chronixdb / chronix.server Goto Github PK

View Code? Open in Web Editor NEW

263.0 17.0 29.0 2.28 MB

The Chronix Server implementation that is based on Apache Solr.

License: Apache License 2.0

Java 62.49% Groovy 35.31% CSS 2.08% Shell 0.05% ANTLR 0.07%

chronix-server time-series database fast efficiency anomalydetection

chronix.server's People

Contributors

Stargazers

Watchers

chronix.server's Issues

New Aggregation: Signed Difference

Implement signed difference aggregation.

For negative values

first = -1
last = -10
=> diff = -9

For positive values

first = 1
last = 10
=> diff = 9

Positive first, negative last

first = 1
last = -10
=> diff = -11

Negative first, positive last

first = -1
last = 10
=> diff = 11

Chronix Simple Ingestion Interface

We should provide a simple ingestion interface for time series data, e.g. pairs of timestamp, value. We should adapt the protocols of InfluxDB, Graphite, ...

Percentile Aggregation

If Chronix is asked for multiple percentile aggregations, only one is answered.

Sever-side Analysis Scripts

Discuss if we should provide a way that a user can send a groovy script to Chronix that is evaluated on the side of the server. This could be a way to easily extend Chronix with missing analyses.

Merging attributes on aggregations and analyses

The attributes of the time series included in an analysis or aggregation are currently not merged. The attributes of the first time series are set in the result. We should merge the attributes using a set as value holding the attributes of the same keys.

Avoid identical functions within one query

FastDTW Analysis

In some cases a time series has two or more timestamps that have exactly the same value. FastDTW can not deal with that. Hence we have to filter / aggregate the points with the same timestamp.

New Aggregation: Count

Upgrade to Solr 6.1

Add / Subtract Transformation

MovingAverage implementation is wrong

The documentation says:

/** 
     * Calculates the moving average of the time series using the following algorithm:
     * <p>
     * 1) Get all points within the defined time window
     * -> Calculate the time and value average sum(values)/#points
     * <p>
     * 2) If the distance of two timestamps (i, j) is larger than the time window
     * -> Ignore the emtpy window
     * -> Use the j as start of the window and continue with step 1)
     *
     * @param timeSeries the time series that is transformed
     * @return the transformed time series
     */

That is not a moving average (and it's jumpy).

Key-value attributes

Hi,

I just learned about Chronix, so dare with me if I have overlooked that but is it possible to add key value metadata to the measurements like host:myhost, application:myapp? Like the InfluxDB format or the format described here: https://www.elastic.co/blog/elasticsearch-as-a-time-series-data-store.

Also it would be nice to have documentation about the http ingestion protocol and format, if available as well as the query api and aggregation functions.

Result never returns

The Chronix client asks the amount of time series in a first call. If the further result (e.g. an analysis) reduces the amount of time series, then the result is never returned.

New Aggregation: Sum

New Aggregation: First

Returns the first value of the time series.

Filter / Window Transformations

Chronix currently has aggregations and high-level analyses but no transformations like filter or window / sliding window.

Enable CORS

Without enabled CORS the Grafana plugin wont work.
Solution: Enable it per default.

Add this to the web.xml

<!-- Activates CORS for queries data e.g. grafana -->
<filter>
    <filter-name>cross-origin</filter-name>
    <filter-class>org.eclipse.jetty.servlets.CrossOriginFilter</filter-class>
    <init-param>
         <param-name>allowedOrigins</param-name>
         <param-value>http://localhost*</param-value>
    </init-param>
     <init-param>
         <param-name>allowedMethods</param-name>
         <param-value>GET,POST,DELETE,PUT,HEAD,OPTIONS</param-value>
     </init-param>
     <init-param>
         <param-name>allowedHeaders</param-name>
         <param-value>origin, content-type, cache-control, accept, options, authorization, x-requested-with</param-value>
     </init-param>
    <init-param>
        <param-name>supportsCredentials</param-name>
        <param-value>true</param-value>
    </init-param>
    <init-param>
      <param-name>chainPreflight</param-name>
      <param-value>false</param-value>
    </init-param>
</filter>

<filter-mapping>
  <filter-name>cross-origin</filter-name>
  <url-pattern>/*</url-pattern>
</filter-mapping>

Explicit option to enable data return.

Currently aggregations do not return the data field but high level analyses does.
We should provide an option that the user can decide if he needs the data or not.

The default is, that no data is returned for all types of analyses (aggregations / high level analyses). With the option fl=data enabled Chronix returns the rawa data.

Delete the Chronix-Response-Writer

The JSON serialization is now part of the query handler.
-> The response writer does not work with transformed results.

Chronix Dokumentation as GitBook

Addresses also

New Aggregation: Last

Returns the last value of the time series.

Build Release 0.2

New Aggregation: Difference

Implement difference aggregation:

Abs.(first-last)

Upgrade to Solr 6.0

Upgrade the codebase to Solr 6.0

Describe Chronix' storage format

We should describe a way to add time series to Chronix using the HTTP Interface.

Pass multiple analyses to Chronix

q=host:xyz&fq=aggregation=min,max, ...

The results of the aggregations and analyses are added to the resulting document:

start: A
end: B
data:[...]
min:X
max:Y
...

Hence can ask several values at once.

Simple Chronix Client

Should we provide a simple chronix client without SolrJ dependency?

Timeshift Transformation

Release Chronix-Libs on maven central

Currently the libraries of Chronix are released on bintray.
We should also release them on maven central.

Frequency detection documentation is confusing.

I just had a look at the frequency detection code to find out what its purpose is.
Reading the documentation is not very enlightening: Detects if a points occurs multiple times within a defined time range

Reading the code, doesn't really help either:

It takes multiple time series as arguments, but only looks at the first one.
It uses the List<Long> currentWindow as a counter (just the size, contents irrelevant).
It seems to subdivide a timeseries into chunks just smaller than windowSize minutes in duration and returns true if a chunk has at least windowThreshold more observations than its predecessor.

New Aggregation: Range

Absolute difference between the minimum and the maximum:

min = -100
max = 200
range => 300

Bug when using field in join key that is not part of the requested fields

If the join key uses a field that is not part of the requested fields, then the join fails.

join_key: "null-testmetric"

Data as JSON even for transformations

Currently the dataAsJson functionality only works for range queries without any functions (aggregations, transformations, analyses). We should provide this feature also for queries that include functions.

Time Series Vectorization

Implement a transformation that does a server-side time series vectorization.
This is useful in many cases, e..g. data reduction on the client side.

Could be something like that:

transform=vector:points,threshold

Data-mining or real-time?

Asked in Gitter...

@FlorianLautenschlager I have a question about Chronix. Maybe about chronix-storage in particular...

It seems like Chronix is designed more for data-mining that real-time use, is that correct?

I ask, because it seems that a time series is only (should only) be added when a sufficient number of data points have been collected.

For example, in order to benefit from the compression it seems that "chunks" of data points need to be accumulated before adding the total series to Solr. If this is true, the "recent" values would not be available for query. Correct?

Or can I collect a set of metrics every 5 seconds, and add them through the storage service, whereby they can be queried? Does something underlying in Chronix "merge" them in some way into a document of "significant size" over time to achieve better compression and query performance?

My concern is that we are building a monitoring system with thousands (or tens of thousands) of disparate metrics collected every 5 seconds, but for any given host/metric pair there would only be 12 per minute -- but they need to be available "immediately" for query to display on real-time dashboards.

Is there any Chronix client driver for javascript/nodejs

Did not find an email id to write and hence writing it here.

Does chronix provide nodejs/javascript client driver?

Info for a comparison table

Hi, I've started a time series comparison table and wondered if you'd be able to help fill it in?

I've only just discovered Chronix and it looks pretty cool. Would be good to have it listed.

https://docs.google.com/spreadsheets/d/1sMQe9oOKhMhIVw9WmuCEWdPtAoccJ4a-IuZv4fXDHxM/edit#gid=0

Happy to expand the table comparison criteria too if you think there's anything else that should be noted in particular about Chronix that would help people choose.

Thanks!

Do not return empty function / aggregation arguments

Describe Chronix' functions in more detail

We should provide a document describing Chronix' functions.

HTML
Markdown
...

Date parsing in subquery (FastDTW)

The subquery currently do not parse the start or end dates.

Add prometheus servlet

We should provide a prometheus servlet for monitoring purposes.

Points returned as JSON are not sorted

A request with fl=dataAsJson returns unsorted points.
Solution: Insert a ts.sort() call before JSON serialization.

Small chunk compaction

Chronix' performance is best when the chunk size is ideal (1024 kbyte, uncompressed). But in a live monitoring we need small chunks (short time range) << 1024 kbyte. Hence the query and storage performance drops.

Feature:

Periodically check if Chronix has records with small chunks. If, group these records and build larger chunks.

Wall clock to set the server timestamp on reported measurements.

Currently Chronix expects pairs of timestamp and value. With a wall clock we can send the values without a timestamp as Chronix adds the current server time on the values. See how InfluxDB does this.

Prometheus Integration

Build an integration with Prometheus and Chronix to read data out of Prometheus into Chronix, to use Chronix as long term storage.

Time Series Forecast

Implement a transformation that does a time series forecast.

Moving Average based on a fixed size of samples

We currently only provide a moving average with a time window. Hence the amounts of points within a window varies. Therefore we should also provide a moving average transformation with a fixed amount of points.

Bug when joining fields

If a field used for joining records is not defined in the requested fields, the join key contains "null" values leading to wrong joins.

Allow server-side response compression

Add this to the jetty-gzip.xml in chronix-X.X/chronix-solr-X.X.X/server/etc

<?xml version="1.0"?>
<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_3.dtd">

<!-- =============================================================== -->
<!-- Mixin the GZIP Handler                                          -->
<!-- This applies the GZIP Handler to the entire server              -->
<!-- If a GZIP handler is required for an individual context, then   -->
<!-- use a context XML (see test.xml example in distribution)        -->
<!-- =============================================================== -->

<Configure id="Server" class="org.eclipse.jetty.server.Server">
  <Call name="insertHandler">
    <Arg>
      <New id="GzipHandler" class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
    <Set name="minGzipSize"><Property name="jetty.gzip.minGzipSize" deprecated="gzip.minGzipSize" default="0"/></Set>
    <Set name="checkGzExists"><Property name="jetty.gzip.checkGzExists" deprecated="gzip.checkGzExists" default="false"/></Set>
    <Set name="compressionLevel"><Property name="jetty.gzip.compressionLevel" deprecated="gzip.compressionLevel" default="1"/></Set>
    <Set name="excludedAgentPatterns">
      <Array type="String">
        <Item><Property name="jetty.gzip.excludedUserAgent" deprecated="gzip.excludedUserAgent" default=".*MSIE.6\.0.*"/></Item>
      </Array>
    </Set>

    <Set name="includedMethods">
      <Array type="String">
        <Item>GET</Item>
      </Array>
    </Set>

      </New>
    </Arg>
  </Call>
</Configure>

And the following snippet to chronix-X.X/chronix-solr-X.X.X/server/contexts/solr-jetty-context.xml

<!DOCTYPE Configure PUBLIC "-//Jetty//Configure//EN" "http://www.eclipse.org/jetty/configure_9_0.dtd">
<Configure class="org.eclipse.jetty.webapp.WebAppContext">
  <Set name="contextPath"><Property name="hostContext" default="/solr"/></Set>
  <Set name="war"><Property name="jetty.base"/>/solr-webapp/webapp</Set>
  <Set name="defaultsDescriptor"><Property name="jetty.base"/>/etc/webdefault.xml</Set>
  <Set name="extractWAR">false</Set>

  <!-- Enable gzip compression -->
  <Set name="gzipHandler">
    <New class="org.eclipse.jetty.server.handler.gzip.GzipHandler">
      <Set name="minGzipSize">2048</Set>      
    </New>
  </Set>
</Configure>

Add the gzip.mod to chronix-X.X/chronix-solr-X.X.X/server/modules

When querying with fq, query handler raises an exception

Steps to reproduce:
Install chronix-solr-6.0.1
solr start
go to http://localhost:8983/solr/#/chronix/query
q = metric:Load
qf = anything you want

solr ui shows:
{
"responseHeader":{
"status":400,
"QTime":4},
"error":{
"metadata":[
"error-class","org.apache.solr.common.SolrException",
"root-error-class","org.apache.solr.common.SolrException"],
"msg":"no field name specified in query and no default specified via 'df' param",
"code":400}}

solr.log shows:
2016-06-07 21:04:59.630 ERROR (qtp110456297-17) [ x:chronix] o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: no field name specified in query and no default specified via 'df' param
at org.apache.solr.parser.SolrQueryParserBase.checkNullField(SolrQueryParserBase.java:700)

chronixdb / chronix.server Goto Github PK

chronix.server's People

Contributors

Stargazers

Watchers

Forkers

chronix.server's Issues

Recommend Projects

Recommend Topics

Recommend Org