Giter Club home page Giter Club logo

kapacitor's Introduction

Kapacitor Circle CI Docker pulls

Open source framework for processing, monitoring, and alerting on time series data

Installation

Kapacitor has two binaries:

  • kapacitor โ€“ a CLI program for calling the Kapacitor API.
  • kapacitord โ€“ the Kapacitor server daemon.

You can either download the binaries directly from the downloads page or go get them:

go get github.com/influxdata/kapacitor/cmd/kapacitor
go get github.com/influxdata/kapacitor/cmd/kapacitord

Configuration

An example configuration file can be found here

Kapacitor can also provide an example config for you using this command:

kapacitord config

Getting Started

This README gives you a high level overview of what Kapacitor is and what its like to use it. As well as some details of how it works. To get started using Kapacitor see this guide. After you finish the getting started exercise you can check out the TICKscripts for different Telegraf plugins.

Basic Example

Kapacitor uses a DSL named TICKscript to define tasks.

A simple TICKscript that alerts on high cpu usage looks like this:

stream
    |from()
        .measurement('cpu_usage_idle')
        .groupBy('host')
    |window()
        .period(1m)
        .every(1m)
    |mean('value')
    |eval(lambda: 100.0 - "mean")
        .as('used')
    |alert()
        .message('{{ .Level}}: {{ .Name }}/{{ index .Tags "host" }} has high cpu usage: {{ index .Fields "used" }}')
        .warn(lambda: "used" > 70.0)
        .crit(lambda: "used" > 85.0)

        // Send alert to hander of choice.

        // Slack
        .slack()
        .channel('#alerts')

        // VictorOps
        .victorOps()
        .routingKey('team_rocket')

        // PagerDuty
        .pagerDuty()

Place the above script into a file cpu_alert.tick then run these commands to start the task:

# Define the task (assumes cpu data is in db 'telegraf')
kapacitor define \
    cpu_alert \
    -type stream \
    -dbrp telegraf.default \
    -tick ./cpu_alert.tick
# Start the task
kapacitor enable cpu_alert

kapacitor's People

Contributors

aanthony1243 avatar alespour avatar bbczeuz avatar bednar avatar bnpfeife avatar codyshepherd avatar danxmoran avatar davidby-influx avatar dependabot[bot] avatar desa avatar docmerlin avatar elohmeier avatar exabrial avatar goller avatar gpestana avatar jdstrand avatar jonseymour avatar jsternberg avatar karel-rehor avatar lesam avatar mattnotmitt avatar nathanielc avatar onlynone avatar rhajek avatar rossmcdonald avatar sputnik13 avatar sranka avatar timhallinflux avatar yosiat avatar zabullet avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

kapacitor's Issues

Add OpsGenie as notifications system

Please add output integration with OpsGenie.
If you need any help with their documentation or best practices for the API calls, just let me know.

D.

Fix InfluxDBOut tags

Error: invalid task: error calling func "tag" on obj *pipeline.InfluxDBOutNode: assignment to entry in nil map

Make the difference between 'as' and 'rename' clear

When joining two streams there are a few names that need to be renamed.

  1. The name of the stream itself. Since each point has a name the new joined data point needs a name. The default is to use the name of the left data point. You can also explicitly name the new point using the rename method.
  2. The names of the fields could be in conflict so the as method on a JoinNode renames each field name in the points by prefixing them with the respective name passed to as.
    var errors = stream.fork().from("errors")
    var requests = stream.fork().from("requests")
    // Join the errors and requests stream
    errors.join(requests)
             //prefix field names
            .as("errors", "requests")
             // rename stream
            .rename("error_rate")
        .apply(expr("rate", "errors.value / requests.value"))

This is confusing so we should try to make it clear.

Maybe we cn rename the as method to fieldPrefix so its less like an alias in SQL and more specific to what is actually happening.

At the very least we should add better docs around this difference between the methods.

Make TICKscripts scoped

For future authorization reasons a TICKscript should be scoped to the data it can access.

This means changing the way stream works so that fork is not necessary.

Question: re kapacitor vs continuous queries influxdb

Hi Guys,

Awesome product announcements. I would like to know, what would be the preferred way of aggregating data. I don't want to spend too much time developing continuous queries, if I can achieve same or maybe even better results with kapacitor.

Thanks for the hard work you guys have put into these products!

Alert reloads not working

Hi,
I'm trying to reload my alerts and it doesn't seem to be working.

Please see sequence below.
dev@sesh-dev1:~/alerts$ kapacitor show test_alert
Name: test_alert
Error:
Type: stream
Enabled: true
Databases Retention Policies: ["sesh"."default"]
TICKscript:
stream
// Select just the cpu_usage_idle measurement from our example database.
.from().measurement('DCVolt1')
.alert()
.crit(lambda: "value" < 30 )
// Whenever we get an alert write it to a file.
.log('/tmp/alerts.log')

DOT:
digraph test_alert {
stream0 -> stream1;
stream1 -> alert2;
}

dev@sesh-dev1:~/alerts$ cat test_alert.tick
stream
// Select just the cpu_usage_idle measurement from our example database.
.from().measurement('DCVolt1')
.as("Battery_Voltage")
.alert()
.crit(lambda: "value" < 30 )
// Whenever we get an alert write it to a file.
.log('/tmp/alerts.log')
.slack() //added slack
.email() //added email

dev@sesh-dev1:~/alerts$ kapacitor reload test_alert

dev@sesh-dev1:~/alerts$ kapacitor show test_alert
Name: test_alert
Error:
Type: stream
Enabled: true
Databases Retention Policies: ["sesh"."default"]
TICKscript:
stream
// Select just the cpu_usage_idle measurement from our example database.
.from().measurement('DCVolt1')
.alert()
.crit(lambda: "value" < 30 )
// Whenever we get an alert write it to a file.
.log('/tmp/alerts.log')

DOT:
digraph test_alert {
stream0 -> stream1;
stream1 -> alert2;
}

Nothing in the logs indicating an error.

Alert Levels should be able to go to different endpoints.

Right now an AlertNode will send an alert of any level to all configured handlers.

To send them to different endpoints you can create multiple alert node that only send alerts for certain levels.

Maybe we should change it so that each level is a node and can have its own handlers.

Example: Send INFO alerts to email, send WARNING and CRITICIAL alerts to http endpoint and log them.

Currently this can be done this way:

var data = stream. ....

data
  .alert()
    .info(...)
    .email(...)

data
  .alert()
    .warn(...)
    .crit(...)
    .post(...)
    .log(...)

Would something like this make more sense?

var alertNode = stream ....
  .alert()

alertNode
     .info(...)
        .email(...)
alertNode
     .warn(...)
        .post(...)
        .log(...)

alertNode
     .crit(...)
        .post(...)
        .log(...)

The second method will be inefficient as it has to evaluate all three alert conditions where as the first method can break out early if its only a warning.

I like the way it is now so maybe just some good examples is all we need.

@pauldix Thoughts?

Bug: "kapacitor delete <task>" is not working

I think it is somehow mapped to "enable" instead of "delete" because it throws

vagrant@vagrant-ubuntu-trusty-64:~$ kapacitor delete event
Must pass at least one task name or recording ID
Usage: kapacitor enable [task name...]

        Enable and start a task running from the live data.

Kapacitor 0.2.0 (git: master 18b2061)

Subscriptions causing problems with inflluxdb 0.9.4.2

Hello,
I was setting up Kapacitor to play with some alerting though I've run into a problem right around the getgo.
I'm running inflxdb 0.9.4.
When I start Kapacitor I get
2015/12/02 13:14:51 Using configuration at: kapacitor.conf
run: open server: open service: error parsing query: found SUBSCRIPTIONS, expected CONTINUOUS, DATABASES, FIELD, GRANTS, MEASUREMENTS, RETENTION, SERIES, SERVERS, TAG, USERS at line 1, char 6

Looking at the documentation https://github.com/influxdb/influxdb/blob/0.9.5/influxql/INFLUXQL.md

I've tried a few subscription queries and I also get the same error. So this might belong as an error on the influxdb side. Apologies if that's the case. I was able to create continuous queries without problem though.

Oddly I don't see any mention of SUBSCRIPTIONS in this documentation https://influxdb.com/docs/v0.9/query_language/continuous_queries.html

So I'm not sure where the exact problem is. Influxdb or kapacitor.

Thanks!
Alp

Kapacitor / Bosun

As a long-time InfluxDB customer, I'm hopeful about the futures of these new tools -- Telegraf, Chronograf, Kapacitor. How does the vision / capabilities for Kapacitor differ from Bosun which is another alerting daemon from Stack Exchange? What other alerting daemons are influencing the design of Kapacitor?

Alert - Custom field

That will be good if inside the mail sent over alerting system we could include an content like an URL which could contain procedure.

By this way we could a mail template having:

  • metrics impacted by the alert
  • current threshold exceed
  • URL procedure linked to this alert

Just an idea when it's an support team receiving alert. By this way they will know which metrics impacted and whichaction required following the URL procedire included into the alert for resolve the problem ;-)

Thanks for your feedback.

Julien.

Add Monitoring to Nodes

As the amount of traffic Kapacitor is serving increases so does the need to be able to understand how quickly it is being processed.

We need to add a monitor service that captures stats on each of the nodes for a task and publishes them somewhere, probably an InfluxDB host.

Dependencies alerts

In some alerts you want that more than one conditions will happen to issue an alert.

So the flow will be like the following,
Query -> Alert -> Check N query -> Alert \ cancel the alert

What do you think?

Support Sensu

We currently use sensu for all of our alerting needs and routing.

Alert - Custom field

That will be good if inside the mail sent over alerting system we could include an content like an URL which could contain procedure.

By this way we could a mail template having:

  • metrics impacted by the alert
  • current threshold exceed
  • URL procedure linked to this alert

Just an idea when it's an support team receiving alert. By this way they will know which metrics impacted and whichaction required following the URL procedire included into the alert for resolve the problem ;-)

Thanks for your feedback.

Julien.

Handle Tags appropriately for selector MR functions

At first it was assumed that all tags could be filtered by the group by dimensions. There are a few select functions top and bottom specifically that can access tags not in the group by clause. As a result we need a way to propogate both raw tagsets and group tagsets.

Be able to join dynamically on Groups

Thinking about how you could join across groups of the same measurement.

Joining across measurements is straight forward and documented.

How would you join two different groups from the same measurement? Do you even need to? I think so and I think you may want to do it dynamically as well.

Proposed solution:

.joinOnGroups(<group_selector or star>, ...)

A group select could just be { 'dim1' : 'value1', 'dim2': 'value2'} etc.

The number of arguments to joinObGroups indicates how many series are joined,

A * indicates all groups.
For example:

stream
  .from('errors')
  .groupBy('page', 'section')
  // do some work
  .joinOnGroups({'page' : 'home', 'section':'sidebar'}, *)
    .as('home', 'other')
  // do more work on the joined data.

This would join the home page sidebar section with every other page section group. Producing a new groups with tags like:

{
 home.page: home,
 home.section: sidebar,
 other.page: login,
 other.section: oauth,
}

Seems like manipulating tags like we do fields with dot separators is a simple solution.

Going to think on this some more but it feels like a good start.

Anomaly Detection Algorithm

The kapacitor article mentioned that anomaly detection is in the plans. I was curious if an algorithm had been chosen yet.

I am about to work on a custom solution using (nupic htm)[https://github.com/numenta/nupic] with InfluxDB but after the kapacitor announcement (this is really awesome!) I may cancel my work depending on the status of this feature.

I'm particularly interested in how well the chosen algorithm will score on the (NAB)[https://github.com/numenta/NAB]

AlertNode - history

Hello,
when i try to set history (ex: .history(10)), task was refused by kapacitor:

invalid task: error calling func "history" on obj *pipeline.AlertNode: reflect.Set: value of type int64 is not assignable to type int
  • Debian 8 64bits on Docker
  • Go version go1.5.1
  • Kapacitor[d]: Kapacitor (git: unknown unknown) (installed with go get ...)

Thanks guys :-)

Multiple services for PagerDuty alert

Some users might want alerts to take different escalation paths (eg in Pagerduty to multiple rotations), so adding support to reference "named" alert handlers of the same type would be useful.

I was thinking I could attempt to tackle this when I get some time, or at least open a discussion on the best way to accomplish this (while staying in the style guidelines of the project).

Very cool project by the way, really like the DSL and how it is structured.

EDIT: Not sure if this already exists, haven't had a whole bunch of time to go through the code. So if it does, please ignore this request, I was basing this off configuration examples.

Custom User Functions

Right now the only functions that can be applied to the data stream are the ones available in the InfluxQL language or can be expressed via lambda expressions. We need a way to allow users to define their own functions without having to compile them into Kapacitor.

Basic Plan so far:

  • Use RPC and external processes.
  • The RPC API will have basic Collect/Emit calls to stream data in and out off the process.
  • Initial candidate for RPC is using gRPC
  • Snapshot state periodically so that processes can resume where they left off after restart.

Questions:

  • How to configure such a custom function and expose it via TICKscript, Maybe separate lookup for custom chaining methods based off registered custom functions.
  • Which language do we want to support, the choice of RPC system will greatly effect this. Maybe we want a fast robust system like gRPC and a simple JSON one for other languages.

Support Derivative

We need to be able to do derivatives. Since they are not MR functions they got missed.

Add alert integrations

Right now alerts can log, post to and HTTP endpoint, and exec a command.

We need to add a few more integrations for early adoption. Namely:

  • PagerDuty
  • VictorOps
  • Slack

Add kafka as metrics consumer

This will be awesome if instead of using the InfluxDB resources like query it or add UDP subscriptions, the Kapacitor will be more standalone solution, so it will be able to consume metrics from Kafka and analyze them as sliding window.

The stream is very powerful for the feature above and can complete the kafka consumer.
This integration may need to work with a small db to be able store the sliding window metrics for further queries.

D.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.