Giter Club home page Giter Club logo

tattle's Introduction

Tattle (formely Graphite-Tattle)

A self service alerting and dashboard frontend for graphite and ganglia

This tool was first presented at EscConf during a presentation by drowe from Wayfair

Concepts

Checks

A Check is a graphite or ganglia target in combination with a user defined Error and Warning threshold.

Subscriptions

A Subscription is a users signing up for to be alerted by a plugin if the Check reaches the Error or Warning state. A user can have multiple subscriptions to an alert based on different threshold and plugins. (Example : SMS for Error, and Email for Warning)

There are several subscriptions means available out of the shelf: e-mail through SMTP, IRC through IRCcat, HipChat, PagerDuty, PushOver. In addition, it's quite easy to write your own plugin for other notification means.

Alerts

An Alert is the signal that the Check either passed it's defined Error or Warning threshold, or it's returned to the OK state from being in a bad state. The frequency of Alerts is defined by the Repeat Delay (in minutes), which can't be less than the frequency of the processing cronjob.

Dashboards

A Dashboard is a collection of pre-defined graphs that allows for self service creation and has a fullscreen option.

Graph

A Graph in Tattle is a combination of "lines" which make up the graph. A graph can have one or more lines, and can set the display type of the graphs (Stacked or Not) and has a weight for ordering on the dashboard.

Lines

Lines are a combination of an Alias, a graphite Target, and a color.

Screen Shots

Installation Requirements

  • PHP
  • MySQL
  • Lighttpd / NGINX / Apache
  • Flourishlib for the PHP framework
  • bootstrap for the HTML/CSS framework
  • http access to a graphite or ganglia installation
  • php-curl for some alerting modules

Installation and Configuration

  • Import .sql file to create database and tables

  • Create a session storage folder for flourishlib

  • Create a file called inc/config.override.php so that upgrades don't blow away your config, you can check inc/config.php content for other available settings:

    <?
    $GLOBALS['DATABASE_HOST'] = '127.0.0.1';
    $GLOBALS['DATABASE_NAME'] = 'tattle';
    $GLOBALS['DATABASE_USER'] = 'dbuser';
    $GLOBALS['DATABASE_PASS'] = 'dbpass';
    $GLOBALS['GRAPHITE_URL'] = 'http://graph';
    ?>
    
  • Create a logs folder which is writable by your webserver user

  • Setup cronjob to to run processor.php. This script can be theoricaly run either as a cli or through the web server. Even though cli maybe required depending on your plugins and their required permissions, it has been reported to be easier to configure through the web server, e.g. with such a crontab entry:

    * * * * * curl 127.0.0.1/Graphite-Tattle/processor.php
    
  • Register via the web interface. (The first user registered is the admin currently prior to us implementing any roles, and other permissions)

If you are on EL6 or a recent Fedora , make sure your php.ini has short_open_tag = off commented or you will get bogus output.

Dashboard Cleanurls

If you have apache, with mod_rewrite enabled and allow .htaccess files you can try the new Clean Dashboard urls. Initial urls look like this:

http://localhost/dash/1/500/300

The second parameter should be replaced with the dashboard id you want to see. The third parameter represents the heigt of the individual graphs. The fourth parameter represents the width of the individual graphs

HTTP Auth Based User Accounts

If you are already using Web Server based authentication, then you can tell Tattle to use those credentials instead of keeping two sets of user accounts.

just set the following config variable to true in your config.override.php file:

$GLOBALS['ALLOW_HTTP_AUTH'] = true;

Reason for creation

StatsD from the team over at Etsy added a simple Dev and Ops friendly way to send metrics to graphite. graphite makes graphing metrics and data self serve and simple for anyone.

With this tag team in our environment alerting seemed to be the weakest link from an adhoc/self service perspective which is where the idea for Tattle came from.

Caution!

This project is still in an Alpha status and not feature complete or ready for full production use yet. Any help smoothing out the edges and adding additional features / functions would be greatly appreciated!

If you're having strange SQL issues, make sure you are using the most recent schema

How to Contribute

You're interested in contributing to Tattle? Sweet!

fork Tattle from here: http://github.com/Graphite-Tattle/Tattle

  1. Clone your fork
  2. Hackit up
  3. Push the branch up to GitHub
  4. Send a pull request to the Graphite-Tattle/Tattle project.

We'll do our best to get your changes in as soon as possible!

Contributors

In lieu of a list of contributors, check out the commit history for the project: https://github.com/Graphite-Tattle/Tattle/graphs/contributors Though special shout out to jpatapoff since he helped a lot, but his commits weren't attributed due to manually merging and to f80 who did all his commits with an email address no longer associated to his account.

tattle's People

Contributors

ajablonski avatar barshow avatar chrisdotm avatar draco2003 avatar drowe-wayfair avatar ehammerv avatar g76r avatar krisbuytaert avatar lqybill avatar matthew-lucidchart avatar mreeves1 avatar ssandler avatar trbs avatar twinforces avatar zeebonk avatar zonywhoop avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tattle's Issues

Error while trying to run

I get this error when I try to access the index.php webpage.

PHP Fatal error: Call to undefined function plugin_listener() in /var/www/Graphite-Tattle/plugins/irccat_plugin.php on line 3

Special handling for nulls at the very end of a metric.

When a metric breaches one of the thresholds, an alert is triggered. When a metric that has been triggered falls back below the threshold, an "all-clear" message is sent out.

I have seen these "all-clear" messages sent out when they should not have been, simply due to the fact that the stats had not yet been published for that metric.

When this happens, the JSON will contain a Null value as the very last entry.

Add screen shots to README

I'm about to start using Graphite. After skimming through their docs in search for dashboards I opened about a dozen tabs. I don't want to install and configure all these apps just to see how they look like. Please, add screen shots to README.

Unsecured SMTP

The plug-in is able to connect only with smtp over ssl.

Alerts should be triggering, but are not.

Tried to do some debugging on my own, but could not get Flourish to respect the $debug = true parameter. Right now we have Tattle up and running, but none of the alert thresholds are working even though they are 1,000% higher than my error value.

This is what I see on the alerts page.

There are currently no Alerts based on your subscriptions. Smile, looks like everything is happy!

Here is an image of the graph.

http://imgur.com/5ajxr

Any advice to help debug this would be appreciated.

How to handle graphite multi-valued metrics

Graphite allows you to ask for something like system.*.cpu_load' and get back an array of data for everything that matches that pattern --- you get back a json array forsystem.host_a.cpu_load' and `system.host_b.cpu_load', etc.

For our use, we'd like to be able to get alerts for any of those wildcarded metrics going over the limit, e.g. "Any system load over 3" or whatnot. Right now, Tattle accepts that as a metric, but only does notification for the first metric that happens to be reported (Check::getResultValue explicitly looks at $data[0]).

Setting aside the future issue of things like "I want notified for all system's cpu load over 3, except big_host, where I don't care until it reaches 10, or test_host, where I don't care at all", I'd like to figure out the sanest way of handling this. Has anyone else thought of this, or wanted this? I've got a few ideas on how to handle the simple case, but I wanted to check with other folks before going too far down that road.

Support other schemas than InnoDB

Just pulled the newest source, ran the sql file. Got the following error when I clicked the register button:

Uncaught fProgrammerException
-----------------------------
{doc_root}/tattle/user.php(41): fActiveRecord->__construct()
{doc_root}/tattle/inc/flourish/fActiveRecord.php(1115): User->configure()
{doc_root}/tattle/inc/classes/User.php(6): fORMRelated::setOrderBys(Object(User), 'Subscription', Array)
{doc_root}/tattle/inc/flourish/fORMRelated.php(1343): fORMSchema::getRouteName(Object(fSchema), 'users', 'subscriptions', NULL, '*-to-many')
{doc_root}/tattle/inc/flourish/fORMSchema.php(181)
The table users is not in a *-to-many relationship with the table subscriptions

My DB is forced to use MyISAM. I ran a previous build a few days ago, on MyISAM (different server) and that seemed to work just fine.

Is this a DB issue or a code issue related to registration?

EDIT:

Looking at the Flourish docs, it seems like it wants InnoDB regardless, but will function on MyISAM... but sounds like the programmer has to do more work. I'm very unfamiliar with Flourish.

Running Tattle

Hi, I'm a new user and I'm having troubles bringing up Graphite-Tattle with Nginx. Any chance you post an example configuration on how to accomplish that?

Graph delete button doesn't work

On the dashboard edit page, there's a "delete" button for each graph. Clicking this brings you to a blank page and the graph doesn't actually get deleted.

It is possible to set someone's password to an empty string.

Event though the password box has an asterisk, indicating that it is required, it is possible to set someone's password to an empty string.

I was just trying to change their email address, and did not know their password.

I would recommend not updating their password when it is left blank.

Introducing a way to reorder graphs

We want to add a way to reorder the graphs in a dashboard. When Tattle gets all the graphs, they are already ordered by "weight" ascending. But when two graphs have the same weight, those two are ordered by "graph_id" ascending. That leads to my point : what is the use of the column "Weight" for a graph ? And is this column really usefull ?

We want to add up and down arrows in the graphs list which are more user-friendly. We may also remove the select box in the "add_edit_graph.php" file and rename the "weight" column in "order" or another more explicit word than "weight" for this use.

Does anybody use this column for another thing than sorting the graphs ?

Best practice to implement processor.php cron job?

I originally tried running processor.php in root's cron job like so:
*/5 * * * * php /var/www/html/Graphite-Tattle/processor.php
but I got path errors.

So I made a change in /inc/init.php because getcwd will not always set the root correctly unless you are currently in the Graphite-Tattle directory.
I replaced
define('TATTLE_ROOT', str_replace(array('ajax'),'',getcwd()));
with:
define('TATTLE_ROOT', str_replace(array('ajax'),'',dirname($_SERVER['SCRIPT_FILENAME'])));

one caveat is I am using the f80 fork of Graphite-Tattle so the next bit may not be relevant.

I tried running the cron so it appeared to be done by the apache user and it seems to succeed (writes to log, sets alerts, etc.).
*/5 * * * * su -s /bin/sh -c "php /var/www/html/Graphite-Tattle/processor.php -debug >> /var/www/html/Graphite-Tattle/logs/debug.log 2>> /var/www/html/Graphite-Tattle/logs/errors.log" apache

But for some reason I would not receive email notifications. I checked logs, etc. and everything seems fine.

However, I had noticed if I hit processor.php from the web client it worked. So I decided to try this:
*/5 * * * * curl 127.0.0.1/Graphite-Tattle/processor.php?debug=true >> /var/www/html/Graphite-Tattle/logs/debug.log
which works.

I just wanted to see if anyone had any better advice on how to run processor.php. I was thinking I may add in my curl command crontab to the README as a pull request if you like.

Todo : Add JSON based graph and line details

Might make sense to move to a json or other formatted database column for storing the graph details vs. having to create a new column per tweak as we constantly add new features.

Add better null handling to checks

Currently if you have a metric that is reported only semi-frequently (i.e. a job that runs every 1 or 2 minutes) you'll get a lot of nulls back from graphite which throw off the averages and medians. You can wrap the target in a keepLastValue() function but it can still end up with nulls at the beginning of the results. We should probably ignore nulls, but we need to handle the case of all results being nulls.

It'd also be nice to have an option to explicitly alert on all the values being null (i.e. if my job that sends data to graphite is broken, alert on the fact that graphite is not getting any data).

Add ability for a check to only occur during certain times of the day.

This could be a nice feature to have, for my team we have some checks we usually shut off overnight, or different thresholds for overnight when batch jobs are running. Having schedule based checks would allow for either of those (in the latter case you'd need two separate checks but it's probably the simplest method).

How to integrated Hosted BML(Bill Me Later) page.

Hi All,

Here my requirement is Integrated BML(Bill Me Late) Hosted page to my existing E-commerce project.

  1. Now we are using merchant site only for BML.
  2. Instead of this page i want to Integrated BML Hosted page, including user data like address info., etc.
  3. How to give link to merchant page to BML secure hosted page.(now we using "click here it's go to merchant page" exactly here i want to go hosted BML secure page.)
  4. I seen some Hosted Order Page In Cyber Source website like below link. But i'm not understood much more.
    http://www.cybersource.com/developers/develop/integration_methods/hosted_order_page/

Please help me need full and share your valuable thoughts.

Thanku
All

processor.php is alerting with status '0.0'

My alert tests for a metrics to be above 30 for warning status and above 50 for error status. It's consistently between 0 and ~15 most of the time, but I get an alert every valid interval with status '0.0'.

Todo : Add yAxis Max to Graph

Some metrics have very high spikes which skew the graphs. It would be nice to have a yAxis Max at the graph level to fix this.

Graphite uses the following url attribute to add this:
&yMinLeft=

targets with " won't alert

This happens when you use a target with alias(foo.target,"bar").

prepareTarget turns this into alias(foo.target,"bar"). That's fine for HTML but not for a URL.

solution is probably to have an "encodeTarget" method that uses urlencode.

time offset for checks

Please add a time based offset for the values which are requestet from graphite.
Usecase:
we calculate different business KPIs/metrics in hadoop with a interval of 3, 6, 12 and 24 hours.
so we cant use checks based on last x values or minutes.

Todo : Add time period for graph

Add the ability to set the timeframe of the graph when creating it.

Example url format:
&from=-2hours

inc/Classes/Graph.php : drawGraph()

false positive threshold alerts with complex targets

I receive rather often false positive for threshold alerts when the alert target contains a complex formula such as diffSeries(sum(foo.bar..baz),sum(foo.bar..foo)), probably because the alert is evaluated on last non null value whereas all metrics are not recorded at the same second within graphite sample interval e.g. if many foo.bar.*.foo metrics are recorded the second before current minute, the diffSeries result will be higher than expected.

Todo : Add Database update method

There are a few changes we need to make and each one is most likely going to require a database schema update.

Figure out a good way to iteratively update the database without borking any data.

The movingAverage function operates on samples, not minutes.

The graphite documentation specifies that the movingAverage function operates on samples, not minutes. However, on a 24-hour graph, Graphite condenses samples into 1-minute buckets, leading to the confusion.

For me, this means that, if I choose "3", Graphite averages the last 30 seconds (since my samples are kept at the default resolution of 10 seconds).

This is confusing, since the UI mentions "minutes", which is only true for the 24-hour graph; whereas the actual polling of the data operates on a much smaller timescale and performs the moving average on a sample-by-sample basis.

See:
http://graphite.readthedocs.org/en/1.0/functions.html

Remove the moving-average requirement.

We are using graphite to track an error bucket of sorts.

We need to get alerts anytime this bucket has more than zero items in in.

There are several issues with Graphite-Tattle that prevent us from doing this effectively:

  1. We are unable to turn off moving averages (aside from using a single sample).
  2. We are unable to tell Graphite-Tattle to look at a shorter window than 1 day. When looking at a 1 day window, the values are averaged.
  3. We don't want separate error or warning thresholds here (but we can use 0.5 for warnings as a workaround).

Thoughts?

Table headers incorrect on edit dashboard page

The "description" column actually shows Area, and the "background color" column shows the Description, and there is no background color option to display in its place. Need to fix the ordering and naming of the table headers.

bootstrap update

I'm working on a few Tattle enhancement and would like to upgrade to last Bootstrap release.

Does anybody disagree with such change ?

I prefer to ask before because there is a pretty large version gap and therefore I wonder if there is a good reason for still using v1.3.0 nowadays.

Thanks.

Brief errors are not triggering notifications.

We sometimes have errors that last only for a few seconds (CPU usage above the error threshold, for example). We would like to be notified of these issues, even if they aren't still in error when processor.php is being called.

Rather than checking the last value, we would prefer if Graphite-Tattle checked all values since the last time processor.php was run.

This is related to #33.

Split the 'Samples' setting

I use Graphite to store temperatures from sensors once every 15min. As I understood, the "samples" setting is used for 2 purposes :

  • how many samples the moving average is based upon
  • how far ago processor.php searches for data in the graphite render request using the "from" parameter

I think the two purposes should be considered separately. With my 15min retention frequency, I must set "samples" to at least 15 to be sure to find data to check. But therefore I'm stuck with a moving average of 15 samples, that I don't want.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.