View Code? Open in Web Editor NEW

A stream processing language and compiler for small-scale monitoring

License: Other

Makefile 2.20% Shell 1.06% OCaml 81.57% C 3.86% M4 0.33% Gherkin 1.25% Ruby 0.47% Dockerfile 0.34% CSS 0.13% PHP 8.60% Hack 0.01% C++ 0.13% Gnuplot 0.02% Nix 0.03% HTML 0.01%

ramen's Introduction

What?

A stream processing language and compiler for human-scale infrastructure monitoring

"The right solution for 100X often not optimal for X" — Dean Jeff

Why?

Those last years, thanks to such large companies as Google, Facebook, Linkedin and Netflix, the culture and practice of modern infrastructure monitoring has vastly improved and many good and free tools have been released publicly. Those tools understandably focus on large distributed infrastructure.

For smaller use cases though, tools have been left to where they were in the 90s, with the notable exception of Riemann. But Riemann is only for monitoring hosts and uses Clojure as a configuration language. Which in turns requires a resource hungry JVM.

If you need an all-purpose stream processor to manipulate time series in order to turn inputs from sensors or network probes into alerts but do not want to deploy Kubernetes in your three racks of hardware or have only a couple of GiB left of RAM for monitoring, then you might want to consider Ramen.

How?

This is how operations look like:

DEFINE memory_alert AS
  FROM memory
  SELECT
    time, host,
    free + used + cached + buffered + slab AS total,
    free * 100 / total AS used_ratio,
    used_ratio > 50 AS firing
  GROUP BY host
  COMMIT AND KEEP ALL WHEN COALESCE (out.firing <> previous.firing, false)
  NOTIFY "http://192.168.1.1/notify?title=RAM%20is%20low%20on%20${host}&time=${time}&text=Memory%20on%20${host}%20is%20filled%20up%20to%20${used_ratio}%25";

Currently the stream processing programs are compiled into a language with automatic memory management (OCaml), so performance is not optimal. The plan is to compile down to C (or such) in a future step.

Also, imports/exports are limited: Ramen currently accepts time series from CSV files, and understands collectd and netflow (v5) protocols. As output, it merely reach out to alerting systems via HTTP requests.

Other than that it is possible to “tail” the output of operations from the CLI. More protocols for both input and output need obviously to be added.

tutorial
manual
Docker image: rixed/ramen:demo on https://hub.docker.com
roadmap

ramen's People

Contributors

Stargazers

Watchers

ramen's Issues

Always order selected fields according to their name

So that out_type is ordered and from there all generated code doesn't have to be aware.

Skip unwanted fields when serializing an out-tuple

Simpler put_layer API with a single text

This single text would define several nodes, optionally named.

An out.previous tuple that would be the last output tuple by any group

Require that previous is no longer an alias for group.previous.

Data source for internal instrumentation

So that we could replace the whole top table in the GUI with a normal chart, alert on stale data, amongst other things.

`SELECT **` to select all fields from all parents, nullifying those not present everywhere

Healthcheck API and equivalent command from CLI

To be used with the docker image.
Report that the daemon is running (obviously) and may include number of workers etc.

`ramen shutdown` should save every nodes history

function for reverse ip resolution

with configurable cache size.

Preserve event time information as long as possible

If all parents have the same event definition, and all the fields used in this definition are also present in the child, and no specific event description is given, then propagate it.

We do not want to propagate the export flag though, so this needs to be separated from the event time description.

In.next and group.next tuples

In.next and group.next tuples would come very handy for some non trivial commit clauses.

If In.next is used then we would wait for that tuple before processing the input tuple.
If group.next is used then we would just store the in tuple in the aggregate without further processing, and wait for the next one to take action. Instead we would process the last stored tuple as if it was the new one.

Depends on #360

Clarify `SELECT *` on non identical parents

default_team is broken

Instead, either: broadcast the alert to all teams or force an editable flag "default" to one team?
Broadcasting seems to summon all kind of problems as it corner case the data model.

Also import other CSV types than TCP

BCN will require #33 to compute full traffic volumes.

Implement parameters

Many operations differs just by some constant parameters. If we generated code that obtained those values from the environment rather than hardcoding them, then we could reduce drastically the number of distinct binaries in many cases.

Possible approaches:

Mark such parameters in the operation text, for instance: WHERE foo=$p FOR PARAMETER p=42
Define template operations (or functions) in a specific, distinct phase: DEFINE FOO_FILTER(p) AS WHERE foo=$p and then set the operation text as FOO_FILTER(42)
More radically, make all immediate value a parameter.

This later approach is appealing since it does not involve the user at all and might reuse binaries that would not have been noticed as reusable otherwise. But on the other hand it can affect performance as it is no longer possible to optimise for known constants.

Export on demand

separate export flag from event time info
have export flag be changeable without recompilation (easy since we just have to edit the out_ref file)
ramen should then turn it on/off when timeseries are requested, and timeout after a while
a keyword to explicitly ask for a node result to be saved?

Lock out_ref files when altering them

It should not be the case ATM that we modify them simultaneously but better safe than sorry.
It's a good way to check this assertion BTW.

Also, we will need this later for a fork-based implementation, anyway.

More reliable NOTIFY/EXEC

Repeat until received acknowledgment/0 exit status.
Repeated notifs must be stored elsewhere than in the ringbuf and persisted on disk.

Also solves #127 :

Each notification must have an identifier and a firing bool. We could add these to the generic notification and consider the others as fire-and-forget; So the identifier would be the name and we must add the firing boolean to the notify_cmd.
When a firing notification is received, it is timestamped (both with receive time and a schedule time) and saved in a heap (ordered by schedule time), which is immediately serialised on disc;
When a non-firing notification is received, look in the heap for this notification (requires to have an identifier for any notif: the keyword should provide it (as a string), as well as the firing bool for this to work) and cancel it. If not found, just ignore;
While waiting for new notifications, check the heap top notifs for a notification to schedule (starting from top, look for the first one not sending);
When eventually sending the heap top, leave the notification on the heap but flag it as sending. On success remove it and save the heap. On failure reschedule it and save the heap;
At start up, read the heap from disk and reset all sending flags.

Allow functions to return several values

For percentiles, predictors (predicted value + intervals), etc...

Allow the select clause to use group.previous tuple

Several global states?

The recent WHERE patch have shown that we needed a distinct global state for the WHERE clause.
Maybe we could generalize and have one global state for all input, one for all selected, one for all unselected, and one per group? The clause would tell which default to use.

Do not allocate space on ringbuffer for unwanted fields.

JSON-RPC + human friendly HTTP GET for all calls

When missing some parameters the HTTP GET should return a form.

Make it possible to edit a running layer

Do not stop nodes that are not changing though.

Data source from alerter logs

So that we can plot them as annotations over the graphs.

Enlarge outtypes of nodes with same children to same size

Whenever possible (ie we cannot enlarge a node from another layer).

SNMP traps as a possible contact

A JOIN clause

Given a column that's supposed to be synchronized in N input streams, merge sort them into a single output (using the select clause to construct the output tuple).
For each next value of the synchronized column, build out using the last in of each input stream, then replace the input tuple of one of the input to reach the next smallest synchronized value (or advance several input streams if the same sync value is present in several of them).

This should be an extension to the normal Group-By operation as opposed to a new kind of operation.

Alerter should start with a conf suitable for PV

Fixed configuration mode for the alerter where we specify the config.json to start with on the command line.

Alerter should detect flapping events

To not start/stop an incident if the alert is flapping.
Duration of these hystereses TBD.

Graphite sink

A single listener on port 2003 must be able to fill several tables.
We could restrict to one table per metric (with additional fields from the tags, as factors).
Function name would be the metric path but the last few components, and main field name would be that last path components. How many components to chop off the metric path is unclear though, and should be part of the schema.
As tags are not necessarily present on every messages the schema will have to be declared by the workers wishing to convert some graphite metrics into a tuple stream, anyway.

So:

a new command ramen graphite that starts a TCP forking server and a UDP listener on the given ports;
collected graphite metrics should be enqueued without further ado into a #graphite ringbuf;
then it should be possible to LISTEN FOR GRAPHITE $metric_prefix (...schema...) that would perform a pivot.

Alternatively, we do the same as what we do for collectd, using a simple tuple type of reception time, sender, metric name, tags (as a list), value and event time.

A time type (alias for float) parsed/printed with time units

Would also help declaring start/stop/durations, as we could annotate a field with what type of time it represents (event start / event stop / event duration).

Not sure if we don't want instead some semantic attached to the type, that could be used to parse/print also data volume units, for instance, without requiring a whole new type?
Would probably depend on how light these new types can be made and how convenient they are to use as types.

API and visualizer for tops

Timeseries selects the tuple range using time only.
For this we need, in addition to the time range to select a single batch of TOP output.
So we need either a way to recognize such a batch or to provide a key that would be used to retrieve only the tuples from this time range with a single value for that key (so that we could reuse it for other things than tops).

Then a dedicated visualizer.

Add a child interest bitmask into a parent out_ref

Turn the skiplist into an integer

Once we are confident the skiplist works properly.

update_graph is not called anymore after successful compilation/wtv of a layer

Change typing rules to only check that fields used by a node are output by all its parents

Timeline type of queries

and visualizer.

With this, and once #21 is in place, then we could render incident chronology over any graph.

Alerter GUI timeline should show on-call shifts

IRC as a possible contact for the alerter

A SORT clause

It might be more useful to sort the input rather than the output. So have one optional sorting clause per input stream?

Sharded group-by

Idea: have only one over N shard in RAM (both aggregates and the tuple queue) and the other N-1 on disk, along with N tuple queues.
When a tuple for another shard is received it's batch-queued on disc.
When we rotate the shards we load the queue and process it while newly received tuple for that shard are enqueued. So that we still process them in order.
On disc queues could be growable ringbuffers, as the one that we could also use for node history.

Requires to get rid of serialization and work on mmapped files for the groups.

Saving of last err msg in the node do not work any more

Since we fail out of the compilation we do not save the resulting graph, therefore the last error is not recorded in the node.

Alternative design (that would still require a rw-lock to protect the file) would be for conf to be a persistent data structure. Compile would return a new one with either new graph or new error messages.But then we would have to catch exceptions when calling compile and return a conf indicating of the error rather than an http error code.

geo-localisation functions

would be particularly pretty with a map visualizer.

Get rid of LWT somehow

To de-uglify the code and avoid mixing lwt with exceptions.

Need to use actual threads instead, assuming we do not need to parallelise anything that's blocking.

compile with threads
update RamenRWLock
update RamenOutRef
update RingBufLib
everything else should follow rather easily
regarding cohttp server, either wrap it, or replace it with a custom http server (à la csview), or replace it with a CGI interface and a lightweight external http server
regarding cohttp client in notifier, either wrap it or replace it with ocamlnet or exec shell.

async threads to be replaced:

watchdogs -> another thread, or a small filesystems monitored by the supervisor (touching a file per worker, where suspending the watchdog is replaced by deleting that file);
workers reporting stats every X seconds -> easier from a dedicated thread;
workers merging several ringbuffers with foreground wait times -> still need a background thread;
wait_all_pids thread should not be needed any longer -> easier without threads;
ramen tail reseting export timeouts on a timely basis -> no real need for a thread;
notifier scheduler -> still better with a separate thread;
notifier asynchronously sending notifications -> anything IO is easier with LWTs;
tester threads in RamenTest to be replaced by actual posix threads.

If we start with the workers, we need to focus on watchdogs, reports, and merge.
Watchdogs are problematic as they are used by workers and ramen cli tools such as supervisor and notifier. We could have a mutex and a LWT version of watchdogs?

Allow ${in.field} in replacement text in addition to fields from the output tuple.

Don't stop an alert without a comment

Will also serve as a confirmation dialog.

Propagate null/not-null knowledge down the AST

In a CASE or in OR/AND operations, if the operand is a test for nullability then we should propagate the knowledge that something is null or not down the AST. We could then better type operations on this operand; especially if it's known to be not null.

For instance:

if (a is not null && b is not null) then remember a || remember b

currently the state of the 2 remember operations will be an optional bloom filter, and the code will match against Some a and Some b; but all this is useless since we know that a and b will, here, always be non null.

What is needed is a special operation from nullable to not nullable (akin of Option.get) that would be automatically added around all usage of a and b in the branches, before the actual typing start.
Given this operator, this is a pure rewrite operation.

Allow a FROM clause to mention nodes with different format

As long as the parent nodes do output the fields that this node is using.

We could cram all output tuples in the child input ringbuf, provided:

There is be a prefix to each serialized tuple to say where it is coming from;
There are as many read_tuple functions as there are parents;
All those read_tuple functions output the same in_tuple made of only the fields that are
mentioned (in the order they are mentioned).

Alternatively, we could have several input ringbufs so no prefix required.

Alternatively, the parent could strip down its output to match a child input. This might not be as stupid as it sounds, since we already update its out_ref ; we could add a format spec in that file (a bit mask of the fields) and then, provided fields are ordered by name in encoded tuples, then the single write out tuple function could just skip unwanted tuples. This would come handy for exporting data, as we could enable it on a per field basis.

rixed / ramen Goto Github PK

ramen's Introduction

What?

Why?

How?

ramen's People

Contributors

Stargazers

Watchers

Forkers

ramen's Issues

Recommend Projects

Recommend Topics

Recommend Org