postmates / cernan Goto Github PK
View Code? Open in Web Editor NEWtelemetry aggregation and shipping, last up the ladder
License: Other
telemetry aggregation and shipping, last up the ladder
License: Other
It is desirable to have cernan report its internal health for operational monitoring.
The 'bucket' data store present in cernan is acceptable for statsd multiplexing but breaks down as soon as you want to retain datapoints. Which, you do if you pump in graphite or logs.
This has come up in conversation, especially with @doubleyou, @blakebarnett and @dvdklnr. Very likely we'll just use wavefront proxy's extension to statsd to support tags as this is standard-ish and somewhat supported by existing clients.
See also #10.
The memory use of cernan
will grow without limit because it does not flush old metrics as needed. This can be corrected by moving from an internal HashMap to an LRU or other expiring cache.
Statsd is a lossy protocol. The server is intended to aggregate before passing on, which is not strictly needed with any of our existing aggregation services. graphite is a lossless format, implying that the server merely stores and forwards points.
Supporting graphite will allow ingestion of local collectd points.
In the process of avoiding generic string handling I've introduced a lot of copying of strings. This is okay while we're just multiplexing statsd but will break down as soon as we start storing points or want higher performance.
Right now we have our sinks hard-coded into the binary. This is undesirable for anyone that wants to emit to new and interesting places. We should have the ability to allow users to write their own sinks and to prove this out we ought to extract some of our own.
It turns out that wavefront does not like sub-second points. That is, if you report
a.metric 1 00001
a.metric 1 00001
a.metric 1 00001
Wavefront will only include a.metric 1 00001
in the time series.
So! We need to use the metadata associated with metrics and aggregate appropriately.
A few statsd clients (primarily those written in Go) have been experienced to default to using IPv6 loopback instead of IPv4 loopback.
This net results are confused users and silently dropped metrics.
Investigate the possibility of introducing an optional mechanism for transmitting metrics over UDP with forward error correction. tl;dr: by sending redundant messages, but over a faster protocol, we can significantly increase egress throughput
Criterion include:
Part and parcel with #61--we should no longer report exact points to wavefront--we can no longer rely on the timeseries information to provide accurate counts for our aggregates. This is something we need to collect and report ourselves.
The CLI of cernan should support its former set of configuration options. Without this, master has become difficult to promote to stable as all hosts will necessarily need a correct configuration file on-disk.
The CLI will not be updated to support the variety of configuration available through the config file.
Right now we have a --wavefront-skip-aggrs
and want to eventually remove this option entirely by removing the ability to ship aggregates to wavefront. They are not useful.
Cernan's logging should be more descriptive than exists today. Before releasing 0.4.0
we ought to audit every place where we currently log and ensure that the message provides the full context available at the time, as well as add additional logging in critical areas where it is currently lacking.
The present state of affairs in cernan is that counters are per flush-interval. That is, if cernan is configured to have a flush interval of 15 seconds a counter is implicitly per fifteen seconds.
Instead, the bucket concept should be expanded to aggregate at one second bins. Exactly how we want to cook this in the presence of potential flush failures is unknown as of this writing.
Right now cernan tests do not make coverage reports. This makes it difficult to determine how well tested cernan is in practice. Issues that have snuck by into the wild would suggest that it is not.
As noted by @tsantero in the #85 (comment)
memory allocation is inefficient, reading in 6 log files concurrently at an avg of 10k lines/sec of 100 bytes/line cost several gigs of RAM, with utilization growing as the counts increased
This is... not okay. Our present channel based method of communication has met the end of its useful lifetime.
In order to preserve information about the distribution of request times or similar data, we can't rely on percentiles: there's no correct way to later aggregate two percentiles to get another percentile. A mean of a percentiles isn't usually what you want.
What we can do is calculate histograms for each reporting period, then report each bin of that histogram, for each flush interval, to Wavefront. From Wavefront's perspective it's just a bunch of separate metrics, but we can reassemble them and calculate a histogram for the distribution of some values for arbitrary intervals without loss of information.
The tricky part here is communicating from the application to Cernan what the bins should be. Statsd can be configured to calculate histograms in this way, but it involves defining the bins in statsd's config file which is awkward to realize.
So I propose an extension to the statsd protocol so applications can communicate to cernan at application start the desired histogram bins. Proposed format:
metric.name:-inf,1,10,100,inf|h
This configures four bins for the metric metric.name
:
Applications can send these each time they start, and Cernan will write them to the filesystem so they don't need to be sent again if Cernan restarts.
Right now cernan has a global concept of flush intervals. It would be much nicer if sinks had a per-sink flush interval, allowing us to differentiate between disk IO flushes and network flushes.
Cernan must learn to ingest log files. This issue does not imply modification of the data streams exposed from the files. At present it would be sufficient to note that a line was written to the log file and convert this to a PIT metric.
Depends on #37
Use the console backend as an example.
Cernan presently has in-memory channels for shuttling information back and forth between system components. This was the same approach that heka adopted, which killed its capability in high load situations. In-memory channels are problematic in that they have no concept of backpressure. They are also not crash tolerant.
Here the primary inspiration will be hindsight, which the heka folks have blessed as the works-well-enough replacement for heka.
recording the metrics from every single celery worker * every single instance is going to create a bunch of noise for the graphs, and increase our wf upload rate to a level that, even with aggregation by time tweaks, won't make sense. Cernan tagging will enable us to tag everything as either prod or staging.
Cernan should allow the end-user to manipulate their logs into telemetry streams for differing backends. This implies a scripting capability and the ability for scripts to deliver packets into named queues.
Possibly depends on #40
Pgbouncer metrics in the past were reported each second, but now are every 10 seconds. I'm guessing this has to do with the changes that were made recently to Cernan's aggregation behavior.
To preserve the semantics expected from a graphite interface, Cernan should not aggregate these values, or buffer them for the flush interval, but instead forward them to Wavefront as reported.
There are many places where cernan is aggregating important things in memory, holding until there's enough information from disk-based sources etc etc. Presently if cernan restarts all of this is lost.
Cernan must not lose things.
Related to #98 each metric should have a notion of its own metadata. Presently metadata is per-sink.
As it says on the tin. Determining the running cernan version is very hard without this.
Right now if you send cernan a metric name that is, say, 1MB, cernan will allocate 1MB for you.
So! Probably oughtn't to do that.
It would be very useful for cernan to distinguish runtime environments. Inside of postmates we want this to be a first-class bit of metadata on all points.
It's come up on a few occasions that it'd be awfully handy to have cernan forward to a remote cernan. This is possible to do.
As a part of #61 we can no longer report exact points to wavefront. Doing so eats into our point budget and provides no benefit.
Introduced in #91 the mpmc log file will grow without bound. This needs to be corrected. Current thinking is to do periodic rotation and abstract the state machine used in file_server to service the Receiver.
Motivated by the fact that collectd puts metadata information into the metric name we need to be able to adjust metric names as they came in, in a user-configurable fashion.
Our current QOS setup seemingly causes elision of metrics. Steps to reproduce:
At present all data sources--graphite, statsd, log files--are reported to all backends. This is not appropriate. Cernan should instead allow, by configuration, different sources to be reported to different sinks.
It would be very useful for integration testing if SIGINT were to cause backends to flush and then result in program exit.
Related to #82, cernan's test coverage is poor. We must improve this.
At the suggestion of @pulltab we should introduce a quality-of-service notion to our metric types. The idea here being that backends can choose what rate to flush certain types. By default, we'll maintain flush-rate compatibility with existing statsd implementations.
This is intended to allow us to reduce the burden on our backend reporters.
Cernan has grown a plethora of CLI flags. This makes it difficult to add arbitrary files to track.
Cernan should accept a configuration file. We must keep the current CLI flags around meanwhile.
Consider a system that emits self-aggregated histograms as gauges with names like foo.bar.baz.median
, foo.bar.baz.999
. Cernan will only allow the first.
We must have some capability of determining that cernan emits the points we expect for any given input from the outside of the project.
This requires #34
Right now cernan does not do a DNS lookup to resolve non-IP hosts into IP addresses. This significantly limits the fun things we can get up to.
For instance, we ought to do more than crash on wavefront-proxy.us-west-1.postmates.com
.
Ambition is to parse --tags source=foo,service=bar,metadata=blerg
and emit appropriately to correct emission sites.
As indicated, will fold --metric-source
into this.
In existing postmates systems there is a need to parse metrics with
names like
source.machine-metric.name
as being a metric named 'metric.name' with source 'source.machine'.
The etsy/statsd backend has been modified to take a regex to fiddle
with this but I've hard coded it in the expectation that future
backends will be programmable by the end-user.
We don't have a license for TravisCI. We should still have feedback on all PRs through. What we must have is:
Feedback and badges etc would be a bonus.
This issue is an elaboration of #35
Presently we report the following percentiles:
Kosher?
Presently cernan implements its counters are continuously increasing sums of points. What it should do instead is reset the counts on each flush.
This is related to #61
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.