Giter Club home page Giter Club logo

Comments (7)

Fluepke avatar Fluepke commented on May 24, 2024

We (@hellerve and me) are working on a fix

from influxdb_exporter.

matthiasr avatar matthiasr commented on May 24, 2024

Hmm, I am not sure I agree with this. The sample time is in the protocol because it could be different from "now". That only works so-so with Prometheus, but the exporter does have rudimentary support for that.
If we start keeping track of submission time, we now impose on the exporter (and its operators) the complexity of a second, hidden timestamp. It is not immediately clear to me when which time would take effect – for example, why do we not ignore the submitted timestamps when specifying --timestamps?
I know keeping exact time is difficult, but with expiration on the order of seconds or minutes, is it unreasonable to expect to keep system clocks in sync enough? Does every system that receives time need to handle wibbly wobbly?

from influxdb_exporter.

matthiasr avatar matthiasr commented on May 24, 2024

How would influxdb handle these timestamps?

from influxdb_exporter.

vidister avatar vidister commented on May 24, 2024

At the moment the influx_exporter uses the date supplied from the data source to determine if the entrys should be deleted or left in place. That works fine when all the devices share the same system time.
But if, for example, your device lives in "the future", data will accumulate until the memory of your server is full. If the device lives in the past, data will never reach prometheus because it will be deleted before prometheus can even ask for it.

The in #62 suggested change doesn't affect the timestamp of the data itself, it just introduces a new value utilizing the system time for the decision whether to keep or discard the data.

from influxdb_exporter.

hellerve avatar hellerve commented on May 24, 2024

I guess this boils down to a fundamental question: is a data point “fresh” when the supplier says it is, or is it fresh when it arrives in the system? I think fundamentally both approaches are valid, and both introduce weird error cases.

If we assume that a data point is fresh when whoever sends it says it is, time drift fundamentally changes how we look at the underlying data—it might be expired by the time it even arrives in our system. We trust the data, and take a hands-off approach

If we assume that a data point is fresh when it arrives in our system, we assume that only current data points will ever make it into a request to us. This leads to a different error case, where we impose semantics on someone else’s data (and it leads to two timestamps that should be equivalent or at least close but might not be, because the world is big and messy). We don’t trust the data, and take a hands-on approach.

In both cases, we fix an issue with the other approach, and it might not be possible to get to a “best of both worlds” situation here. Also: in both cases, we should probably at least document the potential error case.

from influxdb_exporter.

hellerve avatar hellerve commented on May 24, 2024

I know keeping exact time is difficult, but with expiration on the order of seconds or minutes, is it unreasonable to expect to keep system clocks in sync enough? Does every system that receives time need to handle wibbly wobbly?

I also want to add to this question for a second. The reason we found this bug was because playing around with this system we got metrics sent from a (lab) device that was a little wonky—as lab devices often are. It sent us data with a timestamp from last year!

"This shouldn't happen! This device shouldn't even be operational!" was our first reaction, too. But it was, and it ran without any problems, except that the metrics didn't show up. Of course this needs to be fixed on the device, but I'm saying this to illustrate that there are weird systems in this world that we have to deal with in some way. Both approaches above are valid, but we have to acknowledge that they might fail in some cases.

from influxdb_exporter.

matthiasr avatar matthiasr commented on May 24, 2024

@hellerve thank you, you have perfectly summarized the core of the issue.

After thinking about this for a while, I would like to leave things as they are. As you pointed out, both approaches are valid. However, we already use and handle the client-supplied timestamps (if --timestamps is enabled). This would not change; so now we would be treating the same sample differently in different contexts.

In the end, there is no perfect way to deal with clients that are significantly out of time; all we can hope for is to deal with them consistently.

from influxdb_exporter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.