Giter Club home page Giter Club logo

Comments (7)

yeya24 avatar yeya24 commented on August 25, 2024 2

Hey @zoglam, sorry for the late reply. I think I got your point now... The merge between regular and OOO blocks should use one-to-one deduplication. And the deduplication between HA pairs should use penalty mode
So in this case, it would require one-to-one deduplication first then penalty.
I think this is not something we support atm.

I think deploying two set of compactors could help and they use different strategies. But I don't think we have a way to only compact blocks matching the target strategy.
I will think about it more and see if there is a better way to do it.

from thanos.

kennylevinsen avatar kennylevinsen commented on August 25, 2024 1

Relates to:

On the Thanos Receive side, the solution has been to disable all local compaction and in turn require Thanos Compact to have vertical compaction enabled. On the Thanos Sidecar side, there are suggestions for allowing upload of compacted chunks and to enable vertical compaction to solve the duplication.

None of this is documented though, and vertical compaction on the thanos compactor side is still experimental with open bugs related to rate/irate and a big warning in the manual.

One hacky solution for sidecar would be to enable compacted block upload, but add an option to delay block upload. By delaying past the out-of-order window and the block max time, we would be able to:

  1. Know for sure if a range has overlaps by just inspecting of multiple blocks for the same range is present at the time of upload.
  2. Reject uploading blocks with overlap, instead waiting for the compacted block.

The delay would have to be larger than the max time + out of order time window, and smaller than the prometheus retention time as store would be further behind. If prometheus is crashed, there could still be an old overlap with something in the wal/wbl. Skipping upload of very old blocks, and the thanos compact skip overlap option might be enough for that.

Another (stupid) solution would be to have a way to make Prometheus always compact, and then only upload compacted blocks.

Of course, the best solution would just be to have thanos compact do the vertical compaction without rate issues...

from thanos.

yeya24 avatar yeya24 commented on August 25, 2024 1

From the Thanos docs "Vertical Compaction Risks" section: #2890. It's a quite old issue with the occasional ping, making the current situation unclear, but being left in the documentation does make it seem to end-users should still be cautious.

Got it. If it is this rate bug then it is specific to the penalty deduplication mode and it shouldn't happen for the default 1:1 deduplication in vertical sharding.

For OOO blocks handled by compactor with vertical compaction, as long as penalty deduplication is not enabled (shouldn't as well since penalty dedup is for HA) I don't see any risks.

from thanos.

yeya24 avatar yeya24 commented on August 25, 2024

vertical compaction on the thanos compactor side is still experimental with open bugs related to rate/irate and a big warning in the manual

Can you please remind me what's the open issue about rate/irate? I am aware of such issues of downsampling but nothing about vertical compaction as Thanos vertical compaction is the same as what Prometheus does.

We should mark vertical compaction as non experimental I think. Thanks for this, we should update docs.

I think we can just enable the same configuration in Prometheus so Prometheus disables overlap compaction locally and compactor will handle it prometheus/prometheus#13112

from thanos.

kennylevinsen avatar kennylevinsen commented on August 25, 2024

Can you please remind me what's the open issue about rate/irate? I am aware of such issues of downsampling but nothing about vertical compaction as Thanos vertical compaction is the same as what Prometheus does.

From the Thanos docs "Vertical Compaction Risks" section: #2890. It's a quite old issue with the occasional ping, making the current situation unclear, but being left in the documentation does make it seem to end-users should still be cautious.

Making Vertical Compaction non-experimental and recommended at least for this use-case, in conjunction with a new Prometheus flag to mimic the new receiver behavior sounds good to me.

from thanos.

zoglam avatar zoglam commented on August 25, 2024

Got it. If it is this rate bug then it is specific to the penalty deduplication mode and it shouldn't happen for the default 1:1 deduplication in vertical sharding.

For OOO blocks handled by compactor with vertical compaction, as long as penalty deduplication is not enabled (shouldn't as well since penalty dedup is for HA) I don't see any risks.

Describing the example situation

HA pair of Prometheus sends data through thanos-sidecar. The compactor is operating in "penalty dedup" mode.
Later on, the system expands and remote_write + OOO functionality is added to Prometheus. (According to the documentation, different modes are used for the two scenarios: one-to-one for receivers, penalty for HA pair). HA pair + remote_write + OOO WITH penalty dedup

After deduplication, there is a reduction in the counter metrics that were written via remote_write to HA pair prometheus. In one-to-one deduplication mode, everything falls apart.
1 What discussions will there be?
2 Why can't HA pair with penalty dedup mode and OOO be done?

from thanos.

yeya24 avatar yeya24 commented on August 25, 2024

Created issue prometheus-operator/prometheus-operator#6829 on operator to track the idea describe in #7551 (comment). The flag is already added to Prometheus

from thanos.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.