Thanos, Prometheus and Golang version used : <div class="snippet-c

Hey <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url=

Relates to: <a class="issue-link js-issue-link" data-error-tex

From the Thanos docs <a href="https://thanos.io/tip/components/compact.md

Created issue <a class="issue-link js-issue-link" data-error-text="Failed to load titl

thanos-sidecar uploads OOO block before Prometheus compacts it, upsetting thanos-compact about thanos HOT 7 OPEN

kennylevinsen commented on August 25, 2024 1

thanos-sidecar uploads OOO block before Prometheus compacts it, upsetting thanos-compact

from thanos.

Comments (7)

yeya24 commented on August 25, 2024 2

Hey @zoglam, sorry for the late reply. I think I got your point now... The merge between regular and OOO blocks should use one-to-one deduplication. And the deduplication between HA pairs should use penalty mode
So in this case, it would require one-to-one deduplication first then penalty.
I think this is not something we support atm.

I think deploying two set of compactors could help and they use different strategies. But I don't think we have a way to only compact blocks matching the target strategy.
I will think about it more and see if there is a better way to do it.

from thanos.

kennylevinsen commented on August 25, 2024 1

Relates to:

On the Thanos Receive side, the solution has been to disable all local compaction and in turn require Thanos Compact to have vertical compaction enabled. On the Thanos Sidecar side, there are suggestions for allowing upload of compacted chunks and to enable vertical compaction to solve the duplication.

None of this is documented though, and vertical compaction on the thanos compactor side is still experimental with open bugs related to rate/irate and a big warning in the manual.

One hacky solution for sidecar would be to enable compacted block upload, but add an option to delay block upload. By delaying past the out-of-order window and the block max time, we would be able to:

Know for sure if a range has overlaps by just inspecting of multiple blocks for the same range is present at the time of upload.
Reject uploading blocks with overlap, instead waiting for the compacted block.

The delay would have to be larger than the max time + out of order time window, and smaller than the prometheus retention time as store would be further behind. If prometheus is crashed, there could still be an old overlap with something in the wal/wbl. Skipping upload of very old blocks, and the thanos compact skip overlap option might be enough for that.

Another (stupid) solution would be to have a way to make Prometheus always compact, and then only upload compacted blocks.

Of course, the best solution would just be to have thanos compact do the vertical compaction without rate issues...

from thanos.

yeya24 commented on August 25, 2024 1

From the Thanos docs "Vertical Compaction Risks" section: #2890. It's a quite old issue with the occasional ping, making the current situation unclear, but being left in the documentation does make it seem to end-users should still be cautious.

Got it. If it is this rate bug then it is specific to the penalty deduplication mode and it shouldn't happen for the default 1:1 deduplication in vertical sharding.

For OOO blocks handled by compactor with vertical compaction, as long as penalty deduplication is not enabled (shouldn't as well since penalty dedup is for HA) I don't see any risks.

from thanos.

yeya24 commented on August 25, 2024

vertical compaction on the thanos compactor side is still experimental with open bugs related to rate/irate and a big warning in the manual

Can you please remind me what's the open issue about rate/irate? I am aware of such issues of downsampling but nothing about vertical compaction as Thanos vertical compaction is the same as what Prometheus does.

We should mark vertical compaction as non experimental I think. Thanks for this, we should update docs.

I think we can just enable the same configuration in Prometheus so Prometheus disables overlap compaction locally and compactor will handle it prometheus/prometheus#13112

from thanos.

kennylevinsen commented on August 25, 2024

Can you please remind me what's the open issue about rate/irate? I am aware of such issues of downsampling but nothing about vertical compaction as Thanos vertical compaction is the same as what Prometheus does.

From the Thanos docs "Vertical Compaction Risks" section: #2890. It's a quite old issue with the occasional ping, making the current situation unclear, but being left in the documentation does make it seem to end-users should still be cautious.

Making Vertical Compaction non-experimental and recommended at least for this use-case, in conjunction with a new Prometheus flag to mimic the new receiver behavior sounds good to me.

from thanos.

zoglam commented on August 25, 2024

Got it. If it is this rate bug then it is specific to the penalty deduplication mode and it shouldn't happen for the default 1:1 deduplication in vertical sharding.

For OOO blocks handled by compactor with vertical compaction, as long as penalty deduplication is not enabled (shouldn't as well since penalty dedup is for HA) I don't see any risks.

Describing the example situation

HA pair of Prometheus sends data through thanos-sidecar. The compactor is operating in "penalty dedup" mode.
Later on, the system expands and remote_write + OOO functionality is added to Prometheus. (According to the documentation, different modes are used for the two scenarios: one-to-one for receivers, penalty for HA pair). HA pair + remote_write + OOO WITH penalty dedup

After deduplication, there is a reduction in the counter metrics that were written via remote_write to HA pair prometheus. In one-to-one deduplication mode, everything falls apart.
1 What discussions will there be?
2 Why can't HA pair with penalty dedup mode and OOO be done?

from thanos.

yeya24 commented on August 25, 2024

Created issue prometheus-operator/prometheus-operator#6829 on operator to track the idea describe in #7551 (comment). The flag is already added to Prometheus

from thanos.

thanos-sidecar uploads OOO block before Prometheus compacts it, upsetting thanos-compact about thanos HOT 7 OPEN

Comments (7)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent