Background <a class="issue-link js-issue-link" data-error-text="Fa

store-gateway: store sparse index headers in object store about mimir HOT 3 OPEN

dimitarvdimitrov commented on September 23, 2024

store-gateway: store sparse index headers in object store

from mimir.

Comments (3)

GroovyCarrot commented on September 23, 2024

This really bit me recently as well. We'd had 27TB of data rack up in the block store, and then tried to start store-gateway nodes; they basically never started / reported ready as the index was so huge.

Also I found the helm chart uses the default pod management policy for the storegateway, and will block spinning up any additional nodes until the previous one has reported ready. I think podManagementPolicy: Parallel will fix this, though didn't try it as we decided just to destroy the bucket and start fresh. I expect this change would allow multiple nodes to spin up and then be able to decide what tenants/tokens they are responsible for, rather than one starting up and thinking it needs to index everything before anything else is allowed to start.

I think it makes sense for the compactor to do this as it is changing the index anyway when it runs?

Is it possible to optimise this by compiling an index per day, or something? And then store those indices for lazy-loading by the store-gateways if a queries is ran for that period? Seems like then you can start a store-gateway node and it can start taking queries practically straight away?

from mimir.

dimitarvdimitrov commented on September 23, 2024

I think you're bringing up another problem. When the store-gateway starts it downloads from the bucket the index headers for blocks that shard to it. Figuring out which blocks shard to it is fast, but downloading the index headers from the bucket it slow. It's better to do this before starting up; otherwise, this latency would hit queries.

Also I found the helm chart uses the default pod management policy for the storegateway, and will block spinning up any additional nodes until the previous one has reported ready. I think podManagementPolicy: Parallel will fix this

This is the other problem. It's already configurable, but making it the default is a breaking change, so we've been saving this for helm chart 6.0 (#4560)

This issue (8166) is about then sampling the index headers when a query comes in. The sampled version is called the "sparse index header" and is also persisted on disk today. Sampling requires reading (effectively) the full index header from disk with a lot of random reads, that's why it's slow. The sparse header is computed lazily. This issue suggests to compute it in the compactor and quickly download it in the store-gateway instead of having to sample the index header if the sparse index header is not already on disk.

Is it possible to optimise this by compiling an index per day, or something?

blocks are already split into 24h ranges; if you're using the split-and-merge compactor, then there can even be multiple blocks per 24h range.

from mimir.

dimitarvdimitrov commented on September 23, 2024

some notes from the comments in the PR: it won't actually be that hard to let the compactor create sparse headers and upload them

I chatted with @ pstibrany and he suggested doing this at the end of BucketCompactor.runCompactionJob so that we don't fail compactions is sparse headers can't be uploaded. It makes sense to still keep the ability to create sparse headers in the store-gateways so they are more autonomous and don't depend on the compactor for performance.

Worth noting that the compactors should upload these sparse headers for new blocks only as not to create a very huge backlog upon deploying a new Mimir version. But store-gateways should still be able to construct sparse headers themselves if those aren't available in the bucket.

from mimir.

store-gateway: store sparse index headers in object store about mimir HOT 3 OPEN

Comments (3)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent