Comments (3)
I am not a fan of non-strict. Currently, if the config was updated wrongfully, a metric is generated. Which can be used for alerting. If no cortex process are restarted, they keep the old config, everything works. If we are using kubernetes, with enough pods crashed, the pdb can protect us from having too many restarts, while we attend the alert.
With non-strict, now we might have broken things for a user and there is no alert for that. It makes hard reasoning about the health of the config.
from cortex.
If I have a typo on active_series
field for example, with non-strict mode Cortex will ignore this field, then it will use the default value for active_series
? This sounds even worse than crashing Cortex pods.
from cortex.
To answer your question @yeya24, yes a typo on the active_series
field will force Cortex to use the default value.
I think these are fair points, we can use the Kubernetes PDB to protect Cortex from loading a bad runtime configuration and create a mechanism on our service side to rollback a bad configuration version.
from cortex.
Related Issues (20)
- AlertManager loda templates from sidecar HOT 2
- Query Frontend: queries with negative offset should check whether it is cacheable or not HOT 2
- [Security Self Assessment] Cortex HOT 2
- API for delete metric labels HOT 1
- Cortex 1.16.0 Upgrade Error:LabelValues() from merge generic querier for label HOT 8
- "context canceled" is Added as a Span Event on `cortex.ingester/QueryStream` Trace HOT 2
- Querying basis on filter and getting metrics we should'nt get HOT 5
- CVEs in Cortex 1.16.0 HOT 2
- The 'alertmanager_max_alerts_count' is not functioning properly HOT 5
- 1.16.0 Store Gateway Query stats - ability to disable HOT 2
- multilevel cache with inmemory and redis HOT 7
- CNCF TOC follow up on project health HOT 2
- "maxFailure (quorum) on a given error family" error: consider different wording HOT 1
- SGW Redis client side cache HOT 1
- Slow running rules from one tenant can cause PrometheusRules API to timeout for all tenants HOT 3
- Seems the CodeQL is breaking in master:
- Ruler API HA HOT 2
- Replacing reprecated functions
- Retry alertmanager delete silence when one instance fails
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cortex.