Giter Club home page Giter Club logo

Comments (13)

prymitive avatar prymitive commented on August 26, 2024 2

Hmmm, I guess support for multiple instances will need to be added to fix #37, so I think it's reasonable to just allow passing multiple instances, update those concurrently and tag alerts with alertmanager instance name + add a filter for it.
For HA we should probably de-duplicate alerts, so maybe alerts should be tagged with list of instances reporting same alert.
Should be the next big feature to work on, I'll start working on it once I have some time. Thanks!

from unsee.

prymitive avatar prymitive commented on August 26, 2024 2

Now that everyone mentioned it I really want to have that too ;)

I also need to do some refactoring around the code that fetches data from Alertmanager API, so that's a good opportunity to do that.

from unsee.

prymitive avatar prymitive commented on August 26, 2024 2

I looked into it a bit and even got some code written, things that need to be changed:

  • configuration syntax - need to support multiple URIs, should allow naming AM instances, if possible configuring timeouts, having a config file starts to be a good idea
  • silence and alert ids/fingerprints need to contain AM instance
  • silence API endpoint URI (used by silence form) needs to be instance aware, probably best if each alert was tagged with instane: [am1, am2], name -> uri mapping provided in the json reponse and form would send silences to all instance where alert is spotted (or even allow use to select instances where it should be silenced, probably the best)
  • internal metrics will need to have another dimension with the instance name
  • upstream errors are now per instance, not global
  • error handling needs to be refactored - it's no longer binary, one instance might be down but other up, so instead of fullscreen error we only need a top bar warning if it's only partial issue (fullscreen only if all am instances are down)
  • alert source button needs to be per instance, this probably means that there should be a single button/label that will trigger a modal with list of instances and details (like link to source per instance).

All of that on top of the obvious changes, so a bit of work and will probably take me a week or two.

from unsee.

prymitive avatar prymitive commented on August 26, 2024 1

Got some code in multi-upstream branch, it's working (as in should collect alerts from multiple Alertmanagers and deduplicate them), but it's not done, some features are missing, need to write new tests and clean up that branch. But it's usable for testing, ALERTMANAGER_URI now accepts multiple values separated by spaces, format is "name:uri name2:uri2", e.g. ALERTMANAGER_URI="prod:http://localhost staging:http://foo.bar"
It should be mergeable in a week or so, there's UI work needed which will require a bit of time, I don't want to overload it with yet another label, so will add some other UI elements to indicate which Alertmanager instance it was collected from (color bar on the side or something equally non-intrusive, but that must still allow filtering actions).

from unsee.

vincepii avatar vincepii commented on August 26, 2024

Similar use case here: multiple alertmanagers on different environments and we would like to have a single dashboard where we could see, at a glance, all the alerts that have triggered in all the environments.

It would be amazing to be able to use unsee for this job.

from unsee.

Rudd-O avatar Rudd-O commented on August 26, 2024

Wanted to second OP and vincepii — we also need data to be sourced from multiple alertmanagers.

from unsee.

filippog avatar filippog commented on August 26, 2024

Thanks for working on this! I'd like to have it too

But it's usable for testing, ALERTMANAGER_URI now accepts multiple values separated by spaces, format is "name:uri name2:uri2", e.g. ALERTMANAGER_URI="prod:http://localhost staging:http://foo.bar"

Random nitpick: I'd use comma-separated list of values for alertmanager_uri, it would match what prometheus expects and it is easier to quote e.g. in the shell :D

from unsee.

prymitive avatar prymitive commented on August 26, 2024

All multi-value env options are space separated, so we would have to change everything for consistency. I'd rather keep it as is right now.

from unsee.

filippog avatar filippog commented on August 26, 2024

@prymitive ah, if space separated is already the rule then yes 👍

from unsee.

prymitive avatar prymitive commented on August 26, 2024

What's on multi-upstream is almost feature complete, a filter support is missing and some UI tweaks are needed, but it should be usable. Long weekend might allow me to finish it, or maybe will make me lazy, no promises.

from unsee.

prymitive avatar prymitive commented on August 26, 2024

I think I'm done and all pieces are in place, PR awaits a review now.

from unsee.

prymitive avatar prymitive commented on August 26, 2024

PR merged, this will be in the next release, which will be tagged after some testing (and likely some fixes), a week or two will likely be needed.
You can use it today via docker using cloudflare/unsee:latest image.

from unsee.

prymitive avatar prymitive commented on August 26, 2024

unsee 0.7.0 was jut tagged and it includes those changes, see https://github.com/cloudflare/unsee/releases/tag/v0.7.0

from unsee.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.