Giter Club home page Giter Club logo

Comments (9)

dmke avatar dmke commented on May 25, 2024 1

Yeah, alertmanager_cluster["listen-address"] = "" does the trick. I'll close this for now.

from ansible-alertmanager.

paulfantom avatar paulfantom commented on May 25, 2024

Sorry for my lack of reply.

We try not to have any dependencies between roles and we cannot ensure there is prometheus.service on the same host as alertmanager. And even when there is prometheus.service there is no guarantee that alertmanager is configured to receive alerts from this prometheus instance. I would recommend to do such check on a playbook level.

As for the second part (RestartSec=5s), I am currently securing all service files and will include it shortly.

from ansible-alertmanager.

dmke avatar dmke commented on May 25, 2024

Fair point. We're actually already using a drop-ins for that purpose:

# systemctl status alertmanager.service
● alertmanager.service - Prometheus Alertmanager
   Loaded: loaded (/etc/systemd/system/alertmanager.service; enabled; vendor preset: enabled)
  Drop-In: /etc/systemd/system/alertmanager.service.d
           └─after-prometheus.conf
           └─slowdown-restarts.conf
[...]

with /etc/systemd/system/alertmanager.service.d/after-prometheus.conf containing just

[Unit]
After=prometheus.service

Maybe this could be added to the README.md, e.g. in the Example section?

As for the second part (RestartSec=5s), I am currently securing all service files and will include it shortly.

Great!

from ansible-alertmanager.

paulfantom avatar paulfantom commented on May 25, 2024

I also looked at your error msgs and it seems quite strange that your alertmanager requires prometheus server to start. This is not a usual case for alertmanager as it should be able to operate without any prometheus server (communication is unidirectional and alertmanager is on receiving end).

From what I see you are having problems because of networking problem as can be seen in logs:

level=warn ts=2019-07-10T00:00:21.115Z caller=cluster.go:154 component=cluster err="couldn't deduce an advertise address: no private IP found, explicit advertise addr not provided"

Are you using alertmanager in HA mode with gossip network? Because it looks like alertmanager cannot start because gossip network address specified with --cluster.advertise-address is not available. Could you provide ansible variables which were used to deploy alertmanager?


Maybe this could be added to the README.md, e.g. in the Example section?

No, there is no requirement of having prometheus running to use alertmanager. Of course it doesn't make much sense, but it is not a hard requirement.

from ansible-alertmanager.

paulfantom avatar paulfantom commented on May 25, 2024

This error:

level=warn ts=2019-07-10T00:00:21.115Z caller=cluster.go:154 component=cluster err="couldn't deduce an advertise address: no private IP found, explicit advertise addr not provided"

is something you get when one of alertmanager requirements is not met. As alertmanager docs says:

The cluster.advertise-address flag is required if the instance doesn't have an IP address that is part of RFC 6980 with a default route.

Source: https://github.com/prometheus/alertmanager#high-availability

from ansible-alertmanager.

dmke avatar dmke commented on May 25, 2024

I'm not aware we're using alertmanager in HA mode. From what I can gather from our Gitlab instance (I don't have the repo checked out right now), these are the only alertmanager_* variables set:

alertmanager_version:        0.18.0
alertmanager_receivers:      [] # multiple configures
alertmanager_route:          {} # configured
alertmanager_child_routes:   [] # routes omitted
alertmanager_listen_address: "127.0.0.1:9093"
alertmanager_external_url:   "https://alertmanager.example.com"

Maybe of note: we've submoduled 50d90b5.

I'll have a more detailed look tomorrow when I'm back in the office.

from ansible-alertmanager.

paulfantom avatar paulfantom commented on May 25, 2024

I am not sure if this would cause problems, but you have misconfiguration in two places:
- alertmanager_listen_address should be alertmanager_web_listen_address
- alertmanager_external_url should be alertmanager_web_external_url

Those options were changed in 0.11.0 to make it similar to alertmanager config.

I just saw that we have backwards-compatibility layer in place so those variables are translated to newer versions. Nevertheless you should consider updating them.

On a side note, wow, you are using it for quite a long time (0.11.0 was released more than a year ago)!

I just discovered we didn't update that part of our docs for over a year 🤦‍♂️

from ansible-alertmanager.

dmke avatar dmke commented on May 25, 2024

I've checked and we don't run alertmanager in HA mode. However, the alertmanager docs state (emphasis mine):

--cluster.listen-address string: cluster listen address (default "0.0.0.0:9094"; empty string disables HA mode)

combined with alertmanager_cluster configured to be {} (i.e. the default value), this results in the following ExecStart directive:

ExecStart=/usr/local/bin/alertmanager \
  --config.file=/etc/alertmanager/alertmanager.yml \
  --storage.path=/var/lib/alertmanager \
  --web.listen-address=127.0.0.1:9093 \
  --web.external-url=https://alertmanager.example.com

Note the absence of a --cluster.listen-address="", which would disable HA mode.

This is just a bad default on alertmanager's part. I'll check whether

alertmanager_cluster:
  listen-address: ""

shuts that off.


I just discovered we didn't update that part of our docs for over a year 🤦‍♂️

😀

from ansible-alertmanager.

lock avatar lock commented on May 25, 2024

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

from ansible-alertmanager.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.