Comments (9)
Yeah, alertmanager_cluster["listen-address"] = ""
does the trick. I'll close this for now.
from ansible-alertmanager.
Sorry for my lack of reply.
We try not to have any dependencies between roles and we cannot ensure there is prometheus.service
on the same host as alertmanager. And even when there is prometheus.service
there is no guarantee that alertmanager is configured to receive alerts from this prometheus instance. I would recommend to do such check on a playbook level.
As for the second part (RestartSec=5s
), I am currently securing all service files and will include it shortly.
from ansible-alertmanager.
Fair point. We're actually already using a drop-ins for that purpose:
# systemctl status alertmanager.service
● alertmanager.service - Prometheus Alertmanager
Loaded: loaded (/etc/systemd/system/alertmanager.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/alertmanager.service.d
└─after-prometheus.conf
└─slowdown-restarts.conf
[...]
with /etc/systemd/system/alertmanager.service.d/after-prometheus.conf
containing just
[Unit]
After=prometheus.service
Maybe this could be added to the README.md
, e.g. in the Example section?
As for the second part (
RestartSec=5s
), I am currently securing all service files and will include it shortly.
Great!
from ansible-alertmanager.
I also looked at your error msgs and it seems quite strange that your alertmanager requires prometheus server to start. This is not a usual case for alertmanager as it should be able to operate without any prometheus server (communication is unidirectional and alertmanager is on receiving end).
From what I see you are having problems because of networking problem as can be seen in logs:
level=warn ts=2019-07-10T00:00:21.115Z caller=cluster.go:154 component=cluster err="couldn't deduce an advertise address: no private IP found, explicit advertise addr not provided"
Are you using alertmanager in HA mode with gossip network? Because it looks like alertmanager cannot start because gossip network address specified with --cluster.advertise-address
is not available. Could you provide ansible variables which were used to deploy alertmanager?
Maybe this could be added to the README.md, e.g. in the Example section?
No, there is no requirement of having prometheus running to use alertmanager. Of course it doesn't make much sense, but it is not a hard requirement.
from ansible-alertmanager.
This error:
level=warn ts=2019-07-10T00:00:21.115Z caller=cluster.go:154 component=cluster err="couldn't deduce an advertise address: no private IP found, explicit advertise addr not provided"
is something you get when one of alertmanager requirements is not met. As alertmanager docs says:
The cluster.advertise-address flag is required if the instance doesn't have an IP address that is part of RFC 6980 with a default route.
Source: https://github.com/prometheus/alertmanager#high-availability
from ansible-alertmanager.
I'm not aware we're using alertmanager in HA mode. From what I can gather from our Gitlab instance (I don't have the repo checked out right now), these are the only alertmanager_*
variables set:
alertmanager_version: 0.18.0
alertmanager_receivers: [] # multiple configures
alertmanager_route: {} # configured
alertmanager_child_routes: [] # routes omitted
alertmanager_listen_address: "127.0.0.1:9093"
alertmanager_external_url: "https://alertmanager.example.com"
Maybe of note: we've submoduled 50d90b5.
I'll have a more detailed look tomorrow when I'm back in the office.
from ansible-alertmanager.
I am not sure if this would cause problems, but you have misconfiguration in two places:
- alertmanager_listen_address
should be alertmanager_web_listen_address
- alertmanager_external_url
should be alertmanager_web_external_url
Those options were changed in 0.11.0 to make it similar to alertmanager config.
I just saw that we have backwards-compatibility layer in place so those variables are translated to newer versions. Nevertheless you should consider updating them.
On a side note, wow, you are using it for quite a long time (0.11.0 was released more than a year ago)!
I just discovered we didn't update that part of our docs for over a year 🤦♂️
from ansible-alertmanager.
I've checked and we don't run alertmanager in HA mode. However, the alertmanager docs state (emphasis mine):
--cluster.listen-address
string: cluster listen address (default "0.0.0.0:9094"; empty string disables HA mode)
combined with alertmanager_cluster
configured to be {}
(i.e. the default value), this results in the following ExecStart
directive:
ExecStart=/usr/local/bin/alertmanager \
--config.file=/etc/alertmanager/alertmanager.yml \
--storage.path=/var/lib/alertmanager \
--web.listen-address=127.0.0.1:9093 \
--web.external-url=https://alertmanager.example.com
Note the absence of a --cluster.listen-address=""
, which would disable HA mode.
This is just a bad default on alertmanager's part. I'll check whether
alertmanager_cluster:
listen-address: ""
shuts that off.
I just discovered we didn't update that part of our docs for over a year 🤦♂️
😀
from ansible-alertmanager.
This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.
from ansible-alertmanager.
Related Issues (20)
- Can't use custom templates for receivers HOT 3
- Config validation not done on templates HOT 2
- Support for webhook_config HOT 2
- var name docu mismatch alertmanager_binaries_local_dir HOT 1
- offline mode fails HOT 2
- Add on README.md example playbook that alertmanager_route and alertmanager_receivers is needed HOT 6
- slack template issue
- Tasks "download alertmanager binary to local folder" and "unpack alertmanager binaries" always changed HOT 1
- mute_time_interval support HOT 1
- alertmanager_checksum_url Gives error after commit from 7 day ago HOT 3
- Permissions of alertmanager config file HOT 1
- Request for more TAGS HOT 1
- log.level=debug HOT 1
- Unable to get the checksum due to errors in the Ansible HOT 2
- too many open files error HOT 3
- Deprecation warning about `include` HOT 1
- Download URL HOT 1
- Configure for Telegram HOT 4
- Error while getting checksum list for version `latest` HOT 1
- alertmanager_template_files variable config is incorrect HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ansible-alertmanager.