prometheus-msteams / prometheus-msteams Goto Github PK
View Code? Open in Web Editor NEWForward Prometheus Alert Manager notifications to Microsoft Teams.
License: MIT License
Forward Prometheus Alert Manager notifications to Microsoft Teams.
License: MIT License
I am using the Prometheus-msteams version 1.1.4 helm chart to integrate the latest Prometheus/alertmanager and MS teams in the kubernetes environment.
There are issues with payloads from Prometheus being sent to msteams. Thus the alert notifications are sometimes not sent correctly to MSteams. The logs of Prometheus-msteams container shows error:
time="2019-11-06T07:01:08Z" level=info msg="/alertmanager received a request"
time="2019-11-06T07:01:08Z" level=debug msg="Prometheus Alert: {\"receiver\":\"prometheus-msteams\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"KubeDeploymentReplicasMismatch\",\"deployment\":\"storagesvc\",\"endpoint\":\"http\",\"instance\":\"10.233.108.72:8080\",\"job\":\"kube-state-metrics\",\"namespace\":\"fission\",\"pod\":\"monitor-kube-state-metrics-856bc9455b-7z5qx\",\"prometheus\":\"monitoring/monitor-prometheus-operato-prometheus\",\"service\":\"monitor-kube-state-metrics\",\"severity\":\"critical\"},\"annotations\":{\"message\":\"Deployment fission/storagesvc has not matched the expected number of replicas for longer than 15 minutes.\",\"runbook_url\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch\"},\"startsAt\":\"2019-11-06T07:00:32.453590324Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://monitor-prometheus-operato-prometheus.monitoring:9090/graph?g0.expr=kube_deployment_spec_replicas%7Bjob%3D%22kube-state-metrics%22%7D+%21%3D+kube_deployment_status_replicas_available%7Bjob%3D%22kube-state-metrics%22%7D\\u0026g0.tab=1\"},{\"status\":\"firing\",\"labels\":{\"alertname\":\"KubePodNotReady\",\"namespace\":\"fission\",\"pod\":\"storagesvc-5bff46b69b-vfdrd\",\"prometheus\":\"monitoring/monitor-prometheus-operato-prometheus\",\"severity\":\"critical\"},\"annotations\":{\"message\":\"Pod fission/storagesvc-5bff46b69b-vfdrd has been in a non-ready state for longer than 15 minutes.\",\"runbook_url\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready\"},\"startsAt\":\"2019-11-06T07:00:32.453590324Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://monitor-prometheus-operato-prometheus.monitoring:9090/graph?g0.expr=sum+by%28namespace%2C+pod%29+%28kube_pod_status_phase%7Bjob%3D%22kube-state-metrics%22%2Cphase%3D~%22Failed%7CPending%7CUnknown%22%7D%29+%3E+0\\u0026g0.tab=1\"}],\"groupLabels\":{\"namespace\":\"fission\",\"severity\":\"critical\"},\"commonLabels\":{\"namespace\":\"fission\",\"prometheus\":\"monitoring/monitor-prometheus-operato-prometheus\",\"severity\":\"critical\"},\"commonAnnotations\":{},\"externalURL\":\"http://monitor-prometheus-operato-alertmanager.monitoring:9093\",\"version\":\"4\",\"groupKey\":\"{}:{namespace=\\\"fission\\\", severity=\\\"critical\\\"}\"}"
time="2019-11-06T07:01:08Z" level=debug msg="Alert rendered in template file: \r\n{\r\n \"@type\": \"MessageCard\",\r\n \"@context\": \"http://schema.org/extensions\",\r\n \"themeColor\": \"8C1A1A\",\r\n \"summary\": \"\",\r\n \"title\": \"Prometheus Alert (firing)\",\r\n \"sections\": [ \r\n {\r\n \"activityTitle\": \"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\r\n \"facts\": [\r\n {\r\n \"name\": \"message\",\r\n \"value\": \"Deployment fission/storagesvc has not matched the expected number of replicas for longer than 15 minutes.\"\r\n },\r\n {\r\n \"name\": \"runbook\\\\_url\",\r\n \"value\": \"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch\"\r\n },\r\n {\r\n \"name\": \"alertname\",\r\n \"value\": \"KubeDeploymentReplicasMismatch\"\r\n },\r\n {\r\n \"name\": \"deployment\",\r\n \"value\": \"storagesvc\"\r\n },\r\n {\r\n \"name\": \"endpoint\",\r\n \"value\": \"http\"\r\n },\r\n {\r\n \"name\": \"instance\",\r\n \"value\": \"10.233.108.72:8080\"\r\n },\r\n {\r\n \"name\": \"job\",\r\n \"value\": \"kube-state-metrics\"\r\n },\r\n {\r\n \"name\": \"namespace\",\r\n \"value\": \"fission\"\r\n },\r\n {\r\n \"name\": \"pod\",\r\n \"value\": \"monitor-kube-state-metrics-856bc9455b-7z5qx\"\r\n },\r\n {\r\n \"name\": \"prometheus\",\r\n \"value\": \"monitoring/monitor-prometheus-operato-prometheus\"\r\n },\r\n {\r\n \"name\": \"service\",\r\n \"value\": \"monitor-kube-state-metrics\"\r\n },\r\n {\r\n \"name\": \"severity\",\r\n \"value\": \"critical\"\r\n }\r\n ],\r\n \"markdown\": true\r\n },\r\n {\r\n \"activityTitle\": \"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\r\n \"facts\": [\r\n {\r\n \"name\": \"message\",\r\n \"value\": \"Pod fission/storagesvc-5bff46b69b-vfdrd has been in a non-ready state for longer than 15 minutes.\"\r\n },\r\n {\r\n \"name\": \"runbook\\\\_url\",\r\n \"value\": \"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready\"\r\n },\r\n {\r\n \"name\": \"alertname\",\r\n \"value\": \"KubePodNotReady\"\r\n },\r\n {\r\n \"name\": \"namespace\",\r\n \"value\": \"fission\"\r\n },\r\n {\r\n \"name\": \"pod\",\r\n \"value\": \"storagesvc-5bff46b69b-vfdrd\"\r\n },\r\n {\r\n \"name\": \"prometheus\",\r\n \"value\": \"monitoring/monitor-prometheus-operato-prometheus\"\r\n },\r\n {\r\n \"name\": \"severity\",\r\n \"value\": \"critical\"\r\n }\r\n ],\r\n \"markdown\": true\r\n }\r\n ]\r\n}\r\n"
time="2019-11-06T07:01:08Z" level=debug msg="Size of message is 1714 Bytes (~1 KB)"
time="2019-11-06T07:01:08Z" level=info msg="Created a card for Microsoft Teams /alertmanager"
time="2019-11-06T07:01:08Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"8C1A1A\",\"summary\":\"\",\"title\":\"Prometheus Alert (firing)\",\"sections\":[{\"activityTitle\":\"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\"facts\":[{\"name\":\"message\",\"value\":\"Deployment fission/storagesvc has not matched the expected number of replicas for longer than 15 minutes.\"},{\"name\":\"runbook\\\\_url\",\"value\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch\"},{\"name\":\"alertname\",\"value\":\"KubeDeploymentReplicasMismatch\"},{\"name\":\"deployment\",\"value\":\"storagesvc\"},{\"name\":\"endpoint\",\"value\":\"http\"},{\"name\":\"instance\",\"value\":\"10.233.108.72:8080\"},{\"name\":\"job\",\"value\":\"kube-state-metrics\"},{\"name\":\"namespace\",\"value\":\"fission\"},{\"name\":\"pod\",\"value\":\"monitor-kube-state-metrics-856bc9455b-7z5qx\"},{\"name\":\"prometheus\",\"value\":\"monitoring/monitor-prometheus-operato-prometheus\"},{\"name\":\"service\",\"value\":\"monitor-kube-state-metrics\"},{\"name\":\"severity\",\"value\":\"critical\"}],\"markdown\":true},{\"activityTitle\":\"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\"facts\":[{\"name\":\"message\",\"value\":\"Pod fission/storagesvc-5bff46b69b-vfdrd has been in a non-ready state for longer than 15 minutes.\"},{\"name\":\"runbook\\\\_url\",\"value\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready\"},{\"name\":\"alertname\",\"value\":\"KubePodNotReady\"},{\"name\":\"namespace\",\"value\":\"fission\"},{\"name\":\"pod\",\"value\":\"storagesvc-5bff46b69b-vfdrd\"},{\"name\":\"prometheus\",\"value\":\"monitoring/monitor-prometheus-operato-prometheus\"},{\"name\":\"severity\",\"value\":\"critical\"}],\"markdown\":true}]}]"
time="2019-11-06T07:01:08Z" level=info msg="Microsoft Teams response text: Summary or Text is required."
time="2019-11-06T07:01:08Z" level=error msg="Failed sending to the Teams Channel. Teams http response: 400 Bad Request"
Maybe this is an issue with the default card template due to which MSteams sends Summary or Text is required
response.
Also, I would suggest :
I would be nice to be able to declare the connectors in 1 or more configmaps so that connectors could be added separate from the service deployment.
Hi team,
First off, thanks for this app. I would like to add a feature request to allow for variables to be used in the URL. We have a use case where we have dozens of MS team channels that we are trying to set up, and it would very useful to be able to set up a single receiver that uses a variable to select the correct url path. In the following example I am adding annotations for email and the msteams connector. It would be useful to be able to use the {{ .annotation.msteams }} variable in the receiver so I do not have to have dozens of receivers, with each connector in the url.
rule:
alert: Alert
testing rule new alert
expr: up{hostname="xxxx.dev.local",job="deploydirector"}
== 1
for: 1m
labels:
channel: msteams,email,monolith
severity: Critical
annotations:
application: git.tools.foo.net
description: '{{ $labels.job }} server has been down for more than 5 minutes.'
documentation_link: https://confluence.com/display/MOD/Deployments+FAQ
email: [email protected]
msteams: Operations
summary: THIS IS A TEST- {{ $labels.job }} is down- THIS IS A TEST
alertmanager config:
global:
resolve_timeout: 5m
route:
receiver: email
group_by:
alertname
routes:
receiver: email
match_re:
channel: ^(?:.(email).)$
continue: true
receiver: msteams
match_re:
channel: ^(?:.(msteams).)$
continue: true
receiver: monolith
match_re:
channel: ^(?:.(monolith).)$
repeat_interval: 8737h
group_wait: 30s
group_interval: 1m
repeat_interval: 4h
receivers:
name: email
email_configs:
name: msteams
webhook_configs:
name: monolith
webhook_configs:
Would this be possible?
thanks
The Prometheus documentation provides a list of integrations made by 3rd party projects using the webhook receiver: https://prometheus.io/docs/operating/integrations/#alertmanager-webhook-receiver.
This documentation is open source and we should add the prometheus-msteams
to the list.
Hello, I am trying to test out the functionality of the prometheus-msteam pod by directly sending it a message using your template. I get this in retunr (URL modified for privacy)
/ # curl -X POST -d @prom-alert.json http://prometheus-msteams:2000/alertmanager
Failed sending to webhook url https://outlook.office.com/webhook/xxx/xxx. Got the error: Post https://outlook.office.com/webhook/xxx/xxx: Forbidden
I check the logs, and I see this:
time="2019-12-03T22:40:58Z" level=info msg="/alertmanager received a request"
time="2019-12-03T22:40:59Z" level=debug msg="Prometheus Alert: {\"receiver\":\"teams_proxy\",\"status\":\"firing\",\"alerts\":[{\"status\":\"\",\"labels\":{\"alertname\":\"high_memory_load\",\"instance\":\"10.80.40.11:9100\",\"job\":\"docker_nodes\",\"monitor\":\"master\",\"severity\":\"warning\"},\"annotations\":{\"description\":\"xxxxxxx\",\"summary\":\"Server High Memory usage\"},\"startsAt\":\"2018-03-07T06:33:21.873077559-05:00\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"\"}],\"groupLabels\":{\"alertname\":\"high_memory_load\"},\"commonLabels\":{\"alertname\":\"high_memory_load\",\"monitor\":\"master\",\"severity\":\"warning\"},\"commonAnnotations\":{\"summary\":\"Server High Memory usage\"},\"externalURL\":\"http://alertmanager:9093\",\"version\":\"4\",\"groupKey\":\"{}:{alertname=\\\"high_memory_load\\\"}\"}"
time="2019-12-03T22:40:59Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"FFA500\",\n \"summary\": \"Server High Memory usage\",\n \"title\": \"Prometheus Alert (firing)\",\n \"sections\": [ \n {\n \"activityTitle\": \"[xxxxxxx](http://alertmanager:9093)\",\n \"facts\": [\n {\n \"name\": \"description\",\n \"value\": \"xxxxxxx\"\n },\n {\n \"name\": \"summary\",\n \"value\": \"Server High Memory usage\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"high memory load\"\n },\n {\n \"name\": \"instance\",\n \"value\": \"10.80.40.11:9100\"\n },\n {\n \"name\": \"job\",\n \"value\": \"docker nodes\"\n },\n {\n \"name\": \"monitor\",\n \"value\": \"master\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"warning\"\n }\n ],\n \"markdown\": true\n }\n ]\n}\n"
time="2019-12-03T22:40:59Z" level=debug msg="Size of message is 557 Bytes (~0 KB)"
time="2019-12-03T22:40:59Z" level=info msg="Created a card for Microsoft Teams /alertmanager"
time="2019-12-03T22:40:59Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"FFA500\",\"summary\":\"Server High Memory usage\",\"title\":\"Prometheus Alert (firing)\",\"sections\":[{\"activityTitle\":\"[xxxxxxx](http://alertmanager:9093)\",\"facts\":[{\"name\":\"description\",\"value\":\"xxxxxxx\"},{\"name\":\"summary\",\"value\":\"Server High Memory usage\"},{\"name\":\"alertname\",\"value\":\"high memory load\"},{\"name\":\"instance\",\"value\":\"10.80.40.11:9100\"},{\"name\":\"job\",\"value\":\"docker nodes\"},{\"name\":\"monitor\",\"value\":\"master\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true}]}]"
time="2019-12-03T22:40:59Z" level=error msg="Failed sending to webhook url https://outlook.office.com/webhook/xxx/xxx. Got the error: Post https://outlook.office.com/webhook/xxx/xxx: Forbidden"
I tested the webhook directly using the test messages and format from here - https://docs.microsoft.com/en-us/outlook/actionable-messages/send-via-connectors
And these work fine, so I know the webhook is operational and functioning.
Note, I have not configured Alertmanager yet, I am simply testing directly to the prometheus-teams pod to verify if it will work once alertmanager is set up.
Not sure why its telling me my connection is forbidden.
Hi,
I tested prometheus-msteams these days, it is really a wonderful tool and better than the one wrote in python introduced in the official link I think.
Now I'm managing connectors with the config file by saltstack, it will dynamically create connectors from pillar values. But if no connectors defined, the service will be failed to start, looks like prometheus-msteams needs at least one connectors to start, so what I have done is this:
connectors:
- dummy: "Used for empty connectors"
By doing this, if I have no actual connectors defined in my pillar, at least the service can be running.
Any thoughts?
Command cat $HOME/.prometheus-msteams.yaml
produces:
connectors:
cc_channel: https://localhost
dd_channel: https://localhost
Command /bin/promteams server
produces:
Using config file: /root/.prometheus-msteams.yaml
2018/06/22 05:05:31 A config file (-f) or --request-uri or --webhook-url is not found.
Usage:
prometheus-msteams server [flags]
Flags:
-h, --help help for server
-l, --listen-address string the address on which the server will listen (default "0.0.0.0")
-p, --port int port on which the server will listen (default 2000)
-r, --request-uri string the request uri path. Do not use this if using a config file. (default "alertmanager")
-w, --webhook-url string the incoming webhook url to post the alert messages. Do not use this if using a config file.
Global Flags:
--config string config file (default is $HOME/.prometheus-msteams.yaml)
Command /bin/promteams server --config /test.yaml
produces:
Using config file: /test.yaml
2018/06/22 05:06:46 A config file (-f) or --request-uri or --webhook-url is not found.
Usage:
prometheus-msteams server [flags]
Flags:
-h, --help help for server
-l, --listen-address string the address on which the server will listen (default "0.0.0.0")
-p, --port int port on which the server will listen (default 2000)
-r, --request-uri string the request uri path. Do not use this if using a config file. (default "alertmanager")
-w, --webhook-url string the incoming webhook url to post the alert messages. Do not use this if using a config file.
Global Flags:
--config string config file (default is $HOME/.prometheus-msteams.yaml)
Alert is being sent to the alert manager. Then it is not able to forward to msteams receiver I guess. I am getting the following error
" unexpected status code 500 from http://teamreceiver:2000/alertmanager"
Hey,
Having issues with payloads from prometheus being sent to msteams. We've recently updated to the latest prometheus/alertmanager and version 1.1.4 of promethus-msteams.
Test alerts work fine, so its an error in the formatting which is coming out of default prometheus-operator helm chart install. I haven't looked too much further yet, but can see a few people do have templating errors. Given 1.1.4 has just been released I thought i'd raise an issue as well.
Template is the default one, unedited - and this is the error being received in the logs
time="2019-08-05T07:42:15Z" level=debug msg="Prometheus Alert: {\"receiver\":\"high_priority_receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"annotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"startsAt\":\"2019-08-05T07:36:45.470372903Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://prometheus-dashboard.testing/graph?g0.expr=vector%281%29\\u0026g0.tab=1\"}],\"groupLabels\":{},\"commonLabels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"commonAnnotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"externalURL\":\"http://alertmanager.testing\",\"version\":\"4\",\"groupKey\":\"{}:{}\"}"
time="2019-08-05T07:42:15Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"808080\",\n \"summary\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\",\n \"title\": \"Prometheus Alert (firing)\",\n \"sections\": [ \n {\n \"activityTitle\": \"[ aaaa](http://alertmanager.testing)\",\n \"facts\": [\n {\n \"name\": \"message\",\n \"value\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"Watchdog\"\n },\n {\n \"name\": \"prometheus\",\n \"value\": \"monitoring/prometheus-operator-prometheus\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"none\"\n }\n ],\n \"markdown\": true\n }\n ]\n}\n"
time="2019-08-05T07:42:15Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
time="2019-08-05T07:42:15Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
I installed via helm:
helm install --name prometheus-msteams ./prometheus-msteams --namespace monitoring
I didn't use the custom config.yaml instead I updated the values.yaml
connectors:
- alertmanager: https://outlook.office.com/webhook/ad5b7417-6a87-4f47-xxxxxxxx8@8a4925a9-fd8e-xxxxxxxxx/IncomingWebhook/21595d04c1aa432xxxxxxxxxxxxxx/602f4df1-xxxxxx-xxxxx
When I am getting an error:
level=error ts=2019-01-10T12:15:31.712451123Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="Post http://localhost:2000/alertmanager: dial tcp 127.0.0.1:2000: connect: connection refused"
I have updated the alertmanager.yaml as well with the below details:
global:
resolve_timeout: 5m
route:
group_by: ['alertname']
group_wait: 10s
group_interval: 1m
repeat_interval: 5m
receiver: sns-forwarder
receivers:
- name: 'sns-forwarder'
slack_configs:
- api_url: https://hooks.slack.com/services/TF90Yxxxxxxxxxxxxxxxxxxxx
channel: '#k8'
icon_emoji: ':bell:'
send_resolved: true
text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
webhook_configs: # https://prometheus.io/docs/alerting/configuration/#webhook_config
- send_resolved: true
url: 'http://localhost:2000/alertmanager'
I have enabled a custom webhook with alert manger to self trigger Jenkins jobs when a alert is received. So when I am introducing the Microsoft Team proxy, my original webhook stops working. Basically I get alert via email and on teams that something is wrong but it does not triggers my original webhook to invoke Jenkins to take required action to start my job.
Thanks for the great app. Just wondered if there was a straightforward way of using it with a proxy server?
Thanks
Hi I am running prom2teams through docker-compose
my docker-compose.yml file:
version: '2'
volumes:
prometheus_data: {}
grafana_data: {}
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus/:/etc/prometheus/
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
ports:
- "9090:9090"
alertmanager:
image: "prom/alertmanager"
volumes:
- ./msteams.yml:/alertmanager.yml
command:
- "--config.file=/alertmanager.yml"
ports:
- "9093:9093"
grafana:
image: grafana/grafana:5.1.0
depends_on:
- prometheus
ports:
- "3000:3000"
user: "104"
promteams:
image: bzon/prometheus-msteams:latest
environment:
- TEAMS_INCOMING_WEBHOOK_URL="https://outlook.office.com/webhook/xxx"
- PROMTEAMS_DEBUG="true"
ports:
- "2000:2000"
alertmanager.yml file
route:
receiver: receiver
group_interval: 1m
repeat_interval: 15m
receivers:
logs:
alertmanager_1 | level=error ts=2018-08-24T12:31:59.017048267Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"
alertmanager_1 | level=error ts=2018-08-24T12:31:59.017134279Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"
alertmanager_1 | level=error ts=2018-08-24T12:32:59.017103102Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"
alertmanager_1 | level=error ts=2018-08-24T12:32:59.017262416Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"
I modified the "title"
of the card-with-action.tmpl
, changing "title": "Prometheus Alert ({{ .Status }})",
to "title": "[{{ .Status }}] Prometheus Alert"
:
card-with-action.tmpl
:
{{ define "teams.card" }}
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "{{- if eq .Status "resolved" -}}2DC72D
{{- else if eq .Status "firing" -}}
{{- if eq .CommonLabels.severity "critical" -}}8C1A1A
{{- else if eq .CommonLabels.severity "warning" -}}FFA500
{{- else -}}808080{{- end -}}
{{- else -}}808080{{- end -}}",
"summary": "{{- if eq .CommonAnnotations.summary "" -}}
{{- if eq .CommonAnnotations.message "" -}}
{{- .CommonLabels.alertname -}}
{{- else -}}
{{- .CommonAnnotations.message -}}
{{- end -}}
{{- else -}}
{{- .CommonAnnotations.summary -}}
{{- end -}}",
"title": "[{{ .Status }}] Prometheus Alert",
"sections": [ {{$externalUrl := .ExternalURL}}
{{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
{
"activityTitle": "[{{ $alert.Annotations.description }}]({{ $externalUrl }})",
"facts": [
{{- range $key, $value := $alert.Annotations }}
{
"name": "{{ reReplaceAll "_" "\\\\_" $key }}",
"value": "{{ reReplaceAll "_" "\\\\_" $value }}"
},
{{- end -}}
{{$c := counter}}{{ range $key, $value := $alert.Labels }}{{if call $c}},{{ end }}
{
"name": "{{ reReplaceAll "_" "\\\\_" $key }}",
"value": "{{ reReplaceAll "_" "\\\\_" $value }}"
}
{{- end }}
],
"markdown": true
}
{{- end }}
],
"potentialAction": [
{
"@context": "http://schema.org",
"@type": "ViewAction",
"name": "Runbook",
"target": [
"{{ reReplaceAll "_" "\\\\_" .CommonAnnotations.runbook }}"
]
}
]
}
{{ end }}
Logs:
prometheus-msteams_1 | time="2019-07-31T12:10:03Z" level=debug msg="Prometheus Alert: {\"receiver\":\"msteams-default-receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"Watchdog\",\"severity\":\"none\"},\"annotations\":{\"description\":\"Watchdog\",\"summary\":\"Watchdog\"},\"startsAt\":\"2019-07-31T12:09:25.689259224Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://a8ed24b9cfee:9090/graph?g0.expr=vector%281%29\\u0026g0.tab=1\"}],\"groupLabels\":{\"alertname\":\"Watchdog\"},\"commonLabels\":{\"alertname\":\"Watchdog\",\"severity\":\"none\"},\"commonAnnotations\":{\"description\":\"Watchdog\",\"summary\":\"Watchdog\"},\"externalURL\":\"http://c72d6a34aa58:9093\",\"version\":\"4\",\"groupKey\":\"{}/{}:{alertname=\\\"Watchdog\\\"}\"}"
prometheus-msteams_1 | time="2019-07-31T12:10:03Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"808080\",\n \"summary\": \"Watchdog\",\n \"title\": \"[firing]: Prometheus Alert\",\n \"sections\": [ \n {\n \"activityTitle\": \"[Watchdog](http://c72d6a34aa58:9093)\",\n \"facts\": [\n {\n \"name\": \"description\",\n \"value\": \"Watchdog\"\n },\n {\n \"name\": \"summary\",\n \"value\": \"Watchdog\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"Watchdog\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"none\"\n }\n ],\n \"markdown\": true\n }\n ],\n \"potentialAction\": [\n {\n \"@context\": \"http://schema.org\",\n \"@type\": \"ViewAction\",\n \"name\": \"Runbook\",\n \"target\": [\n \"\"\n ]\n }\n ]\n}\n"
prometheus-msteams_1 | time="2019-07-31T12:10:03Z" level=debug msg="Size of message is 501 Bytes (~0 KB)"
prometheus-msteams_1 | time="2019-07-31T12:10:03Z" level=info msg="Created a card for Microsoft Teams /prom-alerts-dev-leo"
prometheus-msteams_1 | time="2019-07-31T12:10:03Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"808080\",\"summary\":\"Watchdog\",\"title\":[firing]: Prometheus Alert,\"sections\":[{\"activityTitle\":\"[Watchdog](http://c72d6a34aa58:9093)\",\"facts\":[{\"name\":\"description\",\"value\":\"Watchdog\"},{\"name\":\"summary\",\"value\":\"Watchdog\"},{\"name\":\"alertname\",\"value\":\"Watchdog\"},{\"name\":\"severity\",\"value\":\"none\"}],\"markdown\":true}],\"potentialAction\":[{\"@context\":\"http://schema.org\",\"@type\":\"ViewAction\",\"name\":\"Runbook\",\"target\":[\"\"]}]}]"
prometheus-msteams_1 | time="2019-07-31T12:10:03Z" level=info msg="Microsoft Teams response text: Bad payload received by generic incoming webhook."
prometheus-msteams_1 | time="2019-07-31T12:10:03Z" level=error msg="Failed sending to the Teams Channel. Teams http response: 400 Bad Request"
Alert:
groups:
- name: alerting_rules
rules:
- alert: Watchdog
expr: vector(1)
labels:
severity: none
annotations:
summary: "Watchdog"
description: "Watchdog"
As you can see there is no double-quotes at this part ,\"title\":[firing]: Prometheus Alert,
of the json.
Happens on v1.1.2
and v1.1.3
Hey,
Can you tell me how to edit the template that is sent to MS Teams for example adding an external link - Is there a JSON template I can edit to include this?
I compared it to the Prom2Teams project where I was able to find the template, but I was not able to find it in this one. Can you point me in the right direction please.
Kind regards,
Laura
Is it possible to use the AdaptiveCard from Ms to customise the template, I've tried to use it but I always get an error. The predefined actions at this moment is quite limited.
Do you have any example for this?
Thanks in advance
[{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "8C1A1A",
"summary": "",
"title": "Prometheus Alert (firing)",
"sections": [{
"activityTitle": "[](http://xxxx:9093)",
"facts": [{
"name": "message",
"value": "Only 0% of the desired Pods of DaemonSet monitoring/prometheus-operator-prometheus-node-exporter are scheduled and ready."
},
{
"name": "runbook\\_url",
"value": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck"
},
{
"name": "alertname",
"value": "KubeDaemonSetRolloutStuck"
},
{
"name": "cluster",
"value": "yyyy"
},
{
"name": "daemonset",
"value": "prometheus-operator-prometheus-node-exporter"
},
{
"name": "endpoint",
"value": "http"
},
{
"name": "instance",
"value": "zzzzzz:8080"
},
{
"name": "job",
"value": "kube-state-metrics"
},
{
"name": "namespace",
"value": "monitoring"
},
{
"name": "pod",
"value": "prometheus-operator-kube-state-metrics-6c7cc58ff8-h8tnc"
},
{
"name": "prometheus",
"value": "monitoring/prometheus-operator-prometheus"
},
{
"name": "service",
"value": "prometheus-operator-kube-state-metrics"
},
{
"name": "severity",
"value": "critical"
}
],
"markdown": true
}]
}]
docker log:
time="2019-04-01T15:50:31Z" level=info msg="Microsoft Teams response text: Summary or Text is required."
time="2019-04-01T15:50:31Z" level=error msg="Failed sending to the Teams Channel. Teams http response: 400 Bad Request"
perhaps it also important to mentioned that I had several issues when alerts does not contain anything, which is the case for "Watchdog" and some targets down "AlertmanagerDown"
in order to get it working, i had to define a new route with a "fake" receiver for the cases mentioned before.
We are using the description
field also for instruction how to get rid of the error for 1st and 2nd level support. Wouldn't it be better to use the summary
field as head line instead of the description field.
For e.g. in your screenshot:
This would mean, the summary text is used for the link. So the anker text would be: "Server High Memory Usage"
To streamline this a bit more we could drop the summary field in the list below.
Would you accept a PR with this?
Generally for testing and developing it's great to use the DEBUG
. For the general user however, INFO
should be the default log level.
I have the following card.tmpl
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "{{- if eq .Status "resolved" -}}2DC72D
{{- else if eq .Status "firing" -}}
{{- if eq .CommonLabels.severity "L1" -}}8C1A1A
{{- else if eq .CommonLabels.severity "L4" -}}FFA500
{{- else -}}808080{{- end -}}
{{- else -}}808080{{- end -}}",
"summary": "{{- .CommonLabels.alertname -}}",
"title": "Prometheus Alert ({{ .Status }})",
"sections": [ {{$externalUrl := .ExternalURL}}
{{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
{
"activityTitle": "[{{ $alert.Annotations.description }}]({{ $externalUrl }})",
"markdown": true
}
{{- end }}
]
}
{{ end }}
Its throwing error msg="Failed to parse json with key 'sections': Key path not found"
when running my application in minikube. Please help me out.
The sample payload which needs to parsed is below:
"{\"receiver\":\"high_priority_receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alert_team\":\"P2HOBJ\",\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"instance\":\"172.17.0.11:8081\",\"job\":\"kube-api-scrape\",\"sdk_version\":\"3.25.10\",\"server_version\":\"5.4.1\",\"severity\":\"L1\",\"team\":\"P2HOBJ\"},\"annotations\":{\"description\":\"172.17.0.11:8081 of hobject job \\\"kube-api-scrape\\\" has been down for more than 1 minute.\",\"summary\":\"Instance 172.17.0.11:8081 down\"},\"startsAt\":\"2019-05-15T04:30:24.063168873Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://hobject-prometheus-bbcb9f9cd-s4sfq:9090/graph?g0.expr=up+%3D%3D+0\\u0026g0.tab=1\"},{\"status\":\"firing\",\"labels\":{\"alert_team\":\"P2HOBJ\",\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"instance\":\"172.17.0.8:8081\",\"job\":\"kube-api-scrape\",\"sdk_version\":\"3.25.10\",\"server_version\":\"5.4.1\",\"severity\":\"L1\",\"team\":\"P2HOBJ\"},\"annotations\":{\"description\":\"172.17.0.8:8081 of hobject job \\\"kube-api-scrape\\\" has been down for more than 1 minute.\",\"summary\":\"Instance 172.17.0.8:8081 down\"},\"startsAt\":\"2019-05-15T04:30:34.063168873Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://hobject-prometheus-bbcb9f9cd-s4sfq:9090/graph?g0.expr=up+%3D%3D+0\\u0026g0.tab=1\"}],\"groupLabels\":{\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"team\":\"P2HOBJ\"},\"commonLabels\":{\"alert_team\":\"P2HOBJ\",\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"job\":\"kube-api-scrape\",\"sdk_version\":\"3.25.10\",\"server_version\":\"5.4.1\",\"severity\":\"L1\",\"team\":\"P2HOBJ\"},\"commonAnnotations\":{},\"externalURL\":\"http://hobject-alertmanager-59d8577b9d-zzktw:9093\",\"version\":\"4\",\"groupKey\":\"{}:{alertname=\\\"Instance Down\\\", app_name=\\\"p2hobj-3-25-10-mainftr\\\", team=\\\"P2HOBJ\\\"}\"}"
Is there a way to disable few labels? Currently we get all the labels associated with a particular metric, can this be made more selective?
I am using Prometheus v1.5.0 and alertmanager v0.5
it seems like this does not support older versions and throwing below error.
level=error msg="Failed decoding Prometheus alert message: json: cannot unmarshal number into Go struct field PrometheusAlertMessage.groupKey of type string"
and failing to send message afterwards.
Since v1.0
, the webhooks specified in config.yml
are spit out in the container logs on the INFO
log level.
As with best practices, these secrets should either be logged on only the DEBUG
log level, or they could be logged on the INFO
level but at least be redacted. This is prevent secrets from being leaked as plaintext onto the disk or to the backing log database.
Is it on the roadmap to publish this chart to https://github.com/helm/charts? So far I have been installing my charts using requirements.yaml
, but there doesn't currently seem to be a way to do that with this repo.
Hi bzon,
Thanks for your work on this! Just a small remark: When using the helm chart, the config map is always created in the default
namespace; if you deploy to a different namespace than default
, the container will not start, as the config map cannot be mounted:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 2m default-scheduler Successfully assigned monitoring/prometheus-msteams-59dfbc84b6-ts9dq to <...>
Warning FailedMount 58s (x8 over 2m) kubelet, <...> MountVolume.SetUp failed for volume "config-volume" : configmaps "prometheus-msteams-config" not found
Because:
$ kc get configmaps -n default
NAME DATA AGE
prometheus-msteams-config 1 3m
Using the Go templating engine, we can include a feature to let the user create their own YAML templates and feed it to prometheus-msteams. The teams message card creation reference is here https://docs.microsoft.com/en-us/microsoftteams/platform/concepts/cards/cards-reference.
We can add a new function named CreateCardFromTemplates that will be called if the user use the --card-template $TEMPLATE_FILE
.
Modification on teams.go
adding Orange Warning on your code (i m not authorize to push with my credentials) and a case if it's firing (warning/critical)
`// Copyright © 2018 bzon [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.
package alert
import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"strings"
log "github.com/sirupsen/logrus"
)
// Constants for Sending a Card
const (
messageType = "MessageCard"
context = "http://schema.org/extensions"
colorResolved = "2DC72D"
colorCritical = "8C1A1A"
colorWarning = "FFA500"
colorUnknown = "808080"
)
// TeamsMessageCard is for the Card Fields to send in Teams
// The Documentation is in https://docs.microsoft.com/en-us/outlook/actionable-messages/card-reference#card-fields
type TeamsMessageCard struct {
Type string json:"@type"
Context string json:"@context"
ThemeColor string json:"themeColor"
Summary string json:"summary"
Title string json:"title"
Text string json:"text,omitempty"
Sections []TeamsMessageCardSection json:"sections"
}
func (card *TeamsMessageCard) String() string {
b, err := json.Marshal(card)
if err != nil {
log.Errorf("failed marshalling TeamsMessageCard: %v", err)
}
return string(b)
}
// TeamsMessageCardSection is placed under TeamsMessageCard.Sections
// Each element of AlertWebHook.Alerts will the number of elements of TeamsMessageCard.Sections to create
type TeamsMessageCardSection struct {
ActivityTitle string json:"activityTitle"
Facts []TeamsMessageCardSectionFacts json:"facts"
Markdown bool json:"markdown"
}
func (section *TeamsMessageCardSection) String() string {
b, err := json.Marshal(section)
if err != nil {
log.Errorf("failed marshalling TeamsMessageCardSection: %v", err)
}
return string(b)
}
// TeamsMessageCardSectionFacts is placed under TeamsMessageCardSection.Facts
type TeamsMessageCardSectionFacts struct {
Name string json:"name"
Value string json:"value"
}
// SendCard sends the JSON Encoded TeamsMessageCard
func SendCard(webhook string, card *TeamsMessageCard) (*http.Response, error) {
buffer := new(bytes.Buffer)
if err := json.NewEncoder(buffer).Encode(card); err != nil {
return nil, fmt.Errorf("Failed encoding message card: %v", err)
}
res, err := http.Post(webhook, "application/json", buffer)
if err != nil {
return nil, fmt.Errorf("Failed sending to webhook url %s. Got the error: %v",
webhook, err)
}
rb, err := ioutil.ReadAll(res.Body)
if err != nil {
log.Error(err)
}
log.Infof("Microsoft Teams response text: %s", string(rb))
if res.StatusCode != http.StatusOK {
if err != nil {
return nil, fmt.Errorf("Failed reading Teams http response: %v", err)
}
return nil, fmt.Errorf("Failed sending to the Teams Channel. Teams http response: %s",
res.Status)
}
if err := res.Body.Close(); err != nil {
log.Error(err)
}
return res, nil
}
// createCardMetadata creates the metadata for alerts of the same type
func createCardMetadata(promAlert PrometheusAlertMessage, markdownEnabled bool) *TeamsMessageCard {
card := &TeamsMessageCard{
Type: messageType,
Context: context,
Title: fmt.Sprintf("Prometheus Alert (%s)", promAlert.Status),
// Set a default Summary, this is required for Microsoft Teams
Summary: "Prometheus Alert received",
}
// Override the value of the Summary if the common annotation exists
if value, ok := promAlert.CommonAnnotations["summary"]; ok {
card.Summary = value
}
switch promAlert.Status {
case "resolved":
card.ThemeColor = colorResolved
case "firing":
switch promAlert.CommonLabels["severity"] {
case "critical":
card.ThemeColor = colorCritical
case "warning":
card.ThemeColor = colorWarning
default:
card.ThemeColor = colorUnknown
}
default:
card.ThemeColor = colorUnknown
}
// CreateCards creates the TeamsMessageCard based on values gathered from PrometheusAlertMessage
func CreateCards(promAlert PrometheusAlertMessage, markdownEnabled bool) []*TeamsMessageCard {
// maximum message size of 14336 Bytes (14KB)
const maxSize = 14336
cards := []*TeamsMessageCard{}
card := createCardMetadata(promAlert, markdownEnabled)
cardMetadataJSON := card.String()
cardMetadataSize := len(cardMetadataJSON)
// append first card to cards
cards = append(cards, card)
for _, alert := range promAlert.Alerts {
var s TeamsMessageCardSection
s.ActivityTitle = fmt.Sprintf("[%s](%s)",
alert.Annotations["description"], promAlert.ExternalURL)
s.Markdown = markdownEnabled
for key, val := range alert.Annotations {
s.Facts = append(s.Facts, TeamsMessageCardSectionFacts{key, val})
}
for key, val := range alert.Labels {
// Auto escape underscores if markdown is enabled
if markdownEnabled {
if strings.Contains(val, "_") {
val = strings.Replace(val, "_", "\\_", -1)
}
}
s.Facts = append(s.Facts, TeamsMessageCardSectionFacts{key, val})
}
currentCardSize := len(card.String())
newSectionSize := len(s.String())
newCardSize := cardMetadataSize + currentCardSize + newSectionSize
// if total Size of message exceeds maximum message size then split it
if (newCardSize) < maxSize {
card.Sections = append(card.Sections, s)
} else {
card = createCardMetadata(promAlert, markdownEnabled)
card.Sections = append(card.Sections, s)
cards = append(cards, card)
}
}
return cards
}
`
i deployed the latest version (1.1.4) and see the following exception in the log;
I dont receive the card in teams.
time="2019-08-26T20:45:45Z" level=info msg="Created a card for Microsoft Teams /global_channel"
time="2019-08-26T20:45:45Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"FFA500\",\"summary\":\"TargetDown\",\"title\":\"Prometheus Alert (firing)\",\"sections\":[{\"activityTitle\":\"[](https://alertmanager.xxx.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the identityserver-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"identityserver-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.xxx.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the mediaserver-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"mediaserver-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the pdfcreator-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"pdfcreator-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the policyserver-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"policyserver-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the -contractmanagerselfservice-api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"-contractmanagerselfservice-api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the -employeeselfservice-api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"--api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the -finance-api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"-finance-api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true}]}]"
time="2019-08-26T20:45:47Z" level=info msg="**Microsoft Teams response text: System.Reflection.ReflectionTypeLoadException: Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information.**"
time="2019-08-26T20:45:47Z" level=info msg="A card was successfully sent to Microsoft Teams Channel. Got http status: 200 OK"
Hello,
Could you contribute your helm chart to the official Helm chart github repository?
When running the latest version in kubernetes, we saw it using up 100% CPU.
The output was thousands of these lines:
Failed to parse json with key 'sections': Key path not found"
We don't overwrite the default template. So my first guess would be that
/default-message-card.tmpl might not be correct?
I will have another try on monday and report back.
This will help users to easily troubleshoot the JSON payload that is received from AlertManager and sent to Microsoft Teams. I'm thinking of using the go-kit logger to enable this since it's the one I'm most familiar with.
If anyone wants to take this issue, just let me know here. :)
In docker, running /bin/promteams server -f /test.yaml
:
Error: unknown shorthand flag: 'f' in -f
Usage:
prometheus-msteams server [flags]
Flags:
-h, --help help for server
-l, --listen-address string the address on which the server will listen (default "0.0.0.0")
-p, --port int port on which the server will listen (default 2000)
-r, --request-uri string the request uri path. Do not use this if using a config file. (default "alertmanager")
-w, --webhook-url string the incoming webhook url to post the alert messages. Do not use this if using a config file.
Global Flags:
--config string config file (default is $HOME/.prometheus-msteams.yaml)
unknown shorthand flag: 'f' in -f
In the readme it references -f for --config shorthand flag
This is the error I am getting
component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for \"webhook\" due to unrecoverable error: unexpected status code 404 from http://teamreceiver:2000/alertmanager"
Hi,
as the topic of the alert the summary and not the description should be used, I think.
The description is a more detailed explanation and may be to long for this.
Best
C.
Currently each event gets added as a section to a card so old alerts
are coming through again and again on new cards.
This bug can be replicated by taking the example test json and running
it back to back a few times. The first card will have 1 section, the
second two sections (1 duplicate), the third 3 sections (2 duplicates),
and so on.
When running as a docker container, the custom message card template file is mounted in the container via volume and then the TEMPLATE_FILE environment variable is set.
When running as a binary, the --template-file flag is used.
But there is no mention in the README about how to achieve this when deployed with the helm in a k8s cluster?
In the prometheus-msteams chart, the message card template file is mounted into the container via a configMap volume. This configMap reads data from only the default template card.tmpl
.
binaryData:
card.tmpl: {{ .Files.Get "card.tmpl" | b64enc }}
The function .Files.get
reads only those files which are present inside the chart's folder.
Here you can read that :
Currently, there is no way to pass files external to the chart during helm install. So if you are asking users to supply data, it must be loaded using
helm install -f
orhelm install --set
.
So as of now, it is not possible to use an external custom card template file with helm.
According to me, there could be 2 ways to fulfill the requirement:
prometheus-msteams/card.tmpl
with the custom template file. The custom template file should also have the same name card.tmpl
.card.tmpl
.Am I right here? Is there any other solution for this?
I found a bug wherein 413 "too large" errors from MS Teams are not being caught properly.
Looking at the logs will show that everything was okay, and that the HTTP response code was 200:
time="2018-12-07T01:33:32Z" level=info msg="A card was successfully sent to Microsoft Teams Channel. Got http status: 200 OK"
But it doesn't show up in MS Teams.
I took the request body from the logs and sent it with curl. This is what happens (verbose mode):
> Content-Type: application/json
> Content-Length: 19609
>
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* We are completely uploaded and fine
< HTTP/2 200
< cache-control: no-cache
< pragma: no-cache
< content-length: 185
< content-type: text/plain; charset=utf-8
< expires: -1
< request-id: 9243cf0d-6de1-4a2e-9ebe-2f0647b3ceb6
< x-calculatedfetarget: ME2PR01CU007.internal.outlook.com
< x-backendhttpstatus: 200
< x-feproxyinfo: ME2PR01CA0157.AUSPRD01.PROD.OUTLOOK.COM
< x-calculatedfetarget: SY2PR01CU001.internal.outlook.com
< x-backendhttpstatus: 200
< x-feproxyinfo: SY2PR01CA0011.AUSPRD01.PROD.OUTLOOK.COM
< x-calculatedbetarget: SYAPR01MB2863.ausprd01.prod.outlook.com
< x-backendhttpstatus: 200
< x-aspnet-version: 4.0.30319
< x-cafeserver: SY2PR01CA0011.AUSPRD01.PROD.OUTLOOK.COM
< x-beserver: SYAPR01MB2863
< x-rum-validated: 1
< x-feserver: SY2PR01CA0011
< x-feserver: ME2PR01CA0157
< x-powered-by: ASP.NET
< x-feserver: HK0PR03CA0043
< x-msedge-ref: Ref A: 8B559015A2E546309D7219160649965E Ref B: HK2EDGE1006 Ref C: 2018-12-07T01:34:28Z
< date: Fri, 07 Dec 2018 01:34:27 GMT
<
* Connection #0 to host outlook.office.com left intact
Webhook message delivery failed with error: Microsoft Teams endpoint returned HTTP error 413 with ContextId tcid=8241112485782354073,server=SG2PEPF00000467,cv=oRcpKL9MdU6iz4isFYTR6A.0..
So, infuriatingly, the HTTP response code is 200, but because there are too many alerts being sent at once, the actual response code is 413, but it's in the body.
(It's because the endpoint is a proxy for the actual MS Teams endpoint behind it, and the inner endpoint is the one giving the 413 HTTP Code, but that's not important right now.)
We need a fix to be able to set a maximum size for each call to the webhook, and then just send successive calls if it all doesn't fit in a single call.
Hi guys,
I'm wondering if there is a way to display the date and display the date and time of the alert in the teams card.
thanks for your help
I've tried POSTing a sample alert with curl and I've tried configuring a route to this instance, but no alerts are fired.
I can see from the server logs, that prometheus-msteams receives the alert, but it fails with this error message:
failed to template alerts: template: :1:12: executing "" at <{{template "teams.ca...>: template "teams.card" not defined
I've even copy pasted the default template and tried passing that as a template to the binary with the --template-file
flag and that didn't work either.
I'm using the latest binary from the release page for linux.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.