prometheus-msteams / prometheus-msteams Goto Github PK

Forward Prometheus Alert Manager notifications to Microsoft Teams.

License: MIT License

Makefile 4.04% Go 92.06% Dockerfile 0.90% Shell 1.08% Mustache 1.92%

prometheus golang golang-application devops microsoft-teams alertmanager docker kubernetes

prometheus-msteams's Issues

"Microsoft Teams response text: Summary or Text is required"

I am using the Prometheus-msteams version 1.1.4 helm chart to integrate the latest Prometheus/alertmanager and MS teams in the kubernetes environment.

There are issues with payloads from Prometheus being sent to msteams. Thus the alert notifications are sometimes not sent correctly to MSteams. The logs of Prometheus-msteams container shows error:

time="2019-11-06T07:01:08Z" level=info msg="/alertmanager received a request"
time="2019-11-06T07:01:08Z" level=debug msg="Prometheus Alert: {\"receiver\":\"prometheus-msteams\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"KubeDeploymentReplicasMismatch\",\"deployment\":\"storagesvc\",\"endpoint\":\"http\",\"instance\":\"10.233.108.72:8080\",\"job\":\"kube-state-metrics\",\"namespace\":\"fission\",\"pod\":\"monitor-kube-state-metrics-856bc9455b-7z5qx\",\"prometheus\":\"monitoring/monitor-prometheus-operato-prometheus\",\"service\":\"monitor-kube-state-metrics\",\"severity\":\"critical\"},\"annotations\":{\"message\":\"Deployment fission/storagesvc has not matched the expected number of replicas for longer than 15 minutes.\",\"runbook_url\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch\"},\"startsAt\":\"2019-11-06T07:00:32.453590324Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://monitor-prometheus-operato-prometheus.monitoring:9090/graph?g0.expr=kube_deployment_spec_replicas%7Bjob%3D%22kube-state-metrics%22%7D+%21%3D+kube_deployment_status_replicas_available%7Bjob%3D%22kube-state-metrics%22%7D\\u0026g0.tab=1\"},{\"status\":\"firing\",\"labels\":{\"alertname\":\"KubePodNotReady\",\"namespace\":\"fission\",\"pod\":\"storagesvc-5bff46b69b-vfdrd\",\"prometheus\":\"monitoring/monitor-prometheus-operato-prometheus\",\"severity\":\"critical\"},\"annotations\":{\"message\":\"Pod fission/storagesvc-5bff46b69b-vfdrd has been in a non-ready state for longer than 15 minutes.\",\"runbook_url\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready\"},\"startsAt\":\"2019-11-06T07:00:32.453590324Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://monitor-prometheus-operato-prometheus.monitoring:9090/graph?g0.expr=sum+by%28namespace%2C+pod%29+%28kube_pod_status_phase%7Bjob%3D%22kube-state-metrics%22%2Cphase%3D~%22Failed%7CPending%7CUnknown%22%7D%29+%3E+0\\u0026g0.tab=1\"}],\"groupLabels\":{\"namespace\":\"fission\",\"severity\":\"critical\"},\"commonLabels\":{\"namespace\":\"fission\",\"prometheus\":\"monitoring/monitor-prometheus-operato-prometheus\",\"severity\":\"critical\"},\"commonAnnotations\":{},\"externalURL\":\"http://monitor-prometheus-operato-alertmanager.monitoring:9093\",\"version\":\"4\",\"groupKey\":\"{}:{namespace=\\\"fission\\\", severity=\\\"critical\\\"}\"}"
time="2019-11-06T07:01:08Z" level=debug msg="Alert rendered in template file: \r\n{\r\n  \"@type\": \"MessageCard\",\r\n  \"@context\": \"http://schema.org/extensions\",\r\n  \"themeColor\": \"8C1A1A\",\r\n  \"summary\": \"\",\r\n  \"title\": \"Prometheus Alert (firing)\",\r\n  \"sections\": [ \r\n    {\r\n      \"activityTitle\": \"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\r\n      \"facts\": [\r\n        {\r\n          \"name\": \"message\",\r\n          \"value\": \"Deployment fission/storagesvc has not matched the expected number of replicas for longer than 15 minutes.\"\r\n        },\r\n        {\r\n          \"name\": \"runbook\\\\_url\",\r\n          \"value\": \"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch\"\r\n        },\r\n        {\r\n          \"name\": \"alertname\",\r\n          \"value\": \"KubeDeploymentReplicasMismatch\"\r\n        },\r\n        {\r\n          \"name\": \"deployment\",\r\n          \"value\": \"storagesvc\"\r\n        },\r\n        {\r\n          \"name\": \"endpoint\",\r\n          \"value\": \"http\"\r\n        },\r\n        {\r\n          \"name\": \"instance\",\r\n          \"value\": \"10.233.108.72:8080\"\r\n        },\r\n        {\r\n          \"name\": \"job\",\r\n          \"value\": \"kube-state-metrics\"\r\n        },\r\n        {\r\n          \"name\": \"namespace\",\r\n          \"value\": \"fission\"\r\n        },\r\n        {\r\n          \"name\": \"pod\",\r\n          \"value\": \"monitor-kube-state-metrics-856bc9455b-7z5qx\"\r\n        },\r\n        {\r\n          \"name\": \"prometheus\",\r\n          \"value\": \"monitoring/monitor-prometheus-operato-prometheus\"\r\n        },\r\n        {\r\n          \"name\": \"service\",\r\n          \"value\": \"monitor-kube-state-metrics\"\r\n        },\r\n        {\r\n          \"name\": \"severity\",\r\n          \"value\": \"critical\"\r\n        }\r\n      ],\r\n      \"markdown\": true\r\n    },\r\n    {\r\n      \"activityTitle\": \"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\r\n      \"facts\": [\r\n        {\r\n          \"name\": \"message\",\r\n          \"value\": \"Pod fission/storagesvc-5bff46b69b-vfdrd has been in a non-ready state for longer than 15 minutes.\"\r\n        },\r\n        {\r\n          \"name\": \"runbook\\\\_url\",\r\n          \"value\": \"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready\"\r\n        },\r\n        {\r\n          \"name\": \"alertname\",\r\n          \"value\": \"KubePodNotReady\"\r\n        },\r\n        {\r\n          \"name\": \"namespace\",\r\n          \"value\": \"fission\"\r\n        },\r\n        {\r\n          \"name\": \"pod\",\r\n          \"value\": \"storagesvc-5bff46b69b-vfdrd\"\r\n        },\r\n        {\r\n          \"name\": \"prometheus\",\r\n          \"value\": \"monitoring/monitor-prometheus-operato-prometheus\"\r\n        },\r\n        {\r\n          \"name\": \"severity\",\r\n          \"value\": \"critical\"\r\n        }\r\n      ],\r\n      \"markdown\": true\r\n    }\r\n  ]\r\n}\r\n"
time="2019-11-06T07:01:08Z" level=debug msg="Size of message is 1714 Bytes (~1 KB)"
time="2019-11-06T07:01:08Z" level=info msg="Created a card for Microsoft Teams /alertmanager"
time="2019-11-06T07:01:08Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"8C1A1A\",\"summary\":\"\",\"title\":\"Prometheus Alert (firing)\",\"sections\":[{\"activityTitle\":\"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\"facts\":[{\"name\":\"message\",\"value\":\"Deployment fission/storagesvc has not matched the expected number of replicas for longer than 15 minutes.\"},{\"name\":\"runbook\\\\_url\",\"value\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedeploymentreplicasmismatch\"},{\"name\":\"alertname\",\"value\":\"KubeDeploymentReplicasMismatch\"},{\"name\":\"deployment\",\"value\":\"storagesvc\"},{\"name\":\"endpoint\",\"value\":\"http\"},{\"name\":\"instance\",\"value\":\"10.233.108.72:8080\"},{\"name\":\"job\",\"value\":\"kube-state-metrics\"},{\"name\":\"namespace\",\"value\":\"fission\"},{\"name\":\"pod\",\"value\":\"monitor-kube-state-metrics-856bc9455b-7z5qx\"},{\"name\":\"prometheus\",\"value\":\"monitoring/monitor-prometheus-operato-prometheus\"},{\"name\":\"service\",\"value\":\"monitor-kube-state-metrics\"},{\"name\":\"severity\",\"value\":\"critical\"}],\"markdown\":true},{\"activityTitle\":\"[](http://monitor-prometheus-operato-alertmanager.monitoring:9093)\",\"facts\":[{\"name\":\"message\",\"value\":\"Pod fission/storagesvc-5bff46b69b-vfdrd has been in a non-ready state for longer than 15 minutes.\"},{\"name\":\"runbook\\\\_url\",\"value\":\"https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubepodnotready\"},{\"name\":\"alertname\",\"value\":\"KubePodNotReady\"},{\"name\":\"namespace\",\"value\":\"fission\"},{\"name\":\"pod\",\"value\":\"storagesvc-5bff46b69b-vfdrd\"},{\"name\":\"prometheus\",\"value\":\"monitoring/monitor-prometheus-operato-prometheus\"},{\"name\":\"severity\",\"value\":\"critical\"}],\"markdown\":true}]}]"
time="2019-11-06T07:01:08Z" level=info msg="Microsoft Teams response text: Summary or Text is required."
time="2019-11-06T07:01:08Z" level=error msg="Failed sending to the Teams Channel. Teams http response: 400 Bad Request"

Maybe this is an issue with the default card template due to which MSteams sends Summary or Text is required response.

Also, I would suggest :

Clearly mention how to override the default card template card.tmpl when using helm charts.
It would be good if the chart is available in the helm/charts repository.

Allow using a configmap to declare connectors

I would be nice to be able to declare the connectors in 1 or more configmaps so that connectors could be added separate from the service deployment.

Add ablility to use variables in url path

Hi team,
First off, thanks for this app. I would like to add a feature request to allow for variables to be used in the URL. We have a use case where we have dozens of MS team channels that we are trying to set up, and it would very useful to be able to set up a single receiver that uses a variable to select the correct url path. In the following example I am adding annotations for email and the msteams connector. It would be useful to be able to use the {{ .annotation.msteams }} variable in the receiver so I do not have to have dozens of receivers, with each connector in the url.

rule:
alert: Alert
testing rule new alert
expr: up{hostname="xxxx.dev.local",job="deploydirector"}
== 1
for: 1m
labels:
channel: msteams,email,monolith
severity: Critical
annotations:
application: git.tools.foo.net
description: '{{ $labels.job }} server has been down for more than 5 minutes.'
documentation_link: https://confluence.com/display/MOD/Deployments+FAQ
email: [email protected]
msteams: Operations
summary: THIS IS A TEST- {{ $labels.job }} is down- THIS IS A TEST

alertmanager config:
global:
resolve_timeout: 5m

route:
receiver: email
group_by:

alertname
routes:
receiver: email
match_re:
channel: ^(?:.(email).)$
continue: true
receiver: msteams
match_re:
channel: ^(?:.(msteams).)$
continue: true
receiver: monolith
match_re:
channel: ^(?:.(monolith).)$
repeat_interval: 8737h
group_wait: 30s
group_interval: 1m
repeat_interval: 4h
receivers:
name: email
email_configs:
- send_resolved: true
  to: '{{ .CommonAnnotations.email }}'
  from: [email protected]
  hello: localhost
  To: '{{ .CommonAnnotations.email }}'
  html: '{{ template "email.default.html" . }}'
  require_tls: false
name: msteams
webhook_configs:
- send_resolved: true
  http_config: {}
  url: "http://localhost:2000/{{ .CommonAnnotations.msteams }}'
name: monolith
webhook_configs:
- send_resolved: true
  http_config:
  tls_config:
  insecure_skip_verify: true
  url: https://portal.local/api/Monolith/SendAlertsDataProd_RAW
  templates: []

Would this be possible?
thanks

Update Prometheus webhook integrations documentation

The Prometheus documentation provides a list of integrations made by 3rd party projects using the webhook receiver: https://prometheus.io/docs/operating/integrations/#alertmanager-webhook-receiver.

This documentation is open source and we should add the prometheus-msteams to the list.

Add README

level=error msg="Failed sending to webhook url ...: Forbidden"

Hello, I am trying to test out the functionality of the prometheus-msteam pod by directly sending it a message using your template. I get this in retunr (URL modified for privacy)

/ # curl -X POST -d @prom-alert.json http://prometheus-msteams:2000/alertmanager
Failed sending to webhook url https://outlook.office.com/webhook/xxx/xxx. Got the error: Post https://outlook.office.com/webhook/xxx/xxx: Forbidden

I check the logs, and I see this:

time="2019-12-03T22:40:58Z" level=info msg="/alertmanager received a request"
time="2019-12-03T22:40:59Z" level=debug msg="Prometheus Alert: {\"receiver\":\"teams_proxy\",\"status\":\"firing\",\"alerts\":[{\"status\":\"\",\"labels\":{\"alertname\":\"high_memory_load\",\"instance\":\"10.80.40.11:9100\",\"job\":\"docker_nodes\",\"monitor\":\"master\",\"severity\":\"warning\"},\"annotations\":{\"description\":\"xxxxxxx\",\"summary\":\"Server High Memory usage\"},\"startsAt\":\"2018-03-07T06:33:21.873077559-05:00\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"\"}],\"groupLabels\":{\"alertname\":\"high_memory_load\"},\"commonLabels\":{\"alertname\":\"high_memory_load\",\"monitor\":\"master\",\"severity\":\"warning\"},\"commonAnnotations\":{\"summary\":\"Server High Memory usage\"},\"externalURL\":\"http://alertmanager:9093\",\"version\":\"4\",\"groupKey\":\"{}:{alertname=\\\"high_memory_load\\\"}\"}"
time="2019-12-03T22:40:59Z" level=debug msg="Alert rendered in template file: \n{\n  \"@type\": \"MessageCard\",\n  \"@context\": \"http://schema.org/extensions\",\n  \"themeColor\": \"FFA500\",\n  \"summary\": \"Server High Memory usage\",\n  \"title\": \"Prometheus Alert (firing)\",\n  \"sections\": [ \n    {\n      \"activityTitle\": \"[xxxxxxx](http://alertmanager:9093)\",\n      \"facts\": [\n        {\n          \"name\": \"description\",\n          \"value\": \"xxxxxxx\"\n        },\n        {\n          \"name\": \"summary\",\n          \"value\": \"Server High Memory usage\"\n        },\n        {\n          \"name\": \"alertname\",\n          \"value\": \"high memory load\"\n        },\n        {\n          \"name\": \"instance\",\n          \"value\": \"10.80.40.11:9100\"\n        },\n        {\n          \"name\": \"job\",\n          \"value\": \"docker nodes\"\n        },\n        {\n          \"name\": \"monitor\",\n          \"value\": \"master\"\n        },\n        {\n          \"name\": \"severity\",\n          \"value\": \"warning\"\n        }\n      ],\n      \"markdown\": true\n    }\n  ]\n}\n"
time="2019-12-03T22:40:59Z" level=debug msg="Size of message is 557 Bytes (~0 KB)"
time="2019-12-03T22:40:59Z" level=info msg="Created a card for Microsoft Teams /alertmanager"
time="2019-12-03T22:40:59Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"FFA500\",\"summary\":\"Server High Memory usage\",\"title\":\"Prometheus Alert (firing)\",\"sections\":[{\"activityTitle\":\"[xxxxxxx](http://alertmanager:9093)\",\"facts\":[{\"name\":\"description\",\"value\":\"xxxxxxx\"},{\"name\":\"summary\",\"value\":\"Server High Memory usage\"},{\"name\":\"alertname\",\"value\":\"high memory load\"},{\"name\":\"instance\",\"value\":\"10.80.40.11:9100\"},{\"name\":\"job\",\"value\":\"docker nodes\"},{\"name\":\"monitor\",\"value\":\"master\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true}]}]"
time="2019-12-03T22:40:59Z" level=error msg="Failed sending to webhook url https://outlook.office.com/webhook/xxx/xxx. Got the error: Post https://outlook.office.com/webhook/xxx/xxx: Forbidden"

I tested the webhook directly using the test messages and format from here - https://docs.microsoft.com/en-us/outlook/actionable-messages/send-via-connectors
And these work fine, so I know the webhook is operational and functioning.

Note, I have not configured Alertmanager yet, I am simply testing directly to the prometheus-teams pod to verify if it will work once alertmanager is set up.

Not sure why its telling me my connection is forbidden.

[Proposal] Allow empty connectors in config file

Hi,

I tested prometheus-msteams these days, it is really a wonderful tool and better than the one wrote in python introduced in the official link I think.
Now I'm managing connectors with the config file by saltstack, it will dynamically create connectors from pillar values. But if no connectors defined, the service will be failed to start, looks like prometheus-msteams needs at least one connectors to start, so what I have done is this:

connectors:
- dummy: "Used for empty connectors"

By doing this, if I have no actual connectors defined in my pillar, at least the service can be running.
Any thoughts?

Add CI Pipeline

Config file doesn't work

Command cat $HOME/.prometheus-msteams.yaml produces:

connectors:
  cc_channel: https://localhost
  dd_channel: https://localhost

Command /bin/promteams server produces:

Using config file: /root/.prometheus-msteams.yaml
2018/06/22 05:05:31 A config file (-f) or --request-uri or --webhook-url is not found.
Usage:
  prometheus-msteams server [flags]

Flags:
  -h, --help                    help for server
  -l, --listen-address string   the address on which the server will listen (default "0.0.0.0")
  -p, --port int                port on which the server will listen (default 2000)
  -r, --request-uri string      the request uri path. Do not use this if using a config file. (default "alertmanager")
  -w, --webhook-url string      the incoming webhook url to post the alert messages. Do not use this if using a config file.

Global Flags:
      --config string   config file (default is $HOME/.prometheus-msteams.yaml)

Command /bin/promteams server --config /test.yaml produces:

Using config file: /test.yaml
2018/06/22 05:06:46 A config file (-f) or --request-uri or --webhook-url is not found.
Usage:
  prometheus-msteams server [flags]

Flags:
  -h, --help                    help for server
  -l, --listen-address string   the address on which the server will listen (default "0.0.0.0")
  -p, --port int                port on which the server will listen (default 2000)
  -r, --request-uri string      the request uri path. Do not use this if using a config file. (default "alertmanager")
  -w, --webhook-url string      the incoming webhook url to post the alert messages. Do not use this if using a config file.

Global Flags:
      --config string   config file (default is $HOME/.prometheus-msteams.yaml)

Getting status code 500 error while running thorugh docker

Alert is being sent to the alert manager. Then it is not able to forward to msteams receiver I guess. I am getting the following error
" unexpected status code 500 from http://teamreceiver:2000/alertmanager"

level=error msg="Failed to parse json with key 'sections': Key path not found"

Hey,

Having issues with payloads from prometheus being sent to msteams. We've recently updated to the latest prometheus/alertmanager and version 1.1.4 of promethus-msteams.

Test alerts work fine, so its an error in the formatting which is coming out of default prometheus-operator helm chart install. I haven't looked too much further yet, but can see a few people do have templating errors. Given 1.1.4 has just been released I thought i'd raise an issue as well.

Template is the default one, unedited - and this is the error being received in the logs

time="2019-08-05T07:42:15Z" level=debug msg="Prometheus Alert: {\"receiver\":\"high_priority_receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"annotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"startsAt\":\"2019-08-05T07:36:45.470372903Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://prometheus-dashboard.testing/graph?g0.expr=vector%281%29\\u0026g0.tab=1\"}],\"groupLabels\":{},\"commonLabels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"commonAnnotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"externalURL\":\"http://alertmanager.testing\",\"version\":\"4\",\"groupKey\":\"{}:{}\"}"
time="2019-08-05T07:42:15Z" level=debug msg="Alert rendered in template file: \n{\n  \"@type\": \"MessageCard\",\n  \"@context\": \"http://schema.org/extensions\",\n  \"themeColor\": \"808080\",\n  \"summary\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\",\n  \"title\": \"Prometheus Alert (firing)\",\n  \"sections\": [ \n    {\n      \"activityTitle\": \"[ aaaa](http://alertmanager.testing)\",\n      \"facts\": [\n        {\n          \"name\": \"message\",\n          \"value\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\"\n        },\n        {\n          \"name\": \"alertname\",\n          \"value\": \"Watchdog\"\n        },\n        {\n          \"name\": \"prometheus\",\n          \"value\": \"monitoring/prometheus-operator-prometheus\"\n        },\n        {\n          \"name\": \"severity\",\n          \"value\": \"none\"\n        }\n      ],\n      \"markdown\": true\n    }\n  ]\n}\n"
time="2019-08-05T07:42:15Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
time="2019-08-05T07:42:15Z" level=error msg="Failed to parse json with key 'sections': Key path not found"

getting connection refused error while connecting http://localhost:2000/alertmanager

I installed via helm:

helm install --name prometheus-msteams ./prometheus-msteams --namespace monitoring

I didn't use the custom config.yaml instead I updated the values.yaml

connectors:
  - alertmanager: https://outlook.office.com/webhook/ad5b7417-6a87-4f47-xxxxxxxx8@8a4925a9-fd8e-xxxxxxxxx/IncomingWebhook/21595d04c1aa432xxxxxxxxxxxxxx/602f4df1-xxxxxx-xxxxx

When I am getting an error:
level=error ts=2019-01-10T12:15:31.712451123Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="Post http://localhost:2000/alertmanager: dial tcp 127.0.0.1:2000: connect: connection refused"

I have updated the alertmanager.yaml as well with the below details:

global:
  resolve_timeout: 5m
route:
  group_by: ['alertname']
  group_wait: 10s
  group_interval: 1m
  repeat_interval: 5m
  receiver: sns-forwarder
receivers:
- name: 'sns-forwarder'
  slack_configs:
  - api_url: https://hooks.slack.com/services/TF90Yxxxxxxxxxxxxxxxxxxxx
    channel: '#k8'
    icon_emoji: ':bell:'
    send_resolved: true
    text: "<!channel> \nsummary: {{ .CommonAnnotations.summary }}\ndescription: {{ .CommonAnnotations.description }}"
  webhook_configs: # https://prometheus.io/docs/alerting/configuration/#webhook_config 
  - send_resolved: true
    url: 'http://localhost:2000/alertmanager'

repeat_interval: 30s

I have enabled a custom webhook with alert manger to self trigger Jenkins jobs when a alert is received. So when I am introducing the Microsoft Team proxy, my original webhook stops working. Basically I get alert via email and on teams that something is wrong but it does not triggers my original webhook to invoke Jenkins to take required action to start my job.

Use proxy

Thanks for the great app. Just wondered if there was a straightforward way of using it with a proxy server?

Thanks

Alerts sorting is random

Hi, nice stuff with this app!

Noticed for some reason alert card has keys sorted a bit randomly?

Sometimes Description is at the top, sometimes Summary. Basically every key.

Cheers

Getting unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager

Hi I am running prom2teams through docker-compose
my docker-compose.yml file:
version: '2'
volumes:
prometheus_data: {}
grafana_data: {}
services:
prometheus:
image: prom/prometheus:latest
volumes:
- ./prometheus/:/etc/prometheus/
- prometheus_data:/prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
- '--storage.tsdb.path=/prometheus'
ports:
- "9090:9090"
alertmanager:
image: "prom/alertmanager"
volumes:
- ./msteams.yml:/alertmanager.yml
command:
- "--config.file=/alertmanager.yml"
ports:
- "9093:9093"
grafana:
image: grafana/grafana:5.1.0
depends_on:
- prometheus
ports:
- "3000:3000"
user: "104"
promteams:
image: bzon/prometheus-msteams:latest
environment:
- TEAMS_INCOMING_WEBHOOK_URL="https://outlook.office.com/webhook/xxx"
- PROMTEAMS_DEBUG="true"
ports:
- "2000:2000"

alertmanager.yml file
route:
receiver: receiver
group_interval: 1m
repeat_interval: 15m
receivers:

name: receiver
webhook_configs:
- send_resolved: true
  url: 'http://promteams:2000/alertmanager'
  When I do docker-compose up ,Prom2teams container is up and running no issues with that.
  But when alertmanager tries to forward notification to it, it shows error code 404

logs:
alertmanager_1 | level=error ts=2018-08-24T12:31:59.017048267Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"
alertmanager_1 | level=error ts=2018-08-24T12:31:59.017134279Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"
alertmanager_1 | level=error ts=2018-08-24T12:32:59.017103102Z caller=notify.go:332 component=dispatcher msg="Error on notify" err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"
alertmanager_1 | level=error ts=2018-08-24T12:32:59.017262416Z caller=dispatch.go:280 component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for "webhook" due to unrecoverable error: unexpected status code 404 from http://promteams:2000/alertmanager"

Should be able to handle Multiple Teams webhook

Bad card payload when title starts with a `[`, getting 400 from Teams

I modified the "title" of the card-with-action.tmpl, changing "title": "Prometheus Alert ({{ .Status }})", to "title": "[{{ .Status }}] Prometheus Alert":

card-with-action.tmpl:

{{ define "teams.card" }}
{
  "@type": "MessageCard",
  "@context": "http://schema.org/extensions",
  "themeColor": "{{- if eq .Status "resolved" -}}2DC72D
                 {{- else if eq .Status "firing" -}}
                    {{- if eq .CommonLabels.severity "critical" -}}8C1A1A
                    {{- else if eq .CommonLabels.severity "warning" -}}FFA500
                    {{- else -}}808080{{- end -}}
                 {{- else -}}808080{{- end -}}",
  "summary": "{{- if eq .CommonAnnotations.summary "" -}}
                  {{- if eq .CommonAnnotations.message "" -}}
                    {{- .CommonLabels.alertname -}}
                  {{- else -}}
                    {{- .CommonAnnotations.message -}}
                  {{- end -}}
              {{- else -}}
                  {{- .CommonAnnotations.summary -}}
              {{- end -}}",
  "title": "[{{ .Status }}] Prometheus Alert",
  "sections": [ {{$externalUrl := .ExternalURL}}
  {{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
    {
      "activityTitle": "[{{ $alert.Annotations.description }}]({{ $externalUrl }})",
      "facts": [
        {{- range $key, $value := $alert.Annotations }}
        {
          "name": "{{ reReplaceAll "_" "\\\\_" $key }}",
          "value": "{{ reReplaceAll "_" "\\\\_" $value }}"
        },
        {{- end -}}
        {{$c := counter}}{{ range $key, $value := $alert.Labels }}{{if call $c}},{{ end }}
        {
          "name": "{{ reReplaceAll "_" "\\\\_" $key }}",
          "value": "{{ reReplaceAll "_" "\\\\_" $value }}"
        }
        {{- end }}
      ],
      "markdown": true
    }
    {{- end }}
  ],
  "potentialAction": [
    {
        "@context": "http://schema.org",
        "@type": "ViewAction",
        "name": "Runbook",
        "target": [
            "{{ reReplaceAll "_" "\\\\_" .CommonAnnotations.runbook }}"
        ]
    }
  ]
}
{{ end }}

Logs:

prometheus-msteams_1  | time="2019-07-31T12:10:03Z" level=debug msg="Prometheus Alert: {\"receiver\":\"msteams-default-receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"Watchdog\",\"severity\":\"none\"},\"annotations\":{\"description\":\"Watchdog\",\"summary\":\"Watchdog\"},\"startsAt\":\"2019-07-31T12:09:25.689259224Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://a8ed24b9cfee:9090/graph?g0.expr=vector%281%29\\u0026g0.tab=1\"}],\"groupLabels\":{\"alertname\":\"Watchdog\"},\"commonLabels\":{\"alertname\":\"Watchdog\",\"severity\":\"none\"},\"commonAnnotations\":{\"description\":\"Watchdog\",\"summary\":\"Watchdog\"},\"externalURL\":\"http://c72d6a34aa58:9093\",\"version\":\"4\",\"groupKey\":\"{}/{}:{alertname=\\\"Watchdog\\\"}\"}"
prometheus-msteams_1  | time="2019-07-31T12:10:03Z" level=debug msg="Alert rendered in template file: \n{\n  \"@type\": \"MessageCard\",\n  \"@context\": \"http://schema.org/extensions\",\n  \"themeColor\": \"808080\",\n  \"summary\": \"Watchdog\",\n  \"title\": \"[firing]: Prometheus Alert\",\n  \"sections\": [ \n    {\n      \"activityTitle\": \"[Watchdog](http://c72d6a34aa58:9093)\",\n      \"facts\": [\n        {\n          \"name\": \"description\",\n          \"value\": \"Watchdog\"\n        },\n        {\n          \"name\": \"summary\",\n          \"value\": \"Watchdog\"\n        },\n        {\n          \"name\": \"alertname\",\n          \"value\": \"Watchdog\"\n        },\n        {\n          \"name\": \"severity\",\n          \"value\": \"none\"\n        }\n      ],\n      \"markdown\": true\n    }\n  ],\n  \"potentialAction\": [\n    {\n        \"@context\": \"http://schema.org\",\n        \"@type\": \"ViewAction\",\n        \"name\": \"Runbook\",\n        \"target\": [\n            \"\"\n        ]\n    }\n  ]\n}\n"
prometheus-msteams_1  | time="2019-07-31T12:10:03Z" level=debug msg="Size of message is 501 Bytes (~0 KB)"
prometheus-msteams_1  | time="2019-07-31T12:10:03Z" level=info msg="Created a card for Microsoft Teams /prom-alerts-dev-leo"
prometheus-msteams_1  | time="2019-07-31T12:10:03Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"808080\",\"summary\":\"Watchdog\",\"title\":[firing]: Prometheus Alert,\"sections\":[{\"activityTitle\":\"[Watchdog](http://c72d6a34aa58:9093)\",\"facts\":[{\"name\":\"description\",\"value\":\"Watchdog\"},{\"name\":\"summary\",\"value\":\"Watchdog\"},{\"name\":\"alertname\",\"value\":\"Watchdog\"},{\"name\":\"severity\",\"value\":\"none\"}],\"markdown\":true}],\"potentialAction\":[{\"@context\":\"http://schema.org\",\"@type\":\"ViewAction\",\"name\":\"Runbook\",\"target\":[\"\"]}]}]"
prometheus-msteams_1  | time="2019-07-31T12:10:03Z" level=info msg="Microsoft Teams response text: Bad payload received by generic incoming webhook."
prometheus-msteams_1  | time="2019-07-31T12:10:03Z" level=error msg="Failed sending to the Teams Channel. Teams http response: 400 Bad Request"

Alert:

groups:
  - name: alerting_rules
    rules:
    - alert: Watchdog
      expr: vector(1)
      labels:
        severity: none
      annotations:
        summary: "Watchdog"
        description: "Watchdog"

As you can see there is no double-quotes at this part ,\"title\":[firing]: Prometheus Alert, of the json.

Happens on v1.1.2 and v1.1.3

Use cobra and viper correctly

Template Question

Hey,

Can you tell me how to edit the template that is sent to MS Teams for example adding an external link - Is there a JSON template I can edit to include this?

I compared it to the Prom2Teams project where I was able to find the template, but I was not able to find it in this one. Can you point me in the right direction please.

Kind regards,
Laura

Customise templates

Is it possible to use the AdaptiveCard from Ms to customise the template, I've tried to use it but I always get an error. The predefined actions at this moment is quite limited.

Do you have any example for this?

Thanks in advance

Error sending notifications msteams

[{
	"@type": "MessageCard",
	"@context": "http://schema.org/extensions",
	"themeColor": "8C1A1A",
	"summary": "",
	"title": "Prometheus Alert (firing)",
	"sections": [{
		"activityTitle": "[](http://xxxx:9093)",
		"facts": [{
				"name": "message",
				"value": "Only 0% of the desired Pods of DaemonSet monitoring/prometheus-operator-prometheus-node-exporter are scheduled and ready."
			},
			{
				"name": "runbook\\_url",
				"value": "https://github.com/kubernetes-monitoring/kubernetes-mixin/tree/master/runbook.md#alert-name-kubedaemonsetrolloutstuck"
			},
			{
				"name": "alertname",
				"value": "KubeDaemonSetRolloutStuck"
			},
			{
				"name": "cluster",
				"value": "yyyy"
			},
			{
				"name": "daemonset",
				"value": "prometheus-operator-prometheus-node-exporter"
			},
			{
				"name": "endpoint",
				"value": "http"
			},
			{
				"name": "instance",
				"value": "zzzzzz:8080"
			},
			{
				"name": "job",
				"value": "kube-state-metrics"
			},
			{
				"name": "namespace",
				"value": "monitoring"
			},
			{
				"name": "pod",
				"value": "prometheus-operator-kube-state-metrics-6c7cc58ff8-h8tnc"
			},
			{
				"name": "prometheus",
				"value": "monitoring/prometheus-operator-prometheus"
			},
			{
				"name": "service",
				"value": "prometheus-operator-kube-state-metrics"
			},
			{
				"name": "severity",
				"value": "critical"
			}
		],
		"markdown": true
	}]
}]

docker log:

time="2019-04-01T15:50:31Z" level=info msg="Microsoft Teams response text: Summary or Text is required."
time="2019-04-01T15:50:31Z" level=error msg="Failed sending to the Teams Channel. Teams http response: 400 Bad Request"

perhaps it also important to mentioned that I had several issues when alerts does not contain anything, which is the case for "Watchdog" and some targets down "AlertmanagerDown"
in order to get it working, i had to define a new route with a "fake" receiver for the cases mentioned before.

Add Dockerfile and create Dockerhub Repo Image

Use golang text templates when creating Teams JSON

Description field is used as header

We are using the description field also for instruction how to get rid of the error for 1st and 2nd level support. Wouldn't it be better to use the summary field as head line instead of the description field.

For e.g. in your screenshot:

This would mean, the summary text is used for the link. So the anker text would be: "Server High Memory Usage"

To streamline this a bit more we could drop the summary field in the list below.

Would you accept a PR with this?

Default log level should be INFO

Generally for testing and developing it's great to use the DEBUG. For the general user however, INFO should be the default log level.

level=error msg="Failed to parse json with key 'sections': Key path not found"

I have the following card.tmpl

{
  "@type": "MessageCard",
  "@context": "http://schema.org/extensions",
  "themeColor": "{{- if eq .Status "resolved" -}}2DC72D
                 {{- else if eq .Status "firing" -}}
                    {{- if eq .CommonLabels.severity "L1" -}}8C1A1A
                    {{- else if eq .CommonLabels.severity "L4" -}}FFA500
                    {{- else -}}808080{{- end -}}
                 {{- else -}}808080{{- end -}}",
  "summary": "{{- .CommonLabels.alertname -}}",
  "title": "Prometheus Alert ({{ .Status }})",
  "sections": [ {{$externalUrl := .ExternalURL}}
  {{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
    {
      "activityTitle": "[{{ $alert.Annotations.description }}]({{ $externalUrl }})",
      "markdown": true
    }
    {{- end }}
  ]
}
{{ end }}

Its throwing error msg="Failed to parse json with key 'sections': Key path not found" when running my application in minikube. Please help me out.

The sample payload which needs to parsed is below:
"{\"receiver\":\"high_priority_receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alert_team\":\"P2HOBJ\",\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"instance\":\"172.17.0.11:8081\",\"job\":\"kube-api-scrape\",\"sdk_version\":\"3.25.10\",\"server_version\":\"5.4.1\",\"severity\":\"L1\",\"team\":\"P2HOBJ\"},\"annotations\":{\"description\":\"172.17.0.11:8081 of hobject job \\\"kube-api-scrape\\\" has been down for more than 1 minute.\",\"summary\":\"Instance 172.17.0.11:8081 down\"},\"startsAt\":\"2019-05-15T04:30:24.063168873Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://hobject-prometheus-bbcb9f9cd-s4sfq:9090/graph?g0.expr=up+%3D%3D+0\\u0026g0.tab=1\"},{\"status\":\"firing\",\"labels\":{\"alert_team\":\"P2HOBJ\",\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"instance\":\"172.17.0.8:8081\",\"job\":\"kube-api-scrape\",\"sdk_version\":\"3.25.10\",\"server_version\":\"5.4.1\",\"severity\":\"L1\",\"team\":\"P2HOBJ\"},\"annotations\":{\"description\":\"172.17.0.8:8081 of hobject job \\\"kube-api-scrape\\\" has been down for more than 1 minute.\",\"summary\":\"Instance 172.17.0.8:8081 down\"},\"startsAt\":\"2019-05-15T04:30:34.063168873Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://hobject-prometheus-bbcb9f9cd-s4sfq:9090/graph?g0.expr=up+%3D%3D+0\\u0026g0.tab=1\"}],\"groupLabels\":{\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"team\":\"P2HOBJ\"},\"commonLabels\":{\"alert_team\":\"P2HOBJ\",\"alertname\":\"Instance Down\",\"app_name\":\"p2hobj-3-25-10-mainftr\",\"job\":\"kube-api-scrape\",\"sdk_version\":\"3.25.10\",\"server_version\":\"5.4.1\",\"severity\":\"L1\",\"team\":\"P2HOBJ\"},\"commonAnnotations\":{},\"externalURL\":\"http://hobject-alertmanager-59d8577b9d-zzktw:9093\",\"version\":\"4\",\"groupKey\":\"{}:{alertname=\\\"Instance Down\\\", app_name=\\\"p2hobj-3-25-10-mainftr\\\", team=\\\"P2HOBJ\\\"}\"}"

Disable Selected Labels

Is there a way to disable few labels? Currently we get all the labels associated with a particular metric, can this be made more selective?

Webhook message sent error in older version of prometheus.

I am using Prometheus v1.5.0 and alertmanager v0.5
it seems like this does not support older versions and throwing below error.

level=error msg="Failed decoding Prometheus alert message: json: cannot unmarshal number into Go struct field PrometheusAlertMessage.groupKey of type string"

and failing to send message afterwards.

Config.yml webhook secrets leaked on INFO log level

Since v1.0 , the webhooks specified in config.yml are spit out in the container logs on the INFO log level.

As with best practices, these secrets should either be logged on only the DEBUG log level, or they could be logged on the INFO level but at least be redacted. This is prevent secrets from being leaked as plaintext onto the disk or to the backing log database.

Publish to helm repo?

Is it on the roadmap to publish this chart to https://github.com/helm/charts? So far I have been installing my charts using requirements.yaml, but there doesn't currently seem to be a way to do that with this repo.

ConfigMap is always created in the default namespace

Hi bzon,

Thanks for your work on this! Just a small remark: When using the helm chart, the config map is always created in the default namespace; if you deploy to a different namespace than default, the container will not start, as the config map cannot be mounted:

Events:
  Type     Reason       Age               From                                 Message
  ----     ------       ----              ----                                 -------
  Normal   Scheduled    2m                default-scheduler                    Successfully assigned monitoring/prometheus-msteams-59dfbc84b6-ts9dq to <...>
  Warning  FailedMount  58s (x8 over 2m)  kubelet, <...>  MountVolume.SetUp failed for volume "config-volume" : configmaps "prometheus-msteams-config" not found

Because:

$ kc get configmaps -n default
NAME                        DATA      AGE
prometheus-msteams-config   1         3m

Allow users to customize the Teams message card

Using the Go templating engine, we can include a feature to let the user create their own YAML templates and feed it to prometheus-msteams. The teams message card creation reference is here https://docs.microsoft.com/en-us/microsoftteams/platform/concepts/cards/cards-reference.

We can add a new function named CreateCardFromTemplates that will be called if the user use the --card-template $TEMPLATE_FILE.

Evolution for Warning and Critical cases

Modification on teams.go

adding Orange Warning on your code (i m not authorize to push with my credentials) and a case if it's firing (warning/critical)

`// Copyright © 2018 bzon [email protected]
//
// Permission is hereby granted, free of charge, to any person obtaining a copy
// of this software and associated documentation files (the "Software"), to deal
// in the Software without restriction, including without limitation the rights
// to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
// copies of the Software, and to permit persons to whom the Software is
// furnished to do so, subject to the following conditions:
//
// The above copyright notice and this permission notice shall be included in
// all copies or substantial portions of the Software.
//
// THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
// IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
// FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
// AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
// LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
// OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
// THE SOFTWARE.

package alert

import (
"bytes"
"encoding/json"
"fmt"
"io/ioutil"
"net/http"
"strings"

log "github.com/sirupsen/logrus"

)

// Constants for Sending a Card
const (
messageType = "MessageCard"
context = "http://schema.org/extensions"
colorResolved = "2DC72D"
colorCritical = "8C1A1A"
colorWarning = "FFA500"
colorUnknown = "808080"
)

// TeamsMessageCard is for the Card Fields to send in Teams
// The Documentation is in https://docs.microsoft.com/en-us/outlook/actionable-messages/card-reference#card-fields
type TeamsMessageCard struct {
Type string json:"@type"
Context string json:"@context"
ThemeColor string json:"themeColor"
Summary string json:"summary"
Title string json:"title"
Text string json:"text,omitempty"
Sections []TeamsMessageCardSection json:"sections"
}

func (card *TeamsMessageCard) String() string {
b, err := json.Marshal(card)
if err != nil {
log.Errorf("failed marshalling TeamsMessageCard: %v", err)
}
return string(b)
}

// TeamsMessageCardSection is placed under TeamsMessageCard.Sections
// Each element of AlertWebHook.Alerts will the number of elements of TeamsMessageCard.Sections to create
type TeamsMessageCardSection struct {
ActivityTitle string json:"activityTitle"
Facts []TeamsMessageCardSectionFacts json:"facts"
Markdown bool json:"markdown"
}

func (section *TeamsMessageCardSection) String() string {
b, err := json.Marshal(section)
if err != nil {
log.Errorf("failed marshalling TeamsMessageCardSection: %v", err)
}
return string(b)
}

// TeamsMessageCardSectionFacts is placed under TeamsMessageCardSection.Facts
type TeamsMessageCardSectionFacts struct {
Name string json:"name"
Value string json:"value"
}

// SendCard sends the JSON Encoded TeamsMessageCard
func SendCard(webhook string, card *TeamsMessageCard) (*http.Response, error) {
buffer := new(bytes.Buffer)
if err := json.NewEncoder(buffer).Encode(card); err != nil {
return nil, fmt.Errorf("Failed encoding message card: %v", err)
}
res, err := http.Post(webhook, "application/json", buffer)
if err != nil {
return nil, fmt.Errorf("Failed sending to webhook url %s. Got the error: %v",
webhook, err)
}
rb, err := ioutil.ReadAll(res.Body)
if err != nil {
log.Error(err)
}
log.Infof("Microsoft Teams response text: %s", string(rb))
if res.StatusCode != http.StatusOK {
if err != nil {
return nil, fmt.Errorf("Failed reading Teams http response: %v", err)
}
return nil, fmt.Errorf("Failed sending to the Teams Channel. Teams http response: %s",
res.Status)
}
if err := res.Body.Close(); err != nil {
log.Error(err)
}
return res, nil
}

// createCardMetadata creates the metadata for alerts of the same type
func createCardMetadata(promAlert PrometheusAlertMessage, markdownEnabled bool) *TeamsMessageCard {
card := &TeamsMessageCard{
Type: messageType,
Context: context,
Title: fmt.Sprintf("Prometheus Alert (%s)", promAlert.Status),
// Set a default Summary, this is required for Microsoft Teams
Summary: "Prometheus Alert received",
}
// Override the value of the Summary if the common annotation exists
if value, ok := promAlert.CommonAnnotations["summary"]; ok {
card.Summary = value
}
switch promAlert.Status {
case "resolved":
card.ThemeColor = colorResolved
case "firing":
switch promAlert.CommonLabels["severity"] {
case "critical":
card.ThemeColor = colorCritical
case "warning":
card.ThemeColor = colorWarning
default:
card.ThemeColor = colorUnknown
}
default:
card.ThemeColor = colorUnknown
}

// CreateCards creates the TeamsMessageCard based on values gathered from PrometheusAlertMessage
func CreateCards(promAlert PrometheusAlertMessage, markdownEnabled bool) []*TeamsMessageCard {
// maximum message size of 14336 Bytes (14KB)
const maxSize = 14336
cards := []*TeamsMessageCard{}
card := createCardMetadata(promAlert, markdownEnabled)
cardMetadataJSON := card.String()
cardMetadataSize := len(cardMetadataJSON)
// append first card to cards
cards = append(cards, card)

for _, alert := range promAlert.Alerts {
	var s TeamsMessageCardSection
	s.ActivityTitle = fmt.Sprintf("[%s](%s)",
		alert.Annotations["description"], promAlert.ExternalURL)
	s.Markdown = markdownEnabled
	for key, val := range alert.Annotations {
		s.Facts = append(s.Facts, TeamsMessageCardSectionFacts{key, val})
	}
	for key, val := range alert.Labels {
		// Auto escape underscores if markdown is enabled
		if markdownEnabled {
			if strings.Contains(val, "_") {
				val = strings.Replace(val, "_", "\\_", -1)
			}
		}
		s.Facts = append(s.Facts, TeamsMessageCardSectionFacts{key, val})
	}
	currentCardSize := len(card.String())
	newSectionSize := len(s.String())
	newCardSize := cardMetadataSize + currentCardSize + newSectionSize
	// if total Size of message exceeds maximum message size then split it
	if (newCardSize) < maxSize {
		card.Sections = append(card.Sections, s)
	} else {
		card = createCardMetadata(promAlert, markdownEnabled)
		card.Sections = append(card.Sections, s)
		cards = append(cards, card)
	}
}
return cards

}
`

Print information of latest release version for prometheus-msteams

When there is a new release available for prometheus-msteams webserver, it would be great to inform existing users about that. Pretty similar what happens if you start an outdated version of minikube:

ReflectionTypeLoadException

i deployed the latest version (1.1.4) and see the following exception in the log;

I dont receive the card in teams.

time="2019-08-26T20:45:45Z" level=info msg="Created a card for Microsoft Teams /global_channel"
time="2019-08-26T20:45:45Z" level=debug msg="Teams message cards: [{\"@type\":\"MessageCard\",\"@context\":\"http://schema.org/extensions\",\"themeColor\":\"FFA500\",\"summary\":\"TargetDown\",\"title\":\"Prometheus Alert (firing)\",\"sections\":[{\"activityTitle\":\"[](https://alertmanager.xxx.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the identityserver-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"identityserver-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.xxx.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the mediaserver-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"mediaserver-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the pdfcreator-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"pdfcreator-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the policyserver-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"policyserver-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the -contractmanagerselfservice-api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"-contractmanagerselfservice-api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the -employeeselfservice-api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"--api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true},{\"activityTitle\":\"[](https://alertmanager.-dev.net)\",\"facts\":[{\"name\":\"message\",\"value\":\"100% of the -finance-api-host targets are down.\"},{\"name\":\"alertname\",\"value\":\"TargetDown\"},{\"name\":\"job\",\"value\":\"-finance-api-host\"},{\"name\":\"prometheus\",\"value\":\"monitoring/k8s\"},{\"name\":\"severity\",\"value\":\"warning\"}],\"markdown\":true}]}]"
time="2019-08-26T20:45:47Z" level=info msg="**Microsoft Teams response text: System.Reflection.ReflectionTypeLoadException: Unable to load one or more of the requested types. Retrieve the LoaderExceptions property for more information.**"
time="2019-08-26T20:45:47Z" level=info msg="A card was successfully sent to Microsoft Teams Channel. Got http status: 200 OK"

Add testing and refactor accordingly

Submit your helm chart to official repository?

Hello,

Could you contribute your helm chart to the official Helm chart github repository?

https://github.com/helm/charts/blob/master/CONTRIBUTING.md

Code review before release

latest version uses 100% CPU

When running the latest version in kubernetes, we saw it using up 100% CPU.
The output was thousands of these lines:
Failed to parse json with key 'sections': Key path not found"
We don't overwrite the default template. So my first guess would be that
/default-message-card.tmpl might not be correct?

I will have another try on monday and report back.

Add an optional log format "JSON"

This will help users to easily troubleshoot the JSON payload that is received from AlertManager and sent to Microsoft Teams. I'm thinking of using the go-kit logger to enable this since it's the one I'm most familiar with.

If anyone wants to take this issue, just let me know here. :)

Config file short flag incorrect

In docker, running /bin/promteams server -f /test.yaml:

Error: unknown shorthand flag: 'f' in -f
Usage:
  prometheus-msteams server [flags]

Flags:
  -h, --help                    help for server
  -l, --listen-address string   the address on which the server will listen (default "0.0.0.0")
  -p, --port int                port on which the server will listen (default 2000)
  -r, --request-uri string      the request uri path. Do not use this if using a config file. (default "alertmanager")
  -w, --webhook-url string      the incoming webhook url to post the alert messages. Do not use this if using a config file.

Global Flags:
      --config string   config file (default is $HOME/.prometheus-msteams.yaml)

unknown shorthand flag: 'f' in -f

In the readme it references -f for --config shorthand flag

Docker version is not working

This is the error I am getting

component=dispatcher msg="Notify for alerts failed" num_alerts=1 err="cancelling notify retry for \"webhook\" due to unrecoverable error: unexpected status code 404 from http://teamreceiver:2000/alertmanager"

Use summary instead of description

Hi,

as the topic of the alert the summary and not the description should be used, I think.
The description is a more detailed explanation and may be to long for this.

https://github.com/bzon/prometheus-msteams/blob/28af43054976b1c027712ac273f100fa7431cbae/alert/teams.go#L126

Best
C.

Duplicate sections across cards.

Currently each event gets added as a section to a card so old alerts
are coming through again and again on new cards.

This bug can be replicated by taking the example test json and running
it back to back a few times. The first card will have 1 section, the
second two sections (1 duplicate), the third 3 sections (2 duplicates),
and so on.

Add release to github step in CircleCI

Use https://github.com/aktau/github-release

Use custom message cards with helm chart

When running as a docker container, the custom message card template file is mounted in the container via volume and then the TEMPLATE_FILE environment variable is set.

When running as a binary, the --template-file flag is used.

But there is no mention in the README about how to achieve this when deployed with the helm in a k8s cluster?

In the prometheus-msteams chart, the message card template file is mounted into the container via a configMap volume. This configMap reads data from only the default template card.tmpl.

binaryData:
  card.tmpl: {{ .Files.Get "card.tmpl" | b64enc }}

The function .Files.get reads only those files which are present inside the chart's folder.

Here you can read that :

Currently, there is no way to pass files external to the chart during helm install. So if you are asking users to supply data, it must be loaded using helm install -f or helm install --set.

So as of now, it is not possible to use an external custom card template file with helm.
According to me, there could be 2 ways to fulfill the requirement:

Download the chart and manually replace the prometheus-msteams/card.tmpl with the custom template file. The custom template file should also have the same name card.tmpl.
Provide template file data via values file of the chart. Then allow the configMap to get that data from values file. If the custom template is not specified in the values file, the configMap should use the default template card.tmpl.

Am I right here? Is there any other solution for this?

Chunk alerts when the request body is too large.

I found a bug wherein 413 "too large" errors from MS Teams are not being caught properly.

Looking at the logs will show that everything was okay, and that the HTTP response code was 200:

time="2018-12-07T01:33:32Z" level=info msg="A card was successfully sent to Microsoft Teams Channel. Got http status: 200 OK"

But it doesn't show up in MS Teams.

I took the request body from the logs and sent it with curl. This is what happens (verbose mode):

> Content-Type: application/json
> Content-Length: 19609
> 
* Connection state changed (MAX_CONCURRENT_STREAMS updated)!
* We are completely uploaded and fine
< HTTP/2 200 
< cache-control: no-cache
< pragma: no-cache
< content-length: 185
< content-type: text/plain; charset=utf-8
< expires: -1
< request-id: 9243cf0d-6de1-4a2e-9ebe-2f0647b3ceb6
< x-calculatedfetarget: ME2PR01CU007.internal.outlook.com
< x-backendhttpstatus: 200
< x-feproxyinfo: ME2PR01CA0157.AUSPRD01.PROD.OUTLOOK.COM
< x-calculatedfetarget: SY2PR01CU001.internal.outlook.com
< x-backendhttpstatus: 200
< x-feproxyinfo: SY2PR01CA0011.AUSPRD01.PROD.OUTLOOK.COM
< x-calculatedbetarget: SYAPR01MB2863.ausprd01.prod.outlook.com
< x-backendhttpstatus: 200
< x-aspnet-version: 4.0.30319
< x-cafeserver: SY2PR01CA0011.AUSPRD01.PROD.OUTLOOK.COM
< x-beserver: SYAPR01MB2863
< x-rum-validated: 1
< x-feserver: SY2PR01CA0011
< x-feserver: ME2PR01CA0157
< x-powered-by: ASP.NET
< x-feserver: HK0PR03CA0043
< x-msedge-ref: Ref A: 8B559015A2E546309D7219160649965E Ref B: HK2EDGE1006 Ref C: 2018-12-07T01:34:28Z
< date: Fri, 07 Dec 2018 01:34:27 GMT
< 
* Connection #0 to host outlook.office.com left intact
Webhook message delivery failed with error: Microsoft Teams endpoint returned HTTP error 413 with ContextId tcid=8241112485782354073,server=SG2PEPF00000467,cv=oRcpKL9MdU6iz4isFYTR6A.0..

So, infuriatingly, the HTTP response code is 200, but because there are too many alerts being sent at once, the actual response code is 413, but it's in the body.

(It's because the endpoint is a proxy for the actual MS Teams endpoint behind it, and the inner endpoint is the one giving the 413 HTTP Code, but that's not important right now.)

We need a fix to be able to set a maximum size for each call to the webhook, and then just send successive calls if it all doesn't fit in a single call.

Question / Help: date time in card template

Hi guys,

I'm wondering if there is a way to display the date and display the date and time of the alert in the teams card.

thanks for your help

Alerts do not work.

I've tried POSTing a sample alert with curl and I've tried configuring a route to this instance, but no alerts are fired.

I can see from the server logs, that prometheus-msteams receives the alert, but it fails with this error message:

failed to template alerts: template: :1:12: executing "" at <{{template "teams.ca...>: template "teams.card" not defined

I've even copy pasted the default template and tried passing that as a template to the binary with the --template-file flag and that didn't work either.

I'm using the latest binary from the release page for linux.

prometheus-msteams / prometheus-msteams Goto Github PK

prometheus-msteams's Issues

Recommend Projects

Recommend Topics

Recommend Org