Comments (41)
Sorry for the delay. Needed to check with workplace before posting.
Here's the revised template we are using. Perhaps it can be added as an alternative to default template.
Known gotchas:
- GeneratorURL is not useful
- assumes that "Annotations" and "Labels" exist (will fail to render if not exist)
- still not enough validation
{{ define "teams.card" }}
{
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "{{- if eq .Status "resolved" -}}2DC72D
{{- else if eq .Status "firing" -}}
{{- if eq .CommonLabels.severity "critical" -}}8C1A1A
{{- else if eq .CommonLabels.severity "warning" -}}FFA500
{{- else -}}808080{{- end -}}
{{- else -}}808080{{- end -}}",
"summary": "Prometheus Alert ({{ .Status }})",
"title": "Prometheus Alert ({{ .Status }})",
"sections": [ {{$externalUrl := .ExternalURL}}
{{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
{
"activityTitle":
{{- if ne $alert.Annotations.description "" -}}
"[{{ js $alert.Annotations.description }}]({{ $externalUrl }})",
{{- else -}}
"[{{ js $alert.Annotations.message }}]({{ $externalUrl }})",
{{- end -}}
"facts": [
{ "name": "Status", "value": "{{ .Status }}" },
{ "name": "StartsAt", "value": "{{ js .StartsAt }}" },
{{- if and .EndsAt ( not .EndsAt.IsZero ) }}
{ "name": "EndsAt", "value": "{{ js .EndsAt }}" },
{{- end}}
{ "name": "ExternalURL", "value": "{{ js $externalUrl }}" },
{ "name": "GeneratorURL", "value": "{{ js .GeneratorURL }}" },
{{- range $key, $value := $alert.Annotations }}
{
"name": "{{ reReplaceAll "_" "\\\\_" $key }}",
"value": "{{ reReplaceAll "_" "\\\\_" $value | js }}"
},
{{- end -}}
{{$c := counter}}{{ range $key, $value := $alert.Labels }}{{if call $c}},{{ end }}
{
"name": "{{ reReplaceAll "_" "\\\\_" $key }}",
"value": "{{ reReplaceAll "_" "\\\\_" $value | js }}"
}
{{- end }}
],
"markdown": true
}
{{- end }}
]
}
{{ end }}
If time permits I will prepare proper PRs for this and possibly inclusion of sprig template function library. Helm uses sprig to make templating more concise and harder to "break".
I am afraid it may take a while due to personal time constraints.
Many thanks for prometheus-msteams project !
from prometheus-msteams.
I've opened a PR to modify the default template to handle newline and single quote pls review: #77
from prometheus-msteams.
I spent a bit more time looking at this today, and it seems it doesn't like the line breaks.
I modified the message here from the template to be a single line and removed the " as well to ensure it was ok.. and it works now - so looks like its not handling multiple line alerts/messages?
I used this website to help out
https://jsonformatter.curiousconcept.com/
helm/prometheus-operator/templates/prometheus/rules/general.rules.yaml
@@ -29,17 +29,7 @@ spec:
severity: warning
- alert: Watchdog
annotations:
- message: 'This is an alert meant to ensure that the entire alerting pipeline is functional.
-
- This alert is always firing, therefore it should always be firing in Alertmanager
-
- and always fire against a receiver. There are integrations with various notification
-
- mechanisms that send a notification when this alert is not firing. For example the
-
- DeadMansSwitch integration in PagerDuty.
-
- '
+ message: 'This is an alert meant to ensure that the entire alerting pipeline is functional. This alert is always firing, therefore it should always be firing in Alertmanager and
expr: vector(1)
from prometheus-msteams.
Go template 'js' function seems do produce quoting which works correctly with msteams:
Example:
"[{{ js $alert.Annotations.message }}]({{ $externalUrl }})",
from prometheus-msteams.
@nickadams675
the "greater than" in your alert description is triggering this:
"description\":\"CPU load is > 80%\\n
I will look for a solution, but until we get one as a workaround you may want to use for your description: "CPU load greater than 80%"
the js function in the template is replacing >
with \\x3E
causing the issue
from prometheus-msteams.
@dlevene1 hello, does it work ok with the previous prom-msteams version?
from prometheus-msteams.
Hi @dlevene1 it seems you don't escape \\\"DeadMansSnitch\\\"
correctly. It should be \\\\"DeadMansSnitch\\\\"
in the Prometheus Alert.
We know that the log output is not very good for debugging, we will improve this soon and output as json which will remove the other confusing backslashes. With this improved, you should be immediately see that in the level=debug msg="Alert rendered in template file:
output, that DeadMansSnitch
has a non-escaped "
.
from prometheus-msteams.
Hey @Knappek, Thanks for the response.
So to confirm, this breaks it as it should be escaped?
https://github.com/helm/charts/blob/master/stable/prometheus-operator/templates/prometheus/rules/general.rules.yaml
general.rules.yaml - Link to particular release incase master is updated
Regards
David
from prometheus-msteams.
@dlevene1 sorry for getting back to you that late, it was a crazy week. I don't think multilines is the problem, it should be able to handle it. Simply remove the "
and everything should be fine. Can you test that and verify please?
from prometheus-msteams.
If you still want to have the "
this message should work:
message: 'This is an alert meant to ensure that the entire alerting pipeline is functional.
This alert is always firing, therefore it should always be firing in Alertmanager
and always fire against a receiver. There are integrations with various notification
mechanisms that send a notification when this alert is not firing. For example the
\"DeadMansSnitch\" integration in PagerDuty.
'
I don't have time now to test this, but I can do that today in the evening and give you an update if you want.
from prometheus-msteams.
Hey @Knappek,
I just left it as a single line, as it works for me. The concern is out of the "box" the helm chart promethus-operator and promethus-msteams don't work and display errors - I discovered 2 issues
- The default alert noted in a previous post, is the default in promethus-operator helm chart - while this isn't in your control it does break the test alert.
- When multiple alerts were firing, the CommonAnnonations was an empty string and that made the summary an empty string too breaking teams.
time="2019-08-08T11:03:31Z" level=info msg="Microsoft Teams response text: Summary or Text is required."
time="2019-08-08T11:03:31Z" level=error msg="Failed sending to the Teams Channel. Teams http response: 400 Bad Request"
I worked around this by having the external url always in the summary + the message if there was one.
"summary": "{{ $externalUrl }} - {{- if eq .CommonAnnotations.summary "" -}}
{{- if eq .CommonAnnotations.message "" -}}
{{- .CommonLabels.alertname -}}
{{- else -}}
{{- .CommonAnnotations.message -}}
{{- end -}}
{{- else -}}
{{- .CommonAnnotations.summary -}}
{{- end -}}",
I'm happy for this issue to be closed as I have worked through the issues myself, but others may hit the same.
from prometheus-msteams.
@dlevene1 I see your point and I agree that it should work out of the box. Would you like to open a PR with an updated default card template where summary can't be empty? Regarding the "
maybe this can be fixed as well in the template with the reReplaceAll
function. What do you think?
I would love to test this as well, but I don't think I will find the time in the next 3 weeks I'm afraid.
from prometheus-msteams.
Hey, I looked at this again today, and tried escaping like this
"summary": "{{ $externalUrl }} - {{- reReplaceAll "(\"_)" "\\\\$1" $summary -}}",
But in the debug output - the reReplaceAll didn't have any effect on the alert rendered in the template file - am I missing something?
Even if I remove that reReplaceAll completely, it doesn't change the output as shown below.
level=debug msg="Prometheus Alert:
\\\"DeadMansSnitch\\\"
level=debug msg="Alert rendered in template file:
\"DeadMansSnitch\"
from prometheus-msteams.
hey, I'm sorry I was not able to test this by my own and I don't have a computer with me for the next weeks. I'd try to play around with escaping, I'm always confused with escaping so I don't have a better idea I'm afraid 😀
from prometheus-msteams.
unfortunately this didn't fix the problem, it did escape it differently though. Again, i'm not modifying what's in the promethus-operator helm chart upstream.
time="2019-08-30T05:25:45Z" level=debug msg="Prometheus Alert: {\"receiver\":\"high_priority_receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"annotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"startsAt\":\"2019-08-30T05:25:15.470372903Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://prometheus-dashboard.sandpit/graph?g0.expr=vector%281%29\\u0026g0.tab=1\"}],\"groupLabels\":{},\"commonLabels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"commonAnnotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"externalURL\":\"http://alertmanager.sandpit\",\"version\":\"4\",\"groupKey\":\"{}:{}\"}"
time="2019-08-30T05:25:45Z" level=debug msg="Alert rendered in template file: \n{\n \n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"808080\",\n \"summary\": \"http://alertmanager.sandpit -This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\",\n \"title\": \"Prometheus Alert (firing) - http://alertmanager.sandpit\",\n \"sections\": [\n {\n \"activityTitle\": \"[This is an alert meant to ensure that the entire alerting pipeline is functional.\\u000AThis alert is always firing, therefore it should always be firing in Alertmanager\\u000Aand always fire against a receiver. There are integrations with various notification\\u000Amechanisms that send a notification when this alert is not firing. For example the\\u000A\\\"DeadMansSnitch\\\" integration in PagerDuty.\\u000A](http://alertmanager.sandpit)\",\n \"facts\": [\n {\n \"name\": \"message\",\n \"value\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\\u000AThis alert is always firing, therefore it should always be firing in Alertmanager\\u000Aand always fire against a receiver. There are integrations with various notification\\u000Amechanisms that send a notification when this alert is not firing. For example the\\u000A\\\"DeadMansSnitch\\\" integration in PagerDuty.\\u000A\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"Watchdog\"\n },\n {\n \"name\": \"prometheus\",\n \"value\": \"monitoring/prometheus-operator-prometheus\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"none\"\n }\n ],\n \"markdown\": true\n }\n ]\n}\n"
time="2019-08-30T05:25:45Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
time="2019-08-30T05:25:45Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
# cat /etc/template/card.tmpl
{{ define "teams.card" }}
{
{{$externalUrl := .ExternalURL}}
"@type": "MessageCard",
"@context": "http://schema.org/extensions",
"themeColor": "{{- if eq .Status "resolved" -}}2DC72D
{{- else if eq .Status "firing" -}}
{{- if eq .CommonLabels.severity "critical" -}}8C1A1A
{{- else if eq .CommonLabels.severity "warning" -}}FFA500
{{- else -}}808080{{- end -}}
{{- else -}}808080{{- end -}}",
"summary": "{{ $externalUrl }} - {{- if eq .CommonAnnotations.summary "" -}}
{{- if eq .CommonAnnotations.message "" -}}
{{- .CommonLabels.alertname -}}
{{- else -}}
{{- .CommonAnnotations.message -}}
{{- end -}}
{{- else -}}
{{- .CommonAnnotations.summary -}}
{{- end -}}",
"title": "Prometheus Alert ({{ .Status }}) - {{ $externalUrl }}",
"sections": [
{{- range $index, $alert := .Alerts }}{{- if $index }},{{- end }}
{
"activityTitle": "[{{ js $alert.Annotations.message }}]({{ $externalUrl }})",
"facts": [
{{- range $key, $value := $alert.Annotations }}
{
"name": "{{ js $key }}",
"value": "{{ js $value }}"
},
{{- end -}}
{{$c := counter}}{{ range $key, $value := $alert.Labels }}{{if call $c}},{{ end }}
{
"name": "{{ js $key }}",
"value": "{{ js $value }}"
}
{{- end }}
],
"markdown": true
}
{{- end }}
]
}
{{ end }}
from prometheus-msteams.
@dlevene1 in my case I also had to:
- add quoting to summary section formatting (.CommonAnnotations.message)
- quote ExternalURL and GeneratorURL
- keep reReplaceAll, e.g.:
{{ reReplaceAll "_" "\\\\_" $value | js }}
- as I understand the default card template - templating may fail if alert arrives without labels (will have extra trailing comma)
Related to templating, but does not cause alert transmit to fail:
- prometheus may send multiple unrelated alerts in a batch. In such case there may be no commonAnnotations/commonLabels. Templating will not fail, but summary field will left be empty.
from prometheus-msteams.
thanks for the info - i've managed to get it working - I ended up modifying the source of the alerts as there's only a couple. If you have resolved it as a whole - could you perform a PR request to fix it for all?
from prometheus-msteams.
Hello guys, sorry for getting back lately. Thank you @fuzzycow for helping out, if you could help improve the default template to make it work with whatever prometheus alert is coming, that would be great :).
from prometheus-msteams.
Ha, never heard of sprig, looks promising. Take your time, looking forward to your contribution :)
from prometheus-msteams.
Hello,
is there any progress on this? I've tried to use a replaceAll for escaping " but no success there. Did anyone manage to do that?
from prometheus-msteams.
@VincenzoDo I think there is no further progress yet. I am trying to understand your use-case: how does your incoming prometheus alert look like? I mean it is a json file, why do you want to replace quotes?
from prometheus-msteams.
We are also facing same issue with error "Failed to parse json with key 'sections': Key path not found" with version v1.1.4 for some of our incoming prometheus alert. Any idea when this issue could be fixed, or any work around will be helpful.
from prometheus-msteams.
I solved the issue by using url encoding for all " (%22). I've also done some tests and find that the reReplaceAll for _ (https://github.com/bzon/prometheus-msteams/blob/master/default-message-card.tmpl#L28) seem to have no effect.
from prometheus-msteams.
@karuneshk can you provide the prometheus alert that is causing the error please?
from prometheus-msteams.
I resolved my particular issue by modifying the prometheus-operator helm chart my notes against it in the event I need to do an upgrade are;
The default alert has " and some line breaks in it which if not escaped correctly will break this the version of promethus-msteams used and you'll see errors in the logs of that helm chart.
Ensure that this file `prometheus-operator/templates/prometheus/rules/general.rules.yaml` has the alert message updated to something more appropriate.
To see any errors in msteams this command should do the trick `kubectl logs -n monitoring -l app=prometheus-msteams | grep 'http status' | grep -v 200`
- alert: Watchdog
annotations:
message: 'This is an alert meant to ensure that the entire alerting pipeline is functional.'
expr: vector(1)
labels:
severity: none
from prometheus-msteams.
I solved the issue by using url encoding for all " (%22). I've also done some tests and find that the reReplaceAll for _ (https://github.com/bzon/prometheus-msteams/blob/master/default-message-card.tmpl#L28) seem to have no effect.
Oh I wasn't aware of that. Thanks for the hint, I will review in the following days and maybe I had some reasons why to use it ;). Will keep you posted.
I resolved my particular issue by modifying the prometheus-operator helm chart my notes against it in the event I need to do an upgrade are;
The default alert has " and some line breaks in it which if not escaped correctly will break this the version of promethus-msteams used and you'll see errors in the logs of that helm chart. Ensure that this file `prometheus-operator/templates/prometheus/rules/general.rules.yaml` has the alert message updated to something more appropriate. To see any errors in msteams this command should do the trick `kubectl logs -n monitoring -l app=prometheus-msteams | grep 'http status' | grep -v 200` - alert: Watchdog annotations: message: 'This is an alert meant to ensure that the entire alerting pipeline is functional.' expr: vector(1) labels: severity: none
Interesting! I haven't seen an alert with quotes yet and I can't imagine how this looks like as the incoming request is json. Can somebody please provide an example alert which has quotes in it? Are the "
escaped already with \"
?
from prometheus-msteams.
the very first post in this issue has the output from msteams
from prometheus-msteams.
Finally I was able to reproduce the error. I fixed it by using
reReplaceAll "\"" "\\\""
Make sure to also provide it for the activityTitle
.
It should be used in conjunction with
reReplaceAll "_" "\\\\_"
but I don't know how to apply multiple functions in Go templating and I haven't found anything on google ^^. Anybody having some ideas?
from prometheus-msteams.
On v1.1.3
and v1.1.4
I'm getting the error when there is a '
char in any of these values of an alert. I haven't tested older versions yet. Hope it offers some kind of clue.
{
"version": "4",
"groupKey": "{}:{alertname=\"high_memory_load\"}",
"status": "firing",
"receiver": "teams_proxy",
"groupLabels": {
"alertname": "high_memory_load"
},
"commonLabels": {
"alertname": "high_memory_load",
"monitor": "master",
"severity": "warning"
},
"commonAnnotations": {
"summary": "Server High Memory usage" // here
},
"externalURL": "http://alertmanager:9093",
"alerts": [
{
"labels": {
"alertname": "high_memory_load", // here
"instance": "10.80.40.11:9100", // here
"job": "docker_nodes", // here
"monitor": "master", // here
"severity": "warning" // here
},
"annotations": {
"description": "xxxxxxx", // here
"summary": "Server High Memory usage" // here
},
"startsAt": "2018-03-07T06:33:21.873077559-05:00",
"endsAt": "0001-01-01T00:00:00Z"
}
]
}
Edit
Sorry I should mention that this only happens "value": "{{ reReplaceAll "_" "\\\\_" $value | js }}"
instead of "value": "{{ reReplaceAll "_" "\\\\_" $value }}"
in card-with-action.tmpl
, as suggested by the above modified template. Once I reverted the change there was no such error.
Which means there's something in the js
pipeline function when it is escaping '
.
Demo
So I append a '
to the end of the instance
value of the sample.alert.json
so it becomes "instance": "10.80.40.11:9100'"
.
When using "value": "{{ reReplaceAll "_" "\\\\_" $value }}"
in the card-with-action.tmpl
the card is successfully sent. Rendered alert in logs:
prometheus-msteams_1 | time="2019-12-04T22:34:22Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"FFA500\",\n \"summary\": \"Server High Memory usage\",\n \"title\": \"Prometheus Alert (firing)\",\n \"sections\": [ \n {\n \"activityTitle\": \"[xxxxxxx](http://alertmanager:9093)\",\n \"facts\": [\n {\n \"name\": \"description\",\n \"value\": \"xxxxxxx\"\n },\n {\n \"name\": \"summary\",\n \"value\": \"Server High Memory usage\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"high\\\\_memory\\\\_load\"\n },\n {\n \"name\": \"instance\",\n \"value\": \"10.80.40.11:9100'\"\n },\n {\n \"name\": \"job\",\n \"value\": \"docker\\\\_nodes\"\n },\n {\n \"name\": \"monitor\",\n \"value\": \"master\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"warning\"\n }\n ],\n \"markdown\": true\n }\n ],\n \"potentialAction\": [\n {\n \"@context\": \"http://schema.org\",\n \"@type\": \"ViewAction\",\n \"name\": \"Runbook\",\n \"target\": [\n \"\"\n ]\n }\n ]\n}\n"
prometheus-msteams_1 | time="2019-12-04T22:34:22Z" level=debug msg="Size of message is 669 Bytes (~0 KB)"
However when using "value": "{{ reReplaceAll "_" "\\\\_" $value | js }}"
I get the error. Rendered alert in logs:
prometheus-msteams_1 | time="2019-12-04T22:34:30Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"FFA500\",\n \"summary\": \"Server High Memory usage\",\n \"title\": \"Prometheus Alert (firing)\",\n \"sections\": [ \n {\n \"activityTitle\": \"[xxxxxxx](http://alertmanager:9093)\",\n \"facts\": [\n {\n \"name\": \"description\",\n \"value\": \"xxxxxxx\"\n },\n {\n \"name\": \"summary\",\n \"value\": \"Server High Memory usage\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"high\\\\\\\\_memory\\\\\\\\_load\"\n },\n {\n \"name\": \"instance\",\n \"value\": \"10.80.40.11:9100\\'\"\n },\n {\n \"name\": \"job\",\n \"value\": \"docker\\\\\\\\_nodes\"\n },\n {\n \"name\": \"monitor\",\n \"value\": \"master\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"warning\"\n }\n ],\n \"markdown\": true\n }\n ],\n \"potentialAction\": [\n {\n \"@context\": \"http://schema.org\",\n \"@type\": \"ViewAction\",\n \"name\": \"Runbook\",\n \"target\": [\n \"\"\n ]\n }\n ]\n}\n" // Not compacted
prometheus-msteams_1 | time="2019-12-04T22:34:30Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
prometheus-msteams_1 | time="2019-12-04T22:34:30Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
The difference lies that in the former, the rendered template shows \"value\": \"10.80.40.11:9100'\"\n
, whereas in the latter the it shows \"value\": \"10.80.40.11:9100\\'\"\n
. An extra \
added by the js
pipeline function, makes the final rendered json invalid.
It would be great if we used a proper templating library like sprig
as mentioned above that would solve a lot of this kind of issues.
from prometheus-msteams.
Thanks @leojonathanoh for the detailed explanations. Yes absolutely, I would very much appreciate to use a templating library like sprig
. Currently I do not have much time though, hence I would highly appreciate some contributions :).
from prometheus-msteams.
I've run into this issue as well. it looks like line breaks trigger this bug.
See extra quote in the activityTitle field:
time="2019-12-08T23:14:34Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"808080\",\n \"summary\": \"Umonitored EC2 instance: localhost:9090\",\n \"title\": \"Prometheus Alert (firing)\",\n \"sections\": [ \n {\n \"activityTitle\": \"[\"EC2 instance localhost:9090 is not reachable by prometheus.\nPlease run telegraf and expose port 9273 with security group rules.\nFix to monitor for CPU, memory, disk and network usage\"\n](http://7f764863d830:9093)\",\n \"facts\": [\n {\n \"name\": \"description\",\n \"value\": \"\"EC2 instance localhost:9090 is not reachable by prometheus.\nPlease run telegraf and expose port 9273 with security group rules.\nFix to monitor for CPU, memory, disk and network usage\"\n\"\n },\n {\n \"name\": \"summary\",\n \"value\": \"Umonitored EC2 instance: localhost:9090\"\n },\n {\n \"name\": \"account\\\\_id\",\n \"value\": \"account-id\"\n },\n {\n \"name\": \"alert\\\\_team\",\n \"value\": \"ad\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"test-alert\"\n },\n {\n \"name\": \"environment\",\n \"value\": \"dev\"\n },\n {\n \"name\": \"instance\",\n \"value\": \"localhost:9090\"\n },\n {\n \"name\": \"job\",\n \"value\": \"self-scrape\"\n },\n {\n \"name\": \"pod\",\n \"value\": \"ad\"\n },\n {\n \"name\": \"region\",\n \"value\": \"us-east-1\"\n },\n {\n \"name\": \"service\",\n \"value\": \"self-scraps\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"Low\"\n }\n ],\n \"markdown\": true\n }\n ]\n}\n"
time="2019-12-08T23:14:34Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
time="2019-12-08T23:14:34Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
Adding js in the template field fixes: "[{{ js $alert.Annotations.message }}]({{ $externalUrl }})",
Why is the reReplaceAll necessary. (this replaces all _
to \\_
in $value , correct?)
reReplaceAll "_" "\\\\_" $value
Is this is a ms teams restriction with _
?
from prometheus-msteams.
Exactly, it is needed to escape "_" when markdown is enabled, otherwise it will print text in italic
from prometheus-msteams.
should be fixed in v1.1.5, thanks to @AnandDevan 🎉
from prometheus-msteams.
@Knappek Still seeing this occurring:
time="2020-01-14T16:24:21Z" level=info msg="Version: v1.1.5, Commit: f74544b, Branch: HEAD, Build Date: 2020-01-03T11:09:13+0000"
time="2020-01-14T16:24:21Z" level=info msg="Parsing the message card template file: ./default-message-card.tmpl"
time="2020-01-14T16:24:21Z" level=info msg="Creating the server request path "/alertmanager""
time="2020-01-14T16:24:21Z" level=info msg="prometheus-msteams server started listening at 0.0.0.0:2000"
time="2020-01-14T16:25:20Z" level=info msg="/alertmanager received a request"
time="2020-01-14T16:25:20Z" level=error msg="Error calling json.compact: invalid character 'x' in string escape code"
time="2020-01-14T16:25:20Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
Apologies to comment on a closed issue :(
from prometheus-msteams.
@nickadams675 Can you also post the message json in the logs that we can look at to troubleshoot more
from prometheus-msteams.
@nickadams675 Can you also post the message json in the logs that we can look at to troubleshoot more
@Knappek
I added:
- name: PROMTEAMS_DEBUG
value: "true"
Though it appears it may not be honoring this var, did I miss something?
time="2020-01-14T17:43:04Z" level=info msg="Version: v1.1.5, Commit: f74544b, Branch: HEAD, Build Date: 2020-01-03T11:09:13+0000"
time="2020-01-14T17:43:04Z" level=info msg="Parsing the message card template file: ./default-message-card.tmpl"
time="2020-01-14T17:43:04Z" level=info msg="Creating the server request path "/alertmanager""
time="2020-01-14T17:43:04Z" level=info msg="prometheus-msteams server started listening at 0.0.0.0:2000"
time="2020-01-14T17:46:20Z" level=info msg="/alertmanager received a request"
time="2020-01-14T17:46:22Z" level=error msg="Error calling json.compact: invalid character 'x' in string escape code"
time="2020-01-14T17:46:22Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
from prometheus-msteams.
it may be a few lines before this error in the logs. Here's an example someone submitted from earlier in the discussion, see: #56 (comment)
time="2019-08-05T07:42:15Z" level=debug msg="Prometheus Alert: {\"receiver\":\"high_priority_receiver\",\"status\":\"firing\",\"alerts\":[{\"status\":\"firing\",\"labels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"annotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"startsAt\":\"2019-08-05T07:36:45.470372903Z\",\"endsAt\":\"0001-01-01T00:00:00Z\",\"generatorURL\":\"http://prometheus-dashboard.testing/graph?g0.expr=vector%281%29\\u0026g0.tab=1\"}],\"groupLabels\":{},\"commonLabels\":{\"alertname\":\"Watchdog\",\"prometheus\":\"monitoring/prometheus-operator-prometheus\",\"severity\":\"none\"},\"commonAnnotations\":{\"message\":\"This is an alert meant to ensure that the entire alerting pipeline is functional.\\nThis alert is always firing, therefore it should always be firing in Alertmanager\\nand always fire against a receiver. There are integrations with various notification\\nmechanisms that send a notification when this alert is not firing. For example the\\n\\\"DeadMansSnitch\\\" integration in PagerDuty.\\n\"},\"externalURL\":\"http://alertmanager.testing\",\"version\":\"4\",\"groupKey\":\"{}:{}\"}"
time="2019-08-05T07:42:15Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"808080\",\n \"summary\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\",\n \"title\": \"Prometheus Alert (firing)\",\n \"sections\": [ \n {\n \"activityTitle\": \"[ aaaa](http://alertmanager.testing)\",\n \"facts\": [\n {\n \"name\": \"message\",\n \"value\": \"This is an alert meant to ensure that the entire alerting pipeline is functional.\nThis alert is always firing, therefore it should always be firing in Alertmanager\nand always fire against a receiver. There are integrations with various notification\nmechanisms that send a notification when this alert is not firing. For example the\n\"DeadMansSnitch\" integration in PagerDuty.\n\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"Watchdog\"\n },\n {\n \"name\": \"prometheus\",\n \"value\": \"monitoring/prometheus-operator-prometheus\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"none\"\n }\n ],\n \"markdown\": true\n }\n ]\n}\n"
time="2019-08-05T07:42:15Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
time="2019-08-05T07:42:15Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
from prometheus-msteams.
@AnandDevan Apologies I don't have debug level logs appearing in the containers' stdout, I've added PROMTEAMS_DEBUG:"true" but still no joy, any thoughts?
Can you provide the log level env var or cli option to change the log level, I don't see it referenced in any of the documentation.
from prometheus-msteams.
@AnandDevan and @Knappek
Ok so log level was changed in latest commit and 1.1.5 is hardcoded to INFO and not DEBUG. Redeployed 1.1.4 with the default template of 1.1.5 for testing:
time="2020-01-14T19:49:20Z" level=info msg="/alertmanager received a request"
time="2020-01-14T19:49:20Z" level=debug msg="Prometheus Alert: {\"receiver\":\"webhook\",\"status\":\"resolved\",\"alerts\":[{\"status\":\"resolved\",\"labels\":{\"alertname\":\"HighCpuLoad\",\"instance\":\"10.121.97.27:9100\",\"severity\":\"warning\"},\"annotations\":{\"description\":\"CPU load is \\u003e 80%\\n VALUE = 87.44583333333442\\n LABELS: map[instance:10.121.97.27:9100]\",\"summary\":\"High CPU load (instance 10.121.97.27:9100)\"},\"startsAt\":\"2020-01-14T19:34:10.000423689Z\",\"endsAt\":\"2020-01-14T19:47:10.00038213Z\",\"generatorURL\":\"http://sanatized:9090/graph?g0.expr=100+-+%28avg+by%28instance%29+%28irate%28node_cpu_seconds_total%7Bmode%3D%22idle%22%7D%5B5m%5D%29%29+%2A+100%29+%3E+80\\u0026g0.tab=1\"}],\"groupLabels\":{\"alertname\":\"HighCpuLoad\"},\"commonLabels\":{\"alertname\":\"HighCpuLoad\",\"instance\":\"10.121.97.27:9100\",\"severity\":\"warning\"},\"commonAnnotations\":{\"description\":\"CPU load is \\u003e 80%\\n VALUE = 87.44583333333442\\n LABELS: map[instance:10.121.97.27:9100]\",\"summary\":\"High CPU load (instance 10.121.97.27:9100)\"},\"externalURL\":\"\",\"version\":\"4\",\"groupKey\":\"{}:{alertname=\\\"HighCpuLoad\\\"}\"}"
time="2020-01-14T19:49:20Z" level=debug msg="Alert rendered in template file: \n{\n \"@type\": \"MessageCard\",\n \"@context\": \"http://schema.org/extensions\",\n \"themeColor\": \"2DC72D\",\n \"summary\": \"High CPU load (instance 10.121.97.27:9100)\",\n \"title\": \"Prometheus Alert (resolved)\",\n \"sections\": [ \n {\n \"activityTitle\": \"[CPU load is \\x3E 80%\\u000A VALUE = 87.44583333333442\\u000A LABELS: map[instance:10.121.97.27:9100]]()\",\n \"facts\": [\n {\n \"name\": \"description\",\n \"value\": \"CPU load is \\x3E 80%\\u000A VALUE = 87.44583333333442\\u000A LABELS: map[instance:10.121.97.27:9100]\"\n },\n {\n \"name\": \"summary\",\n \"value\": \"High CPU load (instance 10.121.97.27:9100)\"\n },\n {\n \"name\": \"alertname\",\n \"value\": \"HighCpuLoad\"\n },\n {\n \"name\": \"instance\",\n \"value\": \"10.121.97.27:9100\"\n },\n {\n \"name\": \"severity\",\n \"value\": \"warning\"\n }\n ],\n \"markdown\": true\n }\n ]\n}\n"
time="2020-01-14T19:49:20Z" level=debug msg="Size of message is 0 Bytes (~0 KB)"
time="2020-01-14T19:49:20Z" level=error msg="Failed to parse json with key 'sections': Key path not found"
Thanks!
from prometheus-msteams.
can u set the --log-level to be DEBUG and try (Default is now INFO)
By using a --config file, you will be able to define multiple prometheus request uri and webhook for different channels.
This is an example config file content in YAML format.
---
connectors:
- channel_1: https://outlook.office.com/webhook/xxxx/hook/for/channel1
- channel_2: https://outlook.office.com/webhook/xxxx/hook/for/channel2
Usage:
prometheus-msteams server [flags]
Flags:
--config string The connectors configuration file.
WARNING: 'request-uri' and 'webhook-url' flags will be ignored if this is used.
-h, --help help for server
--idle-conn-timeout duration The idle connection timeout (in seconds) (default 1m30s)
-l, --listen-address string The address on which the server will listen to. (default "0.0.0.0")
--log-level string Log levels: INFO | DEBUG | WARN | ERROR | FATAL | PANIC (default "INFO")
--markdown Format the prometheus alert in Microsoft Teams with markdown. (default true)
-m, --max-idle-conns int The maximum number of idle connections allowed (default 100)
-p, --port int The port on which the server will listen to. (default 2000)
-r, --request-uri string The default request uri path where Prometheus will post to. (default "alertmanager")
-t, --template-file string The Microsoft Teams Message Card template file. (default "./default-message-card.tmpl")
--tls-handshake-timeout duration The TLS handshake timeout (in seconds) (default 30s)
-w, --webhook-url string The default Microsoft Teams Webhook connector.```
from prometheus-msteams.
@AnandDevan Thanks for the quick find! Working on changing this asap thanks!
from prometheus-msteams.
Related Issues (20)
- [FEATURE] Upgrade github.com/go-kit/kit to 0.12.0
- Release dependeny updates HOT 1
- Direct integration into Alertmanager? HOT 5
- [BUG] Give up the way to run "prometheus-msteams" in docker
- Image ARM Architecture HOT 2
- Grafana dashboard
- [BUG] Simulating a Prometheus Alerts to Teams Channel - results in not found
- Security Vulnerability [BUG]
- [BUG] cannot recieve alerts in MSTeams HOT 1
- [BUG] services dies when using systemctl reload prometheus_msteams.service HOT 1
- [BUG] Invalid memory address on Kubernetes v1.24 HOT 3
- [Message: Not Found] using uir static or dynamic. No Alerts are send to Teams channel
- Sunset of this repository HOT 3
- [BUG] When alerts are grouped, the card does not differentiate between status firing and resolved
- [BUG] Image tag v1.5.2 is not available anymore
- [FEATURE] Turn off automountServiceAccountToken
- [FEATURE] Timeline of new release HOT 4
- [BUG] The DEBUG logging level cannot be changed
- [BUG] Unable to pull the image from M2
- [BUG] Can't send messages to the Power Automate flow
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from prometheus-msteams.