Giter Club home page Giter Club logo

ipahealthcheck_exporter's Introduction

ipahealthcheck_exporter

Prometheus exporter for exposing ipa-healthcheck metrics. It's essentially a wrapper around the ipa-healthcheck command.

Prerequisites

Running

You can run the exporter locally :

# ./ipa-healthcheck_exporter
INFO[0000] ipa-healthcheck exporter listening on http://0.0.0.0:9888  source="ipahealthcheck_exporter.go:139"

Or with a systemd service :

[Unit]
Description=Prometheus ipahealthcheck_exporter
Wants=basic.target
After=basic.target network.target

[Service]
User=ipahealthcheck-exporter
Group=ipahealthcheck-exporter
ExecStart=/usr/local/bin/ipahealthcheck_exporter

ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always

[Install]
WantedBy=multi-user.target

The following arguments are supported :

# ./ipa-healthcheck_exporter -h
Usage of ./ipahealthcheck_exporter:
  -address string
    	Address on which to expose metrics. (default "0.0.0.0")
  -ipahealthcheck-log-path string
    	Path to the ipa-healthcheck log file. (default "/var/log/ipa/healthcheck/healthcheck.log")
  -ipahealthcheck-path string
    	Path to the ipa-healthcheck binary. (default "/usr/bin/ipa-healthcheck")
  -metrics-path string
    	Path under which to expose the metrics. (default "/metrics")
  -port int
    	Port on which to expose metrics. (default 9888)
  -sudo
    	Use privilege escalation to run the health checks
  -v	Verbose mode (print more logs)

Exported Metrics

# HELP ipa_cert_expiration Expiration date of the certificates in warning or error state (unix timestamp)
# TYPE ipa_cert_expiration gauge
ipa_cert_expiration{certificate_request_id="20200626075943"} 1.604761504e+09
...
# HELP ipa_dogtag_connectivity_check Check to verify dogtag basic connectivity. (1: success, 0: error)
# TYPE ipa_dogtag_connectivity_check gauge
ipa_dogtag_connectivity_check{ipahealthcheck="DogtagCertsConnectivityCheck"} 1
# HELP ipa_replication_check Replication checks (1: success, 0: error)
# TYPE ipa_replication_check gauge
ipa_replication_check{ipahealthcheck="ReplicationConflictCheck"} 1
# HELP ipa_service_state State of the services monitored by IPA healthcheck (1: running, 0: not running)
# TYPE ipa_service_state gauge
ipa_service_state{service="certmonger"} 1
ipa_service_state{service="httpd"} 1
...

Prometheus

Labels

The exporter labels are the following :

  • severity : Severity of the check ("success", "critical", "error", "warning)
  • source : Source (plugin / list of checks) of the check
  • check : Name of the check

Alerts

Alerting rules

Here are an example of two alerting rules to receive alerts when a check is in a bad state :

alert: IPAHealthcheckIsCritical
expr: ipa_healthcheck_state{severity="critical"} == 1
for: 5m
labels:
  severity: critical
annotations:
  description: "A IPA healthcheck is in critical state ( {{ $labels.source }} / {{ $labels.check }} )"
alert: IPAHealthcheckIsError
expr: ipa_healthcheck_state{severity="error"} == 1
for: 5m
labels:
  severity: error
annotations:
  description: A IPA healthcheck is in error state : ( {{ $labels.source }} / {{ $labels.check }} )"

Misc

When a check is in error you can rerun it on the server to have more informations about the problem with the following command :

# ipa-healthcheck --source <label "source"> --check <label "check">

We currently have to use the --output-file option of the ipa-healthcheck command and a temp file to parse the checks otherwise some warnings are written on stdout alongside the json output.

TODO :

  • Our own direct scraping mechanism (via ipalib) to not be tied to ipa-healthcheck and better performance.
  • Use prometheus/common/promlog package for logging handling.

ipahealthcheck_exporter's People

Contributors

lconsuegra avatar tgerczei avatar

Stargazers

Sam Morris avatar Noë Charlier avatar  avatar Hugo Bollon avatar Sergey avatar  avatar M.A.Y. avatar  avatar Jonathan Asamoah avatar Maxim avatar Witold Baryluk avatar Mickaël Canévet avatar

Watchers

Christian Kaenzig avatar Emmanuel Belo avatar James Cloos avatar Alexandre Fayolle avatar Maurer Luc avatar Julien Acroute avatar Maxim avatar  avatar  avatar  avatar  avatar  avatar

ipahealthcheck_exporter's Issues

Please add an option to set the listening IP address to localhost

Thank you for the exporter!

Trying to get all network traffic encrypted and store telemetry outside of the IPA server I'm looking at ingesting the Prometheus data using Telegraf where I can ship the data once a day to InfluxDB. The Telegraf agent is already present and securely ships system telemetry, so it's easy for me to just add another input there.

It would be great to have the option of not exposing the Prometheus port, when only needing a localhost connection.

ipahealthcheck_exporter.service: Start request repeated too quickly.

Hi, I am trying to start exporter as a service and I get this error in systemctl output:

ipahealthcheck_exporter.service: Main process exited, code=exited, status=203/EXEC
ipahealthcheck_exporter.service: Failed with result 'exit-code'.
ipahealthcheck_exporter.service: Scheduled restart job, restart counter is at 5.
Stopped Prometheus ipahealthcheck_exporter.
ipahealthcheck_exporter.service: Start request repeated too quickly.
ipahealthcheck_exporter.service: Failed with result 'exit-code'.
Failed to start Prometheus ipahealthcheck_exporter.

I've installed it on prometheus server, should ipahealthcheck tool also be installed on the same machine? I am confused about this error message 🤔

Check for empty file before trying to scrape it [request]

Jun 15 02:05:07 ipa.domain.com ipahealthcheck_exporter[34663]: time="2024-06-15T02:05:07+02:00" level=error msg="Cannot unmarshal json from ipa-healthcheck log file: unexpected end of JSON input" source="ipahealthcheck_exporter.go:176"

This error is shown after the healthcheck.log file is (log)rotated, until the healthcheck is run again. So it might be a good idea to add a check for an empty file before trying to read json from it.

Cannot unmarshal json from ipa-healthcheck log file

Hello,

I'm getting the below error while trying to pull the metrics:

[root@idm-1 ipahealthcheck_exporter]# ./ipahealthcheck_exporter
INFO[0000] ipa-healthcheck exporter listening on http://0.0.0.0:9888  source="ipahealthcheck_exporter.go:236"
INFO[0003] Scraping metrics from /usr/bin/ipa-healthcheck  source="ipahealthcheck_exporter.go:104"
INFO[0006] Scraping metrics from /var/log/ipa/healthcheck/healthcheck.log  source="ipahealthcheck_exporter.go:145"
ERRO[0006] Cannot unmarshal json from ipa-healthcheck log file: unexpected end of JSON input  source="ipahealthcheck_exporter.go:154"

Output from the ipa-healthcheck tool seems to be OK. Here is my ipa packages version:

[root@idm-1 ~]# rpm -qa | grep -i ipa
sssd-ipa-2.8.2-3.el8_8.x86_64
ipa-common-4.9.11-6.module+el8.8.0+19022+e8902f4b.noarch
libipa_hbac-2.8.2-3.el8_8.x86_64
python3-ipaserver-4.9.11-6.module+el8.8.0+19022+e8902f4b.noarch
python3-ipalib-4.9.11-6.module+el8.8.0+19022+e8902f4b.noarch
ipa-client-4.9.11-6.module+el8.8.0+19022+e8902f4b.x86_64
redhat-logos-ipa-84.5-1.el8.noarch
ipa-server-4.9.11-6.module+el8.8.0+19022+e8902f4b.x86_64
ipa-client-common-4.9.11-6.module+el8.8.0+19022+e8902f4b.noarch
ipa-server-trust-ad-4.9.11-6.module+el8.8.0+19022+e8902f4b.x86_64
python3-iniparse-0.4-31.el8.noarch
python3-ipaclient-4.9.11-6.module+el8.8.0+19022+e8902f4b.noarch
ipa-server-common-4.9.11-6.module+el8.8.0+19022+e8902f4b.noarch
ipa-healthcheck-0.12-1.module+el8.8.0+17582+6bf5bf91.noarch
ipa-server-dns-4.9.11-6.module+el8.8.0+19022+e8902f4b.noarch
ipa-selinux-4.9.10-9.module+el8.7.0+17437+cf46f77f.noarch
ipa-healthcheck-core-0.12-1.module+el8.8.0+17582+6bf5bf91.noarch
python3-libipa_hbac-2.8.2-3.el8_8.x86_64

So, the end results is that some metrics are displayed and some others are not "such as replication status". Am I missing anything here?

ipahealthcheck_exporter stopped working after upgrade from Fedora 37 to 39

The Journal shows the following error:

Mar 13 11:35:03 ipa.domain.com systemd[1]: Started ipahealthcheck-exporter.service - Prometheus ipahealthcheck_exporter.
Mar 13 11:35:03 ipa.domain.com ipahealthcheck_exporter[729983]: time="2024-03-13T11:35:03+01:00" level=info msg="Running exporter version development, commit none, built at 2024-03-13 11:35:03.101782656 +0100 CET m=+0.002325668" source="ipahealthcheck_exporter.go:283"
Mar 13 11:35:03 ipa.domain.com ipahealthcheck_exporter[729983]: time="2024-03-13T11:35:03+01:00" level=info msg="ipa-healthcheck exporter listening on http://127.0.0.1:9888\n" source="ipahealthcheck_exporter.go:285"
Mar 13 11:40:00 ipa.domain.com ipahealthcheck_exporter[729983]: time="2024-03-13T11:40:00+01:00" level=info msg="Scraping metrics from /usr/bin/ipa-healthcheck" source="ipahealthcheck_exporter.go:115"
Mar 13 11:40:00 ipa.domain.com ipahealthcheck_exporter[729983]: time="2024-03-13T11:40:00+01:00" level=info msg="using sudo to execute health check" source="ipahealthcheck_exporter.go:126"
Mar 13 11:40:00 ipa.domain.com sudo[733710]: ipahealthcheck-exporter : PWD=/ ; USER=root ; COMMAND=/usr/bin/ipa-healthcheck --source ipahealthcheck.meta.services --output-file /dev/shm/ipa-healthcheck.out1284489533
Mar 13 11:40:02 ipa.domain.com ipahealthcheck_exporter[729983]: time="2024-03-13T11:40:02+01:00" level=fatal msg="Cannot unmarshal json from ipa-healthcheck output: unexpected end of JSON input" source="ipahealthcheck_exporter.go:143"
Mar 13 11:40:02 ipa.domain.com systemd[1]: ipahealthcheck-exporter.service: Main process exited, code=exited, status=1/FAILURE
Mar 13 11:40:02 ipa.domain.com systemd[1]: ipahealthcheck-exporter.service: Failed with result 'exit-code'.
Mar 13 11:40:03 ipa.domain.com systemd[1]: ipahealthcheck-exporter.service: Scheduled restart job, restart counter is at 4.

After noticing that ipa-healthcheck now needs to run as root I created a sudo exception for user ipahealthcheck-exporter to run ipa-healthcheck without a password. And added the -sudo flag to ExecStart in the systemd service file.

This resolved the "must be root to run this script" warning/error:

Mar 13 10:50:00 ipa.domain.com ipahealthcheck_exporter[695469]: You must be root to run this script.
Mar 13 10:50:00 ipa.domain.com ipahealthcheck_exporter[694623]: time="2024-03-13T10:50:00+01:00" level=info msg="ipa-healthcheck tool returned errors: exit status 1" source="
ipahealthcheck_exporter.go:132"
Mar 13 10:50:00 ipa.domain.com ipahealthcheck_exporter[694623]: time="2024-03-13T10:50:00+01:00" level=fatal msg="Cannot unmarshal json from ipa-healthcheck output: unexpected end of JSON input" source="ipahealthcheck_exporter.go:143"

But the "Cannot unmarshal json" error remains. Have I lost a prerequisite package in the OS upgrade or is something else going on?

ipa_replication_check - metrics disappeared after the version '0.0.10'

ipahealthcheck_exporter-0.0.12.linux-amd64

# TYPE ipa_dogtag_connectivity_check gauge
ipa_dogtag_connectivity_check{ipahealthcheck="DogtagCertsConnectivityCheck"} 1
# HELP ipa_service_state State of the services monitored by IPA healthcheck (1: running, 0: not running)
# TYPE ipa_service_state gauge
ipa_service_state{service="certmonger"} 1
ipa_service_state{service="dirsrv"} 1
ipa_service_state{service="gssproxy"} 1
ipa_service_state{service="httpd"} 1
ipa_service_state{service="ipa_custodia"} 1
ipa_service_state{service="ipa_dnskeysyncd"} 1
ipa_service_state{service="ipa_otpd"} 1
ipa_service_state{service="kadmin"} 1
ipa_service_state{service="krb5kdc"} 1
ipa_service_state{service="named"} 1
ipa_service_state{service="pki_tomcatd"} 1
ipa_service_state{service="sssd"} 1

ipahealthcheck_exporter-0.0.9.linux-amd64 & ipahealthcheck_exporter-0.0.10.linux-amd64

# TYPE ipa_dogtag_connectivity_check gauge
ipa_dogtag_connectivity_check{ipahealthcheck="DogtagCertsConnectivityCheck"} 1
# HELP ipa_replication_check Replication checks (1: success, 0: error)
# TYPE ipa_replication_check gauge
ipa_replication_check{ipahealthcheck="ReplicationChangelogCheck"} 1
ipa_replication_check{ipahealthcheck="ReplicationCheck"} 1
# HELP ipa_service_state State of the services monitored by IPA healthcheck (1: running, 0: not running)
# TYPE ipa_service_state gauge
ipa_service_state{service="certmonger"} 1
ipa_service_state{service="dirsrv"} 1
ipa_service_state{service="gssproxy"} 1
ipa_service_state{service="httpd"} 1
ipa_service_state{service="ipa_custodia"} 1
ipa_service_state{service="ipa_dnskeysyncd"} 1
ipa_service_state{service="ipa_otpd"} 1
ipa_service_state{service="kadmin"} 1
ipa_service_state{service="krb5kdc"} 1
ipa_service_state{service="named"} 1
ipa_service_state{service="pki_tomcatd"} 1
ipa_service_state{service="sssd"} 1

Services ipa_dnskeysyncd and named missing

Hello,
In the list of services (ipa_service_state) it is still missing "ipa_dnskeysyncd" and "named".
But if I look in "/var/log/ipa/healthcheck/healthcheck.log", these two services are there.

# HELP ipa_service_state State of the services monitored by IPA healthcheck (1: running, 0: not running)
# TYPE ipa_service_state gauge
ipa_service_state{service="certmonger"} 1
ipa_service_state{service="dirsrv"} 1
ipa_service_state{service="gssproxy"} 1
ipa_service_state{service="httpd"} 1
ipa_service_state{service="ipa_custodia"} 1
ipa_service_state{service="ipa_otpd"} 1
ipa_service_state{service="kadmin"} 1
ipa_service_state{service="krb5kdc"} 1
ipa_service_state{service="pki_tomcatd"} 1
ipa_service_state{service="sssd"} 1
{
    "source": "ipahealthcheck.meta.services",
    "check": "ipa_dnskeysyncd",
    "result": "SUCCESS",
    "uuid": "02d729ef-e1a9-4f4b-b8ec-3636bab7f5e5",
    "when": "20230525210918Z",
    "duration": "0.027094",
    "kw": {
      "status": true
    }
  },
  {
    "source": "ipahealthcheck.meta.services",
    "check": "named",
    "result": "SUCCESS",
    "uuid": "e283ab08-faad-4f2c-a668-dfbe4fe48428",
    "when": "20230525210918Z",
    "duration": "0.026866",
    "kw": {
      "status": true
    }
  }

Only a tcp6 connexion available

When starting the process as a service, I only have a tcp6 connexion available

[root@freeipa bin]# sudo netstat -ltnp | grep ipa
tcp6       0      0 :::9888                 :::*                    LISTEN      103338/ipahealthche 

Here is the content of my service file:

[root@freeipa bin]# cat /etc/systemd/system/ipahealthcheck_exporter.service 
[Unit]
Description=Prometheus ipahealthcheck_exporter
Wants=basic.target
After=basic.target network.target

[Service]
User=ipahealthcheck-exporter
Group=ipahealthcheck-exporter
ExecStart=/usr/local/bin/ipahealthcheck_exporter

ExecReload=/bin/kill -HUP $MAINPID
KillMode=process
Restart=always

[Install]
WantedBy=multi-user.target

Failed to fetch metrics

I've started systemd unit and on curl metrics endpoint i see empty reply.
In logs i see error.
Jun 01 17:19:05 pkles-gt0002399.novalocal ipahealthcheck_exporter[6377]: time="2021-06-01T17:19:05+03:00" level=info msg="ipa-healthcheck exporter listening on http://0.0.0.0:98>
Jun 01 17:21:35 pkles-gt0002399.novalocal ipahealthcheck_exporter[6377]: time="2021-06-01T17:21:35+03:00" level=info msg="Scraping metrics from /usr/bin/ipa-healthcheck" source=>
Jun 01 17:21:36 pkles-gt0002399.novalocal ipahealthcheck_exporter[6377]: time="2021-06-01T17:21:36+03:00" level=info msg="ipa-healthcheck tool returned errors: exit status 1" so>
Jun 01 17:21:36 pkles-gt0002399.novalocal ipahealthcheck_exporter[6377]: time="2021-06-01T17:21:36+03:00" level=fatal msg="Cannot unmarshal json from ipa-healthcheck output: une>
Jun 01 17:21:36 pkles-gt0002399.novalocal systemd[1]: ipahealthcheck_exporter.service: Main process exited, code=exited, status=1/FAILURE
Jun 01

metric collection error

Hi, after I run ipa-healthcheck --failures-only > /var/log/ipa/healthcheck/healthcheck.log commmand, collector starts to give errors, but without it I can't get replication and other ldap related problems:

collected metric "ipa_replication_check" { label:<name:"ipahealthcheck" value:"ReplicationCheck" > gauge:<value:0 > } was collected before with the same name and label values

resim

resim

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.