Giter Club home page Giter Club logo

puppet-prometheus's Introduction

puppet-prometheus

Build Status Puppet Forge Puppet Forge - downloads Puppet Forge - endorsement Puppet Forge - scores Apache-2 License

Table of Contents


Compatibility

Prometheus Version Recommended Puppet Module Version
>= 0.16.2 latest

node_exporter >= 0.15.0 consul_exporter >= 0.3.0

This module supports below Prometheus architectures:

  • x86_64/amd64
  • i386
  • armv71 (Tested on raspberry pi 3)

The prometheus::ipmi_exporter class has a dependency on saz/sudo Puppet module.

Background

This module automates the install and configuration of Prometheus monitoring tool: Prometheus web site

What This Module Affects

  • Installs the prometheus daemon, alertmanager or exporters(via url or package)
    • The package method was implemented, but currently there isn't any package for prometheus
  • Optionally installs a user to run it under (per exporter)
  • Installs a configuration file for prometheus daemon (/etc/prometheus/prometheus.yaml) or for alertmanager (/etc/prometheus/alert.rules)
  • Manages the services via upstart, sysv, or systemd
  • Optionally creates alert rules
  • The following exporters are currently implemented: node_exporter, statsd_exporter, process_exporter, haproxy_exporter, mysqld_exporter, blackbox_exporter, consul_exporter, redis_exporter, varnish_exporter, graphite_exporter, postgres_exporter, collectd_exporter, grok_exporter, ipsec_exporter, openldap_exporter, openvpn_exporter, ssh_exporter, ssl_exporter

Usage

Notice about breaking changes

Version 5.0.0 and older of this module allowed you to deploy the prometheus server by doing a simple include prometheus. We introduced a new class layout in version 6. By default, including the prometheus class won't deploy the server now. You need to include the prometheus::server class for this (which has the same parameters that prometheus had). An alternative approach is to set the manage_prometheus_server parameter to true in the prometheus class. Background information about this change is described in the related pull request and the issue.

For more information regarding class parameters please take a look at the class docstrings.

Prometheus Server (versions < 1.0.0)

class { 'prometheus::server':
  global_config  => {
    'scrape_interval'     => '15s',
    'evaluation_interval' => '15s',
    'external_labels'     => {'monitor' => 'master'},
  },
  rule_files     => ['/etc/prometheus/alert.rules'],
  scrape_configs => [
    {
      'job_name'        => 'prometheus',
      'scrape_interval' => '10s',
      'scrape_timeout'  => '10s',
      'target_groups'   => [
        {
          'targets' => ['localhost:9090'],
          'labels'  => {'alias' => 'Prometheus'}
        },
      ],
    },
  ],
}

Prometheus Server (versions >= 1.0.0 < 2.0.0)

class { 'prometheus::server':
  version        => '1.0.0',
  scrape_configs => [
    {
      'job_name'        => 'prometheus',
      'scrape_interval' => '30s',
      'scrape_timeout'  => '30s',
      'static_configs'  => [
        {
          'targets' => ['localhost:9090'],
          'labels'  => {
            'alias' => 'Prometheus',
          },
        },
      ],
    },
  ],
  alerts         => [
    {
      'name'         => 'InstanceDown',
      'condition'    => 'up == 0',
      'timeduration' => '5m',
      'labels'       => [
        {
          'name'    => 'severity',
          'content' => 'page',
        },
      ],
      'annotations'  => [
        {
          'name'    => 'summary',
          'content' => 'Instance {{ $labels.instance }} down',
        },
        {
          'name'    => 'description',
          'content' => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.',
        },
      ],
    },
  ],
  extra_options  => '-alertmanager.url http://localhost:9093 -web.console.templates=/opt/prometheus-1.0.0.linux-amd64/consoles -web.console.libraries=/opt/prometheus-1.0.0.linux-amd64/console_libraries',
  localstorage   => '/prometheus/prometheus',
}

Prometheus Server (versions >= 2.0.0)

class { 'prometheus::server':
  version        => '2.4.3',
  alerts         => {
    'groups' => [
      {
        'name'  => 'alert.rules',
        'rules' => [
          {
            'alert'       => 'InstanceDown',
            'expr'        => 'up == 0',
            'for'         => '5m',
            'labels'      => {
              'severity' => 'page',
            },
            'annotations' => {
              'summary'     => 'Instance {{ $labels.instance }} down',
              'description' => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
            }
          },
        ],
      },
    ],
  },
  scrape_configs => [
    {
      'job_name'        => 'prometheus',
      'scrape_interval' => '10s',
      'scrape_timeout'  => '10s',
      'static_configs'  => [
        {
          'targets' => [ 'localhost:9090' ],
          'labels'  => {
            'alias' => 'Prometheus',
          }
        }
      ],
    },
  ],
}

When using prometheus >= 2.0, use the new yaml format for rules and alerts.

Which in Puppet means the alerts looks like this:

alerts => {
  'groups' => [
    {
      'name' => 'alert.rules',
      'rules' => [
        {
          'alert'  => 'InstanceDown',
          'expr'   => 'up == 0',
          'for'    => '5m',
          'labels' => {
            'severity' => 'page',
          },
          'annotations' => {
            'summary'     => 'Instance {{ $labels.instance }} down',
            'description' => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.',
          }
        }
      ],
    },
  ],
},

And that results in this YAML configuration.

---
alerts:
  groups:
    - name: 'alert.rules'
      rules:
        - alert: 'InstanceDown'
          expr: 'up == 0'
          for: '5m'
          labels:
            severity: 'page'
          annotations:
            summary: 'Instance {{ $labels.instance }} down'
            description: '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'

Monitored Nodes

include prometheus::node_exporter

or:

class { 'prometheus::node_exporter':
  version            => '0.12.0',
  collectors_disable => ['loadavg', 'mdadm'],
  extra_options      => '--collector.ntp.server ntp1.orange.intra',
}

Example

Real Prometheus >=2.0.0 setup example including alertmanager and slack_configs.

class { 'prometheus':
  manage_prometheus_server => true,
  version                  => '2.0.0',
  alerts                   => {
    'groups' => [
      {
        'name'  => 'alert.rules',
        'rules' => [
          {
            'alert'       => 'InstanceDown',
            'expr'        => 'up == 0',
            'for'         => '5m',
            'labels'      => {'severity' => 'page'},
            'annotations' => {
              'summary'     => 'Instance {{ $labels.instance }} down',
              'description' => '{{ $labels.instance }} of job {{ $labels.job }} has been down for more than 5 minutes.'
            },
          },
        ],
      },
    ],
  },
  scrape_configs           => [
    {
      'job_name'        => 'prometheus',
      'scrape_interval' => '10s',
      'scrape_timeout'  => '10s',
      'static_configs'  => [
        {
          'targets' => ['localhost:9090'],
          'labels'  => {'alias' => 'Prometheus'}
        }
      ],
    },
    {
      'job_name'        => 'node',
      'scrape_interval' => '5s',
      'scrape_timeout'  => '5s',
      'static_configs'  => [
        {
          'targets' => ['nodexporter.domain.com:9100'],
          'labels'  => {'alias' => 'Node'}
        },
      ],
    },
  ],
  alertmanagers_config     => [
    {
      'static_configs' => [{'targets' => ['localhost:9093']}],
    },
  ],
}

class { 'prometheus::alertmanager':
  version   => '0.13.0',
  route     => {
    'group_by'        => ['alertname', 'cluster', 'service'],
    'group_wait'      => '30s',
    'group_interval'  => '5m',
    'repeat_interval' => '3h',
    'receiver'        => 'slack',
  },
  receivers => [
    {
      'name'          => 'slack',
      'slack_configs' => [
        {
          'api_url'       => 'https://hooks.slack.com/services/ABCDEFG123456',
          'channel'       => '#channel',
          'send_resolved' => true,
          'username'      => 'username'
        },
      ],
    },
  ],
}

And if you want to use Hiera to declare the values instead, you can simply include the prometheus class and set your Hiera data as shown below:

Puppet Code

include prometheus

Hiera Data (in yaml)

---
prometheus::manage_prometheus_server: true

prometheus::version: '2.0.0'

prometheus::alerts:
  groups:
    - name: 'alert.rules'
      rules:
        - alert: 'InstanceDown'
          expr: 'up == 0'
          for: '5m'
          labels:
            severity: 'page'
          annotations:
            summary: 'Instance {{ $labels.instance }} down'
            description: '{{ $labels.instance }} of job {{ $labels.job }} has been
              down for more than 5 minutes.'

prometheus::scrape_configs:
  - job_name: 'prometheus'
    scrape_interval: '10s'
    scrape_timeout: '10s'
    static_configs:
      - targets:
          - 'localhost:9090'
        labels:
          alias: 'Prometheus'
  - job_name: 'node'
    scrape_interval: '10s'
    scrape_timeout: '10s'
    static_configs:
      - targets:
          - 'nodexporter.domain.com:9100'
        labels:
          alias: 'Node'

prometheus::alertmanagers_config:
  - static_configs:
      - targets:
          - 'localhost:9093'

prometheus::alertmanager::version: '0.13.0'

prometheus::alertmanager::route:
  group_by:
    - 'alertname'
    - 'cluster'
    - 'service'
  group_wait: '30s'
  group_interval: '5m'
  repeat_interval: '3h'
  receiver: 'slack'

prometheus::alertmanager::receivers:
  - name: 'slack'
    slack_configs:
      - api_url: 'https://hooks.slack.com/services/ABCDEFG123456'
        channel: "#channel"
        send_resolved: true
        username: 'username'

Test your commit with vagrant https://github.com/kalinux/vagrant-puppet-prometheus.git

Known issues

In version 0.1.14 of this module the alertmanager was configured to run as the service alert_manager. This has been changed in version 0.2.00 to be alertmanager.

Do not use version 1.0.0 of Prometheus: https://groups.google.com/forum/#!topic/prometheus-developers/vuSIxxUDff8 ; it is not compatible with this module!

Postfix is not supported on Archlinux because it relies on puppet-postfix, which does not support Archlinux.

Development

This project contains tests for rspec-puppet.

Quickstart to run all linter and unit tests:

bundle install --path .vendor/ --without system_tests --without development --without release
bundle exec rake test

Authors

puppet-prometheus is maintained by Vox Pupuli, it was written by brutus333.

puppet-prometheus's People

Contributors

alexjfisher avatar anarcat avatar bastelfreak avatar baurmatt avatar blupman avatar bramblek1 avatar brutus333 avatar costela avatar defenestration avatar dhoppe avatar ekohl avatar ghoneycutt avatar jgodin-c2c avatar jhooyberghs avatar juniorsysadmin avatar kkunkel avatar lswith avatar mdebruyn-trip avatar mindriot88 avatar othalla avatar roidelapluie avatar romdav00 avatar sathieu avatar smortex avatar themeier avatar treydock avatar tuxmea avatar vide avatar wyardley avatar zipkid avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

puppet-prometheus's Issues

node exporter also installs prometheus server on monitored node

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 4.10.10
  • Ruby: 2.1.9p490 (2016-03-30 revision 54437) [x86_64-linux]
  • Distribution: Debian 9.4 (Stretch)
  • Module version: v5.0.1-rc0

How to reproduce (e.g Puppet code you use)

When we use

class { 'prometheus::node_exporter':
    version => '0.14.0',
    ...
}

for the monitored nodes, we get also a prometheus server installed on the monitored nodes.

What are you seeing

promethes server gets installed (Version 1.5.2)
and node exporter gets installed (Version 0.14.0)

What behaviour did you expect instead

only node exporter gets installed (Version 0.14.0)

Any additional information you'd like to impart

i think prometheus::node_exporter inherits prometheus
thats why prometheus server gets installed in Version 1,
see your documentation in section "On the server (for prometheus version >= 1.0.0)"

Feature request for recording rule support

While I can define where prometheus can find a recording rule file, this module gives me no means to specify the contents of that file.

Ideally I would like to be able to use the same module to handle both the prometheus configuration and recording rule definitions.

Blackbox_exporter manifest erroneously uses -config.file instead of --config.file parameter

The recently merged support for the blackbox_exporter contains a typo for the config.file parameter which is a long-form parameter in blackbox_exporter but erroneously passed as -config.file.

Actual behaviour: blackbox_exporter is started with -config.file=/etc/blackbox-exporter.yaml and bails out.
Expected behaviour: blackbox_exporter is started with --config.file=/etc/blackbox-exporter.yaml and works.

Reference:

$options = "-config.file=${config_file} ${extra_options}"

prometheus.yaml broken syntax when generated from hiera

Hello,

I keep prometheus config in hiera, looks like this:

prometheus::scrape_configs:
  - job_name: 'prometheus'
    static_configs:
      - targets:
        - 'localhost:9090'
        labels:
          instance: 'prometheus'
  - job_name: 'linux'
    static_configs:
      - targets:
        - 'ser1:9100'
        - 'ser2:9100'
        - 'ser3:9100'

After switching to latest master commit, my /etc/prometheus/prometheus.yaml transformed from nice yaml into "garbage":

---
"global":
  "scrape_interval": >-
    15s
  "evaluation_interval": >-
    15s
  "external_labels":
    "monitor": >-
      master
"rule_files":
- >-
  /etc/prometheus/alert.rules
"scrape_configs":
- "job_name": >-
    prometheus
  "static_configs":
  - "targets":
    - >-
      localhost:9090

After little digging I found out that it's related to change introduced in
681bc1b

Some details about environment

  • puppet-agent 1.10.8-1jessie
  • puppetserver 2.8.0-1puppetlabs1
  • Debian Jessie

puppetlabs/stdlib dependency appears to be 4.20.0 and not 4.13.1

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 4.10 PC1
  • Ruby: 2.1.5p273 [x86_64-linux-gnu]
  • Distribution: Debian
  • Module version: 4.1.0

How to reproduce (e.g Puppet code you use)

Run puppet agent with version 4.13.1 of stdlib specified in Puppetfile

What are you seeing

Manifest failure with error:

Error: Unknown function: 'to_yaml'. at /etc/puppetlabs/code/environments/prometheus/third_party/prometheus/manifests/alerts.pp:35:30

What behaviour did you expect instead

Manifest compilation.

Output log

Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Unknown function: 'to_yaml'. at /etc/puppetlabs/code/environments/prometheus/third_party/prometheus/manifests/alerts.pp:35:30 on node test
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

Any additional information you'd like to impart

Updated to version 4.20.0 of stdlib and the manifest compiled.

I checked the changelog of stdlib and the to_yaml function appears to have been introduced in version 4.20.0

https://forge.puppet.com/puppetlabs/stdlib/changelog#supported-release-4200

Many thanks for this module.

prometheus wouldn't start anymore at boot time, because it now has a dependency on "multi-user.target" as well as being part of it

jhooyberghs (on slack):
hey Martin, if you have some time somewhere, I've got a little question about a commit/merge of yours in puppet-prometheus:
58d0da6

We've noticed that since that commit, our prometheus wouldn't start anymore at boot time, because it now has a dependency on "multi-user.target" as well as being part of it. The reason in our setup was a problem in fstab, so the multi-user.target dependency wasn't met. Do you have any recollection of why this dependency was added? The commit message only speaks about the "restart" change and I fail to see a deeper reason ๐Ÿ™‚

Prometheus Rule Files

Could this module manage the rule files? I would like to be able to make sure the rule file is defined before the service starts.

node_expoerter v0.15.0

The release 0.15.0 of node collector changes the configuration of the collectors..

[CHANGE] Replace --collectors.enabled with per-collector flags #640

prometheus/node_exporter#640

this renders all handling as it currently is implemented of the collectors in this module useless

also #33 is propably obsoleted by this

Usage of `puppet` in custom alertmanager fact breaks if puppet not in $PATH (e.g. systemd service)

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 4.10.8
  • Ruby:2.0.0p648 (2015-12-16) [x86_64-linux] (that's the system ruby)
  • Distribution: CentOS Linux release 7.4.1708 (Core)
  • Module version: 3.1.0

How to reproduce (e.g Puppet code you use)

Using a puppet master/agent architecture

  • Install the puppet/prometheus module on the master
  • Run puppet agent on an agent node as part of a service, e.g. systemd service

As soon as he prometheus module is installed in an environment he custom fact will be collected on master and agents, triggering the bug.

What are you seeing

When attempting to collect the custom prometheus_alert_manager_running fact an error occurs:

(Facter) error while resolving custom fact "prometheus_alert_manager_running": execution of command "puppet resource service alert_manager 2>&1" failed: command not found.

What behaviour did you expect instead

No error - fact collected as normal.

Output log

see above

Any additional information you'd like to impart

I think this error is down to the fact that puppet is not in the PATH when puppet runs as part of the service. The default systemd service file that comes with the puppetlabs collections does not insert /opt/puppetlabs/bin into the PATH of the service file.
In contrast, the same collection does add a drop-in at /etc/profile.d/ to ensure login-shells do have puppet in the path (so running the command from the fact from a login shell works fine).

I would suggest the fix is to put in the full path of the puppet binary into the fact - ideally there's some way of fetching that inside Facter.

Allow to configure scrape options by file

Hi !

Really nice module but right now i do not want to configure the scraping options directly in manifests file. I would like to provide my own prometheus.yml file with all the configuration already done.

As I can see right now you populate proemtheus.yml from prometheus::config class. What do you thing about one new parameter to prometheus class: scrape_config_file. If there will be something in this file just drop it in correct place and that is all. If this parameter is empty and scrape_configs are set do what you are doing right now.

Running puppet restarts service

Changes to configuration or alert rules results in the Prometheus service restarting.

This is regularly leading to crash-recovery mode on Prometheus. Often sending kill -HUP to reload the service would be sufficient. Is this a change you'd be okay making?

Add support for exporting/collecting *_exporter configs

It would be nice to have the individual *_exporter export puppet resources that could be collected on the prometheus node and automatically configure scrapers.

I already have a PR for this on the pipeline, so this ticket is just for reference, while the PR doesn't get submitted.

Unable to use this module on fresh alert manager instances

In the alertmanager.pp manifests there is a service resource that instructs puppet to stop the alert_manager service if it is running, since the service has been renamed. This works fine on an old instance that has the alert_manager service, but breaks down on new instances that have no such service. I am seeing the following error on a fresh instance that attempts to start the alertmanager:

/Stage[main]/Prometheus::Alertmanager/Service[alert_manager]) Could not evaluate: Could not find init script or upstart conf file for 'alert_manager'

service { 'alert_manager':
ensure => 'stopped',
}

Issue when install prometheus and alertmanager

Hi,

When I install the prometheus and alertmanagers fails because de download url needs a diferent path:

Example puppet output :
err: /Stage[main]/Prometheus::Alert_manager::Install/Staging::File[alert_manager-0.5.0-alpha.0.tar.gz]/Exec[/opt/staging/prometheus/alert_manager-0.5.0-alpha.0.tar.gz]/returns: change from notrun to 0 failed: curl -f -L -o /opt/staging/prometheus/alert_manager-0.5.0-alpha.0.tar.gz https://github.com/prometheus/alertmanager/releases/download/0.5.0-alpha.0/alertmanager-0.5.0-alpha.0.linux-amd64.tar.gz returned 22 instead of one of [0]

  • Current line on init.pp & alertmanager.pp
    $real_download_url = pick($download_url, "${download_url_base}/download/${version}/${package_name}-${version}.${os}-${arch}.${download_extension}")
  • Patch to work fine ( put only v${version} )
    $real_download_url = pick($download_url, "${download_url_base}/download/v${version}/${package_name}-${version}.${os}-${arch}.${download_extension}")

Thanks Oscar

Change Prometheus port

Hi!

Is there any option to change the Prometheus port to something different than 9090? I cannot see such options anywhere in the code, but maybe it is buried somewhere. I can make PR and add this possibility to specify a port, it this is not a problem.

alertmanager default inhibit_rules error

While starting to use alertmanager i noticed the default inhibit_rules cause alertmanager to not start

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 4.10.10
  • Ruby: ruby 2.3.3p222
  • Distribution: debian/stretch
  • Module version: voxpupuli/puppet-prometheus master branch

How to reproduce (e.g Puppet code you use)

  class { '::prometheus::alertmanager':
    version        => '0.14.0',
    install_method => 'url',
    route          => { 'group_by' => [ 'alertname', 'cluster', 'service' ], 'group_wait'=> '30s', 'group_interval'=> '5m', 'repeat_interval'=> '3h', 'receiver'=> 'slack' },
    receivers      => [ { 'name' => 'slack', 'slack_configs'=> [ { 'api_url'=> 'https://hooks.slack.com/services/secret', 'channel' => '#test', 'send_resolved' => true, 'username' => 'username'}] }]
  }

(so, not setting inhibit_rules)

What are you seeing

this causes alertmanager to be installed, but crases after starting, upon investigating i found the error when i started alertmanager by hand.

What behaviour did you expect instead

an empty running alertmanager

Output log

root@723fefb34eec:/docker/development# alertmanager --config.file /etc/alertmanager/alertmanager.yaml 
level=info ts=2018-04-05T10:32:22.8074184Z caller=main.go:136 msg="Starting Alertmanager" version="(version=0.14.0, branch=HEAD, revision=30af4d051b37ce817ea7e35b56c57a0e2ec9dbb0)"
level=info ts=2018-04-05T10:32:22.8074907Z caller=main.go:137 build_context="(go=go1.9.2, user=root@37b6a49ebba9, date=20180213-08:16:42)"
level=info ts=2018-04-05T10:32:22.8112018Z caller=main.go:275 msg="Loading configuration file" file=/etc/alertmanager/alertmanager.yaml
level=error ts=2018-04-05T10:32:22.8118108Z caller=main.go:278 msg="Loading configuration file failed" file=/etc/alertmanager/alertmanager.yaml err="unknown fields in inhibit rule: severity"

Any additional information you'd like to impart

I have created a pull request to generate the same config as params.pp used to make

Default Node Exporter Collectors

The node exporter has updated and has decent defaults. We should change the params to simply not specify any collectors as the default.

Flaky Acceptance Tests in TravisCI

I notice the acceptance tests tend to fail in travis and they seem to be inconsistent/flaky. This should be investigated. I talked with @bastelfreak and he looked into it but was unable to reproduce it locally.

#113

Prometheus Logging to file

I would like to make sure the stdout of prometheus is logged to a file. I would like this done on the Debian version of this module.

node_exporter: "Could not find init script for node_exporter"

Under some circumstances (which I still couldn't pinpoint), puppet is unable to find the installed systemd file with "Could not find init script for node_exporter" (probably affects other daemons as well). An easy fix seems to be passing provider => $init_style in daemon.pp.

Facter error on older distributions (Ubuntu Trusty)

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 5.3.3
  • Ruby: ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-linux] (/opt/puppetlabs/puppet/bin/ruby)
  • Distribution: Ubuntu 14.04.5 LTS
  • Module version: 4.0.0

How to reproduce (e.g Puppet code you use)

vlad@mini:~ $ /opt/puppetlabs/bin/puppet module install puppet-prometheus --version 4.0.0

vlad@mini:~ $ /opt/puppetlabs/bin/puppet apply --noop -e 'class { "prometheus::node_exporter": version => "0.15.2", }'

What are you seeing

Error: Facter: error while resolving custom fact "prometheus_alert_manager_running": Could not find init script or upstart conf file for 'alert_manager'

What behaviour did you expect instead

No error

Output log

vlad@mini:~ $ /opt/puppetlabs/bin/puppet module install puppet-prometheus --version 4.0.0
Notice: Preparing to install into /home/vlad/.puppetlabs/etc/code/modules ...
Notice: Created target directory /home/vlad/.puppetlabs/etc/code/modules
Notice: Downloading from https://forgeapi.puppet.com ...
Notice: Installing -- do not interrupt ...
/home/vlad/.puppetlabs/etc/code/modules
โ””โ”€โ”ฌ puppet-prometheus (v4.0.0)
  โ”œโ”€โ”€ camptocamp-systemd (v1.1.1)
  โ””โ”€โ”ฌ puppet-archive (v2.2.0)
    โ””โ”€โ”€ puppetlabs-stdlib (v4.24.0)
vlad@mini:~ $ /opt/puppetlabs/bin/puppet apply --noop -e 'class { "prometheus::node_exporter": version => "0.15.2", }'
Error: Facter: error while resolving custom fact "prometheus_alert_manager_running": Could not find init script or upstart conf file for 'alert_manager'
Notice: Compiled catalog for mini in environment production in 1.17 seconds
Notice: Applied catalog in 0.50 seconds

Any additional information you'd like to impart

Support Debian packages

There are packages for debian in testing and backports; it would be very nice if they were used.

Error: Could not parse application options: invalid option: --to_yaml

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 3.8.7
  • Ruby:2.1.5p273
  • Distribution: Debian
  • Module version: since #101.

How to reproduce (e.g Puppet code you use)

Use older Puppet agent (we use Puppet master 4.8.2).

What are you seeing

[...]
Info: Loading facts
Error: Could not parse application options: invalid option: --to_yaml
Could not retrieve fact='prometheus_alert_manager_running', resolution='<anonymous>': undefined method `[]' for false:FalseClass
Info: Caching catalog for mynode
[...]

What behaviour did you expect instead

No warning

Output log

Any additional information you'd like to impart

Alert rule validation error

Hi,

Getting error while using puppet-prometheus

init.pp

  class { '::prometheus':
    version              => '2.0.0',
  }

Error shown below

==> prometheus: Error: Execution of '/usr/local/bin/promtool check rules /etc/prometheus/alert.rules20180110-5183-s8nrev' returned 1: Checking /etc/prometheus/alert.rules20180110-5183-s8nrev
==> prometheus:   FAILED:
==> prometheus: yaml: unmarshal errors:
==> prometheus:   line 1: cannot unmarshal !!seq into rulefmt.RuleGroups
==> prometheus: Error: /Stage[main]/Prometheus::Alerts/File[/etc/prometheus/alert.rules]/ensure: change from absent to file failed: Execution of '/usr/local/bin/promtool check rules /etc/prometheus/alert.rules20180110-5183-s8nrev' returned 1: Checking /etc/prometheus/alert.rules20180110-5183-s8nrev
==> prometheus:   FAILED:
==> prometheus: yaml: unmarshal errors:
==> prometheus:   line 1: cannot unmarshal !!seq into rulefmt.RuleGroups

probably something wrong here: https://github.com/voxpupuli/puppet-prometheus/blob/master/manifests/alerts.pp#L36

BR
Karen

Service resource in `prometheus::daemon` does not depend on `init_style` dependent service description

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 4.10
  • Ruby: 2.1.9
  • Distribution: CentOS 7
  • Module version: v2.0.0

How to reproduce (e.g Puppet code you use)

 include prometheus::node_exporter

What are you seeing

Notice: /Stage[main]/Prometheus::Node_exporter/Prometheus::Daemon[node_exporter]/Group[prometheus]/ensure: created
Notice: /Stage[main]/Prometheus::Node_exporter/Prometheus::Daemon[node_exporter]/User[prometheus]/ensure: created
Notice: /Stage[main]/Prometheus::Node_exporter/Prometheus::Daemon[node_exporter]/Package[prometheus-node-exporter]/ensure: created
Error: Systemd start for node_exporter failed!
journalctl log for node_exporter:
-- No entries --

Error: /Stage[main]/Prometheus::Node_exporter/Prometheus::Daemon[node_exporter]/Service[node_exporter]/ensure: change from stopped to running failed: Systemd start for node_exporter journalctl log for node_exporter:
-- No entries --
Notice: /Stage[main]/Prometheus::Node_exporter/Prometheus::Daemon[node_exporter]/Systemd::Unit_file[node_exporter.service]/File[/etc/systemd/system/node_exporter.service]/ensure: defined content as '{md5}b94d9aa7e7c105276490a5eebf37560b'
...

Note that the service was started before the unit file was created, resulting in an error.

What behaviour did you expect instead

The service is only started after the unit file has been created.

Exporters unpacked to /opt are not root:root

Hello,

Most of exporters pulled from github after unpacking have owner/group different than root:root. In some specific cases non-root user can replace binary with malicious code, and run it with (sometimes, maybe) more permissions as exporter user. Or just fake diagnostic data, which can lead to other issues.

bug: alert rules are still 1.0 syntax for Prometheus 2

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet:
  • Ruby:
  • Distribution:
  • Module version:

How to reproduce (e.g Puppet code you use)

See templates/alert.epp

What are you seeing

1.0 syntax

What behaviour did you expect instead

2.0 yaml syntax

Output log

Any additional information you'd like to impart

Wrong service reload command on ubuntu 14.04

On Ubuntu14.04 we use the wrong reload command. Currently hardcoded is upstart, but that isn't present in all 14.04 releases (or in none?). It should be changed to 'service' which wraps sysv or upstart.

Update forge version

Hello,

You should update the version of the module on the puppet forge, because it's not working with systemd as is and you also fixed some bugs :)

Regards

Minor nitpick

Hi I would like to suggest that the example config also includes config for a local node_exporter :)

Also - thanks for making this module!

Running dual instances for v1.8 -> 2.1 migration

Based on the best practices documented in the Prometheus migration guide it may be necessary to run two versions in parallel while data history is built.

https://prometheus.io/docs/prometheus/latest/migration/

Due the puppet constraints, as I understand them, it requires some unique customization to run these two instances. It would be wonderful if this was a capability of this profile. Essentially we would want to start the 1.8.x version in a non-scraping mode and v2.1 with a remote_read reference to the old instance running on the same box during a migration period.

Minor: add explicit retention option?

It's probably a matter of taste, but I think an explicit option for setting storage retention (as opposed to using prometheus::extra_options) would be a nice minor addition.
Any thoughts?

Tag 1.0.0

Hi all,

The version 1.0.0 seems to be released on March 27th 2017.
Could you tag this version ?

Best regards.

Prometheus service wont run if installed from package

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 4.10.6
  • Ruby: puppets very own ruby
  • Distribution: Debian Jessie
  • Module version: from master

How to reproduce (e.g Puppet code you use)

 class { 'prometheus':                                                                                                                                                                                        
         install_method => 'package', 
         ...

What are you seeing

Sep 27 20:30:54 hostname systemd[2208]: Failed at step EXEC spawning /usr/local/bin/prometheus: No such file or directory

What behaviour did you expect instead

If is "package" installation chosen, should be also used systemd service distributed with package.

Output log

Any additional information you'd like to impart

If 'package' method is chosen, module still using own service which make prometheus not run. And without modules own service 'prometheus.yam' is not loaded because default is 'prometheus.yml'. So, i'm missing something or is this not supported use case? Thanks!

Upgrade to Puppet4?

Would it be possible to refactor the code to be puppet 4 compatible sooner rather than later? From what I understand puppet3 modules are going out of support soon. I would be open to putting in a PR and the branching the module in it's current state on the puppet3 branch.

Propose to add configuration for http proxy

puppet-prometheus downdoads packages directly from the Internet. Unfortunately my hosts are are behind a HTTP proxy thus is always fails when downloading. Since archive moduel itself supports proxy_server and proxy_type, shall I add configuration in puppet-promethues to enable HTTP proxy?

relabel_configs unmarshal errors

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 5.3.4
  • Ruby: 2.4.3p205p
  • Distribution: Debian 8
  • Module version: 4.0

How to reproduce (e.g Puppet code you use)

part of the code from the hiera (json):

          "relabel_configs": [
            {
              "source_labels": "['__meta_consul_service']",
              "regex": "'(.*)'",
              "target_label": "service",
              "replacement": "$1"
            }
          ]

What are you seeing

after converting to yaml:

  - "source_labels": >-
      ['__meta_consul_service']
     "regex": >-
       '(.*)'
    "target_label": >-
      'service'
     "replacement": >-
       '$1'

What behaviour did you expect instead

relabel_configs:
- source_labels: ['__meta_consul_service']
  regex:         '(.*)'
  target_label:  'service'
  replacement:   '$1'

Output log

Error: /Stage[main]/Prometheus::Config/File[prometheus.yaml]/content: change from '{md5}8822bf5e2346fa4e41e3cfb3f5e1d3d2' to '{md5}465e86f9620ff98445bc84004f7d491b' failed: Execution of '/usr/local/bin/promtool check-config /etc/prometheus/prometheus.yaml20180215-2965-nsyebg' returned 1: Checking /etc/prometheus/prometheus.yaml20180215-2965-nsyebg
FAILED: yaml: unmarshal errors:
line 26: cannot unmarshal !!str ['__met... into model.LabelNames

$rule_files parameter not respected

Affected Puppet, Ruby, OS and module versions/distributions

  • Puppet: 4.10.9
  • Ruby:2.3.6
  • Distribution: Ubuntu 16.04
  • Module version: 5.0.0

How to reproduce (e.g Puppet code you use)

I declared in the init.pp manifest file the following code, which is pretty straightforward.

class { 'prometheus':
    rule_files        =>  ['/etc/prometheus/testing.rules'],
    scrape_configs    => $scrape_configs,
    extra_options     => $extra_options_real,
    localstorage      => "${data_dir}/prometheus",
    storage_retention => $retention,
  }

after running Puppet on the host, I couldn't see any change related to the parameter $rule_files. Furthermore, the resulting prometheus configuration has set an empty array:

What are you seeing

I see no changes in regards of $rule_files parameters, and looks like this parameter is not in use at all. The resulting yaml configuration shows only:

...
rule_files: []
...

if you read line 194-224:

$extra_rule_files = suffix(prefix(keys($extra_alerts), "${config_dir}/rules/"), '.rules')

  if ! empty($alerts) {
    ::prometheus::alerts { 'alert':
      alerts   => $alerts,
      location => $config_dir,
    }
    $_rule_files = concat(["${config_dir}/alert.rules"], $extra_rule_files)
  }
  else {
    $_rule_files = $extra_rule_files
  }

-> class { '::prometheus::config':
    global_config        => $global_config,
    rule_files           => $_rule_files,
    scrape_configs       => $scrape_configs,
    remote_read_configs  => $remote_read_configs,
    remote_write_configs => $remote_write_configs,
    config_template      => $config_template,
    storage_retention    => $storage_retention,
  }

You can notice that actually the value is calculated from $_rule_files

What behaviour did you expect instead

We are currently using version 3.1.0 in production|staging|dev but we want to upgrade to 5.0.0, however the first step is being able to set the rules parameter since at the moment is loading prometheus without any of our current rules, so I expect to have our rules in place, in a format such as:

rule_files:
- "/etc/prometheus/rules/alerts-service1/*.rules"
- "/etc/prometheus/rules/common/*.rules"

ps: there is another issue in regards of importing rules using that style, see golang/go#11862 for further information, but at the moment it works for us.

Please let me know if you need something from this side.
cheers!

Install Promtool

We should install the promtool so that configuration can be checked before reloading the service.

./promtool -help
usage: promtool <command> [<args>]

Available commands:
  check-config    validate configuration files for correctness
  check-rules     validate rule files for correctness
  help            prints this help text
  version         print the version of this binary

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.