datadog / puppet-datadog-agent Goto Github PK

View Code? Open in Web Editor NEW

51.0 53.0 261.0 1.72 MB

Puppet module to install the Datadog agent

License: Other

Ruby 46.64% Puppet 34.67% HTML 18.68%

puppet-datadog-agent's Introduction

Datadog Puppet Module

This module installs the Datadog Agent and sends Puppet reports to Datadog.

Requirements

The Datadog Puppet module supports Linux and Windows and is compatible with Puppet >= 4.6.x or Puppet Enterprise version >= 2016.4. For detailed information on compatibility, check the module page on Puppet Forge.

Installation

Install the datadog_agent Puppet module in your Puppet master's module path:

puppet module install datadog-datadog_agent

Upgrading

By default Datadog Agent v7.x is installed. To use an earlier Agent version, change the setting agent_major_version.
agent5_enable is no longer used, as it has been replaced by agent_major_version.
agent6_extra_options has been renamed to agent_extra_options since it applies to both Agent v6 and v7.
agent6_log_file has been renamed to agent_log_file since it applies to both Agent v6 and v7.
agent5_repo_uri and agent6_repo_uri become agent_repo_uri for all Agent versions.
conf_dir and conf6_dir become conf_dir for all Agent versions.
The repository file created on Linux is named datadog for all Agent versions instead of datadog5/datadog6.

Configuration

Once the datadog_agent module is installed on your puppetserver/puppetmaster (or on a masterless host), follow these configuration steps:

Obtain your Datadog API key.
Add the Datadog class to your node manifests (eg: /etc/puppetlabs/code/environments/production/manifests/site.pp).
```
class { 'datadog_agent':
    api_key => "<YOUR_DD_API_KEY>",
}
```
If using a Datadog site other than the default 'datadoghq.com', set it here as well:
```
class { 'datadog_agent':
    api_key => "<YOUR_DD_API_KEY>",
    datadog_site => "datadoghq.eu",
}
```
For CentOS/RHEL versions <7.0 and for Ubuntu < 15.04, specify the service provider as upstart:
```
class { 'datadog_agent':
    api_key => "<YOUR_DD_API_KEY>",
    service_provider => 'upstart'
}
```
See the Configuration variables section for list of arguments you can use here.
(Optional) Include any integrations you want to use with the Agent. The following example installs the mongo integration:
```
class { 'datadog_agent::integrations::mongo':
    # integration arguments go here
}
```
See the comments in code for all arguments available for a given integration.

If an integration does not have a manifest with a dedicated class, you can still add a configuration for it. Below is an example for the ntp check:
```
class { 'datadog_agent':
    api_key      => "<YOUR_DD_API_KEY>",
    integrations => {
        "ntp" => {
            init_config => {},
            instances => [{
                offset_threshold => 30,
            }],
        },
    },
}
```
(Optional) To collect metrics and events about Puppet itself, see the section about Reporting.

Upgrading integrations

To install and pin specific integration versions, use datadog_agent::install_integration. This calls the datadog-agent integration command to ensure a specific integration is installed or uninstalled, for example:

datadog_agent::install_integration { "mongo-1.9":
    ensure => present,
    integration_name => 'datadog-mongo',
    version => '1.9.0',
    third_party => false,
}

The ensure argument can take two values:

present (default)
absent (removes a previously pinned version of an integration)

To install a third-party integration (eg: from the marketplace) set the third_party argument to true.

Note it's not possible to downgrade an integration to a version older than the one bundled with the Agent.

Reporting

To enable reporting of Puppet runs to your Datadog timeline, enable the report processor on your Puppet master and reporting for your clients. The clients send a run report after each check-in back to the master.

Set the puppet_run_reports option to true in the node configuration manifest for your master:
```
class { 'datadog-agent':
  api_key            => '<YOUR_DD_API_KEY>',
  puppet_run_reports => true
  # ...
}
```
The dogapi gem is automatically installed. Set manage_dogapi_gem to false if you want to customize the installation.

Add these configuration options to the Puppet master config (eg: /etc/puppetlabs/puppet/puppet.conf):

[main]
# No modification needed to this section
# ...

[master]
# Enable reporting to Datadog
reports=datadog_reports
# If you use other reports, add datadog_reports to the end,
# for example: reports=store,log,datadog_reports
# ...

[agent]
# ...
report=true

With the ini_setting module:

  ini_setting { 'puppet_conf_master_report_datadog_puppetdb':
    ensure  => present,
    path    => '/etc/puppetlabs/puppet/puppet.conf',
    section => 'master',
    setting => 'reports',
    value   => 'datadog_reports,puppetdb',
    notify  => [
      Service['puppet'],
      Service['puppetserver'],
    ],
  }

On all of your Puppet client nodes, add the following in the same location:
```
[agent]
# ...
report=true
```

With the ini_setting module:

  ini_setting { 'puppet_conf_agent_report_true':
    ensure  => present,
    path    => '/etc/puppetlabs/puppet/puppet.conf',
    section => 'agent',
    setting => 'report',
    value   => 'true',
    notify  => [
      Service['puppet'],
    ],
  }

(Optional) Enable tagging of reports with facts:

You can add tags to reports that are sent to Datadog as events. These tags can be sourced from Puppet facts for the given node the report is regarding. These should be 1:1 and not involve structured facts (hashes, arrays, etc.) to ensure readability. To enable regular fact tagging, set the parameter datadog_agent::reports::report_fact_tags to the array value of facts—for example ["virtual","operatingsystem"]. To enable trusted fact tagging, set the parameter datadog_agent::reports::report_trusted_fact_tags to the array value of facts—for example ["certname","extensions.pp_role","hostname"].

NOTE: Changing these settings requires a restart of pe-puppetserver (or puppetserver) to re-read the report processor. Ensure the changes are deployed prior to restarting the service(s).

Tips:
- Use dot index to specify a target fact; otherwise, the entire fact data set becomes the value as a string (not very useful)
- Do not duplicate common data from monitoring like hostname, uptime, memory, etc.
- Coordinate core facts like role, owner, template, datacenter, etc., that help you build meaningful correlations to the same tags from metrics
Verify your Puppet data is in Datadog by searching for sources:puppet in the Event Stream.

NPM setup

To enable the Datadog Agent Network Performance Monitoring (NPM) features follow these steps:

(Windows only) If the Agent is already installed, uninstall it by passing win_ensure => absent to the main class and removing other classes' definitions.
(Windows only) Pass the windows_npm_install option with value true to the datadog::datadog_agent class. Remove win_ensure if added on previous step.
Use the datadog_agent::system_probe class to properly create the configuration file:

class { 'datadog_agent::system_probe':
    network_enabled => true,
}

USM setup

To enable the Datadog Agent Universal Service Monitoring (USM) use the datadog_agent::system_probe class to properly create the configuration file:

class { 'datadog_agent::system_probe':
    service_monitoring_enabled => true,
}

Troubleshooting

You can run the Puppet Agent manually to check for errors in the output:

```shell
sudo systemctl restart puppetserver
sudo puppet agent --onetime --no-daemonize --no-splay --verbose
```

 Example response:

```text
info: Retrieving plugin
info: Caching catalog for alq-linux.dev.datadoghq.com
info: Applying configuration version '1333470114'
notice: Finished catalog run in 0.81 seconds
```

If you see the following error, ensure reports=datadog_reports is defined in [master], not [main].

```text
err: Could not send report:
Error 400 on SERVER: Could not autoload datadog_reports:
Class Datadog_reports is already defined in Puppet::Reports
```

If you don't see any reports coming in, check your Puppet server logs.

Masterless Puppet

The Datadog module and its dependencies have to be installed on all nodes running masterless.

Add this to each node's site.pp file:

class { "datadog_agent":
    api_key            => "<YOUR_DD_API_KEY>",
    puppet_run_reports => true
}

Run puppet in masterless configuration:

puppet apply --modulepath <path_to_modules> <path_to_site.pp>

Tagging client nodes

The Datadog Agent configuration file is recreated from the template every Puppet run. If you need to tag your nodes, add an array entry in Hiera:

datadog_agent::tags:
- 'keyname:value'
- 'anotherkey:%{factname}'

To generate tags from custom facts classify your nodes with Puppet facts as an array to the facts_to_tags paramter either through the Puppet Enterprise console or Hiera. Here is an example:

class { "datadog_agent":
  api_key            => "<YOUR_DD_API_KEY>",
  facts_to_tags      => ["os.family","networking.domain","my_custom_fact"],
}

Tips:

For structured facts index into the specific fact value otherwise the entire array comes over as a string and ultimately be difficult to use.
Dynamic facts such as CPU usage, Uptime, and others that are expected to change each run are not ideal for tagging. Static facts that are expected to stay for the life of a node are best candidates for tagging.

Configuration variables

These variables can be set in the datadog_agent class to control settings in the Agent. See the comments in code for the full list of supported arguments.

variable name	description
`agent_major_version`	The version of the Agent to install: either 5, 6 or 7 (default: 7).
`agent_version`	Lets you pin a specific minor version of the Agent to install, for example: `1:7.16.0-1`. Leave empty to install the latest version.
`collect_ec2_tags`	Collect an instance's custom EC2 tags as Agent tags by using `true`.
`collect_instance_metadata`	Collect an instance's EC2 metadata as Agent tags by using `true`.
`datadog_site`	The Datadog site to report to (Agent v6 and v7 only). Defaults to `datadoghq.com`, eg: `datadoghq.eu` or `us3.datadoghq.com`.
`dd_url`	The Datadog intake server URL. You are unlikely to need to change this. Overrides `datadog_site`
`host`	Overrides the node's host name.
`local_tags`	An array of `<KEY:VALUE>` strings that are set as tags for the node.
`non_local_traffic`	Allow other nodes to relay their traffic through this node.
`apm_enabled`	A boolean to enable the APM Agent (defaults to false).
`process_enabled`	A boolean to enable the process Agent (defaults to false).
`scrub_args`	A boolean to enable the process cmdline scrubbing (defaults to true).
`custom_sensitive_words`	An array to add more words beyond the default ones used by the scrubbing feature (defaults to `[]`).
`logs_enabled`	A boolean to enable the logs Agent (defaults to false).
`windows_npm_install`	A boolean to enable the Windows NPM driver installation (defaults to false).
`win_ensure`	An enum (present/absent) to ensure the presence/absence of the Datadog Agent on Windows (defaults to present)
`container_collect_all`	A boolean to enable logs collection for all containers.
`agent_extra_options`¹	A hash to provide additional configuration options (Agent v6 and v7 only).
`hostname_extraction_regex`²	A regex used to extract the hostname captured group to report the run in Datadog instead of reporting the Puppet nodename, for example: `'^(?<hostname>.\.datadoghq\.com)(\.i-\w{8}\..)?$'`

(1) agent_extra_options is used to provide a fine grain control of additional Agent v6/v7 config options. A deep merge is performed that may override options provided in the datadog_agent class parameters. For example:

class { "datadog_agent":
    < your other arguments to the class >,
    agent_extra_options => {
        use_http => true,
        use_compression => true,
        compression_level => 6,
    },
}

(2) hostname_extraction_regex is useful when the Puppet module and the Datadog Agent are reporting different host names for the same host in the infrastructure list.

puppet-datadog-agent's People

Contributors

Stargazers

Watchers

Forkers

brmatt aripringle kcampos mitx drohr avatarnewyork devopsdave adedommelin-zz snemetz softecspa ordenull cynipe dpippen robbyt blarghmatey mmz-srf walterking zylad paycasso pigri kennyg haraldsk epelletier mrchilds semanticsugar jensguenther freshbooks mingderwang stephenweber brettswaim oremj gozer obowersa kultar nuxij andrewso726 rhoml vend jimdo atlassian gavinlove grubernaut mleonidas skyscrapers thkrmr gphat b2jrock martindelta cwood kitchen mrunkel dvanderspuij cottonbeckfield ixxus jenkins-infra homelinen jniesen code-management stripe-archive brasey jangie fzwart condenast vchan2002 billyschmidt nextdoor biandrews leocavaille babyhuey gservat sher-chowdhury unixorn digideskio cu12 davejrt eddmann shawn-sterling albac pabrahamsson raylu jacobbednarz cristianjuve envato-archive gnkaytaz sethcleveland downsj2 rbramwell alexharv074 rooprob amalcg justinhennessy chriscowley haylix timatooth binford2k grigarr tmack0 carnivalmobile alvin-huang swwolf

puppet-datadog-agent's Issues

revamp of integrations

TL;DR: revamp integrations to allow multiple instances and standard parameter validation on all integrations.

Hey folks! I'm looking for some feedback on the current state of integrations vs where you'd like to go with things.

I see in the history there is at least one commit that references "integrations are classes now, not defines", and I can understand why that is to a degree. However, I don't think that hard "this is the only way it can be" model makes a lot of sense, and I think there is a better approach that can satisfy everyone's needs.

current state

Because of the way the configuration files for integrations are, the 'classes only' model makes a lot of sense. There is one section init_config that is common to the file itself, and then there are multiple instances that can be associated, like requested for in #130. This maps decently to a class whose parameters cover the init_config section, and then an array of instances as one of the class parameters. Great.

However there are some significant downsides to this model:

Since we're using class params, it's hard for a class to say "hey, I need one of these too" like you might with a type. For instance, with the role/profile pattern, if you have 2 postgres databases in different profiles and want to monitor them both, you can't simply declare the class on both, because now those classes can't exist within the same catalog. Now you have to do something to factor out the class into another class which can be included in both profiles, but that's not always the easiest thing to do. Also, consider the case where you want to separate these profiles onto different machines, now you have to un-factor this, and ugh.
Semantically, this is actually not the right way to do it with puppet. These are instances, they should be a type so you can declare multiple of them.
Currently, the module implements this pattern very sporadically. Some have it (nginx, mongo, zk), while most do not. There does appear to be demand for multiple instances from people other than me (see #130, #64, #56), and I know we (Stripe) will be needing it shortly as we start setting up datadog monitoring for our RDS instances (among other things).
Deep validation of the hashes and arrays used to declare the instances is difficult, and as a result, almost completely missing.

The problem of course is that this is all modifying one file, and that file can only be declared once. And part of the file is "class-like" and part of the file is "type-like". IMO the best way to resolve this would actually be to change the configuration parser to allow multiple files to configure the same plugin, and maybe each config file supports an integration parameter which specifies which integration the configs are for. This model is used very well with collectd and the puppet/collectd module (disclaimer: I have made many contributions to that module).

However, that's a fairly major change which depends on major changes in the datadog agent itself, which is a rather large dependency for this :)

new hotness

I'd like to propose a different model.

There would be 2 types that define the underlying file:

datadog_agent::integration::init_config
datadog_agent::integration::instance

These two would be used to define the init_config section of an integration config file, and the instances section. They would expose a pretty generic API which would pretty much just take some 'standard' parts (like tags, on the instance) and then a hash of other parameters to simply YAML.dump into the config file.

With these we could build types and classes to define specific integrations.

As an example, for the mongo integration it would look something like this:

class datadog_agent::integrations::mongo::init_config (params) {
    # validate params
    datadog_agent::integration::init_config { 'mongo':
        params => 'here',
    }
}

define datadog_agent::integrations::mongo (params) {
    include datadog_agent::integrations::mongo::init_config
    # validate params
    datadog_agent::integration::instance { "mongo-${name}":
        integration => 'mongo',
        params => 'here',
        tags => [ 'tags', 'go', 'here' ],
    }
}

Then using the integration would be as simple as:

datadog::agent::integration::mongo { 'localhost':
    params => 'here',
    tags => 'here',
}

And then you get instances for free. If you need to specify additional parameters to the init_config section, you can declare it separately yourself, or specify them via hiera.

As far as having 'generic' integrations that people can use (maybe their own integrations they've written, maybe an integration provided by the datadog agent itself which doesn't have puppet support yet), they can use those 2 building blocks in their own classes/defines.

Some of the integrations don't make sense to have "instances" of (disk, agent_metrics) even though they use the instances section. These can just be classes:

class datadog_agent::integrations::disk (params) {
    datadog::agent::integration::init_config { 'disk':
        params => 'here',
    }
    datadog::agent::integration::instance { 'disk':
        params => 'here',
        tags => ['whatever'],
    }
 }

The underlying implementation of the two core types could use puppetlabs/concat or richardc/datacat to actually build the files. The nice thing about datacat is that since we're generating a yaml file it's a nice mapping to build up a hash and YAML.dump in the template and call it good, but I've always thought datacat, while great, was the wrong approach to the problem. Concat is nice because it's a puppetlabs supported module, but building a yaml data structure out of file fragments also has always been a bad smell for me. However, Because Puppet™, these are basically the best solutions we have to this problems.

As far as supporting every-feature-under-the-sun of the integrations, I believe if we make the underlying types (the init_config, and instance types) generic enough, then arbitrary integrations are possible. And for the 'official' integrations, assuming they are defined well enough in the python code, we could probably generate the puppet module based on those (instance/init_config parameters, etc). This is something which may require a fair amount of work on the python code (adding metadata to each integration type), but could make supporting the puppet module a much easier prospect for you folks in general.

Anywho, I'd like to get cracking on these changes as soon as possible, and I'd love to have some feedback if there are any concerns y'all have or changes you'd like to request. I can also do a proof of concept to show what I'm talking about if you'd like before I get cracking on changing all of the current integrations.

Thanks, and I look forward to your feedback!

HTTPS is not yet a thing for the datadog repo

It seems that ssl repo support has not yet been deployed by datadog. Meaning that our AMIs were failing and we didn't know why. I found that on the 7th, it was changed to https. Here is what port 443 returns from this url as of today(10-09-15):

nick.parry@nparry-laptop:~$ nc -vzw2 yum.datadoghq.com 443
nc: connect to yum.datadoghq.com port 443 (tcp) failed: Connection refused

Just wanted to provide some visibility on this issue.

hiera support?

It would be great if the API key could be defined in hiera

Add generic integration

It would be great to have a generic integration manifest. This allows us to pass larger, more complicated configuration files to integrations that need to be rendered with many variables, and are too site specific.

Allow modifications to reported hostnames

Related to #1

Hosts may show up as duplicates in infrastructure list, with puppet metrics only intermittently reporting. Some cases have included a workaround specific to each case, but these workarounds need to be included each time the puppet module is updated

Reporting configuration for masterless puppet

Is there a way to configure the module to report when running in a masterless puppet configuration?

rubygems installation error on machines with ruby 1.9

As of Ruby 1.9, what used to be in the rubygems package is now included in the base ruby package.

When using this puppet script on ubuntu 12.04 ( any presumably any distro that is on ruby 1.9 ) the output includes the error

E: Package 'rubygems' has no installation candidate

Since the needed code is included in the ruby 1.9 package, the agent will work correctly.

datadog_agent::integrations::directory is not functional?

Seems to stem from this commit

the problem is $name being reassigned.

Also, I really think the whole comment of "we do integrations as classes, not defines" doesn't make a lot of sense. In this case if you wanted to use multiple directories (which the integration seems to support, because instances (I'll admit, I have not used the integration myself, so I'm just guessing) you'd ... what?

Anywho, I'd like to fix this, but I'm not certain what the best approach is. What is $name trying to do there and why, and since it's currently non-functional, how much backward compatibility do I need to worry about?

Again, I haven't used this integration, I am working on writing tests for all of the integration classes so I can factor out a generic integration type, and ran into this issue. Please let me know how I can help :)

Set up travis testing

Missing Metrics on PE/POSS

When reporting is enabled, events are created but metrics seem to be missing. We should check where we should be pulling the expected metrics from, something may have changed in the newer versions of puppet.

Add rabbitmq integration for puppet

Currently no pp file for rabbitmq

Asked by customer here: https://datadog.desk.com/agent/case/20165

nginx integration template uses .to_yaml which will cause un-needed resource applies

You can't use unordered hashes/arrays in templates, otherwise you get un-needed resource applies when the keys are evaluated in a different order.

Support multiple SOLR instances

Mesos Integration Problems

Running dd-agent info gives me the following error.

...
Collector (v 5.6.3)
...

mesos
-----
  - instance #0 [WARNING]
      Warning: This check is deprecated in favor of Mesos master and slave specific checks. It will be removed in a future version of the Datadog Agent.
      Warning: ('Connection aborted.', error(101, 'Network is unreachable'))
      Warning: ('Connection aborted.', error(101, 'Network is unreachable'))
      Warning: ('Connection aborted.', error(101, 'Network is unreachable'))
  - Collected 0 metrics, 0 events & 4 service checks

I noticed that the integration instructions in the webapp are inconsistent with the puppet module. To fix this, I had to rename the mesos.yaml file to mesos_slave.yaml. Please resolve this either in the agent or the Puppet module. Changing the agent is preferred because it probably isn't good practice to use the name of a configuration file for application logic. After changing the name of the YAML file and restarting the collector, I get:

mesos_slave
-----------
  - instance #0 [OK]
  - Collected 30 metrics, 0 events & 2 service checks

To summarize, the mesos integration does not work out of the box using this Puppet module and the problem lies within the agent but can be worked around by adding a slave/master parameter to the Puppet module to dictate what the configuration file should be named.

puppet-datadog-agent/manifests/integrations/mesos.pp

Redis slowlog_max_len not configurable in module but raises warning in collector

For redis servers configured with a non-default slowlog max len, collector issues a warning. Also, users who have configured their redis instance with a higher length probably want access to that data. Error message from sudo service datadog-agent info:

[WARNING]:Redis slowlog-max-len is higher than 128. Defaulting to 128.If you need a higher value, please set slowlog-max-len in your check config

Puppet agent template doesn't have this as an option, nor does the manifest accept it as a setting.

Support puppetlabs-ruby > 0.2.0

Could we make the requirement of puppetlabs-ruby a little less strict?

PR #110 changes metadata.json to support 0.2.0 and higher, until the next major version.

Cheers,

Otto

Missing integration for Consul

Now that consul is supported in datadog, would be nice if this puppet module exposed that fact

When puppetserver, install dogapi into puppetserver context

If using POSS w/ puppetserver or Puppet Enterprise the 'dogapi' gem must be installed into the context of puppetserver(JRuby):

# /opt/puppetlabs/bin/puppetserver gem install dogapi
Fetching: multi_json-1.11.2.gem (100%)
Successfully installed multi_json-1.11.2
Fetching: dogapi-1.21.0.gem (100%)
Successfully installed dogapi-1.21.0
2 gems installed
# /etc/init.d/pe-puppetserver restart

Windows agent installation and integration

for the sake of getting things up and running on my end, I've put up a separate module that does datadog installs and integrations (SQL server, IIS) for Windows.

https://github.com/vchan2002/datadog_agent_windows

The goal is to hopefully merge this back to the regular datadog-agent install....

redis integration only supports 1 redis server instance

On some hosts, I run multiple redis server instances. Can the agent support that? And, can the puppet template support it?

Integration manifests and templates do not support setting min_collection_interval

Just about every check provided by the dd-agent package extends AgentCheck and supports the min_collection_interval setting at the instance or check level. None of the integrations provided in the puppet module supports a method to set the min_collection_level.

When running on EC2, instance-id and hostname facts should be sent to Datadog to signify that both are aliases of the same server

Report datadog_reports failed: undefined method `title'

Puppet reporting to data dog is not working:

puppet-master[27201]: Report datadog_reports failed: undefined method `title' for #<Puppet::Resource::Status:0x7f1c22d30820>

Puppet version: 2.6.2

Use port 80 when fetching the DataDog APT key

By default apt-key will attempt to access a keyserver on 11371/tcp which can be problematic for environments with firewalls or network ACLs.

The Ubuntu keyserver also listens on port 80/tcp which is much more firewall friendly.

http_check integration is limited to a single instance

I'm not sure if this DD feature previously only allowed a single instance, but certainly as of today that isn't the case. Unnecessary limitation.

operatingsystem parameter for older amazon linux versions

some older versions of Amazon Linux will return $operatingsystem="Linux"

case: 18699

Logic in datadog.yaml is incorrect

The following needs to be changed, as it currently is it doesn't work when passing in a value that isn't nil:

<% if @hostname_extraction_regex.nil? -%>
:hostname_extraction_regex: '<%= @hostname_extraction_regex %>'
<% end -%>

to:

<% if !@hostname_extraction_regex.nil? -%>
:hostname_extraction_regex: '<%= @hostname_extraction_regex %>'
<% end -%>

Use facts in the report class

It would be useful to get facts in the report class, to customize what the report event we send to Datadog.

Redis integration doesn't support warn_on_missing_keys

warn_on_missing_keys was added to the dd-agent integration July 2014. I'd like fewer warnings.

https://github.com/DataDog/dd-agent/blame/master/checks.d/redisdb.py#L242

On puppet 3.0.x, I get an error message from the datadog report processor.

Since I've upgraded to 3.0.2, I've been getting an error message from the datadog report processor. Here's the output from puppet with --debug --trace:

Notice: Finished catalog run in 16.60 seconds
Debug: Using settings: adding file resource 'rrddir': 'File[/var/lib/puppet/rrd]{:links=>:follow, :group=>"puppet", :backup=>false, :ensure=>:directory, :owner=>"puppet", :mode=>"750", :loglevel=>:debug, :path=>"/var/lib/puppet/rrd"}'
Debug: Finishing transaction 69891384984120
Debug: Received report to process from instance5.zicasso.com
Debug: Processing report from instance5.zicasso.com with processor Puppet::Reports::Store
Debug: Processing report from instance5.zicasso.com with processor Puppet::Reports::Datadog_reports
Debug: Sending metrics for instance5.zicasso.com to Datadog
Debug: Sending events for instance5.zicasso.com to Datadog
undefined method `[]' for :@aggregation_key:Symbol

"name" attribute is not parsed in by the http_check.pp script

In "http_check.yaml.example", it states that we can specify a "name" attribute.

# - name: My second service 
#   url: https://another.url.example.com

However, the "name" attribute is not being parsed in "http_check.pp" script.

[class datadog_agent::integrations::http_check ( 
  $url       = undef, 
  $username  = undef, ](url)

Potentially, this can be fixed by modifying the script and the .erb files as below:

https://github.com/DataDog/puppet-datadog-agent/blob/master/manifests/integrations/http_check.pp

  class datadog_agent::integrations::http_check ( 
+   $svcname   = undef, 
    $url       = undef,

https://github.com/DataDog/puppet-datadog-agent/blob/master/templates/agent-conf.d/http_check.yaml.erb


  instances: 
!     -   name: <%= @name %> 
          url: <%= @url %> 
  <% if @timeout -%> 
          timeout: <%= @timeout %> 
--- 1,7 ---- 
  init_config: 

  instances: 
!     -   name: <%= @svcname %> 
          url: <%= @url %> 
  <% if @timeout -%> 
          timeout: <%= @timeout %>

New integrations anytime soon?

Hi,

I've seen on several issues/requests that you are going to push a major release of integrations.
The problem is that at the moment there are very few supported services from those we need.

I would gladly help contribute new integrations but from what I've seen, things are delayed until the major release is out.

My question is, what is next? should I start working on integrations or do you have an ETA?

Anyway, thanks for the great job you did so far!

Eliran

`dogapi` is required to use Datadog report

I am running Puppet 4.2.2 licenced under the open source. I followed the instruction on this project but Puppet integration on the dashboard says the "No Data Received". After looking into puppetserver.log, I found the following error message:

[puppet-server] Puppet You need the `dogapi` gem to use the Datadog report (run puppet with puppet_run_reports on your master)

Reporting is now working well since I have installed dogapi gem by hand:

sudo puppetserver gem install dogapi

Missing tornado package causes failing to start on Fedora 20

I was unable to start the agent installed with this module on a Fedora 20 box since the tornado python module was missing. Once I'd installed the module the agent started working. The problem is that on the particular box there are not many packages (and that's the desired state), so installing tornado by hand caused error messages complaining about inability to compile some speedup.

The second issue is the pip installer. I had to install it prior to tornado installation. I'd like to avoid that, too.

The last problem is adding the installation to the puppet scripts. Either it has to be handcrafted or the puppet-python module has to be used. I use the last option but I don't feel it's the preferred one. The best approach would be to avoid external dependencies.

Promote repository to be standalone

This project shows up as a fork of jamtur01/puppet-datadog-agent.

As it is now 249 commits ahead, 3 commits behind jamtur01:master it might be time to change this to be a standalone repository? Github support should be able to do this easily.

Missing integration for kafka and kafka consumer

Maybe more a kind of hint or reminder . The kafka and kafka consumer integration is missing.

Make the integration manifests 'parameter agnostic'

It's a pain to have to specify every key of the YAML file in a different variable.
We could just get a dictionary representing the instances and convert it to YAML.

class { 'datadog_agent::integrations::example' :
  instances => [
    { 'host' => 'localhost', 'port' => 42, 'tags' => [] }
  ]
}

# in the template
<%= require 'yaml'; {'init_config'=>@init_config, 'instances'=>@instances}.to_yaml %>

Moreover, it would be easy to add new integrations, the puppet user just has to stick to the description format of the agent YAML file.

cc @alq666

That potentially is a non backward-compatible change for people manifests tough.

Service enable on Centos

Each time puppet run this module on Centos, it re-enables service, which by itself triggers false "changed" puppet report on a node:

Debug: Executing: '/bin/systemctl is-active datadog-agent'
Debug: Executing: '/bin/systemctl show datadog-agent --property LoadState --property UnitFileState --no-pager'
Debug: Executing: '/bin/systemctl unmask datadog-agent'
Debug: Executing: '/bin/systemctl enable datadog-agent'
Notice: /Stage[main]/Datadog_agent::Redhat/Service[datadog-agent]/enable: enable changed 'false' to 'true'

I find setting provider => 'redhat' in redhat.pp solves this problem.

Improve the parameters for datadog.conf.erb

At the moment a lot of the options within datadog.conf.erb aren't set as parameters. This leads to some headaches when trying to set up things such as dogstatsd.

Ideally all options within datadog.conf.erb should be able to be defined by parameters provided to the datadog_agent class. This would enable them to be declared where required without having to use a modified version of the puppet-datadog-agent module or override the file in a later module.

Additionally, it would be potentially worthwhile having all the parameters covered by the rspec-tests in order to validate the default values and the custom ones. It would just provide a bit more confidence when introducing new parameters.

This would probably need to take place as three parts:

Existing parameters get tests added to the appropriate spec
Tests are written up for the new parameters
datadog_agent class and datadog.conf.erb are modified in order to pass the new parameters, with defaults matching the current setup in order to ensure the change is backwardly compatible.

On unsuppported OSes, the notify is defined twice

notify {'Unsupported OS'} is defined twice, which causes an error and makes the notify not execute

Puppet master reporting requires the dogapi gem

Puppet master reporting requires the dogapi gem. it seems to fail silently without it.

Version 1.7.0. defaults `use_dogstatsd` to false

Customer reached out with an issue where they upgraded the module to 1.7.0. and they immediately stopped seeing their statsd metrics.

Looks like this is the culprit. https://github.com/DataDog/puppet-datadog-agent/blob/master/manifests/init.pp#L219

Installation issue in Ubuntu

Resources defined in manifests/ubuntu.pp are not guaranteed to run in a consistent order.

I saw this when I tried to deploy datadog agent to a handful Ubuntu boxes. The expected order in which resources execute is:

Exec['datadog_key']
File['/etc/apt/sources.list.d/datadog.list']
Exec['datadog_apt-get_update'] via refresh
Package['datadog-agent']
Service['datadog-agent']

... but looking at reports from failed puppet agents in Puppet Enterprise, I see that 1 and 2 run, but 3 didn't and 4 fails. Unfortunately PE doesn't tell what order it runs, and I'm not 100% positive if it reports refresh events, but I think one of the two things is going on.

it tried to run 1->2->4->3 in that order. 4 requires 3, but 3 is a refreshonly resource, and there are some dragons in this part of Puppet.
it tried to run 2->3->4->1 in that order, because no other resources actually explicitly require 1.

Either way, once the system gets into this state, 2 is up-to-date, so 3 will never run, so the system cannot recover from this state by itself. I had to go in and manually run "apt-get update" to make them unstuck.

I recommend you collapse 1 and 3 into a single exec resource, and define dependencies in 2->(1+3)->4->5.

Use puppetlabs/concat for creating yaml fragments

https://forge.puppetlabs.com/puppetlabs/concat#overview . Might clean up some of the code we use to create conf.d files

Every second execution of puppet fails

Every second execution of puppet removed datadog_agent package, and fails.
We use commit 3cffbcf plus #78 on top of it, but this pull request change does not touch the affected area.
So, for the half of the time, we don't even have the datadog_agent package installed.
This part in the redhat.pp causes every second execution of puppet to fail:

package { 'datadog-agent-base':
  ensure => absent,
  before => Package['datadog-agent'],
}

We don't have datadog-agent-base, but this code by some reason deletes datadog-agent package, including dd_user. Here is the relevant portion of the puppet log:

Info: Applying configuration version '1427837407'
Debug: Prefetching yum resources for package
Debug: Executing '/bin/rpm --version'
Debug: Executing '/bin/rpm -qa --nosignature --nodigest --qf '%{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n''
Debug: Executing '/bin/rpm -q datadog-agent-base --nosignature --nodigest --qf %{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n'
Debug: Executing '/bin/rpm -q datadog-agent-base --nosignature --nodigest --qf %{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n --whatprovides'
Debug: Executing '/bin/rpm -e datadog-agent-5.2.2-1.x86_64'
Notice: /Stage[main]/Datadog_agent::Redhat/Package[datadog-agent-base]/ensure: removed
Debug: /Stage[main]/Datadog_agent::Redhat/Package[datadog-agent-base]: The container Class[Datadog_agent::Redhat] will propagate my refresh event

After that, puppet fails at init.pp:
Error: Could not find user dd-agent
Error: /Stage[main]/Datadog_agent/File[/etc/dd-agent/datadog.conf]/owner: change from 498 to dd-agent failed: Could not find user dd-agent

With a next puppet execution, the datadog_agent package gets reinstalled and exection completes normally:
Debug: Prefetching inifile resources for yumrepo
Debug: Executing '/bin/rpm -q datadog-agent --nosignature --nodigest --qf %{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n'
Debug: Executing '/bin/rpm -q datadog-agent --nosignature --nodigest --qf %{NAME} %|EPOCH?{%{EPOCH}}:{0}| %{VERSION} %{RELEASE} %{ARCH}\n --whatprovides'
Debug: Packagedatadog-agent: Ensuring => latest
Debug: Executing '/usr/bin/yum -d 0 -e 0 -y install datadog-agent'
Notice: /Stage[main]/Datadog_agent::Redhat/Package[datadog-agent]/ensure: created

http_check doesn't support multiple instances

The HTTP check here is declared as a class, this means that only one can be declared on a host. Could you please convert this to a define as I have in my fork (which you cherry-picked):

https://github.com/ordenull/puppet-datadog-agent/blob/master/manifests/defines/http_check.pp
https://github.com/ordenull/puppet-datadog-agent/blob/master/templates/http_check.yaml.erb

The pattern to declare resources as defines is much more puppet friendly and really should apply to all other checks as well. There is also another pattern in use by the process check. It does allow multiple process checks to be defined, however because it wraps them all in the same class it prevents the use of if statements in node definitions such as follows:

if ( defined( Class['Apache'] ) ) {
datadog::process { 'apache2':
name => 'apache2',
search_string => 'apache2',
}

if ( defined( Class['Varnish'] ) ) {
datadog::process { 'varnish':
name => 'varnish',
search_string => 'varnish',
}

datadog_agent::integrations::http_check should be a defined resource

Because it's a class, you can only define a single http check per agent. It should be changed to a defined resource combined with maybe the puppetlabs/concat module to create a yaml with multiple instances of http checks. Having a single http check per host is pretty useless unless you're checking strictly for localhost.

module has wrong dependency for stdlib in metadata.json

metadata.json indicates that any 4.x version of stdlib is sufficient, but the use of validate_integer means it's not.

We have 4.5.1 installed currently which doesn't have the validate_integer function.

Please correct your metadata.json file to reflect the minimum version of stdlib that includes validate_integer. It looks like v4.6 is what you want.

Ensure puppet module works well on puppet 3.x

We've received reports of customers trying it out and having issues, primarily around Class Datadog_reports is already defined in Puppet::Reports, despite confirming correct configuration.

In general, we should also strive to pass puppet lint, and any other testing that makes sense.

Include/exclude rules in the report class

Basically having a way to say I want these nodes to report but not these ones is sometimes useful when using a mix of prod and not-so-prod nodes in puppet.
Or even to roll out slowly the Datadog reporting on all the nodes.