Giter Club home page Giter Club logo

netdata_to_nagios's Introduction

Netdata to Nagios, a plugin for alerting on Netdata perfcounters and for long term storage

root@Nagios:~# python nagios_netdata.py -H 127.0.0.1 -D apps.cpu -i 60 -w 80 -c 90
OK | kernel=0.070277775, log=0.000277777777778, system=0.0130555555556, inetd=0, cron=0.00333333333333, ksmd=0, other=0.000277777777778, lxc=0, ssh=0.00833333055556, netdata=1.35777795556, apache=0.00333333333333, nms=0.025,

Table of Contents:

Example of graphic in grafana generated with this plugin.

Introduction

Netdata to Nagios is a plugin that allows you to get alert via Netdata perfcounter source. Netdata is a neat project that gives you real time metrology. The plugins works with Nagios, Shinken, Icinga and Centreon. It also gives perfdata for long time metrology.

Install and Config

The plugin only needs Python 2.7, no additional dependancy/module. It as only be tested on Linux but should perferctly works on other Unix and Windows systems since there is no operating system commands or operating system specific call.

To install the plugin, just paste the file into your plugin diretory and configure your monitoring system like so :

cp netdata_to_nagios.py /usr/libexec/nagiosplugins/
chmod +x netdata_to_nagios.py

Generic command :

define command{
    command_name    check_memory_via_netdata
    command_line    $PLUGIN_PATH$/netdata_to_nagios.py -H $HOSTADDRESS$ -p $ARG1$ -D $ARG2$ -i $ARG3$ -w $ARG4$ -c $ARG5$
}

Monitor memory usage:

define service{
    use                             generic-service         ; Name of service template to use
    host_name                       mymachine
    service_description             Memory Usage
    check_command                   check_memory_via_netdata!19999!system.ram!2!80!90
}

Monitor CPU usage per application, will alert on which process consume to much CPU: Can help finding which application is consuming CPU ressources

define service{
    use                             generic-service         ; Name of service template to use
    host_name                       mymachine
    service_description             CPU Usage per process
    check_command                   check_cpu_via_netdata!19999!apps.cpu!60!80!90 ; Average cpu load during last 60 seconds
}

Monitor CPU usage at a system level: Can help finding if CPU is busy because of iowait, irq, system operations, etc.

define service{
    use                             generic-service         ; Name of service template to use
    host_name                       mymachine
    service_description             CPU Usage per process
    check_command                   check_cpu_via_netdata!19999!system.cpu!60!80!90 ; Average cpu load during last 60 seconds
}

Monitor disk space:

define service{
    use                             generic-service         ; Name of service template to use
    host_name                       mymachine
    service_description             CPU Usage per process
    check_command                   check_cpu_via_netdata!19999!disk_space._!60!80!90 ; monitor / partition
}

Monitor disk load:

define service{
    use                             generic-service         ; Name of service template to use
    host_name                       mymachine
    service_description             CPU Usage per process
    check_command                   check_cpu_via_netdata!19999!disk_util.sda!60!80!90 ; Average load during last 60 seconds
}

Monitor Apache workers:

define service{
    use                             generic-service         ; Name of service template to use
    host_name                       mymachine
    service_description             CPU Usage per process
    check_command                   check_cpu_via_netdata!19999!apache_local.workers!60!80!90 ; Average worker consumption during last 60 seconds
}

Monitor Nginx workers:

define service{
    use                             generic-service         ; Name of service template to use
    host_name                       mymachine
    service_description             CPU Usage per process
    check_command                   check_cpu_via_netdata!19999!nginx_local.connections!60!1900!2048 ; Average worker consumption during last 60 seconds
}

Command options

Here is the full command options manual :

    Utilisation:
    netdata_to_nagios.py -H host -p port [-D <datasource>] [-i <interval>] -w <80> -c <90>
    
    Options:
     -h, --help 
        Show detailed help
        
     -H, --host
        Specify remote netdata host address
        Default : 127.0.0.1
        
     -p, --port
        Specify remote netdata port
        Default : 19999
        
     -D, --datasource
        Specify which datasource you want to check. 
        Available datasources :
            - apps.cpu                 (default) Check CPU load per process
            - system.ram               Check REAL RAM consumption
            - system.cpu               Gives CPU laod system view (user, system, nice, irq, softirq, iowait)
            - disk_util.sda            Check disk load (sda, sdb,... can specify the name of your drive)
            - disk_space.sda1          Check disk space (sda2, sdb1,... can specify the name of your partition)
            - apache_local.workers     Check Apache worker consumption
            - nginx_local.connections  Check nginx connections
            - nginx_local.requests     Check nginx request rate 
            - mdstat.mdstat_health     Check if there is a faulty md raid array
            
     -i interval
        Specify an interval in seconds (minimum 2)
        Default : 60
        
     -w, --warning
        Specify warning threshold
        
     -c, --critical
        Specify critical threshold

More probes will be added soon.

netdata_to_nagios's People

Contributors

thecattony avatar faust64 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.