Giter Club home page Giter Club logo

nagios-eventhandler-cachet's Introduction

Cachet system status updates from Nagios

This project is a Nagios event handler that can update a public systems status page powered by Cachet when Nagios detects changes to services on your infrastructure.

It is a derivative of mpellegrin/nagios-eventhandler-cachet, but is differentiated by:

  • A configuration file based ability to consider the status of more than one service, or services across multiple hosts when deciding how to update a component of the system status page. This could be useful when you have more than one host behind a load balancer that performs the same task, for example.
  • A mechanism to allow operators to manually pin a component on your status page to a given status, in cases where your nagios checks are incorrect. To do this, simply open an incident for a component and mark it "stickied." As long as a component has a stickied incident, this script will not change the status of the component or close out the incident.
  • Limited interaction with related incidents:
    If a component is automatically marked as having an issue, and then an operator subsequently opens an incident about that component to provide people with more information about the issue and does not mark that incident as "stickied", then whenever the component gets set back to Operational, the incident will also automatically be marked as fixed.

Installation

  • Clone this repository into a new directory
  • Run the command composer install inside the cloned directory.

Configuration

  • Copy config.sample.yaml to nagios_eventhandler_cachet.yaml or /etc/nagios_eventhandler_cachet.yaml.
  • Get a Cachet API key:
    • Create a new user in Cachet dashboard
    • login with this user
    • get the API key in his profile.
  • Update config.yaml to contain the url to your cachet server's API and the API key from above.
    • These go in cachet_api/url and cachet_api/api_key.
  • Update config.yaml to contain the url to your nagios server's json cgi endpoints, and optionally an http username and password if your nagios instance is protected by some form of HTTP authentication.
    • These go in nagios_api/url and optionally nagios_api/username and nagios_api/password.

Cachet Component to Nagios Service maping configuration

The most interesting part of your config file maps Cachet Components (these are the individual systems you can set the status of) to Nagios service(s) on particular host(s).

From the config.sample.yaml sample configuration, we have:

components:
  Login Nodes:
    service_aggregator: degrade_if_any_fail_if_all
    nagios_services:
      - host: sfec1
        service: SSH
      - host: sfec2
        service: SSH
      - host: sfec3
        service: SSH

In this sample,

  • "Login Nodes" is the name of a Cachet Component on your status page
  • We are saying that this component depends on the "SSH" nagios service on three different hosts ("sfec1", "sfec2", and "sfec3".)
  • In order to combine the three statuses of SSH from these hosts into a single status shown by Cachet for the "Login Nodes" component, we use a service_aggregator called degrade_if_any_fail_if_all that sets the status to Operational if all of the nagios services are ok, Major Outage if all of the nagios services are not ok, and Partial Outage if some of the nagios services are not ok.
    • Other logic can be added / used instead. The currently available values for service_aggregator are defined in this source file.

Try it out

Set up at least one Cachet component and its underlying nagios_services in your config file. Then, run

./system_status_auto.php '[hostname]' '[service name]' CRITICAL HARD

where [hostname] matches a host: line in your config file and service: matches a service.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.