Giter Club home page Giter Club logo

openshift_zabbix's Introduction

OpenShift-Zabbix

Summary

This repository contains monitoring scripts and configuration files to enable monitoring an OpenShift installation with Zabbix. Many of these scripts originated with the OpenShift Online product, but are expected to be useful in monitoring an OpenShift Enterprise installation as well.

Scope

While Zabbix is the primary target for these scripts, it is not expected to be the only use case. Many of these scripts should be capable of being utilized with any Network Monitoring Software with little or no modifications. Where changes are required to support other monitoring solutions, patches are welcome.

This repository also contains configuration management code for deploying and configuring the scripts using Puppet. Similarly, while Puppet is the primary target, patches to support other configuration management software are welcome.

License

All code in this repository is licensed under the Apache License, Version 2.0. See the LICENSE file for the complete license text.

Copyrights are attributed individually in each file. If no attribution exists, the file is Copyright 2012 Red Hat Inc.

File Layout

./openshift_zabbix/
|-- files/           - Static files
|   |-- checks/      - Monitoring check scripts
|   |-- lib/         - Libraries used by checks
|   |-- userparams/  - Zabbix userparameter configuration files
|   `-- xml/         - XML template files defining Zabbix Templates.
|-- manifests/       - Puppet configuration manifests
`-- templates/       - ERB Templated files
    |-- checks/      - ERB Templated monitoring check scripts
    |-- lib/         - ERB Templated Libraries used by checks
    `-- userparams/  - ERB Templated Zabbix userparameter configuration files

Prerequisites

Getting Started

  1. Import the XML templates from openshift_zabbix/files/xml/ directory into your Zabbix server.
  2. (Optional) Add the openshift_zabbix module into your Puppet code repository, and integrate it into your manifests.
  3. Deploy openshift_zabbix/files/{checks,lib} onto your OpenShift broker, node and messaging (ActiveMQ) server as documented in openshift_zabbix/manifests.
  4. (Optional) Use openshift_zabbix/files/openshift_zabbix.conf.sample as an example of how to deploy common configuration settings to your OpenShift systems. All openshift_zabbix check scripts accept either command-line arguments or a YAML configuration file. (default: /etc/openshift/openshift_zabbix.conf)

openshift_zabbix's People

Contributors

blentz avatar fkolacek avatar hellaenergy avatar mwoodson avatar rharrison10 avatar tdawson avatar wshearn avatar yocum137 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

openshift_zabbix's Issues

check-mc-ping: cannot load such file -- facter (LoadError)

When trying to run check-mc-ping on an OSE 2.1.4 Broker Host on RHEL 6.5 I get the following error:

# /usr/share/zabbix/bin/check-mc-ping 
/opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require': cannot load such file -- facter (LoadError)
    from /opt/rh/ruby193/root/usr/share/rubygems/rubygems/custom_require.rb:36:in `require'
    from /usr/share/zabbix/lib/zabbix_sender.rb:17:in `<top (required)>'
    from /usr/share/zabbix/bin/check-mc-ping:29:in `require_relative'
    from /usr/share/zabbix/bin/check-mc-ping:29:in `<main>'

What am I missing exactly? I tried installing facter from EPEL 6 w/o luck.

broker.xml and node.xml will not import into Zabbix 2.2.1

When trying to import the broker.xml into Zabbix 2.2.1 (revision 40808) (EPEL 6) I get the following error:

XML file contains fatal error 73:expected '>' [ Line: 299 | Column: 15 ]

When looking in the broker.xml on line 299 I see:

With the node.xml I get the following error:

Cannot import template "Template OpenShift Node", linked templates "Template App Apache Worker, Template File System Checks" do not exist.

I do not see anything in the default Zabbix templates or in the README on this template. Where do I get it?

check-mc-ping: Cannot connect to a replica set

Even though OpenShift is able to connect to the MongoDB replica set via oo-admin-chk, check-mc-ping seems to be failing with the following error:

/opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/util/pool_manager.rb:265:in `get_valid_seed_node': Cannot connect to a replica set using seeds server1.example.com:27017, server2.example.com:27017, server3.example.com:27017 (Mongo::ConnectionFailure)
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/util/pool_manager.rb:171:in `connect_to_members'
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/util/pool_manager.rb:66:in `block in connect'
    from <internal:prelude>:10:in `synchronize'
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/util/pool_manager.rb:61:in `connect'
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/mongo_replica_set_client.rb:202:in `block in connect'
    from <internal:prelude>:10:in `synchronize'
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/mongo_replica_set_client.rb:191:in `connect'
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/mongo_client.rb:693:in `setup'
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/mongo_replica_set_client.rb:495:in `setup'
    from /opt/rh/ruby193/root/usr/share/gems/gems/mongo-1.9.2/lib/mongo/mongo_replica_set_client.rb:168:in `initialize'
    from /usr/share/zabbix/lib/openshift_mongo.rb:48:in `new'
    from /usr/share/zabbix/lib/openshift_mongo.rb:48:in `initialize'
    from /usr/share/zabbix/bin/check-mc-ping:99:in `new'
    from /usr/share/zabbix/bin/check-mc-ping:99:in `main'
    from /usr/share/zabbix/bin/check-mc-ping:182:in `<main>'

The MONGO entries in the /etc/openshift/broker.conf look like:

MONGO_HOST_PORT="server1.example.com:27017,server2.example.com:27017,server3.example.com:27017"
MONGO_USER="user"
MONGO_PASSWORD="password"
MONGO_DB="openshift_broker_db"
MONGO_TEST_DB="openshift_broker_test_db".
MONGO_SSL="true"
MONGO_WRITE_REPLICAS=1

check-district-capacity: LoadError: cannot infer basepath

I got the following error when trying to run check-district-capacity:

# irb /usr/share/zabbix/bin/check-district-capacity 
check-district-capacity(main):001:0> #!/usr/bin/env oo-ruby
check-district-capacity(main):002:0* #
check-district-capacity(main):003:0* #   Copyright 2012 Red Hat Inc.
check-district-capacity(main):004:0* #
check-district-capacity(main):005:0* #   Licensed under the Apache License, Version 2.0 (the "License");
check-district-capacity(main):006:0* #   you may not use this file except in compliance with the License.
check-district-capacity(main):007:0* #   You may obtain a copy of the License at
check-district-capacity(main):008:0* #
check-district-capacity(main):009:0* #       http://www.apache.org/licenses/LICENSE-2.0
check-district-capacity(main):010:0* #
check-district-capacity(main):011:0* #   Unless required by applicable law or agreed to in writing, software
check-district-capacity(main):012:0* #   distributed under the License is distributed on an "AS IS" BASIS,
check-district-capacity(main):013:0* #   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
check-district-capacity(main):014:0* #   See the License for the specific language governing permissions and
check-district-capacity(main):015:0* #   limitations under the License.
check-district-capacity(main):016:0* #
check-district-capacity(main):017:0* # Purpose: report district capacity statistics to Zabbix.
check-district-capacity(main):018:0* #
check-district-capacity(main):019:0* require_relative '../lib/zabbix_sender'
LoadError: cannot infer basepath
    from /usr/share/zabbix/bin/check-district-capacity:19:in `require_relative'
    from /usr/share/zabbix/bin/check-district-capacity:19
    from /opt/rh/ruby193/root/usr/bin/irb:12:in `<main>'
check-district-capacity(main):020:0> require_relative '../lib/utils/log'
LoadError: cannot infer basepath
    from /usr/share/zabbix/bin/check-district-capacity:20:in `require_relative'
    from /usr/share/zabbix/bin/check-district-capacity:20
    from /opt/rh/ruby193/root/usr/bin/irb:12:in `<main>'
check-district-capacity(main):021:0> require_relative '../lib/utils/cli_opts'
LoadError: cannot infer basepath
    from /usr/share/zabbix/bin/check-district-capacity:21:in `require_relative'
    from /usr/share/zabbix/bin/check-district-capacity:21
    from /opt/rh/ruby193/root/usr/bin/irb:12:in `<main>'
check-district-capacity(main):022:0> 
check-district-capacity(main):023:0* load '/usr/sbin/oo-stats'
=> true
check-district-capacity(main):024:0> 
check-district-capacity(main):025:0* def get_district_stats
check-district-capacity(main):026:1>   begin
check-district-capacity(main):027:2*     stats = OOStats.new
check-district-capacity(main):028:2>     stats.gather_statistics
check-district-capacity(main):029:2>     return stats.results
check-district-capacity(main):030:2>   rescue => e
check-district-capacity(main):031:2>     $log.puts e.message
check-district-capacity(main):032:2>     $log.puts e.backtrace
check-district-capacity(main):033:2>   end
check-district-capacity(main):034:1> 
check-district-capacity(main):035:1*   return nil
check-district-capacity(main):036:1> end
=> nil
check-district-capacity(main):037:0> 
check-district-capacity(main):038:0* def main
check-district-capacity(main):039:1>   stats = get_district_stats
check-district-capacity(main):040:1> 
check-district-capacity(main):041:1*   if stats.nil?
check-district-capacity(main):042:2>     $log.error "Unable to gather stats from oo-stats. Exiting..."
check-district-capacity(main):043:2>     exit
check-district-capacity(main):044:2>   end
check-district-capacity(main):045:1> 
check-district-capacity(main):046:1*   zs = ZabbixSender.new($opts[:server], :port=>$opts[:port], :log=>$log)
check-district-capacity(main):047:1> 
check-district-capacity(main):048:1*   stats['profile_summaries'].each do |profile|
check-district-capacity(main):049:2*     name        = profile['profile']
check-district-capacity(main):050:2>     avail_gears = profile['effective_available_gears']
check-district-capacity(main):051:2>     avail_uids  = profile['dist_avail_uids']
check-district-capacity(main):052:2> 
check-district-capacity(main):053:2*     zs.add_entry("#{name}_district_capacity_uuids", avail_uids)
check-district-capacity(main):054:2>     zs.add_entry("#{name}_district_capacity_gears", avail_gears)
check-district-capacity(main):055:2> 
check-district-capacity(main):056:2*     if $opts[:verbose]
check-district-capacity(main):057:3>       $log.puts "#{name}: #{avail_gears} gears; #{avail_uids} uuids"
check-district-capacity(main):058:3>     end
check-district-capacity(main):059:2>   end
check-district-capacity(main):060:1> 
check-district-capacity(main):061:1*   zs.send_data($opts[:verbose]) unless $opts[:test]
check-district-capacity(main):062:1> end
=> nil
check-district-capacity(main):063:0> 
check-district-capacity(main):064:0* if __FILE__ == $0
check-district-capacity(main):065:1>   cli = CLIOpts.new
check-district-capacity(main):066:1>   cli.parse
check-district-capacity(main):067:1>   $opts = cli.options
check-district-capacity(main):068:1>   $log  = Log.new
check-district-capacity(main):069:1>   main
check-district-capacity(main):070:1> end
NameError: uninitialized constant CLIOpts
    from /usr/share/zabbix/bin/check-district-capacity:65
    from /opt/rh/ruby193/root/usr/bin/irb:12:in `<main>'
check-district-capacity(main):071:0> 

The requirements are in fact there:

# ls -R /usr/share/zabbix/lib
/usr/share/zabbix/lib:
accept_node.rb  logfile_parser.rb  openshift_mcollective.rb  openshift_mongo.rb  user_action.rb  utils  zabbix_sender.rb

/usr/share/zabbix/lib/utils:
cli_opts.rb  config.rb  log.rb  state.rb

This is on RHEL 6.5 with SCL 1.1 and OSE 2.1

/usr/share/zabbix/lib/logfile_parser.rb:30:in `initialize': ArgumentError (ArgumentError)

I am attempting to run check-user-action-log on a OSE 2.2.4, I have a /etc/openshift/openshift_zabbix.conf setup and working. To verify it is working I have check-mc-ping running and that is sending data as expected.

/usr/share/zabbix/lib/logfile_parser.rb:30:in initialize': ArgumentError (ArgumentError) from ./check-user-action-log:45:innew'
from ./check-user-action-log:45:in parse_action_log' from ./check-user-action-log:139:inblock in

'
from /opt/rh/ruby193/root/usr/share/ruby/benchmark.rb:295:in realtime' from ./check-user-action-log:138:in'

[RFE] Improve Implementation Documentation

I'd like to see more documentation around how the check scripts are run and how to check if they're functioning correctly in Zabbix (within the README perhaps?).

Take the node host check as an example. I imported the template installed the scripts and setup a cron job to run them per reading through the provided puppet manifests. I get back the following result on one of my node hosts:

zabbix_sender [5471]: DEBUG: answer [{
    "response":"success",
    "info":"processed: 0; failed: 1; total: 1; seconds spent: 0.000139"}]
info from server: "processed: 0; failed: 1; total: 1; seconds spent: 0.000139"
sent: 1; skipped: 0; total: 1

on the Zabbix server side I see:

  9200:20140811:171956.720 trapper #2 [processing data]
  9200:20140811:171956.720 Trapper got [{
        "request":"sender data",
        "data":[
                {
                        "host":"server1",
                        "key":"accept_node",
                        "value":"4"}]}] len 114
  9200:20140811:171956.720 In recv_agenthistory()
  9200:20140811:171956.720 In process_hist_data()
  9200:20140811:171956.720 In process_mass_data()
  9200:20140811:171956.720 End of process_mass_data()
  9200:20140811:171956.720 End of process_hist_data():SUCCEED
  9200:20140811:171956.721 In zbx_send_response()
  9200:20140811:171956.721 zbx_send_response() '{
        "response":"success",
        "info":"processed: 0; failed: 1; total: 1; seconds spent: 0.000180"}'
  9200:20140811:171956.721 End of zbx_send_response():SUCCEED

What does this mean exactly? It seems that nothing is being updated on the Zabbix server when looking at the "Latest Data" for that node. Meanwhile I get messages from Zabbix like:

[Heal] cgred is not running on server1
[Heal] Mcollective is not running on server1

Which is not true. Both are in fact running on that server.

I'm now left in a state where I don't quite know what I missed from a implementation perspective and/or if everything is functioning as expect. Some more detailed implementation and debugging documentation would be really nice right about now.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.