Giter Club home page Giter Club logo

couchbase_exporter's Introduction

Couchbase Exporter

Build Status Coverage Status License

Expose metrics from Couchbase cluster for consumption by Prometheus.

News

Couchbase has released an official exporter: couchbase-exporter.

Getting Started

Run from command-line:

./couchbase_exporter [flags]

The exporter supports various configuration ways: command-line arguments takes precedence over environment variables that take precedence over configuration file.

Configuration file can be provided on the command line. It must be written in json or yaml. If none is provided using the command line --config.file option, it will look for a file named config.json or config.yml in the same directory that the exporter binary. You can find complete examples of configuation files in the sources (directory examples).

As for available flags and equivalent environment variables, here is a list:

environment variable argument description default
-config.file Configuration file to load data from
CB_EXPORTER_LISTEN_ADDR -web.listen-address Address to listen on for HTTP requests :9191
CB_EXPORTER_TELEMETRY_PATH -web.telemetry-path Path under which to expose metrics /metrics
CB_EXPORTER_SERVER_TIMEOUT -web.timeout Server read timeout in seconds 10s
CB_EXPORTER_DB_URI -db.uri Address of Couchbase cluster http://127.0.0.1:8091
CB_EXPORTER_DB_TIMEOUT -db.timeout Couchbase client timeout in seconds 10s
CB_EXPORTER_TLS_ENABLED -tls.enabled If true, enable TLS communication with the cluster false
CB_EXPORTER_TLS_SKIP_INSECURE -tls.skip-insecure If true, certificate won't be verified false
CB_EXPORTER_TLS_CA_CERT -tls.ca-cert Root certificate of the cluster
CB_EXPORTER_TLS_CLIENT_CERT -tls.client-cert Client certificate
CB_EXPORTER_TLS_CLIENT_KEY -tls.client-key Client private key
CB_EXPORTER_DB_USER not allowed Administrator username
CB_EXPORTER_DB_PASSWORD not allowed Administrator password
CB_EXPORTER_LOG_LEVEL -log.level Log level: info,debug,warn,error,fatal error
CB_EXPORTER_LOG_FORMAT -log.format Log format: text, json text
CB_EXPORTER_SCRAPE_CLUSTER -scrape.cluster If false, wont scrape cluster metrics true
CB_EXPORTER_SCRAPE_NODE -scrape.node If false, wont scrape node metrics true
CB_EXPORTER_SCRAPE_BUCKET -scrape.bucket If false, wont scrape bucket metrics true
CB_EXPORTER_SCRAPE_XDCR -scrape.xdcr If false, wont scrape xdcr metrics false
-help Command line help

Important: for security reasons credentials cannot be set with command line arguments.

Metrics

All metrics are listed in resources/metrics.md.

Docker

Use it like this:

docker run --name cbexporter -p 9191:9191 -e CB_EXPORTER_DB_USER=admin -e CB_EXPORTER_DB_PASSWORD=complicatedpassword blakelead/couchbase-exporter:latest

Examples

You can find example files in resources directory.

Prometheus

Some simple alerting rules: resources/prometheus-alerts.yaml.

Grafana

Minimal dashboard (resources/grafana-dashboard.json):

Systemd

You can adapt and use the provided service template to run the exporter with systemd (resources/couchbase-exporter.service):

sudo mv couchbase-exporter.service /etc/systemd/system/couchbase-exporter.service
sudo systemctl enable couchbase-exporter.service
sudo systemctl start couchbase-exporter.service

Contributors

Special thanks to:

  • @Berchiche
  • @bitdba88
  • @CharlesRaymond1
  • @pandrieux

couchbase_exporter's People

Contributors

blakelead avatar charlesraymond1 avatar pandrieux avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

couchbase_exporter's Issues

couchbase_export build failed

go build

github.com/leansys-team/couchbase_exporter

./couchbase_exporter.go:62:3: unknown field 'TLSEnabled' in struct literal of type collector.Context
./couchbase_exporter.go:63:3: unknown field 'TLSSkipInsecure' in struct literal of type collector.Context
./couchbase_exporter.go:64:3: unknown field 'TLSCACert' in struct literal of type collector.Context
./couchbase_exporter.go:65:3: unknown field 'TLSClientCert' in struct literal of type collector.Context
./couchbase_exporter.go:66:3: unknown field 'TLSClientKey' in struct literal of type collector.Context

go mod collector source is not new

diff collector/common.go /mnt/e/go/dev/pkg/mod/github.com/blakelead/[email protected]/collector/common.go
9,10d8
< "crypto/tls"
< "crypto/x509"
40,52c38,45
< URI string
< Username string
< Password string
< Timeout time.Duration
< ScrapeCluster bool
< ScrapeNode bool
< ScrapeBucket bool
< ScrapeXDCR bool
< TLSEnabled bool
< TLSSkipInsecure bool
< TLSCACert string
< TLSClientCert string
< TLSClientKey string

  URI           string
  Username      string
  Password      string
  Timeout       time.Duration
  ScrapeCluster bool
  ScrapeNode    bool
  ScrapeBucket  bool
  ScrapeXDCR    bool

121,128d113
< tlsClientConfig := &tls.Config{}
< if c.TLSEnabled {
< tlsClientConfig, err = createTLSClientConfig(c)
< if err != nil {
< log.Error(err)
< }
< }
<
130,136c115
< client := http.Client{
< Timeout: c.Timeout,
< Transport: &http.Transport{
< TLSClientConfig: tlsClientConfig,
< },
< }
<

  client := http.Client{Timeout: c.Timeout}

138d116
<
143d120
<
266,288d242
< }
<
< // createTLSClientConfig loads certificates and create TLS config
< func createTLSClientConfig(c Context) (*tls.Config, error) {
< caCert, err := ioutil.ReadFile(c.TLSCACert)
< if err != nil {
< return nil, err
< }
< certPool := x509.NewCertPool()
< certPool.AppendCertsFromPEM(caCert)
<
< keyPair, err := tls.LoadX509KeyPair(c.TLSClientCert, c.TLSClientKey)
< if err != nil {
< return nil, err
< }
<
< config := tls.Config{
< Certificates: []tls.Certificate{keyPair},
< ClientCAs: certPool,
< InsecureSkipVerify: c.TLSSkipInsecure,
< }
<
< return &config, nil

Support for Couchbase 6.x

Hi,

is Couchbase 6.x supported?

I haven't tested it yet, thought you probably already know if it's the case or not.

Thanks

Issue with docker image run

i am trying to run docker image with all default value , but container is existing with below error:

docker run blakelead/couchbase-exporter:0.6.0
time="2019-06-19T11:09:45Z" level=info msg="stat /bin/config.yml: no such file or directory: using command-line parameters and/or environment variables if provided"
time="2019-06-19T11:09:45Z" level=info msg="Couchbase Exporter Version: 0.6.0"
time="2019-06-19T11:09:45Z" level=info msg="Supported Couchbase versions: 4.5.1, 4.6.5, 5.1.1"
time="2019-06-19T11:09:45Z" level=info msg="config.file=config.yml"
time="2019-06-19T11:09:45Z" level=info msg="web.listen-address=9191"
time="2019-06-19T11:09:45Z" level=info msg="web.telemetry-path=/metrics"
time="2019-06-19T11:09:45Z" level=info msg="web.timeout=0s"
time="2019-06-19T11:09:45Z" level=info msg="db.uri=http://127.0.0.1:8091"
time="2019-06-19T11:09:45Z" level=info msg="db.timeout=0s"
time="2019-06-19T11:09:45Z" level=info msg="log.level=info"
time="2019-06-19T11:09:45Z" level=info msg="log.format=text"
time="2019-06-19T11:09:45Z" level=info msg="scrape.cluster=true"
time="2019-06-19T11:09:45Z" level=info msg="scrape.node=true"
time="2019-06-19T11:09:45Z" level=info msg="scrape.bucket=true"
time="2019-06-19T11:09:45Z" level=info msg="scrape.xdcr=false"
time="2019-06-19T11:09:45Z" level=info msg="Started listening at 9191"
time="2019-06-19T11:09:45Z" level=fatal msg="listen tcp: address 9191: missing port in address"

Do you have the kubernets deployment file for this ?

[xdcr] crash when scraping

go version go1.8.3 linux/amd64
-sh-4.2$ go run *go
INFO[0000] No configuration file was found in the working directory /tmp/go-build797253098/command-line-arguments/_obj/exe 
INFO[0000] Couchbase version: 3.0.1-1444-rel-community  
INFO[0000] Community version: true                      
WARN[0000] Version 3.0.1-1444-rel-community may not be supported by this exporter 
ERRO[0000] Could not read file /tmp/go-build797253098/command-line-arguments/_obj/exe/metrics/cluster-default.json 
ERRO[0000] Error during creation of cluster exporter. Cluster metrics won't be scraped 
INFO[0000] Listening at :9191         

json error

exporter version 0.7.0 (compiled binary not in a container)
couchbase version: 5.0.1

The exporter is handing out metrics BUT throws some errors on various servers:

level=error msg="json: cannot unmarshal number 530.5305305305305 into Go struct field .cmd_get of type int"
level=error msg="json: cannot unmarshal number 8.998001998001998 into Go struct field .diskFetches of type int"
level=error msg="json: cannot unmarshal object into Go struct field BucketData.autoCompactionSettings of type bool"

The new version is not running

Hey there,

I tried the new build and it exits immediately after creating:

I run it with: docker run --name cbexporter -p 9191:9191 -e CB_EXPORTER_TELEMETRY_PATH=/metrics -e CB_EXPORTER_DB_URI=http://10.7.62.132 -e CB_EXPORTER_DB_USER=prometheus -e CB_EXPORTER_DB_PASSWORD=prompwd blakelead/couchbase-exporter:latest

Then i see this: cdfb3009fb0f blakelead/couchbase-exporter:latest "/bin/sh -c /bin/cou…" 6 seconds ago Exited (1) 5 seconds ago cbexporter

Thanks for your input

Node vs Bucket metrics

I have a 3 node cluster. I'm wondering do I need a separate exporter for each and every node to grab each "NODE's" metrics (for example: cb_node_stats_cmd_get)? If my understanding is correct, each exporter would be calculating the same BUCKET and CLUSTER metrics redundantly.

Is there a way to just have 1 exporter running that points to a single node in the cluster, and then can identify all other nodes in that cluster and provide the *NODE metrics breakdown for each node? (Much like we have for BUCKETs, if we have multiple buckets, we report on each bucket from a single exporter).

What am I doing wrong

I tried to start this from the command line in a centos container and I tried to download the https://github.com/blakelead/couchbase_exporter/releases/download/0.1.0/couchbase_exporter-0.1.0-linux-amd64.tar.gz. I also tried the premade docker image "blakelead/couchbase-exporter". I am not understanding what to do here. After running the docker image one I get the following:
sudo docker run -dit --name dbapocs_cbexp_004 --label triton.network.public="SDC-PCI-Dev-DB" -web.listen-address=":9191" -web.telemetry-path="/metrics" -db.url="http://xx.x.xx.xxx:8091" -db.user="prometheus" -db.pwd="prompwd" blakelead/couchbase-exporter
Password:
unknown shorthand flag: 'b' in -b.url=http://xx.x.xx.xxx:8091

I don't understand why it's not functioning properly.
in the centos container I run the command line one and I am seeing all of the metrics I need in the terminal window, However when I goto the url http://server:9191 I am not seeing all the metrics that I am seeing in the terminal window.
screen shot 2018-08-22 at 9 27 50 am

I suspect that is the log that is showing there. See screenshot below.

screen shot 2018-08-22 at 9 17 10 am

If you would like a screen shot, I can send you one via email. Just let me know.

Thanks,
Michael

Incorrect binary in linux amd64 release

Wrong binary in tar.gz for linux amd64.

The linux-amd64.tar.gz form the release page does not run on alpine:3.9.3.

However if I clone down the repo and build the binary env GOOS=linux GOARCH=amd64 go build ./couchbase_exporter.go it runs on alpine no problem.

This leads me to believe the release tar.gz does not have the correct binary tar.gz.

Support Couchbase 6.0

Saw similar question was raised before when we started the exporter we are having this warnings reported and we do not see the metrics are spitted out.
[root]# ./couchbase2_exporter -scrape.xdcr=false INFO[0000] Couchbase version: 6.0.0-1693-community INFO[0000] Community version: true WARN[0000] Version 6.0.0-1693-community may not be supported by this exporter INFO[0000] Listening at :9420

Is TYPE correct for cb_bucket_ep_oom_errors and cb_bucket_ep_tmp_oom_errors ?

I have 558 couchbase nodes. and over the past week I've only ever seen these metrics increase and plateau, never decrease, even on machines that presently have 85% ram free. The couchbase_exporter documents them as

# HELP cb_bucket_ep_oom_errors Number of times unrecoverable OOMs happened while processing operations
# TYPE cb_bucket_ep_oom_errors gauge

I found DataDog documents these metrics as type 'gauge' but I worry these are actually of type counter, because the numbers stay high long after the machine's memory pressure is relieved.
I'm having a terrible time finding mention of "Samples.EpOomErrors" or "Samples.EpTmpOomErrors" in the Couchbase documentation. All I've found is a passing mention to "ep_oom_errors" and how it's a bad thing if you see it at all... and about a dozen other websites that copy-paste that one paragraph.

I'm certain the exporter is correctly relaying the information from couchbase, but I would like an assurance that these metrics are of type gauge, and if so, a more comprehensible description of their meaning. I.E. if these are a gague-measurement of errorsful operations... how many operations were sampled for this gauge? a minute's worth? an hour's worth? if I scrape less often than the sample range, could errors go undetected between scrapes?

Errors when running against couchbase 3.0.1 -- timeout exceeded while awaiting headers

Hello,

Thanks for the help earlier -- it can run now without crashing :) One issue though: yes, it runs, but it doesn't actually get any stats when it runs now. I get these errors every time it scrapes (below). When I go to http://:8091/pools/default/buckets/presence/stats/replications I do actually see json, so the endpoints seem to be there. Any ideas? Thanks!

ERRO[0098] Get http://:8091/pools/default/buckets/mwi/stats: net/http: request canceled (Client.Timeout exceeded while awaiting headers) ERRO[0098] Get http://:8091/pools/default/buckets/presence/stats: net/http: request canceled (Client.Timeout exceeded while awaiting headers) ERRO[0098] Could not unmarshal bucketstats data for bucket mwi

Collect performance issues

Collect can take a long time eventually failing with a timeout, especially for bucket metrics fetching.

A solution would be to parallelize bucket API requests.

can't identify some couchbase cluster metrics

I try to add some couchbase metrics such as couchbase cluster ram/disk, but can't identify them, but it seems defined in metrics directory in the repo. BTW what should I do if I need to add more coustom metrics in the future?

    "name": "cluster",
    "route": "/pools/default",
    "list": [
        { "name": "ram_total_bytes",         "id": "StorageTotals.ram.total",       "description": "Total memory available to the cluster",              "labels": [] },
        { "name": "ram_used_bytes",          "id": "StorageTotals.ram.used",        "description": "Memory used by the cluster",                         "labels": [] },
        { "name": "ram_used_by_data_bytes",  "id": "StorageTotals.ram.usedByData",  "description": "Memory used by the data in the cluster",             "labels": [] },
        { "name": "ram_quota_total_bytes",   "id": "StorageTotals.ram.quotaTotal", "description": "Total memory allocated to Couchbase in the cluster", "labels": [] },
        { "name": "ram_quota_used_bytes",    "id": "StorageTotals.ram.quotaUsed",   "description": "Memory quota used by the cluster",                   "labels": [] },
        { "name": "disk_total_bytes",        "id": "StorageTotals.hdd.total",       "description": "Total disk space available to the cluster",          "labels": [] },
        { "name": "disk_used_bytes",         "id": "StorageTotals.hdd.used",        "description": "Disk space used by the cluster",                     "labels": [] },
        { "name": "disk_quota_total_bytes",  "id": "StorageTotals.hdd.quotaTotal",  "description": "Disk space quota for the cluster",                   "labels": [] },
        { "name": "disk_used_by_data_bytes", "id": "StorageTotals.hdd.usedByData",  "description": "Disk space used by the data in the cluster",         "labels": [] },
        { "name": "disk_free_bytes",         "id": "StorageTotals.hdd.free",        "description": "Free disk space in the cluster",                     "labels": [] },

    ]
}

Do not force installing exporters on each node in cluster

I have read your answer to @shipmak in his issue #6 and I read the prometheus Deployment paragraph but I think this should be an exception here.
It looks as if the exporter is meant to be running from each node in the couchbase cluster but in our case we have a 6 nodes cluster (others might have more) but I'm running the exporter as a deployment in a separate kubernetes cluster.
I don't think in my case it makes sense to have 6 deployments running only with different nodes variables, plus the redundant "Bucket" metrics DO mess up your queries ones you start sum up things and forget about splitting by the "instance" label..
I feel like the best solution for this is a "self discover" mechanism for each node in the cluster and report them all as a label.
For example instead of:
cb_node_service_up 1 =>
cb_node_service_up {node="node1"} 1
cb_node_service_up {node="node2"} 1

Also, in the grafana dashboard that you provided (nice work btw!) the $node variable assumes the same idea of the exporter running from couchbase node itself when in fact the "instance" label is just the kubernetes node the exporter is running in.

multiple xdcr crashes the exporter

exporter version 0.9.0
couchbase version 5.0.1-5003 (community)

Issue: setting up a couchbase cluster to replicate to 2 datacenters (each with a distinct cluster id and name) throws the exporter in a strange loop. After a few prometheus scrape it uses over 1030 file descriptors and errors out with this for every xdcr metrics.

  • collected metric "cb_xdcr_error_count" { label:<name:"destination_bucket" value:"XXXXX" > label:<name:"remote_cluster_id" value:"YYYY" > label:<name:"remote_cluster_name" value:"backup" > label:<name:"source_bucket" value:"ZZZZ" > gauge:<value:0 > } was collected before with the same name and label values

Cluster configuration

Is it necessary to run the exporter against each node in the cluster? Also what permissions are required by the exporter to get statistics?

Make the application more secure

As pointed out by Brian Brazil in the Prometheus developer group, password is passed in an unsecure way.

I'm looking for the best way to address this issue but as I'm not proficient enough, I'm open to advice of any form.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.