prometheus / node_exporter Goto Github PK
View Code? Open in Web Editor NEWExporter for machine metrics
Home Page: https://prometheus.io/
License: Apache License 2.0
Exporter for machine metrics
Home Page: https://prometheus.io/
License: Apache License 2.0
I'm getting the following error while trying to build:
GOROOT=/usr/local/go GOPATH=/home/f8366545/Projects/node_exporter-0.8.0/.build/gopath /usr/local/go/bin/go build -o node_exporter
# github.com/prometheus/client_golang/text
.build/gopath/src/github.com/prometheus/client_golang/text/proto.go:30: undefined: ext.WriteDelimited
make: *** [node_exporter] Error 2
I'm not a gopher myself, so it may be related to my env.
After activating the runit exporter, only the node_exporter_scrape_duration_seconds* metric gets populated with runit values.
No service statuses are exposed.
This happens in version 94d2259, other versions were not tested.
System is a debian testing with runit 2.1.2-3.
Also tested was a debian wheezy with runit 2.1.1-6.2
Hi,
sometime the ntp collector returns drift around -3643548636.539542, which seems to refer to ~1900.
We're using https://github.com/beevik/ntp/blob/master/ntp.go which apparently depends on the UDP stack to make sure all udp packets are complete. The working theory right now is that some udp responses are truncated, causing the drift to be off.
Like the ntp package, the runit package has no explicit license. Although I presume you can fix that one easily :-)
We don't want to shell out (particularly to things that require root), so these should move to the textfile collector. We should provide scripts or binaries that can be run from cron
I'm trying to cross-compile to the freebsd archs, and they're all failing out. All the code and filenames seems to be correct, so I'm not sure what's going wrong. I'm using the official Go 1.5.1 package.
export GOOS=freebsd
export GOARCH=amd64
export GO15VENDOREXPERIMENT=1
go build github.com/prometheus/node_exporter
# github.com/prometheus/node_exporter/collector
../gopath/src/github.com/prometheus/node_exporter/collector/filesystem_common.go:34: undefined: defIgnoredMountPoints
../gopath/src/github.com/prometheus/node_exporter/collector/filesystem_common.go:62: undefined: filesystemLabelNames
../gopath/src/github.com/prometheus/node_exporter/collector/filesystem_common.go:68: undefined: filesystemLabelNames
../gopath/src/github.com/prometheus/node_exporter/collector/filesystem_common.go:74: undefined: filesystemLabelNames
../gopath/src/github.com/prometheus/node_exporter/collector/filesystem_common.go:80: undefined: filesystemLabelNames
../gopath/src/github.com/prometheus/node_exporter/collector/filesystem_common.go:86: undefined: filesystemLabelNames
../gopath/src/github.com/prometheus/node_exporter/collector/filesystem_common.go:100: c.GetStats undefined (type *filesystemCollector has no field or method GetStats)
There are currently two ways collectors can be configured: by using cli paraemters or by reading from a config file. I'd propose to settle on a single option to make it easier for users to use node_exporter.
The collectors using the config files are:
Collectors using parameters:
Additionally, the selection of collectors itself also uses parameters.
Given the majority of configuration options already happens using parameters, I propose to replace the config file options with parameters for now. Additionally, in order to make it easier to read all config options, all parameters get namespaced with their collector name. The namespacing scheme with a dot will be used to consolidate node_exporter with other prometheus components like pushgateway or the server which also use a dot.
Example:
Usage of node_exporter:
-attributes.list="": comma separates list of static attributes
-filesystem.ignoredMountPoints="^/(sys|proc|dev)($|/)": Regexp of mount points to ignore for filesystem collector.
-ntp.server="": NTP server to use for ntp collector.
-megacli.command="megacli": Command to execute to retrieve information.
As we've now clarified that our general stance is that exporter auth should be done via a reverse proxy, we should remove the basic auth support from node exporter.
See #60 (comment)
Like the supervisord
and runit
collectors, it would be great to export service status from systemd. This shouldn't be too hard, there is already a go-systemd library here:
https://github.com/coreos/go-systemd
Which can be used to get service information/status over dbus.
As some collectors are not available on all operating systems, they should not be part of the collectors.enabled
flag by default. This list needs to be dynamic and remove any collectors which are not available/compiled.
Right now node exporter only tracks desired state, normal state and state (btw, what the difference?). It's possible for service to be restarted between scrapes, and no-one will know about it from node exporter.
Would you consider reporting the cpu-related info from /proc/stat
in ticks / sysconf(_SC_CLK_TCK)
instead of ticks ? It would make comparison between VMs and across kernel more robust ?
The Makefile doesn't find go on CentOS 7.
I worked around the issue by hacking Makefile.COMMON:
-GOCC ?= $(GOROOT)/bin/go
+GOCC ?= $(GOROOT)/bin/linux_amd64/go
Not sure how to do this generically.
The filesystem exporter should collect a boolean indicating if the filesystem is read-only or read-write.
This is placeholder issue since I'll implement the feature soon.
Currently, node_disk_sectors_read and node_disk_sectors_written are exported, however, what use are these considering the sector size can be different for disk?
Would it perhaps not be a better idea to export node_disk_read_bytes and node_disk_written_bytes?
Hi,
I am using node_exporter and I would like to use the official prom/node_exporter container from docker hub.
The versioning system is not implemented and currently prom/node_exporter:latest contains node_exporter v0.12.0-rc1. Is it possible to push containers with a sane versioning system?
Just had some interesting times diagnosing a problem where this data would prove helpful to alert off of in the future. This issue is a note to myself to implement it.
Whenever I run make test
on node_exporter on OSX, the test cases will always pass, even if I purposely break a test case.
$ cd /tmp
$ git clone https://github.com/prometheus/node_exporter.git
$ cd node_exporter
Change the test case loadavg_linux_test.go from 0.21 to 0.25
func TestLoad(t *testing.T) {
load, err := parseLoad("0.21 0.37 0.39 1/719 19737")
if err != nil {
t.Fatal(err)
}
if want := 0.25; want != load {
t.Fatalf("want load %f, got %f", want, load)
}
}
Run the test cases using make:
$ make test
mkdir -p /private/tmp/node_exporter/.build/gopath/src/github.com/prometheus/
ln -s /private/tmp/node_exporter /private/tmp/node_exporter/.build/gopath/src/github.com/prometheus/node_exporter
GOPATH=/private/tmp/node_exporter/.build/gopath /usr/local/go/bin/go get -d
touch dependencies-stamp
GOPATH=/private/tmp/node_exporter/.build/gopath /usr/local/go/bin/go test ./...
? _/tmp/node_exporter [no test files]
ok _/tmp/node_exporter/collector 0.082s
? _/tmp/node_exporter/collector/ganglia [no test files]
$ which go
/usr/local/go/bin/go
$ go version
go version go1.4.2 darwin/amd64
$ uname -a
Darwin XXX.local 14.5.0 Darwin Kernel Version 14.5.0: Wed Jul 29 02:26:53 PDT 2015; root:xnu-2782.40.9~1/RELEASE_X86_64 x86_64
A trap for new users is that the node_exporter defaults (due to glog) to logging to /tmp but doesn't seem to set any sensible rotation limits on this. Since tmp can be quite small and/or a tmpfs, this can be a sneaky problem since log files are going to unexpected places.
I think we should default -logtostderr
to true, since the vast majority of users are going to want to (1) run it and see if it accepts requests and then (2) run it and forget about it (if it goes down, Prometheus will tell us).
This is most likely due to lack of /proc
under OSX system. The only collectors that worked for me were textfile, time and mdadm (hehe).
The current code contains many golint violations and even a few go vet violations.
Since new collectors share boilerplate code with existing ones, contributors copy those violations into new code as well. The badness spreads.
In order to have reproducible builds, we need to vendor all dependencies.
It would be good to be able to expose the node_exporter metrics over HTTPS with basic auth. Bonus points for client certificate verification.
For now will probably use a reverse proxy for this but it does seem like something that makes sense to have baked in.
I want to create a setup where i guard node_exporter behind an nginx reverse proxy. Since this proxy should work directory-based and not vhost-based, i created the following proxy rule in nginx:
location /tirn-node {
proxy_pass http://192.168.1.9:9100;
include /etc/nginx/auth-basic.conf;
}
So when you access /tirn-node, you should get to the landing page of node_exporter, and when you access /tirn-node/metrics, you get the metrics which are then scraped my prometheus.
To achieve this goal, i start up node_exporter with the following CLI flag:
-web.telemetry-path="/tirn-node/metrics
which works great when you start node_exporter by hand. The result here is:
But now i want to start it with systemd.
The corresponding line in the unit definition is:
ExecStart=/home/simonszu/go/src/github.com/prometheus/node_exporter/node_exporter -web.telemetry-path="/tirn-node/metrics"
I have checked with ps that this exact line is called by systemd.
But the behaviour is now:
So i am not able to access the metrics when i start node_exporter via systemd, despite it has exactly the same CLI flags it had when i started it manually and could access the metrics.
Note: When i bypass the proxy and access the node_exporter directly, but all occurrences of host.tld/tirn-node are replaced by the blank ip:port urls, which is the desired behaviour, so i really think it has to do something with node_exporter itself.
I am a little clueless what to do now, so i'm sending you this issue and hope you have any idea.
When there's a Docker container running on a host and the node exporter is not run as root, the filesystem collector fails with the following error:
INFO[0320] ERROR: filesystem failed after 0.002522s: Statfs on /var/lib/docker/aufs returned permission denied file=node_exporter.go line=97
df
is smarter about which filesystems to show usage for. It first does a stat, then only a statfs for some of them that are actually "relevant". Determine what it does exactly and perhaps use the same strategy.
Hello,
As defined here debops/ansible-ferm#63 I use ferm to manage iptables.
As soon as I start a container with the --net=host
parameter, I can not update my iptables rules anymore. I get the error :
stderr: iptables v1.4.21: host/network `' not found
Is there a way to start the node_exporter container without --net=host
?
Thanks
Hello,
I don't really plan to use the default node.html to monitor my system (more promdash and custom dashboards) but just to let you know that when I browse to http://server_ip:9090/consoles/node.html
and click on the node link, I get the following error :
error executing template __console_/node-overview.html: template: __console_/node-overview.html:38:109: executing "__console_/node-overview.html" at <query>: error calling query: parse error at char 72: unknown escape sequence U+0064 'd'
See #60 (comment)
panic: runtime error: slice bounds out of range
goroutine 35 [running]:
github.com/prometheus/node_exporter/collector.(*statCollector).Update(0xc20806e200, 0xc2080526c0, 0x0, 0x0)
/usr/local/src/node_exporter/.deps/gopath/src/github.com/prometheus/node_exporter/collector/stat.go:101 +0x74a
main.Execute(0x8aef80, 0x4, 0x7f716c18a4d0, 0xc20806e200, 0xc2080526c0)
/usr/local/src/node_exporter/node_exporter.go:89 +0x75
main.func·001(0x8aef80, 0x4, 0x7f716c18a4d0, 0xc20806e200)
/usr/local/src/node_exporter/node_exporter.go:62 +0x5b
created by main.NodeCollector.Collect
/usr/local/src/node_exporter/node_exporter.go:64 +0x1c7
Kernel 2.6.32-25-pve
provides 8 instead of 9 values for each CPU.
Is there any option for quiet mode?
Node exporter sends a lot of data to syslog...
IIUC, sar can give you a lot of metrics.
Multiple simultaneous scrapes can result in bad data.
Hi
i try to build the node_exporter master branch and get this error, any hints?
go build node_exporter.go
package runtime: C source files not allowed when not using cgo or SWIG: atomic_amd64x.c defs.c float.c heapdump.c lfstack.c malloc.c mcache.c mcentral.c mem_linux.c mfixalloc.c mgc0.c mheap.c msize.c os_linux.c panic.c parfor.c proc.c runtime.c signal.c signal_amd64x.c signal_unix.c stack.c string.c sys_x86.c vdso_linux_amd64.c
also with make
make
mkdir -p /home/fk/work/src/go/src/github.com/prometheus/node_exporter/.build/gopath/src/github.com/prometheus/
ln -s /home/fk/work/src/go/src/github.com/prometheus/node_exporter /home/fk/work/src/go/src/github.com/prometheus/node_exporter/.build/gopath/src/github.com/prometheus/node_exporter
GOPATH=/home/fk/work/src/go/src/github.com/prometheus/node_exporter/.build/gopath /usr/local/go/bin/go get -d
package .
imports runtime: C source files not allowed when not using cgo or SWIG: atomic_amd64x.c defs.c float.c heapdump.c lfstack.c malloc.c mcache.c mcentral.c mem_linux.c mfixalloc.c mgc0.c mheap.c msize.c os_linux.c panic.c parfor.c proc.c runtime.c signal.c signal_amd64x.c signal_unix.c stack.c string.c sys_x86.c vdso_linux_amd64.c
Makefile.COMMON:93: recipe for target 'dependencies-stamp' failed
make: *** [dependencies-stamp] Error 1
regards f0
Hi,
having this filesystem here:
$ mount|grep docker
/dev/mapper/ubuntu--vg-docker on /var/lib/docker type btrfs (rw)
$ grep docker /proc/mounts
/dev/mapper/ubuntu--vg-docker /var/lib/docker btrfs rw,relatime,nospace_cache 0 0
/dev/mapper/ubuntu--vg-docker /var/lib/docker/btrfs btrfs rw,relatime,nospace_cache 0 0
But the node_exporter doesn't expose this mountpoint:
node_filesystem_free{env="prod",filesystem="/.dockerinit",instance="http://1.2.3.4:9080/metrics",job="node_exporter"} 350100799488.000000
node_filesystem_free{env="prod",filesystem="/etc/hosts",instance="http://1.2.3.4:9080/metrics",job="node_exporter"} 350100799488.000000
node_filesystem_free{env="prod",filesystem="/",instance="http://1.2.3.4:9080/metrics",job="node_exporter"} 350100799488.000000
node_filesystem_free{env="prod",filesystem="/etc/resolv.conf",instance="http://1.2.3.4:9080/metrics",job="node_exporter"} 120811970560.000000
node_filesystem_free{env="prod",filesystem="/etc/hostname",instance="http://1.2.3.4:9080/metrics",job="node_exporter"} 350100799488.000000
Hi,
The configuration file is not really documented anywhere I could find, and a quick look at the code did not really help. A small reference to it would be very helpful.
This would allow us to monitor the number of TCP and UDP connections and bandwidth
thanks
I have been running the node_exporter version 0.8.0 for a while now. Maybe 2 weeks, and I am now seeing a pretty high IO load. If I run iotop then I am seeing it reading megabytes of data in a matter of seconds. It is basically using 99% of the io on the system.
If I run lsof -p $PID then I see that there are a lot of open file descriptors to /pro/stat and /proc/$PID/limits and /proc/meminfo and /proc/$PID/stat
if this helps
sudo lsof -p 25359 | awk '{print $9}' | sort | uniq -c | sort -n
1 *:7301
1 /dev/null
1 /proc/diskstats
1 /tmp/node_exporter.ip-XXXX.root.log.INFO.20150501-131008.25359
1 /usr/lib64/ld-2.19.so
1 /usr/lib64/libc-2.19.so
1 /usr/lib64/libnss_files-2.19.so
1 /usr/lib64/libpthread-2.19.so
1 /var/lib/prometheus/node_exporter
1 NAME
1 anon_inode
1 ip-XXXX.compute.internal:7301->ip-XXXX.compute.internal:49115
2 /
2 /proc/25359/mounts
2 /proc/25359/net/dev
2 socket
12 /proc/25359/limits
17 /proc/stat
19 /proc/25359/stat
20 /proc/meminfo
47 can't identify protocol
There are currently many metrics which don't follow our naming guidelines, e.g. they're missing _total
suffixes or units like _bytes
.
This will be a breaking change, but for the better. As node_exporter is one of the most popular exporters, it should also lead by example.
Currently it reports /etc/hostname and /etc/hosts and /etc/resolv.conf as mount points inside the docker container, and then reports / as just the filesystem that the docker container is mounted on.
It would be nice to be able to report the mounts from the host machine, or at least have some configuration option where you can do that similar to cAdvisor.
I see node_exporter choking on one host:
ERROR: netdev failed after 0.000096s: Couldn't get netstats: Invalid line in /proc/net/dev: lo:181005683161 224490607 0 0 0 0 0 0 181005683161 224490607 0 0 0 0 0 0
Looking at the code, I think it is because there is no space between the colon and the number, I pressume this happens when the counters have more than 10 digits. This is the ProcNetDev file in the same host:
$ cat /proc/net/dev
Inter-| Receive | Transmit
face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed
lo:181033261002 224517415 0 0 0 0 0 0 181033261002 224517415 0 0 0 0 0 0
eth0:68210035552 520993275 0 0 0 0 0 0 9315587528 43451486 0 0 0 0 0 0
eth1: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
sit0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
It looks like data are written in /tmp
by default.
Is there a way to change this ? I did not see any relevant flag in node_export -h
.
Our production is a mix of Ubuntu 12.04 and Centos 5.9.
The kernel version on the Centos is 2.6.18.
Please advise
Thanks
/proc/vmstat
has useful metrics to find out what the kernel memory system is up to.
We're running node_exporter on machines on which we use Docker. There are a lot of docker containers that get spawned and destroyed. Over time, node_exporter accumulates hundreds of entries for node_filesystem_*{filesystem="/var/lib/docker/devicemapper/mnt/*"}
and node_network_*{device="veth*"}
. I think that could be a reason why our prometheus is using a lot of memory. I see two possible solutions: One is to filter those values completely, the other is to forget about old entries after some time. What do you think?
Hi,
I noticed that the ntp package that is used by node_exporter has no license, and so technically, it is not legal to use it at all. I have already filed a bug with the author: beevik/ntp#1
Thanks
Prometheus metrics should be appropriately namespaced. Native node_exporter metrics are exported with a node_
prefix, but Ganglia ones are exported as-is. We should probably prefix them with ganglia_
.
Are there any plans to support windows nodes?
I'm monitoring a set of aurora clusters. Because of how we move slave between them, the only source of truth for which cluster a given slave belongs to is a puppet configuration run, I need a way to annotate the metrics exposed from node exporter with that info -- it's not part of the metric uri that Prometheus saves for instance.
Is this module actually problematic for technical reasons? If so, is there a better way of doing this?
netstat -s
reports on "active connection openings" and "passive connection openings", which is interesting to us since we just had issues because one of our apps wasn't properly using HTTP keep-alive and thus it created a ton of new TCP connections each second. We could see that with those values, but it seems like they are not reported atm. The source that netstat
uses is /proc/net/snmp
.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.