Comments (9)
The information is updated at every scrape. If the data doesn't change then the underlying data has not changed. The os
collector reads from os-release
files on Linux.
This is likely a NixOS issue.
from node_exporter.
That's what I assumed, however node_exporter doesn't appear to follow the new symlink to the update os-release
:
Pre-update:
jdavies@zima ~> ls -l /etc/os-release
lrwxrwxrwx 1 root root 22 Mar 20 15:54 /etc/os-release -> /etc/static/os-release
jdavies@zima ~> ls -l /etc/static/os-release
lrwxrwxrwx 2 root root 58 Jan 1 1970 /etc/static/os-release -> /nix/store/nz7p4wg27400l981d4czq2f6j82yn5d7-etc-os-release
jdavies@zima ~> cat /etc/os-release
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="23.11.20240319.fa9f817"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 23.11 (Tapir)"
SUPPORT_END="2024-06-30"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="23.11 (Tapir)"
VERSION_CODENAME=tapir
VERSION_ID="23.11"
jdavies@zima /e/nixos > sudo nix flake update
warning: updating lock file '/etc/nixos/flake.lock':
• Updated input 'nixpkgs':
'github:NixOS/nixpkgs/fa9f817df522ac294016af3d40ccff82f5fd3a63' (2024-03-19)
→ 'github:NixOS/nixpkgs/219951b495fc2eac67b1456824cc1ec1fd2ee659' (2024-03-28)
jdavies@zima ~> curl http://localhost:9100/metrics | grep os_info
# HELP node_os_info A metric with a constant '1' value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
# TYPE node_os_info gauge
node_os_info{build_id="23.11.20240319.fa9f817",id="nixos",id_like="",image_id="",image_version="",name="NixOS",pretty_name="NixOS 23.11 (Tapir)",variant="",variant_id="",version="23.11 (Tapir)",version_codename="tapir",version_id="23.11"} 1
Update system:
jdavies@zima ~> sudo nixos-rebuild switch
-> generates nixos-system-zima-23.11.20240328.219951b
jdavies@zima ~> cat /etc/os-release
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="23.11.20240328.219951b"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 23.11 (Tapir)"
SUPPORT_END="2024-06-30"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="23.11 (Tapir)"
VERSION_CODENAME=tapir
VERSION_ID="23.11"
jdavies@zima ~> ls -l /etc/static/os-release
lrwxrwxrwx 2 root root 58 Jan 1 1970 /etc/static/os-release -> /nix/store/mc3razq0wglc6hzik2dw8vpsvmskpxc6-etc-os-release
jdavies@zima ~> curl http://localhost:9100/metrics | grep os_info
# HELP node_os_info A metric with a constant '1' value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
# TYPE node_os_info gauge
node_os_info{build_id="23.11.20240319.fa9f817",id="nixos",id_like="",image_id="",image_version="",name="NixOS",pretty_name="NixOS 23.11 (Tapir)",variant="",variant_id="",version="23.11 (Tapir)",version_codename="tapir",version_id="23.11"} 1
jdavies@zima ~> sudo systemctl restart prometheus-node-exporter.service
jdavies@zima ~> curl http://localhost:9100/metrics | grep os_info
# HELP node_os_info A metric with a constant '1' value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
# TYPE node_os_info gauge
node_os_info{build_id="23.11.20240328.219951b",id="nixos",id_like="",image_id="",image_version="",name="NixOS",pretty_name="NixOS 23.11 (Tapir)",variant="",variant_id="",version="23.11 (Tapir)",version_codename="tapir",version_id="23.11"} 1
from node_exporter.
The code clearly shows that Update()
is called and does an os.Open()
at each scrape. Without knowing what kind of filesystem isolation is being done by NixOS or the systemd unit, it's impossible to say why you're having this issue.
from node_exporter.
~> cat /etc/systemd/system/prometheus-node-exporter.service
[Unit]
After=network.target
[Service]
Environment="LOCALE_ARCHIVE=/nix/store/d2nnadv7fdvains3rziq8lkpzw7anh9x-glibc-locales-2.38-44/lib/locale/locale-archive"
Environment="PATH=/nix/store/rk067yylvhyb7a360n8k1ps4lb4xsbl3-coreutils-9.3/bin:/nix/store/q7x6rjg6ya1gsg068fxj1sgf1k2n144n-findutils-4.9.0/bin:/nix/store/r1lp9kxlrc6h7vrba90gm6i94s31xvvx-gnugrep-3.11/bin:/nix/store/29w8hg0fis0pl3j4d3v0p02aicyw10lv-gnused-4.9/bin:/nix/store/dzp7d4k1d94s1x49p9171mvcsfyxr7bj-systemd-254.6/bin:/nix/store/rk067yylvhyb7a360n8k1ps4lb4xsbl3-coreutils-9.3/sbin:/nix/store/q7x6rjg6ya1gsg068fxj1sgf1k2n144n-findutils-4.9.0/sbin:/nix/store/r1lp9kxlrc6h7vrba90gm6i94s31xvvx-gnugrep-3.11/sbin:/nix/store/29w8hg0fis0pl3j4d3v0p02aicyw10lv-gnused-4.9/sbin:/nix/store/dzp7d4k1d94s1x49p9171mvcsfyxr7bj-systemd-254.6/sbin"
Environment="TZDIR=/nix/store/i6nk8llh46f2xjzc5h8j83kwwr1w3kx0-tzdata-2024a/share/zoneinfo"
CapabilityBoundingSet=
DeviceAllow=
DynamicUser=false
ExecStart=/nix/store/vwpkipjynqgwpp2pgyl5mxxffysn1c60-node_exporter-1.7.0/bin/node_exporter \
--web.listen-address 0.0.0.0:9100
Group=node-exporter
LockPersonality=true
MemoryDenyWriteExecute=true
NoNewPrivileges=true
PrivateDevices=true
PrivateTmp=true
ProtectClock=false
ProtectControlGroups=true
ProtectHome=true
ProtectHostname=true
ProtectKernelLogs=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectSystem=strict
RemoveIPC=true
Restart=always
RestrictAddressFamilies=AF_NETLINK
RestrictAddressFamilies=AF_INET
RestrictAddressFamilies=AF_INET6
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
RuntimeDirectory=prometheus-node-exporter
SystemCallArchitectures=native
UMask=0077
User=node-exporter
WorkingDirectory=/tmp
There's also nothing in my logs from node-exporter
.
from node_exporter.
I don't think this can be anything on the node-exporter side, so closing for now.
from node_exporter.
I see the issue now. I wasn't reading the code carefully enough. We cache the mtime of the file and only update the data if the file mtime is changed.
But, for some reason on NixOS the stat is not following the symlink. On Ubuntu, this doesn't seem to be a problem.
Although, maybe again this is a systemd masking issue as I don't see this problem running under a normal shell.
from node_exporter.
We cache the mtime of the file and only update the data if the file mtime is changed
This is where the NixOS weirdness comes in - where all files in the Nix store have a mtime of Unix epoch (as it doesn't support extended attributes).
node_exporter
would look at the symlink to the Nix store, see the mtime hadn't changed cause they're all epoch - and then just carry on reporting the old value (rather than follow the new symlink to the updated store path).
from node_exporter.
Oh, that's a completely different problem. Maybe we need --collector.os.cache
flag so that --no-collector.os.cache
can be used on NixOS.
Or we could just completely eliminate the whole mtime thing and read the file every time. I kinda feel like the whole mtime check is a bit of an over-optimization considering most of the time the whole file will be in page cache anyway.
from node_exporter.
@SuperQ yeah I think we should just read the file every time
from node_exporter.
Related Issues (20)
- Bug: collector.arp.netlink is super slow to retrieve metrics HOT 2
- caller=stdlib.go caller='error encoding and sending metric family:write tcp 10.10.10.10:9100' msg=">192.168.3.12:47158:write:broken pipe" HOT 2
- Question about the procfs package
- NewNodeCollector all collectors disabled
- Node exporter have high memory usage in some nodes HOT 4
- node_exporter cannot parse drbd 8.4.11 stat file HOT 4
- several collectors fail on OpenBSD 7.5 HOT 2
- AWS ECS implementation
- Have you considered adding support for ebpf? HOT 7
- Qdisc collector does not expose queues with a parent
- PFSense node_exporter is not emitting `node_uname_info` metric HOT 3
- Question about diskIOSaturationThreshold HOT 1
- Add Support for GPU Clock Frequencies (sclk and mclk) in hwmon Collector
- Feature request: Unix sockets HOT 2
- Add docker tags for major/minor/patch versions HOT 3
- can metrics value supports parsing hexadecimal? HOT 2
- The value of CPU usage is negative HOT 2
- [QUESTION] Is the firewall part intentionally ommited ? HOT 1
- TYPE and HELP descriptors show UNKNOWN using file-collector HOT 3
- Unable to start node_exporter err="could not get power_supply class info: error obtaining power_supply class info: failed to read file \"/sys/class/power_supply/BAT0/current_now\": no such device" HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from node_exporter.