Giter Club home page Giter Club logo

Comments (9)

SuperQ avatar SuperQ commented on September 27, 2024

The information is updated at every scrape. If the data doesn't change then the underlying data has not changed. The os collector reads from os-release files on Linux.

This is likely a NixOS issue.

from node_exporter.

jpds avatar jpds commented on September 27, 2024

That's what I assumed, however node_exporter doesn't appear to follow the new symlink to the update os-release:

Pre-update:

jdavies@zima ~> ls -l /etc/os-release
lrwxrwxrwx 1 root root 22 Mar 20 15:54 /etc/os-release -> /etc/static/os-release
jdavies@zima ~> ls -l /etc/static/os-release
lrwxrwxrwx 2 root root 58 Jan  1  1970 /etc/static/os-release -> /nix/store/nz7p4wg27400l981d4czq2f6j82yn5d7-etc-os-release
jdavies@zima ~> cat /etc/os-release
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="23.11.20240319.fa9f817"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 23.11 (Tapir)"
SUPPORT_END="2024-06-30"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="23.11 (Tapir)"
VERSION_CODENAME=tapir
VERSION_ID="23.11"
jdavies@zima /e/nixos > sudo nix flake update
warning: updating lock file '/etc/nixos/flake.lock':
• Updated input 'nixpkgs':
    'github:NixOS/nixpkgs/fa9f817df522ac294016af3d40ccff82f5fd3a63' (2024-03-19)
  → 'github:NixOS/nixpkgs/219951b495fc2eac67b1456824cc1ec1fd2ee659' (2024-03-28)

jdavies@zima ~>  curl http://localhost:9100/metrics | grep os_info
# HELP node_os_info A metric with a constant '1' value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
# TYPE node_os_info gauge
node_os_info{build_id="23.11.20240319.fa9f817",id="nixos",id_like="",image_id="",image_version="",name="NixOS",pretty_name="NixOS 23.11 (Tapir)",variant="",variant_id="",version="23.11 (Tapir)",version_codename="tapir",version_id="23.11"} 1

Update system:

jdavies@zima ~> sudo nixos-rebuild switch
-> generates nixos-system-zima-23.11.20240328.219951b
jdavies@zima ~> cat /etc/os-release
BUG_REPORT_URL="https://github.com/NixOS/nixpkgs/issues"
BUILD_ID="23.11.20240328.219951b"
DOCUMENTATION_URL="https://nixos.org/learn.html"
HOME_URL="https://nixos.org/"
ID=nixos
LOGO="nix-snowflake"
NAME=NixOS
PRETTY_NAME="NixOS 23.11 (Tapir)"
SUPPORT_END="2024-06-30"
SUPPORT_URL="https://nixos.org/community.html"
VERSION="23.11 (Tapir)"
VERSION_CODENAME=tapir
VERSION_ID="23.11"
jdavies@zima ~> ls -l /etc/static/os-release
lrwxrwxrwx 2 root root 58 Jan  1  1970 /etc/static/os-release -> /nix/store/mc3razq0wglc6hzik2dw8vpsvmskpxc6-etc-os-release
jdavies@zima ~>  curl http://localhost:9100/metrics | grep os_info
# HELP node_os_info A metric with a constant '1' value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
# TYPE node_os_info gauge
node_os_info{build_id="23.11.20240319.fa9f817",id="nixos",id_like="",image_id="",image_version="",name="NixOS",pretty_name="NixOS 23.11 (Tapir)",variant="",variant_id="",version="23.11 (Tapir)",version_codename="tapir",version_id="23.11"} 1
jdavies@zima ~> sudo systemctl restart prometheus-node-exporter.service
jdavies@zima ~> curl http://localhost:9100/metrics | grep os_info
# HELP node_os_info A metric with a constant '1' value labeled by build_id, id, id_like, image_id, image_version, name, pretty_name, variant, variant_id, version, version_codename, version_id.
# TYPE node_os_info gauge
node_os_info{build_id="23.11.20240328.219951b",id="nixos",id_like="",image_id="",image_version="",name="NixOS",pretty_name="NixOS 23.11 (Tapir)",variant="",variant_id="",version="23.11 (Tapir)",version_codename="tapir",version_id="23.11"} 1

from node_exporter.

SuperQ avatar SuperQ commented on September 27, 2024

The code clearly shows that Update() is called and does an os.Open() at each scrape. Without knowing what kind of filesystem isolation is being done by NixOS or the systemd unit, it's impossible to say why you're having this issue.

from node_exporter.

jpds avatar jpds commented on September 27, 2024
~> cat /etc/systemd/system/prometheus-node-exporter.service
[Unit]
After=network.target

[Service]
Environment="LOCALE_ARCHIVE=/nix/store/d2nnadv7fdvains3rziq8lkpzw7anh9x-glibc-locales-2.38-44/lib/locale/locale-archive"
Environment="PATH=/nix/store/rk067yylvhyb7a360n8k1ps4lb4xsbl3-coreutils-9.3/bin:/nix/store/q7x6rjg6ya1gsg068fxj1sgf1k2n144n-findutils-4.9.0/bin:/nix/store/r1lp9kxlrc6h7vrba90gm6i94s31xvvx-gnugrep-3.11/bin:/nix/store/29w8hg0fis0pl3j4d3v0p02aicyw10lv-gnused-4.9/bin:/nix/store/dzp7d4k1d94s1x49p9171mvcsfyxr7bj-systemd-254.6/bin:/nix/store/rk067yylvhyb7a360n8k1ps4lb4xsbl3-coreutils-9.3/sbin:/nix/store/q7x6rjg6ya1gsg068fxj1sgf1k2n144n-findutils-4.9.0/sbin:/nix/store/r1lp9kxlrc6h7vrba90gm6i94s31xvvx-gnugrep-3.11/sbin:/nix/store/29w8hg0fis0pl3j4d3v0p02aicyw10lv-gnused-4.9/sbin:/nix/store/dzp7d4k1d94s1x49p9171mvcsfyxr7bj-systemd-254.6/sbin"
Environment="TZDIR=/nix/store/i6nk8llh46f2xjzc5h8j83kwwr1w3kx0-tzdata-2024a/share/zoneinfo"
CapabilityBoundingSet=
DeviceAllow=
DynamicUser=false
ExecStart=/nix/store/vwpkipjynqgwpp2pgyl5mxxffysn1c60-node_exporter-1.7.0/bin/node_exporter \
  --web.listen-address 0.0.0.0:9100

Group=node-exporter
LockPersonality=true
MemoryDenyWriteExecute=true
NoNewPrivileges=true
PrivateDevices=true
PrivateTmp=true
ProtectClock=false
ProtectControlGroups=true
ProtectHome=true
ProtectHostname=true
ProtectKernelLogs=true
ProtectKernelModules=true
ProtectKernelTunables=true
ProtectSystem=strict
RemoveIPC=true
Restart=always
RestrictAddressFamilies=AF_NETLINK
RestrictAddressFamilies=AF_INET
RestrictAddressFamilies=AF_INET6
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
RuntimeDirectory=prometheus-node-exporter
SystemCallArchitectures=native
UMask=0077
User=node-exporter
WorkingDirectory=/tmp

There's also nothing in my logs from node-exporter.

from node_exporter.

discordianfish avatar discordianfish commented on September 27, 2024

I don't think this can be anything on the node-exporter side, so closing for now.

from node_exporter.

SuperQ avatar SuperQ commented on September 27, 2024

I see the issue now. I wasn't reading the code carefully enough. We cache the mtime of the file and only update the data if the file mtime is changed.

But, for some reason on NixOS the stat is not following the symlink. On Ubuntu, this doesn't seem to be a problem.

Although, maybe again this is a systemd masking issue as I don't see this problem running under a normal shell.

from node_exporter.

jpds avatar jpds commented on September 27, 2024

We cache the mtime of the file and only update the data if the file mtime is changed

This is where the NixOS weirdness comes in - where all files in the Nix store have a mtime of Unix epoch (as it doesn't support extended attributes).

node_exporter would look at the symlink to the Nix store, see the mtime hadn't changed cause they're all epoch - and then just carry on reporting the old value (rather than follow the new symlink to the updated store path).

from node_exporter.

SuperQ avatar SuperQ commented on September 27, 2024

Oh, that's a completely different problem. Maybe we need --collector.os.cache flag so that --no-collector.os.cache can be used on NixOS.

Or we could just completely eliminate the whole mtime thing and read the file every time. I kinda feel like the whole mtime check is a bit of an over-optimization considering most of the time the whole file will be in page cache anyway.

from node_exporter.

discordianfish avatar discordianfish commented on September 27, 2024

@SuperQ yeah I think we should just read the file every time

from node_exporter.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.