Giter Club home page Giter Club logo

Comments (6)

stelfrag avatar stelfrag commented on June 24, 2024 1

@davidcba1 Thanks for the report, we are investigating the issue

from netdata.

vkalintiris avatar vkalintiris commented on June 24, 2024

Not sure what causes it.. just leaving it running and it stops working

Out of curiosity, how often/predictably can you reproduce this?

from netdata.

davidcba1 avatar davidcba1 commented on June 24, 2024

I can't.. i just notice monitoring isn't running and restart it.. I just did this and can see its happened 4 times in the past ~2 weeks. This host retrieves metrics from all the hosts so that's the only difference.

# journalctl -u netdata | grep -C10 -i 'METRIC: refcount is 0 (zero or negative) during release'
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: recalculating tier 0 retention for 12672 metrics starting with datafile 7052
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: migrated journal file '/var/cache/netdata/dbengine/journalfile-1-0000007086.njfv2', file size 1144716
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: updating tier 0 metrics registry retention for 12672 metrics
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: deleting data file '/var/cache/netdata/dbengine/datafile-1-0000007051.ndf'.
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: deleting data and journal files to maintain disk quota
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: deleted journal file "/var/cache/netdata/dbengine/journalfile-1-0000007051.njf".
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: deleted journal file "/var/cache/netdata/dbengine/journalfile-1-0000007051.njfv2".
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: deleted data file "/var/cache/netdata/dbengine/datafile-1-0000007051.ndf".
Feb 22 22:56:25 test_host netdata[972574]: DBENGINE: reclaimed 7307372 bytes of disk space.
Feb 22 23:01:20 test_host netdata[972574]: Host 'AWS Host' with machine guid '67f378a4-cf90-11ee-a877-06708782dc81' is obsolete - cleaning up.
Feb 22 23:01:20 test_host netdata[972574]: METRIC: refcount is 0 (zero or negative) during release
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0xbfe50)[0x559f6946be50]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x370f4b)[0x559f6971cf4b]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x371f7d)[0x559f6971df7d]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x36787d)[0x559f6971387d]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x265c16)[0x559f69611c16]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x266820)[0x559f69612820]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x8b3d9)[0x559f694373d9]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x8cd9c)[0x559f69438d9c]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x8d6ed)[0x559f694396ed]
Feb 22 23:01:20 test_host netdata[972574]: /usr/sbin/netdata(+0x265d44)[0x559f69611d44]
--
Feb 26 03:03:22 test_host netdata[2542786]: Deleting chart 'systemd_insights-client-results.pids_current' ('systemd_insights-client-results.pids_current_3') from disk...
Feb 26 03:03:22 test_host netdata[2542786]: NETDATA SHUTDOWN: in    2803 ms, clean rrdhost database - next: stop aclk threads
Feb 26 03:03:22 test_host netdata[2542786]: NETDATA SHUTDOWN: in       0 ms, stop aclk threads - next: stop all remaining worker threads
Feb 26 03:03:22 test_host netdata[2542786]: NETDATA SHUTDOWN: in       0 ms, stop all remaining worker threads - next: cancel main threads
Feb 26 03:03:22 test_host netdata[2542786]: EXIT: Stopping main thread: DYNCFG
Feb 26 03:03:22 test_host netdata[2542786]: Waiting 1 threads to finish...
Feb 26 03:03:22 test_host netdata[2542786]: cleaning up...
Feb 26 03:03:23 test_host netdata[2542786]: All threads finished.
Feb 26 03:03:23 test_host netdata[2542786]: NETDATA SHUTDOWN: in     100 ms, cancel main threads - next: flush dbengine tiers
Feb 26 03:03:24 test_host netdata[2542786]: NETDATA SHUTDOWN: in    1455 ms, flush dbengine tiers - next: stop collection for all hosts
Feb 26 03:03:24 test_host netdata[2542786]: METRIC: refcount is 0 (zero or negative) during release
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0xbfe50)[0x5585464d1e50]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x370f4b)[0x558546782f4b]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x371f7d)[0x558546783f7d]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x36787d)[0x55854677987d]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x265c16)[0x558546677c16]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x271f90)[0x558546683f90]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x26c30d)[0x55854667e30d]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x26c398)[0x55854667e398]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x6e569)[0x558546480569]
Feb 26 03:03:24 test_host netdata[2542786]: /usr/sbin/netdata(+0x70bcc)[0x558546482bcc]
--
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: indexing file '/var/cache/netdata/dbengine/journalfile-1-0000007582.njfv2': extents 202, metrics 12925, pages 12928
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: migrated journal file '/var/cache/netdata/dbengine/journalfile-1-0000007582.njfv2', file size 1144896
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: recalculating tier 0 retention for 12733 metrics starting with datafile 7548
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: updating tier 0 metrics registry retention for 12733 metrics
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: deleting data file '/var/cache/netdata/dbengine/datafile-1-0000007547.ndf'.
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: deleting data and journal files to maintain disk quota
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: deleted journal file "/var/cache/netdata/dbengine/journalfile-1-0000007547.njf".
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: deleted journal file "/var/cache/netdata/dbengine/journalfile-1-0000007547.njfv2".
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: deleted data file "/var/cache/netdata/dbengine/datafile-1-0000007547.ndf".
Feb 26 09:16:38 test_host netdata[1571916]: DBENGINE: reclaimed 7325200 bytes of disk space.
Feb 26 09:22:54 test_host netdata[1571916]: METRIC: refcount is 0 (zero or negative) during release
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0xbfe50)[0x557ea5bc2e50]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x370f4b)[0x557ea5e73f4b]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x371f7d)[0x557ea5e74f7d]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x36787d)[0x557ea5e6a87d]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x265c16)[0x557ea5d68c16]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x266820)[0x557ea5d69820]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x8b3d9)[0x557ea5b8e3d9]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x8cd9c)[0x557ea5b8fd9c]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x8d6ed)[0x557ea5b906ed]
Feb 26 09:22:54 test_host netdata[1571916]: /usr/sbin/netdata(+0x265d44)[0x557ea5d68d44]
--
Feb 27 22:59:10 test_host netdata[1909088]: Deleting chart header file '/var/cache/netdata/systemd_dnf-makecache.throttle_serviced_ops/main.db'.
Feb 27 22:59:10 test_host netdata[1909088]: Deleting dimension file '/var/cache/netdata/systemd_dnf-makecache.throttle_serviced_ops/read.db'.
Feb 27 22:59:10 test_host netdata[1909088]: Deleting dimension file '/var/cache/netdata/systemd_dnf-makecache.throttle_serviced_ops/write.db'.
Feb 27 22:59:10 test_host netdata[1909088]: Deleting empty directory '/var/cache/netdata/systemd_dnf-makecache.throttle_serviced_ops'
Feb 27 22:59:10 test_host netdata[1909088]: Deleting chart 'systemd_dnf-makecache.pids_current' ('systemd_dnf-makecache.pids_current_6') from disk...
Feb 27 22:59:10 test_host netdata[1909088]: Deleting chart header file '/var/cache/netdata/systemd_dnf-makecache.pids_current/main.db'.
Feb 27 22:59:10 test_host netdata[1909088]: Deleting dimension file '/var/cache/netdata/systemd_dnf-makecache.pids_current/pids.db'.
Feb 27 22:59:10 test_host netdata[1909088]: Deleting empty directory '/var/cache/netdata/systemd_dnf-makecache.pids_current'
Feb 27 23:01:10 test_host netdata[1909088]: Host 'AWS Host' with machine guid '7bdb3c90-d106-11ee-919f-0269db087c85' is obsolete - cleaning up.
Feb 27 23:01:10 test_host netdata[1909088]: Host 'AWS host' with machine guid '67f378a4-cf90-11ee-a877-06708782dc81' is obsolete - cleaning up.
Feb 27 23:01:10 test_host netdata[1909088]: METRIC: refcount is 0 (zero or negative) during release
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0xbfe50)[0x557699a5de50]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x370f4b)[0x557699d0ef4b]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x371f7d)[0x557699d0ff7d]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x36787d)[0x557699d0587d]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x265c16)[0x557699c03c16]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x266820)[0x557699c04820]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x8b3d9)[0x557699a293d9]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x8cd9c)[0x557699a2ad9c]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x8d6ed)[0x557699a2b6ed]
Feb 27 23:01:10 test_host netdata[1909088]: /usr/sbin/netdata(+0x265d44)[0x557699c03d44]

from netdata.

hvulin avatar hvulin commented on June 24, 2024

Just to add to count I have the same problem:
time=2024-03-13T11:00:51.384+02:00 comm=netdata source=daemon level=alert tid=32420 thread=SERVICE msg="METRIC: refcount is 0 (zero or negative) during release"
/usr/sbin/netdata(+0xc832e)[0x55b35a11e32e]
/usr/sbin/netdata(+0x59b6f)[0x55b35a0afb6f]
/usr/sbin/netdata(+0x367632)[0x55b35a3bd632]
/usr/sbin/netdata(+0x35caa8)[0x55b35a3b2aa8]
/usr/sbin/netdata(+0x2662ba)[0x55b35a2bc2ba]
/usr/sbin/netdata(+0x266d9a)[0x55b35a2bcd9a]
/usr/sbin/netdata(+0x96dd3)[0x55b35a0ecdd3]
/usr/sbin/netdata(+0x9839f)[0x55b35a0ee39f]
/usr/sbin/netdata(+0x98dad)[0x55b35a0eedad]
/usr/sbin/netdata(+0x2663d0)[0x55b35a2bc3d0]
/usr/sbin/netdata(+0x2767f8)[0x55b35a2cc7f8]
/usr/sbin/netdata(+0x96dd3)[0x55b35a0ecdd3]
/usr/sbin/netdata(+0x9839f)[0x55b35a0ee39f]
/usr/sbin/netdata(+0x98dad)[0x55b35a0eedad]
/usr/sbin/netdata(+0x273537)[0x55b35a2c9537]
/usr/sbin/netdata(+0x26a0b6)[0x55b35a2c00b6]
/usr/sbin/netdata(+0x7dc57)[0x55b35a0d3c57]
/usr/sbin/netdata(+0xd5790)[0x55b35a12b790]
/lib64/libpthread.so.0(+0x7ea5)[0x7f7927884ea5]
/lib64/libc.so.6(clone+0x6d)[0x7f7926d8d8dd]

Web component dies and no 19999 port is open although process is still running.

from netdata.

hugovalente-pm avatar hugovalente-pm commented on June 24, 2024

I believe this #17239 will address this issue

from netdata.

ilyam8 avatar ilyam8 commented on June 24, 2024

Yes, should be fixed in #17239. We will do a patch release (v1.45.1) later this week.

from netdata.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.