Giter Club home page Giter Club logo

Comments (8)

cheatfate avatar cheatfate commented on July 19, 2024

Could you please provide execution arguments of nimbus_beacon_node binary?

from nimbus-eth2.

catwith1hat avatar catwith1hat commented on July 19, 2024
/home/user/nimbus_beacon_node --network=holesky --jwt-secret=/jwt.hex \
  --udp-port=${PORT} --tcp-port=${PORT} --data-dir=/nimbus-data \
  --el=http://localhost:8640 --enr-auto-update --metrics --metrics-port=11140 \
  --metrics-address=0.0.0.0 --rest --rest-port=4040 --rest-address=0.0.0.0 \
  --suggested-fee-recipient=${ADDR} --doppelganger-detection=off \
  --history=prune --web3-signer-update-interval=300 \
  --in-process-validators=false --payload-builder=true \
  --payload-builder-url=http://localhost:18920

from nimbus-eth2.

cheatfate avatar cheatfate commented on July 19, 2024

Could you please confirm that you are using latest release version of nimbus-eth2 - 24.5.1 and you have not specified --listen-address CLI option.

from nimbus-eth2.

kdeme avatar kdeme commented on July 19, 2024

@catwith1hat Do you also have the full log line for:

WRN 2024-05-29 18:00:03.718+00:00 Peer count low, no new peers discovered  >

?

Normally this log line should also print the discovered_nodes, new_peers, current_peers and wanted_peers.

from nimbus-eth2.

catwith1hat avatar catwith1hat commented on July 19, 2024

@cheatfate I can confirm that I haven't set the --listen-address CLI option:

$ ps axu | grep beacon | grep listen-add | wc
      0       0       0

@kdeme: Sorry, I truncated the line while copying it. Here is the full line:

WRN 2024-05-29 18:00:03.718+00:00 Peer count low, no new peers discovered    topics="networking" discovered_nodes=0 new_peers=@[] current_peers=0 wanted_peers=160

from nimbus-eth2.

kdeme avatar kdeme commented on July 19, 2024

Sorry, I truncated the line while copying it. Here is the full line:

Thank you, that's very useful.

In terms of reproducing this: Do I understand correctly that you are running a Nimbus Docker container inside a QEMU VM?

edit:
As I cannot reproduce this myself, it would be really good to know the exact setup, as I think this will be some setup specific issue. I also think that the Discovery send failed msg="(101) Network is unreachable" log is not something that would show up when run from a Docker container, well, at least not in network-bridge mode.

Additional question, did all the Peer count low, no new peers discovered lines give discovered_nodes set to 0?

from nimbus-eth2.

catwith1hat avatar catwith1hat commented on July 19, 2024

In terms of reproducing this: Do I understand correctly that you are running a Nimbus Docker container inside a QEMU VM?

That's correct.

Additional question, did all the Peer count low, no new peers discovered lines give discovered_nodes set to 0?

Pretty much:

$ journalctl -u podman-nimbus-N4-I0.service --since="2024-05-29 16:00:00" --until="2024-05-29 20:00:00" | \
  grep -oE "Peer count low.*" | \
  awk '
$0 != prev {
    if (count > 1) {
        print count "x" prev
    }
    prev = $0
    count = 1
}
$0 == prev {
    count++
}
END {
    if (count > 1) {
        print count "x" prev
    }
}'
2xPeer count low, no new peers discovered    topics="networking" discovered_nodes=0 new_peers=@[] current_peers=36 wanted_peers=160
2xPeer count low, no new peers discovered    topics="networking" discovered_nodes=0 new_peers=@[] current_peers=35 wanted_peers=160
2xPeer count low, no new peers discovered    topics="networking" discovered_nodes=1 new_peers=@[] current_peers=0 wanted_peers=160
5xPeer count low, no new peers discovered    topics="networking" discovered_nodes=0 new_peers=@[] current_peers=0 wanted_peers=160
4xPeer count low, no new peers discovered    topics="networking" discovered_nodes=1 new_peers=@[] current_peers=0 wanted_peers=160
3xPeer count low, no new peers discovered    topics="networking" discovered_nodes=0 new_peers=@[] current_peers=0 wanted_peers=160
4xPeer count low, no new peers discovered    topics="networking" discovered_nodes=1 new_peers=@[] current_peers=0 wanted_peers=160
5xPeer count low, no new peers discovered    topics="networking" discovered_nodes=0 new_peers=@[] current_peers=0 wanted_peers=160
2xPeer count low, no new peers discovered    topics="networking" discovered_nodes=1 new_peers=@[] current_peers=0 wanted_peers=160
29xPeer count low, no new peers discovered    topics="networking" discovered_nodes=0 new_peers=@[] current_peers=0 wanted_peers=160

(the awk script counts how many times a line repeats)

As I cannot reproduce this myself, it would be really good to know the exact setup, as I think this will be some setup specific issue. I also think that the Discovery send failed msg="(101) Network is unreachable" log is not something that would show up when run from a Docker container, well, at least not in network-bridge mode.

This is probably correct that you would not get Network is unreachable. It is true that my networking inside the docker container is non-standard. I use my personal equivalent to gluetun that sets up VPN networking inside a container, and the nimbus container attaches to this docker's network. When I cut the link, the default route for Nimbus disappears after openvpn dies. However, the networking comes back, as well as a new default route. So I do believe that getting stuck is still an undesirable behavior of Nimbus.

But spotting Network is unreachable might be a great catch. Hypothesis: Maybe Nimbus reacts differently to a socket error that returns "Network unreachable" instead of a connection that simply times out? Maybe Nimbus treats the former as a more permanent error when connecting to a peer, such that this peer will never be tried again in the future?

@kdeme If you have your setup still at hand, would you mind trying to reproduce this by remove the default route of the docker container (or the whole VM/host) for let's say 60 minutes?

from nimbus-eth2.

catwith1hat avatar catwith1hat commented on July 19, 2024

Two more datapoints:

  • I tried to reproduce this on a non-VM machine by removing the default route inside the Nimbus docker container. In this case, the Nimbus mainnet instance came back to life.
  • On my holesky VM-hosted node, this is reproducable. The node permanently stays stuck without 0 peers after I restore network connectivity to the VM. I also double checked that the Nimbus container has indeed network connectivity, podman exec -it nimbus /bin/bash and then using bash's built in TCP support to connect to some random website.

from nimbus-eth2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.