Giter Club home page Giter Club logo

Comments (16)

jamilbk avatar jamilbk commented on June 1, 2024 1

I think this is actually because the portal is now sending all candidates to the Gateways, which might be way too many, and prevent STUN from working properly.

At least in theory, it shouldn't. We have a limit of 300 candidate pairs that we test (upped from the default of 100) but they should be eliminated by relay-first so STUN should not be affected by that.

Connections only work Relayed. Client is main.

How are you inferring this from the logs? It says it established a direct connection.

The Direct connection is to a Relay though. The peer IP is the Relay.

Ping in this scenario is nearly 300ms. Vs the 80 I get when restarting gateway and Direct peer is actually the gateways ip.

Edit: will see if I can replicate. May have missed capturing the relevant logs

from firezone.

AndrewDryga avatar AndrewDryga commented on June 1, 2024 1

@jamilbk yes, it was fixed in PR that added relay presence broadcasting

from firezone.

mdp avatar mdp commented on June 1, 2024 1

Yep, I can confirm that this is working now and I'm not seeing the decapsulation issues. Thanks

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

I think this is actually because the portal is now sending all candidates to the Gateways, which might be way too many, and prevent STUN from working properly.

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

cc @AndrewDryga

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

Possibly related #4290

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

restarting gateway fixed the issue, for now

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

Is it possible b33328a caused this by removing any sort of filtering of Relay candidates by Gateways? Probably need the related fix in the portal to send only proximity Relays to Gateways and this will probably get fixed with that.

from firezone.

thomaseizinger avatar thomaseizinger commented on June 1, 2024

I think this is actually because the portal is now sending all candidates to the Gateways, which might be way too many, and prevent STUN from working properly.

At least in theory, it shouldn't. We have a limit of 300 candidate pairs that we test (upped from the default of 100) but they should be eliminated by relay-first so STUN should not be affected by that.

Connections only work Relayed. Client is main.

How are you inferring this from the logs? It says it established a direct connection.

from firezone.

thomaseizinger avatar thomaseizinger commented on June 1, 2024

Packet is a STUN message but no agent handled it

The reason you are seeing this is because we invalidate all other candidates as we move sockets.

from firezone.

AndrewDryga avatar AndrewDryga commented on June 1, 2024

@jamilbk on a standup I told that poral was sending all relays but since patch just sends 2 nearest ones, the bug was there all the time. If it needs all relays to work Id say there is a bug in gateway

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

@AndrewDryga just so I'm following -- portal sends only the nearest 2 now?

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

This should be fixed now that we invalidate candidates on both sides.

from firezone.

mdp avatar mdp commented on June 1, 2024

This looks like it's in 1.0.0 (#4685), but I'm running into a similar issue on my Linux gateway. Logs from the gateway here:

2024-04-30T15:11:35.567349Z  WARN handle_timeout{id=980cb9e2-82a4-4de0-94b2-c5a313df2e39}: boringtun::noise::timers: HANDSHAKE(REKEY_TIMEOUT)
2024-04-30T15:11:35.692200Z DEBUG handle_timeout{relay=Some(35.239.126.166:3478)}: snownet::allocation: Request timed out after 3.375s, re-sending id=TransactionId(0x0FA5F2508203396737B2D5A5) method=binding dst=[2600:1900:4000:a47e:0:d::]:3478
2024-04-30T15:11:35.692271Z DEBUG handle_timeout{relay=Some(34.23.241.97:3478)}: snownet::allocation: Request timed out after 3.375s, re-sending id=TransactionId(0x58D53A86CD55B61FBCF236C8) method=binding dst=[2600:1900:4020:8f68:0:27::]:3478
2024-04-30T15:11:40.754005Z DEBUG handle_timeout{relay=Some(35.239.126.166:3478)}: snownet::allocation: Request timed out after 5.0625s, re-sending id=TransactionId(0x0FA5F2508203396737B2D5A5) method=binding dst=[2600:1900:4000:a47e:0:d::]:3478
2024-04-30T15:11:40.754095Z DEBUG handle_timeout{relay=Some(34.23.241.97:3478)}: snownet::allocation: Request timed out after 5.0625s, re-sending id=TransactionId(0x58D53A86CD55B61FBCF236C8) method=binding dst=[2600:1900:4020:8f68:0:27::]:3478
2024-04-30T15:11:40.828057Z DEBUG handle_timeout{relay=Some(35.239.126.166:3478)}: snownet::allocation: Request timed out after 11.390625s, re-sending id=TransactionId(0x5F58780FDE3657E2AC45AF3C) method=binding dst=[2600:1900:4000:a47e:0:d::]:3478
2024-04-30T15:11:40.828108Z DEBUG handle_timeout{relay=Some(34.94.40.209:3478)}: snownet::allocation: Request timed out after 11.390625s, re-sending id=TransactionId(0x25B8F240BF8CDEE961D928EA) method=binding dst=[2600:1900:4120:691:0:27::]:3478
2024-04-30T15:11:41.067570Z  INFO handle_timeout{id=980cb9e2-82a4-4de0-94b2-c5a313df2e39}: str0m::ice_::agent: State change (got new possible): Connected -> Checking
2024-04-30T15:11:41.556670Z  INFO handle_timeout{id=980cb9e2-82a4-4de0-94b2-c5a313df2e39}: str0m::ice_::agent: State change (no possible pairs): Checking -> Disconnected
2024-04-30T15:11:41.556750Z  INFO handle_timeout{id=980cb9e2-82a4-4de0-94b2-c5a313df2e39}: snownet::node: Connection failed (ICE timeout)
2024-04-30T15:11:41.988542Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:43.177407Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet was not accepted by any wireguard tunnel; num_tunnels = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=216
2024-04-30T15:11:43.493475Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:43.988540Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:45.001547Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:46.490520Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:47.991575Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:48.347919Z DEBUG handle_timeout{relay=Some(35.239.126.166:3478)}: snownet::allocation: Request timed out after 7.59375s, re-sending id=TransactionId(0x0FA5F2508203396737B2D5A5) method=binding dst=[2600:1900:4000:a47e:0:d::]:3478
2024-04-30T15:11:48.347967Z DEBUG handle_timeout{relay=Some(34.23.241.97:3478)}: snownet::allocation: Request timed out after 7.59375s, re-sending id=TransactionId(0x58D53A86CD55B61FBCF236C8) method=binding dst=[2600:1900:4020:8f68:0:27::]:3478
2024-04-30T15:11:48.612463Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet was not accepted by any wireguard tunnel; num_tunnels = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=32
2024-04-30T15:11:49.498574Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:50.991735Z  WARN firezone_tunnel::gateway: Failed to decapsulate incoming packet: Packet is a STUN message but no agent handled it; num_agents = 0 local=172.17.0.2:36528 from=xxx.xx.xx.xx:55240 num_bytes=92
2024-04-30T15:11:53.837464Z DEBUG accept_connection{id=980cb9e2-82a4-4de0-94b2-c5a313df2e39}:refresh{relay=Some(35.239.126.166:3478)}: snownet::allocation: Refreshing allocation
2024-04-30T15:11:53.837514Z DEBUG accept_connection{id=980cb9e2-82a4-4de0-94b2-c5a313df2e39}:refresh{relay=Some(34.23.241.97:3478)}: snownet::allocation: Refreshing allocation

This is running 1.0.1 via docker. Happy to provide anymore info on my end.

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

Thanks for the report @mdp -- we're investigating a possible partial outage of a couple Relay servers at the moment: https://firezone.statuspage.io/

from firezone.

jamilbk avatar jamilbk commented on June 1, 2024

@mdp We probably need to tune the log level on those a bit further. Sometimes you'll get a few expected ones when you roam networks for example. #4872

from firezone.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.