Comments (8)
What do you think is the bug here? That Cilium generates a packet of 1508 bytes? If yes, what else should it do (keep in mind that we can't avoid encapsulation)?
Can you either drop Cilium's MTU to 1492 or increase your underlying network to >= 1508?
from cilium.
I'm not sure if it's possible via ebpf, but cilium should fragment the packet, like any other router would in a scenario like this.
But I would also like to suggest what it shouldn't do: Break random network connections without any warning, and put the task of debugging onto the end-user. This issue took a significant amount of time to figure out (SIP connections breaking randomly, good luck), and I would at least expect some form of mention about this in the documentation, that "you should watch out for this if you are doing this", even if the solution is simply to increase/decrease the MTU in certain places. Make it appear in a red alert box on the native routing documentation page.
I could increase the MTU on the hosts, and that's what I did in the end, but I feel like this is more of a workaround than a solution. It turns out the host network interfaces were dropping the packets, because they automatically set the same MRU as the MTU - and also on a hardware level, so we couldn't even see those packets with tcpdump. However, now the hosts technically could send 1508 byte packets to other network entities, which is not ideal (cilium's MTU was left at 1500, which meant we had to take it out from automatic MTU mode), so we implemented another workaround: a bond interface on top of the physical one with 1500 MTU. I hope you see the absurdity of this situation. In my opinion, you shouldn't have to manually set the MTU anywhere, anytime.
Sorry if this comment came out as aggressive, I really like what you do, and I like cilium as a software as well, I'm just through a few hours of debugging, and now a bit emotionally attached to this case.
from cilium.
π maybe have a look at #21825 ?
from cilium.
Sorry if this comment came out as aggressive, I really like what you do, and I like cilium as a software as well, I'm just through a few hours of debugging, and now a bit emotionally attached to this case.
Sure, it does come out that way. I empathise with you, but you are using software that is made available to you for free. Debugging is the price you pay. Expressing your frustration this way to a stranger trying to help is not the way to go.
I'm not sure if it's possible via ebpf, but cilium should fragment the packet, like any other router would in a scenario like this.
Fragmentation is a tricky topic. I'm not sure we have plans to do outbound fragmentation. Enabling pmtu discovery should at least help for compliant clients.
However, now the hosts technically could send 1508 byte packets to other network entities, which is not ideal (cilium's MTU was left at 1500, which meant we had to take it out from automatic MTU mode), so we implemented another workaround: a bond interface on top of the physical one with 1500 MTU.
Could you lower the default route MTU to 1500 instead?
Coming back to making this easier to debug: oversized packets should emit DROP_FRAG_NEEDED
in cilium-dbg monitor
. Did you check that output?
from cilium.
Thanks for the help.
I can lower the default route MTU to 1500, but we have a lot other (dynamically added) routes on the nodes, and setting an MTU for those would be tricky.
There is no such output in cilium-dbg monitor
. The packets are dropped by the network card, on the hardware level, and not by cilium.
from cilium.
@lmb @lgyurci Sorry to bother you, but I encountered the same issue. I cannot modify the switch MTU to 1508 (since the switch belongs to someone else). If I only modify the node MTU to 1508, packets will still be dropped at the switch. If Cilium cannot fragment the packets, how can this issue be resolved?
from cilium.
@xichaocn I guess you can't modify the router MTU either... Like I said in my previous comments, our issue turned out to be caused by the network card, and not the switch. Are you absolutely sure that the switch is dropping the packets, and not the network card? Because if so, there is really not much you can do here.
from cilium.
As far as I can tell, the official docs around DSR don't actually describe the required MTU changes, they just hint at required MTU changes. Perhaps the minimal change here is to either (a) document explicitly the requirements that DSR imposes on MTU in an environment, or next step (b) in DSR mode adjust Cilium's autodetected MTU based on the potential need for additional header space for the directly-returned packets.
from cilium.
Related Issues (20)
- CFP: when will support wIndows node HOT 2
- featuremanager: Lack of safety features
- CFP: Automatically port-forward Hubble CLI HOT 1
- daemon creation failed: error while initializing daemon: failed while reinitializing datapath: cgroup attach: PROG_ATTACH for program cil_sock4_post_bind: can't attach program: operation not permitted HOT 1
- next reader: websocket: close 1006 (abnormal closure): unexpected EOF HOT 1
- CI: Image CI Build
- CI: Base Image Release Build
- Host firewall - Egress CiliumClusterwideNetworkPolicy
- DaemonSet.apps "cilium-pre-flight-check" is invalid: spec.template.spec.containers[0].volumeMounts[1].name: Not found: "kube-api-access"
- CI: Envoy warning internal_address_config is not configured.
- Routing is "extremely" slow using BGP, with many TCP Retransmissions, to nodes outside cluster HOT 9
- Issue in east-west traffic distribution using HTTPRoute which targets a service HOT 1
- CI: Conformance IPSec E2E, logs spammed with HUBBLE_RIG_BUFFER lost events
- Cilium-Envoy stops receiving config updates once there is one invalid listener config with Gateway API
- Cilium-Agent bootstrap time varies significantly on empty cluster HOT 1
- Π‘ilium L7 LB loses traffic when rolling update
- Cilium main: Unable to remove service {...}: key does not exist HOT 5
- CI: Hubble CLI Integration Test: hubble-relay 1 pods of Deployment hubble-relay are not ready
- Failing no-unexpected-packet-drops test with INGRESS - TTL exceeded
- single stack ipv6 causes panic
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cilium.