Giter Club home page Giter Club logo

Comments (8)

lmb avatar lmb commented on September 27, 2024

What do you think is the bug here? That Cilium generates a packet of 1508 bytes? If yes, what else should it do (keep in mind that we can't avoid encapsulation)?

Can you either drop Cilium's MTU to 1492 or increase your underlying network to >= 1508?

from cilium.

lgyurci avatar lgyurci commented on September 27, 2024

I'm not sure if it's possible via ebpf, but cilium should fragment the packet, like any other router would in a scenario like this.

But I would also like to suggest what it shouldn't do: Break random network connections without any warning, and put the task of debugging onto the end-user. This issue took a significant amount of time to figure out (SIP connections breaking randomly, good luck), and I would at least expect some form of mention about this in the documentation, that "you should watch out for this if you are doing this", even if the solution is simply to increase/decrease the MTU in certain places. Make it appear in a red alert box on the native routing documentation page.

I could increase the MTU on the hosts, and that's what I did in the end, but I feel like this is more of a workaround than a solution. It turns out the host network interfaces were dropping the packets, because they automatically set the same MRU as the MTU - and also on a hardware level, so we couldn't even see those packets with tcpdump. However, now the hosts technically could send 1508 byte packets to other network entities, which is not ideal (cilium's MTU was left at 1500, which meant we had to take it out from automatic MTU mode), so we implemented another workaround: a bond interface on top of the physical one with 1500 MTU. I hope you see the absurdity of this situation. In my opinion, you shouldn't have to manually set the MTU anywhere, anytime.

Sorry if this comment came out as aggressive, I really like what you do, and I like cilium as a software as well, I'm just through a few hours of debugging, and now a bit emotionally attached to this case.

from cilium.

julianwiedmann avatar julianwiedmann commented on September 27, 2024

πŸ‘‹ maybe have a look at #21825 ?

from cilium.

lmb avatar lmb commented on September 27, 2024

Sorry if this comment came out as aggressive, I really like what you do, and I like cilium as a software as well, I'm just through a few hours of debugging, and now a bit emotionally attached to this case.

Sure, it does come out that way. I empathise with you, but you are using software that is made available to you for free. Debugging is the price you pay. Expressing your frustration this way to a stranger trying to help is not the way to go.

I'm not sure if it's possible via ebpf, but cilium should fragment the packet, like any other router would in a scenario like this.

Fragmentation is a tricky topic. I'm not sure we have plans to do outbound fragmentation. Enabling pmtu discovery should at least help for compliant clients.

However, now the hosts technically could send 1508 byte packets to other network entities, which is not ideal (cilium's MTU was left at 1500, which meant we had to take it out from automatic MTU mode), so we implemented another workaround: a bond interface on top of the physical one with 1500 MTU.

Could you lower the default route MTU to 1500 instead?

Coming back to making this easier to debug: oversized packets should emit DROP_FRAG_NEEDED in cilium-dbg monitor. Did you check that output?

from cilium.

lgyurci avatar lgyurci commented on September 27, 2024

Thanks for the help.

I can lower the default route MTU to 1500, but we have a lot other (dynamically added) routes on the nodes, and setting an MTU for those would be tricky.

There is no such output in cilium-dbg monitor. The packets are dropped by the network card, on the hardware level, and not by cilium.

from cilium.

xichaocn avatar xichaocn commented on September 27, 2024

@lmb @lgyurci Sorry to bother you, but I encountered the same issue. I cannot modify the switch MTU to 1508 (since the switch belongs to someone else). If I only modify the node MTU to 1508, packets will still be dropped at the switch. If Cilium cannot fragment the packets, how can this issue be resolved?

from cilium.

lgyurci avatar lgyurci commented on September 27, 2024

@xichaocn I guess you can't modify the router MTU either... Like I said in my previous comments, our issue turned out to be caused by the network card, and not the switch. Are you absolutely sure that the switch is dropping the packets, and not the network card? Because if so, there is really not much you can do here.

from cilium.

joestringer avatar joestringer commented on September 27, 2024

As far as I can tell, the official docs around DSR don't actually describe the required MTU changes, they just hint at required MTU changes. Perhaps the minimal change here is to either (a) document explicitly the requirements that DSR imposes on MTU in an environment, or next step (b) in DSR mode adjust Cilium's autodetected MTU based on the potential need for additional header space for the directly-returned packets.

from cilium.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.