Giter Club home page Giter Club logo

Comments (7)

tgraf avatar tgraf commented on May 13, 2024 3

@deitch
Great to hear from you. We should be putting this into the documentation but let met give you a quick answer before we have that.

The programmability of BPF allows Cilium to implement whatever model is desired. What is currently implemented is as follows:

  • Cilium will perform internal routing for any packets addressed to another local container. (L2 rewrite, TTL dec, redirect into device of destination container)
  • For packets addressed to containers on other servers or for external network endpoints Cilium can be configured to either:
    • Pass the packet to the Linux routing layer to perform a normal L3 operation as per routing table. We do not include any control plane to modify the routing table of Linux in this case. We will require the routing table to have an entry for each per server prefix out of which each server allocates IPs to local containers. This can achieved easily with static configuration or by running routing daemons.
    • The second option is to run in encapsulation mode in which case Cilium will create a encapsulation device in metadata mode of your choice (VXLAN, GRE, Geneve, ...) and will then encapsulate each packet addresses for a container on another server. After encapsulation, the packet is again given to Linux for routing. This scenario requires the routing table of the server to have enough information to reach all other servers.

Hope this helps

from cilium.

deitch avatar deitch commented on May 13, 2024

@tgraf : and great to be welcomed! :-)

The programmability of BPF allows Cilium to implement whatever model is desired

So Cilium is more of a framework enabling implementation of Overlay or L3 strategies, but by doing it right there in the kernel using BPF? And you can build complete solutions on top of that?

Is the intent that, e.g. Weave or Flannel might rebuild their system - which distribute IPs using a key/value store like etcd as a backend to coordinate and encapsulates packets - but using Cilium is the encapsulation layer? Or Calico would do the same but using Cilium?

from cilium.

tgraf avatar tgraf commented on May 13, 2024

So Cilium is more of a framework enabling implementation of Overlay or L3 strategies, but by doing it right there in the kernel using BPF? And you can build complete solutions on top of that?

Cilium will assume and derive sane defaults wherever possible if run without specific options, e.g. running Cilium in a Kubernetes environment will cause Cilium to automatically derive the node pod CIDR of each server for allocation of IPs to local containers and networking will just work. No additional configuration required.

For encapsulation, each container address is structured in a way that allows deriving the server address whether the container is running on so encapsulation will work without any additional routing configuration beyond simple server to server connectivity which is typically a given anyway. All you have to do here is specify the type of encapsulation at startup, e.g. -t vxlan

If these defaults are not meeting your exact requirements Cilium allows to define and implement almost any L3 strategy.

Is the intent that, e.g. Weave or Flannel might rebuild their system - which distribute IPs using a key/value store like etcd as a backend to coordinate and encapsulates packets - but using Cilium is the encapsulation layer? Or Calico would do the same but using Cilium?

CIlium has an IPAM plugin for libnetwork and CNI and will allocate of a per server prefix and would replace flannel in regard of allocation and encapsulation but with likely better performance due to Cilium operating exclusively in the kernel. I guess you could run the routing daemon components of Calico side by side with Cilium if you wanted as the L3 routing mode of Cilium seems very similar with how Calico operates. You could also run any of those IPAM components potentially as long as the basic requirement of the host allocator is met, i.e. that a container receives an IP out of a per server prefix.

from cilium.

deitch avatar deitch commented on May 13, 2024

will cause Cilium to automatically derive the node pod CIDR of each server for allocation of IPs to local containers and networking will just work

That's the default k8s networking, so it will just use that? What are the advantages to doing it that way vs. the shipped-with-k8s CNI plugin?

you could run the routing daemon components of Calico side by side with Cilium if you wanted as the L3 routing mode of Cilium seems very similar with how Calico operates

Ah, that makes sense if you want a more complex routing model.

So, basically, in the end, Cilium handles in-kernel (using BPF) much of what flannel/weave do in user space (although Weave has a fast datapath mode which is supposed to avoid 2 context switches per packet, more https://www.weave.works/documentation/net-latest-how-it-works/net-latest-fastdp-how-it-works/ ). From a Calico perspective, it all happens in kernel anyways because no userspace needs touch each packet, which is kind of nice.

I can say that there are performance differences. I gave a presentation at LinuxCon Tokyo and Berlin last year comparing performance of various networking options for containers. Would love to get Cilium in the mix. Actually, I heard about Cilium from a colleague at Docker who was at the presentation and send me the link this week.

My preso is here http://www.slideshare.net/deitcher/linux-con-berlin2016presentationdeitchera

If/when I do the next low-latency project, I will have to add this to the mix.

from cilium.

tgraf avatar tgraf commented on May 13, 2024

That's the default k8s networking, so it will just use that? What are the advantages to doing it that way vs. the shipped-with-k8s CNI plugin?

Cilium derives the default k8s network address model, yes. The advantages of Cilium compared to stock k8s networking are scalable policy enforcement with a single hashtable lookup, increased visibility via a fast ring buffer (e.g. you will see the src/dst pod labels of dropped packets), integrated load balancing, policy enforcement of a pod talking to itself, ... I'm probably missing quite a bit here.

So, basically, in the end, Cilium handles in-kernel (using BPF) much of what flannel/weave do in user space (although Weave has a fast datapath mode which is supposed to avoid 2 context switches per packet, more https://www.weave.works/documentation/net-latest-how-it-works/net-latest-fastdp-how-it-works/ ). From a Calico perspective, it all happens in kernel anyways because no userspace needs touch each packet, which is kind of nice.

I'm careful with comparing Cilium with other solutions as often small details matter and I don't know enough about all the details for it be a fair comparison. The Weave fast datapath uses the OVS in-kernel datapath which is a wildcard flow table capable of executing actions. It is programmable to a great extend but not to the same extend as BPF. A simple example here is that if a pod requires port mapping, we will construct the BPF program for that pod to perform the port mapping for all packet it receives/transmits whereis in a flow table model, this requires an action which needs to be configured. Native instructions without any branching and cache misses is simply faster.

While Calico and Cilium will show very similar performance on raw forwarding as both ultimately depend on the kernel. They will show differences when enforcing security policies. Calico depends on iptables and ipset to enforce policies while Cilium compiles a hashtable of allowed consumers which will show avg behaviour of O(1) versus traversing a sequence of rules in the case of iptables.

Another example of this is the difference in the k8s services implementation of CIlium and kube-proxy. kube-proxy eventually results in n*1K iptables rules to DNAT services where Cilium will maintain a hashtable for this and both insertion and lookup scale nicely as number of services increase.

To summarise, Cilium leverages BPF to gain access to programmability of user space solutions such as flannel or Weave while remaining in the kernel and sharing many of the benefits of Calico without depending on iptabels. I believe there is a sweet spot there. The downside of using BPF is obviously the increased platform requirements in terms of minimal kernel version required (4.8) to run Cilium.

We are still in very early stages and this is alpha at best. Given we have implemented all the functionality that we wanted initially, we will shift focus on Documenting and stabilising of what we have so far.

from cilium.

deitch avatar deitch commented on May 13, 2024

The advantages of Cilium compared to stock k8s networking ... I'm probably missing quite a bit here.

Definitely worth documenting, as you said. You built this for a reason, feeling existing solutions were insufficient, let people know the reason.

FYI, I do not love the native k8s networking model. Assuming I will have 254 (`/24) containers on each host, and then however many in the subnet in which the hosts live, is rigid, although I get the desire to make route aggregation simple.

Calico's model appeals to me for its pure L3, although without aggregation, it means very large tables on each host.

The real solution, of course, is route aggregation via IPv6. Then, you can have subnets of almost any size on a host and not worry at all.

I'm careful with comparing Cilium with other solutions as often small details matter

I get that. But people will ask, because in the end, implementors will want to know why to use X vs Y.

while Cilium compiles a hashtable of allowed consumers which will show avg behaviour of O(1) versus traversing a sequence of rules in the case of iptables

That is an interesting point. iptables still just runs down the sequence with each packet?

The downside of using BPF is obviously the increased platform requirements in terms of minimal kernel version required (4.8) to run Cilium

Eh, don't worry about that. Eventually everyone catches up, and it is worth it.

we will shift focus on Documenting and stabilising of what we have so far

I very much look forward to reading them!

from cilium.

deitch avatar deitch commented on May 13, 2024

In any case, thanks for all of the details. I will close out the issue.

from cilium.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.