Giter Club home page Giter Club logo

Comments (10)

kevinGC avatar kevinGC commented on May 24, 2024

Very interested in making this happen. Thinking of this as separate sub-issues:

  1. General third-party stack support - I think this is great. The largest issue I see is API stability -- we develop and build gVisor+netstack as one big binary, so there's no defined API for network stacks. The primary concern for me is getting stuck on an API with problems. From experience I can tell you that we've changed that API within gVisor many times -- that flexibility is useful and finding a way to keep it is ideal. I wonder whether we can do API versioning (think Go modules-esque) so that stable APIs exist, but don't hamper development.
  2. CGO in gVisor - gVisor/runsc can't introduce CGO as a dependency for security reasons. This will have to be explicitly turned on by plugin users.
  3. TLDK performance - With those performance numbers I have a ton of questions. Too many for this post, but generally I'm curious whether your stack is portable or specifically tailored to your environment, e.g:
    1. Can you have multiple pods on a node? Normally DPDK steals the entire NIC, but maybe you use SR-IOV to create multiple NICs.
    2. Does SR-IOV tie you to particular hardware NICs? If I understand correctly it's not fully portable, which could create problems if different nodes have different NICs.
    3. If this is running in Kubernetes, what network plugin (CNI) is used to set everything up?
    4. Do non-gVisor pods run in the same environment?

Please let me know what you think. Also happy to discuss your specific setup in email/chat/wherever if that's easier.

from gvisor.

tanjianfeng avatar tanjianfeng commented on May 24, 2024

@kevinGC Among those sub-issues, the core one is CGO.

gVisor/runsc can't introduce CGO as a dependency for security reasons.

  • Does CGO interface introduce security issue? In other words, if we introduce a rust-based component (also memory-safe) in sentry, does that break the security?
  • gVisor itself is a defense-in-depth solution, with the host kernel jailers (seccomp/cgroup/namespace/capabilites/...) as the last line of defense. Can we tradeoff sentry security for perforance? An example in hand (may be not readlly proper), directfs sacrifice the security by allowing open() in sandbox process.

This will have to be explicitly turned on by plugin users.

If we understand it correctly, pure go needs the decision made at compile time. Do we have a conditional compile mechanism in gvisor bazel?

from gvisor.

amysaq2023 avatar amysaq2023 commented on May 24, 2024

@kevinGC Thanks for your quick response and we are happy to discuss more on these sub issues you have .

To answer sub-issue 3, in short, our stack can be portable to other environment and detailed reasons are below:

Can you have multiple pods on a node? Normally DPDK steals the entire NIC, but maybe you use SR-IOV to create multiple NICs.

Yes, we can support multiple pods on a node and it is also the common scenario we use in Antgroup. We use SR-IOV to create multiple ENIs to do that.

Does SR-IOV tie you to particular hardware NICs? If I understand correctly it's not fully portable, which could create problems if different nodes have different NICs.

In our current implement for gVisor with TLDK+DPDK, it does not have requirements on NIC. As long as NIC can be used as virtio backend device, our solution to support TLDK can work on it.

If this is running in Kubernetes, what network plugin (CNI) is used to set everything up?

We do not use any CNI to set TLDK stack up. Instead, we invoke CGO wrapper to initialize TLDK stack during gVisor doing StartRoot().

Do non-gVisor pods run in the same environment?

Yes, non-gVisor pods can run with gVisor with TLDK pods in the same environment.

from gvisor.

kevinGC avatar kevinGC commented on May 24, 2024

Does CGO interface introduce security issue? In other words, if we introduce a rust-based component (also memory-safe) in sentry, does that break the security?

We've never discussed the CGO interface on its own, i.e. with something other than C being called into. But my first take is that the runsc binary should always be flagged as no CGO. I think a good solution would be to leave runsc as pure Go, and have this plugin system usable by defining a different go_binary target. That way we keep the high level of security, and users who want to make the tradeoff just need to write their own BUILD target. So ideally you'd have your own target looking something like:

go_binary(
    name = "runsc-tldk",
    srcs = ["main.go"],
    pure = False,
    visibility = [
        "//visibility:public",
    ],
    deps = [
        "@dev_gvisor//runsc/cli",
        "@dev_gvisor//runsc/version",
        "//my/codebase/tldk:runsc_plugin",
    ],
)

This yields a few benefits:

  • gVisor remains CGO-free
  • Plugin network stacks can be developed independently of upstream gVisor
  • By consuming gVisor as a bazel dependency, you would pin to a specific version of gVisor. This may be useful when gVisor changes to avoid breaking API changes

@tanjianfeng what do you think? Since you already have a third-party network stack, we want to hear what setup would work for you. If you have specific ideas in mind, we'd love to hear them. Once we have some agreement here, we can get others onboard and actually make the changes.

gVisor itself is a defense-in-depth solution, with the host kernel jailers (seccomp/cgroup/namespace/capabilites/...) as the last line of defense. Can we tradeoff sentry security for perforance?

Yes. Generally such tradeoffs are implemented but off by default. For example, raw sockets are implemented because people need tools like tcpdump, but must be enabled via a flag. Since CGO introduces a security issue just by being present in the binary, we shouldn't compile it in by default.


@amysaq2023 that's super impressive that you're getting the benefits of kernel bypass without many of the traditional issues (e.g. machines being single-app only). A few more questions (if you can answer):

  • Are the nodes in that Redis benchmark VMs or actual machines? My understanding is that the performance boost mostly comes from cutting out the host network stack, but if these are VMs then I'd expect the host machine's stack to slow things down.
  • Did you consider using XDP instead of DPDK? I wonder how performant it would be relative to DPDK, and given that it's probably easier to use.
  • Generally, do you think it's DPDK or TLDK that provide the bulk of the performance improvement? I'd like to do some experimenting of my own, and am wondering whether I'm more likely to see performance differences by hooking kernel bypass up to netstack or TLDK up to an AF_PACKET socket.

from gvisor.

amysaq2023 avatar amysaq2023 commented on May 24, 2024

@kevinGC

what do you think? Since you already have a third-party network stack, we want to hear what setup would work for you.

Thank you for your insightful suggestion on how to support TLDK while maintaining the high level of security in gVisor. We have an additional proposal to consider:
First, we propose abstracting a set of APIs for gVisor's network stack. This way, third-party network stacks will only need to implement these APIs in order to be compatible with gVisor.
Next, we will compile the third-party network stack with gVisor APIs implemented as an object file. This approach ensures seamless integration between gVisor and the third-party network stack.
Most importantly, gVisor needs to support a method to invoke these APIs within the network stack binary. Currently, we are considering options such as using go plugins or implementing something similar.
We feel like that this solution will more thoroughly decouple the development of third-party network stacks from gVisor. Additionally, supporting binary plugins may have potential benefits for other modules, like the filesystem, enabling support for third-party implementations in the future.

Are the nodes in that Redis benchmark VMs or actual machines? My understanding is that the performance boost mostly comes from cutting out the host network stack, but if these are VMs then I'd expect the host machine's stack to slow things down.

The nodes in the Redis benchmark are actual physical machines.

Did you consider using XDP instead of DPDK? I wonder how performant it would be relative to DPDK, and given that it's probably easier to use.
Generally, do you think it's DPDK or TLDK that provide the bulk of the performance improvement? I'd like to do some experimenting of my own, and am wondering whether I'm more likely to see performance differences by hooking kernel bypass up to netstack or TLDK up to an AF_PACKET socket.

DPDK not only functions as a driver, but also offers various performance enhancements. For instance, it utilizes rte_ring for efficient communication with hardware and introduces its own memory management mechanisms with mbuf and mempool. Moreover, DPDK operates entirely at the user-level, completely detached from the host kernel, unlike XDP which still relies on hooking into the host kernel. Therefore, the performance enhancement achieved with TLDK+DPDK goes beyond just kernel bypass, benefiting from the improvements introduced by both TLDK and DPDK.

from gvisor.

kevinGC avatar kevinGC commented on May 24, 2024

First, we propose abstracting a set of APIs for gVisor's network stack. This way, third-party network stacks will only need to implement these APIs in order to be compatible with gVisor.

Agreed! Maybe you could send a PR with the interface you use now to work with TLDK -- that would be a really good starting point. Much better than trying to come up with an arbitrary API, given that you've got this running already.

Next, we will compile the third-party network stack with gVisor APIs implemented as an object file. This approach ensures seamless integration between gVisor and the third-party network stack.

Right, if I understand correctly the build process for cgo requires building the object file first, then writing a Go layer around it that can call into it using the tools provided by import "C".

Most importantly, gVisor needs to support a method to invoke these APIs within the network stack binary. Currently, we are considering options such as using go plugins or implementing something similar.

Can you help me understand why we couldn't just build a static binary containing gVisor and the third party network stack? As part of the API we talked about above, gVisor can support registering third party netstacks. So the third party stack would contain an implementation of the API (socket ops like in your diagram), the cgo wrapper, the third party stack itself, and an init function that registers the stack to be used instead of netstack:

import "pkg/sentry/socket"

func init() {
  socket.RegisterThirdPartyProvider(linux.AF_INET, &tldkProvider)
  // etc..
}

This keeps everything building statically and avoids issues introduced by go plugins as far as I can tell, but maybe I'm missing something.

from gvisor.

kevinGC avatar kevinGC commented on May 24, 2024

Something I should've been more clear about regarding the static binary idea: I'm suggesting that the existing, cgo-free runsc target remain as-is, and that we support third party network stacks by having multiple BUILD targets. So the existing target will look mostly (or entirely) the same as it is today:

go_binary(
    name = "runsc",
    srcs = ["main.go"],
    pure = True,
    tags = ["staging"],
    visibility = [
        "//visibility:public",
    ],
    x_defs = {"gvisor.dev/gvisor/runsc/version.version": "{STABLE_VERSION}"},
    deps = [
        "//runsc/cli",
        "//runsc/version",
    ],
)

And building runsc with a third party network stack requires adding another target (which could be in the same BUILD file, a different one, or even a separate bazel project):

go_binary(
    name = "runsc_tldk",
    srcs = ["main_tldk.go"],
    pure = False,
    tags = ["staging"],
    visibility = [
        "//visibility:public",
    ],
    x_defs = {"gvisor.dev/gvisor/runsc/version.version": "{STABLE_VERSION}"},
    deps = [
        "//runsc/cli",
        "//runsc/version",
        "//othernetstacks/tldk:tldk_provider",
    ],
)

Both go_binary targets are static, avoid go plugins and its headaches, and the default runsc binary remains cgo-free.

from gvisor.

amysaq2023 avatar amysaq2023 commented on May 24, 2024

@kevinGC
Great! We are fully onboard with the idea of introducing an additional target to support third-party networking stack. To kick things off, we will begin by preparing a PR that encompasses gVisor APIs for networking modules, along with our implementation of these APIs in TLDK for seamless integration with gVisor. We sincerely appreciate all the valuable insights shared throughout this discussion thread.

from gvisor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.