Giter Club home page Giter Club logo

ctnr's Issues

Simplify CNI plugin/bridge usage for unprivileged users

user-mode networking is now supported within a user namespace. To allow containers to communicate with each other currently nested containers need to be created though. This provides a bad usability in most use-cases.

Instead users should also be able to setup all containers from within the host namespace and bridge them in a shared namespace with a single ctnr command.

Option 1: Move container execution into another, implicit, shared container

a) An implicit shared container with an own file system (from a stage image containing plugin binaries as in rkt) is maintained to run nested containers.
Problem1: If the shared container does not inherit the host's file system external files cannot be resolved.
It is also not possible to mount them into the outer container since this would require a container update which would break other containers that have already been started there.
Problem2: the container is not visible on the host anymore - only the outer container. But at least this way terminating the outer container also terminates the children (although they may leak kernel resources when not properly terminated/unmounted).

b) Alternatively the outer container could inherit the host's file system only having minimal isolation (userns, netns, mountns) to avoid breaking external file references and keep containers visible on the host.
Problem1: plugin binaries cannot be provided.
=> In the outer container the stage image's file contents could be mounted over the rootfs providing all required plugin binaries.
Problem2: Child containers on the host cannot be associated with the parent. On container termination on the host the bridge plugin would not be able to cleanup the veth when it is not run within the outer container's netns anymore which cannot be enforced using plain runc.
=> The child containers could be mapped to the outer container by writing their state into a separate directory and a naming convention. Thus the user should be made aware of the container hierarchy and deal with it explicitly which is probably not a bad idea and can be simplified in high-level tooling/compose. In order to make containers communicate with each other she may create a parent container/pod first and add containers or nested pods to it.

c) Another alternative that would provide both the necessary bridge namespace and the OCI/CNI plugin binaries would be to mount the host's namespace into a sub directory of the container and rewrite file references accordingly. Unfortunately the sub directory prefix may still be shown to the user in error messages.

EDIT:
In general this option also supports to run multiple containers as pod. The outer/pod container would define the network and the child/app containers would remain in the outer container's network namespace.
This needs to be done for pods anyway. Once this is done communication between pods can be achieved using another outer container - same problem a layer higher then. Container hierarchies should become part of the ctnr design as 1b shows.

Option 2: Move OCI CNI network hook execution into other namespace

The OCI network hook would require the namespaces it should enter before CNI execution configured.
Thus it could ensure that network deletion/termination is initiated from within the same namespace.
The container would be visible and can easily be controlled on the host.
The functionality could also be used with plain runc/other OCI runtimes.
Also other CNI plugins could benefit from this approach: for instance the existing portmap plugin could be used as well.

On the other hand it may limit CNI plugin capabilities since it can only be applied to ALL plugins:
For instance a plugin is planned to forward/proxy ports from the host to the container's netns using socat or similar.
This plugin would require explicit configuration of the namespace to forward ports to in addition to the container's namespace.
This would be acceptable since the plugin could be used to connect any namespace this way.

Where/how to create/lookup the userns/netns from within the OCI hook?
Create/join namespace by name dynamically? When to remove?
=> Make OCI hook join existing namespace only (tooling on top of runc must create/provide and remove the namespace when not needed anymore)
=> Make ctnr create/gc a container representing the namespace dynamically
=> The stage container is still required to get the hook/plugin binaries. Back to option 1?
=> These two features should be separated because one is about file system dependencies and the other about network namespaces which users may want to combine independent from each other?!

Option 3: Extend bridge plugin to bridge custom userns/netns with container

The bridge plugin would get an additional parameter specifying the namespaces to bridge to instead of the current namespace.
Thus it could ensure that network deletion/termination is triggered from within the same namespace.
The container would be visible and can easily be controlled on the host.
A plugin to map ports using socat would not need additional configuration of the namespace to map ports to (except when it should work with a completely different namespace).

=> Problem: the netns configuration is dynamic. if it should be part of any static plugin configuration a dependency to the container engine (no!) or another mapping would need to be created
=> provide userns/netns or rather process PID in runtime config as in portmap plugin
=> Problem: it is unclear for ctnr when to create the custom netns since it doesn't interpret CNI network configurations - it would always need to create it to be sure or
=> require an additional parameter

Option 4: Create new CNI plugin that enters custom netns and executes nested plugins there

=> provides most flexibility
=> Problems: as in option 3
=> The plugin should also manage the slirped shared network namespace itself: The namespace should be a simple host-independent name so that it can be configured statically within a CNI JSON file. The plugin should map the provided name to a namespace persistently. The plugin should also keep track of the namespace's users and terminate the slirp4netns process and destroy the shared namespace as soon as no container is using it anymore. This would also allow usage in other contexts as for instance with plain runc without ctnr and decouples the network logic from the rest.
Problem: The plugin requires a lot of the container engine's functionality!

Result so far

Actually three features were mixed here (especially in option 1):

  • Support bridging multiple containers to a shared namespace.
  • Provide additional binaries to create a container using a stage image as in rkt.
  • Support pods

EDIT:
This issue is about bridging containers but the pod feature could also solve it. Currently I'd go with option 1b.

@AkihiroSuda please have a quick look over this. I may be missing sth in my considerations.

Inverted logic for resolving tmpfs mounts?

Hi. This looks wrong:

ctnr/model/resolve.go

Lines 44 to 45 in d3c3072

if m.Source == "" && m.Type != MOUNT_TYPE_TMPFS {
src = self.anonymous(m.Target)

	if m.Source == "" && m.Type != MOUNT_TYPE_TMPFS {
		src = self.anonymous(m.Target)

I would have expected tmpfs to specifically want anonymous mount sources (== instead of !=).

Currently I have a ctnr run --mount type=tmpfs,dst=/foo failing with

WARN[0000] exit status 1                                
ERRO[0000] run process: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:58: mounting \\\"/home/tv\\\" to rootfs \\\"/home/tv/.ctnr/bundles/p2jen2rvarc43inrum66qzktna/rootfs\\\" at \\\"/foo\\\" caused \\\"invalid argument\\\"\"" 

And it really doesn't seem like a tmpfs should be trying to mount my home dir.

Then again, ctnr run --mount type=tmpfs,src=/tmp,dst=/foo shows my host /tmp inside the container, so it seems that tmpfs mounts don't work at all.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.