mgoltzsche / ctnr Goto Github PK
View Code? Open in Web Editor NEWrootless runc-based container engine - deprecated in favour of podman
License: Apache License 2.0
rootless runc-based container engine - deprecated in favour of podman
License: Apache License 2.0
Will be supported when docker/cli#573 is merged and containers/image updated their docker/docker dependency.
user-mode networking is now supported within a user namespace. To allow containers to communicate with each other currently nested containers need to be created though. This provides a bad usability in most use-cases.
Instead users should also be able to setup all containers from within the host namespace and bridge them in a shared namespace with a single ctnr command.
a) An implicit shared container with an own file system (from a stage image containing plugin binaries as in rkt) is maintained to run nested containers.
Problem1: If the shared container does not inherit the host's file system external files cannot be resolved.
It is also not possible to mount them into the outer container since this would require a container update which would break other containers that have already been started there.
Problem2: the container is not visible on the host anymore - only the outer container. But at least this way terminating the outer container also terminates the children (although they may leak kernel resources when not properly terminated/unmounted).
b) Alternatively the outer container could inherit the host's file system only having minimal isolation (userns, netns, mountns) to avoid breaking external file references and keep containers visible on the host.
Problem1: plugin binaries cannot be provided.
=> In the outer container the stage image's file contents could be mounted over the rootfs providing all required plugin binaries.
Problem2: Child containers on the host cannot be associated with the parent. On container termination on the host the bridge plugin would not be able to cleanup the veth when it is not run within the outer container's netns anymore which cannot be enforced using plain runc.
=> The child containers could be mapped to the outer container by writing their state into a separate directory and a naming convention. Thus the user should be made aware of the container hierarchy and deal with it explicitly which is probably not a bad idea and can be simplified in high-level tooling/compose. In order to make containers communicate with each other she may create a parent container/pod first and add containers or nested pods to it.
c) Another alternative that would provide both the necessary bridge namespace and the OCI/CNI plugin binaries would be to mount the host's namespace into a sub directory of the container and rewrite file references accordingly. Unfortunately the sub directory prefix may still be shown to the user in error messages.
EDIT:
In general this option also supports to run multiple containers as pod. The outer/pod container would define the network and the child/app containers would remain in the outer container's network namespace.
This needs to be done for pods anyway. Once this is done communication between pods can be achieved using another outer container - same problem a layer higher then. Container hierarchies should become part of the ctnr design as 1b shows.
The OCI network hook would require the namespaces it should enter before CNI execution configured.
Thus it could ensure that network deletion/termination is initiated from within the same namespace.
The container would be visible and can easily be controlled on the host.
The functionality could also be used with plain runc/other OCI runtimes.
Also other CNI plugins could benefit from this approach: for instance the existing portmap plugin could be used as well.
On the other hand it may limit CNI plugin capabilities since it can only be applied to ALL plugins:
For instance a plugin is planned to forward/proxy ports from the host to the container's netns using socat or similar.
This plugin would require explicit configuration of the namespace to forward ports to in addition to the container's namespace.
This would be acceptable since the plugin could be used to connect any namespace this way.
Where/how to create/lookup the userns/netns from within the OCI hook?
Create/join namespace by name dynamically? When to remove?
=> Make OCI hook join existing namespace only (tooling on top of runc must create/provide and remove the namespace when not needed anymore)
=> Make ctnr create/gc a container representing the namespace dynamically
=> The stage container is still required to get the hook/plugin binaries. Back to option 1?
=> These two features should be separated because one is about file system dependencies and the other about network namespaces which users may want to combine independent from each other?!
The bridge plugin would get an additional parameter specifying the namespaces to bridge to instead of the current namespace.
Thus it could ensure that network deletion/termination is triggered from within the same namespace.
The container would be visible and can easily be controlled on the host.
A plugin to map ports using socat would not need additional configuration of the namespace to map ports to (except when it should work with a completely different namespace).
=> Problem: the netns configuration is dynamic. if it should be part of any static plugin configuration a dependency to the container engine (no!) or another mapping would need to be created
=> provide userns/netns or rather process PID in runtime config as in portmap plugin
=> Problem: it is unclear for ctnr when to create the custom netns since it doesn't interpret CNI network configurations - it would always need to create it to be sure or
=> require an additional parameter
=> provides most flexibility
=> Problems: as in option 3
=> The plugin should also manage the slirped shared network namespace itself: The namespace should be a simple host-independent name so that it can be configured statically within a CNI JSON file. The plugin should map the provided name to a namespace persistently. The plugin should also keep track of the namespace's users and terminate the slirp4netns process and destroy the shared namespace as soon as no container is using it anymore. This would also allow usage in other contexts as for instance with plain runc without ctnr and decouples the network logic from the rest.
Problem: The plugin requires a lot of the container engine's functionality!
Actually three features were mixed here (especially in option 1):
EDIT:
This issue is about bridging containers but the pod feature could also solve it. Currently I'd go with option 1b.
@AkihiroSuda please have a quick look over this. I may be missing sth in my considerations.
Hi. This looks wrong:
Lines 44 to 45 in d3c3072
if m.Source == "" && m.Type != MOUNT_TYPE_TMPFS {
src = self.anonymous(m.Target)
I would have expected tmpfs to specifically want anonymous mount sources (==
instead of !=
).
Currently I have a ctnr run --mount type=tmpfs,dst=/foo
failing with
WARN[0000] exit status 1
ERRO[0000] run process: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:58: mounting \\\"/home/tv\\\" to rootfs \\\"/home/tv/.ctnr/bundles/p2jen2rvarc43inrum66qzktna/rootfs\\\" at \\\"/foo\\\" caused \\\"invalid argument\\\"\""
And it really doesn't seem like a tmpfs should be trying to mount my home dir.
Then again, ctnr run --mount type=tmpfs,src=/tmp,dst=/foo
shows my host /tmp
inside the container, so it seems that tmpfs mounts don't work at all.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.