Giter Club home page Giter Club logo

ctnr's Introduction

ctnr Build Status

ctnr is a CLI built on top of runc to manage and build OCI images as well as containers on Linux.
ctnr aims to ease system container creation and execution as unprivileged user.
Also ctnr is a tool to experiment with runc features.

THIS PROJECT IS NOT MAINTAINED ANYMORE IN FAVOUR OF podman.

Features

  • OCI bundle and container preparation as well as execution as unprivileged user using runc
  • OCI image build as unprivileged user
  • Simple concurrently accessible image and bundle store
  • Image and bundle file system creation (based on umoci)
  • Various image formats and transports supported by containers/image
  • Container networking using CNI (optional, requires root, as OCI runtime hook)
  • Dockerfile support
  • Docker Compose 3 support (subset) using docker/cli (WIP)
  • Easy to learn: docker-like CLI
  • Easy installation: single statically linked binary (plus optional binaries: CNI plugins, proot) and convention over configuration

Rootless containers

Concerning accessibility, usability and security a rootless container engine has several advantages:

  • Containers can be run by unprivileged users.
    Required in restrictive environments and useful for graphical applications.
  • Container images can be built in almost every Linux environment.
    More flexibility in unprivileged builds - nesting containers is also possible (see experiments and limitations).
  • A higher degree and more flexible level of security.
    Less likely for an attacker to gain root access when run as unprivileged user.
    User/group-based container access control. Separation of responsibilities.

Limitations & challenges

Container execution as unprivileged user is limited:

Container networking is limited. With plain ctnr/runc only the host network can be used. The standard CNI plugins require root privileges.
One workaround is to map ports on the host network using PRoot* accepting bad performance.
A better solution is to use slirp4netns which emulates the TCP/IP stack in a user namespace efficiently. It can be used with ctnr via the slirp-cni-plugin. Once container initialization is also moved into a user namespace with slirp the standard CNI plugins can be used again. For instance the bridge can be used to achieve communication between containers (see user-mode networking).

Inside the container a process' or file's user cannot be changed. This is caused by the fact that all operations in the container are still run by the host user (who is just mapped to user 0 inside the container). Unfortunately this stops many package managers as well as official docker images from working: While apk or dnf already work with plain runc apt-get does not since it requires to change a user permanently.
To overcome this limitation ctnr supports the user.rootlesscontainers xattr and integrates with PRoot*.

For more details see Aleksa Sarai's summary of the state of the art of rootless containers.

* PRoot is a binary that hooks its child processes' kernel-space system calls using ptrace to simulate them in the user-space. This is more reliable but slower than hooking libc calls using LD_PRELOAD as fakechroot does it.

Installation

Download the binary:

wget -O ctnr https://github.com/mgoltzsche/ctnr/releases/download/v0.7.0-alpha/ctnr.linux-amd64 &&
chmod +x ctnr &&
sudo mv ctnr /usr/local/bin/

If you need PRoot or CNI plugins you can build them by calling make proot cni-plugins-static within this repository's directory.

Build

Build the binary dist/bin/ctnr as well as dist/bin/cni-plugins on a Linux machine with git, make and docker:

git clone https://github.com/mgoltzsche/ctnr.git
cd ctnr
make

Install in /usr/local:

sudo make install

Optionally the project can now be opened with LiteIDE running in a ctnr container
(Please note that it takes some time to build the LiteIDE container image):

make ide

Examples

The following examples assume your policy accepts docker images or you have copied image-policy-example.json to /etc/containers/policy.json on your host.

Create and run container from Docker image

$ ctnr run docker://alpine:3.8 echo hello world
hello world

Create and run Firefox as unprivileged user

Build a Firefox ESR container image local/firefox:alpine (cached operation):

$ ctnr image build \
	--from=docker://alpine:3.8 \
	--author='John Doe' \
	--run='apk add --update --no-cache firefox-esr libcanberra-gtk3 adwaita-icon-theme ttf-ubuntu-font-family' \
	--cmd=firefox \
	--tag=local/firefox:alpine

Create and run a bundle named firefox from the previously built image:

$ ctnr run -b firefox --update \
	--env DISPLAY=$DISPLAY \
	--mount src=/tmp/.X11-unix,dst=/tmp/.X11-unix \
	--mount src=/etc/machine-id,dst=/etc/machine-id,opt=ro \
	local/firefox:alpine

(Unfortunately tabs in firefox tend to crash) The -b <BUNDLE> and --update options make this operation idempotent: The bundle's file system is reused and only recreated when the underlying image has changed. Use these options to restart containers very quickly. Without them ctnr copies the image file system on bundle creation which can take some time and disk space depending on the image's size.
Also these options enable a container update on restart when the base image is frequently updated before the child image is rebuilt using the following command:

$ ctnr image import docker://alpine:3.8

Build Dockerfile as unprivileged user

This example shows how to build a debian-based image with the help of PRoot.

Dockerfile Dockerfile-cowsay:

FROM debian:9
RUN apt-get update && apt-get install -y cowsay
ENTRYPOINT ["/usr/games/cowsay"]

Build the image (Please note that this works only with --proot enabled. With plain ctnr/runc apt-get fails to change uid/gid.):

$ ctnr image build --proot --dockerfile Dockerfile-cowsay --tag example/cowsay

Run a container using the previously built image (Please note that --proot is not required anymore):

$ ctnr run example/cowsay hello from container
 ______________________
< hello from container >
 ----------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

Port mapping

ctnr supports port mapping using the -p, --publish option. Unprivileged users can use the --proot option in addition.

Port mapping as root using a contained CNI network

When a container is run as root in a contained network (--network default, default as root) the portmap CNI plugin is used to map ports from a specified IP or the host network to the container.

Map the container network's port 80 to port 8080 on the host:

$ sudo ctnr run -p 8080:80 docker://alpine:3.8 nc -l -p 80 -e echo hello from container

Connectivity test on the host on another shell:

$ nc 127.0.0.1 8080
hello from container

Port mapping as unprivileged user using proot

Unprivileged users can enable the --proot option to map ports within the host network namespace on a syscall level.

Map bind/connect syscalls with port 80 to port 8080:

$ ctnr run --proot -p 8080:80 docker://alpine:3.8 nc -l -p 80 -e echo hello from container

You can now also run another container using the same port as long as you don't map it on the same host port (proot maps it to a random free port and back within the container):

$ ctnr run --proot docker://alpine:3.8 /bin/sh -c 'nc -l -p 80 -e echo hello & sleep 1; timeout -t 1 nc 127.0.0.1 80'
hello

Connectivity test on the host on another shell:

$ nc 127.0.0.1 8080
hello from container

OCI specs and this implementation

An OCI image provides a base configuration and file system to create an OCI bundle from. The file system consists of a list of layers represented by tar files each containing the diff to its predecessor.
ctnr manages images in its local store directory in the OCI image layout format. Images are imported into the local store using the containers/image library. A new bundle is created by extracting the image's file system into a directory and deriving the bundle's default configuration from the image's configuration plus user-defined options.

An OCI bundle describes a container by a configuration and a file system. Basically it is a directory containing a config.json file with the configuration and a sub directory with the root file system.
ctnr manages bundles in its local store directory. Alternatively a custom directory can also be used as bundle. OCI bundles generated by ctnr can also be run with plain runc.

An OCI container is a host-specific bundle instance. On Linux it is a set of namespaces in which a configured process can be run.
ctnr provides two wrapper implementations of the OCI runtime reference implementation runc/libcontainer to either use an external runc binary or use libcontainer (no runtime dependencies!) controlled by a compiler flag.

Related tools

Roadmap

  • system.Context aware processes, unpacking/packing images
  • improved multi-user support (store per user group, file permissions, lock location)
  • CLI integration tests
  • advanced rootless networking (using a network daemon run by root)
  • separate OCI CNI network hook binary
  • health check
  • improved Docker Compose support
  • service discovery integration (hook / DNS; consul, etcd)
  • detached mode
  • systemd integration (cgroup, startup notification)
  • advanced logging
  • support additional read-only image stores

ctnr's People

Contributors

akihirosuda avatar mgoltzsche avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Forkers

akihirosuda pwfoo

ctnr's Issues

Simplify CNI plugin/bridge usage for unprivileged users

user-mode networking is now supported within a user namespace. To allow containers to communicate with each other currently nested containers need to be created though. This provides a bad usability in most use-cases.

Instead users should also be able to setup all containers from within the host namespace and bridge them in a shared namespace with a single ctnr command.

Option 1: Move container execution into another, implicit, shared container

a) An implicit shared container with an own file system (from a stage image containing plugin binaries as in rkt) is maintained to run nested containers.
Problem1: If the shared container does not inherit the host's file system external files cannot be resolved.
It is also not possible to mount them into the outer container since this would require a container update which would break other containers that have already been started there.
Problem2: the container is not visible on the host anymore - only the outer container. But at least this way terminating the outer container also terminates the children (although they may leak kernel resources when not properly terminated/unmounted).

b) Alternatively the outer container could inherit the host's file system only having minimal isolation (userns, netns, mountns) to avoid breaking external file references and keep containers visible on the host.
Problem1: plugin binaries cannot be provided.
=> In the outer container the stage image's file contents could be mounted over the rootfs providing all required plugin binaries.
Problem2: Child containers on the host cannot be associated with the parent. On container termination on the host the bridge plugin would not be able to cleanup the veth when it is not run within the outer container's netns anymore which cannot be enforced using plain runc.
=> The child containers could be mapped to the outer container by writing their state into a separate directory and a naming convention. Thus the user should be made aware of the container hierarchy and deal with it explicitly which is probably not a bad idea and can be simplified in high-level tooling/compose. In order to make containers communicate with each other she may create a parent container/pod first and add containers or nested pods to it.

c) Another alternative that would provide both the necessary bridge namespace and the OCI/CNI plugin binaries would be to mount the host's namespace into a sub directory of the container and rewrite file references accordingly. Unfortunately the sub directory prefix may still be shown to the user in error messages.

EDIT:
In general this option also supports to run multiple containers as pod. The outer/pod container would define the network and the child/app containers would remain in the outer container's network namespace.
This needs to be done for pods anyway. Once this is done communication between pods can be achieved using another outer container - same problem a layer higher then. Container hierarchies should become part of the ctnr design as 1b shows.

Option 2: Move OCI CNI network hook execution into other namespace

The OCI network hook would require the namespaces it should enter before CNI execution configured.
Thus it could ensure that network deletion/termination is initiated from within the same namespace.
The container would be visible and can easily be controlled on the host.
The functionality could also be used with plain runc/other OCI runtimes.
Also other CNI plugins could benefit from this approach: for instance the existing portmap plugin could be used as well.

On the other hand it may limit CNI plugin capabilities since it can only be applied to ALL plugins:
For instance a plugin is planned to forward/proxy ports from the host to the container's netns using socat or similar.
This plugin would require explicit configuration of the namespace to forward ports to in addition to the container's namespace.
This would be acceptable since the plugin could be used to connect any namespace this way.

Where/how to create/lookup the userns/netns from within the OCI hook?
Create/join namespace by name dynamically? When to remove?
=> Make OCI hook join existing namespace only (tooling on top of runc must create/provide and remove the namespace when not needed anymore)
=> Make ctnr create/gc a container representing the namespace dynamically
=> The stage container is still required to get the hook/plugin binaries. Back to option 1?
=> These two features should be separated because one is about file system dependencies and the other about network namespaces which users may want to combine independent from each other?!

Option 3: Extend bridge plugin to bridge custom userns/netns with container

The bridge plugin would get an additional parameter specifying the namespaces to bridge to instead of the current namespace.
Thus it could ensure that network deletion/termination is triggered from within the same namespace.
The container would be visible and can easily be controlled on the host.
A plugin to map ports using socat would not need additional configuration of the namespace to map ports to (except when it should work with a completely different namespace).

=> Problem: the netns configuration is dynamic. if it should be part of any static plugin configuration a dependency to the container engine (no!) or another mapping would need to be created
=> provide userns/netns or rather process PID in runtime config as in portmap plugin
=> Problem: it is unclear for ctnr when to create the custom netns since it doesn't interpret CNI network configurations - it would always need to create it to be sure or
=> require an additional parameter

Option 4: Create new CNI plugin that enters custom netns and executes nested plugins there

=> provides most flexibility
=> Problems: as in option 3
=> The plugin should also manage the slirped shared network namespace itself: The namespace should be a simple host-independent name so that it can be configured statically within a CNI JSON file. The plugin should map the provided name to a namespace persistently. The plugin should also keep track of the namespace's users and terminate the slirp4netns process and destroy the shared namespace as soon as no container is using it anymore. This would also allow usage in other contexts as for instance with plain runc without ctnr and decouples the network logic from the rest.
Problem: The plugin requires a lot of the container engine's functionality!

Result so far

Actually three features were mixed here (especially in option 1):

  • Support bridging multiple containers to a shared namespace.
  • Provide additional binaries to create a container using a stage image as in rkt.
  • Support pods

EDIT:
This issue is about bridging containers but the pod feature could also solve it. Currently I'd go with option 1b.

@AkihiroSuda please have a quick look over this. I may be missing sth in my considerations.

Inverted logic for resolving tmpfs mounts?

Hi. This looks wrong:

ctnr/model/resolve.go

Lines 44 to 45 in d3c3072

if m.Source == "" && m.Type != MOUNT_TYPE_TMPFS {
src = self.anonymous(m.Target)

	if m.Source == "" && m.Type != MOUNT_TYPE_TMPFS {
		src = self.anonymous(m.Target)

I would have expected tmpfs to specifically want anonymous mount sources (== instead of !=).

Currently I have a ctnr run --mount type=tmpfs,dst=/foo failing with

WARN[0000] exit status 1                                
ERRO[0000] run process: container_linux.go:348: starting container process caused "process_linux.go:402: container init caused \"rootfs_linux.go:58: mounting \\\"/home/tv\\\" to rootfs \\\"/home/tv/.ctnr/bundles/p2jen2rvarc43inrum66qzktna/rootfs\\\" at \\\"/foo\\\" caused \\\"invalid argument\\\"\"" 

And it really doesn't seem like a tmpfs should be trying to mount my home dir.

Then again, ctnr run --mount type=tmpfs,src=/tmp,dst=/foo shows my host /tmp inside the container, so it seems that tmpfs mounts don't work at all.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.