Giter Club home page Giter Club logo

sarus's Introduction

Sarus - An OCI-compatible container engine for HPC

License: BSD 3-Clause Documentation Integration

Sarus is a software to run Linux containers on High Performance Computing environments. Its development has been driven by the specific requirements of HPC systems, while leveraging open standards and technologies to encourage vendor and community involvement.

Key features:

  • Spawning of isolated software environments (containers), built by users to fit the deployment of a specific application

  • Security oriented to HPC systems

  • Extensible runtime by means of OCI hooks to allow current and future support of custom hardware while achieving native performance

  • Creation of container filesystems tailored for diskless nodes and parallel filesystems

  • Compatibility with the presence of a workload manager

  • Compatibility with the Open Container Initiative (OCI) standards:

    • Can pull images from registries adopting the OCI Distribution Specification or the Docker Registry HTTP API V2 protocol
    • Can import and convert images adopting the OCI Image Format
    • Sets up a container bundle complying to the OCI Runtime Specification
    • Uses an OCI-compliant runtime to spawn the container process

Accessing the documentation

The full documentation is available on Read the Docs.

If you wish to generate the documentation yourself, the sources are located in the doc directory and can be built using Python 3 and Sphinx:

cd doc
python3 -m venv ./venv
source venv/bin/activate
pip3 install -r requirements.txt
make html

Communications

If you think you've identified a security issue in the project, please DO NOT report the issue publicly via the Github issue tracker. Instead, send an email with as many details as possible to the following address: [email protected]. This is the business address to the core Sarus maintainers.

To report bugs and request new features, you can use GitHub issues.

Contributing

Contributions to Sarus are welcome and greatly appreciated. Please refer to the CONTRIBUTING.md file for more detailed information about contributing.

Publications

  • Benedicic, L., Cruz, F.A., Madonna, A. and Mariotti, K., 2019, June. Sarus: Highly Scalable Docker Containers for HPC Systems. In International Conference on High Performance Computing (pp. 46-60). Springer, Cham.

    https://doi.org/10.1007/978-3-030-34356-9_5

sarus's People

Contributors

fawzi avatar finkandreas avatar jenkins-cscs avatar jgphpc avatar madeeks avatar michele-brambilla avatar rukkal avatar taliaga avatar tdhooks avatar teojgo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

sarus's Issues

Config Sarus runtime using environment variables

I would like to be able to have control as a user over Sarus by setting some environment variables. In principle, one could take control of the entire config options list (https://sarus.readthedocs.io/en/latest/config/configuration_reference.html) using environment variables.

For instance, siteMounts could be called mount or mounts (https://sarus.readthedocs.io/en/latest/config/configuration_reference.html#sitemounts-array-optional) could be defined as:

export SARUS_MOUNTS= "type:bind,source:/home,destination:/home;type:bind,source:/apps,destination:/apps"

Or one could define environment variables using something like

export SARUS_ENVS= "LD_PRELOAD=/my-path/bla.so;MYVAR=1"

The names and the syntax are not important, what is important is the control.

Failed to retrieve username for uid=xxxxx

Receiving the error mentioned in the subject. I'd welcome any ideas what I might be doing wrong. Thanks.

-bash-4.2$ sarus --debug images
[6215047.318118181] [<redacted>] [CLI] [DEBUG] parsing CLI arguments of images command
[6215047.318211795] [<redacted>] [CommonUtility] [DEBUG] initializing CLI config's directories for local repository
Failed to retrieve username for uid=<redacted>
See 'sarus help images'
[6215047.345666132] [<redacted>] [main] [ERROR] Error trace (most nested error last):
#0   parseCommandArguments at "CommandImages.hpp":137 Failed to retrieve username for uid=<redacted>
See 'sarus help images'
-bash-4.2$

steps taken before trying the above command

sudo mkdir /opt/sarus
cd /opt/sarus
sudo wget https://github.com/eth-cscs/sarus/releases/download/1.3.0/sarus-Release.tar.gz
tar xf sarus-Release.tar.gz
cd 1.3.0-Release
sudo yum install squashfs-tools
sudo ./configure_installation.sh
export PATH=/opt/sarus/1.3.0-Release/bin:${PATH}

extra info:

File: ‘/opt/sarus/1.3.0-Release/bin/sarus’
Access: (6755/-rwsr-sr-x)  Uid: (    0/    root)   Gid: (    0/    root)

powerpc support

Hi,

would sarus be expected to work on ppc64 machines, such as IBM HPCs? e.g. this docker image: ppc64le/centos

Thanks,
Lukas

RDMA failed to open device

Hello,
I am trying to run some MPI benchmarks with Sarus containers. In particular I am using OpenMPI 4.
Nodes are RDMA capable and have Infiniband. Everything works fine without the container and if I run ibv_devinfo on the host I got:

hca_id: mlx5_0
        transport:                      InfiniBand (0)
        fw_ver:                         16.26.0206
        node_guid:                      0015:5dff:fe33:ff0d
        sys_image_guid:                 506b:4b03:00fb:f03a
        vendor_id:                      0x02c9
        vendor_part_id:                 4120
        hw_ver:                         0x0
        board_id:                       MT_0000000010
        phys_port_cnt:                  1
                port:   1
                        state:                  PORT_ACTIVE (4)
                        max_mtu:                4096 (5)
                        active_mtu:             4096 (5)
                        sm_lid:                 1
                        port_lid:               700
                        port_lmc:               0x00
                        link_layer:             InfiniBand

But if I run it inside a container I got Failed to open device. So, I tried to mount the device with a bind but it does not work without sudo:

[user@controller1 ~]$ sarus run --mount=src=/dev/infiniband/uverbs0,dst=/dev/infiniband/uverbs0,type=bind nichr/hpc-bench:v2 bash
[895.208658764] [controller1-5327] [main] [ERROR] Error trace (most nested error last):
#0   createFoldersIfNecessary at "Utility.cpp":437 Failed to create directory "/opt/sarus/1.3.0-Release/var/OCIBundleDir/rootfs/dev/infiniband"
#1   "unknown function" at "unknown file":-1 boost::filesystem::create_directory: Permission denied: "/opt/sarus/1.3.0-Release/var/OCIBundleDir/rootfs/dev/infiniband"

On the other hand, it works with sudo and the device is recognized inside the container.

1. Is there any other way to mount the device without sudo?

The guide reports that I need to use the SSH hook in order to run OpenMPI.
But if I launch sarus with sudo, mount and srun:

[user@controller1 sarus]$ srun sudo /opt/sarus/1.3.0-Release/bin/sarus run --ssh --mount=src=/dev/infiniband/uverbs0,dst=/dev/infiniband/uverbs0,type=bind nichr/hpc-bench:v2 bash -c 'if [ $SLURM_PROCID -eq 0 ]; then mpirun -npernode 1 --allow-run-as-root --map-by node -mca pml ucx --mca btl ^vader,tcp,openib -x UCX_NET_DEVICES=mlx5_0:1 -x UCX_IB_PKEY=$UCX_IB_PKEY /opt/benchmarks/mpiBench/mpiBench -e 1K; else sleep infinity; fi'

I got:

bash: line 0: [: -eq: unary operator expected
bash: line 0: [: -eq: unary operator expected

2. If I use OpenMPI I need the SSH hook, am I right?


I have created the container with the following Dockerfile:

FROM centos:7.6.1810

# set up base
RUN yum install -y epel-release \
    && yum groupinstall -y "Development tools" \
    && yum install -y \
        libusbx pciutils-libs pciutils lsof ethtool fuse-libs \
        ca-certificates wget openssh-server openssh-clients net-tools \
        numactl-devel gtk2 atk cairo tcsh libnl3 tcl libmnl tk

# set up workdir
ENV INSTALL_PREFIX=/opt
WORKDIR /tmp/mpi

# download and install mlnx
RUN wget -q -O - http://content.mellanox.com/ofed/MLNX_OFED-5.1-0.6.6.0/MLNX_OFED_LINUX-5.1-0.6.6.0-rhel7.6-x86_64.tgz | tar -xzf - \
    && ./MLNX_OFED_LINUX-5.1-0.6.6.0-rhel7.6-x86_64/mlnxofedinstall --user-space-only --without-fw-update --all --force \
    && rm -rf MLNX_OFED_LINUX-5.1-0.6.6.0-rhel7.6-x86_64

# download and install HPC-X
ENV HPCX_VERSION="v2.7.0"
RUN cd ${INSTALL_PREFIX} && \
    wget -q -O - https://azhpcstor.blob.core.windows.net/azhpc-images-store/hpcx-v2.7.0-gcc9.2.0-MLNX_OFED_LINUX-5.1-0.6.6.0-redhat7.6-x86_64.tbz | tar -xjf - \
    && HPCX_PATH=${INSTALL_PREFIX}/hpcx-${HPCX_VERSION}-gcc-MLNX_OFED_LINUX-5.1-0.6.6.0-redhat7.6-x86_64 \
    && HCOLL_PATH=${HPCX_PATH}/hcoll \
    && UCX_PATH=${HPCX_PATH}/ucx

# download and install OpenMPI
ENV OMPI_VERSION="4.0.4"
RUN wget -q -O - https://download.open-mpi.org/release/open-mpi/v4.0/openmpi-${OMPI_VERSION}.tar.gz | tar -xzf - \
    && cd openmpi-${OMPI_VERSION} \
    && ./configure --with-ucx=${UCX_PATH} --with-hcoll=${HCOLL_PATH} --enable-mpirun-prefix-by-default \
    && make -j 8 && make install \
    && cd .. \
    && rm -rf openmpi-${OMPI_VERSION} 

# install and setup benchmarks
WORKDIR /opt/benchmarks

# download and install mpiBench
RUN wget -q -O - https://codeload.github.com/LLNL/mpiBench/tar.gz/master | tar -xzf - \
    && mv ./mpiBench-master ./mpiBench \
    && cd mpiBench/ \
    && make

# download and install osu micro benchmarks
RUN wget -q -O - http://mvapich.cse.ohio-state.edu/download/mvapich/osu-micro-benchmarks-5.6.3.tar.gz | tar -xzf - \
    && mv ./osu-micro-benchmarks-5.6.3 ./osu-micro-benchmarks \
    && cd osu-micro-benchmarks/ \
    && ./configure CC=mpicc CXX=mpicxx \
    && make \
    && make install

I am new to Sarus and HPC world, thank you for your support!

Too many cpus requested

On my pc with a AMD Ryzen 7 3700X and Linux 5.4.0 I'm facing an issue with the number of requested CPUs being too large which results in the container failing to start.

The generated config.json for runc contains ... "linux":{"resources":{"cpu":{"cpus":"0-31"}} ... which indeed corresponds to the Cpus_allowed_list:

$ cat /proc/self/status | grep Cpus_allowed_list
Cpus_allowed_list:	0-31

but I only have 8 cores / 16 threads:

$ nproc
16

The error I'm getting is

ERRO[0000] container_linux.go:349: starting container process caused "process_linux.go:297: applying cgroup configuration for process caused \"failed to write \\\"0-31\\\" to \\\"/sys/fs/cgroup/cpuset/container-ccxounuuahznpsds/cpuset.cpus\\\": write /sys/fs/cgroup/cpuset/container-ccxounuuahznpsds/cpuset.cpus: invalid argument\"" 

If I hard-code cpus to 0-15 everything is fine.

Sarus spack recipe inside main Spack repo?

Hi @Madeeks,

is there any specific reason why the Spack recipe is sitting in this repo rather than in the main Spack repo?
If there is any technical blocker, I am happy to try and have a look myself!

Best regards,
Marco

unable to build

Hi,

Any ideas why this fails? OS: Ubuntu 18.04.5LTS.

$ sudo apt search rapidjson
Sorting... Done
Full Text Search... Done
rapidjson-dev/bionic,now 1.1.0+dfsg2-3 all [installed]
fast JSON parser/generator for C++ with SAX/DOM style API

. .

[ 2%] Building CXX object src/common/CMakeFiles/common_library.dir/Utility.o
cd /home/torel/workspace/Sarus/sarus-1.3.2/build-x86_64/src/common && /usr/bin/c++ -DBOOST_ALL_NO_LIB -DBOOST_FILESYSTEM_DYN_LINK -DBOOST_NO_CXX11_SCOPED_ENUMS -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_RANDOM_DYN_LINK -DBOOST_REGEX_DYN_LINK -DBOOST_SYSTEM_DYN_LINK -DBOOST_THREAD_DYN_LINK -DCPPREST_FORCE_HTTP_CLIENT_ASIO -DCPPREST_FORCE_HTTP_LISTENER_ASIO -DCPPREST_NO_SSL_LEAK_SUPPRESS -I/home/torel/workspace/Sarus/sarus-1.3.2/src -isystem /cm/shared/apps/boost/gcc/1.71.0/include -MD -MT src/common/CMakeFiles/common_library.dir/Utility.o -MF CMakeFiles/common_library.dir/Utility.o.d -o CMakeFiles/common_library.dir/Utility.o -c /home/torel/workspace/Sarus/sarus-1.3.2/src/common/Utility.cpp
/home/torel/workspace/Sarus/sarus-1.3.2/src/common/Utility.cpp: In function ‘rapidjson::SchemaDocument sarus::common::readJSONSchema(const boost::filesystem::path&)’:
/home/torel/workspace/Sarus/sarus-1.3.2/src/common/Utility.cpp:970:94: error: no matching function for call to ‘rapidjson::GenericSchemaDocument<rapidjson::GenericValue<rapidjson::UTF8<> > >::GenericSchemaDocument()’
return rapidjson::SchemaDocument{ schemaJSON, nullptr, rapidjson::SizeType(0), &provider };
^
In file included from /home/torel/workspace/Sarus/sarus-1.3.2/src/common/Utility.hpp:26:0,
from /home/torel/workspace/Sarus/sarus-1.3.2/src/common/Utility.cpp:11:
/usr/include/rapidjson/schema.h:1405:5: note: candidate: rapidjson::GenericSchemaDocument<ValueType, Allocator>::GenericSchemaDocument(const rapidjson::GenericSchemaDocument<ValueType, Allocator>&) [with ValueT = rapidjson::GenericValue<rapidjson::UTF8<> >; Allocator = rapidjson::CrtAllocator]
GenericSchemaDocument(const GenericSchemaDocument&);
^~~~~~~~~~~~~~~~~~~~~
/usr/include/rapidjson/schema.h:1405:5: note: candidate expects 1 argument, 4 provided
/usr/include/rapidjson/schema.h:1378:5: note: candidate: rapidjson::GenericSchemaDocument<ValueType, Allocator>::GenericSchemaDocument(rapidjson::GenericSchemaDocument<ValueType, Allocator>&&) [with ValueT = rapidjson::GenericValue<rapidjson::UTF8<> >; Allocator = rapidjson::CrtAllocator]
GenericSchemaDocument(GenericSchemaDocument&& rhs) RAPIDJSON_NOEXCEPT :
^~~~~~~~~~~~~~~~~~~~~
/usr/include/rapidjson/schema.h:1378:5: note: candidate expects 1 argument, 4 provided
/usr/include/rapidjson/schema.h:1341:14: note: candidate: rapidjson::GenericSchemaDocument<ValueType, Allocator>::GenericSchemaDocument(const ValueType&, rapidjson::GenericSchemaDocument<ValueType, Allocator>::IRemoteSchemaDocumentProviderType*, Allocator*) [with ValueT = rapidjson::GenericValue<rapidjson::UTF8<> >; Allocator = rapidjson::CrtAllocator; rapidjson::GenericSchemaDocument<ValueType, Allocator>::ValueType = rapidjson::GenericValue<rapidjson::UTF8<> >; rapidjson::GenericSchemaDocument<ValueType, Allocator>::IRemoteSchemaDocumentProviderType = rapidjson::IGenericRemoteSchemaDocumentProvider<rapidjson::GenericSchemaDocument<rapidjson::GenericValue<rapidjson::UTF8<> > > >]
explicit GenericSchemaDocument(const ValueType& document, IRemoteSchemaDocumentProviderType* remoteProvider = 0, Allocator* allocator = 0) :
^~~~~~~~~~~~~~~~~~~~~
/usr/include/rapidjson/schema.h:1341:14: note: candidate expects 3 arguments, 4 provided
/home/torel/workspace/Sarus/sarus-1.3.2/src/common/Utility.cpp: In function ‘rapidjson::Document sarus::common::readAndValidateJSON(const boost::filesystem::path&, const boost::filesystem::path&)’:
/home/torel/workspace/Sarus/sarus-1.3.2/src/common/Utility.cpp:1007:24: error: ‘class rapidjson::SchemaValidatingReader<0, rapidjson::BasicIStreamWrapper<std::basic_istream >, rapidjson::UTF8<> >’ has no member named ‘GetError’
reader.GetError().Accept(w);
^~~~~~~~
src/common/CMakeFiles/common_library.dir/build.make:246: recipe for target 'src/common/CMakeFiles/common_library.dir/Utility.o' failed
make[2]: *** [src/common/CMakeFiles/common_library.dir/Utility.o] Error 1
make[2]: Leaving directory '/global/D1/homes/torel/workspace/Sarus/sarus-1.3.2/build-x86_64'
CMakeFiles/Makefile2:657: recipe for target 'src/common/CMakeFiles/common_library.dir/all' failed
make[1]: *** [src/common/CMakeFiles/common_library.dir/all] Error 2
make[1]: Leaving directory '/global/D1/homes/torel/workspace/Sarus/sarus-1.3.2/build-x86_64'
Makefile:148: recipe for target 'all' failed
make: *** [all] Error 2

Avoiding ldconfig

I've stumbled upon what is likely a long-standing issue in ldconfig that is blocking me to run it. It seems to be necessary for sarus run --mpi .. to work.

To reduce the size of my docker image I'm running a bundling tool that collects all executables in a bin/ folder and recursively all dependent shared libraries in a lib/ folder. Then it changes the RUNPATH of the binaries using the patchelf tool so that ldd can resolve them. Everything is fine, it seems I can run the binaries, however, ldconfig chokes on it in the following way:

root@afd0fbda1255:~# ldconfig ~/SIRIUS.AppDir/usr/lib/
/sbin/ldconfig.real: file /root/SIRIUS.AppDir/usr/lib/libgsl.so.23 is truncated
...

This is apparently a 10 year old issue (https://nix-dev.science.uu.narkive.com/q6Ww5fyO/ldconfig-problem-with-patchelf-and-64-bit-libs, NixOS/patchelf#44, mesonbuild/meson#4685, and more), and apparently everybody seems to be working around it.

Can we somehow avoid relying on ldconfig in sarus so that we don't run into this problem?

Since my executables are configured to use RUNPATH, it should be enough to mount MPI into the container and set an environment variable LD_LIBRARY_PATH=/path/to/mpi/lib, since that takes precedence.

Pull RepoDigest

Pulling an image based on a sha256 hash is not possible. I reckon it would benefit reproducibility to pull an image based on the RepoDigest:

$ docker inspect -f '{{.RepoDigests}}' qnib/uplain-osu-benchmark
[qnib/uplain-osu-benchmark@sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396]

Docker allows to download an image based on the digest:

$ docker pull qnib/uplain-osu-benchmark@sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396
sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396: Pulling from qnib/uplain-osu-benchmark
Digest: sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396
Status: Image is up to date for qnib/uplain-osu-benchmark@sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396

Sarus is breaking when using a digest:

$ sarus pull qnib/uplain-osu-benchmark@sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396
# image            : index.docker.io/qnib/uplain-osu-benchmark@sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396
# cache directory  : "/home/ubuntu/.sarus/cache"
# temp directory   : "/tmp"
# images directory : "/home/ubuntu/.sarus/images"
Failed authentication for image 'index.docker.io/qnib/uplain-osu-benchmark@sha256:cd7957a4291dc0a6b34c4ae033c9b224fd2647fc1337459bfbf34f9ba046e396'
Did you perform a login with the proper credentials?
See 'sarus help pull' (--login option)

Cannot run an OCI image in Sarus

$ sarus pull stabbles/sarus-breaks
[153440.815513349] [nid003192-256744] [main] [ERROR] Error trace (most nested error last):
#0   parseJSON at "Utility.cpp":1027 Error parsing JSON string:
'{
"mediaType":"application/vnd.oci.image.layer.v1.tar+gzip",
"digest":"sha256:342a65410489fb8ec0e87f1f9af0b7da91ba922f689345b0007380e19cc2703f",
"size":18007253
}
],
"annotations":{
"org.opencontainers.image.description":"mvapich@=3.0b%gcc@=12.1.0~alloca~cuda~debug+regcache+wrapperrpath build_system=autotools ch3_rank_bits=32 file_systems=auto netmod=ofi pmi_version=simple process_managers=auto threads=multiple arch=linux-ubuntu22.04-x86_64_v2"
}
}'
Input data is not valid JSON
Error(offset 163): The document root must not be followed by other values.

This image runs fine elsewhere.

Issue when just appending flags for image with entrypoint set

Currently I cannot do:

$ sarus run gcr.io/kaniko-project/executor --dockerfile something
Invalid image ID ["gcr.io/kaniko-project/executor", "--dockerfile"]
The image ID is expected to be a single token without options
See 'sarus help run'

Whereas docker run gcr.io/kaniko-project/executor --dockerfile /something works.

Note that I cannot just provide the binary like docker run gcr.io/kaniko-project/executor /kaniko/executor --dockerfile /something, because there is no shell or anything (see https://github.com/GoogleContainerTools/kaniko/blob/master/deploy/Dockerfile#L35)

The workaround seems to be adding '' as an initial argument:

$ sarus run gcr.io/kaniko-project/executor '' --dockerfile something

Can this be avoided?

Error sending request to remote registry

I've installed Sarus on a compute node on one of our HPC clusters in Norway.
The compute nodes do not have direct access to the Internet.
We use a proxy for this.
When we use wget we set the shell variable http_proxy.

Is it possible to set a proxy for Sarus?

Of course, I may have misinterpreted the whole issue—here is what happens:

[mhu027@c13-6 ~]$ sarus pull alpine
# image            : index.docker.io/library/alpine:latest
# cache directory  : "/home/mhu027/.sarus/cache"
# temp directory   : "/tmp"
# images directory : "/home/mhu027/.sarus/images"
[2346780.841753263] [c13-6-79584] [main] [ERROR] Error trace (most nested error last):
#0   retrieveImageManifest at "Puller.cpp":333 Error while sending request for manifest to remote registry
#1   "unknown function" at "unknown file":-1 Request canceled by user.

I tested the same thing successfully from a virtual machine (that has full Internet access).

Race condition with static auth.json filename

authFilePath = authFileBasePath / "auth.json";

The filename for pulling an image from a registry which requires authentication has a race condition. Consider the following workflow:

  • Pull1 starting for my.registry.com/container_image_1:latest
  • Pull2 starting for my.registry.com/container_image_2:lastest
  • Pull1 writing authentication information to auth.json for my.registry.com/container_image_1
  • Pull2 writing authentication information to auth.json for my.registry.com/container_image_2
  • Pull1 reaching the point where it wants to actually do skopeo copy --src-authfile auth.json
  • Pull2 reaching the point where it wants to actually do skopeo copy --src-authfile auth.json

At this point Pull2 would be successful, because the authentication for Pull2 is in auth.json, but Pull1 will fail, because the authentication information for Pull1 was overwritten by Pull2

An obvious way to fix this is to have a unique name for the authentication file.

home basedir hard-coded

When running sarus pull, it tries to write to /home/$USER/.sarus/ instead of $HOME/.sarus/.

[[email protected] ~]$ sarus pull alpine
# image            : index.docker.io/library/alpine:latest
# cache directory  : "/home/submarco/.sarus/cache"
# temp directory   : "/tmp"
# images directory : "/home/submarco/.sarus/images"
> save image layers ...
[704118.455223156] [vm-13-8629] [main] [ERROR] Error trace (most nested error last):
#0   createFoldersIfNecessary at "Utility.cpp":460 Failed to create directory "/home/submarco"
#1   "unknown function" at "unknown file":-1 boost::filesystem::create_directory: Permission denied: "/home/submarco"
[[email protected] ~]$ echo $HOME
/home/sub/submarco

This is for instance relevant on our compute nodes where the user's homedirs are /cluster/home/${USER}.

sarus run --pull

docker run --pull ensures the latest version of the image is pulled from the repository. Currently We have a sarus pull before the sarus run invocation, but this takes 30s-1min each time building a new squashfs image. Is it possible to not incur this cost on each run but yet ensure we have the latest image? If so, I suggest adding this as a --pull to mimic the docker interface.

Going rootless

Hi all,

I've experimented a bit with runc's rootless containers + additional linux capabilities using sarus. I think rootless containers will get quite popular with the next major release of Docker, and I think it provides the perfect trade-off between flexibility and security. (rootless in this context = dropping privileged setuid before executing the container command)

The main reason to look into this is being able to build images inside of a container running in the sarus runtime, which is currently impossible (#10). It's also impossible to run package manager commands like apt-get [...] inside of an ubuntu container with sarus currently.

To solve these two problems, it seems we need a few Linux capabilities, to be precise: CAP_CHOWN, CAP_SETUID, CAP_SETGID, CAP_FOWNER, and CAP_DAC_OVERRIDE.

In the current situation we cannot have those capabilities in sarus because they are too powerful. E.g. a user can chown a root-owned file from a mounted directory to make him/herself owner, and there's probably more issues.

With user namespaces however, this is not an issue anymore. We can drop the seteuid and seteguid privileges right before executing the container command so that the container is executed as the current user, and then use namespaces with a user mapping to map the current user to root inside the container. This solves at least the obvious issues with mounting root-owned files (even when the user has CAP_CHOWN permissions):

# create a file owned by root that cannot be read by others, and verify it cannot be chown'ed when mounted inside the container
test-escalation-sarus $ echo "hi" > root-owned-file.txt
test-escalation-sarus $ sudo chown root:root root-owned-file.txt
test-escalation-sarus $ sudo chmod go-rw root-owned-file.txt
test-escalation-sarus $ sarus run --mount=type=bind,src=`pwd`,destination=/workspace -t ubuntu:18.04 /bin/bash
root@harmen-desktop:/# cd workspace/
root@harmen-desktop:/workspace# id
uid=0(root) gid=0(root) groups=0(root),65534(nogroup)
root@harmen-desktop:/workspace# cat root-owned-file.txt
cat: root-owned-file.txt: Permission denied
root@harmen-desktop:/workspace# chown harmen:harmen root-owned-file.txt
chown: changing ownership of 'root-owned-file.txt': Operation not permitted

Another great feature of namespaces is that files created as root inside of a mounted directory are in fact owned by the current user outside of the container.

The only potential issue at the moment seems to be that cgroups are not yet handled well with rootless containers, but the runc folks seem to have a workaround using cgroups v2, which is nearly finished.

Also note that it seems like a step in the direction of making the sarus not a setuid binary. Because we have to mount things, we can probably never entirely get rid of that, but with rootless containers we can at least drop the privileges before executing container commands.

I have a working example of everything here: develop...haampie:rootless not too many changes. If you want to compile it, you need to copy some hard-coded values from /etc/subuid and /etc/subgid.

With the above I can make sarus do all the things I would wish to do :) e.g.

$ sarus run ubuntu:18.04 /bin/bash -c 'apt-get update -qq && apt-get install --no-install-recommends -qq cowsay && /usr/games/cowsay "Hello from rootless sarus"'
debconf: delaying package configuration, since apt-utils is not installed
Selecting previously unselected package perl-modules-5.26.
(Reading database ... 4046 files and directories currently installed.)
Preparing to unpack .../0-perl-modules-5.26_5.26.1-6ubuntu0.3_all.deb ...
Unpacking perl-modules-5.26 (5.26.1-6ubuntu0.3) ...
Selecting previously unselected package libgdbm5:amd64.
Preparing to unpack .../1-libgdbm5_1.14.1-6_amd64.deb ...
Unpacking libgdbm5:amd64 (1.14.1-6) ...
Selecting previously unselected package libgdbm-compat4:amd64.
Preparing to unpack .../2-libgdbm-compat4_1.14.1-6_amd64.deb ...
Unpacking libgdbm-compat4:amd64 (1.14.1-6) ...
Selecting previously unselected package libperl5.26:amd64.
Preparing to unpack .../3-libperl5.26_5.26.1-6ubuntu0.3_amd64.deb ...
Unpacking libperl5.26:amd64 (5.26.1-6ubuntu0.3) ...
Selecting previously unselected package perl.
Preparing to unpack .../4-perl_5.26.1-6ubuntu0.3_amd64.deb ...
Unpacking perl (5.26.1-6ubuntu0.3) ...
Selecting previously unselected package libtext-charwidth-perl.
Preparing to unpack .../5-libtext-charwidth-perl_0.04-7.1_amd64.deb ...
Unpacking libtext-charwidth-perl (0.04-7.1) ...
Selecting previously unselected package cowsay.
Preparing to unpack .../6-cowsay_3.03+dfsg2-4_all.deb ...
Unpacking cowsay (3.03+dfsg2-4) ...
Setting up perl-modules-5.26 (5.26.1-6ubuntu0.3) ...
Setting up libgdbm5:amd64 (1.14.1-6) ...
Setting up libtext-charwidth-perl (0.04-7.1) ...
Setting up libgdbm-compat4:amd64 (1.14.1-6) ...
Setting up libperl5.26:amd64 (5.26.1-6ubuntu0.3) ...
Setting up perl (5.26.1-6ubuntu0.3) ...
Setting up cowsay (3.03+dfsg2-4) ...
Processing triggers for libc-bin (2.27-3ubuntu1) ...
 ___________________________
< Hello from rootless sarus >
 ---------------------------
        \   ^__^
         \  (oo)\_______
            (__)\       )\/\
                ||----w |
                ||     ||

InfiniBand support

One of our users has tested Sarus on our cluster. He gets poor performance and we think it is because it does not use InfiniBand efficiently. This test was with Sarus 1.4.2 (one release behind).

If our assumption is correct, are there plans or is it possible to provide InfiniBand support, e.g. through an OCI hook?

`sarus run` failing without a tag when latest is not available

This is not urgent at all but would be nice to have :)

In case I have one ubuntu image e.g. tag jammy available, I would like to be able to do sarus run -t ubuntu without specifying the tag.
This command errors out:

$  sarus --version
1.6.2
$  sarus images
REPOSITORY   TAG          IMAGE ID       CREATED               SIZE         SERVER
ubuntu       jammy        174c8c134b2a   2024-01-11T15:36:32   28.40MB      docker.io
$  sarus run -t ubuntu
Image docker.io/library/ubuntu:latest is not available. Attempting to look for equivalent image in index.docker.io server repositories
Image index.docker.io/library/ubuntu:latest is not available
$  sarus run -t ubuntu:jammy
emily@desktop:/$ 

podman equivalent:

$  podman images
REPOSITORY                      TAG         IMAGE ID      CREATED        SIZE
docker.io/library/ubuntu        jammy       174c8c134b2a  4 weeks ago    80.4 MB
$  podman run -t ubuntu
root@78280fae6072:/# 

Comparison to other tools like singularity

Hi,

I just found sarus by accident and it looks really interesting. In my experience, singularity has a lot of mindshare in the container on HPC space. I thought it would be great if you could add a section to the docs to talk about what sarus does differently from the other options available, e.g. like singularity.

Thanks

Option to whitelist/blacklist captured host's environment variables

It would be nice if we could have an option to whitelist and/or blacklist the environment variables that are captured from the host's environment.

The use case is the use of application managers inside the container, where the identification of the system is done by environment variables, e.g. Spack identifies if the system is a Cray system by the existence of the environment variable CRAYPE_VERSION (https://spack.readthedocs.io/en/latest/getting_started.html#using-linux-containers-on-cray-machines). In this case, one has to unset the environment variable CRAYPE_VERSION before running spack with Sarus.

Runtime Flag to change working directory

This way, when using 3rd party container images, it is still possible to customise the container working directory at runtime.
Might have a syntax similar to Docker: --workdir / -w

Preserve color output

It would be great if sarus could preserve color output when running e.g. sarus run [image] ctest. Docker seems to do this as well.

mwe:

docker run --rm ubuntu:18.04 ls --color=auto
sarus run ubuntu:18.04 ls --color=auto

FWIW, it seems that TERM is fine inside of the container

$ sarus run ubuntu:18.04 echo $TERM
xterm-256color

support for insecure registries

Currently, when attempting to pull from an insecure registry, the pull fails like so:

tdhooks@pop-os:~$ sarus pull localhost:5000/sarus/alpine:latest
# image            : localhost:5000/sarus/alpine:latest
# cache directory  : "/home/tdhooks/.sarus/cache"
# temp directory   : "/tmp"
# images directory : "/home/tdhooks/.sarus/images"
[1674.682032881] [pop-os-30306] [main] [ERROR] Error trace (most nested error last):
#0   retrieveImageManifest at "Puller.cpp":324 Error while sending request for manifest to remote registry
#1   "unknown function" at "unknown file":-1 Error in SSL handshake

Having skimmed through Puller.cpp, it looks like sarus is hard-coded to only support pulls from registries with valid https (see Puller.cpp:549).

Docker allows use of insecure registries mainly through the insecure-registry config field in /etc/docker/daemon.json, and this is very useful for those that want to use LAN secured registries without certs or simply for testing. Could/Should this be a new feature, configurable through a similar field in sarus.json or a sarus pull --insecure flag?

Note: I tested this with un-certed localhost and non-localhost repositories to the same result. I haven't tested with a self-signed certificate, however I suspect that will fail as well if not with the same error.

Help needed configuring oci nvidia hook

Hello Team,

Could you help me out with the following error? What am I doing wrong? Thank you.

Error

$ sarus run nvidia/cuda:10.0-base nvidia-smi
ERRO[0000] container_linux.go:349: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH"
container_linux.go:349: starting container process caused "exec: \"nvidia-smi\": executable file not found in $PATH"

Hook configuration

$ cat /opt/sarus/1.3.0-Release/etc/hooks.d/oci-nvidia-hook.json
{
    "version": "1.0.0",
    "hook": {
        "path": "/usr/bin/nvidia-container-toolkit",
        "args": ["nvidia-container-toolkit", "prestart"],
        "env": [
            "PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin"
        ]
    },
    "when": {
        "always": true,
        "commands": [".*"]
    },
    "stages": ["prestart"]
}
$ which nvidia-smi
/usr/bin/nvidia-smi
$ which nvidia-container-toolkit
/usr/bin/nvidia-container-toolkit
$ sudo docker run --rm -e NVIDIA_VISIBLE_DEVICES=all nvidia/cuda:10.0-base nvidia-smi
Thu Oct 22 16:11:40 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.44       Driver Version: 440.44       CUDA Version: 10.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Quadro K2000        Off  | 00000000:07:00.0 Off |                  N/A |
| 30%   38C    P0    N/A /  N/A |      0MiB /  1999MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Stopping Sarus containers

Hi!

We are using Sarus containers to host serverless functions on HPC machines. Thus, our case is different than a typical MPI job - the application running inside a container is not running to completion. Instead, it keeps running until the serverless resource manager kills it.

When using the Docker, we can stop containers easily by sending signals, either by using CLI or sending an HTTP request to the daemon. This is important as we want to send SIGINT or SIGTERM to allow a graceful shutdown. In Sarus, we have no such option. When we start a new container using fork + exec, the new process is running as a root user, and we are not allowed to send any signals to it. When using Sarus on Piz Daint, we noticed that this process spawns another root-owned process, and only this one actually starts a user-owned process executing our containerized application. Only then can we send signals and properly terminate our containers. We developed a method that queries the children of the main container process until it finds our application. However, this method does not seem to be very robust, and it depends on the internal implementation of Sarus.

Is there a more reliable method of stopping running containers that do not depend on Sarus internals?

CI with sarus requires docker-in-docker style containers

I'm migrating some projects to use sarus as part of their CI pipelines, which so far works well.

However, some projects have tests set up to generate and run srun [options] sarus [image] [test_command] commands. When the entrypoint of CI is a container, this cannot work unfortunately, as slurm and sarus are not available inside the container themselves. The current workaround is (1) to extract the generated srun commands from the container, and execute them in a normal shell outside of the container, or (2) collect the srun commands outside of CI and have some trusted person review and update them by hand. But option (1) defies the purpose of containers and option (2) is not dynamic enough when new tests are added frequently, especially when pull requests are opened and multiple versions of the software are around.

In many CI environments it is common practice to let the user specify the container in which to start execution, which can in fact be a docker-in-docker container, so that the user can run docker run ... inside of the container without having to escape it.

I think a similar solution with sarus + slurm would be incredibly useful as well: we could make the entrypoint of CI a sarus-in-sarus container, where the user could allocate resources, download images, and run them through sarus.

What I would like to know is whether this would be easy to manage, and whether this comes with performance or security penalties.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.