Giter Club home page Giter Club logo

Comments (13)

milantracy avatar milantracy commented on June 30, 2024

do you have a memory limit set for the container?

from gvisor.

danielnorberg avatar danielnorberg commented on June 30, 2024

do you have a memory limit set for the container?

No, there is no memory limit set for the container.

There are also no oom log entries and no memory pressure on the host.

from gvisor.

danielnorberg avatar danielnorberg commented on June 30, 2024

On an m7i with ubuntu 22.04 docker instead hangs when starting a container using runsc:

$ docker run --runtime=runsc-debug --rm hello-world
<hang>
^C^C^C^Z^Z^Z
<still hung>
$ ll /tmp/runsc-debug/
...
-rw-r--r--  1 root root    3076 Nov 29 09:43 runsc.log.20231129-094346.714655.kill.txt
-rw-r--r--  1 root root    3076 Nov 29 09:43 runsc.log.20231129-094346.966621.kill.txt
-rw-r--r--  1 root root    3076 Nov 29 09:43 runsc.log.20231129-094347.022673.kill.txt
$ cat /tmp/runsc-debug/runsc.log.20231129-094347.022673.kill.txt
I1129 09:43:47.022701    5663 main.go:189] ***************************
I1129 09:43:47.022727    5663 main.go:190] Args: [/usr/local/bin/runsc --debug --debug-log=/tmp/runsc-debug/ --strace --log-packets --root /var/run/docker/runtime-runc/moby --log /run/containerd/io.containerd.runtime.v2.task/moby/f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2/log.json --log-format json --systemd-cgroup kill f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2 20]
I1129 09:43:47.022737    5663 main.go:191] Version release-20231113.0
I1129 09:43:47.022741    5663 main.go:192] GOOS: linux
I1129 09:43:47.022744    5663 main.go:193] GOARCH: amd64
I1129 09:43:47.022747    5663 main.go:194] PID: 5663
I1129 09:43:47.022751    5663 main.go:195] UID: 0, GID: 0
I1129 09:43:47.022754    5663 main.go:196] Configuration:
I1129 09:43:47.022758    5663 main.go:197] 		RootDir: /var/run/docker/runtime-runc/moby
I1129 09:43:47.022761    5663 main.go:198] 		Platform: systrap
I1129 09:43:47.022765    5663 main.go:199] 		FileAccess: exclusive
I1129 09:43:47.022769    5663 main.go:200] 		Directfs: true
I1129 09:43:47.022772    5663 main.go:201] 		Overlay: root:self
I1129 09:43:47.022776    5663 main.go:202] 		Network: sandbox, logging: true
I1129 09:43:47.022780    5663 main.go:203] 		Strace: true, max size: 1024, syscalls:
I1129 09:43:47.022784    5663 main.go:204] 		IOURING: false
I1129 09:43:47.022787    5663 main.go:205] 		Debug: true
I1129 09:43:47.022793    5663 main.go:206] 		Systemd: true
I1129 09:43:47.022796    5663 main.go:207] ***************************
D1129 09:43:47.022806    5663 state_file.go:78] Load container, rootDir: "/var/run/docker/runtime-runc/moby", id: {SandboxID: ContainerID:f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2}, opts: {Exact:false SkipCheck:false TryLock:false RootContainer:false}
D1129 09:43:47.023613    5663 container.go:673] Signal container, cid: f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2, signal: signal 0 (0)
D1129 09:43:47.023622    5663 sandbox.go:1211] Signal sandbox "f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2"
D1129 09:43:47.023626    5663 sandbox.go:613] Connecting to sandbox "f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2"
D1129 09:43:47.023672    5663 urpc.go:568] urpc: successfully marshalled 144 bytes.
D1129 09:43:47.023918    5663 urpc.go:611] urpc: unmarshal success.
D1129 09:43:47.023928    5663 container.go:673] Signal container, cid: f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2, signal: stopped (20)
D1129 09:43:47.023933    5663 sandbox.go:1211] Signal sandbox "f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2"
D1129 09:43:47.023936    5663 sandbox.go:613] Connecting to sandbox "f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2"
D1129 09:43:47.023951    5663 urpc.go:568] urpc: successfully marshalled 145 bytes.
D1129 09:43:47.024237    5663 urpc.go:611] urpc: unmarshal success.
I1129 09:43:47.024244    5663 main.go:224] Exiting with status: 0
$ ps aux | grep runsc
...
root        3883  5.5  0.0 2342892 32288 ?       Ssl  09:38   0:26 runsc-sandbox --log-format=json --debug-log=/tmp/runsc-debug/ --log-packets=true --strace=true --systemd-cgroup=true --root=/var/run/docker/runtime-runc/moby --debug=true --log=/run/containerd/io.containerd.runtime.v2.task/moby/f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2/log.json --log-fd=3 --debug-log-fd=4 boot --apply-caps=false --bundle=/run/containerd/io.containerd.runtime.v2.task/moby/f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2 --controller-fd=12 --cpu-num=16 --dev-io-fd=-1 --gofer-filestore-fds=9 --gofer-mount-confs=lisafs:self,lisafs:none,lisafs:none,lisafs:none --io-fds=5,6,7,8 --mounts-fd=10 --setup-root=false --spec-fd=13 --start-sync-fd=11 --stdio-fds=14,15,16 --total-host-memory=66326761472 --total-memory=66326761472 --product-name=m7i.4xlarge --proc-mount-sync-fd=23 f84bc25ff786a53ae655439d1ec56ad6cfb1966b089a5dbd250aada14ed1fee2
$ sudo kill 3883
<docker still hung>
$ sudo kill -9 3883
<docker cli exits>

from gvisor.

ayushr2 avatar ayushr2 commented on June 30, 2024

Could you share boot.txt logs (or rather all logs like start, create, etc) on m7i with Ubuntu 20.04?

from gvisor.

danielnorberg avatar danielnorberg commented on June 30, 2024

Sure, here are all the runsc debug logs from an m7i with ubuntu 20.04: https://gist.github.com/danielnorberg/db9460552d393ae15e7d66588792ec3b (same as shared above in the original issue description, too large to include verbatim)

from gvisor.

ayushr2 avatar ayushr2 commented on June 30, 2024

Whoops, missed the original link. Thanks.

Hmm seems like the boot process just dies suddenly. Smells like a seccomp violation (since the boot process is sigkilled). Could you check seccomp logs? (sudo ausearch --start today --end now | grep -i seccomp)

from gvisor.

kevinGC avatar kevinGC commented on June 30, 2024

Per what you posted above, AppArmor may be running too:

 Security Options:
  apparmor
  seccomp
   Profile: builtin

I'm not familiar with AppArmor, but is it possible that's killing runsc? Maybe try running with it disabled. FWIW if you're already sandboxing jobs inside of gVisor, AppArmor shouldn't be necessary.

from gvisor.

danielnorberg avatar danielnorberg commented on June 30, 2024

Hmm seems like the boot process just dies suddenly. Smells like a seccomp violation (since the boot process is sigkilled). Could you check seccomp logs? (sudo ausearch --start today --end now | grep -i seccomp)

After apt install auditd to have ausearch available:

$ docker run --rm --runtime=runsc-debug hello-world; echo $?
137

$ sudo ausearch --start today --end now | grep -i seccomp; echo $?
1

from gvisor.

danielnorberg avatar danielnorberg commented on June 30, 2024

I'm not familiar with AppArmor, but is it possible that's killing runsc? Maybe try running with it disabled. FWIW if you're already sandboxing jobs inside of gVisor, AppArmor shouldn't be necessary.

Seems AppArmor is loaded by default on the aws-provided ubuntu 20.04 image, and fwiw it does not seem to be causing issues on an m6i instance using the same ubuntu image.

Still, brute-force disabling AppArmor on the m7i instance:

$ sudo apt remove apparmor
...
Removing apparmor (2.13.3-7ubuntu5.2) ...
...
$ sudo apparmor_status
sudo: apparmor_status: command not found

And then after reboot:

$ docker run --rm --runtime=runsc-debug hello-world; echo $?
137

from gvisor.

danielnorberg avatar danielnorberg commented on June 30, 2024

I can confirm that the same issue manifests on GCP C3 Intel Sapphire Rapids instances.

$ curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/zone
projects/<redacted>/zones/europe-west4-a 
$ curl -H "Metadata-Flavor: Google" http://169.254.169.254/computeMetadata/v1/instance/machine-type
projects/<redacted>/machineTypes/c3-highcpu-44
$ docker run --rm --runtime=runsc hello-world; echo $?
137

from gvisor.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.