Small docker image to experiment with CAP_SYS_NICE
in Docker.
The entry point inspects /proc/1/status
for Cap*:
entries and then executes a small C program that tries to call sched_setscheduler() to set the SCHED_FIFO
"real-time" scheduling policy.
A x86-64 image is published to docker hub. To build locally:
$ docker build --tag fredrikfornwall/cap-sys-nice-docker:0.1 .
Run using docker directly:
$ docker run fredrikfornwall/cap-sys-nice-docker:0.1
Inspecting /proc/1/status
CapInh: 0000000000000000
CapPrm: 00000000a80425fb
CapEff: 00000000a80425fb
CapBnd: 00000000a80425fb
CapAmb: 0000000000000000
sched_setscheduler: Operation not permitted
As CAP_SYS_NICE is bit 23, we can verify that this bit is not set in CapEff
here:
$ python3 -c 'print(0x00000000a80425fb & (1 << 23))'
0
Adding the CAP_SYS_NICE
capability allows the sched_setscheduler()
call to succeed, and we can see the bit being set in CapEff
:
$ docker run --cap-add SYS_NICE fredrikfornwall/cap-sys-nice-docker:0.1
Inspecting /proc/1/status
CapInh: 0000000000000000
CapPrm: 00000000a88425fb
CapEff: 00000000a88425fb
CapBnd: 00000000a88425fb
CapAmb: 0000000000000000
sched_setscheduler: Ok
$ python3 -c 'print(0x00000000a88425fb & (1 << 23))'
8388608
Deploy a kubernetes pod:
$ kubectl apply -f https://raw.githubusercontent.com/fornwall/cap-sys-nice-docker/main/cap-sys-nice-docker.yml
[..]
$ kubectl logs cap-sys-nice-docker
[..]
Note that this pod specifies the following security context for the container:
securityContext:
capabilities:
add: ["SYS_NICE"]