A situation came up in IRC this morning where a user (@bitkeks) would like to enter a network namespace in their own code using setns(2)
and then manipulate WireGuard devices using wgctrl
.
The goroutine with a modified namespace called into wgctrl
and ultimately netlink
, but because of the way my netlink package spins up an internal goroutine for dealing with system calls, that internal goroutine was not part of the same network namespace as its calling goroutine. To remedy this, I've added mdlayher/netlink#141 which enables the caller to explicitly opt-in to setting the namespace of the calling thread on the internal goroutine/syscall thread.
On IRC, we had also discussed that EPERM
from setns(2)
(due to lack of CAP_SYS_ADMIN
/root) could be non-fatal, but then we can end up in a state where the caller attempted to configure a namespace but the namespace was not actually applied. Yet, no error would be returned in this case.
In order to integrate these ideas into wgctrl
, we have a few options. At this point, I'm leaning toward option 2, but I'd be happy to hear from others as well.
/cc @bitkeks @zx2c4
- set
netlink.CallingThreadNetNS
in genetlink.Dial
config and do nothing else
This means that the calling and netlink socket goroutines will share the same network namespace, but either root or CAP_SYS_ADMIN
is now required at all times to set the network namespace. root or CAP_NET_ADMIN
are already required to manipulate WireGuard devices.
I consider this the least favorable option, although it is the simplest.
- expose a config struct for
wgctrl.New
and add an option to use the calling thread's namespace
This option is explicit and requires the caller to actually opt-in to calling setns(2)
under the hood, so root/CAP_SYS_ADMIN
is only required if the user explicitly says so.
This pattern is pretty common in Go applications, and could look something like:
package wgctrl
type Config struct {
// Make Client retrieve the network namespace of the calling thread and use it
// internally, so WireGuard devices in that namespace can be manipulated.
LinuxCallingThreadNetNS bool
// Alternative: actually specify a namespace FD for the netlink socket to enter,
// with a constant for "calling thread". If set to zero, nothing happens.
// LinuxNetNS int
// Room for future options and extensibility...
// OpenBSDFoo int
}
Then the caller can either use an explicit config, or nil for none:
// Use all defaults.
c, err := wgctrl.New(nil)
// Configure only what is needed. Empty fields are defaults.
c, err := wgctrl.New(&wgctrl.Config{
LinuxCallingThreadNetNS: true,
})
I'm currently in favor of doing this because it is simple to implement and explicit. It is slightly unfortunate that:
- it would break existing callers, but it won't be hard for them to replace
wgctrl.New()
with wgctrl.New(nil)
(however, we make zero API stability guarantees at this time anyway)
- the user has to explicitly opt-in to this to get the behavior they might assume would already work, if they're already using
setns(2)
(as @bitkeks is)
- set
netlink.CallingThreadNetNS
and add additional logic in package netlink to figure out if we're already in the right namespace in both the calling thread and internal syscall thread
If we create the internal sycall goroutine and lock OS thread, then determine it has an identical network namespace to the calling thread, there's no need to ever invoke setns(2)
. This could mean that the code could continue to work without privileges, unless a network namespace was explicitly set in either the calling thread or configuration. If the calling thread was able to set a namespace, package netlink will already have permission to do so as well.
There is the potential to use something like kcmp(2)
to see if the namespaces match, but it appears to have some caveats according to the manpage:
Note the kcmp() is not protected against false positives which may occur if the processes are currently running. One should stop the
processes by sending SIGSTOP (see signal(7)) prior to inspection with this system call to obtain meaningful results.
This system call is available only if the kernel was configured with CONFIG_CHECKPOINT_RESTORE. The main use of the system call is for
the checkpoint/restore in user space (CRIU) feature. The alternative to this system call would have been to expose suitable process
information via the proc(5) filesystem; this was deemed to be unsuitable for security reasons.
I did a couple of quick experiments with this and didn't seem to come up with a meaningful result, although perhaps I'm doing it wrong.
func kcmp(pid1, pid2, typ, idx1, idx2 int) (int, error) {
r0, r1, errno := unix.Syscall6(
unix.SYS_KCMP,
uintptr(pid1),
uintptr(pid2),
uintptr(typ),
uintptr(idx1),
uintptr(idx2),
0,
)
log.Println("kcmp", r0, r1, errno)
if errno != 0 {
return 0, os.NewSyscallError("kcmp", errno)
}
return int(r0), nil
}
pid := os.Getpid()
const kcmpFile = 0
res, err := kcmp(pid, pid, kcmpFile, netNS, origNetNS)
if err != nil {
errC <- err
return
}
log.Println("KCMP:", res)
$ go run main.go
2019/06/04 13:26:40 CALLER NETNS FD: 3
2019/06/04 13:26:40 SYSCALL NETNS FD: 5
2019/06/04 13:26:40 kcmp 2 0 errno 0
2019/06/04 13:26:40 KCMP: 2
2019/06/04 13:26:40 &os.SyscallError{Syscall:"setns", Err:0x1}
2019/06/04 13:26:40 failed to open wgctrl: setns: operation not permitted
exit status 1
Perhaps this merits more investigation, but due to the caveats listed on the manpage and potential complexity, I'm still leaning toward option 2.