This document provides tips on how to debug Sysbox.
Before initiating a debugging session, we must ensure that the binaries that we will be operating on have been built with compiler-optimizations disabled. The following sysbox Makefile targets have been created for this purpose:
sysbox-debug
sysbox-runc-debug
sysbox-fs-debug
sysbox-mgr-debug
Example:
$ make sysbox-debug && sudo make install
In some cases, it's desirable to debug process initialization phases, so in
those cases we must pick a convenient location where to place a sleep
instruction that provides user with enough time to launch the debugger.
Example (sysbox-runc):
diff --git a/create.go b/create.go
index bb551950..a2b29beb 100644
--- a/create.go
+++ b/create.go
@@ -2,6 +2,7 @@ package main
import (
"os"
+ "time"
@@ -59,6 +60,7 @@ command(s) that get executed on start, edit the args parameter of the spec. See
if err := revisePidFile(context); err != nil {
return err
}
+ time.Sleep(time.Second * 30)
if err := sysbox.CheckHostConfig(context); err != nil {
return err
}
Even though GDB offers Golang support, in reality there are a few key features missing today, such as proper understanding of Golang's concurrency constructs (e.g. goroutines). In consequence, in this document i will be focusing on Delve debugger, which is not as feature-rich as the regular GDB, but it fully supports Golang's runtime. Luckily, most of the existing Delve instructions fully match GDB ones, so I will mainly concentrate on those that (slightly) deviate.
-
Installation:
rodny@vm-1:~$ go get -u github.com/derekparker/delve/cmd/dlv
-
Change working directory to the sysbox workspace location:
rodny@vm-1:~$ cd ~/wsp/sysbox
-
Attaching to a running process:
Let's pick sysbox-runc as an example. First, we need to find the PID of the running sysbox-runc process. Use
pstree -SlpgT | grep sysbox
orps -ef | grep sysbox
to help with this.Then start the debugger and attach it to the sysbox-runc process via the PID:
rodny@vm-1:~/wsp/sysbox/sysbox$ sudo env "PATH=$PATH" env "GOROOT=$GOROOT" env "GOPATH=$GOPATH" env "PWD=$PWD" $(which dlv) attach $(pidof sysbox-runc)
Notice that to allow Golang runtime to operate as we expect, we must export the existing Golang related env-vars to the newly-spawn delve process.
-
Delve command reference:
https://github.com/go-delve/delve/blob/master/Documentation/cli/README.md
-
Setting break-points:
Depending on the level of granularity required, we can set breakpoints attending to either one of these approches:
- Package + Receiver + Method: (dlv) b libcontainer.(*initProcess).start
- File + Line: (dlv) b process_linux.go:290
Example:
(dlv) b libcontainer.(*initProcess).start Breakpoint 1 set at 0x55731c80152d for github.com/opencontainers/runc/libcontainer.(*initProcess).start() /home/rodny/go/src/github.com/opencontainers/runc/libcontainer/process_linux.go:263
-
Process iteration:
We can make use of the typical
n
(next),s
(step),c
(continue) instruccions to iterate through a process' instruction-set.Example:
(dlv) c > github.com/opencontainers/runc/libcontainer.(*initProcess).start() /home/rodny/go/src/github.com/opencontainers/runc/libcontainer/process_linux.go:263 (hits goroutine(1):1 total:1) (PC: 0x55731c80152d) 258: p.cmd.Process = process 259: p.process.ops = p 260: return nil 261: } 262: => 263: func (p *initProcess) start() error { 264: defer p.parentPipe.Close() 265: err := p.cmd.Start() 266: p.process.ops = p 267: p.childPipe.Close() 268: if err != nil {
-
Inspecting the stack-trace:
(dlv) bt 0 0x00000000004ead9a in syscall.Syscall6 at /usr/local/go/src/syscall/asm_linux_amd64.s:53 1 0x0000000000524f55 in os.(*Process).blockUntilWaitable at /usr/local/go/src/os/wait_waitid.go:31 2 0x00000000005194ae in os.(*Process).wait at /usr/local/go/src/os/exec_unix.go:22 3 0x00000000005180a1 in os.(*Process).Wait at /usr/local/go/src/os/exec.go:125 4 0x00000000007d870f in os/exec.(*Cmd).Wait at /usr/local/go/src/os/exec/exec.go:501 5 0x0000000000d6c2fa in github.com/opencontainers/runc/libcontainer. (*initProcess).wait at /root/nestybox/sysbox/sysbox-runc/libcontainer/process_linux.go:655 6 0x0000000000d6c43f in github.com/opencontainers/runc/libcontainer.(*initProcess).terminate at /root/nestybox/sysbox/sysbox-runc/libcontainer/process_linux.go:668 7 0x0000000000d89f35 in github.com/opencontainers/runc/libcontainer.(*initProcess).start.func1 at /root/nestybox/sysbox/sysbox-runc/libcontainer/process_linux.go:353 8 0x0000000000d6bace in github.com/opencontainers/runc/libcontainer.(*initProcess).start at /root/nestybox/sysbox/sysbox-runc/libcontainer/process_linux.go:592 9 0x0000000000d3f3ae in github.com/opencontainers/runc/libcontainer.(*linuxContainer).start at /root/nestybox/sysbox/sysbox-runc/libcontainer/container_linux.go:390 10 0x0000000000d3e426 in github.com/opencontainers/runc/libcontainer.(*linuxContainer).Start at /root/nestybox/sysbox/sysbox-runc/libcontainer/container_linux.go:287 11 0x0000000000e1da2e in main.(*runner).run at /root/nestybox/sysbox/sysbox-runc/utils_linux.go:383 12 0x0000000000e1f08f in main.startContainer at /root/nestybox/sysbox/sysbox-runc/utils_linux.go:553 13 0x0000000000e1f78c in main.glob..func2 at /root/nestybox/sysbox/sysbox-runc/create.go:108 14 0x0000000000bac838 in github.com/urfave/cli.HandleAction at /go/pkg/mod/github.com/urfave/cli@v1.22.1/app.go:523 15 0x0000000000bade00 in github.com/urfave/cli.Command.Run at /go/pkg/mod/github.com/urfave/cli@v1.22.1/command.go:174 16 0x0000000000baa123 in github.com/urfave/cli.(*App).Run at /go/pkg/mod/github.com/urfave/cli@v1.22.1/app.go:276 17 0x0000000000e11880 in main.main at /root/nestybox/sysbox/sysbox-runc/main.go:145 18 0x000000000043ad24 in runtime.main at /usr/local/go/src/runtime/proc.go:203 19 0x000000000046c0b1 in runtime.goexit at /usr/local/go/src/runtime/asm_amd64.s:1357 (dlv)
-
Configure the source-code path:
Sysbox compilation process is carried out inside a docker container. In order to do this, we bind-mount the user's Sysbox workspace (i.e. "sysbox" folder) into this path within the container:
/root/nestybox/sysbox
.Golang compiler includes this path into the generated Sysbox binaries. Thereby, if you are debugging Sysbox daemon in your host, unless your workspace path fully matches the one above (unlikely), Delve will not be able to display the Sysbox source-code.
The typical solution in these cases is to modify Delve's configuration to replace the containerized path with the one of your local environment.
(dlv) config substitute-path /root/nestybox/sysbox /home/rodny/wsp/sysbox
The source-code should be now properly shown:
(dlv) frame 10 > syscall.Syscall6() /usr/local/go/src/syscall/asm_linux_amd64.s:53 (PC: 0x4ead9a) Frame 10: /root/nestybox/sysbox/sysbox-runc/libcontainer/container_linux.go:287 (PC: d3e426) 282: if err := c.setupShiftfsMarks(); err != nil { 283: return err 284: } 285: } 286: } => 287: if err := c.start(process); err != nil { 288: if process.Init { 289: c.deleteExecFifo() 290: } 291: return err 292: } (dlv)
-
Inspecting POSIX threads:
(dlv) threads Thread 2955507 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex Thread 2955508 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex Thread 2955509 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex Thread 2955510 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex Thread 2955511 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex Thread 2955512 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex Thread 2955517 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex * Thread 2955520 at 0x4ead9a /usr/local/go/src/syscall/asm_linux_amd64.s:53 syscall.Syscall6 Thread 2955523 at 0x46dfd3 /usr/local/go/src/runtime/sys_linux_amd64.s:536 runtime.futex Thread 2955564 at 0x46e180 /usr/local/go/src/runtime/sys_linux_amd64.s:673 runtime.epollwait (dlv)
-
Inspecting goroutines:
(dlv) goroutines * Goroutine 1 - User: /usr/local/go/src/syscall/asm_linux_amd64.s:53 syscall.Syscall6 (0x4ead9a) (thread 2955520) Goroutine 2 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 3 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 4 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 5 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 9 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 18 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 20 - User: /usr/local/go/src/runtime/sigqueue.go:147 os/signal.signal_recv (0x450dec) Goroutine 23 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 26 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) Goroutine 32 - User: /usr/local/go/src/runtime/proc.go:305 runtime.gopark (0x43b0db) [11 goroutines] (dlv)
NOTE: Use
goroutines -t
to show a full stack trace for each goroutine. Then useframe X
to switch to the desired frame. -
Get a list of DLV configs:
(dlv) config -list
-
Configure print length of strings:
(dlv) config max-string-len 1000
-
Configure max array size:
(dlv) config max-array-values 600
-
Configure depth of variable recursion:
(dlv) config max-variable-recurse 2
For unit tests, use dlv test <package> -test.run <test-name>
:
dlv test github.com/nestybox/sysbox-runc/libcontainer/integration -test.run TestSysctl
Then set a breakpoint at the desired test line and press c
(continue).
In some cases you need the test to be built with special tags:
go test -tags idmapped_mnt
Or if you want to attach the Delve debugger to it, first build the test then run it with the debugger.
go test -c -tags idmapped_mnt -gcflags="all=-N -l"
sudo env "PATH=$PATH" env "GOROOT=$GOROOT" env "GOPATH=$GOPATH" env "PWD=$PWD" $(which dlv) --check-go-version=false exec <path-to-compiled-test>.test
As it's usually the case, core-dumps can be generated either through the gcore
tool (provided as part of the gdb
package), or within the dlv
debugger itself.
Refer to this link for
details about the former procedure. For the later, proceed as below.
Let's pick sysbox-fs as an example ...
```console
dev-vm1:~/wsp/sysbox$ sudo env "PATH=$PATH" env "GOROOT=$GOROOT" env "GOPATH=$GOPATH" env "PWD=$PWD" $(which dlv) attach `pidof sysbox-fs`
Type 'help' for list of commands.
# Generate core-dump
(dlv) dump core.sysbox-fs.1
Dumping memory 203239424 / 203239424...
# Exit delve without killing the process to avoid any service disruption.
(dlv) quit
Would you like to kill the process? [Y/n] n
dev-vm1:~/wsp/04-14-2021/sysbox$ ls -lh core.sysbox-fs.1
-rw-r--r-- 1 root root 194M Apr 24 00:04 core.sysbox-fs.1
# Compress the obtained core-dump file.
dev-vm1:~/wsp/sysbox$ sudo tar -zcvf core.sysbox-fs.1.tar.gz core.sysbox-fs.1
core.sysbox-fs.1
dev-vm1:~/wsp/sysbox$ ls -lh core.sysbox-fs.1.tar.gz
-rw-r--r-- 1 root root 7.2M Apr 24 00:05 core.sysbox-fs.1.tar.gz
```
To load and debug a previously generated core-dump do the following.
-
Install the Sysbox debugging package corresponding to the release being executed (no symbols are included in the official / production binaries):
sudo apt-get install ~/sysbox-ce-dbgsym_0.4.0-0.ubuntu-focal_amd64.ddeb
-
If debugging from your host:
$ sudo env "PATH=$PATH" env "GOROOT=$GOROOT" env "GOPATH=$GOPATH" env "PWD=$PWD" $(which dlv) core $(which sysbox-fs) ./core.sysbos-fs.1
-
If debugging from Sysbox's dev/test container:
$ dlv core $(which sysbox-fs) ./core.sysbox-fs.1
In both cases above, sysbox-fs/sysbox-fs
refers to the path where to find the
binary being debugged. Obviously, this binary should fully match the one utilized
to generate the original core-dump.
To debug cgo code, you must use gdb (delve does not work).
Instructions:
- Build the cgo code with "go build --buildmode=exe"; do not use "--buildmode=pie", as the position independent code confuses gdb.
- There may be a gdb option/command to get around this, but it's easier to just build with "--buildmode=exe" during debug.
- In the golang file that calls cgo, use the "-g" switch, to tell gccgo to generate debug symbols.
#cgo CFLAGS: -Wall -g
- If needed, instrument the binary to allow you time to attach the debugger to it.
- For example, to attach to the sysbox-runc nsenter child process which is normally ephemeral, add an debug "sleep()" to an appropriate location within the nsenter (to give you time to find the nsenter pid and attach the debugger to it), then execute sysbox-runc, find the pid with pstree, and attach gdb to it (next step).
- Attach gdb to the target process (need root access):
# gdb --pid 17089
GNU gdb (Ubuntu 8.3-0ubuntu1) 8.3
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.
Type "show copying" and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word".
Attaching to process 17089
No executable file now.
warning: Could not load vsyscall page because no executable was specified
0x00007f0d78abf2e2 in ?? ()
- Point gdb to the sysbox-runc binary so it can load the symbols:
(gdb) file /usr/bin/sysbox-runc
A program is being debugged already.
Are you sure you want to change the file? (y or n) y
Reading symbols from /usr/bin/sysbox-runc...
Loading Go Runtime support.
(gdb) bt
#0 0x00007f0d78abf2e2 in ?? ()
#1 0x0000000000bb4b23 in read (__nbytes=16, __buf=0x7fff3fcba260, __fd=4) at /usr/include/x86_64-linux-gnu/bits/unistd.h:44
#2 nl_parse (config=0x7fff3fcba270, fd=4) at nsexec.c:422
#3 nsexec () at nsexec.c:634
#4 0x0000000000bc3fbd in __libc_csu_init ()
#5 0x00007f0d788db16e in ?? ()
#6 0x0000000000000000 in ?? ()
- Then use gdb as usual:
(gdb) break nsexec.c:650
Breakpoint 1 at 0xbb4f68: file nsexec.c, line 650.
(gdb) c
Continuing.
Breakpoint 1, update_oom_score_adj (len=4, data=0xe2f62e "-999") at nsexec.c:650
650 update_oom_score_adj("-999", 4);
(gdb) n
nsexec () at nsexec.c:662
662 if (config.namespaces) {
(gdb) p config
$1 = {data = 0x1b552a0 "\b", cloneflags = 2114060288, oom_score_adj = 0x1b552dc "0", oom_score_adj_len = 2, uidmap = 0x1b552ac "0 165536 65536\n", uidmap_len = 16, gidmap = 0x1b552c0 "0 165536 65536\n", gidmap_len = 16, namespaces = 0x0, namespaces_len = 0, is_setgroup = 1 '\001', is_rootless_euid = 0 '\000',
uidmappath = 0x0, uidmappath_len = 0, gidmappath = 0x0, gidmappath_len = 0, prep_rootfs = 1 '\001', use_shiftfs = 1 '\001', make_parent_priv = 0 '\000', rootfs_prop = 540672, rootfs = 0x1b5530c "/var/lib/docker/overlay2/d764bae04e3e81674c0f0c8ccfc8dec1ef2483393027723bac6519133fa7a4a2/merged", rootfs_len = 97,
parent_mount = 0x0, parent_mount_len = 0, shiftfs_mounts = 0x1b55374 "/lib/modules/5.3.0-46-generic,/usr/src/linux-headers-5.3.0-46,/usr/src/linux-headers-5.3.0-46-generic,/var/lib/docker/containers/cbf6dfe2bef0563532770ed664829032d00eb278367176de32cd03b7290ea1ac", shiftfs_mounts_len = 194}
(gdb) set print pretty
(gdb) p config
$2 = {
data = 0x1b552a0 "\b",
cloneflags = 2114060288,
oom_score_adj = 0x1b552dc "0",
oom_score_adj_len = 2,
uidmap = 0x1b552ac "0 165536 65536\n",
uidmap_len = 16,
gidmap = 0x1b552c0 "0 165536 65536\n",
gidmap_len = 16,
namespaces = 0x0,
namespaces_len = 0,
is_setgroup = 1 '\001',
is_rootless_euid = 0 '\000',
uidmappath = 0x0,
uidmappath_len = 0,
gidmappath = 0x0,
gidmappath_len = 0,
prep_rootfs = 1 '\001',
--Type <RET> for more, q to quit, c to continue without paging--q
Quit
Tip: if you are running sysbox-runc inside the test container, run gdb at host level,
use pstree to figure out the pid of sysbox-runc nsenter child process inside the test container,
and point gdb to the sysbox-runc binary inside the test container (e.g.,
file /var/lib/docker/overlay2/860f62b3bd74c36be6754c8ed8e3f77a63744a2c6b16bef058b22ba0185e2877/merged/usr/bin/sysbox-runc
).