systemd / systemd-bootchart Goto Github PK
View Code? Open in Web Editor NEWBoot performance graphing tool
License: GNU General Public License v2.0
Boot performance graphing tool
License: GNU General Public License v2.0
m4/intltool.m4 is a symlink pointing to /usr/share/aclocal/intltool.m4
lintian complains about this
$ lintian -i systemd-bootchart_231-1.dsc
E: systemd-bootchart source: source-contains-unsafe-symlink m4/intltool.m4
N:
N: The source contains an unsafe symlink. If followed, the link will escape
N: the source root. Note that all absolute symlinks are unconditionally
N: considered "unsafe" in this case (unlike in binary packages).
N:
N: If it is a part of the package's testsuite, Lintian may have failed to
N: recognise it as a test. In that case, please override the tag.
N:
N: Severity: serious, Certainty: possible
N:
N: Check: cruft, Type: source
N:
I think this symlink should simply be removed. systemd-bootchart doesn't actually use intltool
[Moved from https://github.com/systemd/systemd/issues/8502.]
238
Debian Sid/unstable
An explanation, what the vertical dashed line in the SVG means.
No explanation.
Start system with
init=/lib/systemd/systemd-bootchart
.
I have on my system / , /usr , /var and / home
(all ext4) on different partitions.
It is an UEFI system with GPT using systemd-boot to boot.
systemd-bootchart is invoked via kernel parameters ("initcall_debug printk.time=y init=/usr/lib/systemd/systemd-bootchart
").
With this setup I get a bootchart which only spans ~ 3 seconds, here is an example:
http://imgh.us/before-bootchart-20160306-1332.svg
It would be nice if systemd-bootchart could properly support this case and chart all data / up to the login.
Showing comdline is important debugging tool.
Passing -C parameter from kernel command line is not straightforward.
Please add this functionality to bootchart.conf parser.
configure.ac looks for the libsystemd-journal compat lib, which no longer exists.
Something is not quite right with this graph:
bootchart-20210221-1726.svg.gz
Had no luck getting a sane result.. with systemd-bootchart 233-2 in a debian VM and
PlotMemoryUsage=yes
PlotEntropyGraph=yes
Especially noticeable in the init=.../systemd-bootchart
case where /proc
is not mounted early enough unless some special setup (a pre-init prior to systemd-bootchart that mounts /proc
) is done.
Note that the comment on
systemd-bootchart/src/bootchart.c
Lines 424 to 429 in 6953367
/proc
is not mounted log_sample
returns the -errno
of the failed openat()
call on /proc/vmstat
, causing the program to exit immediately.
This regression appears to have been introduced back in commit f91781329c6d8a760e3c1b88b66b0e2137c2e5ab of systemd (prior to the split between this and that repo).
As a workaround, one can pre-mount /proc
.
Originally it seems as if src/architecture.h
was a copy of systemd's https://github.com/systemd/systemd/blob/master/src/basic/architecture.h
However it's an old file which doesn't recognize riscv64 amongst other new architectures. This causes a build failure:
In file included from src/svg.c:32:0:
src/architecture.h:198:4: error: #error "Please register your architecture here!"
# error "Please register your architecture here!"
^~~~~
Unfortunately just copying the new systemd file isn't sufficient to fix this because src/architecture.h
has diverged by adding PROC_CPUINFO_MODEL
definitions for each architecture.
So I'm not sure how to solve this. I could just add riscv64 to the new file, making it diverge properly. Or I could try to factor out the PROC_CPUINFO_MODEL
definition so that the two files can be identical ...
The docs subsection of systemd-bootchart.service includes a reference to the nonexistent man:systemd-bootchart.service(1). This man page should either be written (ideal) or the reference in the service file should be remove (less than ideal).
The bootchart process leaks file handles to /proc/PID/schedstat
, /proc/PID/shed
, and to /proc/PID/smaps
(when arg_pss is set). It seems that this is known, as bootchart increases it's maximum file descriptor count to 4096. Unfortunately, this strategy fails when we have a boot with lots of launched processes.
At every sample, bootchart loops through the entries in /proc
(ignoring those that aren't numbers), and traverses it's linked list of known processes. If the process is a new one, it allocates a new ps_struct
. When it allocates this structure, it opens the file /proc/PID/sched
and sets ps_struct->sched
to that file descriptor. If the process is one that's already known, it just uses the existing ps_struct
in the linked list; no new files are opened.
It only ever closes the file if it fails to read from it in a sample. If the process dies any time between when it find's the entry in /proc
and when it gets to reading the file, bootchart won't close the file handle until the end of the boot. For sample periods that aren't taxing the CPU, this means that most of the file handles won't be closed.
The trivial solution would be to open and close the file descriptor on every sample. I'm not sure how this would affect performance though, as I suspect the reason it wasn't already done this way was to minimize bootchart's overhead.
I've thought of an alternate solution that would keep a separate list of "active processes", and look through them when looping over the entries of /proc/
. It would then be easy keep track of which processes need to have their file handles closed. I think this would have the added benefit that it could improve overall performance, since the length of the list of ps_struct
s that we need to walk through for every entry in /proc
would be much smaller for many systems. Would you welcome a pull request with this alternate solution?
See systemd/systemd#657.
While per-process cpu wait is a useful measure, the total or per-cpu wait charts are a different story. They can make it look like the system is io-limited when it isn't.
They're populated from /proc/schedstat and show the "sum of all time spent waiting to run by tasks on [each] processor". So two tasks waiting at the same time are counted twice, right? And while knowing that one task waits while another runs is only interesting on a per-task level, not per-cpu. So I often see both the CPU utilization and CPU wait charts saturated. As far as I can tell, that just means I'm cpu-limited but would be io-limited with a faster processor and that parallel service startup is doing me a favour? Only the first of those insights is really useful, the last two are academic at best and confusing at worst (it's easy to interpret that as meaning I'm io-limited).
iowait from /proc/stat seems a more useful measure for a global or per-cpu chart - as it only shows time spent waiting for tasks when there's no other work to do.
Finally, I see that the svg graphing code normalises waittime to the scheduling interval, but the double counting etc. means that waittime can easily exceed the scheduling interval and be truncated.
Maybe I'm misunderstanding something - I don't see where to find the bootchart history before it was merged in systemd 83fdc450aa8f79941bec84488ffd5bf8eadab18e so maybe the motivation for using /proc/schedstat is explained in the original repo?
Using dracut as the framework for the initrd, systemd is used in the initrd image, and systemd-bootchart only seems to shows that execution, and not the second one.
[Copied from https://github.com/systemd/systemd/issues/8497.]
238
Debian Sid/unstable
systemd-backlight is listed in the SVG file in
/run/log
created by systemd-analyze.
systemd-backlight is not listed.
Start system where systemd-backlight is run with
inti=/lib/systemd/systemd-bootchart
.
I guess it’s due to the special characters colon (:) and @ in the service unit name.
With systemd-bootchart 233-2 from Debian Sid/unstable, process names in parentheses sometimes get their start characters stripped.
For systemctl
:
<!-- (ystemctl) [224] ppid=212 runtime=2.336ms -->
<rect class="ps" x="279.993" y="1740.000" width="0.000" height="20.000" />
<rect class="wait" x="279.993" y="1740.000" width="0.000" height="20.000" />
<rect class="cpu" x="279.993" y="1740.000" width="0.000" height="20.000" />
<text x="284.993" y="1754.000"><![CDATA[(ystemctl)]]> [224]<tspan class="run">2.3ms</tspan> </text>
<line class="dot" x1="279.993" y1="1750.000" x2="251.807" y2="1750.000" />
<line class="dot" x1="251.807" y1="1750.000" x2="251.807" y2="1600.000" />
e2scrub becomes crub. But direxec seems fine.
The value which ought to be set by this is referenced in configure.ac and Makefile.am, but there's never any AC_ARG_WITH called for rootprefix. This makes it impossible to install some files to /usr/lib.
As of kernel 4.6, there is now a run-time switch for schedstats support, which defaults to disabled.
It can be switched on via a kernel command line parameter or sysctl [1]
Please update systemd-bootchart to automatically flip that runtime switch via sysctl so fiddling with the kernel command line is not necessary.
make distcheck
fails, presumably due to some missing files in the tarball:
make[3]: *** No rule to make target 'man/bootchart.conf.5', needed by 'all-am'. Stop.
make[3]: *** Waiting for unfinished jobs....
Makefile:1324: recipe for target 'all-recursive' failed
src/fileio.c:51]: (error) Resource leak: f
src/fileio.c:58]: (error) Resource leak: f
src/fileio.c:62]: (error) Resource leak: f
Suggest add calls to fclose.
This is a copy of systemd/systemd#656.
As a summary: I am trying to get a detailed report on a LEGO mindstorms EV3 running debian jessie. The system is quite slow (400MHz Arm9 and 64 Mb memory). The boot is quite busy.
Due to extensive borrowing from systemd, the make distcheck
target fails on make install
due to DESTDIR
not being properly preserved to include $$dc_install_base
. I've spent some time trying to figure out how to coax it that way, but been unable to find the issue yet.
Making install in .
/usr/bin/mkdir -p '/usr/lib/systemd'
/bin/sh ./libtool --mode=install /usr/bin/install -c systemd-bootchart '/usr/lib/systemd'
libtool: install: /usr/bin/install -c systemd-bootchart /usr/lib/systemd/systemd-bootchart
/usr/bin/install: cannot create regular file '/usr/lib/systemd/systemd-bootchart': Permission denied
Building on Arch from https://aur.archlinux.org/packages/systemd-bootchart-git fails if docbook-xsl
isn't installed. I guess it should be somehow flagged as a build dependency?
/usr/bin/xsltproc -o man/bootchart.conf.5 --nonet --xinclude --stringparam man.output.quietly 1 --stringparam funcsynopsis.style ansi --stringparam man.authors.section.enabled 0 --stringparam man.copyright.section.enabled 0 --stringparam systemd.version 231 --path './man:./man' ./man/custom-man.xsl man/bootchart.conf.xml
I/O error : Attempt to load network entity http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl
warning: failed to load external entity "http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl"
compilation error: file ./man/custom-man.xsl line 27 element import
xsl:import : unable to load http://docbook.sourceforge.net/release/xsl/current/manpages/docbook.xsl
make[2]: *** [Makefile:1765: man/bootchart.conf.5] Error 5
make[1]: *** [Makefile:1309: all-recursive] Error 1
make: *** [Makefile:648: all] Error 2
Found a similar issue here with a fix: https://github.com/hughsie/appdata-tools/issues/10
Do builds on Ubuntu & Alpine Linux / other OSes as done in systemd/systemd CI
I have a reproducible assertion failure with bootchart v231 when invoked as pid1, citing the above message.
The problem seems to only occur when the initrd does not mount / as rw, and it is instead done by systemd-remount-fs.
I have managed to reproduce as follows:
# Build a bootable debian image including systemd-bootchart
mkosi -p systemd-bootchart -b -d debian
# Boot into it and set the kernel command line
qemu-system-x86_64 -m 512 -bios /usr/share/ovmf/OVMF.fd --enable-kvm -drive file=image.raw,format=raw -vga qxl
# this is inside the vm
mount /dev/sda1 /boot/efi
echo "quiet init=/lib/systemd/systemd-bootchart" > /etc/kernel/cmdline
# Ensure systemd remounts / as rw to avoid unrelated failures
echo "/dev/sda2 / ext4 errors=remount-ro 0 1" > /etc/fstab
kernel-install add $(uname -r) /boot/vmlinuz-$(uname -r)
# Disable getty on tty1 so that we do not loose the boot messages
systemctl mask getty@tty1
reboot
# Now we have the assertion error during boot
I notice that the errors do not present themselves if the initrd mounts the root filesystem as rw (ie, add rw to the kernel cmdline).
Right now on systems without this:
0.000s [0]
, 0K - [0]
, and a single 393K - [1]
.[1]-0.021s (null)
(no other processes listed).[123]
, etc)Things to do:
/proc/<pid>/comm
, /proc/<pid>/exe
, /proc/<pid>/cmdline
, etc for namingI'm proposing to return to X.Y versioning scheme numbering, starting with "3.0".
Rationale:
systemd-bootchart exits with an error:
matteo@turbo:~$ sudo systemctl status systemd-bootchart.service
● systemd-bootchart.service - Boot Process Profiler
Loaded: loaded (/usr/lib/systemd/system/systemd-bootchart.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Sun 2020-05-10 02:18:56 CEST; 1min 55s ago
Docs: man:systemd-bootchart.service(1)
man:bootchart.conf(5)
Process: 589 ExecStart=/usr/lib/systemd/systemd-bootchart -r (code=exited, status=1/FAILURE)
Main PID: 589 (code=exited, status=1/FAILURE)
CPU: 3.690s
mag 10 02:18:56 turbo systemd-bootchart[589]: Error opening output file '/run/log/bootchart-20200510-0218.svg': Permission denied
mag 10 02:18:56 turbo systemd[1]: systemd-bootchart.service: Main process exited, code=exited, status=1/FAILURE
mag 10 02:18:56 turbo systemd[1]: systemd-bootchart.service: Failed with result 'exit-code'.
mag 10 02:18:56 turbo systemd[1]: systemd-bootchart.service: Consumed 3.690s CPU time.
but actually the file does exist and is writable:
matteo@turbo:~$ ll /run/log/bootchart-20200510-0218.svg
-rw-rw-rw-. 1 root root 48K mag 10 02:18 /run/log/bootchart-20200510-0218.svg
I'm using Fedora 32:
matteo@turbo:~$ uname -a
Linux turbo 5.6.10-300.fc32.x86_64 #1 SMP Mon May 4 14:29:45 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux
matteo@turbo:~$ rpm -q systemd
systemd-245.4-1.fc32.x86_64
matteo@turbo:~$ rpm -q systemd-bootchart
systemd-bootchart-233-6.fc32.x86_64
This on an arm system with fairly high cpu load. systemd-bootchart is v232. I'm observing this when running systemd-bootchart manually as root.
Backtrace:
(gdb) bt full
#0 0x76e164dc in __GI_rewind (fp=0x54ad7ac0) at /usr/src/debug/glibc/2.24-r0/git/libio/rewind.c:34
__self = <optimized out>
_IO_acquire_lock_file = 0x54ad7ac0
#1 0x54aac9e4 in log_sample (ptr=0x54ac0090 <sampledata>, cpus=<synthetic pointer>, pscount=<synthetic pointer>,
ps_first=<optimized out>, sample=4, proc=<optimized out>)
at /usr/src/debug/systemd-bootchart/232+gitAUTOINC+799b7ffa8c-r0/git/src/store.c:495
filename = "785/task\000stat", '\000' <repeats 2107 times>...
ps = 0x54ad78f0
buf = "470984590 110846723 95\n\000 kB\nRss:", ' ' <repeats 19 times>, "0 kB\nPss:", ' ' <repeats 19 times>, "0 kB\nShared_Clean: 0 kB\nShared_Dirty: 0 kB\nPrivate_Clean: 0 kB\nPrivate_Dirty: 0 kB\nRefe"...
key = "cpu0\000tamp\000ackground_threshold", '\000' <repeats 226 times>
m = <optimized out>
ps_prev = 0x0
procfd = 2130696836
r = <optimized out>
n = <optimized out>
taskfd = -1705892290
vmstat = 5
val = "0\000\062\064", '\000' <repeats 251 times>
c = 0
s = <optimized out>
wt = "110846723\000\066", '\000' <repeats 244 times>
p = 0
mod = <optimized out>
buf_schedstat = 0x54adeda8 "version 15\ntimestamp 4294950060\ncpu0 0 0 136249 20668 87765 0 86832255825 49660622690 114610\n"
rt = "470984590\000\070", '\000' <repeats 244 times>
ent = <optimized out>
fd = <optimized out>
sampledata = 0x0
e_fd = 6
vmstat = 5
e_fd = 6
#2 main (argc=<optimized out>, argv=<optimized out>)
at /usr/src/debug/systemd-bootchart/232+gitAUTOINC+799b7ffa8c-r0/git/src/bootchart.c:430
res = <optimized out>
sample_stop = <optimized out>
elapsed = <optimized out>
timeleft = <optimized out>
sampledata = 0x54ad7ac0
proc = <optimized out>
build = 0x0
of = 0x0
sysfd = 2130695624
schfd = <optimized out>
ps_first = 0x54ab9414
graph_start = 7.4820000008912757e-06
log_start = 5.4580178957575345e+303
interval = <optimized out>
output_file = "785/task\000stat", '\000' <repeats 2107 times>...
datestr = "470984590 110846723 95\n\000 kB\nRss:", ' ' <repeats 19 times>, "0 kB\nPss:", ' ' <repeats 19 times>, "0 kB\nShared_Clean: 0 kB\nShared_Dirty: 0 kB\nPrivate_Clean: 0 kB\nPrivate_Dirty: 0 kB\nRefe"
pscount = <optimized out>
n_cpus = <optimized out>
overrun = 3
t = 0
r = <optimized out>
samples = 4
ps = <optimized out>
head = 0x54ad8808
sig = {__sigaction_handler = {sa_handler = 0x54ab7258 <signal_handler.lto_priv.66>,
sa_sigaction = 0x54ab7258 <signal_handler.lto_priv.66>}, sa_mask = {__val = {0 <repeats 32 times>}},
sa_flags = 0, sa_restorer = 0x0}
has_procfs = 16
__func__ = "main"
__PRETTY_FUNCTION__ = "main"
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.