Using `perf` to ensure realtime safety
perf
is a tool built into the linux kernel, which allows profiling and tracing. it can be quite a handy tool to find evil operations that may compromise real-time applications. This gives a tl/dr guide on how to find offending system calls via perf trace
.
Identify your threads
First step is to identify the threads that we want to trace:
ps -eo pid,tid,rtprio,command,comm -L
may give you the list of all threads, including the rtprio
. if the threads are named, it’s easy to identify the threads (tid
), which we want to trace. For the rest of this entry, let’s assume the threads are 1234,1235,1236
.
Trace the thread(s)
Let’s see what we get:
sudo perf trace -t 1234,1235,1236
This will print the trace points to the terminal in real-time. By default it will trace system calls (which may or may not be good to have in the real-time thread)
Record interesting tracepoints
perf
supports quite a number of trace points. Some of them are available as users, some require super user permissions:
# show user-accessible tracepoints
perf list
# show tracepoints that require super user access
sudo perf list
Some of the more interesting trace points are about memory allocations or waiting for a mutex from the rt thread. We first have to install probes:
# memory subsystem (not only, but mainly malloc/free)
sudo perf probe sdt_libc:memory* -t 1234,1235,1236
# waiting via pthread
sudo perf probe sdt_libpthread:*wait -t 1234,1235,1236
Now we can record some traces:
sudo perf trace record -t 1234,1235,1236 -e sdt_libc:memory* -e sdt_libpthread:*wait --call-graph dwarf
# until we ctrl-c to stop the recording
And analyse the result
sudo perf report -g --stdio
Shoulders of giants
Interesting articles: http://www.brendangregg.com/perf.html https://leezhenghui.github.io/linux/2019/03/05/exploring-usdt-on-linux.html