Using `perf` to ensure realtime safety
perf is a tool built into the linux kernel, which allows profiling and tracing. it can be quite a handy tool to find evil operations that may compromise real-time applications. This gives a tl/dr guide on how to find offending system calls via perf trace.
Identify your threads
First step is to identify the threads that we want to trace:
ps -eo pid,tid,rtprio,command,comm -L
may give you the list of all threads, including the rtprio. if the threads are named, it’s easy to identify the threads (tid), which we want to trace. For the rest of this entry, let’s assume the threads are 1234,1235,1236.
Trace the thread(s)
Let’s see what we get:
sudo perf trace -t 1234,1235,1236
This will print the trace points to the terminal in real-time. By default it will trace system calls (which may or may not be good to have in the real-time thread)
Record interesting tracepoints
perf supports quite a number of trace points. Some of them are available as users, some require super user permissions:
# show user-accessible tracepoints
perf list
# show tracepoints that require super user access
sudo perf list
Some of the more interesting trace points are about memory allocations or waiting for a mutex from the rt thread. We first have to install probes:
# memory subsystem (not only, but mainly malloc/free)
sudo perf probe sdt_libc:memory* -t 1234,1235,1236
# waiting via pthread
sudo perf probe sdt_libpthread:*wait -t 1234,1235,1236
Now we can record some traces:
sudo perf trace record -t 1234,1235,1236 -e sdt_libc:memory* -e sdt_libpthread:*wait --call-graph dwarf
# until we ctrl-c to stop the recording
And analyse the result
sudo perf report -g --stdio
Shoulders of giants
Interesting articles: http://www.brendangregg.com/perf.html https://leezhenghui.github.io/linux/2019/03/05/exploring-usdt-on-linux.html