tracing/tools: Add the latency-collector to tools directory

This is a tool that is intended to work around the fact that the
preemptoff, irqsoff, and preemptirqsoff tracers only work in
overwrite mode. The idea is to act randomly in such a way that we
do not systematically lose any latencies, so that if enough testing
is done, all latencies will be captured. If the same burst of
latencies is repeated, then sooner or later we will have captured all
the latencies.

It also works with the wakeup_dl, wakeup_rt, and wakeup tracers.
However, in that case it is probably not useful to use the random
sleep functionality.

The reason why it may be desirable to catch all latencies with a long
test campaign is that for some organizations, it's necessary to test
the kernel in the field and not practical for developers to work
iteratively with field testers. Because of cost and project schedules
it is not possible to start a new test campaign every time a latency
problem has been fixed.

It uses inotify to detect changes to /sys/kernel/tracing/trace.
When a latency is detected, it will either sleep or print
immediately, depending on a function that act as an unfair coin
toss.

If immediate print is chosen, it means that we open
/sys/kernel/tracing/trace and thereby cause a blackout period
that will hide any subsequent latencies.

If sleep is chosen, it means that we wait before opening
/sys/kernel/tracing/trace, by default for 1000 ms, to see if
there is another latency during this period. If there is, then we will
lose the previous latency. The coin will be tossed again with a
different probability, and we will either print the new latency, or
possibly a subsequent one.

The probability for the unfair coin toss is chosen so that there
is equal probability to obtain any of the latencies in a burst.
However, this assumes that we make an assumption of how many
latencies there can be. By default  the program assumes that there
are no more than 2 latencies in a burst, the probability of immediate
printout will be:

1/2 and 1

Thus, the probability of getting each of the two latencies will be 1/2.

If we ever find that there is more than one latency in a series,
meaning that we reach the probability of 1, then the table will be
expanded to:

1/3, 1/2, and 1

Thus, we assume that there are no more than three latencies and each
with a probability of 1/3 of being captured. If the probability of 1
is reached in the new table, that is we see more than two closely
occurring latencies, then the table will again be extended, and so
on.

On my systems, it seems like this scheme works fairly well, as
long as the latencies we trace are long enough, 300 us seems to be
enough. This userspace program receive the inotify event at the end
of a latency, and it has time until the end of the next latency
to react, that is to open /sys/kernel/tracing/trace. Thus,
if we trace latencies that are >300 us, then we have at least 300 us
to react.

The minimum latency will of course not be 300 us on all systems, it
will depend on the hardware, kernel version, workload and
configuration.

Example usage:

In one shell, give the following command:
sudo latency-collector -rvv -t preemptirqsoff -s 2000 -a 3

This will trace latencies > 2000us with the preemptirqsoff tracer,
using random sleep with maximum verbosity, with a probability
table initialized to a size of 3.

In another shell, generate a few bursts of latencies:

root@host:~# modprobe preemptirq_delay_test delay=3000 test_mode=alternate
burst_size=3
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger
root@host:~# echo 1 > /sys/kernel/preemptirq_delay_test/trigger

If all goes well, you should be getting stack traces that shows
all the different latencies, i.e. you should see all the three
functions preemptirqtest_0, preemptirqtest_1, preemptirqtest_2 in the
stack traces.

Link: https://lkml.kernel.org/r/20210212134421.172750-2-Viktor.Rosendahl@bmw.de

Signed-off-by: Viktor Rosendahl <Viktor.Rosendahl@bmw.de>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
This commit is contained in:
Viktor Rosendahl 2021-02-12 14:44:21 +01:00 committed by Steven Rostedt (VMware)
parent 99e22ce73c
commit e23db805da
5 changed files with 2161 additions and 6 deletions

View File

@ -31,6 +31,7 @@ help:
@echo ' bootconfig - boot config tool'
@echo ' spi - spi tools'
@echo ' tmon - thermal monitoring and tuning tool'
@echo ' tracing - misc tracing tools'
@echo ' turbostat - Intel CPU idle stats and freq reporting tool'
@echo ' usb - USB testing tools'
@echo ' virtio - vhost test module'
@ -64,7 +65,7 @@ acpi: FORCE
cpupower: FORCE
$(call descend,power/$@)
cgroup firewire hv guest bootconfig spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging: FORCE
cgroup firewire hv guest bootconfig spi usb virtio vm bpf iio gpio objtool leds wmi pci firmware debugging tracing: FORCE
$(call descend,$@)
bpf/%: FORCE
@ -103,7 +104,7 @@ all: acpi cgroup cpupower gpio hv firewire liblockdep \
perf selftests bootconfig spi turbostat usb \
virtio vm bpf x86_energy_perf_policy \
tmon freefall iio objtool kvm_stat wmi \
pci debugging
pci debugging tracing
acpi_install:
$(call descend,power/$(@:_install=),install)
@ -111,7 +112,7 @@ acpi_install:
cpupower_install:
$(call descend,power/$(@:_install=),install)
cgroup_install firewire_install gpio_install hv_install iio_install perf_install bootconfig_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install:
cgroup_install firewire_install gpio_install hv_install iio_install perf_install bootconfig_install spi_install usb_install virtio_install vm_install bpf_install objtool_install wmi_install pci_install debugging_install tracing_install:
$(call descend,$(@:_install=),install)
liblockdep_install:
@ -137,7 +138,8 @@ install: acpi_install cgroup_install cpupower_install gpio_install \
perf_install selftests_install turbostat_install usb_install \
virtio_install vm_install bpf_install x86_energy_perf_policy_install \
tmon_install freefall_install objtool_install kvm_stat_install \
wmi_install pci_install debugging_install intel-speed-select_install
wmi_install pci_install debugging_install intel-speed-select_install \
tracing_install
acpi_clean:
$(call descend,power/acpi,clean)
@ -145,7 +147,7 @@ acpi_clean:
cpupower_clean:
$(call descend,power/cpupower,clean)
cgroup_clean hv_clean firewire_clean bootconfig_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean:
cgroup_clean hv_clean firewire_clean bootconfig_clean spi_clean usb_clean virtio_clean vm_clean wmi_clean bpf_clean iio_clean gpio_clean objtool_clean leds_clean pci_clean firmware_clean debugging_clean tracing_clean:
$(call descend,$(@:_clean=),clean)
liblockdep_clean:
@ -184,6 +186,6 @@ clean: acpi_clean cgroup_clean cpupower_clean hv_clean firewire_clean \
vm_clean bpf_clean iio_clean x86_energy_perf_policy_clean tmon_clean \
freefall_clean build_clean libbpf_clean libsubcmd_clean liblockdep_clean \
gpio_clean objtool_clean leds_clean wmi_clean pci_clean firmware_clean debugging_clean \
intel-speed-select_clean
intel-speed-select_clean tracing_clean
.PHONY: FORCE

19
tools/tracing/Makefile Normal file
View File

@ -0,0 +1,19 @@
# SPDX-License-Identifier: GPL-2.0
include ../scripts/Makefile.include
all: latency
clean: latency_clean
install: latency_install
latency:
$(call descend,latency)
latency_install:
$(call descend,latency,install)
latency_clean:
$(call descend,latency,clean)
.PHONY: all install clean latency latency_install latency_clean

2
tools/tracing/latency/.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
# SPDX-License-Identifier: GPL-2.0
latency-collector

View File

@ -0,0 +1,24 @@
# SPDX-License-Identifier: GPL-2.0
# Makefile for vm tools
#
VAR_CFLAGS := $(shell pkg-config --cflags libtracefs 2>/dev/null)
VAR_LDLIBS := $(shell pkg-config --libs libtracefs 2>/dev/null)
TARGETS = latency-collector
CFLAGS = -Wall -Wextra -g -O2 $(VAR_CFLAGS)
LDFLAGS = -lpthread $(VAR_LDLIBS)
all: $(TARGETS)
%: %.c
$(CC) $(CFLAGS) -o $@ $< $(LDFLAGS)
clean:
$(RM) latency-collector
prefix ?= /usr/local
sbindir ?= ${prefix}/sbin
install: all
install -d $(DESTDIR)$(sbindir)
install -m 755 -p $(TARGETS) $(DESTDIR)$(sbindir)

File diff suppressed because it is too large Load Diff