Tracing updates for 5.14:

- Added option for per CPU threads to the hwlat tracer
 
  - Have hwlat tracer handle hotplug CPUs
 
  - New tracer: osnoise, that detects latency caused by interrupts, softirqs
    and scheduling of other tasks.
 
  - Added timerlat tracer that creates a thread and measures in detail what
    sources of latency it has for wake ups.
 
  - Removed the "success" field of the sched_wakeup trace event.
    This has been hardcoded as "1" since 2015, no tooling should be looking
    at it now. If one exists, we can revert this commit, fix that tool and
    try to remove it again in the future.
 
  - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.
 
  - New boot command line option "tp_printk_stop", as tp_printk causes trace
    events to write to console. When user space starts, this can easily live
    lock the system. Having a boot option to stop just after boot up is
    useful to prevent that from happening.
 
  - Have ftrace_dump_on_oops boot command line option take numbers that match
    the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.
 
  - Bootconfig clean ups, fixes and enhancements.
 
  - New ktest script that tests bootconfig options.
 
  - Add tracepoint_probe_register_may_exist() to register a tracepoint
    without triggering a WARN*() if it already exists. BPF has a path from
    user space that can do this. All other paths are considered a bug.
 
  - Small clean ups and fixes
 -----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQRRSw7ePDh/lE+zeZMp5XQQmuv6qgUCYN8YPhQccm9zdGVkdEBn
 b29kbWlzLm9yZwAKCRAp5XQQmuv6qhxLAP9Mo5hHv7Hg6W7Ddv77rThm+qclsMR/
 yW0P+eJpMm4+xAD8Cq03oE1DimPK+9WZBKU5rSqAkqG6CjgDRw6NlIszzQQ=
 =WEPR
 -----END PGP SIGNATURE-----

Merge tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace

Pull tracing updates from Steven Rostedt:

 - Added option for per CPU threads to the hwlat tracer

 - Have hwlat tracer handle hotplug CPUs

 - New tracer: osnoise, that detects latency caused by interrupts,
   softirqs and scheduling of other tasks.

 - Added timerlat tracer that creates a thread and measures in detail
   what sources of latency it has for wake ups.

 - Removed the "success" field of the sched_wakeup trace event. This has
   been hardcoded as "1" since 2015, no tooling should be looking at it
   now. If one exists, we can revert this commit, fix that tool and try
   to remove it again in the future.

 - tgid mapping fixed to handle more than PID_MAX_DEFAULT pids/tgids.

 - New boot command line option "tp_printk_stop", as tp_printk causes
   trace events to write to console. When user space starts, this can
   easily live lock the system. Having a boot option to stop just after
   boot up is useful to prevent that from happening.

 - Have ftrace_dump_on_oops boot command line option take numbers that
   match the numbers shown in /proc/sys/kernel/ftrace_dump_on_oops.

 - Bootconfig clean ups, fixes and enhancements.

 - New ktest script that tests bootconfig options.

 - Add tracepoint_probe_register_may_exist() to register a tracepoint
   without triggering a WARN*() if it already exists. BPF has a path
   from user space that can do this. All other paths are considered a
   bug.

 - Small clean ups and fixes

* tag 'trace-v5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-trace: (49 commits)
  tracing: Resize tgid_map to pid_max, not PID_MAX_DEFAULT
  tracing: Simplify & fix saved_tgids logic
  treewide: Add missing semicolons to __assign_str uses
  tracing: Change variable type as bool for clean-up
  trace/timerlat: Fix indentation on timerlat_main()
  trace/osnoise: Make 'noise' variable s64 in run_osnoise()
  tracepoint: Add tracepoint_probe_register_may_exist() for BPF tracing
  tracing: Fix spelling in osnoise tracer "interferences" -> "interference"
  Documentation: Fix a typo on trace/osnoise-tracer
  trace/osnoise: Fix return value on osnoise_init_hotplug_support
  trace/osnoise: Make interval u64 on osnoise_main
  trace/osnoise: Fix 'no previous prototype' warnings
  tracing: Have osnoise_main() add a quiescent state for task rcu
  seq_buf: Make trace_seq_putmem_hex() support data longer than 8
  seq_buf: Fix overflow in seq_buf_putmem_hex()
  trace/osnoise: Support hotplug operations
  trace/hwlat: Support hotplug operations
  trace/hwlat: Protect kdata->kthread with get/put_online_cpus
  trace: Add timerlat tracer
  trace: Add osnoise tracer
  ...
This commit is contained in:
Linus Torvalds 2021-07-03 11:13:22 -07:00
commit 757fa80f4e
74 changed files with 4464 additions and 394 deletions

View file

@ -89,13 +89,35 @@ you can use ``+=`` operator. For example::
In this case, the key ``foo`` has ``bar``, ``baz`` and ``qux``.
However, a sub-key and a value can not co-exist under a parent key.
For example, following config is NOT allowed.::
Moreover, sub-keys and a value can coexist under a parent key.
For example, following config is allowed.::
foo = value1
foo.bar = value2 # !ERROR! subkey "bar" and value "value1" can NOT co-exist
foo.bar := value2 # !ERROR! even with the override operator, this is NOT allowed.
foo.bar = value2
foo := value3 # This will update foo's value.
Note, since there is no syntax to put a raw value directly under a
structured key, you have to define it outside of the brace. For example::
foo {
bar = value1
bar {
baz = value2
qux = value3
}
}
Also, the order of the value node under a key is fixed. If there
are a value and subkeys, the value is always the first child node
of the key. Thus if user specifies subkeys first, e.g.::
foo.bar = value1
foo = value2
In the program (and /proc/bootconfig), it will be shown as below::
foo = value2
foo.bar = value1
Comments
--------

View file

@ -5672,12 +5672,25 @@
Note, echoing 1 into this file without the
tracepoint_printk kernel cmdline option has no effect.
The tp_printk_stop_on_boot (see below) can also be used
to stop the printing of events to console at
late_initcall_sync.
** CAUTION **
Having tracepoints sent to printk() and activating high
frequency tracepoints such as irq or sched, can cause
the system to live lock.
tp_printk_stop_on_boot[FTRACE]
When tp_printk (above) is set, it can cause a lot of noise
on the console. It may be useful to only include the
printing of events during boot up, as user space may
make the system inoperable.
This command line option will stop the printing of events
to console at the late_initcall_sync() time frame.
traceoff_on_warning
[FTRACE] enable this option to disable tracing when a
warning is hit. This turns off "tracing_on". Tracing can

View file

@ -99,6 +99,12 @@ These options are setting per-event options.
ftrace.[instance.INSTANCE.]event.GROUP.EVENT.enable
Enable GROUP:EVENT tracing.
ftrace.[instance.INSTANCE.]event.GROUP.enable
Enable all event tracing within GROUP.
ftrace.[instance.INSTANCE.]event.enable
Enable all event tracing.
ftrace.[instance.INSTANCE.]event.GROUP.EVENT.filter = FILTER
Set FILTER rule to the GROUP:EVENT.

View file

@ -76,8 +76,13 @@ in /sys/kernel/tracing:
- tracing_cpumask - the CPUs to move the hwlat thread across
- hwlat_detector/width - specified amount of time to spin within window (usecs)
- hwlat_detector/window - amount of time between (width) runs (usecs)
- hwlat_detector/mode - the thread mode
The hwlat detector's kernel thread will migrate across each CPU specified in
tracing_cpumask between each window. To limit the migration, either modify
tracing_cpumask, or modify the hwlat kernel thread (named [hwlatd]) CPU
affinity directly, and the migration will stop.
By default, one hwlat detector's kernel thread will migrate across each CPU
specified in cpumask at the beginning of a new window, in a round-robin
fashion. This behavior can be changed by changing the thread mode,
the available options are:
- none: do not force migration
- round-robin: migrate across each CPU specified in cpumask [default]
- per-cpu: create one thread for each cpu in tracing_cpumask

View file

@ -23,6 +23,8 @@ Linux Tracing Technologies
histogram-design
boottime-trace
hwlat_detector
osnoise-tracer
timerlat-tracer
intel_th
ring-buffer-design
stm

View file

@ -0,0 +1,152 @@
==============
OSNOISE Tracer
==============
In the context of high-performance computing (HPC), the Operating System
Noise (*osnoise*) refers to the interference experienced by an application
due to activities inside the operating system. In the context of Linux,
NMIs, IRQs, SoftIRQs, and any other system thread can cause noise to the
system. Moreover, hardware-related jobs can also cause noise, for example,
via SMIs.
hwlat_detector is one of the tools used to identify the most complex
source of noise: *hardware noise*.
In a nutshell, the hwlat_detector creates a thread that runs
periodically for a given period. At the beginning of a period, the thread
disables interrupt and starts sampling. While running, the hwlatd
thread reads the time in a loop. As interrupts are disabled, threads,
IRQs, and SoftIRQs cannot interfere with the hwlatd thread. Hence, the
cause of any gap between two different reads of the time roots either on
NMI or in the hardware itself. At the end of the period, hwlatd enables
interrupts and reports the max observed gap between the reads. It also
prints a NMI occurrence counter. If the output does not report NMI
executions, the user can conclude that the hardware is the culprit for
the latency. The hwlat detects the NMI execution by observing
the entry and exit of a NMI.
The osnoise tracer leverages the hwlat_detector by running a
similar loop with preemption, SoftIRQs and IRQs enabled, thus allowing
all the sources of *osnoise* during its execution. Using the same approach
of hwlat, osnoise takes note of the entry and exit point of any
source of interferences, increasing a per-cpu interference counter. The
osnoise tracer also saves an interference counter for each source of
interference. The interference counter for NMI, IRQs, SoftIRQs, and
threads is increased anytime the tool observes these interferences' entry
events. When a noise happens without any interference from the operating
system level, the hardware noise counter increases, pointing to a
hardware-related noise. In this way, osnoise can account for any
source of interference. At the end of the period, the osnoise tracer
prints the sum of all noise, the max single noise, the percentage of CPU
available for the thread, and the counters for the noise sources.
Usage
-----
Write the ASCII text "osnoise" into the current_tracer file of the
tracing system (generally mounted at /sys/kernel/tracing).
For example::
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo osnoise > current_tracer
It is possible to follow the trace by reading the trace trace file::
[root@f32 tracing]# cat trace
# tracer: osnoise
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth MAX
# || / SINGLE Interference counters:
# |||| RUNTIME NOISE % OF CPU NOISE +-----------------------------+
# TASK-PID CPU# |||| TIMESTAMP IN US IN US AVAILABLE IN US HW NMI IRQ SIRQ THREAD
# | | | |||| | | | | | | | | | |
<...>-859 [000] .... 81.637220: 1000000 190 99.98100 9 18 0 1007 18 1
<...>-860 [001] .... 81.638154: 1000000 656 99.93440 74 23 0 1006 16 3
<...>-861 [002] .... 81.638193: 1000000 5675 99.43250 202 6 0 1013 25 21
<...>-862 [003] .... 81.638242: 1000000 125 99.98750 45 1 0 1011 23 0
<...>-863 [004] .... 81.638260: 1000000 1721 99.82790 168 7 0 1002 49 41
<...>-864 [005] .... 81.638286: 1000000 263 99.97370 57 6 0 1006 26 2
<...>-865 [006] .... 81.638302: 1000000 109 99.98910 21 3 0 1006 18 1
<...>-866 [007] .... 81.638326: 1000000 7816 99.21840 107 8 0 1016 39 19
In addition to the regular trace fields (from TASK-PID to TIMESTAMP), the
tracer prints a message at the end of each period for each CPU that is
running an osnoise/ thread. The osnoise specific fields report:
- The RUNTIME IN US reports the amount of time in microseconds that
the osnoise thread kept looping reading the time.
- The NOISE IN US reports the sum of noise in microseconds observed
by the osnoise tracer during the associated runtime.
- The % OF CPU AVAILABLE reports the percentage of CPU available for
the osnoise thread during the runtime window.
- The MAX SINGLE NOISE IN US reports the maximum single noise observed
during the runtime window.
- The Interference counters display how many each of the respective
interference happened during the runtime window.
Note that the example above shows a high number of HW noise samples.
The reason being is that this sample was taken on a virtual machine,
and the host interference is detected as a hardware interference.
Tracer options
---------------------
The tracer has a set of options inside the osnoise directory, they are:
- osnoise/cpus: CPUs at which a osnoise thread will execute.
- osnoise/period_us: the period of the osnoise thread.
- osnoise/runtime_us: how long an osnoise thread will look for noise.
- osnoise/stop_tracing_us: stop the system tracing if a single noise
higher than the configured value happens. Writing 0 disables this
option.
- osnoise/stop_tracing_total_us: stop the system tracing if total noise
higher than the configured value happens. Writing 0 disables this
option.
- tracing_threshold: the minimum delta between two time() reads to be
considered as noise, in us. When set to 0, the default value will
will be used, which is currently 5 us.
Additional Tracing
------------------
In addition to the tracer, a set of tracepoints were added to
facilitate the identification of the osnoise source.
- osnoise:sample_threshold: printed anytime a noise is higher than
the configurable tolerance_ns.
- osnoise:nmi_noise: noise from NMI, including the duration.
- osnoise:irq_noise: noise from an IRQ, including the duration.
- osnoise:softirq_noise: noise from a SoftIRQ, including the
duration.
- osnoise:thread_noise: noise from a thread, including the duration.
Note that all the values are *net values*. For example, if while osnoise
is running, another thread preempts the osnoise thread, it will start a
thread_noise duration at the start. Then, an IRQ takes place, preempting
the thread_noise, starting a irq_noise. When the IRQ ends its execution,
it will compute its duration, and this duration will be subtracted from
the thread_noise, in such a way as to avoid the double accounting of the
IRQ execution. This logic is valid for all sources of noise.
Here is one example of the usage of these tracepoints::
osnoise/8-961 [008] d.h. 5789.857532: irq_noise: local_timer:236 start 5789.857529929 duration 1845 ns
osnoise/8-961 [008] dNh. 5789.858408: irq_noise: local_timer:236 start 5789.858404871 duration 2848 ns
migration/8-54 [008] d... 5789.858413: thread_noise: migration/8:54 start 5789.858409300 duration 3068 ns
osnoise/8-961 [008] .... 5789.858413: sample_threshold: start 5789.858404555 duration 8812 ns interferences 2
In this example, a noise sample of 8 microseconds was reported in the last
line, pointing to two interferences. Looking backward in the trace, the
two previous entries were about the migration thread running after a
timer IRQ execution. The first event is not part of the noise because
it took place one millisecond before.
It is worth noticing that the sum of the duration reported in the
tracepoints is smaller than eight us reported in the sample_threshold.
The reason roots in the overhead of the entry and exit code that happens
before and after any interference execution. This justifies the dual
approach: measuring thread and tracing.

View file

@ -0,0 +1,181 @@
###############
Timerlat tracer
###############
The timerlat tracer aims to help the preemptive kernel developers to
find souces of wakeup latencies of real-time threads. Like cyclictest,
the tracer sets a periodic timer that wakes up a thread. The thread then
computes a *wakeup latency* value as the difference between the *current
time* and the *absolute time* that the timer was set to expire. The main
goal of timerlat is tracing in such a way to help kernel developers.
Usage
-----
Write the ASCII text "timerlat" into the current_tracer file of the
tracing system (generally mounted at /sys/kernel/tracing).
For example::
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo timerlat > current_tracer
It is possible to follow the trace by reading the trace trace file::
[root@f32 tracing]# cat trace
# tracer: timerlat
#
# _-----=> irqs-off
# / _----=> need-resched
# | / _---=> hardirq/softirq
# || / _--=> preempt-depth
# || /
# |||| ACTIVATION
# TASK-PID CPU# |||| TIMESTAMP ID CONTEXT LATENCY
# | | | |||| | | | |
<idle>-0 [000] d.h1 54.029328: #1 context irq timer_latency 932 ns
<...>-867 [000] .... 54.029339: #1 context thread timer_latency 11700 ns
<idle>-0 [001] dNh1 54.029346: #1 context irq timer_latency 2833 ns
<...>-868 [001] .... 54.029353: #1 context thread timer_latency 9820 ns
<idle>-0 [000] d.h1 54.030328: #2 context irq timer_latency 769 ns
<...>-867 [000] .... 54.030330: #2 context thread timer_latency 3070 ns
<idle>-0 [001] d.h1 54.030344: #2 context irq timer_latency 935 ns
<...>-868 [001] .... 54.030347: #2 context thread timer_latency 4351 ns
The tracer creates a per-cpu kernel thread with real-time priority that
prints two lines at every activation. The first is the *timer latency*
observed at the *hardirq* context before the activation of the thread.
The second is the *timer latency* observed by the thread. The ACTIVATION
ID field serves to relate the *irq* execution to its respective *thread*
execution.
The *irq*/*thread* splitting is important to clarify at which context
the unexpected high value is coming from. The *irq* context can be
delayed by hardware related actions, such as SMIs, NMIs, IRQs
or by a thread masking interrupts. Once the timer happens, the delay
can also be influenced by blocking caused by threads. For example, by
postponing the scheduler execution via preempt_disable(), by the
scheduler execution, or by masking interrupts. Threads can
also be delayed by the interference from other threads and IRQs.
Tracer options
---------------------
The timerlat tracer is built on top of osnoise tracer.
So its configuration is also done in the osnoise/ config
directory. The timerlat configs are:
- cpus: CPUs at which a timerlat thread will execute.
- timerlat_period_us: the period of the timerlat thread.
- osnoise/stop_tracing_us: stop the system tracing if a
timer latency at the *irq* context higher than the configured
value happens. Writing 0 disables this option.
- stop_tracing_total_us: stop the system tracing if a
timer latency at the *thread* context higher than the configured
value happens. Writing 0 disables this option.
- print_stack: save the stack of the IRQ ocurrence, and print
it afte the *thread context* event".
timerlat and osnoise
----------------------------
The timerlat can also take advantage of the osnoise: traceevents.
For example::
[root@f32 ~]# cd /sys/kernel/tracing/
[root@f32 tracing]# echo timerlat > current_tracer
[root@f32 tracing]# echo 1 > events/osnoise/enable
[root@f32 tracing]# echo 25 > osnoise/stop_tracing_total_us
[root@f32 tracing]# tail -10 trace
cc1-87882 [005] d..h... 548.771078: #402268 context irq timer_latency 13585 ns
cc1-87882 [005] dNLh1.. 548.771082: irq_noise: local_timer:236 start 548.771077442 duration 7597 ns
cc1-87882 [005] dNLh2.. 548.771099: irq_noise: qxl:21 start 548.771085017 duration 7139 ns
cc1-87882 [005] d...3.. 548.771102: thread_noise: cc1:87882 start 548.771078243 duration 9909 ns
timerlat/5-1035 [005] ....... 548.771104: #402268 context thread timer_latency 39960 ns
In this case, the root cause of the timer latency does not point to a
single cause, but to multiple ones. Firstly, the timer IRQ was delayed
for 13 us, which may point to a long IRQ disabled section (see IRQ
stacktrace section). Then the timer interrupt that wakes up the timerlat
thread took 7597 ns, and the qxl:21 device IRQ took 7139 ns. Finally,
the cc1 thread noise took 9909 ns of time before the context switch.
Such pieces of evidence are useful for the developer to use other
tracing methods to figure out how to debug and optimize the system.
It is worth mentioning that the *duration* values reported
by the osnoise: events are *net* values. For example, the
thread_noise does not include the duration of the overhead caused
by the IRQ execution (which indeed accounted for 12736 ns). But
the values reported by the timerlat tracer (timerlat_latency)
are *gross* values.
The art below illustrates a CPU timeline and how the timerlat tracer
observes it at the top and the osnoise: events at the bottom. Each "-"
in the timelines means circa 1 us, and the time moves ==>::
External timer irq thread
clock latency latency
event 13585 ns 39960 ns
| ^ ^
v | |
|-------------| |
|-------------+-------------------------|
^ ^
========================================================================
[tmr irq] [dev irq]
[another thread...^ v..^ v.......][timerlat/ thread] <-- CPU timeline
=========================================================================
|-------| |-------|
|--^ v-------|
| | |
| | + thread_noise: 9909 ns
| +-> irq_noise: 6139 ns
+-> irq_noise: 7597 ns
IRQ stacktrace
---------------------------
The osnoise/print_stack option is helpful for the cases in which a thread
noise causes the major factor for the timer latency, because of preempt or
irq disabled. For example::
[root@f32 tracing]# echo 500 > osnoise/stop_tracing_total_us
[root@f32 tracing]# echo 500 > osnoise/print_stack
[root@f32 tracing]# echo timerlat > current_tracer
[root@f32 tracing]# tail -21 per_cpu/cpu7/trace
insmod-1026 [007] dN.h1.. 200.201948: irq_noise: local_timer:236 start 200.201939376 duration 7872 ns
insmod-1026 [007] d..h1.. 200.202587: #29800 context irq timer_latency 1616 ns
insmod-1026 [007] dN.h2.. 200.202598: irq_noise: local_timer:236 start 200.202586162 duration 11855 ns
insmod-1026 [007] dN.h3.. 200.202947: irq_noise: local_timer:236 start 200.202939174 duration 7318 ns
insmod-1026 [007] d...3.. 200.203444: thread_noise: insmod:1026 start 200.202586933 duration 838681 ns
timerlat/7-1001 [007] ....... 200.203445: #29800 context thread timer_latency 859978 ns
timerlat/7-1001 [007] ....1.. 200.203446: <stack trace>
=> timerlat_irq
=> __hrtimer_run_queues
=> hrtimer_interrupt
=> __sysvec_apic_timer_interrupt
=> asm_call_irq_on_stack
=> sysvec_apic_timer_interrupt
=> asm_sysvec_apic_timer_interrupt
=> delay_tsc
=> dummy_load_1ms_pd_init
=> do_one_initcall
=> do_init_module
=> __do_sys_finit_module
=> do_syscall_64
=> entry_SYSCALL_64_after_hwframe
In this case, it is possible to see that the thread added the highest
contribution to the *timer latency* and the stack trace, saved during
the timerlat IRQ handler, points to a function named
dummy_load_1ms_pd_init, which had the following code (on purpose)::
static int __init dummy_load_1ms_pd_init(void)
{
preempt_disable();
mdelay(1);
preempt_enable();
return 0;
}

View file

@ -102,6 +102,7 @@ obj-$(CONFIG_FUNCTION_TRACER) += ftrace_$(BITS).o
obj-$(CONFIG_FUNCTION_GRAPH_TRACER) += ftrace.o
obj-$(CONFIG_FTRACE_SYSCALLS) += ftrace.o
obj-$(CONFIG_X86_TSC) += trace_clock.o
obj-$(CONFIG_TRACING) += trace.o
obj-$(CONFIG_CRASH_CORE) += crash_core_$(BITS).o
obj-$(CONFIG_KEXEC_CORE) += machine_kexec_$(BITS).o
obj-$(CONFIG_KEXEC_CORE) += relocate_kernel_$(BITS).o crash.o

234
arch/x86/kernel/trace.c Normal file
View file

@ -0,0 +1,234 @@
#include <asm/trace/irq_vectors.h>
#include <linux/trace.h>
#if defined(CONFIG_OSNOISE_TRACER) && defined(CONFIG_X86_LOCAL_APIC)
/*
* trace_intel_irq_entry - record intel specific IRQ entry
*/
static void trace_intel_irq_entry(void *data, int vector)
{
osnoise_trace_irq_entry(vector);
}
/*
* trace_intel_irq_exit - record intel specific IRQ exit
*/
static void trace_intel_irq_exit(void *data, int vector)
{
char *vector_desc = (char *) data;
osnoise_trace_irq_exit(vector, vector_desc);
}
/*
* register_intel_irq_tp - Register intel specific IRQ entry tracepoints
*/
int osnoise_arch_register(void)
{
int ret;
ret = register_trace_local_timer_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_err;
ret = register_trace_local_timer_exit(trace_intel_irq_exit, "local_timer");
if (ret)
goto out_timer_entry;
#ifdef CONFIG_X86_THERMAL_VECTOR
ret = register_trace_thermal_apic_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_timer_exit;
ret = register_trace_thermal_apic_exit(trace_intel_irq_exit, "thermal_apic");
if (ret)
goto out_thermal_entry;
#endif /* CONFIG_X86_THERMAL_VECTOR */
#ifdef CONFIG_X86_MCE_AMD
ret = register_trace_deferred_error_apic_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_thermal_exit;
ret = register_trace_deferred_error_apic_exit(trace_intel_irq_exit, "deferred_error");
if (ret)
goto out_deferred_entry;
#endif
#ifdef CONFIG_X86_MCE_THRESHOLD
ret = register_trace_threshold_apic_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_deferred_exit;
ret = register_trace_threshold_apic_exit(trace_intel_irq_exit, "threshold_apic");
if (ret)
goto out_threshold_entry;
#endif /* CONFIG_X86_MCE_THRESHOLD */
#ifdef CONFIG_SMP
ret = register_trace_call_function_single_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_threshold_exit;
ret = register_trace_call_function_single_exit(trace_intel_irq_exit,
"call_function_single");
if (ret)
goto out_call_function_single_entry;
ret = register_trace_call_function_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_call_function_single_exit;
ret = register_trace_call_function_exit(trace_intel_irq_exit, "call_function");
if (ret)
goto out_call_function_entry;
ret = register_trace_reschedule_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_call_function_exit;
ret = register_trace_reschedule_exit(trace_intel_irq_exit, "reschedule");
if (ret)
goto out_reschedule_entry;
#endif /* CONFIG_SMP */
#ifdef CONFIG_IRQ_WORK
ret = register_trace_irq_work_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_reschedule_exit;
ret = register_trace_irq_work_exit(trace_intel_irq_exit, "irq_work");
if (ret)
goto out_irq_work_entry;
#endif
ret = register_trace_x86_platform_ipi_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_irq_work_exit;
ret = register_trace_x86_platform_ipi_exit(trace_intel_irq_exit, "x86_platform_ipi");
if (ret)
goto out_x86_ipi_entry;
ret = register_trace_error_apic_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_x86_ipi_exit;
ret = register_trace_error_apic_exit(trace_intel_irq_exit, "error_apic");
if (ret)
goto out_error_apic_entry;
ret = register_trace_spurious_apic_entry(trace_intel_irq_entry, NULL);
if (ret)
goto out_error_apic_exit;
ret = register_trace_spurious_apic_exit(trace_intel_irq_exit, "spurious_apic");
if (ret)
goto out_spurious_apic_entry;
return 0;
out_spurious_apic_entry:
unregister_trace_spurious_apic_entry(trace_intel_irq_entry, NULL);
out_error_apic_exit:
unregister_trace_error_apic_exit(trace_intel_irq_exit, "error_apic");
out_error_apic_entry:
unregister_trace_error_apic_entry(trace_intel_irq_entry, NULL);
out_x86_ipi_exit:
unregister_trace_x86_platform_ipi_exit(trace_intel_irq_exit, "x86_platform_ipi");
out_x86_ipi_entry:
unregister_trace_x86_platform_ipi_entry(trace_intel_irq_entry, NULL);
out_irq_work_exit:
#ifdef CONFIG_IRQ_WORK
unregister_trace_irq_work_exit(trace_intel_irq_exit, "irq_work");
out_irq_work_entry:
unregister_trace_irq_work_entry(trace_intel_irq_entry, NULL);
out_reschedule_exit:
#endif
#ifdef CONFIG_SMP
unregister_trace_reschedule_exit(trace_intel_irq_exit, "reschedule");
out_reschedule_entry:
unregister_trace_reschedule_entry(trace_intel_irq_entry, NULL);
out_call_function_exit:
unregister_trace_call_function_exit(trace_intel_irq_exit, "call_function");
out_call_function_entry:
unregister_trace_call_function_entry(trace_intel_irq_entry, NULL);
out_call_function_single_exit:
unregister_trace_call_function_single_exit(trace_intel_irq_exit, "call_function_single");
out_call_function_single_entry:
unregister_trace_call_function_single_entry(trace_intel_irq_entry, NULL);
out_threshold_exit:
#endif
#ifdef CONFIG_X86_MCE_THRESHOLD
unregister_trace_threshold_apic_exit(trace_intel_irq_exit, "threshold_apic");
out_threshold_entry:
unregister_trace_threshold_apic_entry(trace_intel_irq_entry, NULL);
out_deferred_exit:
#endif
#ifdef CONFIG_X86_MCE_AMD
unregister_trace_deferred_error_apic_exit(trace_intel_irq_exit, "deferred_error");
out_deferred_entry:
unregister_trace_deferred_error_apic_entry(trace_intel_irq_entry, NULL);
out_thermal_exit:
#endif /* CONFIG_X86_MCE_AMD */
#ifdef CONFIG_X86_THERMAL_VECTOR
unregister_trace_thermal_apic_exit(trace_intel_irq_exit, "thermal_apic");
out_thermal_entry:
unregister_trace_thermal_apic_entry(trace_intel_irq_entry, NULL);
out_timer_exit:
#endif /* CONFIG_X86_THERMAL_VECTOR */
unregister_trace_local_timer_exit(trace_intel_irq_exit, "local_timer");
out_timer_entry:
unregister_trace_local_timer_entry(trace_intel_irq_entry, NULL);
out_err:
return -EINVAL;
}
void osnoise_arch_unregister(void)
{
unregister_trace_spurious_apic_exit(trace_intel_irq_exit, "spurious_apic");
unregister_trace_spurious_apic_entry(trace_intel_irq_entry, NULL);
unregister_trace_error_apic_exit(trace_intel_irq_exit, "error_apic");
unregister_trace_error_apic_entry(trace_intel_irq_entry, NULL);
unregister_trace_x86_platform_ipi_exit(trace_intel_irq_exit, "x86_platform_ipi");
unregister_trace_x86_platform_ipi_entry(trace_intel_irq_entry, NULL);
#ifdef CONFIG_IRQ_WORK
unregister_trace_irq_work_exit(trace_intel_irq_exit, "irq_work");
unregister_trace_irq_work_entry(trace_intel_irq_entry, NULL);
#endif
#ifdef CONFIG_SMP
unregister_trace_reschedule_exit(trace_intel_irq_exit, "reschedule");
unregister_trace_reschedule_entry(trace_intel_irq_entry, NULL);
unregister_trace_call_function_exit(trace_intel_irq_exit, "call_function");
unregister_trace_call_function_entry(trace_intel_irq_entry, NULL);
unregister_trace_call_function_single_exit(trace_intel_irq_exit, "call_function_single");
unregister_trace_call_function_single_entry(trace_intel_irq_entry, NULL);
#endif
#ifdef CONFIG_X86_MCE_THRESHOLD
unregister_trace_threshold_apic_exit(trace_intel_irq_exit, "threshold_apic");
unregister_trace_threshold_apic_entry(trace_intel_irq_entry, NULL);
#endif
#ifdef CONFIG_X86_MCE_AMD
unregister_trace_deferred_error_apic_exit(trace_intel_irq_exit, "deferred_error");
unregister_trace_deferred_error_apic_entry(trace_intel_irq_entry, NULL);
#endif
#ifdef CONFIG_X86_THERMAL_VECTOR
unregister_trace_thermal_apic_exit(trace_intel_irq_exit, "thermal_apic");
unregister_trace_thermal_apic_entry(trace_intel_irq_entry, NULL);
#endif /* CONFIG_X86_THERMAL_VECTOR */
unregister_trace_local_timer_exit(trace_intel_irq_exit, "local_timer");
unregister_trace_local_timer_entry(trace_intel_irq_entry, NULL);
}
#endif /* CONFIG_OSNOISE_TRAECR && CONFIG_X86_LOCAL_APIC */

View file

@ -176,10 +176,10 @@ TRACE_EVENT(amdgpu_cs_ioctl,
TP_fast_assign(
__entry->sched_job_id = job->base.id;
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job));
__entry->context = job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name)
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name);
__entry->num_ibs = job->num_ibs;
),
TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
@ -201,10 +201,10 @@ TRACE_EVENT(amdgpu_sched_run_job,
TP_fast_assign(
__entry->sched_job_id = job->base.id;
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job))
__assign_str(timeline, AMDGPU_JOB_GET_TIMELINE_NAME(job));
__entry->context = job->base.s_fence->finished.context;
__entry->seqno = job->base.s_fence->finished.seqno;
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name)
__assign_str(ring, to_amdgpu_ring(job->base.sched)->name);
__entry->num_ibs = job->num_ibs;
),
TP_printk("sched_job=%llu, timeline=%s, context=%u, seqno=%u, ring_name=%s, num_ibs=%u",
@ -229,7 +229,7 @@ TRACE_EVENT(amdgpu_vm_grab_id,
TP_fast_assign(
__entry->pasid = vm->pasid;
__assign_str(ring, ring->name)
__assign_str(ring, ring->name);
__entry->vmid = job->vmid;
__entry->vm_hub = ring->funcs->vmhub,
__entry->pd_addr = job->vm_pd_addr;
@ -424,7 +424,7 @@ TRACE_EVENT(amdgpu_vm_flush,
),
TP_fast_assign(
__assign_str(ring, ring->name)
__assign_str(ring, ring->name);
__entry->vmid = vmid;
__entry->vm_hub = ring->funcs->vmhub;
__entry->pd_addr = pd_addr;
@ -525,7 +525,7 @@ TRACE_EVENT(amdgpu_ib_pipe_sync,
),
TP_fast_assign(
__assign_str(ring, sched_job->base.sched->name)
__assign_str(ring, sched_job->base.sched->name);
__entry->id = sched_job->base.id;
__entry->fence = fence;
__entry->ctx = fence->context;

View file

@ -24,7 +24,7 @@ DECLARE_EVENT_CLASS(lima_task,
__entry->task_id = task->base.id;
__entry->context = task->base.s_fence->finished.context;
__entry->seqno = task->base.s_fence->finished.seqno;
__assign_str(pipe, task->base.sched->name)
__assign_str(pipe, task->base.sched->name);
),
TP_printk("task=%llu, context=%u seqno=%u pipe=%s",

View file

@ -63,7 +63,7 @@ TRACE_EVENT(hfi1_interrupt,
__array(char, buf, 64)
__field(int, src)
),
TP_fast_assign(DD_DEV_ASSIGN(dd)
TP_fast_assign(DD_DEV_ASSIGN(dd);
is_entry->is_name(__entry->buf, 64,
src - is_entry->start);
__entry->src = src;
@ -100,7 +100,7 @@ TRACE_EVENT(hfi1_fault_opcode,
__field(u32, qpn)
__field(u8, opcode)
),
TP_fast_assign(DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
TP_fast_assign(DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->opcode = opcode;
),

View file

@ -70,7 +70,7 @@ DECLARE_EVENT_CLASS(hfi1_rc_template,
__field(u32, r_psn)
),
TP_fast_assign(
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->s_flags = qp->s_flags;
__entry->psn = psn;
@ -130,7 +130,7 @@ DECLARE_EVENT_CLASS(/* rc_ack */
__field(u32, lpsn)
),
TP_fast_assign(/* assign */
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->aeth = aeth;
__entry->psn = psn;

View file

@ -886,7 +886,7 @@ DECLARE_EVENT_CLASS(/* sender_info */
__field(u8, s_retry)
),
TP_fast_assign(/* assign */
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->state = qp->state;
__entry->s_cur = qp->s_cur;
@ -1285,7 +1285,7 @@ DECLARE_EVENT_CLASS(/* rc_rcv_err */
__field(int, diff)
),
TP_fast_assign(/* assign */
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->s_flags = qp->s_flags;
__entry->state = qp->state;
@ -1574,7 +1574,7 @@ DECLARE_EVENT_CLASS(/* tid_ack */
__field(u32, resync_psn)
),
TP_fast_assign(/* assign */
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->aeth = aeth;
__entry->psn = psn;

View file

@ -120,7 +120,7 @@ DECLARE_EVENT_CLASS(hfi1_qpsleepwakeup_template,
__field(unsigned long, iow_flags)
),
TP_fast_assign(
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->flags = flags;
__entry->qpn = qp->ibqp.qp_num;
__entry->s_flags = qp->s_flags;
@ -868,7 +868,7 @@ TRACE_EVENT(
__field(int, send_flags)
),
TP_fast_assign(
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->wqe = wqe;
__entry->wr_id = wqe->wr.wr_id;
__entry->qpn = qp->ibqp.qp_num;
@ -904,7 +904,7 @@ DECLARE_EVENT_CLASS(
__field(bool, flag)
),
TP_fast_assign(
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device))
DD_DEV_ASSIGN(dd_from_ibdev(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->flag = flag;
),
@ -952,7 +952,7 @@ DECLARE_EVENT_CLASS(/* AIP */
__field(u8, stopped)
),
TP_fast_assign(/* assign */
DD_DEV_ASSIGN(txq->priv->dd)
DD_DEV_ASSIGN(txq->priv->dd);
__entry->txq = txq;
__entry->sde = txq->sde;
__entry->head = txq->tx_ring.head;

View file

@ -85,7 +85,7 @@ DECLARE_EVENT_CLASS(rvt_cq_template,
__field(int, comp_vector_cpu)
__field(u32, flags)
),
TP_fast_assign(RDI_DEV_ASSIGN(cq->rdi)
TP_fast_assign(RDI_DEV_ASSIGN(cq->rdi);
__entry->ip = cq->ip;
__entry->cqe = attr->cqe;
__entry->comp_vector = attr->comp_vector;
@ -123,7 +123,7 @@ DECLARE_EVENT_CLASS(
__field(u32, imm)
),
TP_fast_assign(
RDI_DEV_ASSIGN(cq->rdi)
RDI_DEV_ASSIGN(cq->rdi);
__entry->wr_id = wc->wr_id;
__entry->status = wc->status;
__entry->opcode = wc->opcode;

View file

@ -195,7 +195,7 @@ TRACE_EVENT(
__field(uint, sg_offset)
),
TP_fast_assign(
RDI_DEV_ASSIGN(ib_to_rvt(to_imr(ibmr)->mr.pd->device))
RDI_DEV_ASSIGN(ib_to_rvt(to_imr(ibmr)->mr.pd->device));
__entry->ibmr_iova = ibmr->iova;
__entry->iova = to_imr(ibmr)->mr.iova;
__entry->user_base = to_imr(ibmr)->mr.user_base;

View file

@ -65,7 +65,7 @@ DECLARE_EVENT_CLASS(rvt_qphash_template,
__field(u32, bucket)
),
TP_fast_assign(
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->bucket = bucket;
),
@ -97,7 +97,7 @@ DECLARE_EVENT_CLASS(
__field(u32, to)
),
TP_fast_assign(
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->hrtimer = &qp->s_rnr_timer;
__entry->s_flags = qp->s_flags;

View file

@ -71,7 +71,7 @@ DECLARE_EVENT_CLASS(rvt_rc_template,
__field(u32, r_psn)
),
TP_fast_assign(
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
__entry->qpn = qp->ibqp.qp_num;
__entry->s_flags = qp->s_flags;
__entry->psn = psn;

View file

@ -111,7 +111,7 @@ TRACE_EVENT(
__field(int, wr_num_sge)
),
TP_fast_assign(
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
__entry->wqe = wqe;
__entry->wr_id = wqe->wr.wr_id;
__entry->qpn = qp->ibqp.qp_num;
@ -170,7 +170,7 @@ TRACE_EVENT(
__field(int, send_flags)
),
TP_fast_assign(
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device))
RDI_DEV_ASSIGN(ib_to_rvt(qp->ibqp.device));
__entry->wqe = wqe;
__entry->wr_id = wqe->wr.wr_id;
__entry->qpn = qp->ibqp.qp_num;

View file

@ -26,7 +26,7 @@ TRACE_EVENT(mei_reg_read,
__field(u32, val)
),
TP_fast_assign(
__assign_str(dev, dev_name(dev))
__assign_str(dev, dev_name(dev));
__entry->reg = reg;
__entry->offs = offs;
__entry->val = val;
@ -45,7 +45,7 @@ TRACE_EVENT(mei_reg_write,
__field(u32, val)
),
TP_fast_assign(
__assign_str(dev, dev_name(dev))
__assign_str(dev, dev_name(dev));
__entry->reg = reg;
__entry->offs = offs;
__entry->val = val;
@ -64,7 +64,7 @@ TRACE_EVENT(mei_pci_cfg_read,
__field(u32, val)
),
TP_fast_assign(
__assign_str(dev, dev_name(dev))
__assign_str(dev, dev_name(dev));
__entry->reg = reg;
__entry->offs = offs;
__entry->val = val;

View file

@ -21,7 +21,7 @@ TRACE_EVENT(otx2_msg_alloc,
__field(u16, id)
__field(u64, size)
),
TP_fast_assign(__assign_str(dev, pci_name(pdev))
TP_fast_assign(__assign_str(dev, pci_name(pdev));
__entry->id = id;
__entry->size = size;
),
@ -36,7 +36,7 @@ TRACE_EVENT(otx2_msg_send,
__field(u16, num_msgs)
__field(u64, msg_size)
),
TP_fast_assign(__assign_str(dev, pci_name(pdev))
TP_fast_assign(__assign_str(dev, pci_name(pdev));
__entry->num_msgs = num_msgs;
__entry->msg_size = msg_size;
),
@ -52,7 +52,7 @@ TRACE_EVENT(otx2_msg_check,
__field(u16, rspid)
__field(int, rc)
),
TP_fast_assign(__assign_str(dev, pci_name(pdev))
TP_fast_assign(__assign_str(dev, pci_name(pdev));
__entry->reqid = reqid;
__entry->rspid = rspid;
__entry->rc = rc;
@ -69,8 +69,8 @@ TRACE_EVENT(otx2_msg_interrupt,
__string(str, msg)
__field(u64, intr)
),
TP_fast_assign(__assign_str(dev, pci_name(pdev))
__assign_str(str, msg)
TP_fast_assign(__assign_str(dev, pci_name(pdev));
__assign_str(str, msg);
__entry->intr = intr;
),
TP_printk("[%s] mbox interrupt %s (0x%llx)\n", __get_str(dev),
@ -84,7 +84,7 @@ TRACE_EVENT(otx2_msg_process,
__field(u16, id)
__field(int, err)
),
TP_fast_assign(__assign_str(dev, pci_name(pdev))
TP_fast_assign(__assign_str(dev, pci_name(pdev));
__entry->id = id;
__entry->err = err;
),

View file

@ -232,7 +232,7 @@ TRACE_EVENT(fjes_hw_start_debug_err,
__string(err, err)
),
TP_fast_assign(
__assign_str(err, err)
__assign_str(err, err);
),
TP_printk("%s", __get_str(err))
);
@ -258,7 +258,7 @@ TRACE_EVENT(fjes_hw_stop_debug_err,
__string(err, err)
),
TP_fast_assign(
__assign_str(err, err)
__assign_str(err, err);
),
TP_printk("%s", __get_str(err))
);

View file

@ -138,7 +138,7 @@ DECLARE_EVENT_CLASS(cdnsp_log_simple,
__string(text, msg)
),
TP_fast_assign(
__assign_str(text, msg)
__assign_str(text, msg);
),
TP_printk("%s", __get_str(text))
);

View file

@ -625,7 +625,7 @@ TRACE_EVENT(nfs4_state_mgr,
TP_fast_assign(
__entry->state = clp->cl_state;
__assign_str(hostname, clp->cl_hostname)
__assign_str(hostname, clp->cl_hostname);
),
TP_printk(
@ -1637,7 +1637,7 @@ DECLARE_EVENT_CLASS(nfs4_inode_callback_event,
__entry->fileid = 0;
__entry->dev = 0;
}
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown")
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown");
),
TP_printk(
@ -1694,7 +1694,7 @@ DECLARE_EVENT_CLASS(nfs4_inode_stateid_callback_event,
__entry->fileid = 0;
__entry->dev = 0;
}
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown")
__assign_str(dstaddr, clp ? clp->cl_hostname : "unknown");
__entry->stateid_seq =
be32_to_cpu(stateid->seqid);
__entry->stateid_hash =

View file

@ -1427,8 +1427,8 @@ DECLARE_EVENT_CLASS(nfs_xdr_event,
__entry->version = task->tk_client->cl_vers;
__entry->error = error;
__assign_str(program,
task->tk_client->cl_program->name)
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
task->tk_client->cl_program->name);
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
),
TP_printk(

View file

@ -49,7 +49,7 @@ static int __init copy_xbc_key_value_list(char *dst, size_t size)
else
q = '"';
ret = snprintf(dst, rest(dst, end), "%c%s%c%s",
q, val, q, vnode->next ? ", " : "\n");
q, val, q, xbc_node_is_array(vnode) ? ", " : "\n");
if (ret < 0)
goto out;
dst += ret;

View file

@ -16,6 +16,26 @@
#define BOOTCONFIG_ALIGN (1 << BOOTCONFIG_ALIGN_SHIFT)
#define BOOTCONFIG_ALIGN_MASK (BOOTCONFIG_ALIGN - 1)
/**
* xbc_calc_checksum() - Calculate checksum of bootconfig
* @data: Bootconfig data.
* @size: The size of the bootconfig data.
*
* Calculate the checksum value of the bootconfig data.
* The checksum will be used with the BOOTCONFIG_MAGIC and the size for
* embedding the bootconfig in the initrd image.
*/
static inline __init u32 xbc_calc_checksum(void *data, u32 size)
{
unsigned char *p = data;
u32 ret = 0;
while (size--)
ret += *p++;
return ret;
}
/* XBC tree node */
struct xbc_node {
u16 next;
@ -71,7 +91,7 @@ static inline __init bool xbc_node_is_key(struct xbc_node *node)
*/
static inline __init bool xbc_node_is_array(struct xbc_node *node)
{
return xbc_node_is_value(node) && node->next != 0;
return xbc_node_is_value(node) && node->child != 0;
}
/**
@ -80,6 +100,8 @@ static inline __init bool xbc_node_is_array(struct xbc_node *node)
*
* Test the @node is a leaf key node which is a key node and has a value node
* or no child. Returns true if it is a leaf node, or false if not.
* Note that the leaf node can have subkey nodes in addition to the
* value node.
*/
static inline __init bool xbc_node_is_leaf(struct xbc_node *node)
{
@ -129,6 +151,23 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
return xbc_node_find_child(NULL, key);
}
/**
* xbc_node_get_subkey() - Return the first subkey node if exists
* @node: Parent node
*
* Return the first subkey node of the @node. If the @node has no child
* or only value node, this will return NULL.
*/
static inline struct xbc_node * __init xbc_node_get_subkey(struct xbc_node *node)
{
struct xbc_node *child = xbc_node_get_child(node);
if (child && xbc_node_is_value(child))
return xbc_node_get_next(child);
else
return child;
}
/**
* xbc_array_for_each_value() - Iterate value nodes on an array
* @anode: An XBC arraied value node
@ -140,7 +179,7 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
*/
#define xbc_array_for_each_value(anode, value) \
for (value = xbc_node_get_data(anode); anode != NULL ; \
anode = xbc_node_get_next(anode), \
anode = xbc_node_get_child(anode), \
value = anode ? xbc_node_get_data(anode) : NULL)
/**
@ -149,11 +188,24 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
* @child: Iterated XBC node.
*
* Iterate child nodes of @parent. Each child nodes are stored to @child.
* The @child can be mixture of a value node and subkey nodes.
*/
#define xbc_node_for_each_child(parent, child) \
for (child = xbc_node_get_child(parent); child != NULL ; \
child = xbc_node_get_next(child))
/**
* xbc_node_for_each_subkey() - Iterate child subkey nodes
* @parent: An XBC node.
* @child: Iterated XBC node.
*
* Iterate subkey nodes of @parent. Each child nodes are stored to @child.
* The @child is only the subkey node.
*/
#define xbc_node_for_each_subkey(parent, child) \
for (child = xbc_node_get_subkey(parent); child != NULL ; \
child = xbc_node_get_next(child))
/**
* xbc_node_for_each_array_value() - Iterate array entries of geven key
* @node: An XBC node.
@ -171,7 +223,7 @@ static inline struct xbc_node * __init xbc_find_node(const char *key)
*/
#define xbc_node_for_each_array_value(node, key, anode, value) \
for (value = xbc_node_find_value(node, key, &anode); value != NULL; \
anode = xbc_node_get_next(anode), \
anode = xbc_node_get_child(anode), \
value = anode ? xbc_node_get_data(anode) : NULL)
/**

View file

@ -7,12 +7,21 @@ extern bool trace_hwlat_callback_enabled;
extern void trace_hwlat_callback(bool enter);
#endif
#ifdef CONFIG_OSNOISE_TRACER
extern bool trace_osnoise_callback_enabled;
extern void trace_osnoise_callback(bool enter);
#endif
static inline void ftrace_nmi_enter(void)
{
#ifdef CONFIG_HWLAT_TRACER
if (trace_hwlat_callback_enabled)
trace_hwlat_callback(true);
#endif
#ifdef CONFIG_OSNOISE_TRACER
if (trace_osnoise_callback_enabled)
trace_osnoise_callback(true);
#endif
}
static inline void ftrace_nmi_exit(void)
@ -21,6 +30,10 @@ static inline void ftrace_nmi_exit(void)
if (trace_hwlat_callback_enabled)
trace_hwlat_callback(false);
#endif
#ifdef CONFIG_OSNOISE_TRACER
if (trace_osnoise_callback_enabled)
trace_osnoise_callback(false);
#endif
}
#endif /* _LINUX_FTRACE_IRQ_H */

View file

@ -41,6 +41,13 @@ int trace_array_init_printk(struct trace_array *tr);
void trace_array_put(struct trace_array *tr);
struct trace_array *trace_array_get_by_name(const char *name);
int trace_array_destroy(struct trace_array *tr);
/* For osnoise tracer */
int osnoise_arch_register(void);
void osnoise_arch_unregister(void);
void osnoise_trace_irq_entry(int id);
void osnoise_trace_irq_exit(int id, const char *desc);
#endif /* CONFIG_TRACING */
#endif /* _LINUX_TRACE_H */

View file

@ -41,7 +41,17 @@ extern int
tracepoint_probe_register_prio(struct tracepoint *tp, void *probe, void *data,
int prio);
extern int
tracepoint_probe_register_prio_may_exist(struct tracepoint *tp, void *probe, void *data,
int prio);
extern int
tracepoint_probe_unregister(struct tracepoint *tp, void *probe, void *data);
static inline int
tracepoint_probe_register_may_exist(struct tracepoint *tp, void *probe,
void *data)
{
return tracepoint_probe_register_prio_may_exist(tp, probe, data,
TRACEPOINT_DEFAULT_PRIO);
}
extern void
for_each_kernel_tracepoint(void (*fct)(struct tracepoint *tp, void *priv),
void *priv);

View file

@ -1092,7 +1092,7 @@ TRACE_EVENT(btrfs_trigger_flush,
__entry->flags = flags;
__entry->bytes = bytes;
__entry->flush = flush;
__assign_str(reason, reason)
__assign_str(reason, reason);
),
TP_printk_btrfs("%s: flush=%d(%s) flags=%llu(%s) bytes=%llu",

View file

@ -23,8 +23,8 @@ DECLARE_EVENT_CLASS(dma_fence,
),
TP_fast_assign(
__assign_str(driver, fence->ops->get_driver_name(fence))
__assign_str(timeline, fence->ops->get_timeline_name(fence))
__assign_str(driver, fence->ops->get_driver_name(fence));
__assign_str(timeline, fence->ops->get_timeline_name(fence));
__entry->context = fence->context;
__entry->seqno = fence->seqno;
),

View file

@ -0,0 +1,142 @@
/* SPDX-License-Identifier: GPL-2.0 */
#undef TRACE_SYSTEM
#define TRACE_SYSTEM osnoise
#if !defined(_OSNOISE_TRACE_H) || defined(TRACE_HEADER_MULTI_READ)
#define _OSNOISE_TRACE_H
#include <linux/tracepoint.h>
TRACE_EVENT(thread_noise,
TP_PROTO(struct task_struct *t, u64 start, u64 duration),
TP_ARGS(t, start, duration),
TP_STRUCT__entry(
__array( char, comm, TASK_COMM_LEN)
__field( u64, start )
__field( u64, duration)
__field( pid_t, pid )
),
TP_fast_assign(
memcpy(__entry->comm, t->comm, TASK_COMM_LEN);
__entry->pid = t->pid;
__entry->start = start;
__entry->duration = duration;
),
TP_printk("%8s:%d start %llu.%09u duration %llu ns",
__entry->comm,
__entry->pid,
__print_ns_to_secs(__entry->start),
__print_ns_without_secs(__entry->start),
__entry->duration)
);
TRACE_EVENT(softirq_noise,
TP_PROTO(int vector, u64 start, u64 duration),
TP_ARGS(vector, start, duration),
TP_STRUCT__entry(
__field( u64, start )
__field( u64, duration)
__field( int, vector )
),
TP_fast_assign(
__entry->vector = vector;
__entry->start = start;
__entry->duration = duration;
),
TP_printk("%8s:%d start %llu.%09u duration %llu ns",
show_softirq_name(__entry->vector),
__entry->vector,
__print_ns_to_secs(__entry->start),
__print_ns_without_secs(__entry->start),
__entry->duration)
);
TRACE_EVENT(irq_noise,
TP_PROTO(int vector, const char *desc, u64 start, u64 duration),
TP_ARGS(vector, desc, start, duration),
TP_STRUCT__entry(
__field( u64, start )
__field( u64, duration)
__string( desc, desc )
__field( int, vector )
),
TP_fast_assign(
__assign_str(desc, desc);
__entry->vector = vector;
__entry->start = start;
__entry->duration = duration;
),
TP_printk("%s:%d start %llu.%09u duration %llu ns",
__get_str(desc),
__entry->vector,
__print_ns_to_secs(__entry->start),
__print_ns_without_secs(__entry->start),
__entry->duration)
);
TRACE_EVENT(nmi_noise,
TP_PROTO(u64 start, u64 duration),
TP_ARGS(start, duration),
TP_STRUCT__entry(
__field( u64, start )
__field( u64, duration)
),
TP_fast_assign(
__entry->start = start;
__entry->duration = duration;
),
TP_printk("start %llu.%09u duration %llu ns",
__print_ns_to_secs(__entry->start),
__print_ns_without_secs(__entry->start),
__entry->duration)
);
TRACE_EVENT(sample_threshold,
TP_PROTO(u64 start, u64 duration, u64 interference),
TP_ARGS(start, duration, interference),
TP_STRUCT__entry(
__field( u64, start )
__field( u64, duration)
__field( u64, interference)
),
TP_fast_assign(
__entry->start = start;
__entry->duration = duration;
__entry->interference = interference;
),
TP_printk("start %llu.%09u duration %llu ns interference %llu",
__print_ns_to_secs(__entry->start),
__print_ns_without_secs(__entry->start),
__entry->duration,
__entry->interference)
);
#endif /* _TRACE_OSNOISE_H */
/* This part must be outside protection */
#include <trace/define_trace.h>

View file

@ -152,7 +152,7 @@ DECLARE_EVENT_CLASS(rpcgss_ctx_class,
TP_fast_assign(
__entry->cred = gc;
__entry->service = gc->gc_service;
__assign_str(principal, gc->gc_principal)
__assign_str(principal, gc->gc_principal);
),
TP_printk("cred=%p service=%s principal='%s'",
@ -535,7 +535,7 @@ TRACE_EVENT(rpcgss_upcall_msg,
),
TP_fast_assign(
__assign_str(msg, buf)
__assign_str(msg, buf);
),
TP_printk("msg='%s'", __get_str(msg))

View file

@ -148,7 +148,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
__array( char, comm, TASK_COMM_LEN )
__field( pid_t, pid )
__field( int, prio )
__field( int, success )
__field( int, target_cpu )
),
@ -156,7 +155,6 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
memcpy(__entry->comm, p->comm, TASK_COMM_LEN);
__entry->pid = p->pid;
__entry->prio = p->prio; /* XXX SCHED_DEADLINE */
__entry->success = 1; /* rudiment, kill when possible */
__entry->target_cpu = task_cpu(p);
),

View file

@ -154,8 +154,8 @@ TRACE_EVENT(rpc_clnt_new,
__entry->client_id = clnt->cl_clid;
__assign_str(addr, xprt->address_strings[RPC_DISPLAY_ADDR]);
__assign_str(port, xprt->address_strings[RPC_DISPLAY_PORT]);
__assign_str(program, program)
__assign_str(server, server)
__assign_str(program, program);
__assign_str(server, server);
),
TP_printk("client=%u peer=[%s]:%s program=%s server=%s",
@ -180,8 +180,8 @@ TRACE_EVENT(rpc_clnt_new_err,
TP_fast_assign(
__entry->error = error;
__assign_str(program, program)
__assign_str(server, server)
__assign_str(program, program);
__assign_str(server, server);
),
TP_printk("program=%s server=%s error=%d",
@ -284,8 +284,8 @@ TRACE_EVENT(rpc_request,
__entry->client_id = task->tk_client->cl_clid;
__entry->version = task->tk_client->cl_vers;
__entry->async = RPC_IS_ASYNC(task);
__assign_str(progname, task->tk_client->cl_program->name)
__assign_str(procname, rpc_proc_name(task))
__assign_str(progname, task->tk_client->cl_program->name);
__assign_str(procname, rpc_proc_name(task));
),
TP_printk("task:%u@%u %sv%d %s (%ssync)",
@ -494,10 +494,10 @@ DECLARE_EVENT_CLASS(rpc_reply_event,
__entry->task_id = task->tk_pid;
__entry->client_id = task->tk_client->cl_clid;
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
__assign_str(progname, task->tk_client->cl_program->name)
__assign_str(progname, task->tk_client->cl_program->name);
__entry->version = task->tk_client->cl_vers;
__assign_str(procname, rpc_proc_name(task))
__assign_str(servername, task->tk_xprt->servername)
__assign_str(procname, rpc_proc_name(task));
__assign_str(servername, task->tk_xprt->servername);
),
TP_printk("task:%u@%d server=%s xid=0x%08x %sv%d %s",
@ -622,8 +622,8 @@ TRACE_EVENT(rpc_stats_latency,
__entry->task_id = task->tk_pid;
__entry->xid = be32_to_cpu(task->tk_rqstp->rq_xid);
__entry->version = task->tk_client->cl_vers;
__assign_str(progname, task->tk_client->cl_program->name)
__assign_str(procname, rpc_proc_name(task))
__assign_str(progname, task->tk_client->cl_program->name);
__assign_str(procname, rpc_proc_name(task));
__entry->backlog = ktime_to_us(backlog);
__entry->rtt = ktime_to_us(rtt);
__entry->execute = ktime_to_us(execute);
@ -669,15 +669,15 @@ TRACE_EVENT(rpc_xdr_overflow,
__entry->task_id = task->tk_pid;
__entry->client_id = task->tk_client->cl_clid;
__assign_str(progname,
task->tk_client->cl_program->name)
task->tk_client->cl_program->name);
__entry->version = task->tk_client->cl_vers;
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
} else {
__entry->task_id = 0;
__entry->client_id = 0;
__assign_str(progname, "unknown")
__assign_str(progname, "unknown");
__entry->version = 0;
__assign_str(procedure, "unknown")
__assign_str(procedure, "unknown");
}
__entry->requested = requested;
__entry->end = xdr->end;
@ -735,9 +735,9 @@ TRACE_EVENT(rpc_xdr_alignment,
__entry->task_id = task->tk_pid;
__entry->client_id = task->tk_client->cl_clid;
__assign_str(progname,
task->tk_client->cl_program->name)
task->tk_client->cl_program->name);
__entry->version = task->tk_client->cl_vers;
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
__entry->offset = offset;
__entry->copied = copied;
@ -1107,9 +1107,9 @@ TRACE_EVENT(xprt_retransmit,
__entry->xid = be32_to_cpu(rqst->rq_xid);
__entry->ntrans = rqst->rq_ntrans;
__assign_str(progname,
task->tk_client->cl_program->name)
task->tk_client->cl_program->name);
__entry->version = task->tk_client->cl_vers;
__assign_str(procedure, task->tk_msg.rpc_proc->p_name)
__assign_str(procedure, task->tk_msg.rpc_proc->p_name);
),
TP_printk(
@ -1842,7 +1842,7 @@ TRACE_EVENT(svc_xprt_accept,
TP_fast_assign(
__assign_str(addr, xprt->xpt_remotebuf);
__assign_str(protocol, xprt->xpt_class->xcl_name)
__assign_str(protocol, xprt->xpt_class->xcl_name);
__assign_str(service, service);
),

View file

@ -36,7 +36,8 @@
EM( WB_REASON_PERIODIC, "periodic") \
EM( WB_REASON_LAPTOP_TIMER, "laptop_timer") \
EM( WB_REASON_FS_FREE_SPACE, "fs_free_space") \
EMe(WB_REASON_FORKER_THREAD, "forker_thread")
EM( WB_REASON_FORKER_THREAD, "forker_thread") \
EMe(WB_REASON_FOREIGN_FLUSH, "foreign_flush")
WB_WORK_REASON

View file

@ -358,6 +358,21 @@ TRACE_MAKE_SYSTEM_STR();
trace_print_hex_dump_seq(p, prefix_str, prefix_type, \
rowsize, groupsize, buf, len, ascii)
#undef __print_ns_to_secs
#define __print_ns_to_secs(value) \
({ \
u64 ____val = (u64)(value); \
do_div(____val, NSEC_PER_SEC); \
____val; \
})
#undef __print_ns_without_secs
#define __print_ns_without_secs(value) \
({ \
u64 ____val = (u64)(value); \
(u32) do_div(____val, NSEC_PER_SEC); \
})
#undef DECLARE_EVENT_CLASS
#define DECLARE_EVENT_CLASS(call, proto, args, tstruct, assign, print) \
static notrace enum print_line_t \
@ -736,6 +751,16 @@ static inline void ftrace_test_probe_##call(void) \
#undef __print_array
#undef __print_hex_dump
/*
* The below is not executed in the kernel. It is only what is
* displayed in the print format for userspace to parse.
*/
#undef __print_ns_to_secs
#define __print_ns_to_secs(val) (val) / 1000000000UL
#undef __print_ns_without_secs
#define __print_ns_without_secs(val) (val) % 1000000000UL
#undef TP_printk
#define TP_printk(fmt, args...) "\"" fmt "\", " __stringify(args)

View file

@ -386,16 +386,6 @@ static char * __init xbc_make_cmdline(const char *key)
return new_cmdline;
}
static u32 boot_config_checksum(unsigned char *p, u32 size)
{
u32 ret = 0;
while (size--)
ret += *p++;
return ret;
}
static int __init bootconfig_params(char *param, char *val,
const char *unused, void *arg)
{
@ -439,7 +429,7 @@ static void __init setup_boot_config(void)
return;
}
if (boot_config_checksum((unsigned char *)data, size) != csum) {
if (xbc_calc_checksum(data, size) != csum) {
pr_err("bootconfig checksum failed\n");
return;
}

View file

@ -356,6 +356,68 @@ config HWLAT_TRACER
file. Every time a latency is greater than tracing_thresh, it will
be recorded into the ring buffer.
config OSNOISE_TRACER
bool "OS Noise tracer"
select GENERIC_TRACER
help
In the context of high-performance computing (HPC), the Operating
System Noise (osnoise) refers to the interference experienced by an
application due to activities inside the operating system. In the
context of Linux, NMIs, IRQs, SoftIRQs, and any other system thread
can cause noise to the system. Moreover, hardware-related jobs can
also cause noise, for example, via SMIs.
The osnoise tracer leverages the hwlat_detector by running a similar
loop with preemption, SoftIRQs and IRQs enabled, thus allowing all
the sources of osnoise during its execution. The osnoise tracer takes
note of the entry and exit point of any source of interferences,
increasing a per-cpu interference counter. It saves an interference
counter for each source of interference. The interference counter for
NMI, IRQs, SoftIRQs, and threads is increased anytime the tool
observes these interferences' entry events. When a noise happens
without any interference from the operating system level, the
hardware noise counter increases, pointing to a hardware-related
noise. In this way, osnoise can account for any source of
interference. At the end of the period, the osnoise tracer prints
the sum of all noise, the max single noise, the percentage of CPU
available for the thread, and the counters for the noise sources.
In addition to the tracer, a set of tracepoints were added to
facilitate the identification of the osnoise source.
The output will appear in the trace and trace_pipe files.
To enable this tracer, echo in "osnoise" into the current_tracer
file.
config TIMERLAT_TRACER
bool "Timerlat tracer"
select OSNOISE_TRACER
select GENERIC_TRACER
help
The timerlat tracer aims to help the preemptive kernel developers
to find sources of wakeup latencies of real-time threads.
The tracer creates a per-cpu kernel thread with real-time priority.
The tracer thread sets a periodic timer to wakeup itself, and goes
to sleep waiting for the timer to fire. At the wakeup, the thread
then computes a wakeup latency value as the difference between
the current time and the absolute time that the timer was set
to expire.
The tracer prints two lines at every activation. The first is the
timer latency observed at the hardirq context before the
activation of the thread. The second is the timer latency observed
by the thread, which is the same level that cyclictest reports. The
ACTIVATION ID field serves to relate the irq execution to its
respective thread execution.
The tracer is build on top of osnoise tracer, and the osnoise:
events can be used to trace the source of interference from NMI,
IRQs and other threads. It also enables the capture of the
stacktrace at the IRQ context, which helps to identify the code
path that can cause thread delay.
config MMIOTRACE
bool "Memory mapped IO tracing"
depends on HAVE_MMIOTRACE_SUPPORT && PCI

View file

@ -58,6 +58,7 @@ obj-$(CONFIG_IRQSOFF_TRACER) += trace_irqsoff.o
obj-$(CONFIG_PREEMPT_TRACER) += trace_irqsoff.o
obj-$(CONFIG_SCHED_TRACER) += trace_sched_wakeup.o
obj-$(CONFIG_HWLAT_TRACER) += trace_hwlat.o
obj-$(CONFIG_OSNOISE_TRACER) += trace_osnoise.o
obj-$(CONFIG_NOP_TRACER) += trace_nop.o
obj-$(CONFIG_STACK_TRACER) += trace_stack.o
obj-$(CONFIG_MMIOTRACE) += trace_mmiotrace.o

View file

@ -1842,7 +1842,8 @@ static int __bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *
if (prog->aux->max_tp_access > btp->writable_size)
return -EINVAL;
return tracepoint_probe_register(tp, (void *)btp->bpf_func, prog);
return tracepoint_probe_register_may_exist(tp, (void *)btp->bpf_func,
prog);
}
int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog)

View file

@ -3391,7 +3391,7 @@ static void check_buffer(struct ring_buffer_per_cpu *cpu_buffer,
case RINGBUF_TYPE_PADDING:
if (event->time_delta == 1)
break;
/* fall through */
fallthrough;
case RINGBUF_TYPE_DATA:
ts += event->time_delta;
break;

View file

@ -87,6 +87,7 @@ void __init disable_tracing_selftest(const char *reason)
/* Pipe tracepoints to printk */
struct trace_iterator *tracepoint_print_iter;
int tracepoint_printk;
static bool tracepoint_printk_stop_on_boot __initdata;
static DEFINE_STATIC_KEY_FALSE(tracepoint_printk_key);
/* For tracers that don't implement custom flags */
@ -197,12 +198,12 @@ __setup("ftrace=", set_cmdline_ftrace);
static int __init set_ftrace_dump_on_oops(char *str)
{
if (*str++ != '=' || !*str) {
if (*str++ != '=' || !*str || !strcmp("1", str)) {
ftrace_dump_on_oops = DUMP_ALL;
return 1;
}
if (!strcmp("orig_cpu", str)) {
if (!strcmp("orig_cpu", str) || !strcmp("2", str)) {
ftrace_dump_on_oops = DUMP_ORIG;
return 1;
}
@ -257,6 +258,13 @@ static int __init set_tracepoint_printk(char *str)
}
__setup("tp_printk", set_tracepoint_printk);
static int __init set_tracepoint_printk_stop(char *str)
{
tracepoint_printk_stop_on_boot = true;
return 1;
}
__setup("tp_printk_stop_on_boot", set_tracepoint_printk_stop);
unsigned long long ns2usecs(u64 nsec)
{
nsec += 500;
@ -1683,8 +1691,7 @@ static ssize_t trace_seq_to_buffer(struct trace_seq *s, void *buf, size_t cnt)
unsigned long __read_mostly tracing_thresh;
static const struct file_operations tracing_max_lat_fops;
#if (defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)) && \
defined(CONFIG_FSNOTIFY)
#ifdef LATENCY_FS_NOTIFY
static struct workqueue_struct *fsnotify_wq;
@ -2185,8 +2192,15 @@ void tracing_reset_all_online_cpus(void)
}
}
/*
* The tgid_map array maps from pid to tgid; i.e. the value stored at index i
* is the tgid last observed corresponding to pid=i.
*/
static int *tgid_map;
/* The maximum valid index into tgid_map. */
static size_t tgid_map_max;
#define SAVED_CMDLINES_DEFAULT 128
#define NO_CMDLINE_MAP UINT_MAX
static arch_spinlock_t trace_cmdline_lock = __ARCH_SPIN_LOCK_UNLOCKED;
@ -2459,24 +2473,41 @@ void trace_find_cmdline(int pid, char comm[])
preempt_enable();
}
static int *trace_find_tgid_ptr(int pid)
{
/*
* Pairs with the smp_store_release in set_tracer_flag() to ensure that
* if we observe a non-NULL tgid_map then we also observe the correct
* tgid_map_max.
*/
int *map = smp_load_acquire(&tgid_map);
if (unlikely(!map || pid > tgid_map_max))
return NULL;
return &map[pid];
}
int trace_find_tgid(int pid)
{
if (unlikely(!tgid_map || !pid || pid > PID_MAX_DEFAULT))
return 0;
int *ptr = trace_find_tgid_ptr(pid);
return tgid_map[pid];
return ptr ? *ptr : 0;
}
static int trace_save_tgid(struct task_struct *tsk)
{
int *ptr;
/* treat recording of idle task as a success */
if (!tsk->pid)
return 1;
if (unlikely(!tgid_map || tsk->pid > PID_MAX_DEFAULT))
ptr = trace_find_tgid_ptr(tsk->pid);
if (!ptr)
return 0;
tgid_map[tsk->pid] = tsk->tgid;
*ptr = tsk->tgid;
return 1;
}
@ -2730,9 +2761,45 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb,
if (!tr->no_filter_buffering_ref &&
(trace_file->flags & (EVENT_FILE_FL_SOFT_DISABLED | EVENT_FILE_FL_FILTERED)) &&
(entry = this_cpu_read(trace_buffered_event))) {
/* Try to use the per cpu buffer first */
/*
* Filtering is on, so try to use the per cpu buffer first.
* This buffer will simulate a ring_buffer_event,
* where the type_len is zero and the array[0] will
* hold the full length.
* (see include/linux/ring-buffer.h for details on
* how the ring_buffer_event is structured).
*
* Using a temp buffer during filtering and copying it
* on a matched filter is quicker than writing directly
* into the ring buffer and then discarding it when
* it doesn't match. That is because the discard
* requires several atomic operations to get right.
* Copying on match and doing nothing on a failed match
* is still quicker than no copy on match, but having
* to discard out of the ring buffer on a failed match.
*/
int max_len = PAGE_SIZE - struct_size(entry, array, 1);
val = this_cpu_inc_return(trace_buffered_event_cnt);
if ((len < (PAGE_SIZE - sizeof(*entry) - sizeof(entry->array[0]))) && val == 1) {
/*
* Preemption is disabled, but interrupts and NMIs
* can still come in now. If that happens after
* the above increment, then it will have to go
* back to the old method of allocating the event
* on the ring buffer, and if the filter fails, it
* will have to call ring_buffer_discard_commit()
* to remove it.
*
* Need to also check the unlikely case that the
* length is bigger than the temp buffer size.
* If that happens, then the reserve is pretty much
* guaranteed to fail, as the ring buffer currently
* only allows events less than a page. But that may
* change in the future, so let the ring buffer reserve
* handle the failure in that case.
*/
if (val == 1 && likely(len <= max_len)) {
trace_event_setup(entry, type, trace_ctx);
entry->array[0] = len;
return entry;
@ -5172,6 +5239,8 @@ int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set)
int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
{
int *map;
if ((mask == TRACE_ITER_RECORD_TGID) ||
(mask == TRACE_ITER_RECORD_CMD))
lockdep_assert_held(&event_mutex);
@ -5194,10 +5263,19 @@ int set_tracer_flag(struct trace_array *tr, unsigned int mask, int enabled)
trace_event_enable_cmd_record(enabled);
if (mask == TRACE_ITER_RECORD_TGID) {
if (!tgid_map)
tgid_map = kvcalloc(PID_MAX_DEFAULT + 1,
sizeof(*tgid_map),
GFP_KERNEL);
if (!tgid_map) {
tgid_map_max = pid_max;
map = kvcalloc(tgid_map_max + 1, sizeof(*tgid_map),
GFP_KERNEL);
/*
* Pairs with smp_load_acquire() in
* trace_find_tgid_ptr() to ensure that if it observes
* the tgid_map we just allocated then it also observes
* the corresponding tgid_map_max value.
*/
smp_store_release(&tgid_map, map);
}
if (!tgid_map) {
tr->trace_flags &= ~TRACE_ITER_RECORD_TGID;
return -ENOMEM;
@ -5609,37 +5687,16 @@ static const struct file_operations tracing_readme_fops = {
static void *saved_tgids_next(struct seq_file *m, void *v, loff_t *pos)
{
int *ptr = v;
int pid = ++(*pos);
if (*pos || m->count)
ptr++;
(*pos)++;
for (; ptr <= &tgid_map[PID_MAX_DEFAULT]; ptr++) {
if (trace_find_tgid(*ptr))
return ptr;
}
return NULL;
return trace_find_tgid_ptr(pid);
}
static void *saved_tgids_start(struct seq_file *m, loff_t *pos)
{
void *v;
loff_t l = 0;
int pid = *pos;
if (!tgid_map)
return NULL;
v = &tgid_map[0];
while (l <= *pos) {
v = saved_tgids_next(m, v, &l);
if (!v)
return NULL;
}
return v;
return trace_find_tgid_ptr(pid);
}
static void saved_tgids_stop(struct seq_file *m, void *v)
@ -5648,9 +5705,14 @@ static void saved_tgids_stop(struct seq_file *m, void *v)
static int saved_tgids_show(struct seq_file *m, void *v)
{
int pid = (int *)v - tgid_map;
int *entry = (int *)v;
int pid = entry - tgid_map;
int tgid = *entry;
seq_printf(m, "%d %d\n", pid, trace_find_tgid(pid));
if (tgid == 0)
return SEQ_SKIP;
seq_printf(m, "%d %d\n", pid, tgid);
return 0;
}
@ -6135,7 +6197,7 @@ static int __tracing_resize_ring_buffer(struct trace_array *tr,
ssize_t tracing_resize_ring_buffer(struct trace_array *tr,
unsigned long size, int cpu_id)
{
int ret = size;
int ret;
mutex_lock(&trace_types_lock);
@ -7529,6 +7591,91 @@ static const struct file_operations snapshot_raw_fops = {
#endif /* CONFIG_TRACER_SNAPSHOT */
/*
* trace_min_max_write - Write a u64 value to a trace_min_max_param struct
* @filp: The active open file structure
* @ubuf: The userspace provided buffer to read value into
* @cnt: The maximum number of bytes to read
* @ppos: The current "file" position
*
* This function implements the write interface for a struct trace_min_max_param.
* The filp->private_data must point to a trace_min_max_param structure that
* defines where to write the value, the min and the max acceptable values,
* and a lock to protect the write.
*/
static ssize_t
trace_min_max_write(struct file *filp, const char __user *ubuf, size_t cnt, loff_t *ppos)
{
struct trace_min_max_param *param = filp->private_data;
u64 val;
int err;
if (!param)
return -EFAULT;
err = kstrtoull_from_user(ubuf, cnt, 10, &val);
if (err)
return err;
if (param->lock)
mutex_lock(param->lock);
if (param->min && val < *param->min)
err = -EINVAL;
if (param->max && val > *param->max)
err = -EINVAL;
if (!err)
*param->val = val;
if (param->lock)
mutex_unlock(param->lock);
if (err)
return err;
return cnt;
}
/*
* trace_min_max_read - Read a u64 value from a trace_min_max_param struct
* @filp: The active open file structure
* @ubuf: The userspace provided buffer to read value into
* @cnt: The maximum number of bytes to read
* @ppos: The current "file" position
*
* This function implements the read interface for a struct trace_min_max_param.
* The filp->private_data must point to a trace_min_max_param struct with valid
* data.
*/
static ssize_t
trace_min_max_read(struct file *filp, char __user *ubuf, size_t cnt, loff_t *ppos)
{
struct trace_min_max_param *param = filp->private_data;
char buf[U64_STR_SIZE];
int len;
u64 val;
if (!param)
return -EFAULT;
val = *param->val;
if (cnt > sizeof(buf))
cnt = sizeof(buf);
len = snprintf(buf, sizeof(buf), "%llu\n", val);
return simple_read_from_buffer(ubuf, cnt, ppos, buf, len);
}
const struct file_operations trace_min_max_fops = {
.open = tracing_open_generic,
.read = trace_min_max_read,
.write = trace_min_max_write,
};
#define TRACING_LOG_ERRS_MAX 8
#define TRACING_LOG_LOC_MAX 128
@ -9532,6 +9679,8 @@ static __init int tracer_init_tracefs(void)
return 0;
}
fs_initcall(tracer_init_tracefs);
static int trace_panic_handler(struct notifier_block *this,
unsigned long event, void *unused)
{
@ -9952,7 +10101,7 @@ void __init trace_init(void)
trace_event_init();
}
__init static int clear_boot_tracer(void)
__init static void clear_boot_tracer(void)
{
/*
* The default tracer at boot buffer is an init section.
@ -9962,26 +10111,21 @@ __init static int clear_boot_tracer(void)
* about to be freed.
*/
if (!default_bootup_tracer)
return 0;
return;
printk(KERN_INFO "ftrace bootup tracer '%s' not registered.\n",
default_bootup_tracer);
default_bootup_tracer = NULL;
return 0;
}
fs_initcall(tracer_init_tracefs);
late_initcall_sync(clear_boot_tracer);
#ifdef CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
__init static int tracing_set_default_clock(void)
__init static void tracing_set_default_clock(void)
{
/* sched_clock_stable() is determined in late_initcall */
if (!trace_boot_clock && !sched_clock_stable()) {
if (security_locked_down(LOCKDOWN_TRACEFS)) {
pr_warn("Can not set tracing clock due to lockdown\n");
return -EPERM;
return;
}
printk(KERN_WARNING
@ -9991,8 +10135,21 @@ __init static int tracing_set_default_clock(void)
"on the kernel command line\n");
tracing_set_clock(&global_trace, "global");
}
}
#else
static inline void tracing_set_default_clock(void) { }
#endif
__init static int late_trace_init(void)
{
if (tracepoint_printk && tracepoint_printk_stop_on_boot) {
static_key_disable(&tracepoint_printk_key.key);
tracepoint_printk = 0;
}
tracing_set_default_clock();
clear_boot_tracer();
return 0;
}
late_initcall_sync(tracing_set_default_clock);
#endif
late_initcall_sync(late_trace_init);

View file

@ -45,6 +45,8 @@ enum trace_type {
TRACE_BLK,
TRACE_BPUTS,
TRACE_HWLAT,
TRACE_OSNOISE,
TRACE_TIMERLAT,
TRACE_RAW_DATA,
TRACE_FUNC_REPEATS,
@ -290,7 +292,8 @@ struct trace_array {
struct array_buffer max_buffer;
bool allocated_snapshot;
#endif
#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)
#if defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER) \
|| defined(CONFIG_OSNOISE_TRACER)
unsigned long max_latency;
#ifdef CONFIG_FSNOTIFY
struct dentry *d_max_latency;
@ -438,6 +441,8 @@ extern void __ftrace_bad_type(void);
IF_ASSIGN(var, ent, struct bprint_entry, TRACE_BPRINT); \
IF_ASSIGN(var, ent, struct bputs_entry, TRACE_BPUTS); \
IF_ASSIGN(var, ent, struct hwlat_entry, TRACE_HWLAT); \
IF_ASSIGN(var, ent, struct osnoise_entry, TRACE_OSNOISE);\
IF_ASSIGN(var, ent, struct timerlat_entry, TRACE_TIMERLAT);\
IF_ASSIGN(var, ent, struct raw_data_entry, TRACE_RAW_DATA);\
IF_ASSIGN(var, ent, struct trace_mmiotrace_rw, \
TRACE_MMIO_RW); \
@ -668,15 +673,15 @@ void update_max_tr_single(struct trace_array *tr,
struct task_struct *tsk, int cpu);
#endif /* CONFIG_TRACER_MAX_TRACE */
#if (defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER)) && \
defined(CONFIG_FSNOTIFY)
#if (defined(CONFIG_TRACER_MAX_TRACE) || defined(CONFIG_HWLAT_TRACER) \
|| defined(CONFIG_OSNOISE_TRACER)) && defined(CONFIG_FSNOTIFY)
#define LATENCY_FS_NOTIFY
#endif
#ifdef LATENCY_FS_NOTIFY
void latency_fsnotify(struct trace_array *tr);
#else
static inline void latency_fsnotify(struct trace_array *tr) { }
#endif
#ifdef CONFIG_STACKTRACE
@ -1945,4 +1950,22 @@ static inline bool is_good_name(const char *name)
return true;
}
/*
* This is a generic way to read and write a u64 value from a file in tracefs.
*
* The value is stored on the variable pointed by *val. The value needs
* to be at least *min and at most *max. The write is protected by an
* existing *lock.
*/
struct trace_min_max_param {
struct mutex *lock;
u64 *val;
u64 *min;
u64 *max;
};
#define U64_STR_SIZE 24 /* 20 digits max */
extern const struct file_operations trace_min_max_fops;
#endif /* _LINUX_KERNEL_TRACE_H */

View file

@ -225,14 +225,37 @@ static void __init
trace_boot_init_events(struct trace_array *tr, struct xbc_node *node)
{
struct xbc_node *gnode, *enode;
bool enable, enable_all = false;
const char *data;
node = xbc_node_find_child(node, "event");
if (!node)
return;
/* per-event key starts with "event.GROUP.EVENT" */
xbc_node_for_each_child(node, gnode)
xbc_node_for_each_child(gnode, enode)
xbc_node_for_each_child(node, gnode) {
data = xbc_node_get_data(gnode);
if (!strcmp(data, "enable")) {
enable_all = true;
continue;
}
enable = false;
xbc_node_for_each_child(gnode, enode) {
data = xbc_node_get_data(enode);
if (!strcmp(data, "enable")) {
enable = true;
continue;
}
trace_boot_init_one_event(tr, gnode, enode);
}
/* Event enablement must be done after event settings */
if (enable) {
data = xbc_node_get_data(gnode);
trace_array_set_clr_event(tr, data, NULL, true);
}
}
/* Ditto */
if (enable_all)
trace_array_set_clr_event(tr, NULL, NULL, true);
}
#else
#define trace_boot_enable_events(tr, node) do {} while (0)

View file

@ -360,3 +360,44 @@ FTRACE_ENTRY(func_repeats, func_repeats_entry,
__entry->count,
FUNC_REPEATS_GET_DELTA_TS(__entry))
);
FTRACE_ENTRY(osnoise, osnoise_entry,
TRACE_OSNOISE,
F_STRUCT(
__field( u64, noise )
__field( u64, runtime )
__field( u64, max_sample )
__field( unsigned int, hw_count )
__field( unsigned int, nmi_count )
__field( unsigned int, irq_count )
__field( unsigned int, softirq_count )
__field( unsigned int, thread_count )
),
F_printk("noise:%llu\tmax_sample:%llu\thw:%u\tnmi:%u\tirq:%u\tsoftirq:%u\tthread:%u\n",
__entry->noise,
__entry->max_sample,
__entry->hw_count,
__entry->nmi_count,
__entry->irq_count,
__entry->softirq_count,
__entry->thread_count)
);
FTRACE_ENTRY(timerlat, timerlat_entry,
TRACE_TIMERLAT,
F_STRUCT(
__field( unsigned int, seqnum )
__field( int, context )
__field( u64, timer_latency )
),
F_printk("seq:%u\tcontext:%d\ttimer_latency:%llu\n",
__entry->seqnum,
__entry->context,
__entry->timer_latency)
);

View file

@ -2434,12 +2434,12 @@ create_field_var_hist(struct hist_trigger_data *target_hist_data,
char *subsys_name, char *event_name, char *field_name)
{
struct trace_array *tr = target_hist_data->event_file->tr;
struct hist_field *event_var = ERR_PTR(-EINVAL);
struct hist_trigger_data *hist_data;
unsigned int i, n, first = true;
struct field_var_hist *var_hist;
struct trace_event_file *file;
struct hist_field *key_field;
struct hist_field *event_var;
char *saved_filter;
char *cmd;
int ret;
@ -5232,6 +5232,7 @@ static void unregister_field_var_hists(struct hist_trigger_data *hist_data)
cmd = hist_data->field_var_hists[i]->cmd;
ret = event_hist_trigger_func(&trigger_hist_cmd, file,
"!hist", "hist", cmd);
WARN_ON_ONCE(ret < 0);
}
}

View file

@ -916,7 +916,8 @@ void unpause_named_trigger(struct event_trigger_data *data)
/**
* set_named_trigger_data - Associate common named trigger data
* @data: The trigger data of a named trigger to unpause
* @data: The trigger data to associate
* @named_data: The common named trigger to be associated
*
* Named triggers are sets of triggers that share a common set of
* trigger data. The first named trigger registered with a given name

View file

@ -34,7 +34,7 @@
* Copyright (C) 2008-2009 Jon Masters, Red Hat, Inc. <jcm@redhat.com>
* Copyright (C) 2013-2016 Steven Rostedt, Red Hat, Inc. <srostedt@redhat.com>
*
* Includes useful feedback from Clark Williams <clark@redhat.com>
* Includes useful feedback from Clark Williams <williams@redhat.com>
*
*/
#include <linux/kthread.h>
@ -54,20 +54,33 @@ static struct trace_array *hwlat_trace;
#define DEFAULT_SAMPLE_WIDTH 500000 /* 0.5s */
#define DEFAULT_LAT_THRESHOLD 10 /* 10us */
/* sampling thread*/
static struct task_struct *hwlat_kthread;
static struct dentry *hwlat_sample_width; /* sample width us */
static struct dentry *hwlat_sample_window; /* sample window us */
static struct dentry *hwlat_thread_mode; /* hwlat thread mode */
enum {
MODE_NONE = 0,
MODE_ROUND_ROBIN,
MODE_PER_CPU,
MODE_MAX
};
static char *thread_mode_str[] = { "none", "round-robin", "per-cpu" };
/* Save the previous tracing_thresh value */
static unsigned long save_tracing_thresh;
/* NMI timestamp counters */
static u64 nmi_ts_start;
static u64 nmi_total_ts;
static int nmi_count;
static int nmi_cpu;
/* runtime kthread data */
struct hwlat_kthread_data {
struct task_struct *kthread;
/* NMI timestamp counters */
u64 nmi_ts_start;
u64 nmi_total_ts;
int nmi_count;
int nmi_cpu;
};
struct hwlat_kthread_data hwlat_single_cpu_data;
DEFINE_PER_CPU(struct hwlat_kthread_data, hwlat_per_cpu_data);
/* Tells NMIs to call back to the hwlat tracer to record timestamps */
bool trace_hwlat_callback_enabled;
@ -96,11 +109,24 @@ static struct hwlat_data {
u64 sample_window; /* total sampling window (on+off) */
u64 sample_width; /* active sampling portion of window */
int thread_mode; /* thread mode */
} hwlat_data = {
.sample_window = DEFAULT_SAMPLE_WINDOW,
.sample_width = DEFAULT_SAMPLE_WIDTH,
.thread_mode = MODE_ROUND_ROBIN
};
static struct hwlat_kthread_data *get_cpu_data(void)
{
if (hwlat_data.thread_mode == MODE_PER_CPU)
return this_cpu_ptr(&hwlat_per_cpu_data);
else
return &hwlat_single_cpu_data;
}
static bool hwlat_busy;
static void trace_hwlat_sample(struct hwlat_sample *sample)
{
struct trace_array *tr = hwlat_trace;
@ -136,7 +162,9 @@ static void trace_hwlat_sample(struct hwlat_sample *sample)
void trace_hwlat_callback(bool enter)
{
if (smp_processor_id() != nmi_cpu)
struct hwlat_kthread_data *kdata = get_cpu_data();
if (!kdata->kthread)
return;
/*
@ -145,15 +173,24 @@ void trace_hwlat_callback(bool enter)
*/
if (!IS_ENABLED(CONFIG_GENERIC_SCHED_CLOCK)) {
if (enter)
nmi_ts_start = time_get();
kdata->nmi_ts_start = time_get();
else
nmi_total_ts += time_get() - nmi_ts_start;
kdata->nmi_total_ts += time_get() - kdata->nmi_ts_start;
}
if (enter)
nmi_count++;
kdata->nmi_count++;
}
/*
* hwlat_err - report a hwlat error.
*/
#define hwlat_err(msg) ({ \
struct trace_array *tr = hwlat_trace; \
\
trace_array_printk_buf(tr->array_buffer.buffer, _THIS_IP_, msg); \
})
/**
* get_sample - sample the CPU TSC and look for likely hardware latencies
*
@ -163,6 +200,7 @@ void trace_hwlat_callback(bool enter)
*/
static int get_sample(void)
{
struct hwlat_kthread_data *kdata = get_cpu_data();
struct trace_array *tr = hwlat_trace;
struct hwlat_sample s;
time_type start, t1, t2, last_t2;
@ -175,9 +213,8 @@ static int get_sample(void)
do_div(thresh, NSEC_PER_USEC); /* modifies interval value */
nmi_cpu = smp_processor_id();
nmi_total_ts = 0;
nmi_count = 0;
kdata->nmi_total_ts = 0;
kdata->nmi_count = 0;
/* Make sure NMIs see this first */
barrier();
@ -197,7 +234,7 @@ static int get_sample(void)
outer_diff = time_to_us(time_sub(t1, last_t2));
/* This shouldn't happen */
if (outer_diff < 0) {
pr_err(BANNER "time running backwards\n");
hwlat_err(BANNER "time running backwards\n");
goto out;
}
if (outer_diff > outer_sample)
@ -209,7 +246,7 @@ static int get_sample(void)
/* Check for possible overflows */
if (total < last_total) {
pr_err("Time total overflowed\n");
hwlat_err("Time total overflowed\n");
break;
}
last_total = total;
@ -225,7 +262,7 @@ static int get_sample(void)
/* This shouldn't happen */
if (diff < 0) {
pr_err(BANNER "time running backwards\n");
hwlat_err(BANNER "time running backwards\n");
goto out;
}
@ -247,15 +284,15 @@ static int get_sample(void)
ret = 1;
/* We read in microseconds */
if (nmi_total_ts)
do_div(nmi_total_ts, NSEC_PER_USEC);
if (kdata->nmi_total_ts)
do_div(kdata->nmi_total_ts, NSEC_PER_USEC);
hwlat_data.count++;
s.seqnum = hwlat_data.count;
s.duration = sample;
s.outer_duration = outer_sample;
s.nmi_total_ts = nmi_total_ts;
s.nmi_count = nmi_count;
s.nmi_total_ts = kdata->nmi_total_ts;
s.nmi_count = kdata->nmi_count;
s.count = count;
trace_hwlat_sample(&s);
@ -273,7 +310,6 @@ static int get_sample(void)
}
static struct cpumask save_cpumask;
static bool disable_migrate;
static void move_to_next_cpu(void)
{
@ -281,15 +317,13 @@ static void move_to_next_cpu(void)
struct trace_array *tr = hwlat_trace;
int next_cpu;
if (disable_migrate)
return;
/*
* If for some reason the user modifies the CPU affinity
* of this thread, then stop migrating for the duration
* of the current test.
*/
if (!cpumask_equal(current_mask, current->cpus_ptr))
goto disable;
goto change_mode;
get_online_cpus();
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
@ -300,7 +334,7 @@ static void move_to_next_cpu(void)
next_cpu = cpumask_first(current_mask);
if (next_cpu >= nr_cpu_ids) /* Shouldn't happen! */
goto disable;
goto change_mode;
cpumask_clear(current_mask);
cpumask_set_cpu(next_cpu, current_mask);
@ -308,8 +342,9 @@ static void move_to_next_cpu(void)
sched_setaffinity(0, current_mask);
return;
disable:
disable_migrate = true;
change_mode:
hwlat_data.thread_mode = MODE_NONE;
pr_info(BANNER "cpumask changed while in round-robin mode, switching to mode none\n");
}
/*
@ -328,7 +363,8 @@ static int kthread_fn(void *data)
while (!kthread_should_stop()) {
move_to_next_cpu();
if (hwlat_data.thread_mode == MODE_ROUND_ROBIN)
move_to_next_cpu();
local_irq_disable();
get_sample();
@ -351,178 +387,380 @@ static int kthread_fn(void *data)
return 0;
}
/**
* start_kthread - Kick off the hardware latency sampling/detector kthread
/*
* stop_stop_kthread - Inform the hardware latency sampling/detector kthread to stop
*
* This kicks the running hardware latency sampling/detector kernel thread and
* tells it to stop sampling now. Use this on unload and at system shutdown.
*/
static void stop_single_kthread(void)
{
struct hwlat_kthread_data *kdata = get_cpu_data();
struct task_struct *kthread;
get_online_cpus();
kthread = kdata->kthread;
if (!kthread)
goto out_put_cpus;
kthread_stop(kthread);
kdata->kthread = NULL;
out_put_cpus:
put_online_cpus();
}
/*
* start_single_kthread - Kick off the hardware latency sampling/detector kthread
*
* This starts the kernel thread that will sit and sample the CPU timestamp
* counter (TSC or similar) and look for potential hardware latencies.
*/
static int start_kthread(struct trace_array *tr)
static int start_single_kthread(struct trace_array *tr)
{
struct hwlat_kthread_data *kdata = get_cpu_data();
struct cpumask *current_mask = &save_cpumask;
struct task_struct *kthread;
int next_cpu;
if (hwlat_kthread)
return 0;
/* Just pick the first CPU on first iteration */
get_online_cpus();
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
put_online_cpus();
next_cpu = cpumask_first(current_mask);
if (kdata->kthread)
goto out_put_cpus;
kthread = kthread_create(kthread_fn, NULL, "hwlatd");
if (IS_ERR(kthread)) {
pr_err(BANNER "could not start sampling thread\n");
put_online_cpus();
return -ENOMEM;
}
/* Just pick the first CPU on first iteration */
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
if (hwlat_data.thread_mode == MODE_ROUND_ROBIN) {
next_cpu = cpumask_first(current_mask);
cpumask_clear(current_mask);
cpumask_set_cpu(next_cpu, current_mask);
}
sched_setaffinity(kthread->pid, current_mask);
kdata->kthread = kthread;
wake_up_process(kthread);
out_put_cpus:
put_online_cpus();
return 0;
}
/*
* stop_cpu_kthread - Stop a hwlat cpu kthread
*/
static void stop_cpu_kthread(unsigned int cpu)
{
struct task_struct *kthread;
kthread = per_cpu(hwlat_per_cpu_data, cpu).kthread;
if (kthread)
kthread_stop(kthread);
per_cpu(hwlat_per_cpu_data, cpu).kthread = NULL;
}
/*
* stop_per_cpu_kthreads - Inform the hardware latency sampling/detector kthread to stop
*
* This kicks the running hardware latency sampling/detector kernel threads and
* tells it to stop sampling now. Use this on unload and at system shutdown.
*/
static void stop_per_cpu_kthreads(void)
{
unsigned int cpu;
get_online_cpus();
for_each_online_cpu(cpu)
stop_cpu_kthread(cpu);
put_online_cpus();
}
/*
* start_cpu_kthread - Start a hwlat cpu kthread
*/
static int start_cpu_kthread(unsigned int cpu)
{
struct task_struct *kthread;
char comm[24];
snprintf(comm, 24, "hwlatd/%d", cpu);
kthread = kthread_create_on_cpu(kthread_fn, NULL, cpu, comm);
if (IS_ERR(kthread)) {
pr_err(BANNER "could not start sampling thread\n");
return -ENOMEM;
}
cpumask_clear(current_mask);
cpumask_set_cpu(next_cpu, current_mask);
sched_setaffinity(kthread->pid, current_mask);
hwlat_kthread = kthread;
per_cpu(hwlat_per_cpu_data, cpu).kthread = kthread;
wake_up_process(kthread);
return 0;
}
/**
* stop_kthread - Inform the hardware latency sampling/detector kthread to stop
*
* This kicks the running hardware latency sampling/detector kernel thread and
* tells it to stop sampling now. Use this on unload and at system shutdown.
*/
static void stop_kthread(void)
#ifdef CONFIG_HOTPLUG_CPU
static void hwlat_hotplug_workfn(struct work_struct *dummy)
{
if (!hwlat_kthread)
return;
kthread_stop(hwlat_kthread);
hwlat_kthread = NULL;
struct trace_array *tr = hwlat_trace;
unsigned int cpu = smp_processor_id();
mutex_lock(&trace_types_lock);
mutex_lock(&hwlat_data.lock);
get_online_cpus();
if (!hwlat_busy || hwlat_data.thread_mode != MODE_PER_CPU)
goto out_unlock;
if (!cpumask_test_cpu(cpu, tr->tracing_cpumask))
goto out_unlock;
start_cpu_kthread(cpu);
out_unlock:
put_online_cpus();
mutex_unlock(&hwlat_data.lock);
mutex_unlock(&trace_types_lock);
}
static DECLARE_WORK(hwlat_hotplug_work, hwlat_hotplug_workfn);
/*
* hwlat_cpu_init - CPU hotplug online callback function
*/
static int hwlat_cpu_init(unsigned int cpu)
{
schedule_work_on(cpu, &hwlat_hotplug_work);
return 0;
}
/*
* hwlat_read - Wrapper read function for reading both window and width
* @filp: The active open file structure
* @ubuf: The userspace provided buffer to read value into
* @cnt: The maximum number of bytes to read
* @ppos: The current "file" position
*
* This function provides a generic read implementation for the global state
* "hwlat_data" structure filesystem entries.
* hwlat_cpu_die - CPU hotplug offline callback function
*/
static ssize_t hwlat_read(struct file *filp, char __user *ubuf,
size_t cnt, loff_t *ppos)
static int hwlat_cpu_die(unsigned int cpu)
{
char buf[U64STR_SIZE];
u64 *entry = filp->private_data;
u64 val;
int len;
stop_cpu_kthread(cpu);
return 0;
}
if (!entry)
static void hwlat_init_hotplug_support(void)
{
int ret;
ret = cpuhp_setup_state(CPUHP_AP_ONLINE_DYN, "trace/hwlat:online",
hwlat_cpu_init, hwlat_cpu_die);
if (ret < 0)
pr_warn(BANNER "Error to init cpu hotplug support\n");
return;
}
#else /* CONFIG_HOTPLUG_CPU */
static void hwlat_init_hotplug_support(void)
{
return;
}
#endif /* CONFIG_HOTPLUG_CPU */
/*
* start_per_cpu_kthreads - Kick off the hardware latency sampling/detector kthreads
*
* This starts the kernel threads that will sit on potentially all cpus and
* sample the CPU timestamp counter (TSC or similar) and look for potential
* hardware latencies.
*/
static int start_per_cpu_kthreads(struct trace_array *tr)
{
struct cpumask *current_mask = &save_cpumask;
unsigned int cpu;
int retval;
get_online_cpus();
/*
* Run only on CPUs in which hwlat is allowed to run.
*/
cpumask_and(current_mask, cpu_online_mask, tr->tracing_cpumask);
for_each_online_cpu(cpu)
per_cpu(hwlat_per_cpu_data, cpu).kthread = NULL;
for_each_cpu(cpu, current_mask) {
retval = start_cpu_kthread(cpu);
if (retval)
goto out_error;
}
put_online_cpus();
return 0;
out_error:
put_online_cpus();
stop_per_cpu_kthreads();
return retval;
}
static void *s_mode_start(struct seq_file *s, loff_t *pos)
{
int mode = *pos;
mutex_lock(&hwlat_data.lock);
if (mode >= MODE_MAX)
return NULL;
return pos;
}
static void *s_mode_next(struct seq_file *s, void *v, loff_t *pos)
{
int mode = ++(*pos);
if (mode >= MODE_MAX)
return NULL;
return pos;
}
static int s_mode_show(struct seq_file *s, void *v)
{
loff_t *pos = v;
int mode = *pos;
if (mode == hwlat_data.thread_mode)
seq_printf(s, "[%s]", thread_mode_str[mode]);
else
seq_printf(s, "%s", thread_mode_str[mode]);
if (mode != MODE_MAX)
seq_puts(s, " ");
return 0;
}
static void s_mode_stop(struct seq_file *s, void *v)
{
seq_puts(s, "\n");
mutex_unlock(&hwlat_data.lock);
}
static const struct seq_operations thread_mode_seq_ops = {
.start = s_mode_start,
.next = s_mode_next,
.show = s_mode_show,
.stop = s_mode_stop
};
static int hwlat_mode_open(struct inode *inode, struct file *file)
{
return seq_open(file, &thread_mode_seq_ops);
};
static void hwlat_tracer_start(struct trace_array *tr);
static void hwlat_tracer_stop(struct trace_array *tr);
/**
* hwlat_mode_write - Write function for "mode" entry
* @filp: The active open file structure
* @ubuf: The user buffer that contains the value to write
* @cnt: The maximum number of bytes to write to "file"
* @ppos: The current position in @file
*
* This function provides a write implementation for the "mode" interface
* to the hardware latency detector. hwlatd has different operation modes.
* The "none" sets the allowed cpumask for a single hwlatd thread at the
* startup and lets the scheduler handle the migration. The default mode is
* the "round-robin" one, in which a single hwlatd thread runs, migrating
* among the allowed CPUs in a round-robin fashion. The "per-cpu" mode
* creates one hwlatd thread per allowed CPU.
*/
static ssize_t hwlat_mode_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
struct trace_array *tr = hwlat_trace;
const char *mode;
char buf[64];
int ret, i;
if (cnt >= sizeof(buf))
return -EINVAL;
if (copy_from_user(buf, ubuf, cnt))
return -EFAULT;
if (cnt > sizeof(buf))
cnt = sizeof(buf);
buf[cnt] = 0;
val = *entry;
mode = strstrip(buf);
len = snprintf(buf, sizeof(buf), "%llu\n", val);
ret = -EINVAL;
return simple_read_from_buffer(ubuf, cnt, ppos, buf, len);
}
/**
* hwlat_width_write - Write function for "width" entry
* @filp: The active open file structure
* @ubuf: The user buffer that contains the value to write
* @cnt: The maximum number of bytes to write to "file"
* @ppos: The current position in @file
*
* This function provides a write implementation for the "width" interface
* to the hardware latency detector. It can be used to configure
* for how many us of the total window us we will actively sample for any
* hardware-induced latency periods. Obviously, it is not possible to
* sample constantly and have the system respond to a sample reader, or,
* worse, without having the system appear to have gone out to lunch. It
* is enforced that width is less that the total window size.
*/
static ssize_t
hwlat_width_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
u64 val;
int err;
err = kstrtoull_from_user(ubuf, cnt, 10, &val);
if (err)
return err;
/*
* trace_types_lock is taken to avoid concurrency on start/stop
* and hwlat_busy.
*/
mutex_lock(&trace_types_lock);
if (hwlat_busy)
hwlat_tracer_stop(tr);
mutex_lock(&hwlat_data.lock);
if (val < hwlat_data.sample_window)
hwlat_data.sample_width = val;
else
err = -EINVAL;
for (i = 0; i < MODE_MAX; i++) {
if (strcmp(mode, thread_mode_str[i]) == 0) {
hwlat_data.thread_mode = i;
ret = cnt;
}
}
mutex_unlock(&hwlat_data.lock);
if (err)
return err;
if (hwlat_busy)
hwlat_tracer_start(tr);
mutex_unlock(&trace_types_lock);
return cnt;
*ppos += cnt;
return ret;
}
/**
* hwlat_window_write - Write function for "window" entry
* @filp: The active open file structure
* @ubuf: The user buffer that contains the value to write
* @cnt: The maximum number of bytes to write to "file"
* @ppos: The current position in @file
*
* This function provides a write implementation for the "window" interface
* to the hardware latency detector. The window is the total time
* in us that will be considered one sample period. Conceptually, windows
* occur back-to-back and contain a sample width period during which
* actual sampling occurs. Can be used to write a new total window size. It
* is enforced that any value written must be greater than the sample width
* size, or an error results.
/*
* The width parameter is read/write using the generic trace_min_max_param
* method. The *val is protected by the hwlat_data lock and is upper
* bounded by the window parameter.
*/
static ssize_t
hwlat_window_write(struct file *filp, const char __user *ubuf,
size_t cnt, loff_t *ppos)
{
u64 val;
int err;
err = kstrtoull_from_user(ubuf, cnt, 10, &val);
if (err)
return err;
mutex_lock(&hwlat_data.lock);
if (hwlat_data.sample_width < val)
hwlat_data.sample_window = val;
else
err = -EINVAL;
mutex_unlock(&hwlat_data.lock);
if (err)
return err;
return cnt;
}
static const struct file_operations width_fops = {
.open = tracing_open_generic,
.read = hwlat_read,
.write = hwlat_width_write,
static struct trace_min_max_param hwlat_width = {
.lock = &hwlat_data.lock,
.val = &hwlat_data.sample_width,
.max = &hwlat_data.sample_window,
.min = NULL,
};
static const struct file_operations window_fops = {
.open = tracing_open_generic,
.read = hwlat_read,
.write = hwlat_window_write,
/*
* The window parameter is read/write using the generic trace_min_max_param
* method. The *val is protected by the hwlat_data lock and is lower
* bounded by the width parameter.
*/
static struct trace_min_max_param hwlat_window = {
.lock = &hwlat_data.lock,
.val = &hwlat_data.sample_window,
.max = NULL,
.min = &hwlat_data.sample_width,
};
static const struct file_operations thread_mode_fops = {
.open = hwlat_mode_open,
.read = seq_read,
.llseek = seq_lseek,
.release = seq_release,
.write = hwlat_mode_write
};
/**
* init_tracefs - A function to initialize the tracefs interface files
*
@ -546,18 +784,25 @@ static int init_tracefs(void)
hwlat_sample_window = tracefs_create_file("window", 0640,
top_dir,
&hwlat_data.sample_window,
&window_fops);
&hwlat_window,
&trace_min_max_fops);
if (!hwlat_sample_window)
goto err;
hwlat_sample_width = tracefs_create_file("width", 0644,
top_dir,
&hwlat_data.sample_width,
&width_fops);
&hwlat_width,
&trace_min_max_fops);
if (!hwlat_sample_width)
goto err;
hwlat_thread_mode = trace_create_file("mode", 0644,
top_dir,
NULL,
&thread_mode_fops);
if (!hwlat_thread_mode)
goto err;
return 0;
err:
@ -569,18 +814,22 @@ static void hwlat_tracer_start(struct trace_array *tr)
{
int err;
err = start_kthread(tr);
if (hwlat_data.thread_mode == MODE_PER_CPU)
err = start_per_cpu_kthreads(tr);
else
err = start_single_kthread(tr);
if (err)
pr_err(BANNER "Cannot start hwlat kthread\n");
}
static void hwlat_tracer_stop(struct trace_array *tr)
{
stop_kthread();
if (hwlat_data.thread_mode == MODE_PER_CPU)
stop_per_cpu_kthreads();
else
stop_single_kthread();
}
static bool hwlat_busy;
static int hwlat_tracer_init(struct trace_array *tr)
{
/* Only allow one instance to enable this */
@ -589,7 +838,6 @@ static int hwlat_tracer_init(struct trace_array *tr)
hwlat_trace = tr;
disable_migrate = false;
hwlat_data.count = 0;
tr->max_latency = 0;
save_tracing_thresh = tracing_thresh;
@ -608,7 +856,7 @@ static int hwlat_tracer_init(struct trace_array *tr)
static void hwlat_tracer_reset(struct trace_array *tr)
{
stop_kthread();
hwlat_tracer_stop(tr);
/* the tracing threshold is static between runs */
last_tracing_thresh = tracing_thresh;
@ -637,6 +885,8 @@ __init static int init_hwlat_tracer(void)
if (ret)
return ret;
hwlat_init_hotplug_support();
init_tracefs();
return 0;

2059
kernel/trace/trace_osnoise.c Normal file

File diff suppressed because it is too large Load diff

View file

@ -1202,7 +1202,6 @@ trace_hwlat_print(struct trace_iterator *iter, int flags,
return trace_handle_return(s);
}
static enum print_line_t
trace_hwlat_raw(struct trace_iterator *iter, int flags,
struct trace_event *event)
@ -1232,6 +1231,122 @@ static struct trace_event trace_hwlat_event = {
.funcs = &trace_hwlat_funcs,
};
/* TRACE_OSNOISE */
static enum print_line_t
trace_osnoise_print(struct trace_iterator *iter, int flags,
struct trace_event *event)
{
struct trace_entry *entry = iter->ent;
struct trace_seq *s = &iter->seq;
struct osnoise_entry *field;
u64 ratio, ratio_dec;
u64 net_runtime;
trace_assign_type(field, entry);
/*
* compute the available % of cpu time.
*/
net_runtime = field->runtime - field->noise;
ratio = net_runtime * 10000000;
do_div(ratio, field->runtime);
ratio_dec = do_div(ratio, 100000);
trace_seq_printf(s, "%llu %10llu %3llu.%05llu %7llu",
field->runtime,
field->noise,
ratio, ratio_dec,
field->max_sample);
trace_seq_printf(s, " %6u", field->hw_count);
trace_seq_printf(s, " %6u", field->nmi_count);
trace_seq_printf(s, " %6u", field->irq_count);
trace_seq_printf(s, " %6u", field->softirq_count);
trace_seq_printf(s, " %6u", field->thread_count);
trace_seq_putc(s, '\n');
return trace_handle_return(s);
}
static enum print_line_t
trace_osnoise_raw(struct trace_iterator *iter, int flags,
struct trace_event *event)
{
struct osnoise_entry *field;
struct trace_seq *s = &iter->seq;
trace_assign_type(field, iter->ent);
trace_seq_printf(s, "%lld %llu %llu %u %u %u %u %u\n",
field->runtime,
field->noise,
field->max_sample,
field->hw_count,
field->nmi_count,
field->irq_count,
field->softirq_count,
field->thread_count);
return trace_handle_return(s);
}
static struct trace_event_functions trace_osnoise_funcs = {
.trace = trace_osnoise_print,
.raw = trace_osnoise_raw,
};
static struct trace_event trace_osnoise_event = {
.type = TRACE_OSNOISE,
.funcs = &trace_osnoise_funcs,
};
/* TRACE_TIMERLAT */
static enum print_line_t
trace_timerlat_print(struct trace_iterator *iter, int flags,
struct trace_event *event)
{
struct trace_entry *entry = iter->ent;
struct trace_seq *s = &iter->seq;
struct timerlat_entry *field;
trace_assign_type(field, entry);
trace_seq_printf(s, "#%-5u context %6s timer_latency %9llu ns\n",
field->seqnum,
field->context ? "thread" : "irq",
field->timer_latency);
return trace_handle_return(s);
}
static enum print_line_t
trace_timerlat_raw(struct trace_iterator *iter, int flags,
struct trace_event *event)
{
struct timerlat_entry *field;
struct trace_seq *s = &iter->seq;
trace_assign_type(field, iter->ent);
trace_seq_printf(s, "%u %d %llu\n",
field->seqnum,
field->context,
field->timer_latency);
return trace_handle_return(s);
}
static struct trace_event_functions trace_timerlat_funcs = {
.trace = trace_timerlat_print,
.raw = trace_timerlat_raw,
};
static struct trace_event trace_timerlat_event = {
.type = TRACE_TIMERLAT,
.funcs = &trace_timerlat_funcs,
};
/* TRACE_BPUTS */
static enum print_line_t
trace_bputs_print(struct trace_iterator *iter, int flags,
@ -1442,6 +1557,8 @@ static struct trace_event *events[] __initdata = {
&trace_bprint_event,
&trace_print_event,
&trace_hwlat_event,
&trace_osnoise_event,
&trace_timerlat_event,
&trace_raw_data_event,
&trace_func_repeats_event,
NULL

View file

@ -26,9 +26,9 @@ static struct task_struct *wakeup_task;
static int wakeup_cpu;
static int wakeup_current_cpu;
static unsigned wakeup_prio = -1;
static int wakeup_rt;
static int wakeup_dl;
static int tracing_dl = 0;
static bool wakeup_rt;
static bool wakeup_dl;
static bool tracing_dl;
static arch_spinlock_t wakeup_lock =
(arch_spinlock_t)__ARCH_SPIN_LOCK_UNLOCKED;
@ -498,7 +498,7 @@ static void __wakeup_reset(struct trace_array *tr)
{
wakeup_cpu = -1;
wakeup_prio = -1;
tracing_dl = 0;
tracing_dl = false;
if (wakeup_task)
put_task_struct(wakeup_task);
@ -572,9 +572,9 @@ probe_wakeup(void *ignore, struct task_struct *p)
* another task until the first one wakes up.
*/
if (dl_task(p))
tracing_dl = 1;
tracing_dl = true;
else
tracing_dl = 0;
tracing_dl = false;
wakeup_task = get_task_struct(p);
@ -685,8 +685,8 @@ static int wakeup_tracer_init(struct trace_array *tr)
if (wakeup_busy)
return -EBUSY;
wakeup_dl = 0;
wakeup_rt = 0;
wakeup_dl = false;
wakeup_rt = false;
return __wakeup_tracer_init(tr);
}
@ -695,8 +695,8 @@ static int wakeup_rt_tracer_init(struct trace_array *tr)
if (wakeup_busy)
return -EBUSY;
wakeup_dl = 0;
wakeup_rt = 1;
wakeup_dl = false;
wakeup_rt = true;
return __wakeup_tracer_init(tr);
}
@ -705,8 +705,8 @@ static int wakeup_dl_tracer_init(struct trace_array *tr)
if (wakeup_busy)
return -EBUSY;
wakeup_dl = 1;
wakeup_rt = 0;
wakeup_dl = true;
wakeup_rt = false;
return __wakeup_tracer_init(tr);
}

View file

@ -273,7 +273,8 @@ static void tracepoint_update_call(struct tracepoint *tp, struct tracepoint_func
* Add the probe function to a tracepoint.
*/
static int tracepoint_add_func(struct tracepoint *tp,
struct tracepoint_func *func, int prio)
struct tracepoint_func *func, int prio,
bool warn)
{
struct tracepoint_func *old, *tp_funcs;
int ret;
@ -288,7 +289,7 @@ static int tracepoint_add_func(struct tracepoint *tp,
lockdep_is_held(&tracepoints_mutex));
old = func_add(&tp_funcs, func, prio);
if (IS_ERR(old)) {
WARN_ON_ONCE(PTR_ERR(old) != -ENOMEM);
WARN_ON_ONCE(warn && PTR_ERR(old) != -ENOMEM);
return PTR_ERR(old);
}
@ -343,6 +344,32 @@ static int tracepoint_remove_func(struct tracepoint *tp,
return 0;
}
/**
* tracepoint_probe_register_prio_may_exist - Connect a probe to a tracepoint with priority
* @tp: tracepoint
* @probe: probe handler
* @data: tracepoint data
* @prio: priority of this function over other registered functions
*
* Same as tracepoint_probe_register_prio() except that it will not warn
* if the tracepoint is already registered.
*/
int tracepoint_probe_register_prio_may_exist(struct tracepoint *tp, void *probe,
void *data, int prio)
{
struct tracepoint_func tp_func;
int ret;
mutex_lock(&tracepoints_mutex);
tp_func.func = probe;
tp_func.data = data;
tp_func.prio = prio;
ret = tracepoint_add_func(tp, &tp_func, prio, false);
mutex_unlock(&tracepoints_mutex);
return ret;
}
EXPORT_SYMBOL_GPL(tracepoint_probe_register_prio_may_exist);
/**
* tracepoint_probe_register_prio - Connect a probe to a tracepoint with priority
* @tp: tracepoint
@ -366,7 +393,7 @@ int tracepoint_probe_register_prio(struct tracepoint *tp, void *probe,
tp_func.func = probe;
tp_func.data = data;
tp_func.prio = prio;
ret = tracepoint_add_func(tp, &tp_func, prio);
ret = tracepoint_add_func(tp, &tp_func, prio, true);
mutex_unlock(&tracepoints_mutex);
return ret;
}

View file

@ -156,7 +156,7 @@ xbc_node_find_child(struct xbc_node *parent, const char *key)
struct xbc_node *node;
if (parent)
node = xbc_node_get_child(parent);
node = xbc_node_get_subkey(parent);
else
node = xbc_root_node();
@ -164,7 +164,7 @@ xbc_node_find_child(struct xbc_node *parent, const char *key)
if (!xbc_node_match_prefix(node, &key))
node = xbc_node_get_next(node);
else if (*key != '\0')
node = xbc_node_get_child(node);
node = xbc_node_get_subkey(node);
else
break;
}
@ -274,6 +274,8 @@ int __init xbc_node_compose_key_after(struct xbc_node *root,
struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
struct xbc_node *node)
{
struct xbc_node *next;
if (unlikely(!xbc_data))
return NULL;
@ -282,6 +284,13 @@ struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
if (!node)
node = xbc_nodes;
} else {
/* Leaf node may have a subkey */
next = xbc_node_get_subkey(node);
if (next) {
node = next;
goto found;
}
if (node == root) /* @root was a leaf, no child node. */
return NULL;
@ -296,6 +305,7 @@ struct xbc_node * __init xbc_node_find_next_leaf(struct xbc_node *root,
node = xbc_node_get_next(node);
}
found:
while (node && !xbc_node_is_leaf(node))
node = xbc_node_get_child(node);
@ -367,18 +377,28 @@ static inline __init struct xbc_node *xbc_last_sibling(struct xbc_node *node)
return node;
}
static struct xbc_node * __init xbc_add_sibling(char *data, u32 flag)
static inline __init struct xbc_node *xbc_last_child(struct xbc_node *node)
{
while (node->child)
node = xbc_node_get_child(node);
return node;
}
static struct xbc_node * __init __xbc_add_sibling(char *data, u32 flag, bool head)
{
struct xbc_node *sib, *node = xbc_add_node(data, flag);
if (node) {
if (!last_parent) {
/* Ignore @head in this case */
node->parent = XBC_NODE_MAX;
sib = xbc_last_sibling(xbc_nodes);
sib->next = xbc_node_index(node);
} else {
node->parent = xbc_node_index(last_parent);
if (!last_parent->child) {
if (!last_parent->child || head) {
node->next = last_parent->child;
last_parent->child = xbc_node_index(node);
} else {
sib = xbc_node_get_child(last_parent);
@ -392,6 +412,16 @@ static struct xbc_node * __init xbc_add_sibling(char *data, u32 flag)
return node;
}
static inline struct xbc_node * __init xbc_add_sibling(char *data, u32 flag)
{
return __xbc_add_sibling(data, flag, false);
}
static inline struct xbc_node * __init xbc_add_head_sibling(char *data, u32 flag)
{
return __xbc_add_sibling(data, flag, true);
}
static inline __init struct xbc_node *xbc_add_child(char *data, u32 flag)
{
struct xbc_node *node = xbc_add_sibling(data, flag);
@ -517,17 +547,20 @@ static int __init xbc_parse_array(char **__v)
char *next;
int c = 0;
if (last_parent->child)
last_parent = xbc_node_get_child(last_parent);
do {
c = __xbc_parse_value(__v, &next);
if (c < 0)
return c;
node = xbc_add_sibling(*__v, XBC_VALUE);
node = xbc_add_child(*__v, XBC_VALUE);
if (!node)
return -ENOMEM;
*__v = next;
} while (c == ',');
node->next = 0;
node->child = 0;
return c;
}
@ -557,8 +590,9 @@ static int __init __xbc_add_key(char *k)
node = find_match_node(xbc_nodes, k);
else {
child = xbc_node_get_child(last_parent);
/* Since the value node is the first child, skip it. */
if (child && xbc_node_is_value(child))
return xbc_parse_error("Subkey is mixed with value", k);
child = xbc_node_get_next(child);
node = find_match_node(child, k);
}
@ -601,23 +635,29 @@ static int __init xbc_parse_kv(char **k, char *v, int op)
if (ret)
return ret;
child = xbc_node_get_child(last_parent);
if (child) {
if (xbc_node_is_key(child))
return xbc_parse_error("Value is mixed with subkey", v);
else if (op == '=')
return xbc_parse_error("Value is redefined", v);
}
c = __xbc_parse_value(&v, &next);
if (c < 0)
return c;
if (op == ':' && child) {
xbc_init_node(child, v, XBC_VALUE);
} else if (!xbc_add_sibling(v, XBC_VALUE))
child = xbc_node_get_child(last_parent);
if (child && xbc_node_is_value(child)) {
if (op == '=')
return xbc_parse_error("Value is redefined", v);
if (op == ':') {
unsigned short nidx = child->next;
xbc_init_node(child, v, XBC_VALUE);
child->next = nidx; /* keep subkeys */
goto array;
}
/* op must be '+' */
last_parent = xbc_last_child(child);
}
/* The value node should always be the first child */
if (!xbc_add_head_sibling(v, XBC_VALUE))
return -ENOMEM;
array:
if (c == ',') { /* Array */
c = xbc_parse_array(&next);
if (c < 0)

View file

@ -229,8 +229,10 @@ int seq_buf_putmem_hex(struct seq_buf *s, const void *mem,
WARN_ON(s->size == 0);
BUILD_BUG_ON(MAX_MEMHEX_BYTES * 2 >= HEX_CHARS);
while (len) {
start_len = min(len, HEX_CHARS - 1);
start_len = min(len, MAX_MEMHEX_BYTES);
#ifdef __BIG_ENDIAN
for (i = 0, j = 0; i < start_len; i++) {
#else
@ -243,12 +245,14 @@ int seq_buf_putmem_hex(struct seq_buf *s, const void *mem,
break;
/* j increments twice per loop */
len -= j / 2;
hex[j++] = ' ';
seq_buf_putmem(s, hex, j);
if (seq_buf_has_overflowed(s))
return -1;
len -= start_len;
data += start_len;
}
return 0;
}

View file

@ -27,7 +27,7 @@ static int xbc_show_value(struct xbc_node *node, bool semicolon)
q = '\'';
else
q = '"';
printf("%c%s%c%s", q, val, q, node->next ? ", " : eol);
printf("%c%s%c%s", q, val, q, xbc_node_is_array(node) ? ", " : eol);
i++;
}
return i;
@ -35,30 +35,55 @@ static int xbc_show_value(struct xbc_node *node, bool semicolon)
static void xbc_show_compact_tree(void)
{
struct xbc_node *node, *cnode;
struct xbc_node *node, *cnode = NULL, *vnode;
int depth = 0, i;
node = xbc_root_node();
while (node && xbc_node_is_key(node)) {
for (i = 0; i < depth; i++)
printf("\t");
cnode = xbc_node_get_child(node);
if (!cnode)
cnode = xbc_node_get_child(node);
while (cnode && xbc_node_is_key(cnode) && !cnode->next) {
vnode = xbc_node_get_child(cnode);
/*
* If @cnode has value and subkeys, this
* should show it as below.
*
* key(@node) {
* key(@cnode) = value;
* key(@cnode) {
* subkeys;
* }
* }
*/
if (vnode && xbc_node_is_value(vnode) && vnode->next)
break;
printf("%s.", xbc_node_get_data(node));
node = cnode;
cnode = xbc_node_get_child(node);
cnode = vnode;
}
if (cnode && xbc_node_is_key(cnode)) {
printf("%s {\n", xbc_node_get_data(node));
depth++;
node = cnode;
cnode = NULL;
continue;
} else if (cnode && xbc_node_is_value(cnode)) {
printf("%s = ", xbc_node_get_data(node));
xbc_show_value(cnode, true);
/*
* If @node has value and subkeys, continue
* looping on subkeys with same node.
*/
if (cnode->next) {
cnode = xbc_node_get_next(cnode);
continue;
}
} else {
printf("%s;\n", xbc_node_get_data(node));
}
cnode = NULL;
if (node->next) {
node = xbc_node_get_next(node);
@ -70,10 +95,12 @@ static void xbc_show_compact_tree(void)
return;
if (!xbc_node_get_child(node)->next)
continue;
depth--;
for (i = 0; i < depth; i++)
printf("\t");
printf("}\n");
if (depth) {
depth--;
for (i = 0; i < depth; i++)
printf("\t");
printf("}\n");
}
}
node = xbc_node_get_next(node);
}
@ -84,12 +111,12 @@ static void xbc_show_list(void)
char key[XBC_KEYLEN_MAX];
struct xbc_node *leaf;
const char *val;
int ret = 0;
xbc_for_each_key_value(leaf, val) {
ret = xbc_node_compose_key(leaf, key, XBC_KEYLEN_MAX);
if (ret < 0)
if (xbc_node_compose_key(leaf, key, XBC_KEYLEN_MAX) < 0) {
fprintf(stderr, "Failed to compose key %d\n", ret);
break;
}
printf("%s = ", key);
if (!val || val[0] == '\0') {
printf("\"\"\n");
@ -99,17 +126,6 @@ static void xbc_show_list(void)
}
}
/* Simple real checksum */
static int checksum(unsigned char *buf, int len)
{
int i, sum = 0;
for (i = 0; i < len; i++)
sum += buf[i];
return sum;
}
#define PAGE_SIZE 4096
static int load_xbc_fd(int fd, char **buf, int size)
@ -205,7 +221,7 @@ static int load_xbc_from_initrd(int fd, char **buf)
return ret;
/* Wrong Checksum */
rcsum = checksum((unsigned char *)*buf, size);
rcsum = xbc_calc_checksum(*buf, size);
if (csum != rcsum) {
pr_err("checksum error: %d != %d\n", csum, rcsum);
return -EINVAL;
@ -354,7 +370,7 @@ static int apply_xbc(const char *path, const char *xbc_path)
return ret;
}
size = strlen(buf) + 1;
csum = checksum((unsigned char *)buf, size);
csum = xbc_calc_checksum(buf, size);
/* Backup the bootconfig data */
data = calloc(size + BOOTCONFIG_ALIGN +

View file

@ -1,3 +0,0 @@
key.subkey = value
# We can not override pre-defined subkeys with value
key := value

View file

@ -1,3 +0,0 @@
key = value
# We can not override pre-defined value with subkey
key.subkey := value

View file

@ -0,0 +1,4 @@
key = foo
keyx.subkey = value
key += bar

View file

@ -0,0 +1,6 @@
# mixed key and subkeys with braces
key = value
key {
subkey1
subkey2 = foo
}

View file

@ -0,0 +1,4 @@
key.foo = bar
key = value
# mixed key value can be overridden
key := value2

View file

@ -0,0 +1,49 @@
ftrace.event {
task.task_newtask {
filter = "pid < 128"
enable
}
kprobes.vfs_read {
probes = "vfs_read $arg1 $arg2"
filter = "common_pid < 200"
enable
}
synthetic.initcall_latency {
fields = "unsigned long func", "u64 lat"
actions = "hist:keys=func.sym,lat:vals=lat:sort=lat"
}
initcall.initcall_start {
actions = "hist:keys=func:ts0=common_timestamp.usecs"
}
initcall.initcall_finish {
actions = "hist:keys=func:lat=common_timestamp.usecs-$ts0:onmatch(initcall.initcall_start).initcall_latency(func,$lat)"
}
}
ftrace.instance {
foo {
tracer = "function"
ftrace.filters = "user_*"
cpumask = 1
options = nosym-addr
buffer_size = 512KB
trace_clock = mono
event.signal.signal_deliver.actions=snapshot
}
bar {
tracer = "function"
ftrace.filters = "kernel_*"
cpumask = 2
trace_clock = x86-tsc
}
}
ftrace.alloc_snapshot
kernel {
trace_options = sym-addr
trace_event = "initcall:*"
trace_buf_size = 1M
ftrace = function
ftrace_filter = "vfs*"
}

View file

@ -0,0 +1 @@
CONFIG_CMDLINE="bootconfig"

View file

@ -0,0 +1,15 @@
ftrace {
tracing_on = 0 # off by default
tracer = function_graph
event.kprobes {
start_event {
probes = "pci_proc_init"
actions = "traceon"
}
end_event {
probes = "pci_proc_init%return"
actions = "traceoff"
}
}
}

View file

@ -0,0 +1,33 @@
ftrace {
tracer = function_graph;
options = event-fork, sym-addr, stacktrace;
buffer_size = 1M;
alloc_snapshot;
trace_clock = global;
events = "task:task_newtask", "initcall:*";
event.sched.sched_process_exec {
filter = "pid < 128";
}
instance.bar {
event.kprobes {
myevent {
probes = "vfs_read $arg2 $arg3";
}
myevent2 {
probes = "vfs_write $arg2 +0($arg2):ustring $arg3";
}
myevent3 {
probes = "initrd_load";
}
enable
}
}
instance.foo {
tracer = function;
tracing_on = false;
};
}
kernel {
ftrace_dump_on_oops = "orig_cpu"
traceoff_on_warning
}

View file

@ -0,0 +1,84 @@
#!/bin/sh
cd /sys/kernel/tracing
compare_file() {
file="$1"
val="$2"
content=`cat $file`
if [ "$content" != "$val" ]; then
echo "FAILED: $file has '$content', expected '$val'"
exit 1
fi
}
compare_file_partial() {
file="$1"
val="$2"
content=`cat $file | sed -ne "/^$val/p"`
if [ -z "$content" ]; then
echo "FAILED: $file does not contain '$val'"
cat $file
exit 1
fi
}
file_contains() {
file=$1
val="$2"
if ! grep -q "$val" $file ; then
echo "FAILED: $file does not contain $val"
cat $file
exit 1
fi
}
compare_mask() {
file=$1
val="$2"
content=`cat $file | sed -ne "/^[0 ]*$val/p"`
if [ -z "$content" ]; then
echo "FAILED: $file does not have mask '$val'"
cat $file
exit 1
fi
}
compare_file "events/task/task_newtask/filter" "pid < 128"
compare_file "events/task/task_newtask/enable" "1"
compare_file "events/kprobes/vfs_read/filter" "common_pid < 200"
compare_file "events/kprobes/vfs_read/enable" "1"
compare_file_partial "events/synthetic/initcall_latency/trigger" "hist:keys=func.sym,lat:vals=hitcount,lat:sort=lat"
compare_file_partial "events/synthetic/initcall_latency/enable" "0"
compare_file_partial "events/initcall/initcall_start/trigger" "hist:keys=func:vals=hitcount:ts0=common_timestamp.usecs"
compare_file_partial "events/initcall/initcall_start/enable" "1"
compare_file_partial "events/initcall/initcall_finish/trigger" 'hist:keys=func:vals=hitcount:lat=common_timestamp.usecs-\$ts0:sort=hitcount:size=2048:clock=global:onmatch(initcall.initcall_start).initcall_latency(func,\$lat)'
compare_file_partial "events/initcall/initcall_finish/enable" "1"
compare_file "instances/foo/current_tracer" "function"
file_contains "instances/foo/set_ftrace_filter" "^user"
compare_file "instances/foo/buffer_size_kb" "512"
compare_mask "instances/foo/tracing_cpumask" "1"
compare_file "instances/foo/options/sym-addr" "0"
file_contains "instances/foo/trace_clock" '\[mono\]'
compare_file_partial "instances/foo/events/signal/signal_deliver/trigger" "snapshot"
compare_file "instances/bar/current_tracer" "function"
file_contains "instances/bar/set_ftrace_filter" "^kernel"
compare_mask "instances/bar/tracing_cpumask" "2"
file_contains "instances/bar/trace_clock" '\[x86-tsc\]'
file_contains "snapshot" "Snapshot is allocated"
compare_file "options/sym-addr" "1"
compare_file "events/initcall/enable" "1"
compare_file "buffer_size_kb" "1027"
compare_file "current_tracer" "function"
file_contains "set_ftrace_filter" '^vfs'
exit 0

View file

@ -0,0 +1,61 @@
#!/bin/sh
cd /sys/kernel/tracing
compare_file() {
file="$1"
val="$2"
content=`cat $file`
if [ "$content" != "$val" ]; then
echo "FAILED: $file has '$content', expected '$val'"
exit 1
fi
}
compare_file_partial() {
file="$1"
val="$2"
content=`cat $file | sed -ne "/^$val/p"`
if [ -z "$content" ]; then
echo "FAILED: $file does not contain '$val'"
cat $file
exit 1
fi
}
file_contains() {
file=$1
val="$2"
if ! grep -q "$val" $file ; then
echo "FAILED: $file does not contain $val"
cat $file
exit 1
fi
}
compare_mask() {
file=$1
val="$2"
content=`cat $file | sed -ne "/^[0 ]*$val/p"`
if [ -z "$content" ]; then
echo "FAILED: $file does not have mask '$val'"
cat $file
exit 1
fi
}
compare_file "tracing_on" "0"
compare_file "current_tracer" "function_graph"
compare_file_partial "events/kprobes/start_event/enable" "1"
compare_file_partial "events/kprobes/start_event/trigger" "traceon"
file_contains "kprobe_events" 'start_event.*pci_proc_init'
compare_file_partial "events/kprobes/end_event/enable" "1"
compare_file_partial "events/kprobes/end_event/trigger" "traceoff"
file_contains "kprobe_events" '^r.*end_event.*pci_proc_init'
exit 0

View file

@ -0,0 +1,72 @@
#!/bin/sh
cd /sys/kernel/tracing
compare_file() {
file="$1"
val="$2"
content=`cat $file`
if [ "$content" != "$val" ]; then
echo "FAILED: $file has '$content', expected '$val'"
exit 1
fi
}
compare_file_partial() {
file="$1"
val="$2"
content=`cat $file | sed -ne "/^$val/p"`
if [ -z "$content" ]; then
echo "FAILED: $file does not contain '$val'"
cat $file
exit 1
fi
}
file_contains() {
file=$1
val="$2"
if ! grep -q "$val" $file ; then
echo "FAILED: $file does not contain $val"
cat $file
exit 1
fi
}
compare_mask() {
file=$1
val="$2"
content=`cat $file | sed -ne "/^[0 ]*$val/p"`
if [ -z "$content" ]; then
echo "FAILED: $file does not have mask '$val'"
cat $file
exit 1
fi
}
compare_file "current_tracer" "function_graph"
compare_file "options/event-fork" "1"
compare_file "options/sym-addr" "1"
compare_file "options/stacktrace" "1"
compare_file "buffer_size_kb" "1024"
file_contains "snapshot" "Snapshot is allocated"
file_contains "trace_clock" '\[global\]'
compare_file "events/initcall/enable" "1"
compare_file "events/task/task_newtask/enable" "1"
compare_file "events/sched/sched_process_exec/filter" "pid < 128"
compare_file "events/kprobes/enable" "1"
compare_file "instances/bar/events/kprobes/myevent/enable" "1"
compare_file "instances/bar/events/kprobes/myevent2/enable" "1"
compare_file "instances/bar/events/kprobes/myevent3/enable" "1"
compare_file "instances/foo/current_tracer" "function"
compare_file "instances/foo/tracing_on" "0"
compare_file "/proc/sys/kernel/ftrace_dump_on_oops" "2"
compare_file "/proc/sys/kernel/traceoff_on_warning" "1"
exit 0

View file

@ -0,0 +1,69 @@
# bootconfig.conf
#
# Tests to test some bootconfig scripts
# List where on the target machine the initrd is used
INITRD := /boot/initramfs-test.img
# Install bootconfig on the target machine and define the path here.
BOOTCONFIG := /usr/bin/bootconfig
# Currenty we just build the .config in the BUILD_DIR
BUILD_TYPE := oldconfig
# Helper macro to run bootconfig on the target
# SSH is defined in include/defaults.conf
ADD_BOOTCONFIG := ${SSH} "${BOOTCONFIG} -d ${INITRD} && ${BOOTCONFIG} -a /tmp/${BOOTCONFIG_FILE} ${INITRD}"
# This copies a bootconfig script to the target and then will
# add it to the initrd. SSH_USER is defined in include/defaults.conf
# and MACHINE is defined in the example configs.
BOOTCONFIG_TEST_PREP = scp ${BOOTCONFIG_PATH}${BOOTCONFIG_FILE} ${SSH_USER}@${MACHINE}:/tmp && ${ADD_BOOTCONFIG}
# When a test is complete, remove the bootconfig from the initrd.
CLEAR_BOOTCONFIG := ${SSH} "${BOOTCONFIG} -d ${INITRD}"
# Run a verifier on the target after it had booted, to make sure that the
# bootconfig script did what it was expected to do
DO_TEST = scp ${BOOTCONFIG_PATH}${BOOTCONFIG_VERIFY} ${SSH_USER}@${MACHINE}:/tmp && ${SSH} /tmp/${BOOTCONFIG_VERIFY}
# Comment this out to not run the boot configs
RUN_BOOTCONFIG := 1
TEST_START IF DEFINED RUN_BOOTCONFIG
TEST_TYPE = test
TEST_NAME = bootconfig boottrace
# Just testing the bootconfig on initrd, no need to build the kernel
BUILD_TYPE = nobuild
BOOTCONFIG_FILE = boottrace.bconf
BOOTCONFIG_VERIFY = verify-boottrace.sh
ADD_CONFIG = ${ADD_CONFIG} ${BOOTCONFIG_PATH}/config-bootconfig
PRE_TEST = ${BOOTCONFIG_TEST_PREP}
PRE_TEST_DIE = 1
TEST = ${DO_TEST}
POST_TEST = ${CLEAR_BOOTCONFIG}
TEST_START IF DEFINED RUN_BOOTCONFIG
TEST_TYPE = test
TEST_NAME = bootconfig function graph
BUILD_TYPE = nobuild
BOOTCONFIG_FILE = functiongraph.bconf
BOOTCONFIG_VERIFY = verify-functiongraph.sh
ADD_CONFIG = ${ADD_CONFIG} ${BOOTCONFIG_PATH}/config-bootconfig
PRE_TEST = ${BOOTCONFIG_TEST_PREP}
PRE_TEST_DIE = 1
TEST = ${DO_TEST}
POST_TEST = ${CLEAR_BOOTCONFIG}
TEST_START IF DEFINED RUN_BOOTCONFIG
TEST_TYPE = test
TEST_NAME = bootconfig tracing
BUILD_TYPE = nobuild
BOOTCONFIG_FILE = tracing.bconf
BOOTCONFIG_VERIFY = verify-tracing.sh
ADD_CONFIG = ${ADD_CONFIG} ${BOOTCONFIG_PATH}/config-bootconfig
PRE_TEST = ${BOOTCONFIG_TEST_PREP}
PRE_TEST_DIE = 1
TEST = ${DO_TEST}
POST_TEST = ${CLEAR_BOOTCONFIG}

View file

@ -90,3 +90,4 @@ INCLUDE include/patchcheck.conf
INCLUDE include/tests.conf
INCLUDE include/bisect.conf
INCLUDE include/min-config.conf
INCLUDE include/bootconfig.conf