linux/kernel
Steven Rostedt (Red Hat) 03274a3ffb tracing/fgraph: Adjust fgraph depth before calling trace return callback
While debugging the virtual cputime with the function graph tracer
with a max_depth of 1 (most common use of the max_depth so far),
I found that I was missing kernel execution because of a race condition.

The code for the return side of the function has a slight race:

	ftrace_pop_return_trace(&trace, &ret, frame_pointer);
	trace.rettime = trace_clock_local();
	ftrace_graph_return(&trace);
	barrier();
	current->curr_ret_stack--;

The ftrace_pop_return_trace() initializes the trace structure for
the callback. The ftrace_graph_return() uses the trace structure
for its own use as that structure is on the stack and is local
to this function. Then the curr_ret_stack is decremented which
is what the trace.depth is set to.

If an interrupt comes in after the ftrace_graph_return() but
before the curr_ret_stack, then the called function will get
a depth of 2. If max_depth is set to 1 this function will be
ignored.

The problem is that the trace has already been called, and the
timestamp for that trace will not reflect the time the function
was about to re-enter userspace. Calls to the interrupt will not
be traced because the max_depth has prevented this.

To solve this issue, the ftrace_graph_return() can safely be
moved after the current->curr_ret_stack has been updated.
This way the timestamp for the return callback will reflect
the actual time.

If an interrupt comes in after the curr_ret_stack update and
ftrace_graph_return(), it will be traced. It may look a little
confusing to see it within the other function, but at least
it will not be lost.

Cc: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
2013-01-29 17:30:31 -05:00
..
debug module: add new state MODULE_STATE_UNFORMED. 2013-01-12 13:27:05 +10:30
events Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
gcov
irq irq: tsk->comm is an array 2012-12-18 15:02:11 -08:00
power Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2012-12-12 08:18:24 -08:00
sched wake_up_process() should be never used to wakeup a TASK_STOPPED/TRACED task 2013-01-22 10:08:17 -08:00
time Merge tag 'kvm-3.8-1' of git://git.kernel.org/pub/scm/virt/kvm/kvm 2012-12-13 15:31:08 -08:00
trace tracing/fgraph: Adjust fgraph depth before calling trace return callback 2013-01-29 17:30:31 -05:00
.gitignore
acct.c vfs: make path_openat take a struct filename pointer 2012-10-12 20:15:09 -04:00
async.c async: fix __lowest_in_progress() 2013-01-22 16:21:24 -08:00
audit.c kernel/audit.c: avoid negative sleep durations 2013-01-11 14:54:56 -08:00
audit.h audit: optimize audit_compare_dname_path 2012-10-12 00:32:02 -04:00
audit_tree.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
audit_watch.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
auditfilter.c audit: fix auditfilter.c kernel-doc warnings 2013-01-10 14:35:23 -08:00
auditsc.c audit: catch possible NULL audit buffers 2013-01-11 14:54:55 -08:00
backtracetest.c
bounds.c
capability.c userns: Teach inode_capable to understand inodes whose uids map to other namespaces. 2012-05-15 14:59:24 -07:00
cgroup.c Merge branch 'akpm' (Andrew's patch-bomb) 2012-12-17 20:58:12 -08:00
cgroup_freezer.c cgroup: rename ->create/post_create/pre_destroy/destroy() to ->css_alloc/online/offline/free() 2012-11-19 08:13:38 -08:00
compat.c x32: fix sigtimedwait 2012-12-26 01:15:03 -05:00
configs.c
context_tracking.c context_tracking: New context tracking susbsystem 2012-11-30 11:40:07 -08:00
cpu.c Merge branch 'x86-bsp-hotplug-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-11 19:56:33 -08:00
cpu_pm.c kernel/cpu_pm.c: fix various typos 2012-05-31 17:49:27 -07:00
cpuset.c cpuset: use N_MEMORY instead N_HIGH_MEMORY 2012-12-12 17:38:32 -08:00
crash_dump.c
cred.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-18 10:55:28 -08:00
delayacct.c
dma.c
elfcore.c
exec_domain.c
exit.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2012-12-17 15:44:47 -08:00
extable.c extable: Skip sorting if sorted at build time. 2012-04-19 15:06:55 -07:00
fork.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/signal 2013-01-20 13:58:48 -08:00
freezer.c freezer: change ptrace_stop/do_signal_stop to use freezable_schedule() 2012-10-26 14:27:49 -07:00
futex.c futex: avoid wake_futex() for a PI futex_q 2012-11-26 17:41:24 -08:00
futex_compat.c
groups.c userns: Convert in_group_p and in_egroup_p to use kgid_t 2012-05-03 03:29:33 -07:00
hrtimer.c hrtimer: Update hrtimer base offsets each hrtimer_interrupt 2012-07-11 23:34:39 +02:00
hung_task.c hung task debugging: Inject NMI when hung and going to panic 2012-04-25 12:39:25 +02:00
irq_work.c irq_work: fix compile failure on tile from missing include 2012-04-13 13:15:16 -04:00
itimer.c itimer: Use printk_once instead of WARN_ONCE 2012-04-10 11:00:30 +02:00
jump_label.c jump_label: Export jump_label_rate_limit() 2012-08-06 19:00:35 +03:00
kallsyms.c vsprintf: fix %ps on non symbols when using kallsyms 2012-05-29 16:22:32 -07:00
kcmp.c kcmp: include linux/ptrace.h 2012-12-20 17:40:19 -08:00
Kconfig.freezer
Kconfig.hz
Kconfig.locks locking: Adjust spin lock inlining Kconfig options 2012-09-13 17:56:13 +02:00
Kconfig.preempt
kexec.c kdump: remove unneeded include 2012-10-06 03:05:19 +09:00
kfifo.c [media] kernel:kfifo: export __kfifo_max_r symbol 2012-04-11 18:24:37 -03:00
kmod.c Bury the conditionals from kernel_thread/kernel_execve series 2012-12-19 18:07:38 -05:00
kprobes.c kprobes/x86: Move ftrace-based kprobe code into kprobes-ftrace.c 2013-01-21 13:22:36 -05:00
ksysfs.c Merge branch 'core-rcu-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip 2012-12-11 18:10:49 -08:00
kthread.c kthread: use N_MEMORY instead N_HIGH_MEMORY 2012-12-12 17:38:33 -08:00
latencytop.c
lglock.c brlocks/lglocks: turn into functions 2012-05-29 23:28:41 -04:00
lockdep.c lockdep: Check if nested lock is actually held 2012-09-13 17:00:44 +02:00
lockdep_internals.h
lockdep_proc.c lockdep: Use KSYM_NAME_LEN'ed buffer for __get_key_name() 2012-10-24 12:39:09 +02:00
lockdep_states.h
Makefile Nothing all that exciting; a new module-from-fd syscall for those who want 2012-12-19 07:55:08 -08:00
modsign_certificate.S MODSIGN: Avoid using .incbin in C source 2012-12-14 13:06:44 +10:30
modsign_pubkey.c keys: use keyring_alloc() to create module signing keyring 2012-12-20 17:40:21 -08:00
module-internal.h MODSIGN: Move the magic string to the end of a module and eliminate the search 2012-10-19 17:30:40 -07:00
module.c module: fix missing module_mutex unlock 2013-01-20 20:22:58 -08:00
module_signing.c MODSIGN: Don't use enum-type bitfields in module signature info block 2012-12-05 11:27:24 +10:30
mutex-debug.c
mutex-debug.h
mutex.c
mutex.h
notifier.c
nsproxy.c userns: Implement unshare of the user namespace 2012-11-20 04:18:14 -08:00
padata.c padata: use __this_cpu_read per-cpu helper 2012-12-06 17:16:23 +08:00
panic.c panic: fix a possible deadlock in panic() 2012-07-30 17:25:13 -07:00
params.c params: replace printk(KERN_<LVL>...) with pr_<lvl>(...) 2012-05-04 17:28:18 -07:00
pid.c pidns: Stop pid allocation when init dies 2012-12-25 16:10:05 -08:00
pid_namespace.c pidns: Stop pid allocation when init dies 2012-12-25 16:10:05 -08:00
posix-cpu-timers.c A few /dev/random improvements for the v3.8 merge window. 2012-12-19 20:23:37 -08:00
posix-timers.c
printk.c printk: fix incorrect length from print_time() when seconds > 99999 2013-01-04 16:11:48 -08:00
profile.c profiling: Remove unused timer hook 2013-01-24 15:37:26 +01:00
ptrace.c ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL 2013-01-22 10:08:00 -08:00
range.c
rcu.h rcu: Add a module parameter to force use of expedited RCU primitives 2012-10-23 14:54:08 -07:00
rcupdate.c rcu: Add a module parameter to force use of expedited RCU primitives 2012-10-23 14:54:08 -07:00
rcutiny.c rcu: Fix TINY_RCU rcu_is_cpu_rrupt_from_idle check 2012-11-13 14:08:34 -08:00
rcutiny_plugin.h rcu: Add a module parameter to force use of expedited RCU primitives 2012-10-23 14:54:08 -07:00
rcutorture.c Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD 2012-11-16 09:59:58 -08:00
rcutree.c context_tracking: New context tracking susbsystem 2012-11-30 11:40:07 -08:00
rcutree.h rcu: Separate accounting of callbacks from callback-free CPUs 2012-11-16 10:05:57 -08:00
rcutree_plugin.h rcu: Separate accounting of callbacks from callback-free CPUs 2012-11-16 10:05:57 -08:00
rcutree_trace.c rcu: Separate accounting of callbacks from callback-free CPUs 2012-11-16 10:05:57 -08:00
relay.c splice: fix racy pipe->buffers uses 2012-06-13 21:16:42 +02:00
res_counter.c res_counter: return amount of charges after res_counter_uncharge() 2012-12-18 15:02:12 -08:00
resource.c kernel/resource.c: fix stack overflow in __reserve_region_with_split() 2012-10-06 03:05:31 +09:00
rtmutex-debug.c
rtmutex-debug.h
rtmutex-tester.c
rtmutex.c
rtmutex.h
rtmutex_common.h
rwsem.c lockdep, rwsem: provide down_write_nest_lock() 2013-01-11 14:54:55 -08:00
seccomp.c seccomp: Make syscall skipping and nr changes more consistent 2012-10-02 21:14:29 +10:00
semaphore.c semaphore: fix improper comment reference to mutex 2012-04-05 17:15:55 -07:00
signal.c ptrace: ensure arch_ptrace/ptrace_request can never race with SIGKILL 2013-01-22 10:08:00 -08:00
smp.c smp: Remove ipi_call_lock[_irq]()/ipi_call_unlock[_irq]() 2012-06-05 17:27:14 +02:00
smpboot.c hotplug: Fix UP bug in smpboot hotplug code 2012-08-13 17:01:07 +02:00
smpboot.h smpboot: Provide infrastructure for percpu hotplug threads 2012-08-13 17:01:07 +02:00
softirq.c cputime: Specialize irq vtime hooks 2012-10-29 21:31:32 +01:00
spinlock.c
srcu.c Merge branches 'urgent.2012.10.27a', 'doc.2012.11.16a', 'fixes.2012.11.13a', 'srcu.2012.10.27a', 'stall.2012.11.13a', 'tracing.2012.11.08a' and 'idle.2012.10.24a' into HEAD 2012-11-16 09:59:58 -08:00
stacktrace.c
stop_machine.c
sys.c cputime: Rename thread_group_times to thread_group_cputime_adjusted 2012-11-28 17:07:57 +01:00
sys_ni.c module: add syscall to load module from fd 2012-12-14 13:05:22 +10:30
sysctl.c Automatic NUMA Balancing V11 2012-12-16 15:18:08 -08:00
sysctl_binary.c pidns: Use task_active_pid_ns where appropriate 2012-11-19 05:59:09 -08:00
task_work.c task_work: task_work_add() should not succeed after exit_task_work() 2012-09-13 16:47:34 +02:00
taskstats.c taskstats: cgroupstats_user_cmd() may leak on error 2012-10-06 03:05:31 +09:00
test_kprobes.c
time.c time: Move update_vsyscall definitions to timekeeper_internal.h 2012-09-24 12:38:06 -04:00
timeconst.pl
timer.c timers: Fix endless looping between cascade() and internal_add_timer() 2012-10-09 21:27:14 +02:00
tracepoint.c
tsacct.c userns: Convert taskstats to handle the user and pid namespaces. 2012-09-18 01:01:32 -07:00
uid16.c userns: Convert setting and getting uid and gid system calls to use kuid and kgid 2012-05-03 03:28:41 -07:00
up.c
user-return-notifier.c
user.c proc: Usable inode numbers for the namespace file descriptors. 2012-11-20 04:19:49 -08:00
user_namespace.c userns: Fix typo in description of the limitation of userns_install 2012-12-14 18:36:36 -08:00
utsname.c userns: Require CAP_SYS_ADMIN for most uses of setns. 2012-12-14 16:12:03 -08:00
utsname_sysctl.c
wait.c propagate name change to comments in kernel source 2012-12-06 10:39:54 +01:00
watchdog.c watchdog: Fix disable/enable regression 2012-12-19 12:10:33 -08:00
workqueue.c Merge branch 'for-3.8' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/wq 2012-12-12 08:15:13 -08:00
workqueue_sched.h