linux/include
Peter Zijlstra (Intel) 86038c5ea8 perf: Avoid horrible stack usage
Both Linus (most recent) and Steve (a while ago) reported that perf
related callbacks have massive stack bloat.

The problem is that software events need a pt_regs in order to
properly report the event location and unwind stack. And because we
could not assume one was present we allocated one on stack and filled
it with minimal bits required for operation.

Now, pt_regs is quite large, so this is undesirable. Furthermore it
turns out that most sites actually have a pt_regs pointer available,
making this even more onerous, as the stack space is pointless waste.

This patch addresses the problem by observing that software events
have well defined nesting semantics, therefore we can use static
per-cpu storage instead of on-stack.

Linus made the further observation that all but the scheduler callers
of perf_sw_event() have a pt_regs available, so we change the regular
perf_sw_event() to require a valid pt_regs (where it used to be
optional) and add perf_sw_event_sched() for the scheduler.

We have a scheduler specific call instead of a more generic _noregs()
like construct because we can assume non-recursion from the scheduler
and thereby simplify the code further (_noregs would have to put the
recursion context call inline in order to assertain which __perf_regs
element to use).

One last note on the implementation of perf_trace_buf_prepare(); we
allow .regs = NULL for those cases where we already have a pt_regs
pointer available and do not need another.

Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Javi Merino <javi.merino@arm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Petr Mladek <pmladek@suse.cz>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Tom Zanussi <tom.zanussi@linux.intel.com>
Cc: Vaibhav Nagarnaik <vnagarnaik@google.com>
Link: http://lkml.kernel.org/r/20141216115041.GW3337@twins.programming.kicks-ass.net
Signed-off-by: Ingo Molnar <mingo@kernel.org>
2015-01-14 15:11:45 +01:00
..
acpi ACPI / processor: Convert apic_id to phys_id to make it arch agnostic 2015-01-05 23:32:42 +01:00
asm-generic mm: mmu_gather: use tlb->end != 0 only for TLB invalidation 2015-01-13 15:20:40 +13:00
clocksource
crypto crypto: af_alg - add user space interface for AEAD 2014-12-05 23:56:55 +08:00
drm Revert "drm/gem: Warn on illegal use of the dumb buffer interface v2" 2014-12-24 13:13:22 +10:00
dt-bindings Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/evalenti/linux-soc-thermal into thermal-soc 2014-12-21 22:49:12 +08:00
keys
kvm arm/arm64: KVM: Require in-kernel vgic for the arch timers 2014-12-15 11:50:42 +01:00
linux perf: Avoid horrible stack usage 2015-01-14 15:11:45 +01:00
math-emu
media [media] media: v4l2-image-sizes.h: correct the SVGA height definition 2014-12-04 13:56:56 -02:00
memory
misc
net Here's just a single fix - a revert of a patch that broke the 2015-01-06 13:29:27 -05:00
pcmcia
ras
rdma IB/core: Implement support for MMU notifiers regarding on demand paging regions 2014-12-15 18:13:36 -08:00
rxrpc
scsi SCSI for-linus on 20141220 2014-12-20 13:42:57 -08:00
soc Merge branch 'at91/cleanup5' into next/drivers 2014-12-08 18:29:20 +01:00
sound ALSA: pcm: Fix kerneldoc for params_*() functions 2014-12-30 16:41:11 +01:00
target target: Drop left-over fabric_max_sectors attribute 2015-01-09 15:22:05 -08:00
trace perf: Avoid horrible stack usage 2015-01-14 15:11:45 +01:00
uapi Merge branch 'drm-fixes' of git://people.freedesktop.org/~airlied/linux 2015-01-09 21:13:34 -08:00
video OMAPDSS: Remove all references to obsolete HDMI audio callbacks 2014-12-01 11:09:59 +02:00
xen x86/xen: properly retrieve NMI reason 2015-01-13 09:39:50 +00:00
Kbuild