Commit graph

199472 commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo bafb67470b perf tools: Allow building perf source tarballs on non-configured tree
So that we don't require that the kernel be configured first, and as we
don't use KERNELRELEASE at all in the -src-pkg targets, we need o add a
new wildcard for targets ending in src-pkg:

On a make mrproper'ed kernel we get this without this patch:

[linux-2.6-tip]$ LANG= make perf-tarbz2-src-pkg
/bin/sh: include/config/kernel.release: No such file or directory
make: *** [include/config/kernel.release] Error 1
[acme@emilia linux-2.6-tip]$

Acked-by: Michal Marek <mmarek@suse.cz>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100604173552.GA875@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-07 07:49:24 -03:00
Arun Sharma f60f359383 perf report: Implement --sort cpu
In a shared multi-core environment, users want to analyze why their
program was slow. In particular, if the code ran slower only on certain
CPUs due to interference from other programs or kernel threads, the user
should be able to notice that.

Sample usage:

perf record -f -a -- sleep 3
perf report --sort cpu,comm

Workload:

program is running on 16 CPUs
Experiencing interference from an antagonist only on 4 CPUs.

  Samples: 106218177676 cycles

  Overhead  CPU          Command
  ........  ...  ...............

     6.25%  2            program
     6.24%  6            program
     6.24%  11           program
     6.24%  5            program
     6.24%  9            program
     6.24%  10           program
     6.23%  15           program
     6.23%  7            program
     6.23%  3            program
     6.23%  14           program
     6.22%  1            program
     6.20%  13           program
     3.17%  12           program
     3.15%  8            program
     3.14%  0            program
     3.13%  4            program
     3.11%  4         antagonist
     3.11%  0         antagonist
     3.10%  8         antagonist
     3.07%  12        antagonist

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100505181612.GA5091@sharma-home.net>
Signed-off-by: Arun Sharma <aruns@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:35:53 -03:00
Arnaldo Carvalho de Melo 41a37e2017 perf tools: Make event__preprocess_sample parse the sample
Simplifying the tools that were using both in sequence and allowing
upcoming simplifications, such as Arun's patch to sort by cpus.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:35:19 -03:00
Stephane Eranian 45d8e8025a perf annotate: Ask objdump to demangle symbols
Perf report is demangling symbols but not annotate.

The former uses internal demangling via libbdf or libiberty. The latter
executes objdump which by default does not demangle symbols.

This patch adds the -C option to the objdump cmdline to enable symbol
demangling.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4c07b323.2126e30a.6245.0e1e@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:34:59 -03:00
Stephane Eranian 45de34bbe3 perf buildid: add perfconfig option to specify buildid cache dir
This patch adds the ability to specify an alternate directory to store the
buildid cache (buildids, copy of binaries). By default, it is hardcoded to
$HOME/.debug. This directory contains immutable data. The layout of the
directory is such that no conflicts in filenames are possible. A modification
in a file, yields a different buildid and thus a different location in the
subdir hierarchy.

You may want to put the buildid cache elsewhere because of disk space
limitation or simply to share the cache between users. It is also useful for
remote collect vs. local analysis of profiles.

This patch adds a new config option to the perfconfig file.  Under the tag
'buildid', there is a dir option. For instance, if you have:

$ cat /etc/perfconfig
[buildid]
dir = /var/cache/perf-buildid

All buildids and binaries are be saved in the directory specified. The perf
record, buildid-list, buildid-cache, report, annotate, and archive commands
will it to pull information out.

The option can be set in the system-wide perfconfig file or in the
$HOME/.perfconfig file.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4c055fb7.df0ce30a.5f0d.ffffae52@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:34:04 -03:00
Arnaldo Carvalho de Melo 8e5564e6c7 perf tools: Make target to generate self contained source tarball
Useful for when people want to try some version of the perf tools and don't
wants to download the kernel tarball.

Here is a session using this new target:

  [root@emilia linux-2.6-tip]# make help | grep -i perf
    perf-tar-src-pkg    - Build perf-2.6.35-rc1.tar source tarball
    perf-targz-src-pkg  - Build perf-2.6.35-rc1.tar.gz source tarball
    perf-tarbz2-src-pkg - Build perf-2.6.35-rc1.tar.bz2 source tarball
  [root@emilia linux-2.6-tip]# make perf-tarbz2-src-pkg
    TAR
  [root@emilia linux-2.6-tip]# ls -la perf-2.6.35-rc1.tar.bz2
  -rw-r--r-- 1 root root 295731 May 31 11:18 perf-2.6.35-rc1.tar.bz2
  [root@emilia linux-2.6-tip]# tar xf perf-2.6.35-rc1.tar.bz2
  [root@emilia linux-2.6-tip]# cd perf-2.6.35-rc1
  [root@emilia perf-2.6.35-rc1]# ls
  arch  HEAD  include  lib  tools
  [root@emilia perf-2.6.35-rc1]# cd tools/perf
  [root@emilia perf]# make -j9 2>&1 | tail
      CC arch/x86/util/dwarf-regs.o
      CC util/probe-finder.o
      CC util/newt.o
      CC util/scripting-engines/trace-event-perl.o
      CC scripts/perl/Perf-Trace-Util/Context.o
      CC perf.o
      CC builtin-help.o
      AR libperf.a
      LINK perf
  rm .perf.dev.null
  [root@emilia perf]# ./perf record -a sleep 1
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.262 MB perf.data (~11457 samples) ]
  [root@emilia perf]# ./perf report | head -12
  # Events: 6K cycles
  #
  # Overhead          Command       Shared Object  Symbol
  # ........  ...............  ..................  ......
  #
       4.73%             perf  [kernel.kallsyms]   [k] format_decode
       4.49%             perf  libc-2.12.so        [.] _IO_file_underflow_internal
       4.38%             init  [kernel.kallsyms]   [k] mwait_idle
       3.29%             perf  [kernel.kallsyms]   [k] vsnprintf
       2.38%             init  [kernel.kallsyms]   [k] sched_clock_local
       2.35%             init  [kernel.kallsyms]   [k] apic_timer_interrupt
       1.86%     sirq-timer/5  [kernel.kallsyms]   [k] find_busiest_group
  [root@emilia perf]#

Acked-by: Michal Marek <mmarek@suse.cz>
Acked-by: Sam Ravnborg <sam@ravnborg.org>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Michal Marek <mmarek@suse.cz>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sam Ravnborg <sam@ravnborg.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100528185357.GA28009@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:33:35 -03:00
Stephane Eranian c45c6ea2e5 perf tools: Add the ability to specify list of cpus to monitor
This patch adds a -C option to stat, record, top to designate a list of CPUs to
monitor. CPUs can be specified as a comma-separated list or ranges, no space
allowed.

Examples:
$ perf record -a -C0-1,4-7 sleep 1
$ perf top -C0-4
$ perf stat -a -C1,2,3,4 sleep 1

With perf record in per-thread mode with inherit mode on, samples are collected
only when the thread runs on the designated CPUs.

The -C option does not turn on system-wide mode automatically.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4bff9496.d345d80a.41fe.7b00@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:33:01 -03:00
Stephane Eranian 761844b9c6 perf report: Make -D print sampled CPU
It is useful to know on which CPU a sample was captured on.
The information is captured with perf record -R but it was
not printed out by perf report -D. This patch adds this.

When -R is not used, cpu is set to -1to indicate that
the CPU is unknown (it is not captured).

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <4bff964c.e88cd80a.3106.7d31@mx.google.com>
Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-05 09:32:24 -03:00
Ingo Molnar 58cc1a9e3b Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent 2010-06-04 16:55:19 +02:00
Arnaldo Carvalho de Melo e7dadc0089 perf symbols: Set the DSO long name when using symbol_conf.vmlinux_name
We need to set the long name to the name specified via, for instance,
'perf annotate --vmlinux /path/to/vmlinux', if not it will remain as
'[kernel.kallsyms]' and that will make annotate fail when passing this
as the vmlinux name in the call to objdump.

The way this is setup grew unwieldly and dso__load_vmlinux is the
function that should allocate space for the long name, with callers not
assuming that filenames should be allocated somehow by then (strdup,
dso__build_id_filename, etc).

For now this is the minimalistic patch, a proper fix for .36 will be
made.

Reported-by: Stephane Eranian <eranian@google.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100604003900.GD10469@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-04 07:07:52 -03:00
Peter Zijlstra c6df8d5ab8 perf: Fix crash in swevents
Frederic reported that because swevents handling doesn't disable IRQs
anymore, we can get a recursion of perf_adjust_period(), once from
overflow handling and once from the tick.

If both call ->disable, we get a double hlist_del_rcu() and trigger
a LIST_POISON2 dereference.

Since we don't actually need to stop/start a swevent to re-programm
the hardware (lack of hardware to program), simply nop out these
callbacks for the swevent pmu.

Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1275557609.27810.35218.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-06-03 17:03:08 +02:00
Ingo Molnar da3fd1a001 Merge branch 'perf/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux-2.6 into perf/urgent 2010-06-02 09:13:12 +02:00
Arnaldo Carvalho de Melo b5c874f14c perf buildid-list: Fix --with-hits event processing
When we use plain 'perf buildid-list' we use only what is in the buildid
table in the perf.data header. And those have absolute pathnames because
at 'perf record' time we used __perf_session__process_events and that
doesn't sets up the path shortening code in map__new() that happens if
symbol_conf.full_paths is false, the default.

On the other hand, when we use 'perf buildid-list --with-hits' we
process all the events using perf_session__process_events, adding
entries to the global DSO list _after_ removing the current directory
from the DSO name, for presentation purposes.

Because of that we end up having two entries in the DSO list when
recording events for binaries using relative pathnames.

Fix it minimally by setting symbol_conf.full_paths to true when marking
the DSOs with hits in 'perf buildid-list --with-hits', as used by 'perf
archive'

Right fix longer term is to shorten the path only at presentation time.
Will be done for 2.6.36.

Reported-by: Stephane Eranian <eranian@google.com>
Tested-by: Stephane Eranian <eranian@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <20100601183837.GC4093@ghostprotocols.net>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-01 16:16:11 -03:00
Pierre Tardy c02514850d perf scripts python: Give field dict to unhandled callback
trace_unhandled() callback does not allow to access event fields, this patch
resolves the problem.

It can also been used as a more pythonic and flexible way for script writters
to demux event types

This will for example greatly simplify pytimechart event demux.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Acked-by: Tom Zanussi <tzanussi@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>,
Cc: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <1275340329-2397-1-git-send-email-tardyp@gmail.com>
Signed-off-by: Pierre Tardy <tardyp@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-01 06:12:35 -03:00
Konstantin Stepanyuk 75d9ef1707 perf hist: fix objdump output parsing
hist_entry__annotate() runs objdump with -S option so the output may contain
lines of any format. If a line starts with a colon strtoull() returns 0 and
calculated offset will be negative. This causes perf annotate segfaults.

Make sure that strtoull() has parsed at least one digit.

Cc: David S. Miller <davem@davemloft.net>
Cc: Frédéric Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Mike Galbraith <efault@gmx.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Tom Zanussi <tzanussi@gmail.com>
LKML-Reference: <new-submission>
Signed-off-by: Konstantin Stepanyuk <konstantin.stepanyuk@gmail.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2010-06-01 05:44:36 -03:00
Borislav Petkov 2fb750e825 perf-record: Check correct pid when forking
When forking the child to be traced, we should check the correct
return value from fork() and not a local variable which is otherwise
unused.

Signed-off-by: Borislav Petkov <bp@alien8.de>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
LKML-Reference: <20100531211818.GA30175@liondog.tnic>
Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
2010-06-01 00:57:14 +02:00
Frederic Weisbecker dd833d713c perf: Do the comm inheritance per thread in event__process_task
event__process_task() doesn't propagate the comm copy on clone,
but only on process fork. So we loose all the tid:comm resolution
for tasks that aren't a main process thread.

Progragate the per thread granularity to event__process_task for
pid resolution.

This fixes various unresolved pids in perf sched, especially when
we trace multithread processes. The problem is quickly reproducible
with the messaging benchmark using the multithread mode "-t" :

	perf sched record perf bench sched messaging -t

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
2010-06-01 00:43:07 +02:00
Frederic Weisbecker af64865ba6 perf: Use event__process_task from perf sched
perf sched uses event__process_comm(), which means it can resolve
comms from:

- tasks that have exec'ed (kernel comm events)
- tasks that were running when perf record started the actual
  recording (synthetized comm events)

But perf sched can't resolve the pids of tasks that were created
after the recording started.

To solve this, we need to inherit the comms on fork events using
event__process_task().

This fixes various unresolved pids in perf sched, easily visible
with:
	perf sched record perf bench sched messaging

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
2010-06-01 00:10:32 +02:00
Frederic Weisbecker 13eb04fdbe perf: Process comm events by tid
When we synthetize the existing running tasks though procfs,
we walk through every threads of a process, queuing one comm
events per tid.

But then on report time, event__process_comm() only creates and
sets the comm on a per process granularity. This is the right
thing for comm events that came from the kernel, as they are
only created on exec. Sub-threads then inherit their comm
from fork events. But that doesn't work with our synthetized
comm events taken from procfs informations as the per thread
granularity is done on comm events directly there.

Hence we need event__process_comm() to work with the tid rather
than the pid. It won't change anything for comm events coming
from the kernel but this will fix the synthetized ones.

Before:

	$ ./perf report -D | grep COMM | grep firefox

	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5297

After:
	$ ./perf report -D | grep COMM | grep firefox

	0x2c7b8 [0x18]: PERF_RECORD_COMM: firefox:5297
	0x2c7d0 [0x18]: PERF_RECORD_COMM: firefox:5299
	0x2c7e8 [0x18]: PERF_RECORD_COMM: firefox:5300
	0x2c800 [0x18]: PERF_RECORD_COMM: firefox:5308
	0x2c818 [0x18]: PERF_RECORD_COMM: firefox:5309
	0x2c830 [0x18]: PERF_RECORD_COMM: firefox:5312

This fixes various unresolved pid on perf sched.

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Tom Zanussi <tzanussi@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
2010-05-31 23:59:50 +02:00
Randy Dunlap 546cf44a1b blktrace: Fix new kernel-doc warnings
Fix blktrace.c kernel-doc warnings:
 Warning(kernel/trace/blktrace.c:858): No description found for parameter 'ignore'
 Warning(kernel/trace/blktrace.c:890): No description found for parameter 'ignore'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Jens Axboe <jens.axboe@oracle.com>
Cc: Steven Rostedt <rostedt@goodmis.org>
Cc: Frederic Weisbecker <fweisbec@gmail.com>
LKML-Reference: <20100529114507.c466fc1e.randy.dunlap@oracle.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 09:58:20 +02:00
Frederic Weisbecker 74048f895f perf_events: Fix unincremented buffer base on partial copy
If a sample size crosses to the next page boundary, the copy
will be made in more than one step. However we forget to advance
the source offset for the next copy, leading to unexpected double
copies that completely mess up the traces.

This fixes various kinds of bad traces that have irrelevant
data inside, as an example:

	geany-4979  [001]  5758.077775: sched_switch: prev_comm=! prev_pid=121
		prev_prio=0 prev_state=S|D|Z|X|x ==> next_comm= next_pid=7497072
		next_prio=0

Signed-off-by: Frederic Weisbecker <fweisbec@gmail.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Paul Mackerras <paulus@samba.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1274988898-5639-1-git-send-regression-fweisbec@gmail.com>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 08:46:10 +02:00
Stephane Eranian 90151c35b1 perf_events: Fix event scheduling issues introduced by transactional API
The transactional API patch between the generic and model-specific
code introduced several important bugs with event scheduling, at
least on X86. If you had pinned events, e.g., watchdog,  and were
over-committing the PMU, you would get bogus counts. The bug was
showing up on Intel CPU because events would move around more
often that on AMD. But the problem also existed on AMD, though
harder to expose.

The issues were:

 - group_sched_in() was missing a cancel_txn() in the error path

 - cpuc->n_added was not properly maintained, leading to missing
   actions in hw_perf_enable(), i.e., n_running being 0. You cannot
   update n_added until you know the transaction has succeeded. In
   case of failed transaction n_added was not adjusted back.

 - in case of failed transactions, event_sched_out() was called
   and eventually invoked x86_disable_event() to touch the HW reg.
   But with transactions, on X86, event_sched_in() does not touch
   HW registers, it simply collects events into a list. Thus, you
   could end up calling x86_disable_event() on a counter which
   did not correspond to the current event when idx != -1.

The patch modifies the generic and X86 code to avoid all those problems.

First, we keep track of the number of events added last. In case the
transaction fails, we substract them from n_added. This approach is
necessary (as opposed to delaying updates to n_added) because not all
event updates use the transaction API, e.g., single events.

Second, we encapsulate the event_sched_in() and event_sched_out() in
group_sched_in() inside the transaction. That makes the operations
symmetrical and you can also detect that you are inside a transaction
and skip the HW reg access by checking cpuc->group_flag.

With this patch, you can now overcommit the PMU even with pinned
system-wide events present and still get valid counts.

Signed-off-by: Stephane Eranian <eranian@google.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1274796225.5882.1389.camel@twins>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 08:46:10 +02:00
Peter Zijlstra 2e97942fe5 perf_events, trace: Fix perf_trace_destroy(), mutex went missing
Steve spotted I forgot to do the destroy under event_mutex.

Reported-by: Steven Rostedt <rostedt@goodmis.org>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1274451913.1674.1707.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 08:46:09 +02:00
Peter Zijlstra 3771f07711 perf_events, trace: Fix probe unregister race
tracepoint_probe_unregister() does not synchronize against the probe
callbacks, so do that explicitly. This properly serializes the callbacks
and the free of the data used therein.

Also, use this_cpu_ptr() where possible.

Acked-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <1274438476.1674.1702.camel@laptop>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 08:46:09 +02:00
Peter Zijlstra 8a49542c05 perf_events: Fix races in group composition
Group siblings don't pin each-other or the parent, so when we destroy
events we must make sure to clean up all cross referencing pointers.

In particular, for destruction of a group leader we must be able to
find all its siblings and remove their reference to it.

This means that detaching an event from its context must not detach it
from the group, otherwise we can end up failing to clear all pointers.

Solve this by clearly separating the attachment to a context and
attachment to a group, and keep the group composed until we destroy
the events.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 08:46:09 +02:00
Peter Zijlstra ac9721f3f5 perf_events: Fix races and clean up perf_event and perf_mmap_data interaction
In order to move toward separate buffer objects, rework the whole
perf_mmap_data construct to be a more self-sufficient entity, one
with its own lifetime rules.

This greatly sanitizes the whole output redirection code, which
was riddled with bugs and races.

Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: <stable@kernel.org>
LKML-Reference: <new-submission>
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2010-05-31 08:46:08 +02:00
Linus Torvalds 67a3e12b05 Linux 2.6.35-rc1
.. and thus endeth the merge window.
2010-05-30 13:21:02 -07:00
Linus Torvalds 3b03117c5c Merge branch 'slub/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6
* 'slub/urgent' of git://git.kernel.org/pub/scm/linux/kernel/git/penberg/slab-2.6:
  SLUB: Allow full duplication of kmalloc array for 390
  slub: move kmem_cache_node into it's own cacheline
2010-05-30 12:46:17 -07:00
Linus Torvalds fa7eadab4b Merge branch 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'core-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  mutex: Fix optimistic spinning vs. BKL
2010-05-30 12:35:15 -07:00
Linus Torvalds bc7d352c5e Merge branch 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'perf-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  perf tui: Fix last use_browser problem related to .perfconfig
  perf symbols: Add the build id cache to the vmlinux path
  perf tui: Reset use_browser if stdout is not a tty
  ring-buffer: Move zeroing out excess in page to ring buffer code
  ring-buffer: Reset "real_end" when page is filled
2010-05-30 12:35:01 -07:00
Linus Torvalds b3f2f6cd1f ia64: revert __node_random addition
This partially reverts commit 4ec37de89d
("[IA64] Fix build breakage"), since the commit that made it necessary
got reverted earlier (see commit 35926ff5fb, 'Revert "cpusets:
randomize node rotor used in cpuset_mem_spread_node()"')

Even if we ever re-introduce this, there is no reason to make
__node_random be some architecture-specific function.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-30 10:08:03 -07:00
Linus Torvalds 003386fff3 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/fuse:
  mm: export generic_pipe_buf_*() to modules
  fuse: support splice() reading from fuse device
  fuse: allow splice to move pages
  mm: export remove_from_page_cache() to modules
  mm: export lru_cache_add_*() to modules
  fuse: support splice() writing to fuse device
  fuse: get page reference for readpages
  fuse: use get_user_pages_fast()
  fuse: remove unneeded variable
2010-05-30 09:16:14 -07:00
Linus Torvalds 092405cdb6 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-kconfig
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/rostedt/linux-2.6-kconfig:
  kconfig: Hide error output in find command in streamline_config.pl
  kconfig: Fix typo in comment in streamline_config.pl
  kconfig: Make a variable local in streamline_config.pl
2010-05-30 09:13:43 -07:00
Linus Torvalds 17d30ac077 Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6
* 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/sameo/mfd-2.6: (47 commits)
  mfd: Rename twl5031 sih modules
  mfd: Storage class for timberdale should be before const qualifier
  mfd: Remove unneeded and dangerous clearing of clientdata
  mfd: New AB8500 driver
  gpio: Fix inverted rdc321x gpio data out registers
  mfd: Change rdc321x resources flags to IORESOURCE_IO
  mfd: Move pcf50633 irq related functions to its own file.
  mfd: Use threaded irq for pcf50633
  mfd: pcf50633-adc: Fix potential race in pcf50633_adc_sync_read
  mfd: Fix pcf50633 bitfield logic in interrupt handler
  gpio: rdc321x needs to select MFD_CORE
  mfd: Use menuconfig for quicker config editing
  ARM: AB3550 board configuration and irq for U300
  mfd: AB3550 core driver
  mfd: AB3100 register access change to abx500 API
  mfd: Renamed ab3100.h to abx500.h
  gpio: Add TC35892 GPIO driver
  mfd: Add Toshiba's TC35892 MFD core
  mfd: Delay to mask tsc irq in max8925
  mfd: Remove incorrect wm8350 kfree
  ...
2010-05-30 09:13:08 -07:00
Linus Torvalds e38c1e54ce Merge branch 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx
* 'next' of git://git.kernel.org/pub/scm/linux/kernel/git/djbw/async_tx:
  DMAENGINE: DMA40 U8500 platform configuration
  DMA: PL330: Add dma api driver
2010-05-30 09:12:43 -07:00
Linus Torvalds 3e9345edd8 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/roland/infiniband:
  IB/qib: Remove DCA support until feature is finished
  IB/qib: Use a single txselect module parameter for serdes tuning
  IB/qib: Don't rely on (undefined) order of function parameter evaluation
  IB/ucm: Use memdup_user()
  IB/qib: Fix undefined symbol error when CONFIG_PCI_MSI=n
2010-05-30 09:12:16 -07:00
Linus Torvalds d28619f156 Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6
* 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs-2.6:
  quota: Convert quota statistics to generic percpu_counter
  ext3 uses rb_node = NULL; to zero rb_root.
  quota: Fixup dquot_transfer
  reiserfs: Fix resuming of quotas on remount read-write
  pohmelfs: Remove dead quota code
  ufs: Remove dead quota code
  udf: Remove dead quota code
  quota: rename default quotactl methods to dquot_
  quota: explicitly set ->dq_op and ->s_qcop
  quota: drop remount argument to ->quota_on and ->quota_off
  quota: move unmount handling into the filesystem
  quota: kill the vfs_dq_off and vfs_dq_quota_on_remount wrappers
  quota: move remount handling into the filesystem
  ocfs2: Fix use after free on remount read-only

Fix up conflicts in fs/ext4/super.c and fs/ufs/file.c
2010-05-30 09:11:11 -07:00
Linus Torvalds 021fad8b70 Merge branch 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip
* 'x86-fixes-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/linux-2.6-tip:
  x86, cpufeature: Unbreak compile with gcc 3.x
  x86, pat: Fix memory leak in free_memtype
  x86, k8: Fix section mismatch for powernowk8_exit()
  lib/atomic64_test: fix missing include of linux/kernel.h
  x86: remove last traces of quicklist usage
  x86, setup: Phoenix BIOS fixup is needed on Dell Inspiron Mini 1012
  x86: "nosmp" command line option should force the system into UP mode
  arch/x86/pci: use kasprintf
  x86, apic: ack all pending irqs when crashed/on kexec
2010-05-30 09:06:13 -07:00
Rafael J. Wysocki e9a5f426b8 CPU: Avoid using unititialized error variable in disable_nonboot_cpus()
If there's only one CPU online when disable_nonboot_cpus() is called,
the error variable will not be initialized and that may lead to
erroneous behavior.  Fix this issue by initializing error in
disable_nonboot_cpus() as appropriate.

Signed-off-by: Rafael J. Wysocki <rjw@sisk.pl>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-30 09:06:00 -07:00
Randy Dunlap 97ef6f7449 rapidio: fix new kernel-doc warnings
Fix a bunch of new rapidio kernel-doc warnings:

Warning(include/linux/rio.h:123): No description found for parameter 'comp_tag'
Warning(include/linux/rio.h:123): No description found for parameter 'phys_efptr'
Warning(include/linux/rio.h:123): No description found for parameter 'em_efptr'
Warning(include/linux/rio.h:123): No description found for parameter 'pwcback'
Warning(include/linux/rio.h:247): No description found for parameter 'set_domain'
Warning(include/linux/rio.h:247): No description found for parameter 'get_domain'
Warning(drivers/rapidio/rio-scan.c:1133): No description found for parameter 'rdev'
Warning(drivers/rapidio/rio-scan.c:1133): Excess function parameter 'port' description in 'rio_init_em'
Warning(drivers/rapidio/rio.c:349): No description found for parameter 'rdev'
Warning(drivers/rapidio/rio.c:349): Excess function parameter 'mport' description in 'rio_request_inb_pwrite'
Warning(drivers/rapidio/rio.c:393): No description found for parameter 'port'
Warning(drivers/rapidio/rio.c:393): No description found for parameter 'local'
Warning(drivers/rapidio/rio.c:393): No description found for parameter 'destid'
Warning(drivers/rapidio/rio.c:393): No description found for parameter 'hopcount'
Warning(drivers/rapidio/rio.c:393): Excess function parameter 'rdev' description in 'rio_mport_get_physefb'
Warning(drivers/rapidio/rio.c:845): Excess function parameter 'local' description in 'rio_std_route_clr_table'

Signed-off-by: Randy Dunlap <randy.dunlap@oracle.com>
Cc: Alexandre Bounine <alexandre.bounine@idt.com>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-30 09:02:47 -07:00
Linus Torvalds 06b2e9886e Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6
* 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/kyle/parisc-2.6:
  parisc: Call pagefault_disable/pagefault_enable in kmap_atomic/kunmap_atomic
  parisc: Remove unnecessary macros from entry.S
  parisc: LWS fixes for syscall.S
  parisc: Delete unnecessary nop's in entry.S
  parisc: Avoid interruption in critical region in entry.S
  parisc: invoke oom-killer from page fault
  parisc: clear floating point exception flag on SIGFPE signal
  parisc: Use of align_frame provides stack frame.
2010-05-30 09:02:02 -07:00
Linus Torvalds 35926ff5fb Revert "cpusets: randomize node rotor used in cpuset_mem_spread_node()"
This reverts commit 0ac0c0d0f8, which
caused cross-architecture build problems for all the wrong reasons.
IA64 already added its own version of __node_random(), but the fact is,
there is nothing architectural about the function, and the original
commit was just badly done. Revert it, since no fix is forthcoming.

Requested-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
2010-05-30 09:00:03 -07:00
Linus Torvalds b612a05537 Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client
* 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/sage/ceph-client:
  ceph: clean up on forwarded aborted mds request
  ceph: fix leak of osd authorizer
  ceph: close out mds, osd connections before stopping auth
  ceph: make lease code DN specific
  fs/ceph: Use ERR_CAST
  ceph: renew auth tickets before they expire
  ceph: do not resend mon requests on auth ticket renewal
  ceph: removed duplicated #includes
  ceph: avoid possible null dereference
  ceph: make mds requests killable, not interruptible
  sched: add wait_for_completion_killable_timeout
2010-05-30 08:56:39 -07:00
Christoph Lameter 0f1f694260 SLUB: Allow full duplication of kmalloc array for 390
Commit 756dee7587 ("SLUB: Get rid of dynamic DMA
kmalloc cache allocation") makes S390 run out of kmalloc caches.  Increase the
number of kmalloc caches to a safe size.

Cc: <stable@kernel.org> [ .33 and .34 ]
Reported-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Tested-by: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Christoph Lameter <cl@linux-foundation.org>
Signed-off-by: Pekka Enberg <penberg@cs.helsinki.fi>
2010-05-30 13:02:08 +03:00
John David Anglin 210501aa57 parisc: Call pagefault_disable/pagefault_enable in kmap_atomic/kunmap_atomic
Based on the generic implementation of kmap_atomic and kunmap_atomic,
we should call pagefault_disable and pagefault_enable in our PA8000
implementation.

The define for kmap_atomic_prot was also missing, and I updated
kmap_atomic_pfn to use the generic implementation because of the
change to kmap_atomic.

I believe that this change is needed to fix the fork copy-on-write
bug.

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2010-05-30 05:48:32 -04:00
John David Anglin 9b437bca16 parisc: Remove unnecessary macros from entry.S
The EXTR, DEP and DEPI macros are unnecessary.  There are PA 1.X
pneumonics available with the same functionality, and the DEP and DEPI
macros conflict with assembler pneumonics.

Tested on a variety of 32 and 64-bit systems.

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2010-05-30 05:47:28 -04:00
John David Anglin f4c0346c6f parisc: LWS fixes for syscall.S
1) Gate immediately and save a branch.
2) Fix off by one error in checking entry number.
3) Use sr7 instead of sr3 in error return path as sr3 might not
   contain correct value.
4) Enable locking on UP systems to prevent incorrect operation of
   the cas_action critical region on page faults.

Tested on several systems, including UP c3750 with 2.6.33.2 kernel.

Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2010-05-30 05:46:37 -04:00
John David Anglin c2dc988ec5 parisc: Delete unnecessary nop's in entry.S
Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2010-05-30 05:45:22 -04:00
John David Anglin 8f6c0c2bf1 parisc: Avoid interruption in critical region in entry.S
Signed-off-by: John David Anglin <dave.anglin@nrc-cnrc.gc.ca>
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
2010-05-30 05:44:36 -04:00
Nick Piggin 53e30d0227 parisc: invoke oom-killer from page fault
As explained in commit 1c0fe6e3bd, we want to call the architecture independent
oom killer when getting an unexplained OOM from handle_mm_fault, rather than
simply killing current.

Cc: linux-parisc@vger.kernel.org
Cc: linux-arch@vger.kernel.org
Signed-off-by: Nick Piggin <npiggin@suse.de>
Acked-by: David Rientjes <rientjes@google.com>
Signed-off-by: Kyle McMartin <kyle@mcmartin.ca>
2010-05-30 05:41:30 -04:00