linux/tools/perf
Roberto Agostino Vitillo b50311dc2a perf report: Add support for taken branch sampling
This patch adds support for taken branch sampling, i.e, the
PERF_SAMPLE_BRANCH_STACK feature to perf report. In other
words, to display histograms based on taken branches rather
than executed instructions addresses.

The new option is called -b and it takes no argument. To
generate meaningful output, the perf.data must have been
obtained using perf record -b xxx ... where xxx is a branch
filter option.

The output shows symbols, modules, sorted by 'who branches
where' the most often. The percentages reported in the first
column refer to the total number of branches captured and
not the usual number of samples.

Here is a quick example.
Here branchy is simple test program which looks as follows:

void f2(void)
{}
void f3(void)
{}
void f1(unsigned long n)
{
  if (n & 1UL)
    f2();
  else
    f3();
}
int main(void)
{
  unsigned long i;

  for (i=0; i < N; i++)
   f1(i);
  return 0;
}

Here is the output captured on Nehalem, if we are
only interested in user level function calls.

$ perf record -b any_call,u -e cycles:u branchy

$ perf report -b --sort=symbol
    52.34%  [.] main                   [.] f1
    24.04%  [.] f1                     [.] f3
    23.60%  [.] f1                     [.] f2
     0.01%  [k] _IO_new_file_xsputn    [k] _IO_file_overflow
     0.01%  [k] _IO_vfprintf_internal  [k] _IO_new_file_xsputn
     0.01%  [k] _IO_vfprintf_internal  [k] strchrnul
     0.01%  [k] __printf               [k] _IO_vfprintf_internal
     0.01%  [k] main                   [k] __printf

About half (52%) of the call branches captured are from main()
-> f1(). The second half (24%+23%) is split in two equal shares
between f1() -> f2(), f1() ->f3(). The output is as expected
given the code.

It should be noted, that using -b in perf record does not
eliminate information in the perf.data file. Consequently, a
typical profile can also be obtained by perf report by simply
not using its -b option.

It is possible to sort on branch related columns:

   - dso_from, symbol_from
   - dso_to, symbol_to
   - mispredict

Signed-off-by: Roberto Agostino Vitillo <ravitillo@lbl.gov>
Signed-off-by: Stephane Eranian <eranian@google.com>
Cc: peterz@infradead.org
Cc: acme@redhat.com
Cc: robert.richter@amd.com
Cc: ming.m.lin@intel.com
Cc: andi@firstfloor.org
Cc: asharma@fb.com
Cc: vweaver1@eecs.utk.edu
Cc: khandual@linux.vnet.ibm.com
Cc: dsahern@gmail.com
Link: http://lkml.kernel.org/r/1328826068-11713-14-git-send-email-eranian@google.com
Signed-off-by: Ingo Molnar <mingo@elte.hu>
2012-03-09 08:26:05 +01:00
..
arch perf/powerpc: Fix build for PowerPC with uclibc toolchains 2011-11-25 14:11:27 +11:00
bench perf tool: Fix perf stack to non executable on x86_64 2012-02-06 19:14:17 -02:00
config perf tools: git mv tools/perf/{features-tests.mak,config/} 2011-04-19 08:18:36 -03:00
Documentation perf report: Add support for taken branch sampling 2012-03-09 08:26:05 +01:00
python perf python: Use attr.watermark in twatch.py 2012-01-30 18:38:23 -02:00
scripts perf script: Add drop monitor script 2011-09-29 16:41:37 -03:00
util perf record: Add support for sampling taken branch 2012-03-09 08:26:05 +01:00
.gitignore perf tools: Makefile: Remove various and sundry cruft 2011-02-18 07:43:06 -02:00
builtin-annotate.c perf annotate: Get rid of field_sep check 2012-01-08 13:29:34 -02:00
builtin-bench.c perf bench: Also allow measuring memset() 2012-01-24 20:25:32 -02:00
builtin-buildid-cache.c perf buildid: add perfconfig option to specify buildid cache dir 2010-06-05 09:34:04 -03:00
builtin-buildid-list.c perf report: Accept fifos as input file 2011-12-23 17:01:03 -02:00
builtin-diff.c perf tools: Rename perf_event_ops to perf_tool 2011-11-28 10:39:28 -02:00
builtin-evlist.c perf report: Accept fifos as input file 2011-12-23 17:01:03 -02:00
builtin-help.c perf options: Type check all the remaining OPT_ variants 2010-05-17 16:22:41 -03:00
builtin-inject.c perf tools: Rename perf_event_ops to perf_tool 2011-11-28 10:39:28 -02:00
builtin-kmem.c perf kmem: Fix a memory leak 2012-01-08 13:27:54 -02:00
builtin-kvm.c perf kvm: Do guest-only counting by default 2012-01-06 15:47:37 -02:00
builtin-list.c perf list: Allow filtering list of events 2011-02-17 15:38:58 -02:00
builtin-lock.c perf lock: Document lock info subcommand 2012-01-30 18:30:48 -02:00
builtin-probe.c perf probe: Rename target_module to target 2012-02-02 17:39:15 -02:00
builtin-record.c perf record: Add support for sampling taken branch 2012-03-09 08:26:05 +01:00
builtin-report.c perf report: Add support for taken branch sampling 2012-03-09 08:26:05 +01:00
builtin-sched.c perf report: Accept fifos as input file 2011-12-23 17:01:03 -02:00
builtin-script.c perf script: Add option resolving vmlinux path 2012-01-30 18:13:07 -02:00
builtin-stat.c perf tools: Allow multiple threads or processes in record, stat, top 2012-02-13 22:54:11 -02:00
builtin-test.c perf tools: Invert the sample_id_all logic 2012-02-14 14:18:57 -02:00
builtin-timechart.c perf report: Accept fifos as input file 2011-12-23 17:01:03 -02:00
builtin-top.c perf tools: Handle kernels that don't support attr.exclude_{guest,host} 2012-03-03 12:19:56 -03:00
builtin.h perf tools: Make perf.data more self-descriptive (v8) 2011-10-07 17:01:24 -03:00
command-list.txt perf evlist: New command to list the names of events present in a perf.data file 2011-03-15 11:10:48 -03:00
CREDITS perf_counter tools: Add CREDITS file for Git contributors 2009-06-24 19:54:29 +02:00
design.txt perf: Fix few typos + cosmetics 2010-01-13 17:39:44 +01:00
Makefile perf tools: Add sysfs mountpoint interface 2012-02-13 23:27:15 -02:00
MANIFEST perf tools: Fix out of tree compiles 2012-02-13 22:46:41 -02:00
perf-archive.sh perf buildid: add perfconfig option to specify buildid cache dir 2010-06-05 09:34:04 -03:00
perf.c perf tools: Simplify debugfs mountpoint handling code 2011-11-28 10:11:28 -02:00
perf.h perf record: Add support for sampling taken branch 2012-03-09 08:26:05 +01:00