Commit graph

4675 commits

Author SHA1 Message Date
Kan Liang 8780fb25ab perf sort: Add sort option for physical address
Add a new sort option "phys_daddr" for --mem-mode sort.  With this
option applied, perf can sort and report by sample's physical address.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1504026672-7304-3-git-send-email-kan.liang@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-01 14:46:11 -03:00
Kan Liang 3b0a5daa06 perf tools: Support new sample type for physical address
Support new sample type PERF_SAMPLE_PHYS_ADDR for physical address.

Add new option --phys-data to record sample physical address.

Signed-off-by: Kan Liang <kan.liang@intel.com>
Tested-by: Jiri Olsa <jolsa@redhat.com>
Acked-by: Stephane Eranian <eranian@google.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Madhavan Srinivasan <maddy@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: http://lkml.kernel.org/r/1504026672-7304-2-git-send-email-kan.liang@intel.com
[ Added missing printing in evsel.c patch sent by Jiri Olsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-01 14:46:00 -03:00
Arnaldo Carvalho de Melo 89be3f8ab7 perf syscalltbl: Support glob matching on syscall names
With two new methods, one to find the first match, returning its syscall
id and its index in whatever internal database it keeps the syscall
into, then one to find the next match, if any.

Implemented only on arches where we actually read the syscall table from
the kernel sources, i.e. x86-64 for now, all the others use the libaudit
method for which this returns -1, i.e. just stubs were added, with the
actual implementation using whatever libaudit functions for matching
that may be available.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-i0sj4rxk1a63pfe9gl8z8irs@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-09-01 14:45:48 -03:00
Jin Yao c4ee06251d perf report: Calculate the average cycles of iterations
The branch history code has a loop detection function. With this, we can
get the number of iterations by calculating the removed loops.

While it would be nice for knowing the average cycles of iterations.
This patch adds up the cycles in branch entries of removed loops and
save the result to the next branch entry (e.g. branch entry A).

Finally it will display the iteration number and average cycles at the
"from" of branch entry A.

For example:
perf record -g -j any,save_type ./div
perf report --branch-history --no-children --stdio

--22.63%--main div.c:42 (RET CROSS_2M)
          compute_flag div.c:28 (cycles:2 iter:173115 avg_cycles:2)
          |
           --10.73%--compute_flag div.c:27 (RET CROSS_2M)
                     rand rand.c:28 (cycles:1)
                     rand rand.c:28 (RET CROSS_2M)
                     __random random.c:298 (cycles:1)
                     __random random.c:297 (COND_BWD CROSS_2M)
                     __random random.c:295 (cycles:1)
                     __random random.c:295 (COND_BWD CROSS_2M)
                     __random random.c:295 (cycles:1)
                     __random random.c:295 (RET CROSS_2M)

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1502111115-18305-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-30 10:03:27 -03:00
Li Bin b2f7605076 perf symbols: Fix plt entry calculation for ARM and AARCH64
On x86, the plt header size is as same as the plt entry size, and can be
identified from shdr's sh_entsize of the plt.

But we can't assume that the sh_entsize of the plt shdr is always the
plt entry size in all architecture, and the plt header size may be not
as same as the plt entry size in some architecure.

On ARM, the plt header size is 20 bytes and the plt entry size is 12
bytes (don't consider the FOUR_WORD_PLT case) that refer to the binutils
implementation. The plt section is as follows:

Disassembly of section .plt:
000004a0 <__cxa_finalize@plt-0x14>:
 4a0:   e52de004        push    {lr}            ; (str lr, [sp, #-4]!)
 4a4:   e59fe004        ldr     lr, [pc, #4]    ; 4b0 <_init+0x1c>
 4a8:   e08fe00e        add     lr, pc, lr
 4ac:   e5bef008        ldr     pc, [lr, #8]!
 4b0:   00008424        .word   0x00008424

000004b4 <__cxa_finalize@plt>:
 4b4:   e28fc600        add     ip, pc, #0, 12
 4b8:   e28cca08        add     ip, ip, #8, 20  ; 0x8000
 4bc:   e5bcf424        ldr     pc, [ip, #1060]!        ; 0x424

000004c0 <printf@plt>:
 4c0:   e28fc600        add     ip, pc, #0, 12
 4c4:   e28cca08        add     ip, ip, #8, 20  ; 0x8000
 4c8:   e5bcf41c        ldr     pc, [ip, #1052]!        ; 0x41c

On AARCH64, the plt header size is 32 bytes and the plt entry size is 16
bytes.  The plt section is as follows:

Disassembly of section .plt:
0000000000000560 <__cxa_finalize@plt-0x20>:
 560:   a9bf7bf0        stp     x16, x30, [sp,#-16]!
 564:   90000090        adrp    x16, 10000 <__FRAME_END__+0xf8a8>
 568:   f944be11        ldr     x17, [x16,#2424]
 56c:   9125e210        add     x16, x16, #0x978
 570:   d61f0220        br      x17
 574:   d503201f        nop
 578:   d503201f        nop
 57c:   d503201f        nop

0000000000000580 <__cxa_finalize@plt>:
 580:   90000090        adrp    x16, 10000 <__FRAME_END__+0xf8a8>
 584:   f944c211        ldr     x17, [x16,#2432]
 588:   91260210        add     x16, x16, #0x980
 58c:   d61f0220        br      x17

0000000000000590 <__gmon_start__@plt>:
 590:   90000090        adrp    x16, 10000 <__FRAME_END__+0xf8a8>
 594:   f944c611        ldr     x17, [x16,#2440]
 598:   91262210        add     x16, x16, #0x988
 59c:   d61f0220        br      x17

NOTES:

In addition to ARM and AARCH64, other architectures, such as
s390/alpha/mips/parisc/poperpc/sh/sparc/xtensa also need to consider
this issue.

Signed-off-by: Li Bin <huawei.libin@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexis Berlemont <alexis.berlemont@gmail.com>
Cc: David Tolnay <dtolnay@gmail.com>
Cc: Hanjun Guo <guohanjun@huawei.com>
Cc: Hemant Kumar <hemant@linux.vnet.ibm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: zhangmengting@huawei.com
Link: http://lkml.kernel.org/r/1496622849-21877-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-29 11:41:27 -03:00
Li Bin 2c29461e27 perf probe: Fix kprobe blacklist checking condition
The commit 9aaf5a5f47 ("perf probe: Check kprobes blacklist when
adding new events"), 'perf probe' supports checking the blacklist of the
fuctions which can not be probed.  But the checking condition is wrong,
that the end_addr of the symbol which is the start_addr of the next
symbol can't be included.

Committer notes:

IOW make it match its kernel counterpart in kernel/kprobes.c:

  bool within_kprobe_blacklist(unsigned long addr)

Each entry have as its end address not its end address, but the first
address _outside_ that symbol, which for related functions, is the first
address of the next symbol, like these from kernel/trace/trace_probe.c:

0xffffffffbd198df0-0xffffffffbd198e40	print_type_u8
0xffffffffbd198e40-0xffffffffbd198e90	print_type_u16
0xffffffffbd198e90-0xffffffffbd198ee0	print_type_u32
0xffffffffbd198ee0-0xffffffffbd198f30	print_type_u64
0xffffffffbd198f30-0xffffffffbd198f80	print_type_s8
0xffffffffbd198f80-0xffffffffbd198fd0	print_type_s16
0xffffffffbd198fd0-0xffffffffbd199020	print_type_s32
0xffffffffbd199020-0xffffffffbd199070	print_type_s64
0xffffffffbd199070-0xffffffffbd1990c0	print_type_x8
0xffffffffbd1990c0-0xffffffffbd199110	print_type_x16
0xffffffffbd199110-0xffffffffbd199160	print_type_x32
0xffffffffbd199160-0xffffffffbd1991b0	print_type_x64

But not always:

0xffffffffbd1997b0-0xffffffffbd1997c0	fetch_kernel_stack_address (kernel/trace/trace_probe.c)
0xffffffffbd1c57f0-0xffffffffbd1c58b0	__context_tracking_enter   (kernel/context_tracking.c)

Signed-off-by: Li Bin <huawei.libin@huawei.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: zhangmengting@huawei.com
Fixes: 9aaf5a5f47 ("perf probe: Check kprobes blacklist when adding new events")
Link: http://lkml.kernel.org/r/1504011443-7269-1-git-send-email-huawei.libin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-29 11:14:12 -03:00
David Carrillo-Cisneros 3866058ef1 perf tools: Robustify detection of clang binary
Prior to this patch, make scripts tested for CLANG with ifeq ($(CC),
clang), failing to detect CLANG binaries with different names. Fix it by
testing for the existence of __clang__ macro in the list of compiler
defined macros.

Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Paul Turner <pjt@google.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170827075442.108534-5-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 16:44:46 -03:00
Jiri Olsa 9933183e36 perf report: Group stat values on global event id
There's no big value on displaying counts for every event ID, which is
one per every CPU. Rather than that, displaying the whole sum for the
event.

  $ perf record -c 100000 -e cycles:u -s test
  $ perf report -T

Before:
  #  PID   TID  cycles:u  cycles:u  cycles:u  cycles:u  ... [20 more columns of 'cycles:u']
    3339  3339         0         0         0         0
    3340  3340         0         0         0         0
    3341  3341         0         0         0         0
    3342  3342         0         0         0         0

Now:
  #  PID   TID  cycles:u
    3339  3339     19678
    3340  3340     18744
    3341  3341     17335
    3342  3342     26414

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-10-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 16:44:44 -03:00
Jiri Olsa a1834fc938 perf values: Zero value buffers
We need to make sure the array of value pointers are zero initialized,
because we use them in realloc later on and uninitialized non zero value
will cause allocation error and aborted execution.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-9-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 16:44:43 -03:00
Jiri Olsa f4ef3b7c18 perf values: Fix allocation check
Bailing out in case the allocation failed, not the other way round.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-8-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 16:44:43 -03:00
Jiri Olsa 64eed1deb6 perf values: Fix thread index bug
We are taking wrong index (+1) for first thread, which leaves thread
with index 0 unused and uninitialized.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-7-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 16:44:42 -03:00
Jiri Olsa dac7f6b7ed perf report: Add dump_read function
Adding dump_read function to gather all the dump output of read
function. Adding output of enabled and running times and id if enabled
(3 new lines with '...' prefix below).

  $ perf record -s ...
  $ perf report -D

  958358311769 0x91f8 [0x40]: PERF_RECORD_READ: 3339 3339 cycles:u 0
  ... time enabled : 958358313731
  ... time running : 958358313731
  ... id           : 80

Committer note:

Do not use 'read' as a variable name as it breaks the build on older
systems, such as RHEL6:

    CC       /tmp/build/perf/util/session.o
  cc1: warnings being treated as errors
  util/session.c: In function 'dump_read':
  util/session.c:1132: error: declaration of 'read' shadows a global declaration
  /usr/include/bits/unistd.h:35: error: shadowed declaration is here
  mv: cannot stat `/tmp/build/perf/util/.session.o.tmp': No such file or directory

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-6-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 16:43:50 -03:00
Jiri Olsa a17f069787 perf record: Set read_format for inherit_stat
Set read_format for what we expect to get from read event generated by
perf_event_attr::inherit_stat.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824162737.7813-5-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 11:05:10 -03:00
Jiri Olsa 12c15302dd perf c2c: Fix remote HITM detection for Skylake
Skylake introduced new mem_remote bit in union perf_mem_data_src [1].
It applies to any other memory level to express Remote unknown level, as
is reported by Skylake.

Adding this extra check to c2c_decode_stats to properly decode remote
HITMs on Skylake.

[1] http://lkml.kernel.org/r/20170816222156.19953-4-andi@firstfloor.org

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Joe Mario <jmario@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170824085732.28481-1-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-28 11:05:10 -03:00
Konstantin Khlebnikov ac0bb6b72f perf: Fix documentation for sysctls perf_event_paranoid and perf_event_mlock_kb
Fix misprint CAP_IOC_LOCK -> CAP_IPC_LOCK. This capability have nothing
to do with raw tracepoints. This part is about bypassing mlock limits.

Sysctl kernel.perf_event_paranoid = -1 allows raw and ftrace function
tracepoints without CAP_SYS_ADMIN.

Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/150322916080.129746.11285255474738558340.stgit@buzz
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 13:24:54 -03:00
Andi Kleen 52839e653b perf tools: Add support for printing new mem_info encodings
Add decoding for the new "lvlx" and "snoopx" meminfo fields added
earlier to the kernel so that "perf mem report" and other tools can
print it properly.

v2: Merge with persistent memory patch.
Switch to new bit encoding for each combination.

v3: Switch to generic lvlnum field.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Peter Zijlstra <peterz@infradead.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170816222156.19953-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 12:30:25 -03:00
Andi Kleen d66dccdb13 perf tools: Dedup events in expression parsing
Avoid adding redundant events while parsing an expression.  When we add
an "other" event check first if it already exists.

v2: Fix perf test failure.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170811232634.30465-10-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 12:19:08 -03:00
Andi Kleen 8d3db2b97f perf tools: Increase maximum number of events in expressions
Some of the upcoming metrics need more than 8 events. Increase the maximum
number the parser supports.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170811232634.30465-9-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 12:19:05 -03:00
Andi Kleen d73bad0685 perf tools: Expression parser enhancements for metrics
Enhance the expression parser for more complex metric formulas.

- Support python style IF ELSE operators
- Add an #SMT_On magic variable for formulas that depend on the SMT
status.

Example: 4 *( CPU_CLK_UNHALTED.THREAD_ANY / 2 ) if #SMT_on else cycles

- Support MIN/MAX operations

Example: min(1 , IDQ.MITE_UOPS / ( UPI * 16 * ( ICACHE.HIT + ICACHE.MISSES ) / 4.0 ) )

This is useful to fix up problems caused by multiplexing.

- Support | & ^ operators
- Minor cleanups and fixes
- Support an \ escape for operators. This allows to specify event names
like c2-residency
- Support @ as an alternative for / to be able to specify pmus without
conflicts with operators (like msr/tsc/ as msr@tsc@)

Example: (cstate_core@c3\\-residency@ / msr@tsc@) * 100

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170811232634.30465-8-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 12:15:53 -03:00
Andi Kleen de5077c4e3 perf tools: Add utility function to detect SMT status
Add an smt_on() function to return if SMT is enabled or disabled.  Used
in the next patch.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170811232634.30465-7-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 12:09:04 -03:00
Andi Kleen 77d0871c76 perf bpf: Tighten detection of BPF events
perf stat -e cpu/uops_executed.core,cmask=1/

would be detected as a BPF source event because the .c matches the .c
source BPF pattern.

v2:

Originally I tried to use lex lookahead, but it doesn't seem to work.

This now extends the BPF pattern to match longer events, but then does
an extra check in the C code to reject BPF matches that do not end with
.c/.o/.obj

This uses REJECT, which makes the flex scanner slower, but that
shouldn't be a big problem for the perf events.

Committer testing:

  # perf trace -e write -e /home/acme/bpf/tracepoint.c cat /etc/passwd > /dev/null
     0.000 ( 0.006 ms): cat/18485 write(fd: 1, buf: 0x7f59eebe1000, count: 3494                         ) ...
     0.006 (         ): raw_syscalls:sys_enter:NR 1 (1, 7f59eebe1000, da6, 22, 7f59eebe0010, 0))
     0.008 (         ): perf_bpf_probe:_write:(ffffffff9626b2c0))
     0.000 ( 0.010 ms): cat/18485  ... [continued]: write()) = 3494
  #

It continues doing what was expected, i.e. identifying
/home/acme/bpf/tracepoint.c as a BPF event and activates the clang
machinery to build an eBPF object and then uses sys_bpf() to hook it up
to the raw_syscalls:sys_enter tracepoint, etc.

Andi forgot to add Wang to the CC list, fix it.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/r/20170811232634.30465-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 11:56:22 -03:00
Andi Kleen 475fb533fb perf evsel: Fix buffer overflow while freeing events
Fix buffer overflow for:

  % perf stat -e msr/tsc/,cstate_core/c7-residency/ true

that causes glibc free list corruption. For some reason it doesn't
trigger in valgrind, but it is visible in AS:

  =================================================================
  ==32681==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x603000003f5c at pc 0x0000005671ef bp 0x7ffdaaac9ac0 sp 0x7ffdaaac9ab0
  READ of size 4 at 0x603000003f5c thread T0
    #0 0x5671ee in perf_evsel__close_fd util/evsel.c:1196
    #1 0x56c57a in perf_evsel__close util/evsel.c:1717
    #2 0x55ed5f in perf_evlist__close util/evlist.c:1631
    #3 0x4647e1 in __run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:749
    #4 0x4648e3 in run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:767
    #5 0x46e1bc in cmd_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:2785
    #6 0x52f83d in run_builtin /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:296
    #7 0x52fd49 in handle_internal_command /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:348
    #8 0x5300de in run_argv /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:392
    #9 0x5308f3 in main /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:530
    #10 0x7f0672d13400 in __libc_start_main (/lib64/libc.so.6+0x20400)
    #11 0x428419 in _start (/home/ak/hle/obj-perf/perf+0x428419)

  0x603000003f5c is located 0 bytes to the right of 28-byte region [0x603000003f40,0x603000003f5c)
  allocated by thread T0 here:
    #0 0x7f0675139020 in calloc (/lib64/libasan.so.3+0xc7020)
    #1 0x648a2d in zalloc util/util.h:23
    #2 0x648a88 in xyarray__new util/xyarray.c:9
    #3 0x566419 in perf_evsel__alloc_fd util/evsel.c:1039
    #4 0x56b427 in perf_evsel__open util/evsel.c:1529
    #5 0x56c620 in perf_evsel__open_per_thread util/evsel.c:1730
    #6 0x461dea in create_perf_stat_counter /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:263
    #7 0x4637d7 in __run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:600
    #8 0x4648e3 in run_perf_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:767
    #9 0x46e1bc in cmd_stat /home/ak/hle/linux-hle-2.6/tools/perf/builtin-stat.c:2785
    #10 0x52f83d in run_builtin /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:296
    #11 0x52fd49 in handle_internal_command /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:348
    #12 0x5300de in run_argv /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:392
    #13 0x5308f3 in main /home/ak/hle/linux-hle-2.6/tools/perf/perf.c:530
    #14 0x7f0672d13400 in __libc_start_main (/lib64/libc.so.6+0x20400)

The event is allocated with cpus == 1, but freed with cpus == real number
When the evsel close function walks the file descriptors it exceeds the
fd xyarray boundaries and reads random memory.

v2:

Now that xyarrays save their original dimensions we can use these to
iterate the two dimensional fd arrays. Fix some users (close, ioctl) in
evsel.c to use these fields directly. This allows simplifying the code
and dropping quite a few function arguments. Adjust all callers by
removing the unneeded arguments.

The actual perf event reading still uses the original values from the
evsel list.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170811232634.30465-2-andi@firstfloor.org
[ Fix up xy_max_[xy]() -> xyarray__max_[xy]() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 11:51:31 -03:00
Andi Kleen d74be47673 perf xyarray: Save max_x, max_y
Save the original array dimensions in xyarrays, so that users can
retrieve them later. Add some inline functions to access these fields.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170811232634.30465-1-andi@firstfloor.org
[ As noticed by Jiri, fix up namespacing: xy__method() -> xyarray__method() ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-22 11:51:28 -03:00
Taeung Song 1ac39372e0 perf annotate stdio: Support --show-nr-samples option
Add --show-nr-samples option to "perf annotate" so that it matches "perf
report".

Committer note:

Note that it can't be used together with --show-total-period, which
seems like a silly limitation, that can be lifted at some point.

Made it bail out if not on --stdio.

Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/1503046008-5511-1-git-send-email-treeze.taeung@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-18 10:31:53 -03:00
Arnaldo Carvalho de Melo 9a57eaf1d2 perf tools: Use default CPUINFO_PROC where it fits
Several architectures don't need to define it since the string is the
same as the default one, so nuke them.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-v1e1jr1u474w9xcelpaoxamu@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-17 16:58:21 -03:00
Arnaldo Carvalho de Melo 5d9cdc1181 perf events parse: Rename parse_events_parse arguments
Calling them just "data" is too vague, call it 'perf_state', to make it
clearer, for instance, when looking at patch hunks.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-rnhk5yb05wem77rjpclrh7so@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-17 16:39:15 -03:00
Arnaldo Carvalho de Melo d17d0878f4 perf events parse: Use just one parse events state struct
Andi reported problems when parse errors were detected with vendor
events (json), because in the yyparse/parse_events_parse function we
dereferenced the _data parameter to two different structs, with
different layouts, which ended up making parse_events_evlist->error to
point to random stack addresses.

Fix it by making _data to always be struct parse_events_state, changing
the only place where 'struct parse_events_term' was used in
parse_events.y.

Reported-by: Andi Kleen <ak@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-bc27lshz823hxl8n9nkelcgh@git.kernel.org
Fixes: 90e2b22dee ("perf/tool: Add support to reuse event grammar to parse out terms")
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-17 16:39:15 -03:00
Arnaldo Carvalho de Melo 5d369a75ed perf events parse: Rename parsing state struct to clearer name
Rename it from 'parse_events_evlist' to 'parse_events_state' to better
state that this is parsing state that has to be passed around.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-dursqtg2h2w98ztaa297u43x@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-17 16:39:15 -03:00
Arnaldo Carvalho de Melo 07806a1df1 perf events parse: Remove some needless local variables
Those are just casting a void pointer to a struct to then pass them to
functions, i.e. remove the local variables and pass the void pointer
directly, the casting will be done and the code will be shorter.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Wang Nan <wangnan0@huawei.com>
Link: http://lkml.kernel.org/n/tip-bzfodzr3mb46gy7u7v0mqad6@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-17 16:39:15 -03:00
Wang Nan db26984a36 perf bpf: Fix endianness problem when loading parameters in prologue
Perf's BPF prologue generator unconditionally fetches 8 bytes for
function parameters, which causes problems on big endian machines. Thomas
gives a detailed analysis for this problem:

 http://lkml.kernel.org/r/968ebda5-abe4-8830-8d69-49f62529d151@linux.vnet.ibm.com

 ---- 8< ----
  I investigated perf test BPF for s390x and have a question regarding
  the 38.3 subtest (bpf-prologue test) which fails on s390x.

  When I turn on trace_printk in tests/bpf-script-test-prologue.c
  I see this output in /sys/kernel/debug/tracing/trace:

  [root@s8360047 perf]# cat /sys/kernel/debug/tracing/trace
  perf-30229 [000] d..2 170161.535791: : f_mode 2001d00000000 offset:0 orig:0
  perf-30229 [000] d..2 170161.535809: : f_mode 6001f00000000 offset:0 orig:0
  perf-30229 [000] d..2 170161.535815: : f_mode 6001f00000000 offset:1 orig:0
  perf-30229 [000] d..2 170161.535819: : f_mode 2001d00000000 offset:1 orig:0
  perf-30229 [000] d..2 170161.535822: : f_mode 2001d00000000 offset:2 orig:1
  perf-30229 [000] d..2 170161.535825: : f_mode 6001f00000000 offset:2 orig:1
  perf-30229 [000] d..2 170161.535828: : f_mode 6001f00000000 offset:3 orig:1
  perf-30229 [000] d..2 170161.535832: : f_mode 2001d00000000 offset:3 orig:1
  perf-30229 [000] d..2 170161.535835: : f_mode 2001d00000000 offset:4 orig:0
  perf-30229 [000] d..2 170161.535841: : f_mode 6001f00000000 offset:4 orig:0

  [...]

  There are 3 parameters the eBPF program tests/bpf-script-test-prologue.c
  accesses: f_mode (member of struct file at offset 140) offset and orig.  They
  are parameters of the lseek() system call triggered in this test case in
  function llseek_loop().

  What is really strange is the value of f_mode. It is an 8 byte value, whereas
  in the probe event it is defined as a 4 byte value.  The lower 4 bytes are all
  zero and do not belong to member f_mode.  The correct value should be 2001d for
  read-only and 6001f for read-write open mode.

  Here is the output of the 'perf test -vv bpf' trace:
  Try to find probe point from debuginfo.
  Matched function: null_lseek [2d9310d]
   Probe point found: null_lseek+0
  Searching 'file' variable in context.
  Converting variable file into trace event.
  converting f_mode in file
  f_mode type is unsigned int.
  Opening /sys/kernel/debug/tracing//README write=0
  Searching 'offset' variable in context.
  Converting variable offset into trace event.
  offset type is long long int.
  Searching 'orig' variable in context.
  Converting variable orig into trace event.
  orig type is int.
  Found 1 probe_trace_events.
  Opening /sys/kernel/debug/tracing//kprobe_events write=1
  Writing event: p:perf_bpf_probe/func _text+8794224 f_mode=+140(%r2):x32
 ---- 8< ----

This patch parses the type of each argument and converts data from memory to
expected type.

Now the test runs successfully on 4.13.0-rc5:

  [root@s8360046 perf]# ./perf test  bpf
  38: BPF filter                                 :
  38.1: Basic BPF filtering                      : Ok
  38.2: BPF pinning                              : Ok
  38.3: BPF prologue generation                  : Ok
  38.4: BPF relocation checker                   : Ok
  [root@s8360046 perf]#

Signed-off-by: Wang Nan <wangnan0@huawei.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Link: http://lkml.kernel.org/r/20170815092159.31912-1-tmricht@linux.vnet.ibm.com
Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-16 10:31:11 -03:00
Thomas Richter 4a084ecfc8 perf report: Fix module symbol adjustment for s390x
The 'perf report' tool does not display the addresses of kernel module
symbols correctly.

For example symbol qeth_send_ipa_cmd in kernel module qeth.ko has this
relative address for function qeth_send_ipa_cmd():

  [root@s8360047 linux]# nm -g drivers/s390/net/qeth.ko | fgrep send_ipa_cmd
  0000000000013088 T qeth_send_ipa_cmd

The module is loaded at address:

  [root@s8360047 linux]# cat /sys/module/qeth/sections/.text
  0x000003ff80296d20
  [root@s8360047 linux]#

This should result in a start address of:

  0x13088 + 0x3ff80296d20 = 0x3ff802a9da8

Using crash to verify the address on a live system:

  [root@s8360046 linux]# crash vmlinux

  crash 7.1.9++
  Copyright (C) 2002-2016  Red Hat, Inc.
  Copyright (C) 2004, 2005, 2006, 2010  IBM Corporation

  [...]

  crash> mod -s qeth drivers/s390/net/qeth.ko
       MODULE       NAME        SIZE  OBJECT FILE
       3ff8028d700  qeth      151552  drivers/s390/net/qeth.ko
  crash> sym qeth_send_ipa_cmd
  3ff802a9da8 (T) qeth_send_ipa_cmd [qeth] /root/linux/drivers/s390/net/qeth_core_main.c: 2944
  crash>

Now perf report displays the address of symbol qeth_send_ipa_cmd:
symbol__new:

  qeth_send_ipa_cmd 0x130f0-0x132ce

There is a difference of 0x68 between the entry in the symbol table (see
nm command above) and perf. The difference is from the offset the .text
segment of qeth.ko:

  [root@s8360047 perf]# readelf -a drivers/s390/net/qeth.ko
  Section Headers:
  [Nr] Name              Type             Address           Offset
       Size              EntSize          Flags  Link  Info  Align
  [ 0]                   NULL             0000000000000000  00000000
       0000000000000000  0000000000000000           0     0     0
  [ 1] .note.gnu.build-i NOTE             0000000000000000  00000040
       0000000000000024  0000000000000000   A       0     0     4
  [ 2] .text             PROGBITS         0000000000000000  00000068
       000000000001c8a0  0000000000000000  AX       0     0     8

As seen the .text segment has an offset of 0x68 with start address 0x0.
Therefore 0x68 is added to the address of qeth_send_ipa_cmd and thus
0x13088 + 0x68 = 0x130f0 is displayed.

This is wrong, perf report needs to display the start address of symbol
qeth_send_ipa_cmd at 0x13088 + qeth.ko.text section start address.

The qeth.ko module .text start address is available in the qeth.ko DSO
map. Just identify the kernel module symbols and correct the addresses.

With the fix I see this correct address for symbol: symbol__new:
qeth_send_ipa_cmd 0x3ff802a9da8-0x3ff802a9f86

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
LPU-Reference: 20170803134902.47207-1-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-q8lktlpoxb5e3dj52u1s1rw4@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11 16:06:32 -03:00
Thomas Richter 9ad4652b66 perf record: Fix wrong size in perf_record_mmap for last kernel module
During work on perf report for s390 I ran into the following issue:

0 0x318 [0x78]: PERF_RECORD_MMAP -1/0:
        [0x3ff804d6990(0xfffffc007fb2966f) @ 0]:
        x /lib/modules/4.12.0perf1+/kernel/drivers/s390/net/qeth_l2.ko

This is a PERF_RECORD_MMAP entry of the perf.data file with an invalid
module size for qeth_l2.ko (the s390 ethernet device driver).

Even a mainframe does not have 0xfffffc007fb2966f bytes of main memory.

It turned out that this wrong size is created by the perf record
command.  What happens is this function call sequence from
__cmd_record():

  perf_session__new():
    perf_session__create_kernel_maps():
      machine__create_kernel_maps():
        machine__create_modules():   Creates map for all loaded kernel modules.
          modules__parse():   Reads /proc/modules and extracts module name and
                              load address (1st and last column)
            machine__create_module():   Called for every module found in /proc/modules.
                              Creates a new map for every module found and enters
                              module name and start address into the map. Since the
                              module end address is unknown it is set to zero.

This ends up with a kernel module map list sorted by module start
addresses.  All module end addresses are zero.

Last machine__create_kernel_maps() calls function map_groups__fixup_end().
This function iterates through the maps and assigns each map entry's
end address the successor map entry start address. The last entry of the
map group has no successor, so ~0 is used as end to consume the remaining
memory.

Later __cmd_record calls function record__synthesize() which in turn calls
perf_event__synthesize_kernel_mmap() and perf_event__synthesize_modules()
to create PERF_REPORT_MMAP entries into the perf.data file.

On s390 this results in the last module qeth_l2.ko
(which has highest start address, see module table:
        [root@s8360047 perf]# cat /proc/modules
        qeth_l2 86016 1 - Live 0x000003ff804d6000
        qeth 266240 1 qeth_l2, Live 0x000003ff80296000
        ccwgroup 24576 1 qeth, Live 0x000003ff80218000
        vmur 36864 0 - Live 0x000003ff80182000
        qdio 143360 2 qeth_l2,qeth, Live 0x000003ff80002000
        [root@s8360047 perf]# )
to be the last entry and its map has an end address of ~0.

When the PERF_RECORD_MMAP entry is created for kernel module qeth_l2.ko
its start address and length is written. The length is calculated in line:
    event->mmap.len   = pos->end - pos->start;
and results in 0xffffffffffffffff - 0x3ff804d6990(*) = 0xfffffc007fb2966f

(*) On s390 the module start address is actually determined by a __weak function
named arch__fix_module_text_start() in machine__create_module().

I think this improvable. We can use the module size (2nd column of /proc/modules)
to get each loaded kernel module size and calculate its end address.
Only for map entries which do not have a valid end address (end is still zero)
we can use the heuristic we have now, that is use successor start address or ~0.

Signed-off-by: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Hendrik Brueckner <brueckner@linux.vnet.ibm.com>
Cc: Thomas-Mich Richter <tmricht@linux.vnet.ibm.com>
Cc: Zvonko Kosic <zvonko.kosic@de.ibm.com>
LPU-Reference: 20170803134902.47207-2-tmricht@linux.vnet.ibm.com
Link: http://lkml.kernel.org/n/tip-nmoqij5b5vxx7rq2ckwu8iaj@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11 16:06:32 -03:00
Milian Wolff d964b1cdbd perf srcline: Do not consider empty files as valid srclines
Sometimes we get a non-null, but empty, string for the filename from
bfd. This then results in srclines of the form ":0", which is different
from the canonical SRCLINE_UNKNOWN in the form "??:0".  Set the file to
NULL if it is empty to fix this.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170806212446.24925-14-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11 16:06:31 -03:00
Milian Wolff 80c345b255 perf util: Take elf_name as const string in dso__demangle_sym
The input string is not modified and thus can be passed in as a pointer
to const data.

Signed-off-by: Milian Wolff <milian.wolff@kdab.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Cc: Yao Jin <yao.jin@linux.intel.com>
Link: http://lkml.kernel.org/r/20170806212446.24925-3-milian.wolff@kdab.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11 16:06:31 -03:00
Andi Kleen c295036b6a perf tools: Add missing newline to expr parser error messages
Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170724234015.5165-6-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11 10:42:53 -03:00
Andi Kleen 5e97665f91 perf stat: Fix saved values rbtree lookup
The stat shadow saved values rbtree is indexed by a pointer.  Fix the
comparison function:

- We cannot return a pointer delta as an int because that loses bits on
  64bit.

- Doing pointer arithmetic on the struct pointer only works if the
  objects are spaced by the multiple of the object size, which is not
  guaranteed for individual malloc'ed object

Replace it with a proper comparison.

This fixes various problems with values not being found.

Signed-off-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Link: http://lkml.kernel.org/r/20170724234015.5165-4-andi@firstfloor.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-08-11 10:42:52 -03:00
Geneviève Bastien f9f6f2a903 perf data: Add mmap[2] events to CTF conversion
This adds the mmap and mmap2 events to the CTF trace obtained from perf
data.

These events will allow CTF trace visualization tools like Trace Compass
to automatically resolve the symbols of the callchain to the
corresponding function or origin library.

To include those events, one needs to convert with the --all option.
Here follows an output of babeltrace:

  $ sudo perf data convert --all --to-ctf myctftrace
  $ babeltrace ./myctftrace
  [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 },
 { pid = 638, tid = 638, start = 0x7F54AE39E000, filename =
 "/usr/lib/ld-2.25.so" }
  [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 }, { pid =
 638, tid = 638, start = 0x7F54AE565000, filename =
 "/usr/lib/libudev.so.1.6.6" }
  [19:00:00.000000000] (+0.000000000) perf_mmap2: { cpu_id = 0 }, { pid =
 638, tid = 638, start = 0x7FFC093EA000, filename = "[vdso]" }

Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Julien Desfossez <jdesfossez@efficios.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170727181205.24843-2-gbastien@versatic.net
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 16:26:06 -03:00
Geneviève Bastien a3073c8e59 perf data: Add callchain to CTF conversion
The field perf_callchain, if available, is added to the sampling events
during the CTF conversion. It is an array of u64 values.  The
perf_callchain_size field contains the size of the array.

It will allow the analysis of sampling data in trace visualization tools
like Trace Compass. Possible analyses with those data: dynamic
flamegraphs, correlation with other tracing data like a userspace trace.

Here follows a babeltrace CTF output of a trace with callchain:

  $ babeltrace ./myctftrace
  [17:38:45.672760285] (+?.?????????) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 1, perf_callchain_size = 7, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770, [3] = 0xFFFFFFFF81006EC6, [4] = 0xFFFFFFFF8118245E, [5] = 0xFFFFFFFF810A9224, [6] = 0xFFFFFFFF8164A4C6 ] }
  [17:38:45.672777672] (+0.000017387) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 1, perf_callchain_size = 8, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770, [3] = 0xFFFFFFFF81006EC6, [4] = 0xFFFFFFFF8118245E, [5] = 0xFFFFFFFF810A9224, [6] = 0xFFFFFFFF8164A4C6, [7] = 0xFFFFFFFF8164ABAD ] }
  [17:38:45.672786700] (+0.000009028) cycles:ppp: { cpu_id = 0 }, { perf_ip = 0xFFFFFFFF81063EE4, perf_tid = 25841, perf_pid = 25774, perf_period = 70, perf_callchain_size = 3, perf_callchain = [ [0] = 0xFFFFFFFFFFFFFF80, [1] = 0xFFFFFFFF81063EE4, [2] = 0xFFFFFFFF8100C770 ] }

Signed-off-by: Geneviève Bastien <gbastien@versatic.net>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Francis Deslauriers <francis.deslauriers@efficios.com>
Cc: Julien Desfossez <jdesfossez@efficios.com>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/20170727181205.24843-1-gbastien@versatic.net
[ Removed PERF_SAMPLE_CALLCHAIN from the TODO list, jolsa ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 16:25:07 -03:00
Arnaldo Carvalho de Melo 48cc330852 perf annotate: Fix storing per line sym_hist_entry
The existing loop incremented the offset while using it as the array
index, when we went to an array of sym_hist_entry instances, we
should've moved the increment to outside of the array element reference,
oops, fix it.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Taeung Song <treeze.taeung@gmail.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: 461c17f00f ("perf annotate: Store the sample period in each histogram bucket")
Link: http://lkml.kernel.org/n/tip-s3dm6uyrazlpag3f0psfia07@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-28 12:53:05 -03:00
Arnaldo Carvalho de Melo ce9ee4a2de perf annotate stdio: Set enough columns for --show-total-period
Now that we set the first column header according to wether
--show-total-period is being used, we need to size it accordingly.

Based-on-a-patch-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Milian Wolff <milian.wolff@kdab.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/n/tip-pu504ffnit4m334k09hxcbs3@git.kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 17:16:46 -03:00
David Carrillo-Cisneros 64831a21db perf sort: Use default sort if evlist is empty
Fixes bug noted by Jiri in https://lkml.org/lkml/2017/6/13/755 and
caused by commit d49dadea78 ("perf tools: Make 'trace' or
'trace_fields' sort key default for tracepoint events") not taking into
account that evlist is empty in pipe-mode.

Before this commit, pipe mode will only show bogus "100.00%  N/A"
instead of correct output as follows:

  $ perf record -o - sleep 1 | perf report -i -
  # To display the perf.data header info, please use --header/--header-only options.
  #
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  #
  # Total Lost Samples: 0
  #
  # Samples: 8  of event 'cycles:ppH'
  # Event count (approx.): 145658
  #
  # Overhead  Trace output
  # ........  ............
  #
     100.00%  N/A

Correct output, after patch:

  $ perf record -o - sleep 1 | perf report -i -
  # To display the perf.data header info, please use --header/--header-only options.
  #
  [ perf record: Woken up 1 times to write data ]
  [ perf record: Captured and wrote 0.000 MB - ]
  #
  # Total Lost Samples: 0
  #
  # Samples: 8  of event 'cycles:ppH'
  # Event count (approx.): 191331
  #
  # Overhead  Command  Shared Object      Symbol
  # ........  .......  .................  .................................
  #
      81.63%  sleep    libc-2.19.so       [.] _exit
      13.58%  sleep    ld-2.19.so         [.] do_lookup_x
       2.34%  sleep    [kernel.kallsyms]  [k] context_switch
       2.34%  sleep    libc-2.19.so       [.] __GI___libc_nanosleep
       0.11%  perf     [kernel.kallsyms]  [k] __intel_pmu_enable_a

Reported-by: Jiri Olsa <jolsa@kernel.org>
Report-Link: https://lkml.kernel.org/r/20170613185422.GA6092@krava
Signed-off-by: David Carrillo-Cisneros <davidcc@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paul Turner <pjt@google.com>
Cc: Simon Que <sque@chromium.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Wang Nan <wangnan0@huawei.com>
Fixes: d49dadea78 ("perf tools: Make 'trace' or 'trace_fields' sort key default for tracepoint events")
Link: https://lkml.kernel.org/r/20170721051157.47331-1-davidcc@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 17:00:07 -03:00
Jiri Olsa 82bf311e15 perf stat: Use group read for event groups
Make perf stat use  group read if there  are groups defined. The group
read will get the values for all member of groups within a single
syscall instead of calling read syscall for every event.

We can see considerable less amount of kernel cycles spent on single
group read, than reading each event separately, like for following perf
stat command:

  # perf stat -e {cycles,instructions} -I 10 -a sleep 1

Monitored with "perf stat -r 5 -e '{cycles:u,cycles:k}'"

Before:

        24,325,676      cycles:u
       297,040,775      cycles:k

       1.038554134 seconds time elapsed

After:
        25,034,418      cycles:u
       158,256,395      cycles:k

       1.036864497 seconds time elapsed

The perf_evsel__open fallback changes contributed by Andi Kleen.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170726120206.9099-4-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 14:25:44 -03:00
Jiri Olsa f7794d5254 perf evsel: Add read_counter()
Add perf_evsel__read_counter() to read single or group counter. After
calling this function the counter's evsel::counts struct is filled with
values for the counter and member of its group if there are any.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170726120206.9099-3-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 14:21:59 -03:00
Jiri Olsa de63403bfd perf tools: Add perf_evsel__read_size function
Currently we use the size of struct perf_counts_values to read the
event, which prevents us to put any new member to the struct.

Adding perf_evsel__read_size to return size of the buffer needed for
event read.

Signed-off-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <andi@firstfloor.org>
Cc: David Ahern <dsahern@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Link: http://lkml.kernel.org/r/20170726120206.9099-2-jolsa@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-26 14:20:28 -03:00
Taeung Song 38d2dcd0cc perf annotate stdio: Fix column header when using --show-total-period
Currently the first column header is always "Percent", fix it to show
correct column name based on given options, i.e. if using
--show-total-period, show "Event count" as a first column.

Reported-by: Milian Wolff <milian.wolff@kdab.com>
Signed-off-by: Taeung Song <treeze.taeung@gmail.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lkml.kernel.org/r/c3c902e7-95bc-16d4-366f-12eb034c5c8d@gmail.com
[ Extracted from a larger patch ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:37 -03:00
Jin Yao a1a8bed32d perf report: Tag branch type/flag on "to" and tag cycles on "from"
Current --branch-history LBR annotation displays confused data. For
example, each cycles report is duplicated on both "from" and "to"
entries.

For example:

  perf report --branch-history --no-children --stdio

  --2.32%--main div.c:39 (COND_BWD CROSS_2M predicted:49.7% cycles:1)
            main div.c:44 (predicted:49.7% cycles:1)
            main div.c:42 (RET CROSS_2M cycles:2)
            compute_flag div.c:28 (cycles:2)
            compute_flag div.c:27 (RET CROSS_2M cycles:1)
            rand rand.c:28 (cycles:1)
            rand rand.c:28 (RET CROSS_2M cycles:1)
            __random random.c:298 (cycles:1)
            __random random.c:297 (COND_BWD CROSS_2M cycles:1)
            __random random.c:295 (cycles:1)
            __random random.c:295 (COND_BWD CROSS_2M cycles:1)
            __random random.c:295 (cycles:1)
            __random random.c:295 (RET CROSS_2M cycles:9)

The cycles should be tagged only on the "from". It's for the code block
that ends with "from", not for "to".

Another issue is the "predicted:49.7%" is duplicated too (tag on both
"from" and "to").

This patch tags the branch type/flag on "to" and tag the cycles on
"from".

For example:

  --2.32%--main div.c:39 (COND_BWD CROSS_2M predicted:49.7%)
            main div.c:44 (cycles:1)
            main div.c:42 (RET CROSS_2M)
            compute_flag div.c:28 (cycles:2)
            compute_flag div.c:27 (RET CROSS_2M)
            rand rand.c:28 (cycles:1)
            rand rand.c:28 (RET CROSS_2M)
            __random random.c:298 (cycles:1)
            __random random.c:297 (COND_BWD CROSS_2M)
            __random random.c:295 (cycles:1)
            __random random.c:295 (COND_BWD CROSS_2M)
            __random random.c:295 (cycles:1)
            __random random.c:295 (RET CROSS_2M)
            |
             --2.23%--__random_r random_r.c:392 (cycles:9)

In this example, The "main div.c:39 (COND_BWD CROSS_2M predicted:49.7%)"
is "to" of branch and "main div.c:44 (cycles:1)" is "from" of branch.
It should be easier for understanding than before.

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1500894547-18411-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:35 -03:00
Jin Yao b49a821ed9 perf report: Make --branch-history work without callgraphs(-g) option in perf record
perf record -b -g <command>
  perf report --branch-history

This merges the LBRs with the callgraphs.

However it would be nice if it also works without callgraphs (-g) set in
perf record, so that only the LBRs are displayed.  But currently perf
report errors in this case. For example,

  perf record -b <command>
  perf report --branch-history

  Error:
  Selected -g or --branch-history but no callchain data. Did
  you call 'perf record' without -g?

This patch displays the LBRs only even if callgraphs(-g) is not enabled
in perf record.

Change log:

v2: According to Milian Wolff's comment, change the obsolete error
message. Now the error message is:

                 ┌─Error:─────────────────────────────────────┐
                 │Selected -g or --branch-history.            │
                 │But no callchain or branch data.            │
                 │Did you call 'perf record' without -g or -b?│
                 │                                            │
                 │                                            │
                 │Press any key...                            │
                 └────────────────────────────────────────────┘

When passing the last parameter to hists__fprintf,
changes "|" to "||".

  hists__fprintf(hists, !quiet, 0, 0, rep->min_percent, stdout,
                 symbol_conf.use_callchain || symbol_conf.show_branchflag_count);

Signed-off-by: Yao Jin <yao.jin@linux.intel.com>
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lkml.kernel.org/r/1494240182-28899-1-git-send-email-yao.jin@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:46:03 -03:00
Arun Kalyanasundaram a641860550 perf script python: Generate hooks with additional argument
Modify the signature of tracepoint specific and trace_unhandled hooks to
add the perf_sample dict as a new argument.
Create a python helper function to print a dictionary.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-6-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:21 -03:00
Arun Kalyanasundaram f38d281663 perf script python: Add perf_sample dict to tracepoint handlers
The process_event python hook receives a dict with all perf_sample
entries, but the tracepoint specific and trace_unhandled hooks predate
the introduction of this dict, and do not receive it.

Add the aforementioned dict as an additional argument to the affected
handlers. To keep backwards compatibility (and avoid unnecessary work),
do not pass the dict if the number of arguments signals that handler
version predates this change.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-5-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:20 -03:00
Arun Kalyanasundaram 74ec14f389 perf script python: Add sample_read to dict
Provide time_enabled, time_running and counter value in the perf_sample
dict.

Signed-off-by: Arun Kalyanasundaram <arunkaly@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Daniel Borkmann <daniel@iogearbox.net>
Cc: David Carrillo-Cisneros <davidcc@google.com>
Cc: David S. Miller <davem@davemloft.net>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Seongjae Park <sj38.park@gmail.com>
Cc: Stephane Eranian <eranian@google.com>
Link: http://lkml.kernel.org/r/20170721220422.63962-4-arunkaly@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2017-07-25 22:43:19 -03:00