Commit graph

759 commits

Author SHA1 Message Date
Ian Rogers ee756ef749 perf dso: Add reference count checking and accessor functions
Add reference count checking to struct dso, this can help with
implementing correct reference counting discipline. To avoid
RC_CHK_ACCESS everywhere, add accessor functions for the variables in
struct dso.

The majority of the change is mechanical in nature and not easy to
split up.

Committer testing:

'perf test' up to this patch shows no regressions.

But:

  util/symbol.c: In function ‘dso__load_bfd_symbols’:
  util/symbol.c:1683:9: error: too few arguments to function ‘dso__set_adjust_symbols’
   1683 |         dso__set_adjust_symbols(dso);
        |         ^~~~~~~~~~~~~~~~~~~~~~~
  In file included from util/symbol.c:21:
  util/dso.h:268:20: note: declared here
    268 | static inline void dso__set_adjust_symbols(struct dso *dso, bool val)
        |                    ^~~~~~~~~~~~~~~~~~~~~~~
  make[6]: *** [/home/acme/git/perf-tools-next/tools/build/Makefile.build:106: /tmp/tmp.ZWHbQftdN6/util/symbol.o] Error 1
    MKDIR   /tmp/tmp.ZWHbQftdN6/tests/workloads/
  make[6]: *** Waiting for unfinished jobs....

This was updated:

  -       symbols__fixup_end(&dso->symbols, false);
  -       symbols__fixup_duplicate(&dso->symbols);
  -       dso->adjust_symbols = 1;
  +       symbols__fixup_end(dso__symbols(dso), false);
  +       symbols__fixup_duplicate(dso__symbols(dso));
  +       dso__set_adjust_symbols(dso);

But not build tested with BUILD_NONDISTRO and libbfd devel files installed
(binutils-devel on fedora).

Add the missing argument:

   	symbols__fixup_end(dso__symbols(dso), false);
   	symbols__fixup_duplicate(dso__symbols(dso));
  -	dso__set_adjust_symbols(dso);
  +	dso__set_adjust_symbols(dso, true);

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahelenia Ziemiańska <nabijaczleweli@nabijaczleweli.xyz>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ben Gainey <ben.gainey@arm.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Chengen Du <chengen.du@canonical.com>
Cc: Colin Ian King <colin.i.king@gmail.com>
Cc: Dima Kogan <dima@secretsauce.net>
Cc: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Li Dong <lidong@vivo.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paran Lee <p4ranlee@gmail.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Song Liu <song@kernel.org>
Cc: Sun Haiyong <sunhaiyong@loongson.cn>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yanteng Si <siyanteng@loongson.cn>
Cc: zhaimingbing <zhaimingbing@cmss.chinamobile.com>
Link: https://lore.kernel.org/r/20240504213803.218974-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-06 15:28:49 -03:00
Ian Rogers 8f283fb7b8 perf trace: Disable syscall augmentation with record
Syscall augmentation is causing samples not to be written to the
perf.data file with "perf trace record". Disabling augmentation is
sub-optimal, but it beats having a totally broken perf trace record.

Closes: https://lore.kernel.org/lkml/CAP-5=fV9Gd1Teak+EOcUSxe13KqSyfZyPNagK97GbLiOQRgGaw@mail.gmail.com/
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240216172357.65037-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-05-04 15:03:58 -03:00
Yang Jihong 09d2056efe perf evsel: Use evsel__name_is() helper
Code cleanup, replace strcmp(evsel__name(evsel, {NAME})) with
evsel__name_is() helper.

No functional change.

Committer notes:

Fix this build error:

          trace.syscalls.events.bpf_output = evlist__last(trace.evlist);
  -       assert(evsel__name_is(trace.syscalls.events.bpf_output), "__augmented_syscalls__");
  +       assert(evsel__name_is(trace.syscalls.events.bpf_output, "__augmented_syscalls__"));

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yang Jihong <yangjihong@bytedance.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240401062724.1006010-3-yangjihong@bytedance.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-04-03 11:48:56 -03:00
Arnaldo Carvalho de Melo 0831638e8c perf trace: Fix 'newfstatat'/'fstatat' argument pretty printing
There were needless two entries, one for 'newfstatat' and another for
'fstatat', keep just one and pretty print its 'flags' argument using the
fs_at_flags scnprintf that is also used by other FS syscalls such as
'stat', now:

  root@number:~# perf trace -e newfstatat --max-events=5
       0.000 ( 0.010 ms): abrt-dump-jour/1400 newfstatat(dfd: 7, filename: "", statbuf: 0x7fff0d127000, flag: EMPTY_PATH) = 0
       0.020 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 9, filename: "", statbuf: 0x55752507b0e8, flag: EMPTY_PATH) = 0
       0.039 ( 0.004 ms): abrt-dump-jour/1400 newfstatat(dfd: 19, filename: "", statbuf: 0x557525061378, flag: EMPTY_PATH) = 0
       0.047 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 20, filename: "", statbuf: 0x5575250b8cc8, flag: EMPTY_PATH) = 0
       0.053 ( 0.003 ms): abrt-dump-jour/1400 newfstatat(dfd: 22, filename: "", statbuf: 0x5575250535d8, flag: EMPTY_PATH) = 0
  root@number:~#

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240320193115.811899-6-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 13:54:40 -03:00
Arnaldo Carvalho de Melo 4d92328290 perf trace: Beautify the 'flags' arg of unlinkat
Reusing the fs_at_flags array done for the 'stat' syscall.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240320193115.811899-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 13:54:40 -03:00
Arnaldo Carvalho de Melo b8171a8406 perf beauty: Introduce faccessat2 flags scnprintf routine
The fsaccessat and fsaccessat2 now have beautifiers for its arguments.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240320193115.811899-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 13:54:40 -03:00
Arnaldo Carvalho de Melo 3d6cfbaf27 perf beauty: Introduce scrape script for various fs syscalls 'flags' arguments
It was using the first variation on producing a string representation
for a binary flag, one that used the system's fcntl.h and preprocessor
tricks that had to be updated everytime a new flag was introduced.

Use the more recent scrape script + strarray + strarray__scnprintf_flags() combo.

  $ tools/perf/trace/beauty/fs_at_flags.sh
  static const char *fs_at_flags[] = {
  	[ilog2(0x100) + 1] = "SYMLINK_NOFOLLOW",
  	[ilog2(0x200) + 1] = "REMOVEDIR",
  	[ilog2(0x400) + 1] = "SYMLINK_FOLLOW",
  	[ilog2(0x800) + 1] = "NO_AUTOMOUNT",
  	[ilog2(0x1000) + 1] = "EMPTY_PATH",
  	[ilog2(0x0000) + 1] = "STATX_SYNC_AS_STAT",
  	[ilog2(0x2000) + 1] = "STATX_FORCE_SYNC",
  	[ilog2(0x4000) + 1] = "STATX_DONT_SYNC",
  	[ilog2(0x8000) + 1] = "RECURSIVE",
  	[ilog2(0x80000000) + 1] = "GETATTR_NOSEC",
  };
  $

Now we need a copy of uapi/linux/fcntl.h from tools/include/ in the
scrape only directory tools/perf/trace/beauty/include and will use that
fs_at_flags array for other fs syscalls.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240320193115.811899-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 13:54:40 -03:00
Arnaldo Carvalho de Melo a9f4c6c999 perf trace: Collect sys_nanosleep first argument
That is a 'struct timespec' passed from userspace to the kernel as we
can see with a system wide syscall tracing:

  root@number:~# perf trace -e nanosleep
       0.000 (10.102 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
      38.924 (10.077 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     100.177 (10.107 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     139.171 (10.063 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     200.603 (10.105 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     239.399 (10.064 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     300.994 (10.096 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     339.584 (10.067 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     401.335 (10.057 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     439.758 (10.166 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     501.814 (10.110 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     539.983 (10.227 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     602.284 (10.199 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     640.208 (10.105 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     702.662 (10.163 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     740.440 (10.107 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     802.993 (10.159 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
  ^Croot@number:~# strace -p 9150 -e nanosleep

If we then use the ptrace method to look at that podman process:

  root@number:~# strace -p 9150 -e nanosleep
  strace: Process 9150 attached
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  ^Cstrace: Process 9150 detached
  root@number:~#

With some changes we can get something closer to the strace output,
still in system wide mode:

  root@number:~# perf config trace.show_arg_names=false
  root@number:~# perf config trace.show_duration=false
  root@number:~# perf config trace.show_timestamp=false
  root@number:~# perf config trace.show_zeros=true
  root@number:~# perf config trace.args_alignment=0
  root@number:~# perf trace -e nanosleep --max-events=10
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  root@number:~#
  root@number:~# perf config
  trace.show_arg_names=false
  trace.show_duration=false
  trace.show_timestamp=false
  trace.show_zeros=true
  trace.args_alignment=0
  root@number:~# cat ~/.perfconfig
  # this file is auto-generated.
  [trace]
  	show_arg_names = false
  	show_duration = false
  	show_timestamp = false
  	show_zeros = true
  	args_alignment = 0
  root@number:~#

This will not get reused by any other syscall as nanosleep is the only
one to have as its first argument a 'struct timespec" pointer argument
passed from userspace to the kernel:

  root@number:~# grep timespec /sys/kernel/tracing/events/syscalls/sys_enter_*/format | grep offset:16
  /sys/kernel/tracing/events/syscalls/sys_enter_nanosleep/format:	field:struct __kernel_timespec * rqtp;	offset:16;	size:8;	signed:0;
  root@number:~#

BTF based pretty printing will simplify all this, but then lets just get
the low hanging fruits first.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zbq72dJRpOlfRWnf@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:26 -03:00
Ian Rogers f178ffdf7e perf trace: Ignore thread hashing in summary
Commit 91e467bc56 ("perf machine: Use hashtable for machine
threads") made the iteration of thread tids unordered. The perf trace
--summary output sorts and prints each hash bucket, rather than all
threads globally. Change this behavior by turn all threads into a
list, sort the list by number of trace events then by tids, finally
print the list. This also allows the rbtree in threads to be not
accessed outside of machine.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-3-irogers@google.com
2024-03-03 22:51:18 -08:00
Arnaldo Carvalho de Melo 54373b5d53 perf env: Introduce perf_env__arch_strerrno()
That will cache the arch specific function translating error numbers to
strings.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: David Laight <David.Laight@ACULAB.COM>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20231201203046.486596-2-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-12-04 16:42:09 -03:00
Arnaldo Carvalho de Melo 64917f4df0 perf trace: Use heuristic when deciding if a syscall tracepoint "const char *" field is really a string
'perf trace' tries to find BPF progs associated with a syscall that have
a signature that is similar to syscalls without one to try and reuse,
so, for instance, the 'open' signature can be reused with many other
syscalls that have as its first arg a string.

It uses the tracefs events format file for finding a signature that can
be reused, but then comes the "write" syscall with its second argument
as a "const char *":

  # cat /sys/kernel/debug/tracing/events/syscalls/sys_enter_write/format
  name: sys_enter_write
  ID: 746
  format:
  	field:unsigned short common_type;	offset:0;	size:2;	signed:0;
  	field:unsigned char common_flags;	offset:2;	size:1;	signed:0;
  	field:unsigned char common_preempt_count;	offset:3;	size:1;	signed:0;
  	field:int common_pid;	offset:4;	size:4;	signed:1;

  	field:int __syscall_nr;	offset:8;	size:4;	signed:1;
  	field:unsigned int fd;	offset:16;	size:8;	signed:0;
  	field:const char * buf;	offset:24;	size:8;	signed:0;
  	field:size_t count;	offset:32;	size:8;	signed:0;

  print fmt: "fd: 0x%08lx, buf: 0x%08lx, count: 0x%08lx", ((unsigned long)(REC->fd)), ((unsigned long)(REC->buf)), ((unsigned long)(REC->count))
  #

Which isn't a string (the man page for glibc has buf as "void *"), so we
have to use the name of the argument as an heuristic, to consider a
string just args that are "const char *" and that have in its name  the
"path", "file", etc substrings.

With that now it reuses:

  [root@quaco ~]# perf trace -v --max-events=1 |& grep Reus
  Reusing "open" BPF sys_enter augmenter for "stat"
  Reusing "open" BPF sys_enter augmenter for "lstat"
  Reusing "open" BPF sys_enter augmenter for "access"
  Reusing "connect" BPF sys_enter augmenter for "accept"
  Reusing "sendto" BPF sys_enter augmenter for "recvfrom"
  Reusing "connect" BPF sys_enter augmenter for "bind"
  Reusing "connect" BPF sys_enter augmenter for "getsockname"
  Reusing "connect" BPF sys_enter augmenter for "getpeername"
  Reusing "open" BPF sys_enter augmenter for "execve"
  Reusing "open" BPF sys_enter augmenter for "truncate"
  Reusing "open" BPF sys_enter augmenter for "chdir"
  Reusing "open" BPF sys_enter augmenter for "mkdir"
  Reusing "open" BPF sys_enter augmenter for "rmdir"
  Reusing "open" BPF sys_enter augmenter for "creat"
  Reusing "open" BPF sys_enter augmenter for "link"
  Reusing "open" BPF sys_enter augmenter for "unlink"
  Reusing "open" BPF sys_enter augmenter for "symlink"
  Reusing "open" BPF sys_enter augmenter for "readlink"
  Reusing "open" BPF sys_enter augmenter for "chmod"
  Reusing "open" BPF sys_enter augmenter for "chown"
  Reusing "open" BPF sys_enter augmenter for "lchown"
  Reusing "open" BPF sys_enter augmenter for "mknod"
  Reusing "open" BPF sys_enter augmenter for "statfs"
  Reusing "open" BPF sys_enter augmenter for "pivot_root"
  Reusing "open" BPF sys_enter augmenter for "chroot"
  Reusing "open" BPF sys_enter augmenter for "acct"
  Reusing "open" BPF sys_enter augmenter for "swapon"
  Reusing "open" BPF sys_enter augmenter for "swapoff"
  Reusing "open" BPF sys_enter augmenter for "delete_module"
  Reusing "open" BPF sys_enter augmenter for "setxattr"
  Reusing "open" BPF sys_enter augmenter for "lsetxattr"
  Reusing "openat" BPF sys_enter augmenter for "fsetxattr"
  Reusing "open" BPF sys_enter augmenter for "getxattr"
  Reusing "open" BPF sys_enter augmenter for "lgetxattr"
  Reusing "openat" BPF sys_enter augmenter for "fgetxattr"
  Reusing "open" BPF sys_enter augmenter for "listxattr"
  Reusing "open" BPF sys_enter augmenter for "llistxattr"
  Reusing "open" BPF sys_enter augmenter for "removexattr"
  Reusing "open" BPF sys_enter augmenter for "lremovexattr"
  Reusing "fsetxattr" BPF sys_enter augmenter for "fremovexattr"
  Reusing "open" BPF sys_enter augmenter for "mq_open"
  Reusing "open" BPF sys_enter augmenter for "mq_unlink"
  Reusing "fsetxattr" BPF sys_enter augmenter for "add_key"
  Reusing "fremovexattr" BPF sys_enter augmenter for "request_key"
  Reusing "fremovexattr" BPF sys_enter augmenter for "inotify_add_watch"
  Reusing "fremovexattr" BPF sys_enter augmenter for "mkdirat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "mknodat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "fchownat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "futimesat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "newfstatat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "unlinkat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "linkat"
  Reusing "open" BPF sys_enter augmenter for "symlinkat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "readlinkat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "fchmodat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "faccessat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "utimensat"
  Reusing "connect" BPF sys_enter augmenter for "accept4"
  Reusing "fremovexattr" BPF sys_enter augmenter for "name_to_handle_at"
  Reusing "fremovexattr" BPF sys_enter augmenter for "renameat2"
  Reusing "open" BPF sys_enter augmenter for "memfd_create"
  Reusing "fremovexattr" BPF sys_enter augmenter for "execveat"
  Reusing "fremovexattr" BPF sys_enter augmenter for "statx"
  [root@quaco ~]#

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alan Maguire <alan.maguire@oracle.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZN5lrdeEdSMCn7hk@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-18 16:33:28 -03:00
Arnaldo Carvalho de Melo 83a0943b18 perf trace: Use the augmented_raw_syscall BPF skel only for tracing syscalls
It is possible to use 'perf trace' with tracepoints and in that case we
can't initialize/use the augmented_raw_syscalls BPF skel.

For instance, this usecase:

  # perf trace -e sched:*exec --max-events=5
         ? (         ): NetworkManager/1183  ... [continued]: poll())                                             = 1
     0.043 ( 0.007 ms): NetworkManager/1183 epoll_wait(epfd: 17<anon_inode:[eventpoll]>, events: 0x55555f90e920, maxevents: 6) = 0
     0.060 ( 0.007 ms): NetworkManager/1183 write(fd: 3<anon_inode:[eventfd]>, buf: 0x7ffc5a27cd30, count: 8)     = 8
     0.073 ( 0.005 ms): NetworkManager/1183 epoll_wait(epfd: 24<anon_inode:[eventpoll]>, events: 0x7ffc5a27cd20, maxevents: 2) = 1
     0.082 ( 0.010 ms): NetworkManager/1183 recvmmsg(fd: 26<socket:[30298]>, mmsg: 0x7ffc5a27caa0, vlen: 8)       = 1
  #

Where we want to trace just some sched tracepoints ending in 'exec' ends
up tracing all syscalls.

Fix it by checking existing trace->trace_syscalls boolean to see if we
need the augmenter.

A followup patch will move those sections of code used only with the
augmenter to separate functions, to get it cleaner and remove the goto,
done just for reviewing purposes.

With this patch in place the previous behaviour is restored: no syscalls
when we have other events and no syscall names:

  [root@quaco ~]# perf probe do_filp_open "filename=pathname->name:string"
  Added new event:
    probe:do_filp_open   (on do_filp_open with filename=pathname->name:string)

  You can now use it in all perf tools, such as:

	  perf record -e probe:do_filp_open -aR sleep 1

  [root@quaco ~]# perf trace --max-events=10 -e probe:do_filp_open sleep 1
     0.000 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/etc/ld.so.cache")
     0.056 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/lib64/libc.so.6")
     0.481 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/locale-archive")
     0.501 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/share/locale/locale.alias")
     0.572 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.UTF-8/LC_IDENTIFICATION")
     0.581 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.utf8/LC_IDENTIFICATION")
     0.616 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib64/gconv/gconv-modules.cache")
     0.656 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.UTF-8/LC_MEASUREMENT")
     0.664 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.utf8/LC_MEASUREMENT")
     0.696 sleep/455122 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/en_US.UTF-8/LC_TELEPHONE")
  [root@quaco ~]#

As well as mixing syscalls with tracepoints, getting the syscall
tracepoints used augmented using the BPF skel:

  [root@quaco ~]# perf trace --max-events=10 -e open*,probe:do_filp_open sleep 1
     0.000 (         ): sleep/455124 openat(dfd: CWD, filename: "/etc/ld.so.cache", flags: RDONLY|CLOEXEC) ...
     0.005 (         ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/etc/ld.so.cache")
     0.000 ( 0.011 ms): sleep/455124  ... [continued]: openat())                                           = 3
     0.031 (         ): sleep/455124 openat(dfd: CWD, filename: "/lib64/libc.so.6", flags: RDONLY|CLOEXEC) ...
     0.033 (         ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/lib64/libc.so.6")
     0.031 ( 0.006 ms): sleep/455124  ... [continued]: openat())                                           = 3
     0.258 (         ): sleep/455124 openat(dfd: CWD, filename: "/usr/lib/locale/locale-archive", flags: RDONLY|CLOEXEC) ...
     0.261 (         ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/lib/locale/locale-archive")
     0.258 ( 0.006 ms): sleep/455124  ... [continued]: openat())                                           = -1 ENOENT (No such file or directory)
     0.272 (         ): sleep/455124 openat(dfd: CWD, filename: "/usr/share/locale/locale.alias", flags: RDONLY|CLOEXEC) ...
     0.273  (        ): sleep/455124 probe:do_filp_open(__probe_ip: -1186560412, filename: "/usr/share/locale/locale.alias")

A final note: the probe:do_filp_open uses a kprobe (probably optimized
as its in the start of a function) that uses the kprobe_tracer mechanism
in the kernel to collect the pathname->name string and stash it into the
tracepoint created by 'perf probe' for that:

  [root@quaco ~]# cat /sys/kernel/debug/tracing/kprobe_events
  p:probe/do_filp_open _text+4621920 filename=+0(+0(%si)):string
  [root@quaco ~]#

While the syscalls:sys_enter_openat tracepoint gets its string from a
BPF program attached to raw_syscalls:sys_enter that tail calls into
another BPF program that knows the types for the openat syscall args and
thus can bpf_probe_read it right after the normal
sys_enter/sys_enter_openat tracepoint payload that comes prefixed with
whatever perf_event_open asked for (CPU, timestamp, etc):

  [root@quaco ~]# bpftool prog | grep -E "sys_enter |sys_enter_opena" -A3
  3176: tracepoint  name sys_enter  tag 0bc3fc9d11754ba1  gpl
	loaded_at 2023-08-17T12:32:20-0300  uid 0
	xlated 272B  jited 257B  memlock 4096B  map_ids 2462,2466,2463
	btf_id 2976
  --
  3180: tracepoint  name sys_enter_opena  tag 19dd077f00ec2f58  gpl
	  loaded_at 2023-08-17T12:32:20-0300  uid 0
	  xlated 328B  jited 206B  memlock 4096B  map_ids 2466,2465
	  btf_id 2976
  [root@quaco ~]#

Fixes: 5e6da6be30 ("perf trace: Migrate BPF augmentation to use a skeleton")
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wang ShaoBo <bobo.shaobowang@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Link: https://lore.kernel.org/lkml/ZN4+s2Wl+zYmXTDj@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-18 16:33:12 -03:00
Ian Rogers 5e6da6be30 perf trace: Migrate BPF augmentation to use a skeleton
Previously a BPF event of augmented_raw_syscalls.c could be used to
enable augmentation of syscalls by perf trace. As BPF events are no
longer supported, switch to using a BPF skeleton which when attached
explicitly opens the sysenter and sysexit tracepoints.

The dump map is removed as debugging wasn't supported by the
augmentation and bpf_printk can be used when necessary.

Remove tools/perf/examples/bpf/augmented_raw_syscalls.c so that the
rename/migration to a BPF skeleton captures that this was the source.

Committer notes:

Some minor stylistic changes to help visualizing the diff.

Use libbpf_strerror when failing to load the augmented raw syscalls BPF.

Use  bpf_object__for_each_program(prog, trace.skel->obj) to disable auto
attachment for all but the sys_enter, sys_exit tracepoints, to avoid
having to add extra lines as we go adding support for more pointer
receiving syscalls.

Committer testing:

  # perf trace -e open*  --max-events=10
     0.000 ( 0.022 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC)    = 11
   208.833 (         ): gnome-terminal/3223 openat(dfd: CWD, filename: "/proc/51250/cmdline")                  ...
   249.993 ( 0.024 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/proc/meminfo", flags: RDONLY|CLOEXEC)    = 11
   250.118 ( 0.030 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.pressure", flags: RDONLY|CLOEXEC) = 11
   250.205 ( 0.016 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.current", flags: RDONLY|CLOEXEC) = 11
   250.244 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.min", flags: RDONLY|CLOEXEC) = 11
   250.282 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.low", flags: RDONLY|CLOEXEC) = 11
   250.320 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.swap.current", flags: RDONLY|CLOEXEC) = 11
   250.355 ( 0.014 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/memory.stat", flags: RDONLY|CLOEXEC) = 11
   250.717 ( 0.016 ms): systemd-oomd/1151 openat(dfd: CWD, filename: "/sys/fs/cgroup/user.slice/user-1001.slice/user@1001.service/memory.pressure", flags: RDONLY|CLOEXEC) = 11
  #
  # perf trace -e *nanosleep*  --max-events=10
         ? (         ): SCTP timer/28304  ... [continued]: clock_nanosleep())                                  = 0
     0.007 (10.058 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0
    10.069 (         ): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) ...
    10.069 (10.056 ms): SCTP timer/28304  ... [continued]: clock_nanosleep())                                  = 0
    17.059 (         ): podman/3572 nanosleep(rqtp: 0x7fc4f4d75be0)                                    ...
    17.059 (10.061 ms): podman/3572  ... [continued]: nanosleep())                                        = 0
    20.131 (10.059 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0
    30.195 (10.038 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0
    40.238 (10.057 ms): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) = 0
    50.301 (         ): SCTP timer/28304 clock_nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 }, rmtp: 0x7f0466b78de0) ...
  #

  # perf trace -e perf_event*  -- perf stat -e instructions,cycles,cache-misses sleep 0.1
     0.000 ( 0.011 ms): perf/51331 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x1 (PERF_COUNT_HW_INSTRUCTIONS), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 51332 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
     0.013 ( 0.003 ms): perf/51331 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0 (PERF_COUNT_HW_CPU_CYCLES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 51332 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
     0.017 ( 0.002 ms): perf/51331 perf_event_open(attr_uptr: { type: 0 (PERF_TYPE_HARDWARE), size: 136, config: 0x3 (PERF_COUNT_HW_CACHE_MISSES), sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 51332 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5

 Performance counter stats for 'sleep 0.1':

         1,495,051      instructions                     #    1.11  insn per cycle
         1,347,641      cycles
            35,424      cache-misses

       0.100935279 seconds time elapsed

       0.000924000 seconds user
       0.000000000 seconds sys

  #

  # perf trace -e connect*  ssh localhost
       0.000 ( 0.012 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.118 ( 0.004 ms): ssh/51346 connect(fd: 6, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.399 ( 0.007 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.426 ( 0.003 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.754 ( 0.009 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: INET, port: 22, addr: 127.0.0.1 }, addrlen: 16) = 0
       0.771 ( 0.010 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
       0.798 ( 0.053 ms): ssh/51346 connect(fd: 4, uservaddr: { .family: INET6, port: 22, addr: ::1 }, addrlen: 28) = 0
       0.870 ( 0.004 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.904 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.930 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.957 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       0.981 ( 0.003 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       1.006 ( 0.004 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
       1.036 ( 0.005 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/lib/sss/pipes/nss }, addrlen: 110) = -1 ECONNREFUSED (Connection refused)
      65.077 ( 0.022 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
      66.608 ( 0.014 ms): ssh/51346 connect(fd: 5, uservaddr: { .family: LOCAL, path: /var/run/.heim_org.h5l.kcm-socket }, addrlen: 110) = 0
  root@localhost's password:
  #

  # perf trace -e sendto*  ping -c 2 localhost
  PING localhost(localhost (::1)) 56 data bytes
  64 bytes from localhost (::1): icmp_seq=1 ttl=64 time=0.024 ms
       0.000 ( 0.011 ms): ping/51357 sendto(fd: 5, buff: 0x7ffcca35e620, len: 20, addr: { .family: NETLINK }, addr_len: 0xc) = 20
       0.135 ( 0.026 ms): ping/51357 sendto(fd: 4, buff: 0x5601398f7b20, len: 64, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 0x1c) = 64
    1014.929 ( 0.050 ms): ping/51357 sendto(fd: 4, buff: 0x5601398f7b20, len: 64, flags: CONFIRM, addr: { .family: INET6, port: 58, addr: ::1 }, addr_len: 0x1c) = 64
  64 bytes from localhost (::1): icmp_seq=2 ttl=64 time=0.046 ms

  --- localhost ping statistics ---
  2 packets transmitted, 2 received, 0% packet loss, time 1015ms
  rtt min/avg/max/mdev = 0.024/0.035/0.046/0.011 ms
  #

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wang ShaoBo <bobo.shaobowang@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230810184853.2860737-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-15 16:41:48 -03:00
Ian Rogers 3d6dfae889 perf parse-events: Remove BPF event support
New features like the BPF --filter support in perf record have made the
BPF event functionality somewhat redundant. As shown by commit
fcb027c1a4f6 ("perf tools: Revert enable indices setting syntax for BPF
map") and commit 14e4b9f428 ("perf trace: Raw augmented syscalls fix
libbpf 1.0+ compatibility") the BPF event support hasn't been well
maintained and it adds considerable complexity in areas like event
parsing, not least as '/' is a separator for event modifiers as well as
in paths.

This patch removes support in the event parser for BPF events and then
the associated functions are removed. This leads to the removal of whole
source files like bpf-loader.c.  Removing support means that augmented
syscalls in perf trace is broken, this will be fixed in a later commit
adding support using BPF skeletons.

The removal of BPF events causes an unused label warning from flex
generated code, so update build to ignore it:

  ```
  util/parse-events-flex.c:2704:1: error: label ‘find_rule’ defined but not used [-Werror=unused-label]
  2704 | find_rule: /* we branch to this label when backing up */
  ```

Committer notes:

Extracted from a larger patch that was also removing the support for
linking with libllvm and libclang, that were an alternative to using an
external clang execution to compile the .c event source code into BPF
bytecode.

Testing it:

  # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c
  event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c'
                        \___ Bad event or PMU

  Unabled to find PMU or event on a PMU of 'home'

  Initial error:
  event syntax error: '/home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c'
                        \___ Cannot find PMU `home'. Missing kernel support?
  Run 'perf list' for a list of valid events

   Usage: perf trace [<options>] [<command>]
      or: perf trace [<options>] -- <command> [<options>]
      or: perf trace record [<options>] [<command>]
      or: perf trace record [<options>] -- <command> [<options>]

      -e, --event <event>   event/syscall selector. use 'perf list' to list available events
  #

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrii Nakryiko <andrii@kernel.org>
Cc: Anshuman Khandual <anshuman.khandual@arm.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brendan Gregg <brendan.d.gregg@gmail.com>
Cc: Carsten Haitzler <carsten.haitzler@arm.com>
Cc: Eduard Zingerman <eddyz87@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: He Kuang <hekuang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Tom Rix <trix@redhat.com>
Cc: Wang Nan <wangnan0@huawei.com>
Cc: Wang ShaoBo <bobo.shaobowang@huawei.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Yonghong Song <yhs@fb.com>
Cc: YueHaibing <yuehaibing@huawei.com>
Cc: bpf@vger.kernel.org
Cc: llvm@lists.linux.dev
Link: https://lore.kernel.org/r/20230810184853.2860737-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-08-15 16:41:48 -03:00
Arnaldo Carvalho de Melo fcca1faf11 perf trace: Free thread_trace->files table
The fd->pathname table that is kept in 'struct thread_trace' and thus in
thread->priv must be freed when a thread is deleted.

This was also detected using -fsanitize=address.

Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20230719202951.534582-6-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-07-20 11:30:26 -03:00
Arnaldo Carvalho de Melo 7962ef1365 perf trace: Really free the evsel->priv area
In 3cb4d5e00e ("perf trace: Free syscall tp fields in
evsel->priv") it only was freeing if strcmp(evsel->tp_format->system,
"syscalls") returned zero, while the corresponding initialization of
evsel->priv was being performed if it was _not_ zero, i.e. if the tp
system wasn't 'syscalls'.

Just stop looking for that and free it if evsel->priv was set, which
should be equivalent.

Also use the pre-existing evsel_trace__delete() function.

This resolves these leaks, detected with:

  $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin

  =================================================================
  ==481565==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540e8b in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3212
      #7 0x540e8b in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540e8b in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Direct leak of 40 byte(s) in 1 object(s) allocated from:
      #0 0x7f7343cba097 in calloc (/lib64/libasan.so.8+0xba097)
      #1 0x987966 in zalloc (/home/acme/bin/perf+0x987966)
      #2 0x52f9b9 in evsel_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:307
      #3 0x52f9b9 in evsel__syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:333
      #4 0x52f9b9 in evsel__init_raw_syscall_tp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:458
      #5 0x52f9b9 in perf_evsel__raw_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:480
      #6 0x540dd1 in trace__add_syscall_newtp /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3205
      #7 0x540dd1 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3891
      #8 0x540dd1 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5156
      #9 0x5ef262 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
      #10 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
      #11 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
      #12 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
      #13 0x7f7342c4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  SUMMARY: AddressSanitizer: 80 byte(s) leaked in 2 allocation(s).
  [root@quaco ~]#

With this we plug all leaks with "perf trace sleep 1".

Fixes: 3cb4d5e00e ("perf trace: Free syscall tp fields in evsel->priv")
Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Link: https://lore.kernel.org/lkml/20230719202951.534582-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-07-20 11:29:50 -03:00
Arnaldo Carvalho de Melo 9de251cb50 perf trace: Register a thread priv destructor
To plug these leaks detected with:

  $ make EXTRA_CFLAGS="-fsanitize=address" BUILD_BPF_SKEL=1 CORESIGHT=1 O=/tmp/build/perf-tools-next -C tools/perf install-bin

  =================================================================
  ==473890==ERROR: LeakSanitizer: detected memory leaks

  Direct leak of 112 byte(s) in 1 object(s) allocated from:
    #0 0x7fdf19aba097 in calloc (/lib64/libasan.so.8+0xba097)
    #1 0x987836 in zalloc (/home/acme/bin/perf+0x987836)
    #2 0x5367ae in thread_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1289
    #3 0x5367ae in thread__trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1307
    #4 0x5367ae in trace__sys_exit /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:2468
    #5 0x52bf34 in trace__handle_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3177
    #6 0x52bf34 in __trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3685
    #7 0x542927 in trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3712
    #8 0x542927 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:4055
    #9 0x542927 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5141
    #10 0x5ef1a2 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
    #11 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
    #12 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
    #13 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
    #14 0x7fdf18a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Direct leak of 2048 byte(s) in 1 object(s) allocated from:
    #0 0x7f788fcba6af in __interceptor_malloc (/lib64/libasan.so.8+0xba6af)
    #1 0x5337c0 in trace__sys_enter /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:2342
    #2 0x52bfb4 in trace__handle_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3191
    #3 0x52bfb4 in __trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3699
    #4 0x542883 in trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3726
    #5 0x542883 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:4069
    #6 0x542883 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5155
    #7 0x5ef232 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
    #8 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
    #9 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
    #10 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
    #11 0x7f788ec4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

  Indirect leak of 48 byte(s) in 1 object(s) allocated from:
    #0 0x7fdf19aba6af in __interceptor_malloc (/lib64/libasan.so.8+0xba6af)
    #1 0x77b335 in intlist__new util/intlist.c:116
    #2 0x5367fd in thread_trace__new /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1293
    #3 0x5367fd in thread__trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:1307
    #4 0x5367fd in trace__sys_exit /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:2468
    #5 0x52bf34 in trace__handle_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3177
    #6 0x52bf34 in __trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3685
    #7 0x542927 in trace__deliver_event /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:3712
    #8 0x542927 in trace__run /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:4055
    #9 0x542927 in cmd_trace /home/acme/git/perf-tools-next/tools/perf/builtin-trace.c:5141
    #10 0x5ef1a2 in run_builtin /home/acme/git/perf-tools-next/tools/perf/perf.c:323
    #11 0x4196da in handle_internal_command /home/acme/git/perf-tools-next/tools/perf/perf.c:377
    #12 0x4196da in run_argv /home/acme/git/perf-tools-next/tools/perf/perf.c:421
    #13 0x4196da in main /home/acme/git/perf-tools-next/tools/perf/perf.c:537
    #14 0x7fdf18a4a50f in __libc_start_call_main (/lib64/libc.so.6+0x2750f)

Acked-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20230719202951.534582-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-07-20 11:23:48 -03:00
Ian Rogers 8ab12a2038 perf callchain: Use pthread keys for tls callchain_cursor
Pthread keys are more portable than __thread and allow the association
of a destructor with the key. Use the destructor to clean up TLS
callchain cursors to aid understanding memory leaks.

Committer notes:

Had to fixup a series of unconverted places and also check for the
return of get_tls_callchain_cursor() as it may fail and return NULL.

In that unlikely case we now either print something to a file, if the
caller was expecting to print a callchain, or return an error code to
state that resolving the callchain isn't possible.

In some cases this was made easier because thread__resolve_callchain()
already can fail for other reasons, so this new one (cursor == NULL) can
be added and the callers don't have to explicitely check for this new
condition.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-25-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12 15:57:54 -03:00
Ian Rogers 0dd5041c9a perf addr_location: Add init/exit/copy functions
struct addr_location holds references to multiple reference counted
objects. Add init/exit functions to make maintenance of those more
consistent with the rest of the code and to try to avoid
leaks. Modification of thread reference counts isn't included in this
change.

Committer notes:

I needed to initialize result to sample->ip to make sure is set to
something, fixing a compile time error, mostly keeping the previous
logic as build_alloc_func_list() already does debugging/error prints
about what went wrong if it takes the 'goto out'.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-7-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12 15:57:53 -03:00
Ian Rogers ee84a3032b perf thread: Add accessor functions for thread
Using accessors will make it easier to add reference count checking in
later patches.

Committer notes:

thread->nsinfo wasn't wrapped as it is used together with
nsinfo__zput(), where does a trick to set the field with a refcount
being dropped to NULL, and that doesn't work well with using
thread__nsinfo(thread), that loses the &thread->nsinfo pointer.

When refcount checking is added to 'struct thread', later in this
series, nsinfo__zput(RC_CHK_ACCESS(thread)->nsinfo) will be used to
check the thread pointer.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12 15:57:53 -03:00
Ian Rogers 7ee227f674 perf thread: Make threads rbtree non-invasive
Separate the rbtree out of thread and into a new struct
thread_rb_node. The refcnt is in thread and the rbtree is responsible
for a single count.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ali Saidi <alisaidi@amazon.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Brian Robbins <brianrob@linux.microsoft.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Dmitrii Dolgov <9erthalion6@gmail.com>
Cc: Fangrui Song <maskray@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Ivan Babrou <ivan@cloudflare.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jing Zhang <renyu.zj@linux.alibaba.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Liam Howlett <liam.howlett@oracle.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Miguel Ojeda <ojeda@kernel.org>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steinar H. Gunderson <sesse@google.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Wenyu Liu <liuwenyu7@huawei.com>
Cc: Will Deacon <will@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Ye Xingchen <ye.xingchen@zte.com.cn>
Cc: Yuan Can <yuancan@huawei.com>
Cc: coresight@lists.linaro.org
Cc: linux-arm-kernel@lists.infradead.org
Link: https://lore.kernel.org/r/20230608232823.4027869-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-06-12 15:57:53 -03:00
Ian Rogers 60995604d1 perf trace: Make some large static arrays const to move it to .data.rel.ro
Allows the movement of 33,128 bytes from .data to .data.rel.ro.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230526183401.2326121-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-28 10:21:53 -03:00
Ian Rogers 411ad22ecf perf parse-events: Add pmu filter
To support the cputype argument added to "perf stat" for hybrid it is
necessary to filter events during wildcard matching. Add a scanner
argument for the filter and checking it when wildcard matching.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ahmad Yasin <ahmad.yasin@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Caleb Biggers <caleb.biggers@intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kang Minchul <tegongkang@gmail.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Perry Taylor <perry.taylor@intel.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Rob Herring <robh@kernel.org>
Cc: Samantha Alt <samantha.alt@intel.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Suzuki Poulouse <suzuki.poulose@arm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Tiezhu Yang <yangtiezhu@loongson.cn>
Cc: Weilin Wang <weilin.wang@intel.com>
Cc: Xing Zhengjun <zhengjun.xing@linux.intel.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230502223851.2234828-30-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-05-15 09:12:14 -03:00
Arnaldo Carvalho de Melo 9997d5dd17 perf trace: Use zfree() to reduce chances of use after free
Do defensive programming by using zfree() to initialize freed pointers
to NULL, so that eventual use after free result in a NULL pointer deref
instead of more subtle behaviour.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-12 09:54:32 -03:00
Ian Rogers 63df0e4bc3 perf map: Add accessor for dso
Later changes will add reference count checking for struct map, with
dso being the most frequently accessed variable. Add an accessor so
that the reference count check is only necessary in one place.

Additional changes:
 - add a dso variable to avoid repeated map__dso calls.
 - in builtin-mem.c dump_raw_samples, code only partially tested for
   dso == NULL. Make the possibility of NULL consistent.
 - in thread.c thread__memcpy fix use of spaces and use tabs.

Committer notes:

Did missing conversions on these files:

   tools/perf/arch/powerpc/util/skip-callchain-idx.c
   tools/perf/arch/powerpc/util/sym-handling.c
   tools/perf/ui/browsers/hists.c
   tools/perf/ui/gtk/annotate.c
   tools/perf/util/cs-etm.c
   tools/perf/util/thread.c
   tools/perf/util/unwind-libunwind-local.c
   tools/perf/util/unwind-libunwind.c

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Cc: Darren Hart <dvhart@infradead.org>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Dmitriy Vyukov <dvyukov@google.com>
Cc: Eric Dumazet <edumazet@google.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Hao Luo <haoluo@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Madhavan Srinivasan <maddy@linux.ibm.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Miaoqian Lin <linmq006@gmail.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Riccardo Mancini <rickyman7@gmail.com>
Cc: Shunsuke Nakamura <nakamura.shun@fujitsu.com>
Cc: Song Liu <song@kernel.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Stephen Brennan <stephen.s.brennan@oracle.com>
Cc: Steven Rostedt (VMware) <rostedt@goodmis.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Yury Norov <yury.norov@gmail.com>
Link: https://lore.kernel.org/r/20230320212248.1175731-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-04-04 16:41:57 -03:00
Changbin Du cb4b9e6813 perf record: Reuse target::initial_delay
This just simply replace record_opts::initial_delay with
target::initial_delay. Nothing else is changed.

Signed-off-by: Changbin Du <changbin.du@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Hui Wang <hw.huiwang@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20230302031146.2801588-3-changbin.du@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-03-13 14:52:14 -03:00
Ian Rogers 1634bad320 perf trace: Reduce #ifdefs for TEP_FIELD_IS_RELATIVE
Add a helper function that applies the mask to test, or returns false
if libtraceevent is too old or not present.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/r/20230111070641.1728726-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-19 13:26:28 -03:00
Ian Rogers 1784eeaeb3 perf tools: Remove HAVE_LIBTRACEEVENT_TEP_FIELD_IS_RELATIVE
Switch HAVE_LIBTRACEEVENT_TEP_FIELD_IS_RELATIVE to be a version number
test on libtraceevent being >= to version 1.5.0. This also corrects a
greater-than test to be greater-than-or-equal.

Fixes: b9a49f8cb0 ("perf tools: Check if libtracevent has TEP_FIELD_IS_RELATIVE")
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Eelco Chaudron <echaudro@redhat.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Kim Phillips <kim.phillips@amd.com>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Link: https://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-19 13:24:56 -03:00
Ian Rogers d891f2b724 perf build: Properly guard libbpf includes
Including libbpf header files should be guarded by HAVE_LIBBPF_SUPPORT.
In bpf_counter.h, move the skeleton utilities under HAVE_BPF_SKEL.

Fixes: d6a735ef32 ("perf bpf_counter: Move common functions to bpf_counter.h")
Reported-by: Mike Leach <mike.leach@linaro.org>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Mike Leach <mike.leach@linaro.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20230105172243.7238-1-mike.leach@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2023-01-10 10:51:39 -03:00
Tiezhu Yang 818448e9cf perf tools: Use "grep -E" instead of "egrep"
The latest version of grep claims the egrep is now obsolete so the build
now contains warnings that look like:

	egrep: warning: egrep is obsolescent; using grep -E

fix this up by moving the related file to use "grep -E" instead.

  sed -i "s/egrep/grep -E/g" `grep egrep -rwl tools/perf`

Here are the steps to install the latest grep:

  wget http://ftp.gnu.org/gnu/grep/grep-3.8.tar.gz
  tar xf grep-3.8.tar.gz
  cd grep-3.8 && ./configure && make
  sudo make install
  export PATH=/usr/local/bin:$PATH

Signed-off-by: Tiezhu Yang <yangtiezhu@loongson.cn>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/1668762999-9297-1-git-send-email-yangtiezhu@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 15:28:19 -03:00
Arnaldo Carvalho de Melo b9a49f8cb0 perf tools: Check if libtracevent has TEP_FIELD_IS_RELATIVE
Some distros have older versions of libtraceevent where
TEP_FIELD_IS_RELATIVE and its associated semantics are not present, so
we need to check if the version has it, it was introduced in
libtraceevent 1.5.0.

Reported-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Tested-by: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Cc: Stephane Eranian <eranian@google.com>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Ian Rogers 378ef0f5d9 perf build: Use libtraceevent from the system
Remove the LIBTRACEEVENT_DYNAMIC and LIBTRACEFS_DYNAMIC make command
line variables.

If libtraceevent isn't installed or NO_LIBTRACEEVENT=1 is passed to the
build, don't compile in libtraceevent and libtracefs support.

This also disables CONFIG_TRACE that controls "perf trace".

CONFIG_LIBTRACEEVENT is used to control enablement in Build/Makefiles,
HAVE_LIBTRACEEVENT is used in C code.

Without HAVE_LIBTRACEEVENT tracepoints are disabled and as such the
commands kmem, kwork, lock, sched and timechart are removed.  The
majority of commands continue to work including "perf test".

Committer notes:

Fixed up a tools/perf/util/Build reject and added:

  #include <traceevent/event-parse.h>

to tools/perf/util/scripting-engines/trace-event-perl.c.

Committer testing:

  $ rpm -qi libtraceevent-devel
  Name        : libtraceevent-devel
  Version     : 1.5.3
  Release     : 2.fc36
  Architecture: x86_64
  Install Date: Mon 25 Jul 2022 03:20:19 PM -03
  Group       : Unspecified
  Size        : 27728
  License     : LGPLv2+ and GPLv2+
  Signature   : RSA/SHA256, Fri 15 Apr 2022 02:11:58 PM -03, Key ID 999f7cbf38ab71f4
  Source RPM  : libtraceevent-1.5.3-2.fc36.src.rpm
  Build Date  : Fri 15 Apr 2022 10:57:01 AM -03
  Build Host  : buildvm-x86-05.iad2.fedoraproject.org
  Packager    : Fedora Project
  Vendor      : Fedora Project
  URL         : https://git.kernel.org/pub/scm/libs/libtrace/libtraceevent.git/
  Bug URL     : https://bugz.fedoraproject.org/libtraceevent
  Summary     : Development headers of libtraceevent
  Description :
  Development headers of libtraceevent-libs
  $

Default build:

  $ ldd ~/bin/perf | grep tracee
  	libtraceevent.so.1 => /lib64/libtraceevent.so.1 (0x00007f1dcaf8f000)
  $

  # perf trace -e sched:* --max-events 10
       0.000 migration/0/17 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, dest_cpu: 1)
       0.005 migration/0/17 sched:sched_wake_idle_without_ipi(cpu: 1)
       0.011 migration/0/17 sched:sched_switch(prev_comm: "", prev_pid: 17 (migration/0), prev_state: 1, next_comm: "", next_prio: 120)
       1.173 :0/0 sched:sched_wakeup(comm: "", pid: 3138 (gnome-terminal-), prio: 120)
       1.180 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 3138 (gnome-terminal-), next_prio: 120)
       0.156 migration/1/21 sched:sched_migrate_task(comm: "", pid: 1603763 (perf), prio: 120, orig_cpu: 1, dest_cpu: 2)
       0.160 migration/1/21 sched:sched_wake_idle_without_ipi(cpu: 2)
       0.166 migration/1/21 sched:sched_switch(prev_comm: "", prev_pid: 21 (migration/1), prev_state: 1, next_comm: "", next_prio: 120)
       1.183 :0/0 sched:sched_wakeup(comm: "", pid: 1602985 (kworker/u16:0-f), prio: 120, target_cpu: 1)
       1.186 :0/0 sched:sched_switch(prev_comm: "", prev_prio: 120, next_comm: "", next_pid: 1602985 (kworker/u16:0-f), next_prio: 120)
  #

Had to tweak tools/perf/util/setup.py to make sure the python binding
shared object links with libtraceevent if -DHAVE_LIBTRACEEVENT is
present in CFLAGS.

Building with NO_LIBTRACEEVENT=1 uncovered some more build failures:

- Make building of data-convert-bt.c to CONFIG_LIBTRACEEVENT=y

- perf-$(CONFIG_LIBTRACEEVENT) += scripts/

- bpf_kwork.o needs also to be dependent on CONFIG_LIBTRACEEVENT=y

- The python binding needed some fixups and util/trace-event.c can't be
  built and linked with the python binding shared object, so remove it
  in tools/perf/util/setup.py and exclude it from the list of
  dependencies in the python/perf.so Makefile.perf target.

Building without libtraceevent-devel installed uncovered more build
failures:

- The python binding tools/perf/util/python.c was assuming that
  traceevent/parse-events.h was always available, which was the case
  when we defaulted to using the in-kernel tools/lib/traceevent/ files,
  now we need to enclose it under ifdef HAVE_LIBTRACEEVENT, just like
  the other parts of it that deal with tracepoints.

- We have to ifdef the rules in the Build files with
  CONFIG_LIBTRACEEVENT=y to build builtin-trace.c and
  tools/perf/trace/beauty/ as we only ifdef setting CONFIG_TRACE=y when
  setting NO_LIBTRACEEVENT=1 in the make command line, not when we don't
  detect libtraceevent-devel installed in the system. Simplification here
  to avoid these two ways of disabling builtin-trace.c and not having
  CONFIG_TRACE=y when libtraceevent-devel isn't installed is the clean
  way.

From Athira:

<quote>
tools/perf/arch/powerpc/util/Build
-perf-y += kvm-stat.o
+perf-$(CONFIG_LIBTRACEEVENT) += kvm-stat.o
</quote>

Then, ditto for arm64 and s390, detected by container cross build tests.

- s/390 uses test__checkevent_tracepoint() that is now only available if
  HAVE_LIBTRACEEVENT is defined, enclose the callsite with ifder HAVE_LIBTRACEEVENT.

Also from Athira:

<quote>
With this change, I could successfully compile in these environment:
- Without libtraceevent-devel installed
- With libtraceevent-devel installed
- With “make NO_LIBTRACEEVENT=1”
</quote>

Then, finally rename CONFIG_TRACEEVENT to CONFIG_LIBTRACEEVENT for
consistency with other libraries detected in tools/perf/.

Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221205225940.3079667-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-12-14 11:16:12 -03:00
Leo Yan 8daf87f592 perf trace: Remove unused bpf map 'syscalls'
augmented_raw_syscalls.c defines the bpf map 'syscalls' which is
initialized by perf tool in user space to indicate which system calls
are enabled for tracing, on the other flip eBPF program relies on the
map to filter out the trace events which are not enabled.

The map also includes a field 'string_args_len[6]' which presents the
string length if the corresponding argument is a string type.

Now the map 'syscalls' is not used, bpf program doesn't use it as filter
anymore, this is replaced by using the function bpf_tail_call() and
PROG_ARRAY syscalls map.  And we don't need to explicitly set the string
length anymore, bpf_probe_read_str() is smart to copy the string and
return string length.

Therefore, it's safe to remove the bpf map 'syscalls'.

To consolidate the code, this patch removes the definition of map
'syscalls' from augmented_raw_syscalls.c and drops code for using
the map in the perf trace.

Note, since function trace__set_ev_qualifier_bpf_filter() is removed,
calling trace__init_syscall_bpf_progs() from it is also removed.  We
don't need to worry it because trace__init_syscall_bpf_progs() is
still invoked from trace__init_syscalls_bpf_prog_array_maps() for
initialization the system call's bpf program callback.

After:

  # perf trace -e examples/bpf/augmented_raw_syscalls.c,open* --max-events 10 perf stat --quiet sleep 0.001
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libelf.so.1", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libdw.so.1", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind.so.8", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libunwind-aarch64.so.8", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libcrypto.so.3", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libslang.so.2", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libperl.so.5.34", O_RDONLY|O_CLOEXEC) = 3
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3

  # perf trace -e examples/bpf/augmented_raw_syscalls.c --max-events 10 perf stat --quiet sleep 0.001
  ... [continued]: execve())             = 0
  brk(NULL)                               = 0xaaaab1d28000
  faccessat(-100, "/etc/ld.so.preload", 4) = -1 ENOENT (No such file or directory)
  openat(AT_FDCWD, "/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
  close(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>) = 0
  openat(AT_FDCWD, "/lib/aarch64-linux-gnu/libm.so.6", O_RDONLY|O_CLOEXEC) = 3
  read(3</usr/lib/aarch64-linux-gnu/libcrypto.so.3>, 0xfffff33f70d0, 832) = 832
  munmap(0xffffb5519000, 28672)           = 0
  munmap(0xffffb55b7000, 32880)           = 0
  mprotect(0xffffb55a6000, 61440, PROT_NONE) = 0

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221121075237.127706-6-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23 10:30:00 -03:00
Leo Yan 03e9a5d8eb perf trace: Handle failure when trace point folder is missed
On Arm64 a case is perf tools fails to find the corresponding trace
point folder for system calls listed in the table 'syscalltbl_arm64',
e.g. the generated system call table contains "lookup_dcookie" but we
cannot find out the matched trace point folder for it.

We need to figure out if there have any issue for the generated system
call table, on the other hand, we need to handle the case when trace
point folder is missed under sysfs, this patch sets the flag
syscall::nonexistent as true and returns the error from
trace__read_syscall_info().

Another problem is for trace__syscall_info(), it returns two different
values if a system call doesn't exist: at the first time calling
trace__syscall_info() it returns NULL when the system call doesn't exist,
later if call trace__syscall_info() again for the same missed system
call, it returns pointer of syscall.  trace__syscall_info() checks the
condition 'syscalls.table[id].name == NULL', but the name will be
assigned in the first invoking even the system call is not found.

So checking system call's name in trace__syscall_info() is not the right
thing to do, this patch simply checks flag syscall::nonexistent to make
decision if a system call exists or not, finally trace__syscall_info()
returns the consistent result (NULL) if a system call doesn't existed.

Fixes: b8b1033fca ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: bpf@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221121075237.127706-4-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23 10:29:59 -03:00
Leo Yan d4223e1776 perf trace: Return error if a system call doesn't exist
When a system call is not detected, the reason is either because the
system call ID is out of scope or failure to find the corresponding path
in the sysfs, trace__read_syscall_info() returns zero.  Finally, without
returning an error value it introduces confusion for the caller.

This patch lets the function trace__read_syscall_info() to return
-EEXIST when a system call doesn't exist.

Fixes: b8b1033fca ("perf trace: Mark syscall ids that are not allocated to avoid unnecessary error messages")
Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: bpf@vger.kernel.org
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20221121075237.127706-3-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23 10:29:59 -03:00
Leo Yan eadcab4c7a perf trace: Use macro RAW_SYSCALL_ARGS_NUM to replace number
This patch defines a macro RAW_SYSCALL_ARGS_NUM to replace the open
coded number '6'.

Signed-off-by: Leo Yan <leo.yan@linaro.org>
Acked-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20221121075237.127706-2-leo.yan@linaro.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-23 10:29:59 -03:00
Ian Rogers fd3f518fc1 perf thread_map: Reduce exposure of libperf internal API
Remove unnecessary include of internal threadmap.h and refcount.h in
thread_map.h. Switch to using public APIs when possible or including
the internal header file in the C file. Fix a transitive dependency in
openat-syscall.c broken by the clean up.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andrii Nakryiko <andrii.nakryiko@gmail.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Nicolas Schier <nicolas@fjasle.eu>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: bpf@vger.kernel.org
Link: http://lore.kernel.org/lkml/20221109184914.1357295-13-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-16 12:17:15 -03:00
Arnaldo Carvalho de Melo 6ac7382099 perf trace: Add augmenter for clock_gettime's rqtp timespec arg
One more before going the BTF way:

  # perf trace -e /home/acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.o,*nanosleep
         ? pool-gsd-smart/2893  ... [continued]: clock_nanosleep())    = 0
         ? gpm/1042  ... [continued]: clock_nanosleep())    = 0
     1.232 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ...
     1.232 pool-gsd-smart/2893  ... [continued]: clock_nanosleep())    = 0
   327.329 gpm/1042 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffddfd1cf20) ...
  1002.482 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) = 0
   327.329 gpm/1042  ... [continued]: clock_nanosleep())    = 0
  2003.947 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ...
  2003.947 pool-gsd-smart/2893  ... [continued]: clock_nanosleep())    = 0
  2327.858 gpm/1042 clock_nanosleep(rqtp: { .tv_sec: 2, .tv_nsec: 0 }, rmtp: 0x7ffddfd1cf20) ...
         ? crond/1384  ... [continued]: clock_nanosleep())    = 0
  3005.382 pool-gsd-smart/2893 clock_nanosleep(rqtp: { .tv_sec: 1, .tv_nsec: 0 }, rmtp: 0x7f64d7ffec50) ...
  3005.382 pool-gsd-smart/2893  ... [continued]: clock_nanosleep())    = 0
  3675.633 crond/1384 clock_nanosleep(rqtp: { .tv_sec: 60, .tv_nsec: 0 }, rmtp: 0x7ffcc02b66b0) ...
^C#

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-10 15:30:10 -03:00
Arnaldo Carvalho de Melo a9cd6c6766 perf trace: Add BPF augmenter to perf_event_open()'s 'struct perf_event_attr' arg
Using BPF for that, doing a cleverish reuse of perf_event_attr__fprintf(),
that really needs to be turned into __snprintf(), etc.

But since the plan is to go the BTF way probably use libbpf's
btf_dump__dump_type_data().

Example:

[root@quaco ~]# perf trace -e ~acme/git/perf/tools/perf/examples/bpf/augmented_raw_syscalls.c,perf_event_open --max-events 10 perf stat --quiet sleep 0.001
fg
     0.000 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x1, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 3
     0.067 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x3, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 4
     0.120 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x4, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 5
     0.172 perf_event_open(attr_uptr: { type: 1, size: 128, config: 0x2, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 7
     0.190 perf_event_open(attr_uptr: { size: 128, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 8
     0.199 perf_event_open(attr_uptr: { size: 128, config: 0x1, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 9
     0.204 perf_event_open(attr_uptr: { size: 128, config: 0x4, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 10
     0.210 perf_event_open(attr_uptr: { size: 128, config: 0x5, sample_type: IDENTIFIER, read_format: TOTAL_TIME_ENABLED|TOTAL_TIME_RUNNING, disabled: 1, inherit: 1, enable_on_exec: 1, exclude_guest: 1 }, pid: 258859 (perf), cpu: -1, group_fd: -1, flags: FD_CLOEXEC) = 11
[root@quaco ~]#

Suggested-by: Ian Rogers <irogers@google.com>
Tested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/Y2V2Tpu+2vzJyon2@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-07 10:56:40 -03:00
Ian Rogers 92ea0720ba perf trace: Use sig_atomic_t to avoid undefined behaviour in a signal handler
Use sig_atomic_t for variables written/accessed in signal
handlers. This is undefined behavior as per:

  https://wiki.sei.cmu.edu/confluence/display/c/SIG31-C.+Do+not+access+shared+objects+in+signal+handlers

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Cc: German Gomez <german.gomez@arm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Leo Yan <leo.yan@linaro.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20221024181913.630986-9-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-11-03 11:44:33 -03:00
Chen Zhongjin 96b731412d perf trace: Fix incorrectly parsed hexadecimal value for flags in filter
When parsing flags in filter, the strtoul function uses wrong parsing
condition (tok[1] = 'x'), which can make the flags be corrupted and
treat all numbers start with 0 as hex.

In fact strtoul() will auto test hex format when base == 0 (See
_parse_integer_fixup_radix). So there is no need to test this again.

Remove the unnessesary is_hexa test.

Fixes: 154c978d48 ("libbeauty: Introduce strarray__strtoul_flags()")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220926031440.28275-3-chenzhongjin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04 08:55:23 -03:00
Chen Zhongjin 888964a05d perf trace: Fix show_arg_names not working for tp arg names
trace__fprintf_tp_fields() will always print arg names because when
implemented it is forced to print arg_names with:

  (1 || trace->show_arg_names)

So the printing looks like:

> cat ~/.perfconfig
    [trace]
        show_arg_names = no

> perf trace -e syscalls:*mmap sleep 1
    0.000 sleep/1119 syscalls:sys_enter_mmap(NULL, 8192, READ|WRITE, PRIVATE|ANONYMOUS)
    0.179 sleep/1119 syscalls:sys_exit_mmap(__syscall_nr: 9, ret: 140535426170880)
    ...

Although the comment said that perhaps we need a show_tp_arg_names.

I don't think it's necessary to control them separately because it's not
so clean that part of the log shows arg names but other not.

Also when we are tracing functions it's rare to especially distinguish
syscalls and tp trace.

Only use one option to control arg names printing is more resonable and
simple. So remove the force condition and commit.

After fix:

> perf trace -e syscalls:*mmap sleep 1
    0.000 sleep/1121 syscalls:sys_enter_mmap(NULL, 8192, READ|WRITE, PRIVATE|ANONYMOUS)
    0.163 sleep/1121 syscalls:sys_exit_mmap(9, 140454467661824)
    ...

Fixes: f11b2803bb ("perf trace: Allow choosing how to augment the tracepoint arguments")
Signed-off-by: Chen Zhongjin <chenzhongjin@huawei.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Florian Fischer <florian.fischer@muhq.space>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: John Garry <john.garry@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220926031440.28275-2-chenzhongjin@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04 08:55:23 -03:00
Shang XiaoJing e3e7572fa8 perf trace: Use zalloc() to save initialization of syscall_stats
As most members of syscall_stats is set to 0 in thread__update_stats,
using zalloc() directly.

Signed-off-by: Shang XiaoJing <shangxiaojing@huawei.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220908021141.27134-2-shangxiaojing@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-10-04 08:55:21 -03:00
shaomin Deng 632f5c224e perf trace: Fix double word in comments
Delete repeated word "and" in comments.

Signed-off-by: shaomin Deng <dengshaomin@cdjrlc.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: http://lore.kernel.org/lkml/20220807084629.23121-1-dengshaomin@cdjrlc.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-12 16:44:56 -03:00
Ian Rogers 9b7c7728f4 perf parse-events: Break out tracepoint and printing
Move print_*_events functions out of parse-events.c into a new
print-events.c. Move tracepoint code into tracepoint.c or
trace-event-info.c (sole user). This reduces the dependencies of
parse-events.c and makes it more amenable to being a library in the
future.

Remove some unnecessary definitions from parse-events.h. Fix a
checkpatch.pl warning on using unsigned rather than unsigned int.  Fix
some line length warnings too.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Link: https://lore.kernel.org/r/20220729204217.250166-3-irogers@google.com
[ Add include linux/stddef.h before perf_events.h for systems where __always_inline isn't pulled in before used, such as older Alpine Linux ]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-08-02 16:32:26 -03:00
Naveen N. Rao 4b335e1e0d perf trace: Fix SIGSEGV when processing syscall args
On powerpc, 'perf trace' is crashing with a SIGSEGV when trying to
process a perf.data file created with 'perf trace record -p':

  #0  0x00000001225b8988 in syscall_arg__scnprintf_augmented_string <snip> at builtin-trace.c:1492
  #1  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1492
  #2  syscall_arg__scnprintf_filename <snip> at builtin-trace.c:1486
  #3  0x00000001225bdd9c in syscall_arg_fmt__scnprintf_val <snip> at builtin-trace.c:1973
  #4  syscall__scnprintf_args <snip> at builtin-trace.c:2041
  #5  0x00000001225bff04 in trace__sys_enter <snip> at builtin-trace.c:2319

That points to the below code in tools/perf/builtin-trace.c:
	/*
	 * If this is raw_syscalls.sys_enter, then it always comes with the 6 possible
	 * arguments, even if the syscall being handled, say "openat", uses only 4 arguments
	 * this breaks syscall__augmented_args() check for augmented args, as we calculate
	 * syscall->args_size using each syscalls:sys_enter_NAME tracefs format file,
	 * so when handling, say the openat syscall, we end up getting 6 args for the
	 * raw_syscalls:sys_enter event, when we expected just 4, we end up mistakenly
	 * thinking that the extra 2 u64 args are the augmented filename, so just check
	 * here and avoid using augmented syscalls when the evsel is the raw_syscalls one.
	 */
	if (evsel != trace->syscalls.events.sys_enter)
		augmented_args = syscall__augmented_args(sc, sample, &augmented_args_size, trace->raw_augmented_syscalls_args_size);

As the comment points out, we should not be trying to augment the args
for raw_syscalls. However, when processing a perf.data file, we are not
initializing those properly. Fix the same.

Reported-by: Claudio Carvalho <cclaudio@linux.ibm.com>
Signed-off-by: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: http://lore.kernel.org/lkml/20220707090900.572584-1-naveen.n.rao@linux.vnet.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-07-17 10:59:52 -03:00
Arnaldo Carvalho de Melo 859f7e4554 Merge remote-tracking branch 'torvalds/master' into perf/core
To pick up fixes from perf/urgent that recently got merged.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-17 18:40:54 -03:00
Changbin Du de9f498d2b perf trace: Avoid early exit due SIGCHLD from non-workload processes
The function trace__symbols_init() runs "perf-read-vdso32" and that ends up
with a SIGCHLD delivered to 'perf'. And this SIGCHLD make perf exit early.

'perf trace' should exit only if the SIGCHLD is from our workload process.
So let's use sigaction() instead of signal() to match such condition.

Committer notes:

Use memset to zero the 'struct sigaction' variable as the '= { 0 }'
method isn't accepted in many compiler versions, e.g.:

   4    34.02 alpine:3.6                    : FAIL clang version 4.0.0 (tags/RELEASE_400/final)
    builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
            struct sigaction sigchld_act = { 0 };
                                             ^
                                             {}
    builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers]
            struct sigaction sigchld_act = { 0 };
                                               ^
    2 errors generated.
   6    32.60 alpine:3.8                    : FAIL gcc version 6.4.0 (Alpine 6.4.0)
    builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
            struct sigaction sigchld_act = { 0 };
                                             ^
                                             {}
    builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers]
            struct sigaction sigchld_act = { 0 };
                                               ^
    2 errors generated.
   7    34.82 alpine:3.9                    : FAIL gcc version 8.3.0 (Alpine 8.3.0)
    builtin-trace.c:4897:35: error: suggest braces around initialization of subobject [-Werror,-Wmissing-braces]
            struct sigaction sigchld_act = { 0 };
                                             ^
                                             {}
    builtin-trace.c:4897:37: error: missing field 'sa_mask' initializer [-Werror,-Wmissing-field-initializers]
            struct sigaction sigchld_act = { 0 };
                                               ^
    2 errors generated.

Signed-off-by: Changbin Du <changbin.du@gmail.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20220208140725.3947-1-changbin.du@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-16 13:47:12 -03:00
Alexey Bayduraev 2292083f59 perf report: Output data file name in raw trace dump
Print path and name of a data file into raw dump (-D)
<file_offset>@<path/file>:

  0x2226a@perf.data [0x30]: event: 9
or
  0x15cc36@perf.data/data.7 [0x30]: event: 9

Reviewed-by: Riccardo Mancini <rickyman7@gmail.com>
Signed-off-by: Alexey Bayduraev <alexey.v.bayduraev@linux.intel.com>
Tested-by: Jiri Olsa <jolsa@kernel.org>
Tested-by: Riccardo Mancini <rickyman7@gmail.com>
Acked-by: Andi Kleen <ak@linux.intel.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Antonov <alexander.antonov@linux.intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Alexei Budankov <abudankov@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/e8378fd4910c10751b001be880705653989283c2.1642440724.git.alexey.v.bayduraev@linux.intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2022-02-10 16:27:34 -03:00
Linus Torvalds 57d17378a4 perf tools changes for v5.17: 1st batch
New features:
 
 - Add 'trace' subcommand for 'perf ftrace', setting the stage for more
   'perf ftrace' subcommands. Not using a subcommand yields the previous
   behaviour of 'perf ftrace'.
 
 - Add 'latency' subcommand to 'perf ftrace', that can use the function
   graph tracer or a BPF optimized one, via the -b/--use-bpf option.
 
   E.g.:
 
   $ sudo perf ftrace latency -a -T mutex_lock sleep 1
   #   DURATION     |      COUNT | GRAPH                          |
        0 - 1    us |       4596 | ########################       |
        1 - 2    us |       1680 | #########                      |
        2 - 4    us |       1106 | #####                          |
        4 - 8    us |        546 | ##                             |
        8 - 16   us |        562 | ###                            |
       16 - 32   us |          1 |                                |
       32 - 64   us |          0 |                                |
       64 - 128  us |          0 |                                |
      128 - 256  us |          0 |                                |
      256 - 512  us |          0 |                                |
      512 - 1024 us |          0 |                                |
        1 - 2    ms |          0 |                                |
        2 - 4    ms |          0 |                                |
        4 - 8    ms |          0 |                                |
        8 - 16   ms |          0 |                                |
       16 - 32   ms |          0 |                                |
       32 - 64   ms |          0 |                                |
       64 - 128  ms |          0 |                                |
      128 - 256  ms |          0 |                                |
      256 - 512  ms |          0 |                                |
      512 - 1024 ms |          0 |                                |
        1 - ...   s |          0 |                                |
 
   The original implementation of this command was in the bcc tool.
 
 - Support --cputype option for hybrid events in 'perf stat'.
 
 Improvements:
 
 - Call chain improvements for ARM64.
 
 - No need to do any affinity setup when profiling pids.
 
 - Reduce multiplexing with duration_time in 'perf stat' metrics.
 
 - Improve error message for uncore events, stating that some event groups are
   can only be used in system wide (-a) mode.
 
 - perf stat metric group leader fixes/improvements, including arch specific
   changes to better support Intel topdown events.
 
 - Probe non-deprecated sysfs path 1st, i.e. try /sys/devices/system/cpu/cpuN/topology/thread_siblings
   first, then the old /sys/devices/system/cpu/cpuN/topology/core_cpus.
 
 - Disable debuginfod by default in 'perf record', to avoid stalls on distros
   such as Fedora 35.
 
 - Use unbuffered output in 'perf bench' when pipe/tee'ing to a file.
 
 - Enable ignore_missing_thread in 'perf trace'
 
 Fixes:
 
 - Avoid TUI crash when navigating in the annotation of recursive functions.
 
 - Fix hex dump character output in 'perf script'.
 
 - Fix JSON indentation to 4 spaces standard in the ARM vendor event files.
 
 - Fix use after free in metric__new().
 
 - Fix IS_ERR_OR_NULL() usage in the perf BPF loader.
 
 - Fix up cross-arch register support, i.e. when printing register names take
   into account the architecture where the perf.data file was collected.
 
 - Fix SMT fallback with large core counts.
 
 - Don't lower case MetricExpr when parsing JSON files so as not to lose info
   such as the ":G" event modifier in metrics.
 
 perf test:
 
 - Add basic stress test for sigtrap handling to 'perf test'.
 
 - Fix 'perf test' failures on s/390
 
 - Enable system wide for metricgroups test in 'perf test´.
 
 - Use 3 digits for test numbering now we can have more tests.
 
 Arch specific:
 
 - Add events for Arm Neoverse N2 in the ARM JSON vendor event files
 
 - Support PERF_MEM_LVLNUM encodings in powerpc, that came from a single
   patch series, where I incorrectly merged the kernel bits, that were then
   reverted after coordination with Michael Ellerman and Stephen Rothwell.
 
 - Add ARM SPE total latency as PERF_SAMPLE_WEIGHT.
 
 - Update AMD documentation, with info on raw event encoding.
 
 - Add support for global and local variants of the "p_stage_cyc" sort key,
   applicable to perf.data files collected on powerpc.
 
 - Remove duplicate and incorrect aux size checks in the ARM CoreSight ETM code.
 
 Refactorings:
 
 - Add a perf_cpu abstraction to disambiguate CPUs and CPU map indexes, fixing
   problems along the way.
 
 - Document CPU map methods.
 
 UAPI sync:
 
 - Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
 
 - Sync UAPI files with the kernel sources: drm, msr-index, cpufeatures.
 
 Build system
 
 - Enable warnings through HOSTCFLAGS.
 
 - Drop requirement for libstdc++.so for libopencsd check
 
 libperf:
 
 - Make libperf adopt perf_counts_values__scale() from tools/perf/util/.
 
 - Add a stat multiplexing test to libperf.
 
 Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQR2GiIUctdOfX2qHhGyPKLppCJ+JwUCYeQj6AAKCRCyPKLppCJ+
 JwyWAQCBmU8OJxhSJQnNCwTB9zNkPPBbihvIztepOJ7zsw7JcQD+KfAidHGQvI/Y
 EmXIYkmdNkWPYJafONllnKK5cckjxgI=
 =aj9V
 -----END PGP SIGNATURE-----

Merge tag 'perf-tools-for-v5.17-2022-01-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux

Pull perf tool updates from Arnaldo Carvalho de Melo:
 "New features:

   - Add 'trace' subcommand for 'perf ftrace', setting the stage for
     more 'perf ftrace' subcommands. Not using a subcommand yields the
     previous behaviour of 'perf ftrace'.

   - Add 'latency' subcommand to 'perf ftrace', that can use the
     function graph tracer or a BPF optimized one, via the -b/--use-bpf
     option.

     E.g.:

	$ sudo perf ftrace latency -a -T mutex_lock sleep 1
	#   DURATION     |      COUNT | GRAPH                          |
	     0 - 1    us |       4596 | ########################       |
	     1 - 2    us |       1680 | #########                      |
	     2 - 4    us |       1106 | #####                          |
	     4 - 8    us |        546 | ##                             |
	     8 - 16   us |        562 | ###                            |
	    16 - 32   us |          1 |                                |
	    32 - 64   us |          0 |                                |
	    64 - 128  us |          0 |                                |
	   128 - 256  us |          0 |                                |
	   256 - 512  us |          0 |                                |
	   512 - 1024 us |          0 |                                |
	     1 - 2    ms |          0 |                                |
	     2 - 4    ms |          0 |                                |
	     4 - 8    ms |          0 |                                |
	     8 - 16   ms |          0 |                                |
	    16 - 32   ms |          0 |                                |
	    32 - 64   ms |          0 |                                |
	    64 - 128  ms |          0 |                                |
	   128 - 256  ms |          0 |                                |
	   256 - 512  ms |          0 |                                |
	   512 - 1024 ms |          0 |                                |
	     1 - ...   s |          0 |                                |

     The original implementation of this command was in the bcc tool.

   - Support --cputype option for hybrid events in 'perf stat'.

  Improvements:

   - Call chain improvements for ARM64.

   - No need to do any affinity setup when profiling pids.

   - Reduce multiplexing with duration_time in 'perf stat' metrics.

   - Improve error message for uncore events, stating that some event
     groups are can only be used in system wide (-a) mode.

   - perf stat metric group leader fixes/improvements, including arch
     specific changes to better support Intel topdown events.

   - Probe non-deprecated sysfs path first, i.e. try the path
     /sys/devices/system/cpu/cpuN/topology/thread_siblings first, then
     the old /sys/devices/system/cpu/cpuN/topology/core_cpus.

   - Disable debuginfod by default in 'perf record', to avoid stalls on
     distros such as Fedora 35.

   - Use unbuffered output in 'perf bench' when pipe/tee'ing to a file.

   - Enable ignore_missing_thread in 'perf trace'

  Fixes:

   - Avoid TUI crash when navigating in the annotation of recursive
     functions.

   - Fix hex dump character output in 'perf script'.

   - Fix JSON indentation to 4 spaces standard in the ARM vendor event
     files.

   - Fix use after free in metric__new().

   - Fix IS_ERR_OR_NULL() usage in the perf BPF loader.

   - Fix up cross-arch register support, i.e. when printing register
     names take into account the architecture where the perf.data file
     was collected.

   - Fix SMT fallback with large core counts.

   - Don't lower case MetricExpr when parsing JSON files so as not to
     lose info such as the ":G" event modifier in metrics.

  perf test:

   - Add basic stress test for sigtrap handling to 'perf test'.

   - Fix 'perf test' failures on s/390

   - Enable system wide for metricgroups test in 'perf test´.

   - Use 3 digits for test numbering now we can have more tests.

  Arch specific:

   - Add events for Arm Neoverse N2 in the ARM JSON vendor event files

   - Support PERF_MEM_LVLNUM encodings in powerpc, that came from a
     single patch series, where I incorrectly merged the kernel bits,
     that were then reverted after coordination with Michael Ellerman
     and Stephen Rothwell.

   - Add ARM SPE total latency as PERF_SAMPLE_WEIGHT.

   - Update AMD documentation, with info on raw event encoding.

   - Add support for global and local variants of the "p_stage_cyc" sort
     key, applicable to perf.data files collected on powerpc.

   - Remove duplicate and incorrect aux size checks in the ARM CoreSight
     ETM code.

  Refactorings:

   - Add a perf_cpu abstraction to disambiguate CPUs and CPU map
     indexes, fixing problems along the way.

   - Document CPU map methods.

  UAPI sync:

   - Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench
     mem memcpy'

   - Sync UAPI files with the kernel sources: drm, msr-index,
     cpufeatures.

  Build system

   - Enable warnings through HOSTCFLAGS.

   - Drop requirement for libstdc++.so for libopencsd check

  libperf:

   - Make libperf adopt perf_counts_values__scale() from tools/perf/util/.

   - Add a stat multiplexing test to libperf"

* tag 'perf-tools-for-v5.17-2022-01-16' of git://git.kernel.org/pub/scm/linux/kernel/git/acme/linux: (115 commits)
  perf record: Disable debuginfod by default
  perf evlist: No need to do any affinity setup when profiling pids
  perf cpumap: Add is_dummy() method
  perf metric: Fix metric_leader
  perf cputopo: Fix CPU topology reading on s/390
  perf metricgroup: Fix use after free in metric__new()
  libperf tests: Update a use of the new cpumap API
  perf arm: Fix off-by-one directory path
  tools arch x86: Sync the msr-index.h copy with the kernel sources
  tools headers cpufeatures: Sync with the kernel sources
  tools headers UAPI: Update tools's copy of drm.h header
  tools arch: Update arch/x86/lib/mem{cpy,set}_64.S copies used in 'perf bench mem memcpy'
  perf pmu-events: Don't lower case MetricExpr
  perf expr: Add debug logging for literals
  perf tools: Probe non-deprecated sysfs path 1st
  perf tools: Fix SMT fallback with large core counts
  perf cpumap: Give CPUs their own type
  perf stat: Correct first_shadow_cpu to return index
  perf script: Fix flipped index and cpu
  perf c2c: Use more intention revealing iterator
  ...
2022-01-18 06:32:11 +02:00