Commit graph

15927 commits

Author SHA1 Message Date
Adrian Hunter d4a98b45fb perf script: Show also errors for --insn-trace option
The trace could be misleading if trace errors are not taken into
account, so display them also by adding the itrace "e" option.

Note --call-trace and --call-ret-trace already add the itrace "e"
option.

Fixes: b585ebdb59 ("perf script: Add --insn-trace for instruction decoding")
Reviewed-by: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: Andi Kleen <ak@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: stable@vger.kernel.org
Link: https://lore.kernel.org/r/20240315071334.3478-1-adrian.hunter@intel.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
James Clark 36f65f9b7a perf docs arm_spe: Clarify more SPE requirements related to KPTI
The question of exactly when KPTI needs to be disabled comes up a lot
because it doesn't always need to be done. Add the relevant kernel
function and some examples that describe the behavior.

Also describe the interrupt requirement and that no error message will
be printed if this isn't met.

Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240312132508.423320-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo a672af9139 tools headers: Remove almost unused copy of uapi/stat.h, add few conditional defines
These were used to build perf to provide defines not available in older
distros, but this was back in 2017, nowadays most the distros that are
supported and I have build containers for work using just the system
headers, so ditch them.

For the few that don't have STATX_MNT_ID{_UNIQUE}, or STATX_MNT_DIOALIGN
add them conditionally.

Some of these older distros may not have things that are used in 'perf
trace', but then they also don't have libtraceevent packages, so don't
build 'perf trace'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240315204835.748716-6-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo 8a1ad44135 tools headers: Remove now unused copies of uapi/{fcntl,openat2}.h and asm/fcntl.h
These were used to build perf to provide defines not available in older
distros, but this was back in 2017, nowadays all the distros that are
supported and I have build containers for work using just the system
headers, so ditch them.

Some of these older distros may not have things that are used in 'perf
trace', but then they also don't have libtraceevent packages, so don't
build 'perf trace'.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240315204835.748716-5-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo 6652830c87 perf beauty: Use the system linux/fcntl.h instead of a copy from the kernel
Builds ok all the way back to these older distros:

   1  almalinux:8    : Ok  gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-20) , clang version 16.0.6 (Red Hat 16.0.6-2.module_el8.9.0+3621+df7f7146) flex 2.6.1
   3  alpine:3.15    : Ok  gcc (Alpine 10.3.1_git20211027) 10.3.1 20211027 , Alpine clang version 12.0.1 flex 2.6.4
  15  debian:10      : Ok  gcc (Debian 8.3.0-6) 8.3.0 , Debian clang version 11.0.1-2~deb10u1 flex 2.6.4
  32  opensuse:15.4  : Ok  gcc (SUSE Linux) 7.5.0 , clang version 15.0.7 flex 2.6.4
  23  fedora:35      : Ok  gcc (GCC) 11.3.1 20220421 (Red Hat 11.3.1-3) , clang version 13.0.1 (Fedora 13.0.1-1.fc35) flex 2.6.4
  38  ubuntu:18.04   : Ok  gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0  flex 2.6.4

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240315204835.748716-4-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo eb01fe7abb perf beauty: Move prctl.h files (uapi/linux and x86's) copy out of the directory used to build perf
It is used only to generate string tables, not to build perf, so move it
to the tools/perf/trace/beauty/{include,arch}/ hierarchies, that is used
just for scraping.

This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.

No other tools/ living code uses it, just <linux/usbdevice_fs.h> coming
from either 'make install_headers' or from the system /usr/include/
directory.

Suggested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/20240315204835.748716-3-acme@kernel.org
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo f324b73c2c perf beauty: Stop using the copy of uapi/linux/prctl.h
Use the system one, nothing used in that file isn't available in the
supported, active distros.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
To: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/lkml/20240315204835.748716-3-acme@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo c8bfe3fad4 perf beauty: Move arch/x86/include/asm/irq_vectors.h copy out of the directory used to build perf
It is used only to generate string tables, not to build perf, so move it
to the tools/perf/trace/beauty/include/ hierarchy, that is used just for
scraping.

This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.

No other tools/ living code uses it.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo 7050e33e86 perf beauty: Move uapi/sound/asound.h copy out of the directory used to build perf
It is used only to generate string tables, not to build perf, so move it
to the tools/perf/trace/beauty/include/ hierarchy, that is used just for
scraping.

This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo 44512bd613 perf beauty: Move uapi/linux/usbdevice_fs.h copy out of the directory used to build perf
It is mostly used only to generate string tables, not to build perf, so
move it to the tools/perf/trace/beauty/include/ hierarchy, that is used
just for scraping.

This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.

No other tools/ living code uses it, just <linux/usbdevice_fs.h> coming
from either 'make install_headers' or from the system /usr/include/
directory.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo ab3316119f perf beauty: Move uapi/linux/mount.h copy out of the directory used to build perf
It is mostly used only to generate string tables, not to build perf, so
move it to the tools/perf/trace/beauty/include/ hierarchy, that is used
just for scraping.

This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.

No other tools/ living code uses it, just <linux/mount.h> coming from
either 'make install_headers' or from the system /usr/include/
directory.

Suggested-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo 22916d2cba perf beauty: Don't include uapi/linux/mount.h, use sys/mount.h instead
The tools/include/uapi/linux/mount.h file is mostly used for scrapping
defines into id->string tables, this is the only place were it is being
directly used, stop doing so.

Define MOUNT_ATTR_RELATIME and MOUNT_ATTR__ATIME if not available in the
system's headers.

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo faf7217a39 perf beauty: Move uapi/linux/fs.h copy out of the directory used to build perf
It is mostly used only to generate string tables, not to build perf, so
move it to the tools/perf/trace/beauty/include/ hierarchy, that is used
just for scraping.

The only case where it was being used to build was in
tools/perf/trace/beauty/sync_file_range.c, because some older systems
doesn't have the SYNC_FILE_RANGE_WRITE_AND_WAIT define, just use the
system's linux/fs.h header instead, defining it if not available.

This is a something that should've have happened, as happened with the
linux/socket.h scrapper, do it now as Ian suggested while doing an
audit/refactor session in the headers used by perf.

No other tools/ living code uses it, just <linux/fs.h> coming from
either 'make install_headers' or from the system /usr/include/
directory.

Suggested-by: Ian Rogers <irogers@google.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/CAP-5=fWZVrpRufO4w-S4EcSi9STXcTAN2ERLwTSN7yrSSA-otQ@mail.gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo 5d8c646038 perf beauty: Fix dependency of tables using uapi/linux/mount.h
Several such tables were depending on uapi/linux/fs.h, cut and paste
error when they were introduced, fix it.

Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Ze9vjxv42PN_QGZv@x1
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Bhaskar Chowdhury 4b3761eebb perf c2c: Fix a punctuation
s/dont/don\'t/

Signed-off-by: Bhaskar Chowdhury <unixbhaskar@gmail.com>
Acked-by: Randy Dunlap <rdunlap@infradead.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20210319232824.742-1-unixbhaskar@gmail.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:27 -03:00
Arnaldo Carvalho de Melo a9f4c6c999 perf trace: Collect sys_nanosleep first argument
That is a 'struct timespec' passed from userspace to the kernel as we
can see with a system wide syscall tracing:

  root@number:~# perf trace -e nanosleep
       0.000 (10.102 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
      38.924 (10.077 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     100.177 (10.107 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     139.171 (10.063 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     200.603 (10.105 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     239.399 (10.064 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     300.994 (10.096 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     339.584 (10.067 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     401.335 (10.057 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     439.758 (10.166 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     501.814 (10.110 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     539.983 (10.227 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     602.284 (10.199 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     640.208 (10.105 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     702.662 (10.163 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     740.440 (10.107 ms): podman/2195174 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
     802.993 (10.159 ms): podman/9150 nanosleep(rqtp: { .tv_sec: 0, .tv_nsec: 10000000 })                   = 0
  ^Croot@number:~# strace -p 9150 -e nanosleep

If we then use the ptrace method to look at that podman process:

  root@number:~# strace -p 9150 -e nanosleep
  strace: Process 9150 attached
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  nanosleep({tv_sec=0, tv_nsec=10000000}, NULL) = 0
  ^Cstrace: Process 9150 detached
  root@number:~#

With some changes we can get something closer to the strace output,
still in system wide mode:

  root@number:~# perf config trace.show_arg_names=false
  root@number:~# perf config trace.show_duration=false
  root@number:~# perf config trace.show_timestamp=false
  root@number:~# perf config trace.show_zeros=true
  root@number:~# perf config trace.args_alignment=0
  root@number:~# perf trace -e nanosleep --max-events=10
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/2195174 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  podman/9150 nanosleep({ .tv_sec: 0, .tv_nsec: 10000000 }, NULL) = 0
  root@number:~#
  root@number:~# perf config
  trace.show_arg_names=false
  trace.show_duration=false
  trace.show_timestamp=false
  trace.show_zeros=true
  trace.args_alignment=0
  root@number:~# cat ~/.perfconfig
  # this file is auto-generated.
  [trace]
  	show_arg_names = false
  	show_duration = false
  	show_timestamp = false
  	show_zeros = true
  	args_alignment = 0
  root@number:~#

This will not get reused by any other syscall as nanosleep is the only
one to have as its first argument a 'struct timespec" pointer argument
passed from userspace to the kernel:

  root@number:~# grep timespec /sys/kernel/tracing/events/syscalls/sys_enter_*/format | grep offset:16
  /sys/kernel/tracing/events/syscalls/sys_enter_nanosleep/format:	field:struct __kernel_timespec * rqtp;	offset:16;	size:8;	signed:0;
  root@number:~#

BTF based pretty printing will simplify all this, but then lets just get
the low hanging fruits first.

Reviewed-by: Ian Rogers <irogers@google.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zbq72dJRpOlfRWnf@kernel.org/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-03-21 10:41:26 -03:00
Namhyung Kim 0f66dfe7b9 perf annotate: Add comments in the data structures
Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240304230815.1440583-5-namhyung@kernel.org
2024-03-06 20:25:48 -08:00
Namhyung Kim f59e3660cd perf annotate: Remove sym_hist.addr[] array
It's not used anymore and the code is coverted to use a hash map.  Now
sym_hist has a static size, so no need to have sizeof_sym_hist in the
struct annotated_source.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240304230815.1440583-4-namhyung@kernel.org
2024-03-06 20:25:36 -08:00
Namhyung Kim 8015457584 perf annotate: Calculate instruction overhead using hashmap
Use annotated_source.samples hashmap instead of addr array in the
struct sym_hist.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240304230815.1440583-3-namhyung@kernel.org
2024-03-06 20:25:20 -08:00
Namhyung Kim d3e7cad6f3 perf annotate: Add a hashmap for symbol histogram
Now symbol histogram uses an array to save per-offset sample counts.
But it wastes a lot of memory if the symbol has a few samples only.
Add a hashmap to save values only for actual samples.

For now, it has duplicate histogram (one in the existing array and
another in the new hash map).  Once it can convert to use the hash
in all places, we can get rid of the array later.

Reviewed-by: Ian Rogers <irogers@google.com>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Andi Kleen <ak@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240304230815.1440583-2-namhyung@kernel.org
2024-03-06 20:24:55 -08:00
Ian Rogers 7bfc84b23e perf threads: Reduce table size from 256 to 8
The threads data structure is an array of hashmaps, previously
rbtrees. The two levels allows for a fixed outer array where access is
guarded by rw_semaphores. Commit 91e467bc56 ("perf machine: Use
hashtable for machine threads") sized the outer table at 256 entries
to avoid future scalability problems, however, this means the threads
struct is sized at 30,720 bytes. As the hashmaps allow O(1) access for
the common find/insert/remove operations, lower the number of entries
to 8. This reduces the size overhead to 960 bytes.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-8-irogers@google.com
2024-03-03 22:52:13 -08:00
Ian Rogers 412a2ff473 perf threads: Switch from rbtree to hashmap
The rbtree provides a sorting on entries but this is unused. Switch to
using hashmap for O(1) rather than O(log n) find/insert/remove
complexity.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-7-irogers@google.com
2024-03-03 22:52:04 -08:00
Ian Rogers 93bb5b0d93 perf threads: Move threads to its own files
Move threads out of machine and into its own file.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-6-irogers@google.com
2024-03-03 22:51:55 -08:00
Ian Rogers d436f90a64 perf machine: Move machine's threads into its own abstraction
Move thread_rb_node into the machine.c file. This hides the
implementation of threads from the rest of the code allowing for it to
be refactored.

Locking discipline is tightened up in this change. As the lock is now
encapsulated in threads, the findnew function requires holding it (as
it already did in machine). Rather than do conditionals with locks
based on whether the thread should be created (which could potentially
be error prone with a read lock match with a write unlock), have a
separate threads__find that won't create the thread and only holds the
read lock. This effectively duplicates the findnew logic, with the
existing findnew logic only operating under a write lock assuming
creation is necessary as a previous find failed. The creation may
still fail with the write lock due to another thread. The duplication
is removed in a later next patch that delegates the implementation to
hashtable.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-5-irogers@google.com
2024-03-03 22:51:44 -08:00
Ian Rogers 45ac4960d7 perf machine: Move fprintf to for_each loop and a callback
Avoid exposing the threads data structure by switching to the callback
machine__for_each_thread approach. machine__fprintf is only used in
tests and verbose >3 output so don't turn to list and sort. Add
machine__threads_nr to be refactored later.

Note, all existing *_fprintf routines ignore fprintf errors.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-4-irogers@google.com
2024-03-03 22:51:31 -08:00
Ian Rogers f178ffdf7e perf trace: Ignore thread hashing in summary
Commit 91e467bc56 ("perf machine: Use hashtable for machine
threads") made the iteration of thread tids unordered. The perf trace
--summary output sorts and prints each hash bucket, rather than all
threads globally. Change this behavior by turn all threads into a
list, sort the list by number of trace events then by tids, finally
print the list. This also allows the rbtree in threads to be not
accessed outside of machine.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-3-irogers@google.com
2024-03-03 22:51:18 -08:00
Ian Rogers 2f1e20feb9 perf report: Sort child tasks by tid
Commit 91e467bc56 ("perf machine: Use hashtable for machine
threads") made the iteration of thread tids unordered. The perf report
--tasks output now shows child threads in an order determined by the
hashing. For example, in this snippet tid 3 appears after tid 256 even
though they have the same ppid 2:

```
$ perf report --tasks
%      pid      tid     ppid  comm
         0        0       -1 |swapper
         2        2        0 | kthreadd
       256      256        2 |  kworker/12:1H-k
    693761   693761        2 |  kworker/10:1-mm
   1301762  1301762        2 |  kworker/1:1-mm_
   1302530  1302530        2 |  kworker/u32:0-k
         3        3        2 |  rcu_gp
...
```

The output is easier to read if threads appear numerically
increasing. To allow for this, read all threads into a list then sort
with a comparator that orders by the child task's of the first common
parent. The list creation and deletion are created as utilities on
machine.  The indentation is possible by counting the number of
parents a child has.

With this change the output for the same data file is now like:
```
$ perf report --tasks
%      pid      tid     ppid  comm
         0        0       -1 |swapper
         1        1        0 | systemd
       823      823        1 |  systemd-journal
       853      853        1 |  systemd-udevd
      3230     3230        1 |  systemd-timesyn
      3236     3236        1 |  auditd
      3239     3239     3236 |   audisp-syslog
      3321     3321        1 |  accounts-daemon
...
```

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Oliver Upton <oliver.upton@linux.dev>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301053646.1449657-2-irogers@google.com
2024-03-03 22:50:55 -08:00
Sandipan Das 498d348637 perf vendor events amd: Fix Zen 4 cache latency events
L3PMCx0AC and L3PMCx0AD, used in l3_xi_sampled_latency* events, have a
quirk that requires them to be programmed with SliceId set to 0x3.
Without this, the events do not count at all and affects dependent
metrics such as l3_read_miss_latency.

If ThreadMask is not specified, the amd-uncore driver internally sets
ThreadMask to 0x3, EnAllCores to 0x1 and EnAllSlices to 0x1 but does
not set SliceId. Since SliceId must also be set to 0x3 in this case,
specify all the other fields explicitly.

E.g.

  $ sudo perf stat -e l3_xi_sampled_latency.all,l3_xi_sampled_latency_requests.all -a sleep 1

Before:

   Performance counter stats for 'system wide':

                   0      l3_xi_sampled_latency.all
                   0      l3_xi_sampled_latency_requests.all

         1.005155399 seconds time elapsed

After:

   Performance counter stats for 'system wide':

             921,446      l3_xi_sampled_latency.all
              54,210      l3_xi_sampled_latency_requests.all

         1.005664472 seconds time elapsed

Fixes: 5b2ca349c3 ("perf vendor events amd: Add Zen 4 uncore events")
Signed-off-by: Sandipan Das <sandipan.das@amd.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: ananth.narayan@amd.com
Cc: ravi.bangoria@amd.com
Cc: eranian@google.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301084431.646221-1-sandipan.das@amd.com
2024-03-03 22:49:37 -08:00
James Clark 507ad2bde3 perf version: Display availability of OpenCSD support
This is useful for scripts that work with Perf and ETM trace. Rather
than them trying to parse Perf's error output at runtime to see if it
was linked or not.

Signed-off-by: James Clark <james.clark@arm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: al.grant@arm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240301133829.346286-1-james.clark@arm.com
2024-03-03 22:48:40 -08:00
Ian Rogers dd267d056f perf vendor events intel: Add umasks/occ_sel to PCU events.
UMasks were being dropped leading to all PCU
UNC_P_POWER_STATE_OCCUPANCY events having the same encoding. Don't
drop the umask trying to be consistent with other sources of events
like libpfm4 [1]. Older models need to use occ_sel rather than umask,
correct these values too. This applies the change from [2].

[1] https://sourceforge.net/p/perfmon2/libpfm4/ci/master/tree/lib/events/intel_skx_unc_pcu_events.h#l30
[2] cbd4aee810

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240228170529.4035675-1-irogers@google.com
2024-02-29 18:08:13 -08:00
Ian Rogers ec42d3d568 perf map: Fix map reference count issues
The find will get the map, ensure puts are done on all paths.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240229062048.558799-1-irogers@google.com
2024-02-29 18:06:00 -08:00
Namhyung Kim b44d665368 perf lock contention: Account contending locks too
Currently it accounts the contention using delta between timestamps in
lock:contention_begin and lock:contention_end tracepoints.  But it means
the lock should see the both events during the monitoring period.

Actually there are 4 cases that happen with the monitoring:

                monitoring period
            /                       \
            |                       |
 1:  B------+-----------------------+--------E
 2:    B----+-------------E         |
 3:         |           B-----------+----E
 4:         |     B-------------E   |
            |                       |
            t0                      t1

where B and E mean contention BEGIN and END, respectively.  So it only
accounts the case 4 for now.  It seems there's no way to handle the case
1.  The case 2 might be handled if it saved the timestamp (t0), but it
lacks the information from the B notably the flags which shows the lock
types.  Also it could be a nested lock which it currently ignores.  So
I think we should ignore the case 2.

However we can handle the case 3 if we save the timestamp (t1) at the
end of the period.  And then it can iterate the map entries in the
userspace and update the lock stat accordinly.

Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Reviewed-by: Ian Rogers <irogers@google.com>
Reviwed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Song Liu <song@kernel.org>
Cc: bpf@vger.kernel.org
Link: https://lore.kernel.org/r/20240228053335.312776-1-namhyung@kernel.org
2024-02-29 13:53:56 -08:00
Ian Rogers 97b6b4ac1c perf metrics: Fix segv for metrics with no events
A metric may have no events, for example, the transaction metrics on
x86 are dependent on there being TSX events. Fix a segv where an evsel
of NULL is dereferenced for a metric leader value.

Fixes: a59fb796a3 ("perf metrics: Compute unmerged uncore metrics individually")
Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240224011420.3066322-2-irogers@google.com
2024-02-29 13:40:13 -08:00
Ian Rogers d4be39cade perf metrics: Fix metric matching
The metric match function fails for cases like looking for "metric" in
the string "all;foo_metric;metric" as the "metric" in "foo_metric"
matches but isn't preceeded by a ';'. Fix this by matching the first
list item and recursively matching on failure the next item after a
semicolon.

Signed-off-by: Ian Rogers <irogers@google.com>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240224011420.3066322-1-irogers@google.com
2024-02-29 13:39:54 -08:00
Christophe JAILLET ef5de1613d perf pmu: Fix a potential memory leak in perf_pmu__lookup()
The commit in Fixes has reordered some code, but missed an error handling
path.

'goto err' now, in order to avoid a memory leak in case of error.

Fixes: f63a536f03 ("perf pmu: Merge JSON events with sysfs at load time")
Signed-off-by: Christophe JAILLET <christophe.jaillet@wanadoo.fr>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: kernel-janitors@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/9538b2b634894c33168dfe9d848d4df31fd4d801.1693085544.git.christophe.jaillet@wanadoo.fr
2024-02-26 21:43:00 -08:00
Colin Ian King eb94225eb4 perf test: Fix spelling mistake "curent" -> "current"
There is a spelling mistake in a pr_debug message. Fix it.

Signed-off-by: Colin Ian King <colin.i.king@gmail.com>
Reviewed-by: Adrian Hunter <adrian.hunter@intel.com>
Cc: kernel-janitors@vger.kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240226105326.3944887-1-colin.i.king@gmail.com
2024-02-26 21:41:27 -08:00
Arnaldo Carvalho de Melo 8680999dbe perf test: Use TEST_FAIL in the TEST_ASSERT macros instead of -1
Just to make things clearer, return TEST_FAIL (-1) instead of an open
coded -1.

Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/ZdepeMsjagbf1ufD@x1
2024-02-26 08:31:24 -08:00
Ilkka Koskinen bae4d1f86e perf data convert: Fix segfault when converting to json when cpu_desc isn't set
Arm64 doesn't have Model in /proc/cpuinfo and, thus, cpu_desc doesn't get
assigned.

Running
	$ perf data convert --to-json perf.data.json

ends up calling output_json_string() with NULL pointer, which causes a
segmentation fault.

Signed-off-by: Ilkka Koskinen <ilkka@os.amperecomputing.com>
Acked-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: James Clark <james.clark@arm.com>
Cc: Evgeny Pistun <kotborealis@awooo.ru>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240223220458.15282-1-ilkka@os.amperecomputing.com
2024-02-26 08:30:17 -08:00
Arnaldo Carvalho de Melo 529d5818a3 perf bpf: Check that the minimal vmlinux.h installed is the latest one
When building BPF skels perf will, by default, install a minimalistic
vmlinux.h file with the types needed by the BPF skels in
tools/perf/util/bpf_skel/ in its build directory.

When 29d16de26d ("perf augmented_raw_syscalls.bpf: Move 'struct
timespec64' to vmlinux.h") was added, a type used in the augmented_raw_syscalls
BPF skel, 'struct timespec64' was not found when building from a pre-existing
build directory, because the vmlinux.h there didn't contain that type,
ending up with this error, spotted in linux-next:

    CLANG   /tmp/build/perf-tools-next/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o
  util/bpf_skel/augmented_raw_syscalls.bpf.c:329:15: error: invalid application of 'sizeof' to an incomplete type 'struct timespec64'
    329 |         __u32 size = sizeof(struct timespec64);
        |                      ^     ~~~~~~~~~~~~~~~~~~~
  util/bpf_skel/augmented_raw_syscalls.bpf.c:329:29: note: forward declaration of 'struct timespec64'
    329 |         __u32 size = sizeof(struct timespec64);
        |                                    ^
  util/bpf_skel/augmented_raw_syscalls.bpf.c:350:15: error: invalid application of 'sizeof' to an incomplete type 'struct timespec64'
    350 |         __u32 size = sizeof(struct timespec64);
        |                      ^     ~~~~~~~~~~~~~~~~~~~
  util/bpf_skel/augmented_raw_syscalls.bpf.c:350:29: note: forward declaration of 'struct timespec64'
    350 |         __u32 size = sizeof(struct timespec64);
        |                                    ^
  2 errors generated.
  make[2]: *** [Makefile.perf:1158: /tmp/build/perf-tools-next/util/bpf_skel/.tmp/augmented_raw_syscalls.bpf.o] Error 1
  make[2]: *** Waiting for unfinished jobs....
  make[1]: *** [Makefile.perf:261: sub-make] Error 2
  make: *** [Makefile:113: install-bin] Error 2
  make: Leaving directory '/home/acme/git/perf-tools-next/tools/perf'

So add a Makefile dependency (Namhyung's suggestion) to make sure that
the new tools/perf/util/bpf_skel/vmlinux/vmlinux.h minimal vmlinux is
updated in the build directory, providing the moved 'struct timespec64'
type.

Fixes: 29d16de26d ("perf augmented_raw_syscalls.bpf: Move 'struct timespec64' to vmlinux.h")
Reported-by: Stephen Rothwell <sfr@canb.auug.org.au>
Reviewed-by: Ian Rogers <irogers@google.com>
Suggested-by: Namhyung Kim <namhyung@kernel.org>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/ZdoPrWg-qYFpBJbz@x1
2024-02-26 08:16:08 -08:00
Masahiro Yamada c2bd08ba20 treewide: remove meaningless assignments in Makefiles
In Makefiles, $(error ), $(warning ), and $(info ) expand to the empty
string, as explained in the GNU Make manual [1]:
 "The result of the expansion of this function is the empty string."

Therefore, they are no-op except for logging purposes.

$(shell ...) expands to the output of the command. It expands to the
empty string when the command does not print anything to stdout.
Hence, $(shell mkdir ...) is no-op except for creating the directory.

Remove meaningless assignments.

[1]: https://www.gnu.org/software/make/manual/make.html#Make-Control-Functions

Signed-off-by: Masahiro Yamada <masahiroy@kernel.org>
Reviewed-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Link: https://lore.kernel.org/r/20240221134201.2656908-1-masahiroy@kernel.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: linux-kbuild@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Cc: linux-perf-users@vger.kernel.org
2024-02-23 14:19:07 -08:00
Mark Rutland 25412c0364 perf print-events: make is_event_supported() more robust
Currently the perf tool doesn't detect support for extended event types
on Apple M1/M2 systems, and will not auto-expand plain PERF_EVENT_TYPE
hardware events into per-PMU events. This is due to the detection of
extended event types not handling mandatory filters required by the
M1/M2 PMU driver.

PMU drivers and the core perf_events code can require that
perf_event_attr::exclude_* filters are configured in a specific way and
may reject certain configurations of filters, for example:

(a) Many PMUs lack support for any event filtering, and require all
    perf_event_attr::exclude_* bits to be clear. This includes Alpha's
    CPU PMU, and ARM CPU PMUs prior to the introduction of PMUv2 in
    ARMv7,

(b) When /proc/sys/kernel/perf_event_paranoid >= 2, the perf core
    requires that perf_event_attr::exclude_kernel is set.

(c) The Apple M1/M2 PMU requires that perf_event_attr::exclude_guest is
    set as the hardware PMU does not count while a guest is running (but
    might be extended in future to do so).

In is_event_supported(), we try to account for cases (a) and (b), first
attempting to open an event without any filters, and if this fails,
retrying with perf_event_attr::exclude_kernel set. We do not account for
case (c), or any other filters that drivers could theoretically require
to be set.

Thus is_event_supported() will fail to detect support for any events
targeting an Apple M1/M2 PMU, even where events would be supported with
perf_event_attr:::exclude_guest set.

Since commit:

  82fe2e45cd ("perf pmus: Check if we can encode the PMU number in perf_event_attr.type")

... we use is_event_supported() to detect support for extended types,
with the PMU ID encoded into the perf_event_attr::type. As above, on an
Apple M1/M2 system this will always fail to detect that the event is
supported, and consequently we fail to detect support for extended types
even when these are supported, as they have been since commit:

  5c81672865 ("arm_pmu: Add PERF_PMU_CAP_EXTENDED_HW_TYPE capability")

Due to this, the perf tool will not automatically expand plain
PERF_TYPE_HARDWARE events into per-PMU events, even when all the
necessary kernel support is present.

This patch updates is_event_supported() to additionally try opening
events with perf_event_attr::exclude_guest set, allowing support for
events to be detected on Apple M1/M2 systems. I believe that this is
sufficient for all contemporary CPU PMU drivers, though in future it may
be necessary to check for other combinations of filter bits.

I've deliberately changed the check to not expect a specific error code
for missing filters, as today ;the kernel may return a number of
different error codes for missing filters (e.g. -EACCESS, -EINVAL, or
-EOPNOTSUPP) depending on why and where the filter configuration is
rejected, and retrying for any error is more robust.

Note that this does not remove the need for commit:

  a24d9d9dc0 ("perf parse-events: Make legacy events lower priority than sysfs/JSON")

... which is still necessary so that named-pmu/event/ events work on
kernels without extended type support, even if the event name happens to
be the same as a PERF_EVENT_TYPE_HARDWARE event (e.g. as is the case for
the M1/M2 PMU's 'cycles' and 'instructions' events).

Fixes: 82fe2e45cd ("perf pmus: Check if we can encode the PMU number in perf_event_attr.type")
Signed-off-by: Mark Rutland <mark.rutland@arm.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: James Clark <james.clark@arm.com>
Tested-by: Marc Zyngier <maz@kernel.org>
Cc: Hector Martin <marcan@marcan.st>
Cc: James Clark <james.clark@arm.com>
Cc: John Garry <john.g.garry@oracle.com>
Cc: Leo Yan <leo.yan@linux.dev>
Cc: Mike Leach <mike.leach@linaro.org>
Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
Cc: Thomas Richter <tmricht@linux.ibm.com>
Cc: Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240126145605.1005472-1-mark.rutland@arm.com
2024-02-23 14:16:33 -08:00
Ian Rogers b482f5f8e0 perf tests: Add option to run tests in parallel
By default tests are forked, add an option (-p or --parallel) so that
the forked tests are all started in parallel and then their output
gathered serially. This is opt-in as running in parallel can cause
test flakes.

Rather than fork within the code, the start_command/finish_command
from libsubcmd are used. This changes how stderr and stdout are
handled. The child stderr and stdout are always read to avoid the
child blocking. If verbose is 1 (-v) then if the test fails the child
stdout and stderr are displayed. If the verbose is >1 (e.g. -vv) then
the stdout and stderr from the child are immediately displayed.

An unscientific test on my laptop shows the wall clock time for perf
test without parallel being 5 minutes 21 seconds and with parallel
(-p) being 1 minute 50 seconds.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-9-irogers@google.com
2024-02-22 09:13:20 -08:00
Ian Rogers 964461ee37 perf tests: Run time generate shell test suites
Rather than special shell test logic, do a single pass to create an
array of test suites. Hold the shell test file name in the test suite
priv field. This makes the special shell test logic in builtin-test.c
redundant so remove it.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-8-irogers@google.com
2024-02-22 09:13:06 -08:00
Ian Rogers f3295f5b06 perf tests: Use scandirat for shell script finding
Avoid filename appending buffers by using openat, faccessat and
scandirat more widely. Turn the script's path back to a file name
using readlink from /proc/<pid>/fd/<fd>.

Read the script's description using api/io.h to avoid fdopen
conversions. Whilst reading perform additional sanity checks on the
script's contents.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-7-irogers@google.com
2024-02-22 09:12:53 -08:00
Ian Rogers d5bcade989 perf test: Rename builtin-test-list and add missed header guard
builtin-test-list is primarily concerned with shell script
tests. Rename the file to better reflect this and add a missed header
guard.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-6-irogers@google.com
2024-02-22 09:12:40 -08:00
Ian Rogers 526f2ac9f6 perf tests: Avoid fork in perf_has_symbol test
perf test -vv Symbols is used to indentify symbols within the perf
binary. Add the -F flag so that the test command doesn't fork the test
before running. This removes a little overhead.

Acked-by: Adrian Hunter <adrian.hunter@intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-4-irogers@google.com
2024-02-22 09:12:04 -08:00
Ian Rogers 8ece26ad5a perf list: Add scandirat compatibility function
scandirat is used during the printing of tracepoint events but may be
missing from certain libcs. Add a compatibility implementation that
uses the symlink of an fd in /proc as a path for the reliably present
scandir.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-3-irogers@google.com
2024-02-22 09:11:41 -08:00
Ian Rogers 510e528786 perf thread_map: Skip exited threads when scanning /proc
Scanning /proc is inherently racy. Scanning /proc/pid/task within that
is also racy as the pid can terminate. Rather than failing in
__thread_map__new_all_cpus, skip pids for such failures.

Signed-off-by: Ian Rogers <irogers@google.com>
Cc: James Clark <james.clark@arm.com>
Cc: Justin Stitt <justinstitt@google.com>
Cc: Bill Wendling <morbo@google.com>
Cc: Nick Desaulniers <ndesaulniers@google.com>
Cc: Yang Jihong <yangjihong1@huawei.com>
Cc: Nathan Chancellor <nathan@kernel.org>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Athira Jajeev <atrajeev@linux.vnet.ibm.com>
Cc: llvm@lists.linux.dev
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221034155.1500118-2-irogers@google.com
2024-02-22 09:11:03 -08:00
Thomas Richter b6968f9b50 perf list: fix short description for some cache events
Correct the short description of the following events:
DCW_REQ, DCW_REQ_CHIP_HIT, DCW_REQ_DRAWER_HIT, DCW_REQ_IV,
DCW_ON_CHIP, DCW_ON_CHIP_IV, DCW_ON_CHIP_CHIP_HIT,
DCW_ON_CHIP_DRAWER_HIT, CW_ON_MODULE, DCW_ON_DRAWER,
DCW_OFF_DRAWER, IDCW_ON_MODULE_IV, IDCW_ON_MODULE_CHIP_HIT,
IDCW_ON_MODULE_DRAWER_HIT, IDCW_ON_DRAWER_IV, IDCW_ON_DRAWER_CHIP_HIT,
IDCW_ON_DRAWER_DRAWER_HIT, IDCW_OFF_DRAWER_IV, IDCW_OFF_DRAWER_CHIP_HIT,
IDCW_OFF_DRAWER_DRAWER_HIT, ICW_REQ, ICW_REQ_IV, CW_REQ_CHIP_HIT,
ICW_REQ_DRAWER_HIT, ICW_ON_CHIP, ICW_ON_CHIP_IV, ICW_ON_CHIP_CHIP_HIT,
ICW_ON_CHIP_DRAWER_HIT, ICW_ON_MODULE and ICW_OFF_DRAWER.

The second Cache should be L2-Cache.

Output before (display diff of the first four events)
  # perf list -d
  DCW_REQ
       [Directory Write Level 1 Data Cache from Cache. Unit: cpum_cf]
  DCW_REQ_CHIP_HIT
       [Directory Write Level 1 Data Cache from Cache with Chip HP \
	       Hit. Unit: cpum_cf]
  DCW_REQ_DRAWER_HIT
       [Directory Write Level 1 Data Cache from Cache with Drawer \
	       HP Hit. Unit: cpum_cf]
  DCW_REQ_IV
       [Directory Write Level 1 Data Cache from Cache with Intervention. \
	       Unit: cpum_cf]

Output after:
  # perf list -d
  DCW_REQ
       [Directory Write Level 1 Data Cache from L2-Cache. Unit: cpum_cf]
  DCW_REQ_CHIP_HIT
       [Directory Write Level 1 Data Cache from L2-Cache with Chip HP \
	       Hit. Unit: cpum_cf]
  DCW_REQ_DRAWER_HIT
       [Directory Write Level 1 Data Cache from L2-Cache with Drawer \
	       HP Hit. Unit: cpum_cf]
  DCW_REQ_IV
       [Directory Write Level 1 Data Cache from L2-Cache with \
	       Intervention. Unit: cpum_cf]

Fixes: 7f76b31130 ("perf list: Add IBM z16 event description for s390")
Reported-by: Andreas Krebbel <krebbel@linux.ibm.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Acked-by: Andreas Krebbel <krebbel@linux.ibm.com>
Reviewed-by: Ian Rogers <irogers@google.com>
Cc: gor@linux.ibm.com
Cc: hca@linux.ibm.com
Cc: sumanthk@linux.ibm.com
Cc: svens@linux.ibm.com
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221091908.1759083-1-tmricht@linux.ibm.com
2024-02-22 09:02:59 -08:00
Ian Rogers bafd4e75c1 perf stat: Fix metric-only aggregation index
Aggregation index was being computed using the evsel's cpumap which
may have a different (typically the same or fewer) entries.

Before:
```
$ perf stat --metric-only -A -M memory_bandwidth_total -a sleep 1

 Performance counter stats for 'system wide':

       MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total
CPU0                            12.8                           0.0                          12.9                          12.7                           0.0                          12.6
CPU1

       1.007806367 seconds time elapsed
```

After:
```
$ perf stat --metric-only -A -M memory_bandwidth_total -a sleep 1

 Performance counter stats for 'system wide':

       MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total MB/s  memory_bandwidth_total
CPU0                            15.4                           0.0                          15.3                          15.0                           0.0                          14.9
CPU18                            0.0                           0.0                          13.5                           5.2                           0.0                          11.9

       1.007858736 seconds time elapsed
```

Signed-off-by: Ian Rogers <irogers@google.com>                                  |
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: K Prateek Nayak <kprateek.nayak@amd.com>
Cc: Stephane Eranian <eranian@google.com>
Cc: Kaige Ye <ye@kaige.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: John Garry <john.g.garry@oracle.com>
Signed-off-by: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/r/20240221070754.4163916-3-irogers@google.com
2024-02-22 08:57:29 -08:00