Commit graph

1248891 commits

Author SHA1 Message Date
Arnaldo Carvalho de Melo fdd0ae72b3 perf tools headers: update the asm-generic/unaligned.h copy with the kernel sources
To pick up the changes in:

  1ab33c0314 ("asm-generic: make sparse happy with odd-sized put_unaligned_*()")

Addressing this perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/asm-generic/unaligned.h include/asm-generic/unaligned.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitry Torokhov <dmitry.torokhov@gmail.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/Zbp9I7rmFj1Owhug@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-31 14:02:41 -03:00
Arnaldo Carvalho de Melo 1f8c43b09e tools include UAPI: Sync linux/mount.h copy with the kernel sources
To pick the changes from:

  35e27a5744 ("fs: keep struct mnt_id_req extensible")
  b4c2bea8ce ("add listmount(2) syscall")
  46eae99ef7 ("add statmount(2) syscall")

That doesn't change anything in tools this time as nothing that is
harvested by the beauty scripts got changed:

  $ ls -1 tools/perf/trace/beauty/*mount*sh
  tools/perf/trace/beauty/fsmount.sh
  tools/perf/trace/beauty/mount_flags.sh
  tools/perf/trace/beauty/move_mount_flags.sh
  $

This addresses this perf build warning.

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/mount.h include/uapi/linux/mount.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZbkMiB7ZcOsLP2V5@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-30 11:49:54 -03:00
James Clark 7814fe24a6 perf evlist: Fix evlist__new_default() for > 1 core PMU
The 'Session topology' test currently fails with this message when
evlist__new_default() opens more than one event:

  32: Session topology                                                :
  --- start ---
  templ file: /tmp/perf-test-vv5YzZ
  Using CPUID 0x00000000410fd070
  Opening: unknown-hardware:HG
  ------------------------------------------------------------
  perf_event_attr:
    type                             0 (PERF_TYPE_HARDWARE)
    config                           0xb00000000
    disabled                         1
  ------------------------------------------------------------
  sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 4
  Opening: unknown-hardware:HG
  ------------------------------------------------------------
  perf_event_attr:
    type                             0 (PERF_TYPE_HARDWARE)
    config                           0xa00000000
    disabled                         1
  ------------------------------------------------------------
  sys_perf_event_open: pid 0  cpu -1  group_fd -1  flags 0x8 = 5
  non matching sample_type
  FAILED tests/topology.c:73 can't get session
  ---- end ----
  Session topology: FAILED!

This is because when re-opening the file and parsing the header, Perf
expects that any file that has more than one event has the sample ID
flag set. Perf record already sets the flag in a similar way when there
is more than one event, so add the same logic to evlist__new_default().

evlist__new_default() is only currently used in tests, so I don't
expect this change to have any other side effects. The other tests that
use it don't save and re-open the file so don't hit this issue.

The session topology test has been failing on Arm big.LITTLE platforms
since commit 251aa04024 ("perf parse-events: Wildcard most
"numeric" events") when evlist__new_default() started opening multiple
events for 'cycles'.

Fixes: 251aa04024 ("perf parse-events: Wildcard most "numeric" events")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: James Clark <james.clark@arm.com>
[ This was failing as well on a Rocket Lake Refresh/14700k Intel hybrid system - Arnaldo ]
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Tested-by: Ian Rogers <irogers@google.com>
Tested-by: Kan Liang <kan.liang@linux.intel.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Changbin Du <changbin.du@huawei.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Yang Jihong <yangjihong1@huawei.com>
Closes: https://lore.kernel.org/lkml/CAP-5=fWVQ-7ijjK3-w1q+k2WYVNHbAcejb-xY0ptbjRw476VKA@mail.gmail.com/
Link: https://lore.kernel.org/r/20240124094358.489372-1-james.clark@arm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-30 11:40:28 -03:00
Arnaldo Carvalho de Melo efe80f9c90 tools headers: Update the copy of x86's mem{cpy,set}_64.S used in 'perf bench'
This is to get the changes from:

  94ea9c0521 ("x86/headers: Replace #include <asm/export.h> with #include <linux/export.h>")
  10f4c9b9a3 ("x86/asm: Fix build of UML with KASAN")

That addresses these perf tools build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/lib/memcpy_64.S arch/x86/lib/memcpy_64.S
    diff -u tools/arch/x86/lib/memset_64.S arch/x86/lib/memset_64.S

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Masahiro Yamada <masahiroy@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Vincent Whitchurch <vincent.whitchurch@axis.com>
Link: https://lore.kernel.org/lkml/ZbkIKpKdNqOFdMwJ@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-30 11:31:15 -03:00
Arnaldo Carvalho de Melo 15d6daad8f tools headers x86 cpufeatures: Sync with the kernel sources to pick TDX, Zen, APIC MSR fence changes
To pick the changes from:

  1e536e1068 ("x86/cpu: Detect TDX partial write machine check erratum")
  765a0542fd ("x86/virt/tdx: Detect TDX during kernel boot")
  30fa92832f ("x86/CPU/AMD: Add ZenX generations flags")
  04c3024560 ("x86/barrier: Do not serialize MSR accesses on AMD")

This causes these perf files to be rebuilt and brings some X86_FEATURE
that will be used when updating the copies of
tools/arch/x86/lib/mem{cpy,set}_64.S with the kernel sources:

      CC       /tmp/build/perf/bench/mem-memcpy-x86-64-asm.o
      CC       /tmp/build/perf/bench/mem-memset-x86-64-asm.o

And addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/cpufeatures.h arch/x86/include/asm/cpufeatures.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Borislav Petkov <bp@alien8.de>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-30 10:09:18 -03:00
Arnaldo Carvalho de Melo 21fdd8dd37 tools headers UAPI: Sync unistd.h to pick {list,stat}mount, lsm_{[gs]et_self_attr,list_modules} syscall numbers
To pick the changes in these csets:

  d8b0f54650 ("wire up syscalls for statmount/listmount")
  5f42375904 ("LSM: wireup Linux Security Module syscalls")

Used in some architectures to create syscall tables.

This addresses this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/asm-generic/unistd.h include/uapi/asm-generic/unistd.h

Cc: Casey Schaufler <casey@schaufler-ca.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Paul Moore <paul@paul-moore.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZbfMuAlUMRO9Hqa6@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-29 13:05:44 -03:00
Ian Rogers becc24e96a perf vendor events intel: Alderlake/sapphirerapids metric fixes
As events are deduplicated by name, ensure PMU prefixes are always
used in metrics. Previously they may be missed on the first event in a
formula.

Update metric constraints for architectures with topdown l2 events.

Conversion script updated in:
https://github.com/intel/perfmon/pull/128

Reported-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Reviewed-by: Kan Liang <kan.liang@linux.intel.com>
Signed-off-by: Ian Rogers <irogers@google.com>
Tested-by: Arnaldo Carvalho de Melo <acme@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Edward Baker <edward.baker@intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Closes: https://lore.kernel.org/lkml/ZZam-EG-UepcXtWw@kernel.org/
Link: https://lore.kernel.org/r/20240104231903.775717-1-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-27 16:31:41 -03:00
Arnaldo Carvalho de Melo e30dca91e5 tools headers UAPI: Sync kvm headers with the kernel sources
To pick the changes in:

  a5d3df8ae1 ("KVM: remove deprecated UAPIs")
  6d72283526 ("KVM x86/xen: add an override for PVCLOCK_TSC_STABLE_BIT")
  89ea60c2c7 ("KVM: x86: Add support for "protected VMs" that can utilize private memory")
  8dd2eee9d5 ("KVM: x86/mmu: Handle page fault for private memory")
  a7800aa80e ("KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory")
  5a475554db ("KVM: Introduce per-page memory attributes")
  16f95f3b95 ("KVM: Add KVM_EXIT_MEMORY_FAULT exit to report faults to userspace")
  bb58b90b1a ("KVM: Introduce KVM_SET_USER_MEMORY_REGION2")
  3f9cd0ca84 ("KVM: arm64: Allow userspace to get the writable masks for feature ID registers")

That automatically adds support for some new ioctls and remove a bunch
of deprecated ones.

This ends up making the new binary to forget about the deprecated one,
so when used in an older system it will not be able to resolve those
codes to strings.

  $ tools/perf/trace/beauty/kvm_ioctl.sh > before
  $ cp include/uapi/linux/kvm.h tools/include/uapi/linux/kvm.h
  $ tools/perf/trace/beauty/kvm_ioctl.sh > after
  $ diff -u before after
  --- before	2024-01-27 14:48:16.523014020 -0300
  +++ after	2024-01-27 14:48:24.183932866 -0300
  @@ -14,6 +14,7 @@
   	[0x46] = "SET_USER_MEMORY_REGION",
   	[0x47] = "SET_TSS_ADDR",
   	[0x48] = "SET_IDENTITY_MAP_ADDR",
  +	[0x49] = "SET_USER_MEMORY_REGION2",
   	[0x60] = "CREATE_IRQCHIP",
   	[0x61] = "IRQ_LINE",
   	[0x62] = "GET_IRQCHIP",
  @@ -22,14 +23,8 @@
   	[0x65] = "GET_PIT",
   	[0x66] = "SET_PIT",
   	[0x67] = "IRQ_LINE_STATUS",
  -	[0x69] = "ASSIGN_PCI_DEVICE",
   	[0x6a] = "SET_GSI_ROUTING",
  -	[0x70] = "ASSIGN_DEV_IRQ",
   	[0x71] = "REINJECT_CONTROL",
  -	[0x72] = "DEASSIGN_PCI_DEVICE",
  -	[0x73] = "ASSIGN_SET_MSIX_NR",
  -	[0x74] = "ASSIGN_SET_MSIX_ENTRY",
  -	[0x75] = "DEASSIGN_DEV_IRQ",
   	[0x76] = "IRQFD",
   	[0x77] = "CREATE_PIT2",
   	[0x78] = "SET_BOOT_CPU_ID",
  @@ -66,7 +61,6 @@
   	[0x9f] = "GET_VCPU_EVENTS",
   	[0xa0] = "SET_VCPU_EVENTS",
   	[0xa3] = "ENABLE_CAP",
  -	[0xa4] = "ASSIGN_SET_INTX_MASK",
   	[0xa5] = "SIGNAL_MSI",
   	[0xa6] = "GET_XCRS",
   	[0xa7] = "SET_XCRS",
  @@ -97,6 +91,8 @@
   	[0xcd] = "SET_SREGS2",
   	[0xce] = "GET_STATS_FD",
   	[0xd0] = "XEN_HVM_EVTCHN_SEND",
  +	[0xd2] = "SET_MEMORY_ATTRIBUTES",
  +	[0xd4] = "CREATE_GUEST_MEMFD",
   	[0xe0] = "CREATE_DEVICE",
   	[0xe1] = "SET_DEVICE_ATTR",
   	[0xe2] = "GET_DEVICE_ATTR",
  $

This silences these perf build warnings:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/kvm.h include/uapi/linux/kvm.h
    diff -u tools/arch/x86/include/uapi/asm/kvm.h arch/x86/include/uapi/asm/kvm.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Chao Peng <chao.p.peng@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jing Zhang <jingzhangos@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Oliver Upton <oliver.upton@linux.dev>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Paul Durrant <pdurrant@amazon.com>
Cc: Sean Christopherson <seanjc@google.com>
Link: https://lore.kernel.org/lkml/ZbVLbkngp4oq13qN@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-27 15:29:47 -03:00
Sun Haiyong 7bbe8f0071 perf tools: Fix calloc() arguments to address error introduced in gcc-14
the definition of calloc is as follows:

    void *calloc(size_t nmemb, size_t size);

number of members is in the first parameter and the size is in the
second parameter.

Fix error messages on gcc 14 20240102:

  error: 'calloc' sizes specified with 'sizeof' in the earlier argument and
  not in the later argument [-Werror=calloc-transposed-args]

Committer notes:

I noticed this on fedora 40 and rawhide.

Signed-off-by: Sun Haiyong <sunhaiyong@loongson.cn>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Tested-by: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20240106094129.3337057-1-siyanteng@loongson.cn
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 12:56:01 -03:00
Sun Haiyong 79baac8acf perf top: Remove needless malloc(0) call that triggers -Walloc-size
GCC 14 introduces a new -Walloc-size included in -Wextra which errors out
like:

  builtin-top.c: In function ‘prompt_integer’:
  builtin-top.c:360:21: error: allocation of insufficient size ‘0’ for
  type ‘char’ with size ‘1’ [-Werror=alloc-size]
    360 |         char *buf = malloc(0), *p;
        |                     ^~~~~~

Just set it to NULL, getline() will do the allocation.

Signed-off-by: Sun Haiyong <sunhaiyong@loongson.cn>
Signed-off-by: Yanteng Si <siyanteng@loongson.cn>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Link: https://lore.kernel.org/r/20231204082055.91877-1-siyanteng@loongson.cn
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 12:06:12 -03:00
Yicong Yang 39af674139 perf build: Make minimal shellcheck version to v0.6.0
The perf build failed due to the shellcheck on my machine (v0.4.6 on Ubuntu
18.04.1 LTS) doesn't support -a/--check-sourced and -S/--severity option.

These two options are introduced in shellcheck v0.4.7 and v0.6.0
respectively. So restrict the minimal version of shellcheck to v0.6.0.

Fixes: b809fc656e ("perf build: Shellcheck support for OUTPUT directory")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Junhao He <hejunhao3@huawei.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: linuxarm@huawei.com
Link: https://lore.kernel.org/r/20240122080406.28678-1-yangyicong@huawei.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 12:06:12 -03:00
Arnaldo Carvalho de Melo 1233d1d54b tools headers UAPI: Update tools's copy of drm.h headers to pick DRM_IOCTL_MODE_CLOSEFB
Picking the changes from:

  8570c27932 ("drm/syncobj: Add deadline support for syncobj waits")
  9724ed6c1b ("drm: Introduce DRM_CLIENT_CAP_CURSOR_PLANE_HOTSPOT")
  e4d983acff ("drm: introduce DRM_CAP_ATOMIC_ASYNC_PAGE_FLIP")
  d208d87566 ("drm: introduce CLOSEFB IOCTL")
  afa5cf3175 ("drm/i915/uapi: fix typos/spellos and punctuation")

Addressing these perf build warnings:

  Warning: Kernel ABI header differences:

Now 'perf trace' and other code that might use the
tools/perf/trace/beauty autogenerated tables will be able to translate
this new ioctl command into a string:

  $ tools/perf/trace/beauty/drm_ioctl.sh > before
  $ cp include/uapi/drm/drm.h tools/include/uapi/drm/drm.h
  $ tools/perf/trace/beauty/drm_ioctl.sh > after
  $ diff -u before after
  --- before	2024-01-26 10:54:23.486381862 -0300
  +++ after	2024-01-26 10:54:35.767902442 -0300
  @@ -109,6 +109,7 @@
   	[0xCD] = "SYNCOBJ_TIMELINE_SIGNAL",
   	[0xCE] = "MODE_GETFB2",
   	[0xCF] = "SYNCOBJ_EVENTFD",
  +	[0xD0] = "MODE_CLOSEFB",
   	[DRM_COMMAND_BASE + 0x00] = "I915_INIT",
   	[DRM_COMMAND_BASE + 0x01] = "I915_FLUSH",
   	[DRM_COMMAND_BASE + 0x02] = "I915_FLIP",
  $

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dmitry Baryshkov <dmitry.baryshkov@linaro.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Javier Martinez Canillas <javierm@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Randy Dunlap <rdunlap@infradead.org>
Cc: Rob Clark <robdclark@chromium.org>
Cc: Simon Ser <contact@emersion.fr>
Cc: Tvrtko Ursulin <tvrtko.ursulin@intel.com>
Cc: Zack Rusin <zack.rusin@broadcom.com>
Link: https://lore.kernel.org/lkml/ZbPIN9Dcc5AM0uxo@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 11:57:34 -03:00
Ian Rogers 9a8dd2f24d perf test shell daemon: Make signal test less racy
The daemon signal test sends signals and then expects files to be
written. It was observed on an Intel Alderlake that the signals were
sent too quickly leading to the 3 expected files not appearing.

To avoid this send the next signal only after the expected previous file
has appeared. To avoid an infinite loop the number of retries is
limited.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Jiri Olsa <jolsa@kernel.org>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-6-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:49 -03:00
Ian Rogers 1c2124ec84 perf test shell script: Fix test for python being disabled
"grep -cv" can exit with an error code that causes the "set -e" to abort
the script. Switch to using the grep exit code in the if condition to
avoid this.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-5-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:49 -03:00
Ian Rogers a734c7f969 perf test: Workaround debug output in list test
Write the JSON output to a specific file to avoid debug output
breaking it.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-4-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:49 -03:00
Ian Rogers 79bacb6ad7 perf list: Add output file option
Add an option to write the 'perf list' output to a specific file. This
can avoid issues with debug output being written into the output stream.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-3-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:49 -03:00
Ian Rogers 9d95c6be48 perf list: Switch error message to pr_err() to respect debug settings (-v)
Using printf() can interrupt 'perf list output', use pr_err() which can
respect debug settings and the debug file.

Signed-off-by: Ian Rogers <irogers@google.com>
Acked-by: Namhyung Kim <namhyung@kernel.org>
Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Athira Rajeev <atrajeev@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kajol Jain <kjain@linux.ibm.com>
Cc: Kan Liang <kan.liang@linux.intel.com>
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Ravi Bangoria <ravi.bangoria@amd.com>
Cc: Ross Zwisler <zwisler@chromium.org>
Cc: Shirisha G <shirisha@linux.ibm.com>
Link: https://lore.kernel.org/r/20240124043015.1388867-2-irogers@google.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:48 -03:00
Thomas Richter 2dac1f089a perf test: Fix 'perf script' tests on s390
In linux next repo, test case 'perf script tests' fails on s390.

The root case is a command line invocation of 'perf record' with
call-graph information. On s390 only DWARF formatted call-graphs are
supported and only on software events.

Change the command line parameters for s390.

Output before:

  # perf test 89
  89: perf script tests              : FAILED!
  #

Output after:

  # perf test 89
  89: perf script tests              : Ok
  #

Fixes: 0dd5041c9a ("perf addr_location: Add init/exit/copy functions")
Reviewed-by: Ian Rogers <irogers@google.com>
Signed-off-by: Thomas Richter <tmricht@linux.ibm.com>
Cc: Heiko Carstens <hca@linux.ibm.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Sumanth Korikkar <sumanthk@linux.ibm.com>
Cc: Sven Schnelle <svens@linux.ibm.com>
Cc: Vasily Gorbik <gor@linux.ibm.com>
Link: https://lore.kernel.org/r/20240125100351.936262-1-tmricht@linux.ibm.com
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:48 -03:00
Arnaldo Carvalho de Melo b0dc992155 tools headers UAPI: Sync linux/fcntl.h with the kernel sources
To get the changes in:

  8a924db2d7 ("fs: Pass AT_GETATTR_NOSEC flag to getattr interface function")

That don't add anything that is handled by existing hard coded tables or
table generation scripts.

This silences this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/fcntl.h include/uapi/linux/fcntl.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Namhyung Kim <namhyung@kernel.org>
Cc: Stefan Berger <stefanb@linux.ibm.com>
Link: https://lore.kernel.org/lkml/ZbJv9fGF_k2xXEdr@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:48 -03:00
Arnaldo Carvalho de Melo 1743726689 tools arch x86: Sync the msr-index.h copy with the kernel sources to pick IA32_MKTME_KEYID_PARTITIONING
To pick up the changes in:

  765a0542fd ("x86/virt/tdx: Detect TDX during kernel boot")

Addressing this tools/perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/arch/x86/include/asm/msr-index.h arch/x86/include/asm/msr-index.h

That makes the beautification scripts to pick some new entries:

  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > before
  $ cp arch/x86/include/asm/msr-index.h tools/arch/x86/include/asm/msr-index.h
  $ tools/perf/trace/beauty/tracepoints/x86_msr.sh > after
  $ diff -u before after
  --- before	2024-01-25 11:08:12.363223880 -0300
  +++ after	2024-01-25 11:08:24.839307699 -0300
  @@ -21,6 +21,7 @@
   	[0x0000004f] = "PPIN",
   	[0x00000060] = "LBR_CORE_TO",
   	[0x00000079] = "IA32_UCODE_WRITE",
  +	[0x00000087] = "IA32_MKTME_KEYID_PARTITIONING",
   	[0x0000008b] = "AMD64_PATCH_LEVEL",
   	[0x0000008C] = "IA32_SGXLEPUBKEYHASH0",
   	[0x0000008D] = "IA32_SGXLEPUBKEYHASH1",
  $

Now one can trace systemwide asking to see backtraces to where that MSR
is being read/written, see this example with a previous update:

  # perf trace -e msr:*_msr/max-stack=32/ --filter="msr==IA32_MKTME_KEYID_PARTITIONING"
  ^C#

If we use -v (verbose mode) we can see what it does behind the scenes:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_MKTME_KEYID_PARTITIONING"
  Using CPUID GenuineIntel-6-8E-A
  0x87
  New filter for msr:read_msr: (msr==0x87) && (common_pid != 58627 && common_pid != 3792)
  0x87
  New filter for msr:write_msr: (msr==0x87) && (common_pid != 58627 && common_pid != 3792)
  mmap size 528384B
  ^C#

Example with a frequent msr:

  # perf trace -v -e msr:*_msr/max-stack=32/ --filter="msr==IA32_SPEC_CTRL" --max-events 2
  Using CPUID AuthenticAMD-25-21-0
  0x48
  New filter for msr:read_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  0x48
  New filter for msr:write_msr: (msr==0x48) && (common_pid != 2612129 && common_pid != 3841)
  mmap size 528384B
  Looking at the vmlinux_path (8 entries long)
  symsrc__init: build id mismatch for vmlinux.
  Using /proc/kcore for kernel data
  Using /proc/kallsyms for symbols
   0.000 Timer/2525383 msr:write_msr(msr: IA32_SPEC_CTRL, val: 6)
  				   do_trace_write_msr ([kernel.kallsyms])
  				   do_trace_write_msr ([kernel.kallsyms])
  				   __switch_to_xtra ([kernel.kallsyms])
  				   __switch_to ([kernel.kallsyms])
  				   __schedule ([kernel.kallsyms])
  				   schedule ([kernel.kallsyms])
  				   futex_wait_queue_me ([kernel.kallsyms])
  				   futex_wait ([kernel.kallsyms])
  				   do_futex ([kernel.kallsyms])
  				   __x64_sys_futex ([kernel.kallsyms])
  				   do_syscall_64 ([kernel.kallsyms])
  				   entry_SYSCALL_64_after_hwframe ([kernel.kallsyms])
  				   __futex_abstimed_wait_common64 (/usr/lib64/libpthread-2.33.so)
   0.030 :0/0 msr:write_msr(msr: IA32_SPEC_CTRL, val: 2)
  				   do_trace_write_msr ([kernel.kallsyms])
  				   do_trace_write_msr ([kernel.kallsyms])
  				   __switch_to_xtra ([kernel.kallsyms])
  				   __switch_to ([kernel.kallsyms])
  				   __schedule ([kernel.kallsyms])
  				   schedule_idle ([kernel.kallsyms])
  				   do_idle ([kernel.kallsyms])
  				   cpu_startup_entry ([kernel.kallsyms])
  				   secondary_startup_64_no_verify ([kernel.kallsyms])
  #

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Kai Huang <kai.huang@intel.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZbJt27rjkQVU1YoP@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:48 -03:00
Arnaldo Carvalho de Melo 690811f012 tools headers uapi: Sync linux/stat.h with the kernel sources to pick STATX_MNT_ID_UNIQUE
To pick the changes from:

  98d2b43081 ("add unique mount ID")

That add STATX_MNT_ID_UNIQUE that was manually added to
tools/perf/trace/beauty/statx.c, at some point this should move to the
shell based automated way.

This silences this perf build warning:

  Warning: Kernel ABI header differences:
    diff -u tools/include/uapi/linux/stat.h include/uapi/linux/stat.h

Cc: Adrian Hunter <adrian.hunter@intel.com>
Cc: Christian Brauner <brauner@kernel.org>
Cc: Ian Rogers <irogers@google.com>
Cc: Jiri Olsa <jolsa@kernel.org>
Cc: Miklos Szeredi <mszeredi@redhat.com>
Cc: Namhyung Kim <namhyung@kernel.org>
Link: https://lore.kernel.org/lkml/ZbJq08s19890WDo-@kernel.org
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
2024-01-26 10:51:48 -03:00
Linus Torvalds ecb1b8288d Including fixes from bpf, netfilter and WiFi.
Current release - regressions:
 
   - bpf: fix a kernel crash for the riscv 64 JIT
 
   - bnxt_en: fix memory leak in bnxt_hwrm_get_rings()
 
   - revert "net: macsec: use skb_ensure_writable_head_tail to expand the skb"
 
 Previous releases - regressions:
 
   - core: fix removing a namespace with conflicting altnames
 
   - tc/flower: fix chain template offload memory leak
 
   - tcp:
     - make sure init the accept_queue's spinlocks once
     - fix autocork on CPUs with weak memory model
 
   - udp: fix busy polling
 
   - mlx5e:
     - fix out-of-bound read in port timestamping
     - fix peer flow lists corruption
 
   - iwlwifi: fix a memory corruption
 
 Previous releases - always broken:
 
   - netfilter:
     - nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
     - nft_limit: reject configurations that cause integer overflow
 
   - bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding
     a NULL pointer dereference upon shrinking
 
   - llc: make llc_ui_sendmsg() more robust against bonding changes
 
   - smc: fix illegal rmb_desc access in SMC-D connection dump
 
   - dpll: fix pin dump crash for rebound module
 
   - bnxt_en: fix possible crash after creating sw mqprio TCs
 
   - hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4 Kbytes
 
 Misc:
 
   - several self-tests fixes for better integration with the netdev CI
 
   - added several missing modules descriptions
 
 Signed-off-by: Paolo Abeni <pabeni@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEg1AjqC77wbdLX2LbKSR5jcyPE6QFAmWyUSISHHBhYmVuaUBy
 ZWRoYXQuY29tAAoJECkkeY3MjxOkiuIP/0IChNqw3KtJJQOb4eIu12qRTblMmucU
 Yf+Q4ZBHI7Epz3HFWrmqoj7N7CaWAj+U8b9JEHbcZxP7cwo4mc7ScXmm78IOvpEl
 ypWdjWVW89UVz6GI/Yz/MNC2H7By51NUkiDpbbAA4pZVK6N1+rO8oAU9Fy2IiFTb
 ixt61S1zsVYdmUjHOD/PiU9b8i5cRBukaXF8jnznRj8nAT/cU9XUV/YyFixj33vQ
 Rbs3HoKxD9Mk9KdJ7jgEMi7Vazb40w5TmfMLWyNglJQdNz4DUg+9tqQGCkEf5UGU
 CpRKu4RVr2uzAn9N5Hav4O0He2kDUVH1MoPqgS6MnJAERzCDKoDFxo8ljTmHBk6b
 ISmstRzRon/AXcp+94pwU5RT78B7HekXYlZPcj5tGVKiM7HMdgLiodOcZcsG5fdW
 8okeYhpCL5ew/fxGOZnbNS/BiODZBaa+/e6ns8NasmWZHgGJa9uHiO865g5I53/H
 jWnm53Bi1Zmkgu/+NwXEx1I1vWSa9GCBA8ia5oSuEQDWhHhm3EcUcr44kjrD/R+S
 6elNScjrJ2kMF+fvOb1BEITUf77fk7/ATJarCk8oybJCAt7do+DZ1E47NN0M9Km8
 gARKvThije9rmc4OZ7RYR7R9iQgvwpb7fgGJZw+SI3XFK/WxmcDJHV4dXLiA6+Mu
 vvt+x7EmWiWf
 =ER+y
 -----END PGP SIGNATURE-----

Merge tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net

Pull networking fixes from Paolo Abeni:
 "Including fixes from bpf, netfilter and WiFi.

  Jakub is doing a lot of work to include the self-tests in our CI, as a
  result a significant amount of self-tests related fixes is flowing in
  (and will likely continue in the next few weeks).

  Current release - regressions:

   - bpf: fix a kernel crash for the riscv 64 JIT

   - bnxt_en: fix memory leak in bnxt_hwrm_get_rings()

   - revert "net: macsec: use skb_ensure_writable_head_tail to expand
     the skb"

  Previous releases - regressions:

   - core: fix removing a namespace with conflicting altnames

   - tc/flower: fix chain template offload memory leak

   - tcp:
      - make sure init the accept_queue's spinlocks once
      - fix autocork on CPUs with weak memory model

   - udp: fix busy polling

   - mlx5e:
      - fix out-of-bound read in port timestamping
      - fix peer flow lists corruption

   - iwlwifi: fix a memory corruption

  Previous releases - always broken:

   - netfilter:
      - nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress
        basechain
      - nft_limit: reject configurations that cause integer overflow

   - bpf: fix bpf_xdp_adjust_tail() with XSK zero-copy mbuf, avoiding a
     NULL pointer dereference upon shrinking

   - llc: make llc_ui_sendmsg() more robust against bonding changes

   - smc: fix illegal rmb_desc access in SMC-D connection dump

   - dpll: fix pin dump crash for rebound module

   - bnxt_en: fix possible crash after creating sw mqprio TCs

   - hv_netvsc: calculate correct ring size when PAGE_SIZE is not 4kB

  Misc:

   - several self-tests fixes for better integration with the netdev CI

   - added several missing modules descriptions"

* tag 'net-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net: (88 commits)
  tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring
  tsnep: Remove FCS for XDP data path
  net: fec: fix the unhandled context fault from smmu
  selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
  fjes: fix memleaks in fjes_hw_setup
  i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
  i40e: set xdp_rxq_info::frag_size
  xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
  ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
  intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
  ice: remove redundant xdp_rxq_info registration
  i40e: handle multi-buffer packets that are shrunk by xdp prog
  ice: work on pre-XDP prog frag count
  xsk: fix usage of multi-buffer BPF helpers for ZC XDP
  xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
  xsk: recycle buffer in case Rx queue was full
  net: fill in MODULE_DESCRIPTION()s for rvu_mbox
  net: fill in MODULE_DESCRIPTION()s for litex
  net: fill in MODULE_DESCRIPTION()s for fsl_pq_mdio
  net: fill in MODULE_DESCRIPTION()s for fec
  ...
2024-01-25 10:58:35 -08:00
Linus Torvalds bdc010200e overlayfs fixes for 6.8-rc2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE9zuTYTs0RXF+Ke33EVvVyTe/1WoFAmWyQlcACgkQEVvVyTe/
 1Wqb+g/9GpOV1M7t+PytCtB6od0lSeKwLF7YL4GjIJ/6ms/wE5MdLuvtDESaN3l0
 26xL8025jD2cy72+8vYeoa3Brhy4JniknlR5hpKHVcj6TYu1i8BmDTt+2kfqU6eP
 v5mXRtu3xIVyxexGZtOcbgQB5aca7zw/cGw+2nwdHbUy1eFs3ySfUGopGvkGJtlf
 DxsNepHtA6lt1IvoE80ZDIGOfTfZW24kZIbGiSjxHJ92koEEgjHCRJLvr4guxsV0
 oo/fiiEM32nNjc0VCdRS0Q1aVqHGRdAOeAJkZWp1+YTKEMW3B89Q0RA6iYcTWxQ1
 s6+mg2/u2v7xyaNFD/QQZv7bwUK+ckKUY90EQwpvyJBSjE4pHoYB9csNF91vDrpm
 e0GH1FpTjLXlXB9OJsiec9LHScyS4v69fkPSWyJCmdi4qkvZbEsueMtAFeHvf7IS
 Mg9XUx2/9KZlHBLFWaPS6qUkYxrp7cQO+LbVuEbYFDBSUE2ulI6wEdKesj7IiuIq
 NiS58Cg/i77uUkXd/v2lOe9W0D17pvpCBXBbifNGDQNPWi6pAZT6aYCZOTBF8T2G
 LRWyW3snHBYImzze2yb0g6Ud6mXzuhEOOZtQEPwPR4FtRfDcK94dC/ylChAUM5KE
 CNy4hVaPb1hL2S2NLHTjm9bAduov2CFLUIKk/RJ91V3fkniKzas=
 =Xo8t
 -----END PGP SIGNATURE-----

Merge tag 'ovl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs

Pull overlayfs fix from Amir Goldstein:
 "Change the on-disk format for the new "xwhiteouts" feature introduced
  in v6.7

  The change reduces unneeded overhead of an extra getxattr per readdir.
  The only user of the "xwhiteout" feature is the external composefs
  tool, which has been updated to support the new on-disk format.

  This change is also designated for 6.7.y"

* tag 'ovl-fixes-6.8-rc2' of git://git.kernel.org/pub/scm/linux/kernel/git/overlayfs/vfs:
  ovl: mark xwhiteouts directory with overlay.opaque='x'
2024-01-25 10:52:30 -08:00
Linus Torvalds a658e0e986 vfs-6.8-rc2.netfs
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYKAB0WIQRAhzRXHqcMeLMyaSiRxhvAZXjcogUCZbFEvgAKCRCRxhvAZXjc
 oiv6AP44QuZZP0qp+7YQrIn4jcpRcMowOjGsa9n9c5TYQna+ggEA+JLencaRkihi
 NjsT0McUPKzfi58pKW+6a8AOudwNrgc=
 =zpMk
 -----END PGP SIGNATURE-----

Merge tag 'vfs-6.8-rc2.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs

Pull netfs fixes from Christian Brauner:
 "This contains various fixes for the netfs work merged earlier this
  cycle:

  afs:
   - Fix locking imbalance in afs_proc_addr_prefs_show()
   - Remove afs_dynroot_d_revalidate() which is redundant
   - Fix error handling during lookup
   - Hide sillyrenames from userspace. This fixes a race between
     silly-rename files being created/removed and userspace iterating
     over directory entries
   - Don't use unnecessary folio_*() functions

  cifs:
   - Don't use unnecessary folio_*() functions

  cachefiles:
   - erofs: Fix Null dereference when cachefiles are not doing
     ondemand-mode
   - Update mailing list

  netfs library:
   - Add Jeff Layton as reviewer
   - Update mailing list
   - Fix a error checking in netfs_perform_write()
   - fscache: Check error before dereferencing
   - Don't use unnecessary folio_*() functions"

* tag 'vfs-6.8-rc2.netfs' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs:
  afs: Fix missing/incorrect unlocking of RCU read lock
  afs: Remove afs_dynroot_d_revalidate() as it is redundant
  afs: Fix error handling with lookup via FS.InlineBulkStatus
  afs: Hide silly-rename files from userspace
  cachefiles, erofs: Fix NULL deref in when cachefiles is not doing ondemand-mode
  netfs: Fix a NULL vs IS_ERR() check in netfs_perform_write()
  netfs, fscache: Prevent Oops in fscache_put_cache()
  cifs: Don't use certain unnecessary folio_*() functions
  afs: Don't use certain unnecessary folio_*() functions
  netfs: Don't use certain unnecessary folio_*() functions
  netfs: Add Jeff Layton as reviewer
  netfs, cachefiles: Change mailing list
2024-01-25 10:41:29 -08:00
Linus Torvalds b9fa4cbd84 nfsd-6.8 fixes:
- Fix in-kernel RPC UDP transport
 - Fix NFSv4.0 RELEASE_LOCKOWNER
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEEKLLlsBKG3yQ88j7+M2qzM29mf5cFAmWxLmUACgkQM2qzM29m
 f5eCJBAAqTyYweun0ca9wisypEVVWgfoHN8wNpxZR3iaPkL/pWudkk+ImHuSGFCC
 fxdBU+xpBbxTHJ5yrrjyIVBQwHXRx3HQyedxs5SpFxeEr6h4z4fTzC3oX69AxHvr
 zt4ynccUjZ4vjcyASQafONtPUoeCoTUEP9hkrF/WYNdETuKXhGuJ7rM969f3eLF/
 V3oPz0l8PoCsyw6IrFgUOGsNYSdBfydR73WqMJTUP9Avc5l9yG2tKdNfV5n87xTg
 xPuAxyxBEKuzcK1N7qBjiBDt+MRn+QKSid/kVelhTEwb+vgW4DEqJwmBMNHjFzqN
 gy3x8xVatIlbuWvKslSE4MbrAplTdEZrSHK0Q7CNeREHptlf41QS5dgYdVCWcs15
 vgYb18qTipkErdQ4sYGK7oN0PhPTlOkpit/28Vf8fiiSEXZTPBr9bcYvAGD3pZUS
 kzoA0JrZkZQB+TNHQ3j5T2r+XKunF3KkPVXmI7c6RSf8yWcJ1kqe29zpstIfll71
 K157NBqeVM8FyiPOxZqT0AROW9aIPGml7f6SQ4DkUUp84LbEFaz8svkFMu9a7ONs
 oEkc5xbEuvIvyHdY1liBhxw5ugg1nNMIbchVwRHqQIIZGfy76p+KLRSZ2neidhxE
 pl0QNQchT3XsxZJOwcp5YboyBB+gVcKMpFB1rgX4NBcXTh1buWo=
 =t62e
 -----END PGP SIGNATURE-----

Merge tag 'nfsd-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux

Pull nfsd fixes from Chuck Lever:

 - Fix in-kernel RPC UDP transport

 - Fix NFSv4.0 RELEASE_LOCKOWNER

* tag 'nfsd-6.8-1' of git://git.kernel.org/pub/scm/linux/kernel/git/cel/linux:
  nfsd: fix RELEASE_LOCKOWNER
  SUNRPC: use request size to initialize bio_vec in svc_udp_sendto()
2024-01-25 10:26:52 -08:00
Linus Torvalds 3cb9871f81 Urgent RCU pull request for v6.8
This commit fixes RCU grace period stalls, which are observed when
 an outgoing CPU's quiescent state reporting results in wakeup of
 one of the grace period kthreads, to complete the grace period. If
 those kthreads have SCHED_FIFO policy, the wake up can indirectly
 arm the RT bandwith timer to the local offline CPU. Earlier migration
 of the hrtimers from the CPU introduced in commit 5c0930ccaa
 ("hrtimers: Push pending hrtimers away from outgoing CPU earlier")
 results in this timer getting ignored. If the RCU grace period
 kthreads are waiting for RT bandwidth to be available, they may
 never be actually scheduled, resulting in RCU stall warnings.
 -----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQSi2tPIQIc2VEtjarIAHS7/6Z0wpQUCZbFOUgAKCRAAHS7/6Z0w
 pQjcAQCg/tJYRjwGUPebKLUgkmXlR+IIANzEgvES/RgWTOld5gEAklZVTjf3J0qt
 QeU9WC3My2cVPKvv6kqnuQ9rrqMQ3g0=
 =U5an
 -----END PGP SIGNATURE-----

Merge tag 'urgent-rcu.2024.01.24a' of https://github.com/neeraju/linux

Pull RCU fix from Neeraj Upadhyay:
 "This fixes RCU grace period stalls, which are observed when an
  outgoing CPU's quiescent state reporting results in wakeup of one of
  the grace period kthreads, to complete the grace period.

  If those kthreads have SCHED_FIFO policy, the wake up can indirectly
  arm the RT bandwith timer to the local offline CPU.

  Earlier migration of the hrtimers from the CPU introduced in commit
  5c0930ccaa ("hrtimers: Push pending hrtimers away from outgoing CPU
  earlier") results in this timer getting ignored.

  If the RCU grace period kthreads are waiting for RT bandwidth to be
  available, they may never be actually scheduled, resulting in RCU
  stall warnings"

* tag 'urgent-rcu.2024.01.24a' of https://github.com/neeraju/linux:
  rcu: Defer RCU kthreads wakeup when CPU is dying
2024-01-25 10:21:21 -08:00
Paolo Abeni 0a5bd0ffe7 Merge branch 'tsnep-xdp-fixes'
Gerhard Engleder says:

====================
tsnep: XDP fixes

Found two driver specific problems during XDP and XSK testing.
====================

Link: https://lore.kernel.org/r/20240123200918.61219-1-gerhard@engleder-embedded.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 11:59:44 +01:00
Gerhard Engleder 9a91c05f4b tsnep: Fix XDP_RING_NEED_WAKEUP for empty fill ring
The fill ring of the XDP socket may contain not enough buffers to
completey fill the RX queue during socket creation. In this case the
flag XDP_RING_NEED_WAKEUP is not set as this flag is only set if the RX
queue is not completely filled during polling.

Set XDP_RING_NEED_WAKEUP flag also if RX queue is not completely filled
during XDP socket creation.

Fixes: 3fc2333933 ("tsnep: Add XDP socket zero-copy RX support")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 11:59:42 +01:00
Gerhard Engleder 50bad6f797 tsnep: Remove FCS for XDP data path
The RX data buffer includes the FCS. The FCS is already stripped for the
normal data path. But for the XDP data path the FCS is included and
acts like additional/useless data.

Remove the FCS from the RX data buffer also for XDP.

Fixes: 65b28c8100 ("tsnep: Add XDP RX support")
Fixes: 3fc2333933 ("tsnep: Add XDP socket zero-copy RX support")
Signed-off-by: Gerhard Engleder <gerhard@engleder-embedded.com>
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 11:59:42 +01:00
Paolo Abeni 5da4597163 mlx5-fixes-2024-01-24
-----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEGhZs6bAKwk/OTgTpSD+KveBX+j4FAmWwxyoACgkQSD+KveBX
 +j5yIwgAm37qO63rmb+1SVWy8FHmY7W4UUq6NEnReDRncVfnCazZ1Vh/ki0Qzzck
 4fEjqIfacV2NSQdCuRlQzWU6SH/N+QlLUHF0gL8fSAgDq/tCuMg8Ln9ZCMbobRpZ
 31qLKkkPQ1HIcadmoOuHNN8nslOxWm1Vlkqxj9CqUkV9OT3S6Hu1SWmBCzlSN8sN
 vvoDNkR8uTg+229HEf3visTpuyezu2fnBL5rNzkByb4TMn0ns0lcF79bYPh8Xa7p
 dszLpQw/gYQqzmo+Kh3bG6lUOKEZjbN8IUoa+nlNKQlyQIhxAegNzakSQ4ocgcLd
 aRZMqqfFpk5OhukevnLa/0wYTy1ScA==
 =bU/e
 -----END PGP SIGNATURE-----

Merge tag 'mlx5-fixes-2024-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux

Saeed Mahameed says:

====================
mlx5 fixes 2024-01-24

This series provides bug fixes to mlx5 driver.
Please pull and let me know if there is any problem.

* tag 'mlx5-fixes-2024-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/saeed/linux:
  net/mlx5e: fix a potential double-free in fs_any_create_groups
  net/mlx5e: fix a double-free in arfs_create_groups
  net/mlx5e: Ignore IPsec replay window values on sender side
  net/mlx5e: Allow software parsing when IPsec crypto is enabled
  net/mlx5: Use mlx5 device constant for selecting CQ period mode for ASO
  net/mlx5: DR, Can't go to uplink vport on RX rule
  net/mlx5: DR, Use the right GVMI number for drop action
  net/mlx5: Bridge, fix multicast packets sent to uplink
  net/mlx5: Fix a WARN upon a callback command failure
  net/mlx5e: Fix peer flow lists handling
  net/mlx5e: Fix inconsistent hairpin RQT sizes
  net/mlx5e: Fix operation precedence bug in port timestamping napi_poll context
  net/mlx5: Fix query of sd_group field
  net/mlx5e: Use the correct lag ports number when creating TISes
====================

Link: https://lore.kernel.org/r/20240124081855.115410-1-saeed@kernel.org
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 11:42:27 +01:00
Paolo Abeni fdf8e6d18c bpf-for-netdev
-----BEGIN PGP SIGNATURE-----
 
 iHUEABYIAB0WIQTFp0I1jqZrAX+hPRXbK58LschIgwUCZbIcmAAKCRDbK58LschI
 gw7OAP4xKTkfcHSZUWcIOWgmdCX41B1zUPdyH2w3woHVH8kKRgEApWisrUKZq30Q
 7seUBxVNK1HXBS+egqqRT7Rgs4iFhQs=
 =iBo/
 -----END PGP SIGNATURE-----

Merge tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf

Daniel Borkmann says:

====================
pull-request: bpf 2024-01-25

The following pull-request contains BPF updates for your *net* tree.

We've added 12 non-merge commits during the last 2 day(s) which contain
a total of 13 files changed, 190 insertions(+), 91 deletions(-).

The main changes are:

1) Fix bpf_xdp_adjust_tail() in context of XSK zero-copy drivers which
   support XDP multi-buffer. The former triggered a NULL pointer
   dereference upon shrinking, from Maciej Fijalkowski & Tirthendu Sarkar.

2) Fix a bug in riscv64 BPF JIT which emitted a wrong prologue and
   epilogue for struct_ops programs, from Pu Lehui.

* tag 'for-netdev' of https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf:
  i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
  i40e: set xdp_rxq_info::frag_size
  xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
  ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
  intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
  ice: remove redundant xdp_rxq_info registration
  i40e: handle multi-buffer packets that are shrunk by xdp prog
  ice: work on pre-XDP prog frag count
  xsk: fix usage of multi-buffer BPF helpers for ZC XDP
  xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
  xsk: recycle buffer in case Rx queue was full
  riscv, bpf: Fix unpredictable kernel crash about RV64 struct_ops
====================

Link: https://lore.kernel.org/r/20240125084416.10876-1-daniel@iogearbox.net
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 11:30:31 +01:00
Shenwei Wang 5e34480773 net: fec: fix the unhandled context fault from smmu
When repeatedly changing the interface link speed using the command below:

ethtool -s eth0 speed 100 duplex full
ethtool -s eth0 speed 1000 duplex full

The following errors may sometimes be reported by the ARM SMMU driver:

[ 5395.035364] fec 5b040000.ethernet eth0: Link is Down
[ 5395.039255] arm-smmu 51400000.iommu: Unhandled context fault:
fsr=0x402, iova=0x00000000, fsynr=0x100001, cbfrsynra=0x852, cb=2
[ 5398.108460] fec 5b040000.ethernet eth0: Link is Up - 100Mbps/Full -
flow control off

It is identified that the FEC driver does not properly stop the TX queue
during the link speed transitions, and this results in the invalid virtual
I/O address translations from the SMMU and causes the context faults.

Fixes: dbc64a8ea2 ("net: fec: move calls to quiesce/resume packet processing out of fec_restart()")
Signed-off-by: Shenwei Wang <shenwei.wang@nxp.com>
Link: https://lore.kernel.org/r/20240123165141.2008104-1-shenwei.wang@nxp.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 11:22:27 +01:00
Hangbin Liu a2933a8759 selftests: bonding: do not test arp/ns target with mode balance-alb/tlb
The prio_arp/ns tests hard code the mode to active-backup. At the same
time, The balance-alb/tlb modes do not support arp/ns target. So remove
the prio_arp/ns tests from the loop and only test active-backup mode.

Fixes: 481b56e039 ("selftests: bonding: re-format bond option tests")
Reported-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Closes: https://lore.kernel.org/netdev/17415.1705965957@famine/
Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
Acked-by: Jay Vosburgh <jay.vosburgh@canonical.com>
Link: https://lore.kernel.org/r/20240123075917.1576360-1-liuhangbin@gmail.com
Signed-off-by: Paolo Abeni <pabeni@redhat.com>
2024-01-25 09:50:54 +01:00
Jakub Kicinski a717932db1 netfilter pull request 24-01-24
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCgAdFiEEN9lkrMBJgcdVAPub1V2XiooUIOQFAmWxX0UACgkQ1V2XiooU
 IOTN2xAAq60JXo1A05dnma3DzXxIsCQKmum+ph1ii8gvZ2qqJT5+CVk0pMuHXEfi
 UPt/FfGIC3WZQ7yOLLxkGeUv8g7rxCncIsJtjxQSTh1gaQhePiATMUpwIhJ5W5tq
 QUw6DrZfv3Y95Gth61BDokBEhGVntWTV2ra608gx5PrpXQvCid7CJqKeg3hzaoSr
 cWonnRsxlwHdu1R4vqrZwnEMj6BBJfviOvS9HPGEul9LRQneXNRMuEJ0L73vjU9h
 gdh8oQHxoAgOHsK1KczNK9no9rnGgmLS9K98tBjRYdJwfxsI4YSjL3/+FnxTgWgp
 GsGZP+/+aQD+GKAN/eAO7cjfaSZweqhr8JxaUdyOZRqeQYw2v3SV2zyEKTw2hahM
 1JvY7oEtthMMylPxJac075jbeizoNRU9J5AeJNqWexMZK5P6Yb0BC5DRUeHC0+RN
 y99Yj5z4fwMmBgWdTBXnIm0k87bouQsQBDS3Rn7QTD2b1RhQTTEar+W816QyOBng
 8tllV25b4SOtXc6xouqP4YgCB1wuYN74Bvxb+vgGK2mj9oTLpTMEtFXXduenqoGl
 wQutx0fumJMAHTIHz77RDlTAanEbFOEq35R0PV+cfm9173inI77bZyGcznWlM+Zn
 fM+Gr1ka4JmO6RegYJHh3kUrM2l43zVmN8O1gQCnkRuJJZ6BwVk=
 =SirU
 -----END PGP SIGNATURE-----

Merge tag 'nf-24-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf

Pablo Neira Ayuso says:

====================
Netfilter fixes for net

The following patchset contains Netfilter fixes for net:

1) Update nf_tables kdoc to keep it in sync with the code, from George Guo.

2) Handle NETDEV_UNREGISTER event for inet/ingress basechain.

3) Reject configuration that cause nft_limit to overflow,
   from Florian Westphal.

4) Restrict anonymous set/map names to 16 bytes, from Florian Westphal.

5) Disallow to encode queue number and error in verdicts. This reverts
   a patch which seems to have introduced an early attempt to support for
   nfqueue maps, which is these days supported via nft_queue expression.

6) Sanitize family via .validate for expressions that explicitly refer
   to NF_INET_* hooks.

* tag 'nf-24-01-24' of git://git.kernel.org/pub/scm/linux/kernel/git/netfilter/nf:
  netfilter: nf_tables: validate NFPROTO_* family
  netfilter: nf_tables: reject QUEUE/DROP verdict parameters
  netfilter: nf_tables: restrict anonymous set and map names to 16 bytes
  netfilter: nft_limit: reject configurations that cause integer overflow
  netfilter: nft_chain_filter: handle NETDEV_UNREGISTER for inet/ingress basechain
  netfilter: nf_tables: cleanup documentation
====================

Link: https://lore.kernel.org/r/20240124191248.75463-1-pablo@netfilter.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24 21:03:17 -08:00
Zhipeng Lu f6cc4b6a3a fjes: fix memleaks in fjes_hw_setup
In fjes_hw_setup, it allocates several memory and delay the deallocation
to the fjes_hw_exit in fjes_probe through the following call chain:

fjes_probe
  |-> fjes_hw_init
        |-> fjes_hw_setup
  |-> fjes_hw_exit

However, when fjes_hw_setup fails, fjes_hw_exit won't be called and thus
all the resources allocated in fjes_hw_setup will be leaked. In this
patch, we free those resources in fjes_hw_setup and prevents such leaks.

Fixes: 2fcbca6877 ("fjes: platform_driver's .probe and .remove routine")
Signed-off-by: Zhipeng Lu <alexious@zju.edu.cn>
Reviewed-by: Simon Horman <horms@kernel.org>
Link: https://lore.kernel.org/r/20240122172445.3841883-1-alexious@zju.edu.cn
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24 18:03:53 -08:00
Linus Torvalds 6098d87eaf A fix to avoid triggering an assert in some cases where RBD exclusive
mappings are involved and a deprecated API cleanup.
 -----BEGIN PGP SIGNATURE-----
 
 iQFHBAABCAAxFiEEydHwtzie9C7TfviiSn/eOAIR84sFAmWxcDcTHGlkcnlvbW92
 QGdtYWlsLmNvbQAKCRBKf944AhHzi63lB/4/LCzIsbGLi52U2exy4yObQ4a7gpQB
 /zTrf7jlJAc4f+5zNUd4OgFUGK1UcLr3SC3R4CCXwHI8tqepEG1v9NtmtILBxIzu
 9r7NjVAwvi7qUirRZvsXwvB+uQJeSfOCtOe7+JFNi+BpT303NbvOiPIXHE1YqATs
 rXZo+7Z7kHxSP9lRTvJu0q/DRS1EBZSJy2eOgNIj6uwLFe49h++XACHnpsOQaO0I
 OcuMcQgol5evBbd7hQyrOb4h5BYlLEFY8iMrwaRQ04KCW4yJxvlTviom7tYe9GaU
 xeJtADE1c6SdeV2orU5OsPEQ7BsLoUWcc+vSaPYq/1WmH+IJ5wtNP8RI
 =wGB5
 -----END PGP SIGNATURE-----

Merge tag 'ceph-for-6.8-rc2' of https://github.com/ceph/ceph-client

Pull ceph fixes from Ilya Dryomov:
 "A fix to avoid triggering an assert in some cases where RBD exclusive
  mappings are involved and a deprecated API cleanup"

* tag 'ceph-for-6.8-rc2' of https://github.com/ceph/ceph-client:
  rbd: don't move requests to the running list on errors
  rbd: remove usage of the deprecated ida_simple_*() API
2024-01-24 16:59:52 -08:00
Linus Torvalds f22face166 integrity-6.8-rc1
-----BEGIN PGP SIGNATURE-----
 
 iIoEABYIADIWIQQdXVVFGN5XqKr1Hj7LwZzRsCrn5QUCZbGRzRQcem9oYXJAbGlu
 dXguaWJtLmNvbQAKCRDLwZzRsCrn5UNlAP9NrD1Je62umVMs6fNJC03R1aEnalcB
 S+BzZ5jvX+uaewEAsZQh7mbfiFjQDZsptKdeXbMC5KgKVpIbUqEDeiTeFwI=
 =ItFY
 -----END PGP SIGNATURE-----

Merge tag 'integrity-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity

Pull integrity fix from Mimi Zohar:
 "Revert patch that required user-provided key data, since keys can be
  created from kernel-generated random numbers"

* tag 'integrity-v6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/zohar/linux-integrity:
  Revert "KEYS: encrypted: Add check for strsep"
2024-01-24 16:51:59 -08:00
Alexei Starovoitov 9d71bc833f Merge branch 'net-bpf_xdp_adjust_tail-and-intel-mbuf-fixes'
Maciej Fijalkowski says:

====================
net: bpf_xdp_adjust_tail() and Intel mbuf fixes

Hey,

after a break followed by dealing with sickness, here is a v6 that makes
bpf_xdp_adjust_tail() actually usable for ZC drivers that support XDP
multi-buffer. Since v4 I tried also using bpf_xdp_adjust_tail() with
positive offset which exposed yet another issues, which can be observed
by increased commit count when compared to v3.

John, in the end I think we should remove handling
MEM_TYPE_XSK_BUFF_POOL from __xdp_return(), but it is out of the scope
for fixes set, IMHO.

Thanks,
Maciej

v6:
- add acks [Magnus]
- fix spelling mistakes [Magnus]
- avoid touching xdp_buff in xp_alloc_{reused,new_from_fq}() [Magnus]
- s/shrink_data/bpf_xdp_shrink_data [Jakub]
- remove __shrink_data() [Jakub]
- check retvals from __xdp_rxq_info_reg() [Magnus]

v5:
- pick correct version of patch 5 [Simon]
- elaborate a bit more on what patch 2 fixes

v4:
- do not clear frags flag when deleting tail; xsk_buff_pool now does
  that
- skip some NULL tests for xsk_buff_get_tail [Martin, John]
- address problems around registering xdp_rxq_info
- fix bpf_xdp_frags_increase_tail() for ZC mbuf

v3:
- add acks
- s/xsk_buff_tail_del/xsk_buff_del_tail
- address i40e as well (thanks Tirthendu)

v2:
- fix !CONFIG_XDP_SOCKETS builds
- add reviewed-by tag to patch 3
====================

Link: https://lore.kernel.org/r/20240124191602.566724-1-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:07 -08:00
Maciej Fijalkowski 0cbb08707c i40e: update xdp_rxq_info::frag_size for ZC enabled Rx queue
Now that i40e driver correctly sets up frag_size in xdp_rxq_info, let us
make it work for ZC multi-buffer as well. i40e_ring::rx_buf_len for ZC
is being set via xsk_pool_get_rx_frame_size() and this needs to be
propagated up to xdp_rxq_info.

Fixes: 1c9ba9c146 ("i40e: xsk: add RX multi-buffer support")
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-12-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:07 -08:00
Maciej Fijalkowski a045d2f2d0 i40e: set xdp_rxq_info::frag_size
i40e support XDP multi-buffer so it is supposed to use
__xdp_rxq_info_reg() instead of xdp_rxq_info_reg() and set the
frag_size. It can not be simply converted at existing callsite because
rx_buf_len could be un-initialized, so let us register xdp_rxq_info
within i40e_configure_rx_ring(), which happen to be called with already
initialized rx_buf_len value.

Commit 5180ff1364 ("i40e: use int for i40e_status") converted 'err' to
int, so two variables to deal with return codes are not needed within
i40e_configure_rx_ring(). Remove 'ret' and use 'err' to handle status
from xdp_rxq_info registration.

Fixes: e213ced19b ("i40e: add support for XDP multi-buffer Rx")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-11-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:07 -08:00
Maciej Fijalkowski fbadd83a61 xdp: reflect tail increase for MEM_TYPE_XSK_BUFF_POOL
XSK ZC Rx path calculates the size of data that will be posted to XSK Rx
queue via subtracting xdp_buff::data_end from xdp_buff::data.

In bpf_xdp_frags_increase_tail(), when underlying memory type of
xdp_rxq_info is MEM_TYPE_XSK_BUFF_POOL, add offset to data_end in tail
fragment, so that later on user space will be able to take into account
the amount of bytes added by XDP program.

Fixes: 24ea50127e ("xsk: support mbuf on ZC RX")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-10-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:07 -08:00
Maciej Fijalkowski 3de38c8717 ice: update xdp_rxq_info::frag_size for ZC enabled Rx queue
Now that ice driver correctly sets up frag_size in xdp_rxq_info, let us
make it work for ZC multi-buffer as well. ice_rx_ring::rx_buf_len for ZC
is being set via xsk_pool_get_rx_frame_size() and this needs to be
propagated up to xdp_rxq_info.

Use a bigger hammer and instead of unregistering only xdp_rxq_info's
memory model, unregister it altogether and register it again and have
xdp_rxq_info with correct frag_size value.

Fixes: 1bbc04de60 ("ice: xsk: add RX multi-buffer support")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-9-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:07 -08:00
Maciej Fijalkowski 290779905d intel: xsk: initialize skb_frag_t::bv_offset in ZC drivers
Ice and i40e ZC drivers currently set offset of a frag within
skb_shared_info to 0, which is incorrect. xdp_buffs that come from
xsk_buff_pool always have 256 bytes of a headroom, so they need to be
taken into account to retrieve xdp_buff::data via skb_frag_address().
Otherwise, bpf_xdp_frags_increase_tail() would be starting its job from
xdp_buff::data_hard_start which would result in overwriting existing
payload.

Fixes: 1c9ba9c146 ("i40e: xsk: add RX multi-buffer support")
Fixes: 1bbc04de60 ("ice: xsk: add RX multi-buffer support")
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-8-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:07 -08:00
Maciej Fijalkowski 2ee788c064 ice: remove redundant xdp_rxq_info registration
xdp_rxq_info struct can be registered by drivers via two functions -
xdp_rxq_info_reg() and __xdp_rxq_info_reg(). The latter one allows
drivers that support XDP multi-buffer to set up xdp_rxq_info::frag_size
which in turn will make it possible to grow the packet via
bpf_xdp_adjust_tail() BPF helper.

Currently, ice registers xdp_rxq_info in two spots:
1) ice_setup_rx_ring() // via xdp_rxq_info_reg(), BUG
2) ice_vsi_cfg_rxq()   // via __xdp_rxq_info_reg(), OK

Cited commit under fixes tag took care of setting up frag_size and
updated registration scheme in 2) but it did not help as
1) is called before 2) and as shown above it uses old registration
function. This means that 2) sees that xdp_rxq_info is already
registered and never calls __xdp_rxq_info_reg() which leaves us with
xdp_rxq_info::frag_size being set to 0.

To fix this misbehavior, simply remove xdp_rxq_info_reg() call from
ice_setup_rx_ring().

Fixes: 2fba7dc515 ("ice: Add support for XDP multi-buffer on Rx side")
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-7-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:06 -08:00
Tirthendu Sarkar 83014323c6 i40e: handle multi-buffer packets that are shrunk by xdp prog
XDP programs can shrink packets by calling the bpf_xdp_adjust_tail()
helper function. For multi-buffer packets this may lead to reduction of
frag count stored in skb_shared_info area of the xdp_buff struct. This
results in issues with the current handling of XDP_PASS and XDP_DROP
cases.

For XDP_PASS, currently skb is being built using frag count of
xdp_buffer before it was processed by XDP prog and thus will result in
an inconsistent skb when frag count gets reduced by XDP prog. To fix
this, get correct frag count while building the skb instead of using
pre-obtained frag count.

For XDP_DROP, current page recycling logic will not reuse the page but
instead will adjust the pagecnt_bias so that the page can be freed. This
again results in inconsistent behavior as the page refcnt has already
been changed by the helper while freeing the frag(s) as part of
shrinking the packet. To fix this, only adjust pagecnt_bias for buffers
that are stillpart of the packet post-xdp prog run.

Fixes: e213ced19b ("i40e: add support for XDP multi-buffer Rx")
Reported-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Signed-off-by: Tirthendu Sarkar <tirthendu.sarkar@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-6-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:06 -08:00
Maciej Fijalkowski ad2047cf5d ice: work on pre-XDP prog frag count
Fix an OOM panic in XDP_DRV mode when a XDP program shrinks a
multi-buffer packet by 4k bytes and then redirects it to an AF_XDP
socket.

Since support for handling multi-buffer frames was added to XDP, usage
of bpf_xdp_adjust_tail() helper within XDP program can free the page
that given fragment occupies and in turn decrease the fragment count
within skb_shared_info that is embedded in xdp_buff struct. In current
ice driver codebase, it can become problematic when page recycling logic
decides not to reuse the page. In such case, __page_frag_cache_drain()
is used with ice_rx_buf::pagecnt_bias that was not adjusted after
refcount of page was changed by XDP prog which in turn does not drain
the refcount to 0 and page is never freed.

To address this, let us store the count of frags before the XDP program
was executed on Rx ring struct. This will be used to compare with
current frag count from skb_shared_info embedded in xdp_buff. A smaller
value in the latter indicates that XDP prog freed frag(s). Then, for
given delta decrement pagecnt_bias for XDP_DROP verdict.

While at it, let us also handle the EOP frag within
ice_set_rx_bufs_act() to make our life easier, so all of the adjustments
needed to be applied against freed frags are performed in the single
place.

Fixes: 2fba7dc515 ("ice: Add support for XDP multi-buffer on Rx side")
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-5-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:06 -08:00
Maciej Fijalkowski c5114710c8 xsk: fix usage of multi-buffer BPF helpers for ZC XDP
Currently when packet is shrunk via bpf_xdp_adjust_tail() and memory
type is set to MEM_TYPE_XSK_BUFF_POOL, null ptr dereference happens:

[1136314.192256] BUG: kernel NULL pointer dereference, address:
0000000000000034
[1136314.203943] #PF: supervisor read access in kernel mode
[1136314.213768] #PF: error_code(0x0000) - not-present page
[1136314.223550] PGD 0 P4D 0
[1136314.230684] Oops: 0000 [#1] PREEMPT SMP NOPTI
[1136314.239621] CPU: 8 PID: 54203 Comm: xdpsock Not tainted 6.6.0+ #257
[1136314.250469] Hardware name: Intel Corporation S2600WFT/S2600WFT,
BIOS SE5C620.86B.02.01.0008.031920191559 03/19/2019
[1136314.265615] RIP: 0010:__xdp_return+0x6c/0x210
[1136314.274653] Code: ad 00 48 8b 47 08 49 89 f8 a8 01 0f 85 9b 01 00 00 0f 1f 44 00 00 f0 41 ff 48 34 75 32 4c 89 c7 e9 79 cd 80 ff 83 fe 03 75 17 <f6> 41 34 01 0f 85 02 01 00 00 48 89 cf e9 22 cc 1e 00 e9 3d d2 86
[1136314.302907] RSP: 0018:ffffc900089f8db0 EFLAGS: 00010246
[1136314.312967] RAX: ffffc9003168aed0 RBX: ffff8881c3300000 RCX:
0000000000000000
[1136314.324953] RDX: 0000000000000000 RSI: 0000000000000003 RDI:
ffffc9003168c000
[1136314.336929] RBP: 0000000000000ae0 R08: 0000000000000002 R09:
0000000000010000
[1136314.348844] R10: ffffc9000e495000 R11: 0000000000000040 R12:
0000000000000001
[1136314.360706] R13: 0000000000000524 R14: ffffc9003168aec0 R15:
0000000000000001
[1136314.373298] FS:  00007f8df8bbcb80(0000) GS:ffff8897e0e00000(0000)
knlGS:0000000000000000
[1136314.386105] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[1136314.396532] CR2: 0000000000000034 CR3: 00000001aa912002 CR4:
00000000007706f0
[1136314.408377] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[1136314.420173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7:
0000000000000400
[1136314.431890] PKRU: 55555554
[1136314.439143] Call Trace:
[1136314.446058]  <IRQ>
[1136314.452465]  ? __die+0x20/0x70
[1136314.459881]  ? page_fault_oops+0x15b/0x440
[1136314.468305]  ? exc_page_fault+0x6a/0x150
[1136314.476491]  ? asm_exc_page_fault+0x22/0x30
[1136314.484927]  ? __xdp_return+0x6c/0x210
[1136314.492863]  bpf_xdp_adjust_tail+0x155/0x1d0
[1136314.501269]  bpf_prog_ccc47ae29d3b6570_xdp_sock_prog+0x15/0x60
[1136314.511263]  ice_clean_rx_irq_zc+0x206/0xc60 [ice]
[1136314.520222]  ? ice_xmit_zc+0x6e/0x150 [ice]
[1136314.528506]  ice_napi_poll+0x467/0x670 [ice]
[1136314.536858]  ? ttwu_do_activate.constprop.0+0x8f/0x1a0
[1136314.546010]  __napi_poll+0x29/0x1b0
[1136314.553462]  net_rx_action+0x133/0x270
[1136314.561619]  __do_softirq+0xbe/0x28e
[1136314.569303]  do_softirq+0x3f/0x60

This comes from __xdp_return() call with xdp_buff argument passed as
NULL which is supposed to be consumed by xsk_buff_free() call.

To address this properly, in ZC case, a node that represents the frag
being removed has to be pulled out of xskb_list. Introduce
appropriate xsk helpers to do such node operation and use them
accordingly within bpf_xdp_adjust_tail().

Fixes: 24ea50127e ("xsk: support mbuf on ZC RX")
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com> # For the xsk header part
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-4-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:06 -08:00
Maciej Fijalkowski f7f6aa8e24 xsk: make xsk_buff_pool responsible for clearing xdp_buff::flags
XDP multi-buffer support introduced XDP_FLAGS_HAS_FRAGS flag that is
used by drivers to notify data path whether xdp_buff contains fragments
or not. Data path looks up mentioned flag on first buffer that occupies
the linear part of xdp_buff, so drivers only modify it there. This is
sufficient for SKB and XDP_DRV modes as usually xdp_buff is allocated on
stack or it resides within struct representing driver's queue and
fragments are carried via skb_frag_t structs. IOW, we are dealing with
only one xdp_buff.

ZC mode though relies on list of xdp_buff structs that is carried via
xsk_buff_pool::xskb_list, so ZC data path has to make sure that
fragments do *not* have XDP_FLAGS_HAS_FRAGS set. Otherwise,
xsk_buff_free() could misbehave if it would be executed against xdp_buff
that carries a frag with XDP_FLAGS_HAS_FRAGS flag set. Such scenario can
take place when within supplied XDP program bpf_xdp_adjust_tail() is
used with negative offset that would in turn release the tail fragment
from multi-buffer frame.

Calling xsk_buff_free() on tail fragment with XDP_FLAGS_HAS_FRAGS would
result in releasing all the nodes from xskb_list that were produced by
driver before XDP program execution, which is not what is intended -
only tail fragment should be deleted from xskb_list and then it should
be put onto xsk_buff_pool::free_list. Such multi-buffer frame will never
make it up to user space, so from AF_XDP application POV there would be
no traffic running, however due to free_list getting constantly new
nodes, driver will be able to feed HW Rx queue with recycled buffers.
Bottom line is that instead of traffic being redirected to user space,
it would be continuously dropped.

To fix this, let us clear the mentioned flag on xsk_buff_pool side
during xdp_buff initialization, which is what should have been done
right from the start of XSK multi-buffer support.

Fixes: 1bbc04de60 ("ice: xsk: add RX multi-buffer support")
Fixes: 1c9ba9c146 ("i40e: xsk: add RX multi-buffer support")
Fixes: 24ea50127e ("xsk: support mbuf on ZC RX")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-3-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:06 -08:00
Maciej Fijalkowski 2690098931 xsk: recycle buffer in case Rx queue was full
Add missing xsk_buff_free() call when __xsk_rcv_zc() failed to produce
descriptor to XSK Rx queue.

Fixes: 24ea50127e ("xsk: support mbuf on ZC RX")
Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
Link: https://lore.kernel.org/r/20240124191602.566724-2-maciej.fijalkowski@intel.com
Signed-off-by: Alexei Starovoitov <ast@kernel.org>
2024-01-24 16:24:06 -08:00
Jakub Kicinski 77be22473b Merge branch 'fix-module_description-for-net-p2'
Breno Leitao says:

====================
Fix MODULE_DESCRIPTION() for net (p2)

There are hundreds of network modules that misses MODULE_DESCRIPTION(),
causing a warnning when compiling with W=1. Example:

        WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/com90io.o
        WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/arc-rimi.o
        WARNING: modpost: missing MODULE_DESCRIPTION() in drivers/net/arcnet/com20020.o

This part2 of the patchset focus on the drivers/net/ethernet drivers.
There are still some missing warnings in drivers/net/ethernet that will
be fixed in an upcoming patchset.

v1: https://lore.kernel.org/all/20240122184543.2501493-2-leitao@debian.org/
====================

Link: https://lore.kernel.org/r/20240123190332.677489-1-leitao@debian.org
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
2024-01-24 15:12:56 -08:00