linux/Documentation
Eric DeVolder 88a6f89944 crash: memory and CPU hotplug sysfs attributes
Introduce the crash_hotplug attribute for memory and CPUs for use by
userspace.  These attributes directly facilitate the udev rule for
managing userspace re-loading of the crash kernel upon hot un/plug
changes.

For memory, expose the crash_hotplug attribute to the
/sys/devices/system/memory directory.  For example:

 # udevadm info --attribute-walk /sys/devices/system/memory/memory81
  looking at device '/devices/system/memory/memory81':
    KERNEL=="memory81"
    SUBSYSTEM=="memory"
    DRIVER==""
    ATTR{online}=="1"
    ATTR{phys_device}=="0"
    ATTR{phys_index}=="00000051"
    ATTR{removable}=="1"
    ATTR{state}=="online"
    ATTR{valid_zones}=="Movable"

  looking at parent device '/devices/system/memory':
    KERNELS=="memory"
    SUBSYSTEMS==""
    DRIVERS==""
    ATTRS{auto_online_blocks}=="offline"
    ATTRS{block_size_bytes}=="8000000"
    ATTRS{crash_hotplug}=="1"

For CPUs, expose the crash_hotplug attribute to the
/sys/devices/system/cpu directory. For example:

 # udevadm info --attribute-walk /sys/devices/system/cpu/cpu0
  looking at device '/devices/system/cpu/cpu0':
    KERNEL=="cpu0"
    SUBSYSTEM=="cpu"
    DRIVER=="processor"
    ATTR{crash_notes}=="277c38600"
    ATTR{crash_notes_size}=="368"
    ATTR{online}=="1"

  looking at parent device '/devices/system/cpu':
    KERNELS=="cpu"
    SUBSYSTEMS==""
    DRIVERS==""
    ATTRS{crash_hotplug}=="1"
    ATTRS{isolated}==""
    ATTRS{kernel_max}=="8191"
    ATTRS{nohz_full}=="  (null)"
    ATTRS{offline}=="4-7"
    ATTRS{online}=="0-3"
    ATTRS{possible}=="0-7"
    ATTRS{present}=="0-3"

With these sysfs attributes in place, it is possible to efficiently
instruct the udev rule to skip crash kernel reloading for kernels
configured with crash hotplug support.

For example, the following is the proposed udev rule change for RHEL
system 98-kexec.rules (as the first lines of the rule file):

 # The kernel updates the crash elfcorehdr for CPU and memory changes
 SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"
 SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end"

When examined in the context of 98-kexec.rules, the above rules test if
crash_hotplug is set, and if so, the userspace initiated
unload-then-reload of the crash kernel is skipped.

CPU and memory checks are separated in accordance with CONFIG_HOTPLUG_CPU
and CONFIG_MEMORY_HOTPLUG kernel config options.  If an architecture
supports, for example, memory hotplug but not CPU hotplug, then the
/sys/devices/system/memory/crash_hotplug attribute file is present, but
the /sys/devices/system/cpu/crash_hotplug attribute file will NOT be
present.  Thus the udev rule skips userspace processing of memory hot
un/plug events, but the udev rule will evaluate false for CPU events, thus
allowing userspace to process CPU hot un/plug events (ie the
unload-then-reload of the kdump capture kernel).

Link: https://lkml.kernel.org/r/20230814214446.6659-5-eric.devolder@oracle.com
Signed-off-by: Eric DeVolder <eric.devolder@oracle.com>
Reviewed-by: Sourabh Jain <sourabhjain@linux.ibm.com>
Acked-by: Hari Bathini <hbathini@linux.ibm.com>
Acked-by: Baoquan He <bhe@redhat.com>
Cc: Akhil Raj <lf32.dev@gmail.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: Borislav Petkov (AMD) <bp@alien8.de>
Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com>
Cc: Dave Hansen <dave.hansen@linux.intel.com>
Cc: Dave Young <dyoung@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Cc: "H. Peter Anvin" <hpa@zytor.com>
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Jonathan Corbet <corbet@lwn.net>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Cc: Mimi Zohar <zohar@linux.ibm.com>
Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com>
Cc: Oscar Salvador <osalvador@suse.de>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: Sean Christopherson <seanjc@google.com>
Cc: Takashi Iwai <tiwai@suse.de>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Thomas Weißschuh <linux@weissschuh.net>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
2023-08-24 16:25:14 -07:00
..
ABI crash: memory and CPU hotplug sysfs attributes 2023-08-24 16:25:14 -07:00
accel
accounting
admin-guide crash: memory and CPU hotplug sysfs attributes 2023-08-24 16:25:14 -07:00
arch - Work around an erratum on GIC700, where a race between a CPU 2023-07-30 10:59:19 -07:00
block
bpf sock: Remove ->sendpage*() in favour of sendmsg(MSG_SPLICE_PAGES) 2023-06-24 15:50:13 -07:00
cdrom
core-api crash: memory and CPU hotplug sysfs attributes 2023-08-24 16:25:14 -07:00
cpu-freq
crypto
dev-tools - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
devicetree Devicetree fixes for v6.5: 2023-07-22 10:28:22 -07:00
doc-guide
driver-api Fixes for pci_clean_master, error handling in driver inits, and various 2023-07-09 09:35:51 -07:00
fault-injection
fb
features LoongArch: Add jump-label implementation 2023-06-29 20:58:44 +08:00
filesystems tmpfs: fix Documentation of noswap and huge mount options 2023-07-27 13:07:03 -07:00
firmware-guide
firmware_class
fpga
gpu Merge tag 'amd-drm-next-6.5-2023-06-09' of https://gitlab.freedesktop.org/agd5f/linux into drm-next 2023-06-15 14:11:22 +10:00
hid
hwmon hwmon: (oxp-sensors) Add support for AOKZOE A1 PRO 2023-06-24 20:17:18 -07:00
i2c
iio
images
infiniband
input
isdn
kbuild
kernel-hacking
leds - New Drivers 2023-07-03 11:26:05 -07:00
litmus-tests
livepatch
locking
loongarch
maintainer Documentation: update git configuration for Link: tag 2023-06-21 09:15:15 -06:00
mhi
mips
misc-devices Documentation: Add TI TPS6594 PFSM 2023-06-15 13:41:53 +02:00
mm - Yosry Ahmed brought back some cgroup v1 stats in OOM logs. 2023-06-28 10:28:11 -07:00
netlabel
netlink netlink: specs: add display hints to ovs_flow 2023-06-24 15:45:49 -07:00
networking docs: net: clarify the NAPI rules around XDP Tx 2023-07-21 18:51:37 -07:00
nvdimm
nvme
PCI Merge branch 'pci/controller/endpoint' 2023-06-26 13:00:00 -05:00
pcmcia
peci
power
powerpc Documentation: Document PowerPC kernel DEXCR interface 2023-06-19 17:36:27 +10:00
process Documentation: embargoed-hardware-issues.rst: add AMD to the list 2023-07-26 09:39:34 +02:00
RCU
riscv Documentation: RISC-V: hwprobe: Fix a formatting error 2023-07-11 10:43:51 -07:00
rust
s390
scheduler sched/deadline: Update GRUB description in the documentation 2023-06-16 22:08:12 +02:00
scsi
security
sound ALSA: compress: allow setting codec params after next track 2023-06-21 07:28:31 +02:00
sphinx
sphinx-static
spi
staging
target scsi: target: docs: Remove tcm_mod_builder.py 2023-06-28 22:01:32 -04:00
timers
tools Documentation: Add tools/rtla timerlat -u option documentation 2023-06-13 16:43:37 -04:00
trace Char/Misc and other driver subsystem updates for 6.5-rc1 2023-07-03 12:46:47 -07:00
translations A half-dozen late arriving docs patches. They are mostly fixes, but we 2023-07-06 22:15:38 -07:00
usb
userspace-api media updates for v6.5-rc1 2023-07-05 10:42:32 -07:00
virt A half-dozen late arriving docs patches. They are mostly fixes, but we 2023-07-06 22:15:38 -07:00
w1
watchdog
wmi platform/x86: dell-ddv: Fix mangled list in documentation 2023-07-11 12:15:30 +02:00
.gitignore
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py
docutils.conf
dontdiff
index.rst
Kconfig
Makefile
memory-barriers.txt
SubmittingPatches
subsystem-apis.rst platform-drivers-x86 for v6.5-1 2023-06-30 14:50:00 -07:00