linux/Documentation
Feng Tang 2ed08e4bc5 clocksource: Scale the watchdog read retries automatically
On a 8-socket server the TSC is wrongly marked as 'unstable' and disabled
during boot time on about one out of 120 boot attempts:

    clocksource: timekeeping watchdog on CPU227: wd-tsc-wd excessive read-back delay of 153560ns vs. limit of 125000ns,
    wd-wd read-back delay only 11440ns, attempt 3, marking tsc unstable
    tsc: Marking TSC unstable due to clocksource watchdog
    TSC found unstable after boot, most likely due to broken BIOS. Use 'tsc=unstable'.
    sched_clock: Marking unstable (119294969739, 159204297)<-(125446229205, -5992055152)
    clocksource: Checking clocksource tsc synchronization from CPU 319 to CPUs 0,99,136,180,210,542,601,896.
    clocksource: Switched to clocksource hpet

The reason is that for platform with a large number of CPUs, there are
sporadic big or huge read latencies while reading the watchog/clocksource
during boot or when system is under stress work load, and the frequency and
maximum value of the latency goes up with the number of online CPUs.

The cCurrent code already has logic to detect and filter such high latency
case by reading the watchdog twice and checking the two deltas. Due to the
randomness of the latency, there is a low probabilty that the first delta
(latency) is big, but the second delta is small and looks valid. The
watchdog code retries the readouts by default twice, which is not
necessarily sufficient for systems with a large number of CPUs.

There is a command line parameter 'max_cswd_read_retries' which allows to
increase the number of retries, but that's not user friendly as it needs to
be tweaked per system. As the number of required retries is proportional to
the number of online CPUs, this parameter can be calculated at runtime.

Scale and enlarge the number of retries according to the number of online
CPUs and remove the command line parameter completely.

[ tglx: Massaged change log and comments ]

Signed-off-by: Feng Tang <feng.tang@intel.com>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Tested-by: Jin Wang <jin1.wang@intel.com>
Tested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Waiman Long <longman@redhat.com>
Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
Link: https://lore.kernel.org/r/20240221060859.1027450-1-feng.tang@intel.com
2024-02-21 12:00:42 +01:00
..
ABI Char/Misc changes for 6.8-rc5 2024-02-17 08:52:38 -08:00
accel docs/accel: correct links to mailing list archives 2024-01-23 14:45:50 -07:00
accounting
admin-guide clocksource: Scale the watchdog read retries automatically 2024-02-21 12:00:42 +01:00
arch arm64: Subscribe Microsoft Azure Cobalt 100 to ARM Neoverse N2 errata 2024-02-15 11:47:22 +00:00
block Documentation: block: ioprio: Update schedulers 2024-01-18 08:21:14 -07:00
bpf Another moderately busy cycle for documentation, including: 2024-01-11 19:46:52 -08:00
cdrom
core-api A handful of late-arriving documentation fixes. 2024-01-17 11:49:11 -08:00
cpu-freq
crypto Another moderately busy cycle for documentation, including: 2024-01-11 19:46:52 -08:00
dev-tools Documentation: KUnit: Update the instructions on how to test static functions 2024-01-22 07:59:03 -07:00
devicetree sound fixes for 6.8-rc5 2024-02-16 09:02:19 -08:00
doc-guide docs: Raise the minimum Sphinx requirement to 2.4.4 2023-12-15 08:36:33 -07:00
driver-api pci-v6.8-changes 2024-01-17 16:23:17 -08:00
fault-injection
fb fbdev/intelfb: Remove driver 2024-01-12 12:38:37 +01:00
features riscv: Add support for BATCHED_UNMAP_TLB_FLUSH 2024-01-11 08:01:53 -08:00
filesystems ovl: mark xwhiteouts directory with overlay.opaque='x' 2024-01-23 12:39:48 +02:00
firmware-guide
firmware_class
fpga
gpu amd-drm-next-6.8-2024-01-05: 2024-01-09 09:07:50 +10:00
hid
hwmon hwmon: (lm75) Add AMS AS6200 temperature sensor 2024-01-02 08:44:57 -08:00
i2c Documentation/i2c: fix spelling error in i2c-address-translators 2023-12-27 20:05:44 +01:00
iio
images
infiniband
input
isdn
kbuild docs: kconfig: Fix grammar and formatting 2024-02-15 06:55:47 +09:00
kernel-hacking
leds
litmus-tests
livepatch
locking locking/mutex: Clarify that mutex_unlock(), and most other sleeping locks, can still use the lock object after it's unlocked 2024-01-08 09:55:31 +01:00
maintainer
mhi
misc-devices
mm mm/rmap: rename COMPOUND_MAPPED to ENTIRELY_MAPPED 2023-12-29 11:58:56 -08:00
netlabel
netlink dpll: fix possible deadlock during netlink dump operation 2024-02-08 18:29:21 -08:00
networking net-device: move lstats in net_device_read_txrx 2024-02-12 09:51:26 +00:00
nvdimm
nvme
PCI docs: PCI: Fix typos 2023-12-28 17:37:36 -06:00
pcmcia
peci
power Documentation: PM: Adjust freezing-of-tasks.rst to the freezer changes 2023-12-19 21:14:32 +01:00
process Documentation: Document the Linux Kernel CVE process 2024-02-17 14:46:39 +01:00
RAS
RCU
rust LoongArch changes for v6.8 2024-01-19 13:30:49 -08:00
scheduler sched/fair: Remove SCHED_FEAT(UTIL_EST_FASTUP, true) 2023-12-23 15:59:56 +01:00
scsi
security
sound
sphinx docs: kernel_feat.py: fix build error for missing files 2024-02-08 11:05:35 -07:00
sphinx-static docs: translations: add translations links when they exist 2023-12-19 14:34:59 -07:00
spi
staging rpmsg updates for v6.8 2024-01-17 15:05:27 -08:00
target
tee
timers
tools
trace tracing updates for 6.8: 2024-01-18 14:35:29 -08:00
translations A handful of late-arriving documentation fixes. 2024-01-17 11:49:11 -08:00
usb usb: gadget: ncm: Fix indentations in documentation of NCM section 2024-01-27 16:27:58 -08:00
userspace-api fbdev fixes and cleanups for 6.8-rc1: 2024-01-12 14:38:08 -08:00
virt KVM x86 MMU changes for 6.8: 2024-01-08 08:10:32 -05:00
w1
watchdog
wmi
.gitignore
atomic_bitops.txt
atomic_t.txt
Changes
CodingStyle
conf.py docs: translations: add translations links when they exist 2023-12-19 14:34:59 -07:00
docutils.conf
dontdiff
index.rst
Kconfig
Makefile doc/netlink: Regenerate netlink .rst files if ynl-gen-rst changes 2023-12-18 14:39:44 -08:00
memory-barriers.txt
SubmittingPatches
subsystem-apis.rst