linux/kernel/time
Frederic Weisbecker f55acb1e44 timers/migration: Fix endless timer requeue after idle interrupts
When a CPU is an idle migrator, but another CPU wakes up before it,
becomes an active migrator and handles the queue, the initial idle
migrator may end up endlessly reprogramming its clockevent, chasing ghost
timers forever such as in the following scenario:

               [GRP0:0]
             migrator = 0
             active   = 0
             nextevt  = T1
              /         \
             0           1
          active        idle (T1)

0) CPU 1 is idle and has a timer queued (T1), CPU 0 is active and is
the active migrator.

               [GRP0:0]
             migrator = NONE
             active   = NONE
             nextevt  = T1
              /         \
             0           1
          idle        idle (T1)
          wakeup = T1

1) CPU 0 is now idle and is therefore the idle migrator. It has
programmed its next timer interrupt to handle T1.

                [GRP0:0]
             migrator = 1
             active   = 1
             nextevt  = KTIME_MAX
              /         \
             0           1
          idle        active
          wakeup = T1

2) CPU 1 has woken up, it is now active and it has just handled its own
timer T1.

3) CPU 0 gets a timer interrupt to handle T1 but tmigr_handle_remote()
realize it is not the migrator anymore. So it early returns without
observing that T1 has been expired already and therefore without
updating its ->wakeup value.

4) CPU 0 goes into tmigr_cpu_new_timer() which also early returns
because it doesn't queue a timer of its own. So ->wakeup is left
unchanged and the next timer is programmed to fire now.

5) goto 3) forever

This results in timer interrupt storms in idle and also in nohz_full (as
observed in rcutorture's TREE07 scenario).

Fix this with forcing a re-evaluation of tmc->wakeup while trying
remote timer handling when the CPU isn't the migrator anymmore. The
check is inherently racy but in the worst case the CPU just races setting
the KTIME_MAX value that a remote expiry also tries to set.

Fixes: 7ee9887703 ("timers: Implement the hierarchical pull model")
Reported-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240318230729.15497-2-frederic@kernel.org
2024-03-19 10:14:55 +01:00
..
alarmtimer.c alarmtimer: Use maximum alarm time for suspend 2023-10-09 15:03:28 +02:00
clockevents.c clockevents: Make clockevents_subsys const 2024-02-07 15:11:24 +01:00
clocksource-wdtest.c clocksource: Scale the watchdog read retries automatically 2024-02-21 12:00:42 +01:00
clocksource.c clocksource: Scale the watchdog read retries automatically 2024-02-21 12:00:42 +01:00
hrtimer.c tick: Split nohz and highres features from nohz_mode 2024-02-26 11:37:32 +01:00
itimer.c time: Prevent undefined behaviour in timespec64_to_ns() 2020-10-26 11:48:11 +01:00
jiffies.c clocksource: Make clocksource watchdog test safe for slow-HZ systems 2021-08-28 17:01:32 +02:00
Kconfig clocksource: Loosen clocksource watchdog constraints 2023-01-03 20:43:45 -08:00
Makefile timers: Implement the hierarchical pull model 2024-02-22 17:52:32 +01:00
namespace.c vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy 2022-12-01 11:35:40 +01:00
ntp.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
ntp_internal.h ntp: Make the RTC synchronization more reliable 2020-12-11 10:40:52 +01:00
posix-clock.c posix-clock: introduce posix_clock_context concept 2023-10-15 20:07:52 +01:00
posix-cpu-timers.c posix-cpu-timers: Implement the missing timer_wait_running callback 2023-04-21 15:34:33 +02:00
posix-stubs.c posix-timers: Get rid of [COMPAT_]SYS_NI() uses 2023-12-20 21:30:27 -08:00
posix-timers.c posix-timers: Refer properly to CONFIG_HIGH_RES_TIMERS 2023-06-18 22:41:53 +02:00
posix-timers.h posix-clocks: Introduce clock_get_ktime() callback 2020-01-14 12:20:51 +01:00
sched_clock.c time/sched_clock: Provide sched_clock_noinstr() 2023-06-05 21:11:04 +02:00
test_udelay.c time/debug: Fix memory leak with using debugfs_lookup() 2023-02-09 20:12:27 +01:00
tick-broadcast-hrtimer.c time/tick-broadcast: Remove RCU_NONIDLE() usage 2023-01-13 11:48:16 +01:00
tick-broadcast.c tick/broadcast: Make broadcast device replacement work correctly 2023-05-08 23:18:16 +02:00
tick-common.c tick: Assume timekeeping is correctly handed over upon last offline idle call 2024-02-26 11:37:32 +01:00
tick-internal.h tick: Move broadcast cancellation up to CPUHP_AP_TICK_DYING 2024-02-26 11:37:32 +01:00
tick-legacy.c timekeeping: remove xtime_update 2020-10-30 21:57:07 +01:00
tick-oneshot.c time: Fix various kernel-doc problems 2023-01-03 11:07:58 +01:00
tick-sched.c tick: Assume timekeeping is correctly handed over upon last offline idle call 2024-02-26 11:37:32 +01:00
tick-sched.h tick/sched: Fix build failure for CONFIG_NO_HZ_COMMON=n 2024-02-29 17:41:29 +01:00
time.c time: add kernel-doc in time.c 2023-07-14 13:47:07 -06:00
time_test.c time/kunit: Use correct format specifier 2024-02-21 12:00:42 +01:00
timeconst.bc
timeconv.c time: Improve performance of time64_to_tm() 2021-06-24 11:51:59 +02:00
timecounter.c time/timecounter: Mark 1st argument of timecounter_cyc2time() as const 2021-04-16 21:03:50 +02:00
timekeeping.c timekeeping: Fix cross-timestamp interpolation for non-x86 2024-02-19 12:18:51 +01:00
timekeeping.h asm-generic: cross-architecture timer cleanup 2020-12-16 00:07:17 -08:00
timekeeping_debug.c
timekeeping_internal.h timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00
timer.c timers: Assert no next dyntick timer look-up while CPU is offline 2024-02-26 11:37:32 +01:00
timer_list.c tick: Split nohz and highres features from nohz_mode 2024-02-26 11:37:32 +01:00
timer_migration.c timers/migration: Fix endless timer requeue after idle interrupts 2024-03-19 10:14:55 +01:00
timer_migration.h timers: Implement the hierarchical pull model 2024-02-22 17:52:32 +01:00
vsyscall.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00