linux/kernel/time
Frederic Weisbecker 8ca1836769 timer/migration: Fix quick check reporting late expiry
When a CPU is the last active in the hierarchy and it tries to enter
into idle, the quick check looking up the next event towards cpuidle
heuristics may report a too late expiry, such as in the following
scenario:

                        [GRP1:0]
                     migrator = NONE
                     active   = NONE
                     nextevt  = T0:0, T0:1
                     /              \
          [GRP0:0]                  [GRP0:1]
       migrator = NONE           migrator = NONE
       active   = NONE           active   = NONE
       nextevt  = T0, T1         nextevt  = T2
       /         \                /         \
      0           1              2           3
    idle       idle           idle         idle

0) The whole system is idle, and CPU 0 was the last migrator. CPU 0 has
a timer (T0), CPU 1 has a timer (T1) and CPU 2 has a timer (T2). The
expire order is T0 < T1 < T2.

                        [GRP1:0]
                     migrator = GRP0:0
                     active   = GRP0:0
                     nextevt  = T0:0(i), T0:1
                   /              \
          [GRP0:0]                  [GRP0:1]
       migrator = CPU0           migrator = NONE
       active   = CPU0           active   = NONE
       nextevt  = T0(i), T1      nextevt  = T2
       /         \                /         \
      0           1              2           3
    active       idle           idle         idle

1) CPU 0 becomes active. The (i) means a now ignored timer.

                        [GRP1:0]
                     migrator = GRP0:0
                     active   = GRP0:0
                     nextevt  = T0:1
                     /              \
          [GRP0:0]                  [GRP0:1]
       migrator = CPU0           migrator = NONE
       active   = CPU0           active   = NONE
       nextevt  = T1             nextevt  = T2
       /         \                /         \
      0           1              2           3
    active       idle           idle         idle

2) CPU 0 handles remote. No timer actually expired but ignored timers
   have been cleaned out and their sibling's timers haven't been
   propagated. As a result the top level's next event is T2 and not T1.

3) CPU 0 tries to enter idle without any global timer enqueued and calls
   tmigr_quick_check(). The expiry of T2 is returned instead of the
   expiry of T1.

When the quick check returns an expiry that is too late, the cpuidle
governor may pick up a C-state that is too deep. This may be result into
undesired CPU wake up latency if the next timer is actually close enough.

Fix this with assuming that expiries aren't sorted top-down while
performing the quick check. Pick up instead the earliest encountered one
while walking up the hierarchy.

7ee9887703 ("timers: Implement the hierarchical pull model")
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Thomas Gleixner <tglx@linutronix.de>
Link: https://lore.kernel.org/r/20240305002822.18130-1-frederic@kernel.org
2024-03-06 15:02:09 +01:00
..
alarmtimer.c alarmtimer: Use maximum alarm time for suspend 2023-10-09 15:03:28 +02:00
clockevents.c clockevents: Make clockevents_subsys const 2024-02-07 15:11:24 +01:00
clocksource-wdtest.c clocksource: Scale the watchdog read retries automatically 2024-02-21 12:00:42 +01:00
clocksource.c clocksource: Scale the watchdog read retries automatically 2024-02-21 12:00:42 +01:00
hrtimer.c tick: Split nohz and highres features from nohz_mode 2024-02-26 11:37:32 +01:00
itimer.c time: Prevent undefined behaviour in timespec64_to_ns() 2020-10-26 11:48:11 +01:00
jiffies.c clocksource: Make clocksource watchdog test safe for slow-HZ systems 2021-08-28 17:01:32 +02:00
Kconfig clocksource: Loosen clocksource watchdog constraints 2023-01-03 20:43:45 -08:00
Makefile timers: Implement the hierarchical pull model 2024-02-22 17:52:32 +01:00
namespace.c vdso/timens: Refactor copy-pasted find_timens_vvar_page() helper into one copy 2022-12-01 11:35:40 +01:00
ntp.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00
ntp_internal.h ntp: Make the RTC synchronization more reliable 2020-12-11 10:40:52 +01:00
posix-clock.c posix-clock: introduce posix_clock_context concept 2023-10-15 20:07:52 +01:00
posix-cpu-timers.c posix-cpu-timers: Implement the missing timer_wait_running callback 2023-04-21 15:34:33 +02:00
posix-stubs.c posix-timers: Get rid of [COMPAT_]SYS_NI() uses 2023-12-20 21:30:27 -08:00
posix-timers.c posix-timers: Refer properly to CONFIG_HIGH_RES_TIMERS 2023-06-18 22:41:53 +02:00
posix-timers.h posix-clocks: Introduce clock_get_ktime() callback 2020-01-14 12:20:51 +01:00
sched_clock.c time/sched_clock: Provide sched_clock_noinstr() 2023-06-05 21:11:04 +02:00
test_udelay.c time/debug: Fix memory leak with using debugfs_lookup() 2023-02-09 20:12:27 +01:00
tick-broadcast-hrtimer.c time/tick-broadcast: Remove RCU_NONIDLE() usage 2023-01-13 11:48:16 +01:00
tick-broadcast.c tick/broadcast: Make broadcast device replacement work correctly 2023-05-08 23:18:16 +02:00
tick-common.c tick: Assume timekeeping is correctly handed over upon last offline idle call 2024-02-26 11:37:32 +01:00
tick-internal.h tick: Move broadcast cancellation up to CPUHP_AP_TICK_DYING 2024-02-26 11:37:32 +01:00
tick-legacy.c timekeeping: remove xtime_update 2020-10-30 21:57:07 +01:00
tick-oneshot.c time: Fix various kernel-doc problems 2023-01-03 11:07:58 +01:00
tick-sched.c tick: Assume timekeeping is correctly handed over upon last offline idle call 2024-02-26 11:37:32 +01:00
tick-sched.h tick/sched: Fix build failure for CONFIG_NO_HZ_COMMON=n 2024-02-29 17:41:29 +01:00
time.c time: add kernel-doc in time.c 2023-07-14 13:47:07 -06:00
time_test.c time/kunit: Use correct format specifier 2024-02-21 12:00:42 +01:00
timeconst.bc time: Add SPDX license identifiers 2018-11-23 11:51:20 +01:00
timeconv.c time: Improve performance of time64_to_tm() 2021-06-24 11:51:59 +02:00
timecounter.c time/timecounter: Mark 1st argument of timecounter_cyc2time() as const 2021-04-16 21:03:50 +02:00
timekeeping.c timekeeping: Fix cross-timestamp interpolation for non-x86 2024-02-19 12:18:51 +01:00
timekeeping.h asm-generic: cross-architecture timer cleanup 2020-12-16 00:07:17 -08:00
timekeeping_debug.c timekeeping/debug: No need to check return value of debugfs_create functions 2019-01-29 20:08:41 +01:00
timekeeping_internal.h timekeeping/vsyscall: Provide vdso_update_begin/end() 2020-08-06 10:57:30 +02:00
timer.c timers: Assert no next dyntick timer look-up while CPU is offline 2024-02-26 11:37:32 +01:00
timer_list.c tick: Split nohz and highres features from nohz_mode 2024-02-26 11:37:32 +01:00
timer_migration.c timer/migration: Fix quick check reporting late expiry 2024-03-06 15:02:09 +01:00
timer_migration.h timers: Implement the hierarchical pull model 2024-02-22 17:52:32 +01:00
vsyscall.c timekeeping, clocksource: Fix various typos in comments 2021-03-22 23:06:48 +01:00