qemu/replay
Jamie Iles 83ecdb18eb accel/tcg/tcg-accel-ops-rr: ensure fairness with icount
The round-robin scheduler will iterate over the CPU list with an
assigned budget until the next timer expiry and may exit early because
of a TB exit.  This is fine under normal operation but with icount
enabled and SMP it is possible for a CPU to be starved of run time and
the system live-locks.

For example, booting a riscv64 platform with '-icount
shift=0,align=off,sleep=on -smp 2' we observe a livelock once the kernel
has timers enabled and starts performing TLB shootdowns.  In this case
we have CPU 0 in M-mode with interrupts disabled sending an IPI to CPU
1.  As we enter the TCG loop, we assign the icount budget to next timer
interrupt to CPU 0 and begin executing where the guest is sat in a busy
loop exhausting all of the budget before we try to execute CPU 1 which
is the target of the IPI but CPU 1 is left with no budget with which to
execute and the process repeats.

We try here to add some fairness by splitting the budget across all of
the CPUs on the thread fairly before entering each one.  The CPU count
is cached on CPU list generation ID to avoid iterating the list on each
loop iteration.  With this change it is possible to boot an SMP rv64
guest with icount enabled and no hangs.

Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jamie Iles <quic_jiles@quicinc.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230427020925.51003-3-quic_jiles@quicinc.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-05-11 09:53:41 +01:00
..
meson.build replay: do not build if TCG is not available 2020-10-22 11:53:54 -04:00
replay-audio.c audio: use size_t where makes sense 2019-08-21 09:13:37 +02:00
replay-char.c chardev: src buffer const for write functions 2022-09-29 14:38:05 +04:00
replay-debugging.c qapi replay: Elide redundant has_FOO in generated C 2022-12-14 20:05:07 +01:00
replay-events.c replay: simplify async event processing 2022-06-06 09:26:53 +02:00
replay-input.c Include qemu-common.h exactly where needed 2019-06-12 13:20:20 +02:00
replay-internal.c replay: fix icount request when replaying clock access 2021-02-16 17:15:39 +01:00
replay-internal.h replay: Fix declaration of replay_read_next_clock 2022-11-29 11:09:11 -05:00
replay-net.c Clean up inclusion of sysemu/sysemu.h 2019-08-16 13:31:53 +02:00
replay-random.c replay: record and replay random number sources 2020-01-07 12:08:39 +01:00
replay-snapshot.c replay: simplify async event processing 2022-06-06 09:26:53 +02:00
replay-time.c cleanup: Tweak and re-run return_directly.cocci 2022-12-14 16:19:35 +01:00
replay.c accel/tcg/tcg-accel-ops-rr: ensure fairness with icount 2023-05-11 09:53:41 +01:00
stubs-system.c replay: Simplify setting replay blockers 2023-02-23 14:10:17 +01:00