qemu/qapi
Peter Xu 4146b77ec7 migration/postcopy: Add postcopy-recover-setup phase
This patch adds a migration state on src called "postcopy-recover-setup".
The new state will describe the intermediate step starting from when the
src QEMU received a postcopy recovery request, until the migration channels
are properly established, but before the recovery process take place.

The request came from Libvirt where Libvirt currently rely on the migration
state events to detect migration state changes.  That works for most of the
migration process but except postcopy recovery failures at the beginning.

Currently postcopy recovery only has two major states:

  - postcopy-paused: this is the state that both sides of QEMU will be in
    for a long time as long as the migration channel was interrupted.

  - postcopy-recover: this is the state where both sides of QEMU handshake
    with each other, preparing for a continuation of postcopy which used to
    be interrupted.

The issue here is when the recovery port is invalid, the src QEMU will take
the URI/channels, noticing the ports are not valid, and it'll silently keep
in the postcopy-paused state, with no event sent to Libvirt.  In this case,
the only thing Libvirt can do is to poll the migration status with a proper
interval, however that's less optimal.

Considering that this is the only case where Libvirt won't get a
notification from QEMU on such events, let's add postcopy-recover-setup
state to mimic what we have with the "setup" state of a newly initialized
migration, describing the phase of connection establishment.

With that, postcopy recovery will have two paths to go now, and either path
will guarantee an event generated.  Now the events will look like this
during a recovery process on src QEMU:

  - Initially when the recovery is initiated on src, QEMU will go from
    "postcopy-paused" -> "postcopy-recover-setup".  Old QEMUs don't have
    this event.

  - Depending on whether the channel re-establishment is succeeded:

    - In succeeded case, src QEMU will move from "postcopy-recover-setup"
      to "postcopy-recover".  Old QEMUs also have this event.

    - In failure case, src QEMU will move from "postcopy-recover-setup" to
      "postcopy-paused" again.  Old QEMUs don't have this event.

This guarantees that Libvirt will always receive a notification for
recovery process properly.

One thing to mention is, such new status is only needed on src QEMU not
both.  On dest QEMU, the state machine doesn't change.  Hence the events
don't change either.  It's done like so because dest QEMU may not have an
explicit point of setup start.  E.g., it can happen that when dest QEMUs
doesn't use migrate-recover command to use a new URI/channel, but the old
URI/channels can be reused in recovery, in which case the old ports simply
can work again after the network routes are fixed up.

Add a new helper postcopy_is_paused() detecting whether postcopy is still
paused, taking RECOVER_SETUP into account too.  When using it on both
src/dst, a slight change is done altogether to always wait for the
semaphore before checking the status, because for both sides a sem_post()
will be required for a recovery.

Cc: Jiri Denemark <jdenemar@redhat.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Reviewed-by: Fabiano Rosas <farosas@suse.de>
Buglink: https://issues.redhat.com/browse/RHEL-38485
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fabiano Rosas <farosas@suse.de>
2024-06-21 09:47:59 -03:00
..
acpi.json qapi: Require descriptions and tagged sections to be indented 2024-02-26 10:43:56 +01:00
audio.json audio/pw: Pipewire->PipeWire case fix for user-visible text 2023-07-17 15:22:56 +04:00
authz.json
block-core.json qapi: blockdev-backup: add discard-source parameter 2024-05-28 15:52:15 +03:00
block-export.json qapi: Move error documentation to new "Errors" sections 2024-03-04 07:12:40 +01:00
block.json qapi: Refill doc comments to conform to current conventions 2024-03-26 06:36:08 +01:00
char.json qapi: Delete useless "Returns" sections 2024-03-04 07:12:40 +01:00
common.json qapi: document PCIe Gen5/Gen6 speeds since 9.0 2024-03-18 04:57:45 -04:00
compat.json qapi: Belatedly update CompatPolicy documentation for unstable 2023-10-19 07:02:29 +02:00
control.json qapi: Drop stray Arguments: line from qmp_capabilities docs 2024-03-26 06:36:08 +01:00
crypto.json qapi: Correct documentation indentation and whitespace 2024-03-26 06:36:08 +01:00
cryptodev.json
cxl.json qapi: Refill doc comments to conform to current conventions 2024-03-26 06:36:08 +01:00
dump.json qapi: Correct documentation indentation and whitespace 2024-03-26 06:36:08 +01:00
ebpf.json qapi: Refill doc comments to conform to current conventions 2024-03-26 06:36:08 +01:00
error.json
introspect.json qapi: Drop redundant documentation of inherited members 2024-02-03 09:19:25 +01:00
job.json blockjob: introduce block-job-change QMP command 2023-10-31 18:20:25 +01:00
machine-common.json CPU topology: extend with s390 specifics 2023-10-20 07:16:53 +02:00
machine-target.json target/s390x: report deprecated-props in cpu-model-expansion reply 2024-05-10 08:34:20 +02:00
machine.json hw/intc: Introduce x-query-interrupt-controllers QMP command 2024-06-19 12:40:49 +02:00
meson.build qapi/vfio: Add VFIO migration QAPI event 2024-05-16 16:59:19 +02:00
migration.json migration/postcopy: Add postcopy-recover-setup phase 2024-06-21 09:47:59 -03:00
misc-target.json i386/sev: Update query-sev QAPI format to handle SEV-SNP 2024-06-05 11:01:06 +02:00
misc.json qapi: Correct documentation indentation and whitespace 2024-03-26 06:36:08 +01:00
net.json qapi: Refill doc comments to conform to current conventions 2024-03-26 06:36:08 +01:00
opts-visitor.c qapi: Inline and remove QERR_INVALID_PARAMETER definition 2024-04-24 09:50:58 +02:00
pci.json qapi: Require descriptions and tagged sections to be indented 2024-02-26 10:43:56 +01:00
pragma.json qapi: document parameters of query-cpu-model-* QAPI commands 2024-03-26 06:36:08 +01:00
qapi-clone-visitor.c qapi: Do not cast function pointers 2024-05-29 12:41:56 +02:00
qapi-dealloc-visitor.c
qapi-forward-visitor.c
qapi-schema.json qapi/vfio: Add VFIO migration QAPI event 2024-05-16 16:59:19 +02:00
qapi-type-helpers.c qapi: New strv_from_str_list() 2024-03-04 07:12:40 +01:00
qapi-util.c qapi: Fix dangling references to docs/devel/qapi-code-gen.txt 2024-01-26 07:04:53 +01:00
qapi-visit-core.c
qdev.json qapi: Delete useless "Returns" sections 2024-03-04 07:12:40 +01:00
qmp-dispatch.c Revert "monitor: use aio_co_reschedule_self()" 2024-06-10 11:05:43 +02:00
qmp-event.c
qmp-registry.c
qobject-input-visitor.c qapi: Inline QERR_INVALID_PARAMETER_TYPE definition (constant value) 2024-04-24 09:50:58 +02:00
qobject-output-visitor.c
qom.json i386/sev: Introduce 'sev-snp-guest' object 2024-06-05 11:01:06 +02:00
replay.json qapi: Expand a few awkward abbreviations in documentation 2024-03-26 06:36:08 +01:00
rocker.json qapi: Require descriptions and tagged sections to be indented 2024-02-26 10:43:56 +01:00
run-state.json qapi: document leftover members in qapi/run-state.json 2024-03-26 06:36:08 +01:00
sockets.json qapi: Correct documentation indentation and whitespace 2024-03-26 06:36:08 +01:00
stats.json qapi: document leftover members in qapi/stats.json 2024-03-26 06:36:08 +01:00
string-input-visitor.c qapi: Inline QERR_INVALID_PARAMETER_TYPE definition (constant value) 2024-04-24 09:50:58 +02:00
string-output-visitor.c string-output-visitor: Fix (pseudo) struct handling 2024-01-26 11:16:58 +01:00
tpm.json qapi: Delete useless "Returns" sections 2024-03-04 07:12:40 +01:00
trace-events
trace.h
trace.json trace: Remove deprecated 'vcpu' field from QMP trace events 2024-06-04 11:53:43 +02:00
transaction.json qapi: Delete useless "Returns" sections 2024-03-04 07:12:40 +01:00
ui.json qapi: document InputMultiTouchType 2024-03-26 06:36:08 +01:00
vfio.json qapi/vfio: Add VFIO migration QAPI event 2024-05-16 16:59:19 +02:00
virtio.json qapi: Refill doc comments to conform to current conventions 2024-03-26 06:36:08 +01:00
yank.json qapi/yank: Tweak @yank's error description for consistency 2024-03-04 07:12:40 +01:00