docs/migration: Organize "Postcopy" page

Reorganize the page, moving things around, and add a few
headlines ("Postcopy internals", "Postcopy features") to cover sub-areas.

Reviewed-by: Cédric Le Goater <clg@redhat.com>
Link: https://lore.kernel.org/r/20240109064628.595453-9-peterx@redhat.com
Signed-off-by: Peter Xu <peterx@redhat.com>
This commit is contained in:
Peter Xu 2024-01-09 14:46:26 +08:00
parent 4c6f8a79ae
commit 21b17cd011

View file

@ -1,6 +1,9 @@
========
Postcopy
========
.. contents::
'Postcopy' migration is a way to deal with migrations that refuse to converge
(or take too long to converge) its plus side is that there is an upper bound on
the amount of migration traffic and time it takes, the down side is that during
@ -14,7 +17,7 @@ Postcopy can be combined with precopy (i.e. normal migration) so that if precopy
doesn't finish in a given time the switch is made to postcopy.
Enabling postcopy
-----------------
=================
To enable postcopy, issue this command on the monitor (both source and
destination) prior to the start of migration:
@ -49,8 +52,71 @@ time per vCPU.
``migrate_set_parameter`` is ignored (to avoid delaying requested pages that
the destination is waiting for).
Postcopy device transfer
------------------------
Postcopy internals
==================
State machine
-------------
Postcopy moves through a series of states (see postcopy_state) from
ADVISE->DISCARD->LISTEN->RUNNING->END
- Advise
Set at the start of migration if postcopy is enabled, even
if it hasn't had the start command; here the destination
checks that its OS has the support needed for postcopy, and performs
setup to ensure the RAM mappings are suitable for later postcopy.
The destination will fail early in migration at this point if the
required OS support is not present.
(Triggered by reception of POSTCOPY_ADVISE command)
- Discard
Entered on receipt of the first 'discard' command; prior to
the first Discard being performed, hugepages are switched off
(using madvise) to ensure that no new huge pages are created
during the postcopy phase, and to cause any huge pages that
have discards on them to be broken.
- Listen
The first command in the package, POSTCOPY_LISTEN, switches
the destination state to Listen, and starts a new thread
(the 'listen thread') which takes over the job of receiving
pages off the migration stream, while the main thread carries
on processing the blob. With this thread able to process page
reception, the destination now 'sensitises' the RAM to detect
any access to missing pages (on Linux using the 'userfault'
system).
- Running
POSTCOPY_RUN causes the destination to synchronise all
state and start the CPUs and IO devices running. The main
thread now finishes processing the migration package and
now carries on as it would for normal precopy migration
(although it can't do the cleanup it would do as it
finishes a normal migration).
- Paused
Postcopy can run into a paused state (normally on both sides when
happens), where all threads will be temporarily halted mostly due to
network errors. When reaching paused state, migration will make sure
the qemu binary on both sides maintain the data without corrupting
the VM. To continue the migration, the admin needs to fix the
migration channel using the QMP command 'migrate-recover' on the
destination node, then resume the migration using QMP command 'migrate'
again on source node, with resume=true flag set.
- End
The listen thread can now quit, and perform the cleanup of migration
state, the migration is now complete.
Device transfer
---------------
Loading of device data may cause the device emulation to access guest RAM
that may trigger faults that have to be resolved by the source, as such
@ -130,7 +196,20 @@ processing.
is no longer used by migration, while the listen thread carries on servicing
page data until the end of migration.
Postcopy Recovery
Source side page bitmap
-----------------------
The 'migration bitmap' in postcopy is basically the same as in the precopy,
where each of the bit to indicate that page is 'dirty' - i.e. needs
sending. During the precopy phase this is updated as the CPU dirties
pages, however during postcopy the CPUs are stopped and nothing should
dirty anything any more. Instead, dirty bits are cleared when the relevant
pages are sent during postcopy.
Postcopy features
=================
Postcopy recovery
-----------------
Comparing to precopy, postcopy is special on error handlings. When any
@ -166,76 +245,6 @@ configurations of the guest. For example, when with async page fault
enabled, logically the guest can proactively schedule out the threads
accessing missing pages.
Postcopy states
---------------
Postcopy moves through a series of states (see postcopy_state) from
ADVISE->DISCARD->LISTEN->RUNNING->END
- Advise
Set at the start of migration if postcopy is enabled, even
if it hasn't had the start command; here the destination
checks that its OS has the support needed for postcopy, and performs
setup to ensure the RAM mappings are suitable for later postcopy.
The destination will fail early in migration at this point if the
required OS support is not present.
(Triggered by reception of POSTCOPY_ADVISE command)
- Discard
Entered on receipt of the first 'discard' command; prior to
the first Discard being performed, hugepages are switched off
(using madvise) to ensure that no new huge pages are created
during the postcopy phase, and to cause any huge pages that
have discards on them to be broken.
- Listen
The first command in the package, POSTCOPY_LISTEN, switches
the destination state to Listen, and starts a new thread
(the 'listen thread') which takes over the job of receiving
pages off the migration stream, while the main thread carries
on processing the blob. With this thread able to process page
reception, the destination now 'sensitises' the RAM to detect
any access to missing pages (on Linux using the 'userfault'
system).
- Running
POSTCOPY_RUN causes the destination to synchronise all
state and start the CPUs and IO devices running. The main
thread now finishes processing the migration package and
now carries on as it would for normal precopy migration
(although it can't do the cleanup it would do as it
finishes a normal migration).
- Paused
Postcopy can run into a paused state (normally on both sides when
happens), where all threads will be temporarily halted mostly due to
network errors. When reaching paused state, migration will make sure
the qemu binary on both sides maintain the data without corrupting
the VM. To continue the migration, the admin needs to fix the
migration channel using the QMP command 'migrate-recover' on the
destination node, then resume the migration using QMP command 'migrate'
again on source node, with resume=true flag set.
- End
The listen thread can now quit, and perform the cleanup of migration
state, the migration is now complete.
Source side page map
--------------------
The 'migration bitmap' in postcopy is basically the same as in the precopy,
where each of the bit to indicate that page is 'dirty' - i.e. needs
sending. During the precopy phase this is updated as the CPU dirties
pages, however during postcopy the CPUs are stopped and nothing should
dirty anything any more. Instead, dirty bits are cleared when the relevant
pages are sent during postcopy.
Postcopy with hugepages
-----------------------
@ -293,7 +302,7 @@ Retro-fitting postcopy to existing clients is possible:
guest memory access is made while holding a lock then all other
threads waiting for that lock will also be blocked.
Postcopy Preemption Mode
Postcopy preemption mode
------------------------
Postcopy preempt is a new capability introduced in 8.0 QEMU release, it