Support guest-fsfreeze-freeze and guest-fsfreeze-thaw commands for Windows
guests. When fsfreeze command is issued, it calls the VSS requester to
freeze filesystems and applications. On thaw command, it again tells the VSS
requester to thaw them.
This also adds calling of initialize functions for the VSS requester.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Adds VSS provider and requester as a qga-vss.dll, which is loaded by
Windows VSS service as well as by qemu-ga.
"provider.cpp" implements a basic stub of a software VSS provider.
Currently, this module only relays a frozen event from VSS service to the
agent, and thaw event from the agent to VSS service, to block VSS process
to keep the system frozen while snapshots are taken at the host.
To register the provider to the guest system as COM+ application, the type
library (.tlb) for qga-vss.dll is required. To build it from COM IDL (.idl),
VisualC++, MIDL and stdole2.tlb in Windows SDK are required. This patch also
adds pre-compiled .tlb file in the repository in order to enable
cross-compile qemu-ga.exe for Windows with VSS support.
"requester.cpp" provides the VSS requester to kick the VSS snapshot process.
Qemu-ga.exe works without the DLL, although fsfreeze features are disabled.
These functions are only supported in Windows 2003 or later. In older
systems, fsfreeze features are disabled.
In several versions of Windows which don't support attribute
VSS_VOLSNAP_ATTR_NO_AUTORECOVERY, DoSnapshotSet fails with error
VSS_E_OBJECT_NOT_FOUND. In this patch, we just ignore this error.
To solve this fundamentally, we need a framework to handle mount writable
snapshot on guests, which is required by VSS auto-recovery feature
(cleanup phase after a snapshot is taken).
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
These functions help maintaining homogeneous formatting of error messages
with Windows error code and description (generated by
g_win32_error_message()).
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
To enable VSS support in qemu-ga for Windows, header files included in
VSS SDK are required.
The VSS support is enabled by the configure option like below:
./configure --with-vss-sdk="/path/to/VSS SDK"
If the path is omitted, it tries to search the headers from default paths
and VSS support is enabled only if the SDK is found.
VSS support is disabled if --without-vss-sdk or --with-vss-sdk=no is
specified.
VSS SDK is available from:
http://www.microsoft.com/en-us/download/details.aspx?id=23490
To cross-compile using mingw, you need to setup the SDK on Windows
environments to extract headers. You can also extract the SDK headers on
POSIX environments using scripts/extract-vss-headers and msitools.
In addition, --with-win-sdk="/path/to/Windows SDK" option is also added to
specify path to Windows SDK, which may be used for native-compile of .tlb
file of qemu-ga VSS provider. However, this is usually unnecessary because
pre-compiled .tlb file is included.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
VSS SDK(*) setup.exe is only runnable on Windows. This adds a script
to extract VSS SDK headers on POSIX-systems using msitools.
* http://www.microsoft.com/en-us/download/details.aspx?id=23490
From: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Enable checkpatch.pl to apply the same checks as C source files for
C++ files with .cpp extensions. It also adds some exceptions for C++
sources to suppress errors for:
- <> used in C++ template arguments (e.g. template <class T>)
- :: used to represent namespaces (e.g. SomeClass::method())
- : used in class declaration (e.g. class T : public Super)
- ~ used in destructor method name (e.g. T::~T())
- spacing around 'catch' (e.g. catch (...))
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add c++ keywords to avoid errors in compiling with c++ compiler.
This also renames class member of PciDeviceInfo to q_class.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add configuration for C++ compiler in configure and Makefiles.
The C++ compiler is choosed as following:
- ${CXX}, if it is specified.
- ${cross_prefix}g++, if ${cross_prefix} is specified.
- Otherwise, c++ is used.
Currently, usage of C++ language is only for access to Windows VSS
using COM+ services in qemu-guest-agent for Windows.
Signed-off-by: Tomoki Sekiyama <tomoki.sekiyama@hds.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Micael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
A Malta board can support up to 2GiB of RAM. Since the unmapped kseg0/1
regions are only 512MiB large & the latter 256MiB of those are taken up
by the IO region, access to RAM beyond 256MiB must be done through a
mapped region. In the case of a Linux guest this means we need to use
highmem.
The mainline Linux kernel does not support highmem for Malta at this
time, however this can be tested using the linux-mti-3.8 kernel branch
available from:
git://git.linux-mips.org/pub/scm/linux-mti.git
You should be able to boot a Linux kernel built from the linux-mti-3.8
branch, with CONFIG_HIGHMEM enabled, using 2GiB RAM by passing "-m 2G"
to QEMU and appending the following kernel parameters:
mem=256m@0x0 mem=256m@0x90000000 mem=1536m@0x20000000
Note that the upper half of the physical address space of a Malta
mirrors the lower half (hence the 2GiB limit) except that the IO region
(0x10000000-0x1fffffff in the lower half) is not mirrored in the upper
half. That is, physical addresses 0x90000000-0x9fffffff access RAM
rather than the IO region, resulting in a physical address space
resembling the following:
0x00000000 -> 0x0fffffff RAM
0x10000000 -> 0x1fffffff I/O
0x20000000 -> 0x7fffffff RAM
0x80000000 -> 0x8fffffff RAM (mirror of 0x00000000 -> 0x0fffffff)
0x90000000 -> 0x9fffffff RAM
0xa0000000 -> 0xffffffff RAM (mirror of 0x20000000 -> 0x7fffffff)
The second mem parameter provided to the kernel above accesses the
second 256MiB of RAM through the upper half of the physical address
space, making use of the aliasing described above in order to avoid
the IO region and use the whole 2GiB RAM.
The memory setup may be seen as 'backwards' in this commit since the
'real' memory is mapped in the upper half of the physical address space
and the lower half contains the aliases. On real hardware it would be
typical to see the upper half of the physical address space as the alias
since the bus addresses generated match the lower half of the physical
address space. However since the memory accessible in the upper half of
the physical address space is uninterrupted by the IO region it is
easiest to map the RAM as a whole there, and functionally it makes no
difference to the target code.
Due to the requirements of accessing the second 256MiB of RAM through
a mapping to the upper half of the physical address space it is usual
for the bootloader to indicate a maximum of 256MiB memory to a kernel.
This allows kernels which do not support such access to boot on systems
with more than 256MiB of RAM. It is also the behaviour assumed by Linux.
QEMUs small generated bootloader is modified to provide this behaviour.
Signed-off-by: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
CC: qemu-stable@nongnu.org
Now that the memory subsystem is propagating the endianness correctly,
the ne2000 device should have its I/O ports marked as LITTLE_ENDIAN, as
PCI devices are little endian.
This makes the ne2000 NIC to work again on PowerPC.
Cc: qemu-stable@nongnu.org
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This eliminates a warning about __packed being redefined as exposed by the
vmxnet3 code. __packed is not used anywhere in the vmxnet3 code.
CC hw/net/vmxnet3.o
In file included from hw/net/vmxnet3.c:29:
hw/net/vmxnet3.h:37:1: warning: "__packed" redefined
In file included from /usr/include/stdlib.h:38,
from /buildbot-qemu/default_openbsd_current/build/include/qemu-common.h:26,
from /buildbot-qemu/default_openbsd_current/build/include/hw/hw.h:5,
from hw/net/vmxnet3.c:18:
/usr/include/sys/cdefs.h:209:1: warning: this is the location of the previous definition
Signed-off-by: Brad Smith <brad@comstyle.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch partially implements the e1000 interrupt mitigation mechanisms.
Using a single QEMUTimer, it emulates the ITR register (which is the newer
mitigation register, recommended by Intel) and approximately emulates
RADV and TADV registers. TIDV and RDTR register functionalities are not
emulated (RDTR is only used to validate RADV, according to the e1000 specs).
RADV, TADV, TIDV and RDTR registers make up the older e1000 mitigation
mechanism and would need a timer each to be completely emulated. However,
a single timer has been used in order to reach a good compromise between
emulation accuracy and simplicity/efficiency.
The implemented mechanism can be enabled/disabled specifying the command
line e1000-specific boolean parameter "mitigation", e.g.
qemu-system-x86_64 -device e1000,mitigation=on,... ...
For more information, see the Software developer's manual at
http://download.intel.com/design/network/manuals/8254x_GBe_SDM.pdf.
Interrupt mitigation boosts performance when the guest suffers from
an high interrupt rate (i.e. receiving short UDP packets at high packet
rate). For some numerical results see the following link
http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf
Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Reviewed-by: Andreas Färber <afaerber@suse.de> (for pc-* machines)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Each networking client has a queue for packets that could not yet be
delivered to that client. Calling this queue "send_queue" is highly
confusing as it has nothing to to with packets send from this client but
to it. Avoid this confusing by renaming it to "incoming_queue".
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The following patch simplifies the *BSD tap/tun code and makes use of numbered
tap/tun interfaces on all *BSD OS's. NetBSD has a patch in their pkgsrc tree
to make use of this feature and DragonFly also supports this as well.
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The reference output for test case 026 hasn't been updated in a long
time and it's one of the "known failing" cases. This patch updates the
reference output so that unintentional changes can be reliably detected
again.
The problem with this test case is that it produces different output
depending on whether -nocache is used or not. The solution of this patch
is to actually have two different reference outputs. If nnn.out.nocache
exists, it is used as the reference output for -nocache; otherwise,
nnn.out stays valid for both cases.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
These scripts used to have a four characters indentation, with eight
consecutive spaces converted into a tab. Convert everything into spaces.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Avoid trying to setup dataplane again if dataplane setup is already in
progress. This may happen if an eventfd is triggered during setup.
I saw this occasionally with an experimental s390 irqfd implementation:
virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> virtio_ccw_set_host_notifier
...
-> virtio_queue_set_host_notifier_fd_handler
-> virtio_queue_host_notifier_read
-> virtio_queue_notify_vq
-> virtio_blk_handle_output
-> virtio_blk_data_plane_start
-> vring_setup
-> hostmem_init
-> memory_listener_register
-> BOOM
As virtio-ccw tries to follow what virtio-pci does, it might be triggerable
for other platforms as well.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Protocols return raw data, so you can assume the offsets to pass
through unchanged.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
These are created for example with XFS_IOC_ZERO_RANGE.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Eric Blake also requested including the output in qapi-schema.json,
so that it is published through the introspection mechanism.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This command dumps the metadata of an entire chain, in either tabular or JSON
format.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the sectors are unallocated and we are past the end of the
backing file, they will read as zero.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Alternatively, this could use a "discard zeroes data" flag returned
by bdrv_get_info.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Define the return value of get_block_status. Bits 0, 1, 2 and 9-62
are valid; bit 63 (the sign bit) is reserved for errors. Bits 3-8
are left for future extensions.
The return code is compatible with the old is_allocated API: if a driver
only returns 0 or 1 (aka BDRV_BLOCK_DATA) like is_allocated used to,
clients of is_allocated will not have any change in behavior. Still,
we will return more precise information in the next patches and the
new definition of bdrv_is_allocated is already prepared for this.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For now, bdrv_get_block_status is just another name for bdrv_is_allocated.
The next patches will add more flags.
This also touches all block drivers with a mostly mechanical rename. The
sole exception is cow; because it calls cow_co_is_allocated from the read
code, we keep that function and make cow_co_get_block_status a wrapper.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This helps implementing is_allocated on top of get_block_status.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qemu-img convert can assume "that sectors which are unallocated in the
input image are present in both the output's and input's base images".
However it is only doing this if the output image returns true for
bdrv_has_zero_init(). Testing bdrv_has_zero_init() does not make much
sense if the output image is copy-on-write, because a copy-on-write
image is never initialized to zero (it is initialized to the content
of the backing file).
There is nothing here that makes has_zero_init images special. The
input and output must be equal for the operation to make sense, and
that's it.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Some bdrv_is_allocated callers do not expect errors, but the fallback
in qcow2.c might make other callers trip on assertion failures or
infinite loops.
Fix the callers to always look for errors.
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that bdrv_is_allocated detects coroutine context, the two can
use the same code.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is more robust when the device has removable media.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_is_allocated can detect coroutine context and go through a fast
path, similar to other block layer functions.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If a BlockDriverState is growable, after every write we need to
check if bs->total_sectors might have changed. With this change,
bdrv_getlength does not need anymore a system call.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
As we change bdrv_is_allocated to gather more information from bs and
bs->file, it will become a bit slower. It is still appropriate for online
jobs, but not for reads/writes. Call the internal function instead.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Only sync once per write, rather than once per sector.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Do not do two reads for each sector; load each sector of the bitmap
and use bitmap operations to process it.
Writes are still dog slow!
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add an appropriate entry describing this event and its parameters into
qmp-events.txt.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Block jobs used drive_get_ref(drive_get_by_blockdev(bs)) to avoid BDS
being deleted. Now we have BDS reference count, and block jobs don't
care about dinfo, so replace them to get cleaner code. It is also the
safe way when BDS has no drive info.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Previously, nbd calls drive_get_ref() on the drive of bs. A BDS doesn't
always have associated dinfo, which nbd doesn't care either. We already
have BDS ref count, so use it to make it safe for a BDS w/o blockdev.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We call bdrv_attach_dev when initializing whether or not bs is created
locally, so call bdrv_detach_dev and let the refcnt handle the
lifecycle.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
block-migration.c does not actually use DriveInfo anywhere. Hence it's
safe to drive ref code, we really only care about referencing BDS.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Manage BlockDriverState lifecycle with refcnt, so bdrv_delete() is no
longer public and should be called by bdrv_unref() if refcnt is
decreased to 0.
This is an identical change because effectively, there's no multiple
reference of BDS now: no caller of bdrv_ref() yet, only bdrv_new() sets
bs->refcnt to 1, so all bdrv_unref() now actually delete the BDS.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce bdrv_ref/bdrv_unref to manage the lifecycle of
BlockDriverState. They are unused for now but will used to replace
bdrv_delete() later.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
BlockDriverState structure needs bdrv_new() to initialize refcnt, don't
allocate a local structure variable and memset to 0, becasue with coming
refcnt implementation, bdrv_unref will crash if bs->refcnt not
initialized to 1.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
we need bdrv_new() to properly initialize BDS, don't allocate memory
manually.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
tests/test-aio.c used pipe2 which is Linux only. Use qemu_pipe
and qemu_set_nonblock for portabillity. Addition of O_CLOEXEC
is a harmless bonus.
Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>