Commit graph

890 commits

Author SHA1 Message Date
Kevin Wolf
f468121290 blkdebug: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
8a79380b8e raw-win32: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
c66a615723 raw-posix: Use bdrv_open options instead of filename
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Kevin Wolf
31ca6d077c block: Add driver-specific options for backing files
Options starting in "backing." are passed to the backing file now. If
you don't need to specify the filename for the backing file, you can add
it on the command line instead of in the image file:

$ qemu-nbd -t /tmp/test.img
$ qemu-img create -f qcow2 empty.qcow2 1G
$ qemu-system-x86_64 -drive file=empty.qcow2,backing.file.driver=nbd,\
    backing.file.host=localhost

Note that this doesn't override the backing filename from the image. If
the image has one, this will fail because NBD doesn't want the options
and a filename at the same time.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-04-22 10:27:59 +02:00
Stefan Hajnoczi
16b3c5cd9f qcow: allow sub-cluster compressed write to last cluster
Compression in qcow requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Note that this is almost identical to the qcow2 version of this code.
qcow2's compression code is drawn from qcow.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Stefan Hajnoczi
f4d38bef7c qcow2: allow sub-cluster compressed write to last cluster
Compression in qcow2 requires image length to be a multiple of the
cluster size.  Lift this requirement by zero-padding the final cluster
when necessary.  The virtual disk size is still not cluster-aligned, so
the guest cannot access the zero sectors.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-22 10:27:58 +02:00
Richard W.M. Jones
c7a101f529 ssh: Remove unnecessary use of strlen function.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-19 11:45:38 +02:00
Stefan Weil
6ae7d660a0 block/ssh: Add missing gcc format attributes
Now gcc will check whether format string and variable arguments match.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-19 11:44:16 +02:00
Josh Durgin
dc7588c1eb rbd: add an asynchronous flush
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),
which is sychronous and causes the main qemu thread to block until it
is complete. This results in unresponsiveness and extra latency for
the guest.

Fix this by using an asynchronous version of flush.  This was added to
librbd with a special #define to indicate its presence, since it will
be backported to stable versions. Thus, there is no need to check the
version of librbd.

Implement this as bdrv_aio_flush, since it matches other aio functions
in the rbd block driver, and leave out bdrv_co_flush_to_disk when the
asynchronous version is available.

Reported-by: Oliver Francke <oliver@filoo.de>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Richard W.M. Jones
9a2d462e7b block: ssh: Use libssh2_sftp_fsync (if supported by libssh2) to flush to disk.
libssh2_sftp_fsync is an extension to libssh2 to support fsync(2) over
sftp, which is itself an extension of OpenSSH.

If both libssh2 and the ssh daemon support it, this will allow
bdrv_flush_to_disk to commit changes through to disk on the remote
server.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Richard W.M. Jones
0a12ec87a5 block: Add support for Secure Shell (ssh) block device.
qemu-system-x86_64 -drive file=ssh://hostname/some/image

QEMU will ssh into 'hostname' and open '/some/image' which is made
available as a standard block device.

You can specify a username (ssh://user@host/...) and/or a port number
(ssh://host:port/...).  You can also use an alternate syntax using
properties (file.user, file.host, file.port, file.path).

Current limitations:

- Authentication must be done without passwords or passphrases, using
  ssh-agent.  Other authentication methods are not supported.

- Uses a single connection, instead of concurrent AIO with multiple
  SSH connections.

This is implemented using libssh2 on the client side.  The server just
requires a regular ssh daemon with sftp-server support.  Most ssh
daemons on Unix/Linux systems will work out of the box.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 10:18:05 +02:00
Kevin Wolf
8d3b1a2d0b block: Introduce bdrv_pwritev() for qcow2_save_vmstate
Directly pass the QEMUIOVector on instead of linearising it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Kevin Wolf
cf8074b382 block: Introduce bdrv_writev_vmstate
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-04-15 08:26:18 +02:00
Aurelien Jarno
753d9b82c5 aes: move aes.h from include/block to include/qemu
Move aes.h from include/block to include/qemu to show it can be reused
by other subsystems.

Cc: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2013-04-13 13:51:57 +02:00
Paolo Bonzini
0d09e41a51 hw: move headers to include/
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2013-04-08 18:13:10 +02:00
Kevin Wolf
c2b6ff51e4 qcow2: Fix L1 write error handling in qcow2_update_snapshot_refcount
It ignored the error code, and at least the 'goto fail' is obvious
nonsense as it creates an endless loop (if the next attempt doesn't
magically succeed) and leaves the in-memory L1 table in big-endian
instead of converting it back.

In error cases, there's no point in writing an updated L1 table, so
skip this part for them.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-05 18:58:05 +02:00
Kevin Wolf
c2bc78b6a9 qcow2: Return real error in qcow2_update_snapshot_refcount
This fixes the error message triggered by the following script:

    cat > /tmp/blkdebug.cfg <<EOF
    [inject-error]
    event = "cluster_free"
    errno = "28"
    immediately = "off"
    EOF

    $qemu_img create -f qcow2 test.qcow2 10G
    $qemu_img snapshot -c snap test.qcow2
    $qemu_img snapshot -d snap blkdebug:/tmp/blkdebug.cfg:test.qcow2

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-04-05 18:58:05 +02:00
Stefan Hajnoczi
f9e8cacc55 oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.
Rename to qemu_set_nonblock() just like qemu_set_cloexec().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
2013-04-02 11:47:37 -04:00
Kevin Wolf
ecdd5333ab qcow2: Gather clusters in a looping loop
Instead of just checking once in exactly this order if there are
dependendies, non-COW clusters and new allocation, this starts looping
around these. This way we can, for example, gather non-COW clusters after
new allocations as long as the host cluster offsets stay contiguous.

Once handle_dependencies() is extended so that COW areas of in-flight
allocations can be overwritten, this allows to continue with gathering
other clusters (we wouldn't be able to do that without this change
because we would have missed a possible second dependency in one of the
next clusters).

This means that in the typical sequential write case, we can combine the
COW overwrite of one cluster with the allocation of the next cluster as
soon as something like Delayed COW gets actually implemented. It is only
by avoiding splitting requests this way that Delayed COW actually starts
improving performance noticably.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
2c3b32d256 qcow2: Move cluster gathering to a non-looping loop
This patch is mainly to separate the indentation change from the
semantic changes. All that really changes here is that everything moves
into a while loop, all 'goto done' become 'break' and at the end of the
loop a new 'break is inserted.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
88c6588c51 qcow2: Allow requests with multiple l2metas
Instead of expecting a single l2meta, have a list of them. This allows
to still have a single I/O request for the guest data, even though
multiple l2meta may be needed in order to describe both a COW overwrite
and a new cluster allocation (typical sequential write case).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
710c2496d8 qcow2: Use byte granularity in qcow2_alloc_cluster_offset()
This gets rid of the nb_clusters and keep_clusters and the associated
complicated calculations. Just advance the number of bytes that have
been processed and everything is fine.

This patch advances the variables even after the last operation even
though they aren't used any more afterwards to make things look more
uniform. A later patch will turn the whole thing into a loop and then
it actually starts making sense.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:44 +01:00
Kevin Wolf
411d62b04b qcow2: Prepare handle_alloc/copied() for byte granularity
This makes handle_alloc() and handle_copied() return byte-granularity
host offsets instead of returning always the cluster start. This is
required so that qcow2_alloc_cluster_offset() can stop aligning
everything to cluster boundaries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
e62daaf679 qcow2: handle_copied(): Implement non-zero host_offset
Look only for clusters that start at a given physical offset.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
c53ede9f6d qcow2: handle_copied(): Get rid of keep_clusters parameter
Now *bytes is used to return the length of the area that can be written
to without performing an allocation or COW.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
acb0467f8d qcow2: handle_copied(): Get rid of nb_clusters parameter
handle_copied() uses its bytes parameter now to determine how many
clusters it should try to find.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
0af729ec00 qcow2: Factor out handle_copied()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
83baa9a471 qcow2: Clean up handle_alloc()
Things can be simplified a bit now. No semantic changes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
c37f4cd71d qcow2: Finalise interface of handle_alloc()
The interface works completely on a byte granularity now and duplicated
parameters are removed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
3b8e2e260c qcow2: handle_alloc(): Get rid of keep_clusters parameter
handle_alloc() is now called with the offset at which the actual new
allocation starts instead of the offset at which the whole write request
starts, part of which may already be processed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
f5bc635094 qcow2: handle_alloc(): Get rid of nb_clusters parameter
We already communicate the same information in *bytes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
10f0ed8b2f qcow2: Factor out handle_alloc()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
037689d896 qcow2: Decouple cluster allocation from cluster reuse code
This moves some code that prepares the allocation of new clusters to
where the actual allocation happens. This is the minimum required to be
able to move it to a separate function in the next patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
65eb2e35c0 qcow2: Change handle_dependency to byte granularity
This is a more precise description of what really constitutes a
dependency. The behaviour doesn't change at this point because the COW
area of the old request is still aligned to cluster boundaries and
therefore an overlap is detected wheneven the requests touch any part of
the same cluster.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
d9d74f4177 qcow2: Improve check for overlapping allocations
The old code detected an overlapping allocation even when the
allocations didn't actually overlap, but were only adjacent.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:43 +01:00
Kevin Wolf
17a71e5823 qcow2: Handle dependencies earlier
Handling overlapping allocations isn't just a detail of cluster
allocation. It is rather one of three ways to get the host cluster
offset for a write request:

1. If a request overlaps an in-flight allocations, the cluster offset
   can be taken from there (this is what handle_dependencies will evolve
   into) or the request must just wait until the allocation has
   completed. Accessing the L2 is not valid in this case, it has
   outdated information.

2. Outside overlapping areas, check the clusters that can be written to
   as they are, with no COW involved.

3. If a COW is required, allocate new clusters

Changing the code to reflect this doesn't change the behaviour because
overlaps cannot exist for clusters that are kept in step 2. It does
however make it easier for later patches to work on clusters that belong
to an allocation that is still in flight.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Kevin Wolf
9ee6439e27 qcow2: Remove bogus unlock of s->lock
The unlock wakes up the next coroutine, but the currently running
coroutine will lock it again before it yields, so this doesn't make a
lot of sense.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Kevin Wolf
c349ca4bb2 qcow2: Fix "total clusters" number in bdrv_check
This should be based on the virtual disk size, not on the size of the
image.

Interesting observation: With some VM state stored in the image file,
percentages higher than 100% are possible, even though snapshots
themselves are ignored. This is a qcow2 bug to be fixed another day: The
VM state should be discarded in the active L2 tables after completing
the snapshot creation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-28 11:52:42 +01:00
Stefan Weil
ea804cadf8 block: Add options QDict to bdrv_file_open() prototypes (fix MinGW build)
The new parameter is unused yet.

This part was missing in commit 787e4a8500.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-25 09:53:04 +01:00
Liu Yuan
d43731c758 rbd: fix compile error
Commit 787e4a85 [block: Add options QDict to bdrv_file_open() prototypes] didn't
update rbd.c accordingly.

Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2013-03-25 09:51:43 +01:00
Kevin Wolf
681e7ad024 nbd: Check against invalid option combinations
A file name may only specified if no host or socket path is specified.
The latter two may not appear at the same time either.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
bebbf7fa9c nbd: Use default port if only host is specified
The URL method already takes care to apply the default port when none is
specfied. Directly specifying driver-specific options required the port
number until now. Allow leaving it out and apply the default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
f5866fa438 block: Make find_image_format safe with NULL filename
In order to achieve this, the .bdrv_probe callbacks of all drivers must
cope with this. The DMG driver is the only one that bases its decision
on the filename and it needs to be changed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
6963a30d82 block: Introduce .bdrv_parse_filename callback
If a driver needs structured data and not just a string, it can provide
a .bdrv_parse_filename callback now that parses the command line string
into separate options. Keeping this separate from .bdrv_open_filename
ensures that the preferred way of directly specifying the options always
works as well if parsing the string works.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
f53a1febcd nbd: Accept -drive options for the network connection
The existing parsers for the file name now parse everything into the
bdrv_open() options QDict. Instead of using these parsers, you can now
directly specify the options on the command line, like this:

    qemu-system-x86_64 -drive file=nbd:,file.port=1234,file.host=::1

Clearly the file=... part could use further improvement, but it's a
start.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:32 +01:00
Kevin Wolf
f17c90bed1 nbd: Keep hostname and port separate
The NBD block supports an URL syntax, for which a URL parser returns
separate hostname and port fields. It also supports the traditional qemu
syntax encoded in a filename. Until now, after parsing the URL to get
each piece of information, a new string is built to be fed to socket
functions.

Instead of building a string in the URL case that is immediately parsed
again, parse the string in both cases and use the QemuOpts interface to
qemu-sockets.c.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:31 +01:00
Kevin Wolf
787e4a8500 block: Add options QDict to bdrv_file_open() prototypes
The new parameter is unused yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2013-03-22 17:51:31 +01:00
Kevin Wolf
acdfb480ba qcow2: Fix segfault in qcow2_invalidate_cache
Need to pass an options QDict to qcow2_open() now. This fixes a segfault
on the migration target with qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-19 11:48:36 +01:00
Liu Yuan
fca23f0ad2 sheepdog: show error message for halt status
Sheepdog (neither quorum nor unsafe mode) will refuse to serve IO requests when
number of alive nodes is less than that of copies specified by users. This will
return 0x19 to QEMU client which currently doesn't recognize it.

This patch adds an error description when QEMU client receives it, other than
plainly printing 'Invalid error code'

Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Liu Yuan <tailai.ly@taobao.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2013-03-19 11:48:36 +01:00
Stefan Hajnoczi
c4d9d19645 threadpool: drop global thread pool
Now that each AioContext has a ThreadPool and the main loop AioContext
can be fetched with bdrv_get_aio_context(), we can eliminate the concept
of a global thread pool from thread-pool.c.

The submit functions must take a ThreadPool* argument.

block/raw-posix.c and block/raw-win32.c use
aio_get_thread_pool(bdrv_get_aio_context(bs)) to fetch the main loop's
ThreadPool.

tests/test-thread-pool.c must be updated to reflect the new
thread_pool_submit() function prototypes.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2013-03-15 16:07:51 +01:00