Use Neon instructions to perform zero checking of
buffer. This is helps in reducing total migration time.
Use case: Idle VM live migration with 4 VCPUS and 8GB ram
running CentOS 7.
Without Neon, the Total migration time is 3.5 Sec
Migration status: completed
total time: 3560 milliseconds
downtime: 33 milliseconds
setup: 5 milliseconds
transferred ram: 297907 kbytes
throughput: 685.76 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2062760 pages
skipped: 0 pages
normal: 69808 pages
normal bytes: 279232 kbytes
dirty sync count: 3
With Neon, the total migration time is 2.9 Sec
Migration status: completed
total time: 2960 milliseconds
downtime: 65 milliseconds
setup: 4 milliseconds
transferred ram: 299869 kbytes
throughput: 830.19 mbps
remaining ram: 0 kbytes
total ram: 8519872 kbytes
duplicate: 2064313 pages
skipped: 0 pages
normal: 70294 pages
normal bytes: 281176 kbytes
dirty sync count: 3
Signed-off-by: Vijaya Kumar K <vijayak@cavium.com>
Signed-off-by: Suresh <ksuresh@cavium.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1467190029-694-2-git-send-email-vijayak@cavium.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the avx2 primitives during the test, thus making sure that the
compiler and assembler could actually use avx2.
This also detects the failure case on gcc 4.8.x with -save-temps
and avoids the need for the gcc version check in cutils.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1465557378-24105-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
buffer_find_nonzero_offset() is a hot function during live migration.
Now it use SSE2 instructions for optimization. For platform supports
AVX2 instructions, use AVX2 instructions for optimization can help
to improve the performance of buffer_find_nonzero_offset() about 30%
comparing to SSE2.
Live migration can be faster with this optimization, the test result
shows that for an 8GiB RAM idle guest just boots, this patch can help
to shorten the total live migration time about 6%.
This patch use the ifunc mechanism to select the proper function when
running, for platform supports AVX2, execute the AVX2 instructions,
else, execute the original instructions.
Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1457416397-26671-3-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-6-git-send-email-peter.maydell@linaro.org
Not only it makes sense, but it gets rid of checkpatch warning:
WARNING: consider using qemu_strtosz in preference to strtosz
Also remove get rid of tabs to please checkpatch.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1442419377-9309-1-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJV8Tk7AAoJEL/70l94x66Ds3QH/3bi0RRR2NtKIXAQrGo5tfuD
NPMu1K5Hy+/26AC6mEVNRh4kh7dPH5E4NnDGbxet1+osvmpjxAjc2JrxEybhHD0j
fkpzqynuBN6cA2Gu5GUNoKzxxTmi2RrEYigWDZqCftRXBeO2Hsr1etxJh9UoZw5H
dgpU3j/n0Q8s08jUJ1o789knZI/ckwL4oXK4u2KhSC7ZTCWhJT7Qr7c0JmiKReaF
JEYAsKkQhICVKRVmC8NxML8U58O8maBjQ62UN6nQpVaQd0Yo/6cstFTZsRrHMHL3
7A2Tyg862cMvp+1DOX3Bk02yXA+nxnzLF8kUe0rYo6llqDBDStzqyn1j9R0qeqA=
=nB06
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Support for jemalloc
* qemu_mutex_lock_iothread "No such process" fix
* cutils: qemu_strto* wrappers
* iohandler.c simplification
* Many other fixes and misc patches.
And some MTTCG work (with Emilio's fixes squashed):
* Signal-free TCG kick
* Removing spinlock in favor of QemuMutex
* User-mode emulation multi-threading fixes/docs
# gpg: Signature made Thu 10 Sep 2015 09:03:07 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (44 commits)
cutils: work around platform differences in strto{l,ul,ll,ull}
cpu-exec: fix lock hierarchy for user-mode emulation
exec: make mmap_lock/mmap_unlock globally available
tcg: comment on which functions have to be called with mmap_lock held
tcg: add memory barriers in page_find_alloc accesses
remove unused spinlock.
replace spinlock by QemuMutex.
cpus: remove tcg_halt_cond and tcg_cpu_thread globals
cpus: protect work list with work_mutex
scripts/dump-guest-memory.py: fix after RAMBlock change
configure: Add support for jemalloc
add macro file for coccinelle
configure: factor out adding disas configure
vhost-scsi: fix wrong vhost-scsi firmware path
checkpatch: remove tests that are not relevant outside the kernel
checkpatch: adapt some tests to QEMU
CODING_STYLE: update mixed declaration rules
qmp: Add example usage of strto*l() qemu wrapper
cutils: Add qemu_strtoull() wrapper
cutils: Add qemu_strtoll() wrapper
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Linux returns 0 if no conversion was made, while OS X and presumably
the BSDs return EINVAL. The OS X convention rejects more invalid
inputs, so convert to it and adjust the test case.
Windows returns 1 from strtoul and strtoull (instead of -1) for
negative out-of-range input; fix it up.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add wrapper for strtoull() function. Include unit tests.
Signed-off-by: Carlos L. Torres <carlos.torres@rackspace.com>
Message-Id: <e0f0f611c9a81f3c29f451d0b17d755dfab1e90a.1437346779.git.carlos.torres@rackspace.com>
[Use uint64_t in prototype. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add wrapper for strtoll() function. Include unit tests.
Signed-off-by: Carlos L. Torres <carlos.torres@rackspace.com>
Message-Id: <7454a6bb9ec03b629e8beb4f109dd30dc2c9804c.1437346779.git.carlos.torres@rackspace.com>
[Use int64_t in prototype, since that's what QEMU uses. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add wrapper for strtoul() function. Include unit tests.
Signed-off-by: Carlos L. Torres <carlos.torres@rackspace.com>
Message-Id: <9621b4ae8e35fded31c715c2ae2a98f904f07ad0.1437346779.git.carlos.torres@rackspace.com>
[Fix tests for 32-bit build. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add wrapper for strtol() function. Include unit tests.
Signed-off-by: Carlos L. Torres <carlos.torres@rackspace.com>
Message-Id: <07199f1c0ff3892790c6322123aee1e92f580550.1437346779.git.carlos.torres@rackspace.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the pow2floor() function is now used in a hot code path,
make it inline; for consistency, provide pow2ceil() as an inline
function too.
Because these functions use ctz64() we have to put the inline
versions into host-utils.h, so they have access to ctz64(),
and move the inline is_power_of_2() along with them.
We then need to include host-utils.h from qemu-common.h so that
the files which use these functions via qemu-common.h still have
access to them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1437741192-20955-7-git-send-email-peter.maydell@linaro.org
Use VEC_OR macro for operations on VECTYPE operands
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Message-Id: <3f62d7a3a265f7dd99e50d016a0333a99a4a082a.1435062067.git.atar4qemu@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid truncation of a 64-bit long to a 32-bit int, and check for errno
(especially ERANGE).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We already have pow2floor, mirror it and use instead of a function with
similar results (same in used domain), to clarify our intent.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
qemu_parse_fd() used to handle at least the following strings incorrectly:
o "-2": simply let through
o "2147483648": returned as LONG_MAX==INT_MAX on ILP32 (with ERANGE
ignored); implementation-defined behavior on LP64
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
This adds a helper to format ethernet MAC address.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Introduces a new utility function: parse_debug_env to avoid code
duplication.
This overrides whatever debug value is set on the corresponding devices
from the command line, and is meant to ease the usage with any
management stack. For libvirt you can set environment variables by
extending the dom namespace, i.e:
<domain type='kvm' id='3' xmlns:qemu='http://libvirt.org/schemas/domain/qemu/1.0'>
<qemu:commandline>
<qemu:env name='QEMU_CCID_PASSTHRU_DEBUG' value='4'/>
<qemu:env name='QEMU_CCID_DEBUG' value='4'/>
</qemu:commandline>
</domain>
Signed-off-by: Alon Levy <alevy@redhat.com>
Reviewed-by: Marc-André Lureau <mlureau@redhat.com>
performance gain on SSE2 is approx. 20-25%. altivec
is not tested. performance for unsigned long arithmetic
is unchanged.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
optimized function that searches for non-zero content in a
buffer.
the function starts full unrolling only after the first few chunks have
been checked one by one. analyzing real memory page data has revealed
that non-zero pages are non-zero within the first 256-512 bits in
most cases. as this function is also heavily used to check for zero memory
pages this tweak has been made to avoid the high setup costs of the fully
unrolled check for non-zero pages.
due to the optimizations used in the function there are restrictions
on buffer address and search length. the function
can_use_buffer_find_nonzero_content() can be used to check if
the function can be used safely.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There are lots of duplicate parsing code using strto*() in QEMU, and
most of that code is broken in one way or another. Even the visitors
code have duplicate integer parsing code[1]. This introduces functions
to help parsing unsigned int values: parse_uint() and parse_uint_full().
Parsing functions for signed ints and floats will be submitted later.
parse_uint_full() has all the checks made by opts_type_uint64() at
opts-visitor.c:
- Check for NULL (returns -EINVAL)
- Check for negative numbers (returns -EINVAL)
- Check for empty string (returns -EINVAL)
- Check for overflow or other errno values set by strtoll() (returns
-errno)
- Check for end of string (reject invalid characters after number)
(returns -EINVAL)
parse_uint() does everything above except checking for the end of the
string, so callers can continue parsing the remainder of string after
the number.
Unit tests included.
[1] string-input-visitor.c:parse_int() could use the same parsing code
used by opts-visitor.c:opts_type_int(), instead of duplicating that
logic.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>