Commit graph

1848 commits

Author SHA1 Message Date
Stefan Hajnoczi
9352f80cd9 coroutine: reserve 5,000 mappings
Daniel P. Berrangé <berrange@redhat.com> pointed out that the coroutine
pool size heuristic is very conservative. Instead of halving
max_map_count, he suggested reserving 5,000 mappings for non-coroutine
users based on observations of guests he has access to.

Fixes: 86a637e481 ("coroutine: cap per-thread local pool size")
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20240320181232.1464819-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-03-21 13:14:30 -04:00
Stefan Hajnoczi
86a637e481 coroutine: cap per-thread local pool size
The coroutine pool implementation can hit the Linux vm.max_map_count
limit, causing QEMU to abort with "failed to allocate memory for stack"
or "failed to set up stack guard page" during coroutine creation.

This happens because per-thread pools can grow to tens of thousands of
coroutines. Each coroutine causes 2 virtual memory areas to be created.
Eventually vm.max_map_count is reached and memory-related syscalls fail.
The per-thread pool sizes are non-uniform and depend on past coroutine
usage in each thread, so it's possible for one thread to have a large
pool while another thread's pool is empty.

Switch to a new coroutine pool implementation with a global pool that
grows to a maximum number of coroutines and per-thread local pools that
are capped at hardcoded small number of coroutines.

This approach does not leave large numbers of coroutines pooled in a
thread that may not use them again. In order to perform well it
amortizes the cost of global pool accesses by working in batches of
coroutines instead of individual coroutines.

The global pool is a list. Threads donate batches of coroutines to when
they have too many and take batches from when they have too few:

.-----------------------------------.
| Batch 1 | Batch 2 | Batch 3 | ... | global_pool
`-----------------------------------'

Each thread has up to 2 batches of coroutines:

.-------------------.
| Batch 1 | Batch 2 | per-thread local_pool (maximum 2 batches)
`-------------------'

The goal of this change is to reduce the excessive number of pooled
coroutines that cause QEMU to abort when vm.max_map_count is reached
without losing the performance of an adequately sized coroutine pool.

Here are virtio-blk disk I/O benchmark results:

      RW BLKSIZE IODEPTH    OLD    NEW CHANGE
randread      4k       1 113725 117451 +3.3%
randread      4k       8 192968 198510 +2.9%
randread      4k      16 207138 209429 +1.1%
randread      4k      32 212399 215145 +1.3%
randread      4k      64 218319 221277 +1.4%
randread    128k       1  17587  17535 -0.3%
randread    128k       8  17614  17616 +0.0%
randread    128k      16  17608  17609 +0.0%
randread    128k      32  17552  17553 +0.0%
randread    128k      64  17484  17484 +0.0%

See files/{fio.sh,test.xml.j2} for the benchmark configuration:
https://gitlab.com/stefanha/virt-playbooks/-/tree/coroutine-pool-fix-sizing

Buglink: https://issues.redhat.com/browse/RHEL-28947
Reported-by: Sanjay Rao <srao@redhat.com>
Reported-by: Boaz Ben Shabat <bbenshab@redhat.com>
Reported-by: Joe Mario <jmario@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240318183429.1039340-1-stefanha@redhat.com>
2024-03-19 10:49:31 -04:00
Paolo Bonzini
44a90c0875 oslib-posix: fix memory leak in touch_all_pages
touch_all_pages() can return early, before creating threads.  In this case,
however, it leaks the MemsetContext that it has allocated at the
beginning of the function.

Reported by Coverity as CID 1534922.

Fixes: 04accf43df ("oslib-posix: initialize backend memory objects in parallel", 2024-02-06)
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-03-08 15:51:22 +01:00
Steve Sistare
be19d836cd notify: pass error to notifier with return
Pass an error object as the third parameter to "notifier with return"
notifiers, so clients no longer need to bundle an error object in the
opaque data.  The new parameter is used in a later patch.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Link: https://lore.kernel.org/r/1708622920-68779-2-git-send-email-steven.sistare@oracle.com
Signed-off-by: Peter Xu <peterx@redhat.com>
2024-02-28 11:31:28 +08:00
Markus Armbruster
4edb196e20 qapi: Improve documentation of file descriptor socket addresses
SocketAddress branch @fd is documented in enum SocketAddressType,
unlike the other branches.  That's because the branch's type is String
from common.json.

Use a local copy of String, so we can put the documentation in the
usual place.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240205074709.3613229-14-armbru@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
2024-02-12 10:04:32 +01:00
Peter Maydell
9e34f127f4 * Emulate CVB, CVBY, CVBG and CVDG s390x instructions
* Fix bug in lsi53c895a reentrancy counter
 * Deprecate the "power5+" and "power7+" CPU names
 * Fix problems in the freebsd VM test
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmXCCXURHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbXtEA/9HKWMHbWqDAdlrpmfW8lCFaBHgV0+Fqsy
 GlxJykni2BxIWNoR7J6SdAqbgx3E2/7i8IMIUwYXlNBjEs/UQ0ZcnI5k6OfUS24p
 qfbdH717SgsaB9R1vCBhmOGGWYBfe/RqPGIcni/eg+jSxB5cn2XvEv3+ZBckvDsh
 KFuuAa6vvuBVhyXLbkP8Z+LEe27ttIYi5v1dvJ1an4UbFESqxVb0knyuFYpZpY8Y
 h7dZ0hyCid7YT03zVmSADK7anO+epBdzUU3SsKXj2dB9nebSjmkav6lQQBKYHHUg
 THojcWKwFPNK0AojhBuBCqFYgkGGt/9kjwlUt7jfm1TcSemN65XLNYHThRekPuAJ
 Jcze8dcEerbj1xsNWYh4hPvB92laEiyVR5BYFfUkJ9m2IAamPQLHvOT7jzhC3Y9k
 4wvVcf9QKVtKW0QO54SQjD4A/qQu/4777oH5w83nGuxjUthmHDqZmjDlIRe6lKJt
 gsA+mKn+w9HrtiXOSkoMhK8PAyvCoAef/N7kvHZoHmp6TtfQAjPs4/v2uZMpnd60
 z7Cw50giHpo9lmiZ1Ey2fQvw9orYhNoXAc4XfYGHuYdQFWpCGz1PB2Km8uTPTEUe
 as364ULBqWoFBCRuRndy2+z2e3zhK5THTPCAyHf48M6teMEPa4KTsTCk7MzmfVfx
 C8RsLcmrFPI=
 =eQNc
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2024-02-06' of https://gitlab.com/thuth/qemu into staging

* Emulate CVB, CVBY, CVBG and CVDG s390x instructions
* Fix bug in lsi53c895a reentrancy counter
* Deprecate the "power5+" and "power7+" CPU names
* Fix problems in the freebsd VM test

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmXCCXURHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbXtEA/9HKWMHbWqDAdlrpmfW8lCFaBHgV0+Fqsy
# GlxJykni2BxIWNoR7J6SdAqbgx3E2/7i8IMIUwYXlNBjEs/UQ0ZcnI5k6OfUS24p
# qfbdH717SgsaB9R1vCBhmOGGWYBfe/RqPGIcni/eg+jSxB5cn2XvEv3+ZBckvDsh
# KFuuAa6vvuBVhyXLbkP8Z+LEe27ttIYi5v1dvJ1an4UbFESqxVb0knyuFYpZpY8Y
# h7dZ0hyCid7YT03zVmSADK7anO+epBdzUU3SsKXj2dB9nebSjmkav6lQQBKYHHUg
# THojcWKwFPNK0AojhBuBCqFYgkGGt/9kjwlUt7jfm1TcSemN65XLNYHThRekPuAJ
# Jcze8dcEerbj1xsNWYh4hPvB92laEiyVR5BYFfUkJ9m2IAamPQLHvOT7jzhC3Y9k
# 4wvVcf9QKVtKW0QO54SQjD4A/qQu/4777oH5w83nGuxjUthmHDqZmjDlIRe6lKJt
# gsA+mKn+w9HrtiXOSkoMhK8PAyvCoAef/N7kvHZoHmp6TtfQAjPs4/v2uZMpnd60
# z7Cw50giHpo9lmiZ1Ey2fQvw9orYhNoXAc4XfYGHuYdQFWpCGz1PB2Km8uTPTEUe
# as364ULBqWoFBCRuRndy2+z2e3zhK5THTPCAyHf48M6teMEPa4KTsTCk7MzmfVfx
# C8RsLcmrFPI=
# =eQNc
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 06 Feb 2024 10:27:01 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2024-02-06' of https://gitlab.com/thuth/qemu:
  meson: Link with libinotify on FreeBSD
  test-util-filemonitor: Adapt to the FreeBSD inotify rename semantics
  tests/vm/freebsd: Reload the sshd configuration
  tests/vm: Set UseDNS=no in the sshd configuration
  target/s390x: Prefer fast cpu_env() over slower CPU QOM cast macro
  tests/tcg/s390x: Test CONVERT TO BINARY
  tests/tcg/s390x: Test CONVERT TO DECIMAL
  target/s390x: Emulate CVB, CVBY and CVBG
  target/s390x: Emulate CVDG
  docs/about: Deprecate the old "power5+" and "power7+" CPU names
  target/ppc/cpu-models: Rename power5+ and power7+ for new QOM naming rules
  hw/scsi/lsi53c895a: add missing decrement of reentrancy counter

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-02-08 11:59:28 +00:00
Ilya Leoshkevich
a1a9800e97 meson: Link with libinotify on FreeBSD
make vm-build-freebsd fails with:

    ld: error: undefined symbol: inotify_init1
    >>> referenced by filemonitor-inotify.c:183 (../src/util/filemonitor-inotify.c:183)
    >>>               util_filemonitor-inotify.c.o:(qemu_file_monitor_new) in archive libqemuutil.a

On FreeBSD the inotify functions are defined in libinotify.so. Add it
to the dependencies.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-ID: <20240206002344.12372-5-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-02-06 10:27:50 +01:00
Mark Kanda
04accf43df oslib-posix: initialize backend memory objects in parallel
QEMU initializes preallocated backend memory as the objects are parsed from
the command line. This is not optimal in some cases (e.g. memory spanning
multiple NUMA nodes) because the memory objects are initialized in series.

Allow the initialization to occur in parallel (asynchronously). In order to
ensure optimal thread placement, asynchronous initialization requires prealloc
context threads to be in use.

Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
Message-ID: <20240131165327.3154970-2-mark.kanda@oracle.com>
Tested-by: Mario Casquero <mcasquer@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
2024-02-06 08:15:22 +01:00
Peter Maydell
493bc2dbc1 misc: Clean up includes
This commit was created with scripts/clean-includes:
 ./scripts/clean-includes --git misc net/af-xdp.c plugins/*.c audio/pwaudio.c util/userfaultfd.c

All .c should include qemu/osdep.h first.  The script performs three
related cleanups:

* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c  already includes
  it.  Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
  Drop these, too.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Zhao Liu <zhao1.liu@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2024-01-30 21:20:20 +03:00
Markus Armbruster
b0b1313eb2 qapi: Fix dangling references to docs/devel/qapi-code-gen.txt
Conversion of docs/devel/qapi-code-gen.txt to ReST left several
dangling references behind.  Fix them to point to
docs/devel/qapi-code-gen.rst.

Fixes: f7aa076dbd (docs: convert qapi-code-gen.txt to ReST)
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-ID: <20240120095327.666239-4-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2024-01-26 07:04:53 +01:00
Peter Maydell
5bab95dc74 * Test timeout fixes
* Clean up URI code
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmWw6SsRHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbXsVQ//Ss33GMIu1aUEFsZTSUghUXPx8035zin/
 TugiIcLfcONxxCi+Q/jfUPowJ3TLwt0vdv3V73M94+XBDrWClLyJYuu8eew0EMZI
 zqBl5AyO2hdGXxnF/wJAtdKfleUElJDooUyGPIlsJ2gXmmLi60qkQfKR8dGl3h2r
 fLM36LVsWWtM3HaCePHlHYaYdfy917w4bNWJRf/QfBqSMX5F5mlU+EvzEFLBTkT/
 4HCaYhE1ouQnudO+rvuK78I72BgXgaPTn2oCXVdBvbEM+36heJyhYRDCW4ncf5QN
 PH8UQUih/NrU9BSrLT3aHE3VcYWzik7s8A4Nkg21bHYHhXstO/KKzhUU5//wOUp5
 BV+mwjwTxpnOAFqmgQuvH8rTx/YuXCpdkNdoLd41VX8Qa4DP1AjBWAC6LrJkDq51
 2PIKqMPjSsBaXd/itBKBFzY7JkDRLFUZQMk78l/JjFuhvhE8OfpBPtCofgYo9/OE
 cn9khZ6Oh9zxzZWb9YIdHiu4v1VP0ZtGfB0Zt4WIi2oBm3ql6+cHFkVcssaEIiNQ
 h5tI/xLviUIIRMIPpu7W+WSZBHt+w6wjBlu3O5fjoPSoHQsmNg2S9mS9+AQ2/KGJ
 4/78/Pg4XpKVd2MSLMQ6A2LlI1iQd51TV0aTqrzd/DdZYP3TBXdasQPR/WZN4eWw
 kYwt0bA5FGs=
 =1N9B
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2024-01-24' of https://gitlab.com/thuth/qemu into staging

* Test timeout fixes
* Clean up URI code

# -----BEGIN PGP SIGNATURE-----
#
# iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmWw6SsRHHRodXRoQHJl
# ZGhhdC5jb20ACgkQLtnXdP5wLbXsVQ//Ss33GMIu1aUEFsZTSUghUXPx8035zin/
# TugiIcLfcONxxCi+Q/jfUPowJ3TLwt0vdv3V73M94+XBDrWClLyJYuu8eew0EMZI
# zqBl5AyO2hdGXxnF/wJAtdKfleUElJDooUyGPIlsJ2gXmmLi60qkQfKR8dGl3h2r
# fLM36LVsWWtM3HaCePHlHYaYdfy917w4bNWJRf/QfBqSMX5F5mlU+EvzEFLBTkT/
# 4HCaYhE1ouQnudO+rvuK78I72BgXgaPTn2oCXVdBvbEM+36heJyhYRDCW4ncf5QN
# PH8UQUih/NrU9BSrLT3aHE3VcYWzik7s8A4Nkg21bHYHhXstO/KKzhUU5//wOUp5
# BV+mwjwTxpnOAFqmgQuvH8rTx/YuXCpdkNdoLd41VX8Qa4DP1AjBWAC6LrJkDq51
# 2PIKqMPjSsBaXd/itBKBFzY7JkDRLFUZQMk78l/JjFuhvhE8OfpBPtCofgYo9/OE
# cn9khZ6Oh9zxzZWb9YIdHiu4v1VP0ZtGfB0Zt4WIi2oBm3ql6+cHFkVcssaEIiNQ
# h5tI/xLviUIIRMIPpu7W+WSZBHt+w6wjBlu3O5fjoPSoHQsmNg2S9mS9+AQ2/KGJ
# 4/78/Pg4XpKVd2MSLMQ6A2LlI1iQd51TV0aTqrzd/DdZYP3TBXdasQPR/WZN4eWw
# kYwt0bA5FGs=
# =1N9B
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 24 Jan 2024 10:40:43 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2024-01-24' of https://gitlab.com/thuth/qemu:
  util/uri: Remove unused macros ISA_RESERVED() and ISA_GEN_DELIM()
  util/uri: Remove the uri_string_escape() function
  util/uri: Remove unused functions uri_resolve() and uri_resolve_relative()
  util/uri: Remove uri_string_unescape()
  tests/qtest: Bump timeouts of boot_sector_test()-based tests to 610 seconds
  tests/unit/test-iov: Fix timeout problem on NetBSD and OpenBSD
  tests/qtest: Bump timeout of the boot-serial-test to 360 seconds

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-01-25 12:33:42 +00:00
Thomas Huth
e7b991451e util/uri: Remove unused macros ISA_RESERVED() and ISA_GEN_DELIM()
They are not used anywhere, so there's no need to keep them around.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240123182247.432642-5-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-01-24 09:54:05 +01:00
Thomas Huth
8fd466737c util/uri: Remove the uri_string_escape() function
Now that uri_resolve_relative() has been removed, this function is not
used in QEMU anymore - and if somebody needs this functionality, they
can simply use g_uri_escape_string() from the glib instead.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240123182247.432642-4-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-01-24 09:54:05 +01:00
Thomas Huth
fdd16f16f4 util/uri: Remove unused functions uri_resolve() and uri_resolve_relative()
These rather complex functions have never been used since they've been
introduced in 2012, so looks like they are not really useful for QEMU.
And since the static normalize_uri_path() function is also only used by
uri_resolve(), we can remove that function now, too.

Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: "Daniel P. Berrangé" <berrange@redhat.com>
Message-ID: <20240123182247.432642-3-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-01-24 09:54:05 +01:00
Thomas Huth
7536acb426 util/uri: Remove uri_string_unescape()
uri_string_unescape() basically does the same as the glib function
g_uri_unescape_segment(). So we can get rid of our implementation
completely by simply using the glib function instead.

Suggested-by: Stefan Weil <sw@weilnetz.de>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Message-ID: <20240123182247.432642-2-thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2024-01-24 09:54:05 +01:00
Akihiko Odaki
d9945ccda0 coroutine-ucontext: Save fake stack for pooled coroutine
Coroutine may be pooled even after COROUTINE_TERMINATE if
CONFIG_COROUTINE_POOL is enabled and fake stack should be saved in
such a case to keep AddressSanitizerUseAfterReturn working. Even worse,
I'm seeing stack corruption without fake stack being saved.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20240117-asan-v2-1-26f9e1ea6e72@daynix.com>
2024-01-22 11:00:12 -05:00
Peter Maydell
3f2a357b95 HW core patch queue
. Deprecate unmaintained SH-4 models (Samuel)
 . HPET: Convert DPRINTF calls to trace events (Daniel)
 . Implement buffered block writes in Intel PFlash (Gerd)
 . Ignore ELF loadable segments with zero size (Bin)
 . ESP/NCR53C9x: PCI DMA fixes (Mark)
 . PIIX: Simplify Xen PCI IRQ routing (Bernhard)
 . Restrict CPU 'start-powered-off' property to sysemu (Phil)
 
 . target/alpha: Only build sys_helper.c on system emulation (Phil)
 . target/xtensa: Use generic instruction breakpoint API & add test (Max)
 . Restrict icount to system emulation (Phil)
 . Do not set CPUState TCG-specific flags in non-TCG accels (Phil)
 . Cleanup TCG tb_invalidate API (Phil)
 . Correct LoongArch/KVM include path (Bibo)
 . Do not ignore throttle errors in crypto backends (Phil)
 
 . MAINTAINERS updates (Raphael, Zhao)
 -----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmWqXbkACgkQ4+MsLN6t
 wN6VVBAAkP/Bs2JfQYobPZVV868wceM97KeUJMXP2YWf6dSLpHRCQN5KtuJcACM9
 y3k3R7nMeVJSGmzl/1gF1G9JhjoCLoVLX/ejeBppv4Wq//9sEdggaQfdCwkhWw2o
 IK/gPjTZpimE7Er4hPlxmuhSRuM1MX4duKFRRfuZpE7XY14Y7/Hk12VIG7LooO0x
 2Sl8CaU0DN7CWmRVDoUkwVx7JBy28UVarRDsgpBim7oKmjjBFnCJkH6B6NJXEiYr
 z1BmIcHa87S09kG1ek+y8aZpG9iPC7nUWjPIQyJGhnfrnBuO7hQHwCLIjHHp5QBR
 BoMr8YQNTI34/M/D8pBfg96LrGDjkQOfwRyRddkMP/jJcNPMAPMNGbfVaIrfij1e
 T+jFF4gQenOvy1XKCY3Uk/a11P3tIRFBEeOlzzQg4Aje9W2MhUNwK2HTlRfBbrRr
 V30R764FDmHlsyOu6/E3jqp4GVCgryF1bglPOBjVEU5uytbQTP8jshIpGVnxBbF+
 OpFwtsoDbsousNKVcO5+B0mlHcB9Ru9h11M5/YD/jfLMk95Ga90JGdgYpqQ5tO5Y
 aqQhKfCKbfgKuKhysxpsdWAwHZzVrlSf+UrObF0rl2lMXXfcppjCqNaw4QJ0oedc
 DNBxTPcCE2vWhUzP3A60VH7jLh4nLaqSTrxxQKkbx+Je1ERGrxs=
 =KmQh
 -----END PGP SIGNATURE-----

Merge tag 'hw-cpus-20240119' of https://github.com/philmd/qemu into staging

HW core patch queue

. Deprecate unmaintained SH-4 models (Samuel)
. HPET: Convert DPRINTF calls to trace events (Daniel)
. Implement buffered block writes in Intel PFlash (Gerd)
. Ignore ELF loadable segments with zero size (Bin)
. ESP/NCR53C9x: PCI DMA fixes (Mark)
. PIIX: Simplify Xen PCI IRQ routing (Bernhard)
. Restrict CPU 'start-powered-off' property to sysemu (Phil)

. target/alpha: Only build sys_helper.c on system emulation (Phil)
. target/xtensa: Use generic instruction breakpoint API & add test (Max)
. Restrict icount to system emulation (Phil)
. Do not set CPUState TCG-specific flags in non-TCG accels (Phil)
. Cleanup TCG tb_invalidate API (Phil)
. Correct LoongArch/KVM include path (Bibo)
. Do not ignore throttle errors in crypto backends (Phil)

. MAINTAINERS updates (Raphael, Zhao)

# -----BEGIN PGP SIGNATURE-----
#
# iQIzBAABCAAdFiEE+qvnXhKRciHc/Wuy4+MsLN6twN4FAmWqXbkACgkQ4+MsLN6t
# wN6VVBAAkP/Bs2JfQYobPZVV868wceM97KeUJMXP2YWf6dSLpHRCQN5KtuJcACM9
# y3k3R7nMeVJSGmzl/1gF1G9JhjoCLoVLX/ejeBppv4Wq//9sEdggaQfdCwkhWw2o
# IK/gPjTZpimE7Er4hPlxmuhSRuM1MX4duKFRRfuZpE7XY14Y7/Hk12VIG7LooO0x
# 2Sl8CaU0DN7CWmRVDoUkwVx7JBy28UVarRDsgpBim7oKmjjBFnCJkH6B6NJXEiYr
# z1BmIcHa87S09kG1ek+y8aZpG9iPC7nUWjPIQyJGhnfrnBuO7hQHwCLIjHHp5QBR
# BoMr8YQNTI34/M/D8pBfg96LrGDjkQOfwRyRddkMP/jJcNPMAPMNGbfVaIrfij1e
# T+jFF4gQenOvy1XKCY3Uk/a11P3tIRFBEeOlzzQg4Aje9W2MhUNwK2HTlRfBbrRr
# V30R764FDmHlsyOu6/E3jqp4GVCgryF1bglPOBjVEU5uytbQTP8jshIpGVnxBbF+
# OpFwtsoDbsousNKVcO5+B0mlHcB9Ru9h11M5/YD/jfLMk95Ga90JGdgYpqQ5tO5Y
# aqQhKfCKbfgKuKhysxpsdWAwHZzVrlSf+UrObF0rl2lMXXfcppjCqNaw4QJ0oedc
# DNBxTPcCE2vWhUzP3A60VH7jLh4nLaqSTrxxQKkbx+Je1ERGrxs=
# =KmQh
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 19 Jan 2024 11:32:09 GMT
# gpg:                using RSA key FAABE75E12917221DCFD6BB2E3E32C2CDEADC0DE
# gpg: Good signature from "Philippe Mathieu-Daudé (F4BUG) <f4bug@amsat.org>" [full]
# Primary key fingerprint: FAAB E75E 1291 7221 DCFD  6BB2 E3E3 2C2C DEAD C0DE

* tag 'hw-cpus-20240119' of https://github.com/philmd/qemu: (36 commits)
  configure: Add linux header compile support for LoongArch
  MAINTAINERS: Update hw/core/cpu.c entry
  MAINTAINERS: Update Raphael Norwitz email
  hw/elf_ops: Ignore loadable segments with zero size
  hw/scsi/esp-pci: set DMA_STAT_BCMBLT when BLAST command issued
  hw/scsi/esp-pci: synchronise setting of DMA_STAT_DONE with ESP completion interrupt
  hw/scsi/esp-pci: generate PCI interrupt from separate ESP and PCI sources
  hw/scsi/esp-pci: use correct address register for PCI DMA transfers
  target/riscv: Rename tcg_cpu_FOO() to include 'riscv'
  target/i386: Rename tcg_cpu_FOO() to include 'x86'
  hw/s390x: Rename cpu_class_init() to include 'sclp'
  hw/core/cpu: Rename cpu_class_init() to include 'common'
  accel: Rename accel_init_ops_interfaces() to include 'system'
  cpus: Restrict 'start-powered-off' property to system emulation
  system/watchpoint: Move TCG specific code to accel/tcg/
  system/replay: Restrict icount to system emulation
  hw/pflash: implement update buffer for block writes
  hw/pflash: use ldn_{be,le}_p and stn_{be,le}_p
  hw/pflash: refactor pflash_data_write()
  hw/i386/pc_piix: Make piix_intx_routing_notifier_xen() more device independent
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-01-19 11:39:38 +00:00
Philippe Mathieu-Daudé
72c603f82f util/async: Only call icount_notify_exit() if icount is enabled
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231208113529.74067-6-philmd@linaro.org>
2024-01-19 12:28:59 +01:00
Paolo Bonzini
592d0bc030 remove unnecessary casts from uintptr_t
uintptr_t, or unsigned long which is equivalent on Linux I32LP64 systems,
is an unsigned type and there is no need to further cast to __u64 which is
another unsigned integer type; widening casts from unsigned integers
zero-extend the value.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2024-01-18 10:43:51 +01:00
Natanael Copa
1d513e06d9 util: fix build with musl libc on ppc64le
Use PPC_FEATURE2_ISEL and PPC_FEATURE2_VEC_CRYPTO from linux headers
instead of the GNU specific PPC_FEATURE2_HAS_ISEL and
PPC_FEATURE2_HAS_VEC_CRYPTO. This fixes build with musl libc.

Cc: qemu-stable@nongnu.org
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1861
Signed-off-by: Natanael Copa <ncopa@alpinelinux.org>
Fixes: 63922f467a ("tcg/ppc: Replace HAVE_ISEL macro with a variable")
Fixes: 68f340d4cd ("tcg/ppc: Enable Altivec detection")
Message-Id: <20231219105236.7059-1-ncopa@alpinelinux.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2024-01-11 08:48:16 +11:00
Philippe Mathieu-Daudé
995d8348eb util/fifo8: Introduce fifo8_peek_buf()
To be able to peek at FIFO content without popping it,
introduce the fifo8_peek_buf() method by factoring
common content from fifo8_pop_buf().

Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231109192814.95977-3-philmd@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2024-01-10 06:58:50 +00:00
Philippe Mathieu-Daudé
cd04033dbe util/fifo8: Allow fifo8_pop_buf() to not populate popped length
There might be cases where we know the number of bytes we can
pop from the FIFO, or we simply don't care how many bytes is
returned. Allow fifo8_pop_buf() to take a NULL numptr.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Francisco Iglesias <frasse.iglesias@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20231109192814.95977-2-philmd@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2024-01-10 06:58:50 +00:00
Stefan Hajnoczi
a4a411fbaf Replace "iothread lock" with "BQL" in comments
The term "iothread lock" is obsolete. The APIs use Big QEMU Lock (BQL)
in their names. Update the code comments to use "BQL" instead of
"iothread lock".

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Paul Durrant <paul@xen.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Message-id: 20240102153529.486531-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-01-08 10:45:43 -05:00
Stefan Hajnoczi
195801d700 system/cpus: rename qemu_mutex_lock_iothread() to bql_lock()
The Big QEMU Lock (BQL) has many names and they are confusing. The
actual QemuMutex variable is called qemu_global_mutex but it's commonly
referred to as the BQL in discussions and some code comments. The
locking APIs, however, are called qemu_mutex_lock_iothread() and
qemu_mutex_unlock_iothread().

The "iothread" name is historic and comes from when the main thread was
split into into KVM vcpu threads and the "iothread" (now called the main
loop thread). I have contributed to the confusion myself by introducing
a separate --object iothread, a separate concept unrelated to the BQL.

The "iothread" name is no longer appropriate for the BQL. Rename the
locking APIs to:
- void bql_lock(void)
- void bql_unlock(void)
- bool bql_locked(void)

There are more APIs with "iothread" in their names. Subsequent patches
will rename them. There are also comments and documentation that will be
updated in later patches.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Acked-by: Fabiano Rosas <farosas@suse.de>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Peter Xu <peterx@redhat.com>
Acked-by: Eric Farman <farman@linux.ibm.com>
Reviewed-by: Harsh Prateek Bora <harshpb@linux.ibm.com>
Acked-by: Hyman Huang <yong.huang@smartx.com>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-id: 20240102153529.486531-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2024-01-08 10:45:43 -05:00
Philippe Mathieu-Daudé
897a06c6d7 iothread: Remove unused Error** argument in aio_context_set_aio_params
aio_context_set_aio_params() doesn't use its undocumented
Error** argument. Remove it to simplify.

Note this removes a use of "unchecked Error**" in
iothread_set_aio_context_params().

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20231120171806.19361-1-philmd@linaro.org>
2024-01-08 10:45:34 -05:00
Philippe Mathieu-Daudé
b622ee98bf util/oslib: Have qemu_prealloc_mem() handler return a boolean
Following the example documented since commit e3fe3988d7 ("error:
Document Error API usage rules"), have qemu_prealloc_mem()
return a boolean indicating whether an error is set or not.

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Gavin Shan <gshan@redhat.com>
Message-Id: <20231120213301.24349-19-philmd@linaro.org>
2024-01-05 16:20:15 +01:00
Peter Maydell
05470c3979 * configure: use a native non-cross compiler for linux-user
* meson: cleanups
 * target/i386: miscellaneous cleanups and optimizations
 * target/i386: implement CMPccXADD
 * target/i386: the sgx_epc_get_section stub is reachable
 * esp: check for NULL result from scsi_device_find()
 -----BEGIN PGP SIGNATURE-----
 
 iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWRImYUHHBib256aW5p
 QHJlZGhhdC5jb20ACgkQv/vSX3jHroNd7AgAgcyJGiMfUkXqhefplpm06RDXQIa8
 FuoJqPb21lO75DQKfaFRAc4xGLagjJROMJGHMm9HvMu2VlwvOydkQlfFRspENxQ/
 5XzGdb/X0A7HA/mwUfnMB1AZx0Vs32VI5IBSc6acc9fmgeZ84XQEoM3KBQHUik7X
 mSkE4eltR9gJ+4IaGo4voZtK+YoVD8nEcuqmnKihSPWizev0FsZ49aNMtaYa9qC/
 Xs3kiQd/zPibHDHJu0ulFsNZgxtUcvlLHTCf8gO4dHWxCFLXGubMush83McpRtNB
 Qoh6cTLH+PBXfrxMR3zmTZMNvo8Euls3s07Y8TkNP4vdIIE/kMeMDW1wJw==
 =mq30
 -----END PGP SIGNATURE-----

Merge tag 'for-upstream' of https://gitlab.com/bonzini/qemu into staging

* configure: use a native non-cross compiler for linux-user
* meson: cleanups
* target/i386: miscellaneous cleanups and optimizations
* target/i386: implement CMPccXADD
* target/i386: the sgx_epc_get_section stub is reachable
* esp: check for NULL result from scsi_device_find()

# -----BEGIN PGP SIGNATURE-----
#
# iQFIBAABCAAyFiEE8TM4V0tmI4mGbHaCv/vSX3jHroMFAmWRImYUHHBib256aW5p
# QHJlZGhhdC5jb20ACgkQv/vSX3jHroNd7AgAgcyJGiMfUkXqhefplpm06RDXQIa8
# FuoJqPb21lO75DQKfaFRAc4xGLagjJROMJGHMm9HvMu2VlwvOydkQlfFRspENxQ/
# 5XzGdb/X0A7HA/mwUfnMB1AZx0Vs32VI5IBSc6acc9fmgeZ84XQEoM3KBQHUik7X
# mSkE4eltR9gJ+4IaGo4voZtK+YoVD8nEcuqmnKihSPWizev0FsZ49aNMtaYa9qC/
# Xs3kiQd/zPibHDHJu0ulFsNZgxtUcvlLHTCf8gO4dHWxCFLXGubMush83McpRtNB
# Qoh6cTLH+PBXfrxMR3zmTZMNvo8Euls3s07Y8TkNP4vdIIE/kMeMDW1wJw==
# =mq30
# -----END PGP SIGNATURE-----
# gpg: Signature made Sun 31 Dec 2023 08:12:22 GMT
# gpg:                using RSA key F13338574B662389866C7682BFFBD25F78C7AE83
# gpg:                issuer "pbonzini@redhat.com"
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>" [full]
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>" [full]
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* tag 'for-upstream' of https://gitlab.com/bonzini/qemu: (46 commits)
  meson.build: report graphics backends separately
  configure, meson: rename targetos to host_os
  meson: rename config_all
  meson: remove CONFIG_ALL
  meson: remove config_targetos
  meson: remove CONFIG_POSIX and CONFIG_WIN32 from config_targetos
  meson: remove OS definitions from config_targetos
  meson: always probe u2f and canokey if the option is enabled
  meson: move subdirs to "Collect sources" section
  meson: move config-host.h definitions together
  meson: move CFI detection code with other compiler flags
  meson: keep subprojects together
  meson: move accelerator dependency checks together
  meson: move option validation together
  meson: move program checks together
  meson: add more sections to main meson.build
  configure: unify again the case arms in probe_target_compiler
  configure: remove unnecessary subshell
  Makefile: clean qemu-iotests output
  meson: use version_compare() to compare version
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2024-01-04 19:55:20 +00:00
Paolo Bonzini
d0cda6f461 configure, meson: rename targetos to host_os
This variable is about the host OS, not the target.  It is used a lot
more since the Meson conversion, but the original sin dates back to 2003.
Time to fix it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:29 +01:00
Paolo Bonzini
dc4954943d meson: remove CONFIG_POSIX and CONFIG_WIN32 from config_targetos
For consistency with other OSes, use if...endif for rules that are
target-independent.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Paolo Bonzini
53e8868d69 meson: remove OS definitions from config_targetos
CONFIG_DARWIN, CONFIG_LINUX and CONFIG_BSD are used in some rules, but
only CONFIG_LINUX has substantial use.  Convert them all to if...endif.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-12-31 09:11:28 +01:00
Richard Henderson
59c2ddedcb util/fifo8: Constify VMState
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20231221031652.119827-70-richard.henderson@linaro.org>
2023-12-30 07:38:06 +11:00
Stefan Hajnoczi
9f8d2fdcce aio: remove aio_context_acquire()/aio_context_release() API
Delete these functions because nothing calls these functions anymore.

I introduced these APIs in commit 98563fc3ec ("aio: add
aio_context_acquire() and aio_context_release()") in 2014. It's with a
sigh of relief that I delete these APIs almost 10 years later.

Thanks to Paolo Bonzini's vision for multi-queue QEMU, we got an
understanding of where the code needed to go in order to remove the
limitations that the original dataplane and the IOThread/AioContext
approach that followed it.

Emanuele Giuseppe Esposito had the splendid determination to convert
large parts of the codebase so that they no longer needed the AioContext
lock. This was a painstaking process, both in the actual code changes
required and the iterations of code review that Emanuele eked out of
Kevin and me over many months.

Kevin Wolf tackled multitudes of graph locking conversions to protect
in-flight I/O from run-time changes to the block graph as well as the
clang Thread Safety Analysis annotations that allow the compiler to
check whether the graph lock is being used correctly.

And me, well, I'm just here to add some pizzazz to the QEMU multi-queue
block layer :). Thank you to everyone who helped with this effort,
including Eric Blake, code reviewer extraordinaire, and others who I've
forgotten to mention.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-ID: <20231205182011.1976568-11-stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21 22:49:27 +01:00
Stefan Hajnoczi
b49f4755c7 block: remove AioContext locking
This is the big patch that removes
aio_context_acquire()/aio_context_release() from the block layer and
affected block layer users.

There isn't a clean way to split this patch and the reviewers are likely
the same group of people, so I decided to do it in one patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paul Durrant <paul@xen.org>
Message-ID: <20231205182011.1976568-7-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21 22:49:27 +01:00
Stefan Hajnoczi
b5f4fda4fb aio: make aio_context_acquire()/aio_context_release() a no-op
aio_context_acquire()/aio_context_release() has been replaced by
fine-grained locking to protect state shared by multiple threads. The
AioContext lock still plays the role of balancing locking in
AIO_WAIT_WHILE() and many functions in QEMU either require that the
AioContext lock is held or not held for this reason. In other words, the
AioContext lock is purely there for consistency with itself and serves
no real purpose anymore.

Stop actually acquiring/releasing the lock in
aio_context_acquire()/aio_context_release() so that subsequent patches
can remove callers across the codebase incrementally.

I have performed "make check" and qemu-iotests stress tests across
x86-64, ppc64le, and aarch64 to confirm that there are no failures as a
result of eliminating the lock.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231205182011.1976568-5-stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-12-21 22:49:27 +01:00
Yi Liu
d6b5c4c1b5 util/char_dev: Add open_cdev()
/dev/vfio/devices/vfioX may not exist. In that case it is still possible
to open /dev/char/$major:$minor instead. Add helper function to abstract
the cdev open.

Suggested-by: Jason Gunthorpe <jgg@nvidia.com>
Signed-off-by: Yi Liu <yi.l.liu@intel.com>
Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Nicolin Chen <nicolinc@nvidia.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-12-19 19:03:38 +01:00
Kevin Wolf
411132c979 export/vhost-user-blk: Fix consecutive drains
The vhost-user-blk export implement AioContext switches in its drain
implementation. This means that on drain_begin, it detaches the server
from its AioContext and on drain_end, attaches it again and schedules
the server->co_trip coroutine in the updated AioContext.

However, nothing guarantees that server->co_trip is even safe to be
scheduled. Not only is it unclear that the coroutine is actually in a
state where it can be reentered externally without causing problems, but
with two consecutive drains, it is possible that the scheduled coroutine
didn't have a chance yet to run and trying to schedule an already
scheduled coroutine a second time crashes with an assertion failure.

Following the model of NBD, this commit makes the vhost-user-blk export
shut down server->co_trip during drain so that resuming the export means
creating and scheduling a new coroutine, which is always safe.

There is one exception: If the drain call didn't poll (for example, this
happens in the context of bdrv_graph_wrlock()), then the coroutine
didn't have a chance to shut down. However, in this case the AioContext
can't have changed; changing the AioContext always involves a polling
drain. So in this case we can simply assert that the AioContext is
unchanged and just leave the coroutine running or wake it up if it has
yielded to wait for the AioContext to be attached again.

Fixes: e1054cd4aa
Fixes: https://issues.redhat.com/browse/RHEL-1708
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-ID: <20231127115755.22846-1-kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-11-28 14:56:32 +01:00
Michael Tokarev
f779357882 util/range.c: spelling fix: inbetween
Fixes: b439595a08 "range: Introduce range_inverse_array()"
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Cédric Le Goater <clg@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-11-15 12:06:05 +03:00
Michael Tokarev
f0dbe427ec util/filemonitor-inotify.c: spelling fix: kenel
Fixes: 2e12dd405c "util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow"
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-11-15 12:06:05 +03:00
Stefan Hajnoczi
74949263a5 util: Add cpuinfo for loongarch64
tcg/loongarch64: Use cpuinfo.h
 tcg/loongarch64: Improve register allocation for INDEX_op_qemu_ld_a*_i128
 host/include/loongarch64: Add atomic16 load and store
 tcg: Move expanders out of line
 tcg/mips: Always implement movcond
 tcg/mips: Implement neg opcodes
 tcg/loongarch64: Implement neg opcodes
 tcg: Make movcond and neg required opcodes
 tcg: Optimize env memory operations
 tcg: Canonicalize sub of immediate to add
 tcg/sparc64: Implement tcg_out_extrl_i64_i32
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmVJpT0dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9i7QgAtjxUB3y/caCPp0Me
 3cXYtpL1vNxx+cTESGMlmIRSji+cEOxYSpnY0itxXcKpcwP8Au8eoTe85NxyIllg
 2R/SA2jlmrmiipI+bwb0UBCy+BzUfMgmegA88K2W22J0fetwIy19PN9ORmYdLiYE
 /pWNFOSPzhYEJgOw7V2MwciUv3llolMOfxU7VT4oVaCknZRsyaGUwl4uTT4GdPuK
 p29O9nziyKDmNTqJ9SKKll5bzwCMAgkn2lUcMGf+rpl7ZxjgvysUYrGXKmOnj4Uu
 eCU2d3ZHoSspcYEjbFASlyPd7z5apGI8Iq2K35FUhURFPv06Su/bIGOOD4ujP2Qp
 vc/bFQ==
 =Mvaf
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20231106' of https://gitlab.com/rth7680/qemu into staging

util: Add cpuinfo for loongarch64
tcg/loongarch64: Use cpuinfo.h
tcg/loongarch64: Improve register allocation for INDEX_op_qemu_ld_a*_i128
host/include/loongarch64: Add atomic16 load and store
tcg: Move expanders out of line
tcg/mips: Always implement movcond
tcg/mips: Implement neg opcodes
tcg/loongarch64: Implement neg opcodes
tcg: Make movcond and neg required opcodes
tcg: Optimize env memory operations
tcg: Canonicalize sub of immediate to add
tcg/sparc64: Implement tcg_out_extrl_i64_i32

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmVJpT0dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9i7QgAtjxUB3y/caCPp0Me
# 3cXYtpL1vNxx+cTESGMlmIRSji+cEOxYSpnY0itxXcKpcwP8Au8eoTe85NxyIllg
# 2R/SA2jlmrmiipI+bwb0UBCy+BzUfMgmegA88K2W22J0fetwIy19PN9ORmYdLiYE
# /pWNFOSPzhYEJgOw7V2MwciUv3llolMOfxU7VT4oVaCknZRsyaGUwl4uTT4GdPuK
# p29O9nziyKDmNTqJ9SKKll5bzwCMAgkn2lUcMGf+rpl7ZxjgvysUYrGXKmOnj4Uu
# eCU2d3ZHoSspcYEjbFASlyPd7z5apGI8Iq2K35FUhURFPv06Su/bIGOOD4ujP2Qp
# vc/bFQ==
# =Mvaf
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 07 Nov 2023 10:47:25 HKT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20231106' of https://gitlab.com/rth7680/qemu: (35 commits)
  tcg/sparc64: Implement tcg_out_extrl_i64_i32
  tcg/optimize: Canonicalize sub2 with constants to add2
  tcg/optimize: Canonicalize subi to addi during optimization
  tcg: Canonicalize subi to addi during opcode generation
  tcg/optimize: Split out arg_new_constant
  tcg: Eliminate duplicate env store operations
  tcg/optimize: Optimize env memory operations
  tcg/optimize: Split out cmp_better_copy
  tcg/optimize: Pipe OptContext into reset_ts
  tcg: Don't free vector results
  tcg: Remove TCG_TARGET_HAS_neg_{i32,i64}
  tcg/loongarch64: Implement neg opcodes
  tcg/mips: Implement neg opcodes
  tcg: Remove TCG_TARGET_HAS_movcond_{i32,i64}
  tcg/mips: Always implement movcond
  tcg/mips: Split out tcg_out_setcond_int
  tcg: Move tcg_temp_free_* out of line
  tcg: Move tcg_temp_new_*, tcg_global_mem_new_* out of line
  tcg: Move tcg_constant_* out of line
  tcg: Unexport tcg_gen_op*_{i32,i64}
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-07 11:05:37 +08:00
Stefan Hajnoczi
f6b174ff96 target-arm queue:
* hw/arm/virt: fix PMU IRQ registration
  * hw/arm/virt: Report correct register sizes in ACPI DBG2/SPCR tables
  * hw/i386/intel_iommu: vtd_slpte_nonzero_rsvd(): assert no overflow
  * util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow
  * mc146818rtc: rtc_set_time(): initialize tm to zeroes
  * block/nvme: nvme_process_completion() fix bound for cid
  * hw/core/loader: gunzip(): initialize z_stream
  * io/channel-socket: qio_channel_socket_flush(): improve msg validation
  * hw/arm/vexpress-a9: Remove useless mapping of RAM at address 0
  * target/arm: Fix A64 LDRA immediate decode
 -----BEGIN PGP SIGNATURE-----
 
 iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmVJBtUZHHBldGVyLm1h
 eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3qYTEACYqLV57JezgRFXzMEwKX3l
 9IYbFje+lGemobdJOEHhRvXjCNb+5TwhEfQasri0FBzokw16S3WOOF7roGb6YOU1
 od1SGiS2AbrmiazlBpamVO8z0WAEgbnXIoQa/3xKAGPJXszD2zK+06KnXS5xuCuD
 nHojzIx7Gv4HEIs4huY39/YL2HMaxrqvXC8IAu51eqY+TPnETT+WI3HxlZ2OMIsn
 1Jnn+FeZfA1bhKx4JsD9MyHM1ovbjOwYkHOlzjU6fmTFFPGKRy0nxnjMNCBcXHQ+
 unemc/9BhEFup76tkX+JIlSBrPre5Mnh93DsGKSapwKPKq+fQhUDmzXY2r3OvQZX
 ryxO4PJkCNTM1wZU6GeEDPWVfhgBKHUMv+tr9Mf9iBlyXRsmXLSEl7AFUUaFlgAL
 dSMyiAaUlfvGa7Gtta9eFAJ/GeaiuJu2CYq6lvtRrNIHflLm3gVCef8gmwM5Eqxm
 3PNzEoabKyQQfz69j9RCLpoutMBq1sg2IzxW8UjAFupugcIABjLf0Sl11qA0/B89
 YX67B0ynQD9ajI2GS8ULid/tvEiJVgdZ2Ua3U3xpG54vKG1/54EUiCP8TtoIuoMy
 bKg8AU9EIPN962PxoAwS+bSSdCu7/zBjVpg4T/zIzWRdgSjRsE21Swu5Ca934ng5
 VpVUuiwtI/zvHgqaiORu+w==
 =UbqJ
 -----END PGP SIGNATURE-----

Merge tag 'pull-target-arm-20231106' of https://git.linaro.org/people/pmaydell/qemu-arm into staging

target-arm queue:
 * hw/arm/virt: fix PMU IRQ registration
 * hw/arm/virt: Report correct register sizes in ACPI DBG2/SPCR tables
 * hw/i386/intel_iommu: vtd_slpte_nonzero_rsvd(): assert no overflow
 * util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow
 * mc146818rtc: rtc_set_time(): initialize tm to zeroes
 * block/nvme: nvme_process_completion() fix bound for cid
 * hw/core/loader: gunzip(): initialize z_stream
 * io/channel-socket: qio_channel_socket_flush(): improve msg validation
 * hw/arm/vexpress-a9: Remove useless mapping of RAM at address 0
 * target/arm: Fix A64 LDRA immediate decode

# -----BEGIN PGP SIGNATURE-----
#
# iQJNBAABCAA3FiEE4aXFk81BneKOgxXPPCUl7RQ2DN4FAmVJBtUZHHBldGVyLm1h
# eWRlbGxAbGluYXJvLm9yZwAKCRA8JSXtFDYM3qYTEACYqLV57JezgRFXzMEwKX3l
# 9IYbFje+lGemobdJOEHhRvXjCNb+5TwhEfQasri0FBzokw16S3WOOF7roGb6YOU1
# od1SGiS2AbrmiazlBpamVO8z0WAEgbnXIoQa/3xKAGPJXszD2zK+06KnXS5xuCuD
# nHojzIx7Gv4HEIs4huY39/YL2HMaxrqvXC8IAu51eqY+TPnETT+WI3HxlZ2OMIsn
# 1Jnn+FeZfA1bhKx4JsD9MyHM1ovbjOwYkHOlzjU6fmTFFPGKRy0nxnjMNCBcXHQ+
# unemc/9BhEFup76tkX+JIlSBrPre5Mnh93DsGKSapwKPKq+fQhUDmzXY2r3OvQZX
# ryxO4PJkCNTM1wZU6GeEDPWVfhgBKHUMv+tr9Mf9iBlyXRsmXLSEl7AFUUaFlgAL
# dSMyiAaUlfvGa7Gtta9eFAJ/GeaiuJu2CYq6lvtRrNIHflLm3gVCef8gmwM5Eqxm
# 3PNzEoabKyQQfz69j9RCLpoutMBq1sg2IzxW8UjAFupugcIABjLf0Sl11qA0/B89
# YX67B0ynQD9ajI2GS8ULid/tvEiJVgdZ2Ua3U3xpG54vKG1/54EUiCP8TtoIuoMy
# bKg8AU9EIPN962PxoAwS+bSSdCu7/zBjVpg4T/zIzWRdgSjRsE21Swu5Ca934ng5
# VpVUuiwtI/zvHgqaiORu+w==
# =UbqJ
# -----END PGP SIGNATURE-----
# gpg: Signature made Mon 06 Nov 2023 23:31:33 HKT
# gpg:                using RSA key E1A5C593CD419DE28E8315CF3C2525ED14360CDE
# gpg:                issuer "peter.maydell@linaro.org"
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>" [full]
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>" [full]
# gpg:                 aka "Peter Maydell <peter@archaic.org.uk>" [unknown]
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* tag 'pull-target-arm-20231106' of https://git.linaro.org/people/pmaydell/qemu-arm:
  target/arm: Fix A64 LDRA immediate decode
  hw/arm/vexpress-a9: Remove useless mapping of RAM at address 0
  io/channel-socket: qio_channel_socket_flush(): improve msg validation
  hw/core/loader: gunzip(): initialize z_stream
  block/nvme: nvme_process_completion() fix bound for cid
  mc146818rtc: rtc_set_time(): initialize tm to zeroes
  util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow
  hw/i386/intel_iommu: vtd_slpte_nonzero_rsvd(): assert no overflow
  tests/qtest/bios-tables-test: Update virt SPCR and DBG2 golden references
  hw/arm/virt: Report correct register sizes in ACPI DBG2/SPCR tables.
  tests/qtest/bios-tables-test: Allow changes to virt SPCR and DBG2
  hw/arm/virt: fix PMU IRQ registration

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2023-11-07 09:42:07 +08:00
Richard Henderson
0885f1221e util: Add cpuinfo for loongarch64
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Jiajie Chen <c@jia.je>
Message-Id: <20230916220151.526140-4-richard.henderson@linaro.org>
2023-11-06 08:27:21 -08:00
Vladimir Sementsov-Ogievskiy
2e12dd405c util/filemonitor-inotify: qemu_file_monitor_watch(): assert no overflow
Prefer clear assertions instead of [im]possible array overflow.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Maksim Davydov <davydov-max@yandex-team.ru>
Message-id: 20231017125941.810461-3-vsementsov@yandex-team.ru
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-11-06 15:00:27 +00:00
Cédric Le Goater
721da0396c util/uuid: Add UUID_STR_LEN definition
qemu_uuid_unparse() includes a trailing NUL when writing the uuid
string and the buffer size should be UUID_FMT_LEN + 1 bytes. Add a
define for this size and use it where required.

Cc: Fam Zheng <fam@euphon.net>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: "Denis V. Lunev" <den@openvz.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-03 09:20:31 +01:00
Eric Auger
b439595a08 range: Introduce range_inverse_array()
This helper reverses a list of regions within a [low, high]
span, turning original regions into holes and original
holes into actual regions, covering the whole UINT64_MAX span.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Reviewed-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-03 09:20:31 +01:00
Eric Auger
c310484736 util/reserved-region: Add new ReservedRegion helpers
Introduce resv_region_list_insert() helper which inserts
a new ReservedRegion into a sorted list of reserved region.
In case of overlap, the new region has higher priority and
hides the existing overlapped segments. If the overlap is
partial, new regions are created for parts which are not
overlapped. The new region has higher priority independently
on the type of the regions.

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Jean-Philippe Brucker <jean-philippe@linaro.org>
Tested-by: Yanghang Liu <yanghliu@redhat.com>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-03 09:20:31 +01:00
Eric Auger
43f04cbeff range: Make range_compare() public
Let's expose range_compare() in the header so that it can be
reused outside of util/range.c

Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Cédric Le Goater <clg@redhat.com>
2023-11-03 09:20:31 +01:00
Stefan Hajnoczi
84d61e5f36 virtio: use defer_call() in virtio_irqfd_notify()
virtio-blk and virtio-scsi invoke virtio_irqfd_notify() to send Used
Buffer Notifications from an IOThread. This involves an eventfd
write(2) syscall. Calling this repeatedly when completing multiple I/O
requests in a row is wasteful.

Use the defer_call() API to batch together virtio_irqfd_notify() calls
made during thread pool (aio=threads), Linux AIO (aio=native), and
io_uring (aio=io_uring) completion processing.

Behavior is unchanged for emulated devices that do not use
defer_call_begin()/defer_call_end() since defer_call() immediately
invokes the callback when called outside a
defer_call_begin()/defer_call_end() region.

fio rw=randread bs=4k iodepth=64 numjobs=8 IOPS increases by ~9% with a
single IOThread and 8 vCPUs. iodepth=1 decreases by ~1% but this could
be noise. Detailed performance data and configuration specifics are
available here:
https://gitlab.com/stefanha/virt-playbooks/-/tree/blk_io_plug-irqfd

This duplicates the BH that virtio-blk uses for batching. The next
commit will remove it.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-4-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 15:42:14 +01:00
Stefan Hajnoczi
433fcea40c util/defer-call: move defer_call() to util/
The networking subsystem may wish to use defer_call(), so move the code
to util/ where it can be reused.

As a reminder of what defer_call() does:

This API defers a function call within a defer_call_begin()/defer_call_end()
section, allowing multiple calls to batch up. This is a performance
optimization that is used in the block layer to submit several I/O requests
at once instead of individually:

  defer_call_begin(); <-- start of section
  ...
  defer_call(my_func, my_obj); <-- deferred my_func(my_obj) call
  defer_call(my_func, my_obj); <-- another
  defer_call(my_func, my_obj); <-- another
  ...
  defer_call_end(); <-- end of section, my_func(my_obj) is called once

Suggested-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-ID: <20230913200045.1024233-3-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2023-10-31 15:41:42 +01:00
Akihiko Odaki
dfd3bb0a99 cutils: Fix get_relocated_path on Windows
get_relocated_path() did not have error handling for PathCchSkipRoot()
because a path given to get_relocated_path() was expected to be a valid
path containing a drive letter or UNC server/share path elements on
Windows, but sometimes it turned out otherwise.

The paths passed to get_relocated_path() are defined by macros generated
by Meson. Meson in turn uses a prefix given by the configure script to
generate them. For Windows, the script passes /qemu as a prefix to
Meson by default.

As documented in docs/about/build-platforms.rst, typically MSYS2 is used
for the build system, but it is also possible to use Linux as well. When
MSYS2 is used, its Bash variant recognizes /qemu as a MSYS2 path, and
converts it to a Windows path, adding the MSYS2 prefix including a drive
letter or UNC server/share path elements. Such a conversion does not
happen on a shell on Linux however, and /qemu will be passed as is in
the case.

Implement a proper error handling of PathCchSkipRoot() in
get_relocated_path() so that it can handle a path without a drive letter
or UNC server/share path elements.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-ID: <20231005064726.6945-1-akihiko.odaki@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2023-10-19 23:13:27 +02:00
Paolo Bonzini
655e2a778d meson, cutils: allow non-relocatable installs
Say QEMU is configured with bindir = "/usr/bin" and a firmware path
that starts with "/usr/share/qemu".  Ever since QEMU 5.2, QEMU's
install has been relocatable: if you move qemu-system-x86_64 from
/usr/bin to /home/username/bin, it will start looking for firmware in
/home/username/share/qemu.  Previously, you would get a non-relocatable
install where the moved QEMU will keep looking for firmware in
/usr/share/qemu.

Windows almost always wants relocatable installs, and in fact that
is why QEMU 5.2 introduced relocatability in the first place.
However, newfangled distribution mechanisms such as AppImage
(https://docs.appimage.org/reference/best-practices.html), and
possibly NixOS, also dislike using at runtime the absolute paths
that were established at build time.

On POSIX systems you almost never care; if you do, your usecase
dictates which one is desirable, so there's no single answer.
Obviously relocatability works fine most of the time, because not many
people have complained about QEMU's switch to relocatable install,
and that's why until now there was no way to disable relocatability.

But a non-relocatable, non-modular binary can help if you want to do
experiments with old firmware and new QEMU or vice versa (because you
can just upgrade/downgrade the firmware package, and use rpm2cpio or
similar to extract the QEMU binaries outside /usr), so allow both.
This patch allows one to build a non-relocatable install using a new
option to configure.  Why?  Because it's not too hard, and because
it helps the user double check the relocatability of their install.

Note that the same code that handles relocation also lets you run QEMU
from the build tree and pick e.g. firmware files from the source tree
transparently.  Therefore that part remains active with this patch,
even if you configure with --disable-relocatable.

Suggested-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Emmanouil Pitsidianakis <manos.pitsidianakis@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2023-10-18 10:00:57 +02:00