system/qemu - HydraGit

mirror of https://gitlab.com/qemu-project/qemu synced 2024-11-05 20:35:44 +00:00

Author	SHA1	Message	Date
Stefan Hajnoczi	06e0f098d6	io: follow coroutine AioContext in qio_channel_yield() The ongoing QEMU multi-queue block layer effort makes it possible for multiple threads to process I/O in parallel. The nbd block driver is not compatible with the multi-queue block layer yet because QIOChannel cannot be used easily from coroutines running in multiple threads. This series changes the QIOChannel API to make that possible. In the current API, calling qio_channel_attach_aio_context() sets the AioContext where qio_channel_yield() installs an fd handler prior to yielding: qio_channel_attach_aio_context(ioc, my_ctx); ... qio_channel_yield(ioc); // my_ctx is used here ... qio_channel_detach_aio_context(ioc); This API design has limitations: reading and writing must be done in the same AioContext and moving between AioContexts involves a cumbersome sequence of API calls that is not suitable for doing on a per-request basis. There is no fundamental reason why a QIOChannel needs to run within the same AioContext every time qio_channel_yield() is called. QIOChannel only uses the AioContext while inside qio_channel_yield(). The rest of the time, QIOChannel is independent of any AioContext. In the new API, qio_channel_yield() queries the AioContext from the current coroutine using qemu_coroutine_get_aio_context(). There is no need to explicitly attach/detach AioContexts anymore and qio_channel_attach_aio_context() and qio_channel_detach_aio_context() are gone. One coroutine can read from the QIOChannel while another coroutine writes from a different AioContext. This API change allows the nbd block driver to use QIOChannel from any thread. It's important to keep in mind that the block driver already synchronizes QIOChannel access and ensures that two coroutines never read simultaneously or write simultaneously. This patch updates all users of qio_channel_attach_aio_context() to the new API. Most conversions are simple, but vhost-user-server requires a new qemu_coroutine_yield() call to quiesce the vu_client_trip() coroutine when not attached to any AioContext. While the API is has become simpler, there is one wart: QIOChannel has a special case for the iohandler AioContext (used for handlers that must not run in nested event loops). I didn't find an elegant way preserve that behavior, so I added a new API called qio_channel_set_follow_coroutine_ctx(ioc, true\|false) for opting in to the new AioContext model. By default QIOChannel uses the iohandler AioHandler. Code that formerly called qio_channel_attach_aio_context() now calls qio_channel_set_follow_coroutine_ctx(ioc, true) once after the QIOChannel is created. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-ID: <20230830224802.493686-5-stefanha@redhat.com> [eblake: also fix migration/rdma.c] Signed-off-by: Eric Blake <eblake@redhat.com>	2023-09-07 20:32:11 -05:00
Stefan Hajnoczi	acd4be64b8	io: check there are no qio_channel_yield() coroutines during ->finalize() Callers must clean up their coroutines before calling object_unref(OBJECT(ioc)) to prevent an fd handler leak. Add an assertion to check this. This patch is preparation for the fd handler changes that follow. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-ID: <20230830224802.493686-4-stefanha@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2023-09-07 20:32:11 -05:00
Daniel P. Berrangé	10be627d2b	io: remove io watch if TLS channel is closed during handshake The TLS handshake make take some time to complete, during which time an I/O watch might be registered with the main loop. If the owner of the I/O channel invokes qio_channel_close() while the handshake is waiting to continue the I/O watch must be removed. Failing to remove it will later trigger the completion callback which the owner is not expecting to receive. In the case of the VNC server, this results in a SEGV as vnc_disconnect_start() tries to shutdown a client connection that is already gone / NULL. CVE-2023-3354 Reported-by: jiangyegen <jiangyegen@huawei.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2023-08-01 18:45:27 +01:00
Stefan Hajnoczi	60f782b6b7	aio: remove aio_disable_external() API All callers now pass is_external=false to aio_set_fd_handler() and aio_set_event_notifier(). The aio_disable_external() API that temporarily disables fd handlers that were registered is_external=true is therefore dead code. Remove aio_disable_external(), aio_enable_external(), and the is_external arguments to aio_set_fd_handler() and aio_set_event_notifier(). The entire test-fdmon-epoll test is removed because its sole purpose was testing aio_disable_external(). Parts of this patch were generated using the following coccinelle (https://coccinelle.lip6.fr/) semantic patch: @@ expression ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque; @@ - aio_set_fd_handler(ctx, fd, is_external, io_read, io_write, io_poll, io_poll_ready, opaque) + aio_set_fd_handler(ctx, fd, io_read, io_write, io_poll, io_poll_ready, opaque) @@ expression ctx, notifier, is_external, io_read, io_poll, io_poll_ready; @@ - aio_set_event_notifier(ctx, notifier, is_external, io_read, io_poll, io_poll_ready) + aio_set_event_notifier(ctx, notifier, io_read, io_poll, io_poll_ready) Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Message-Id: <20230516190238.8401-21-stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-30 17:37:26 +02:00
Kevin Wolf	7c1f51bf38	nbd/server: Fix drained_poll to wake coroutine in right AioContext nbd_drained_poll() generally runs in the main thread, not whatever iothread the NBD server coroutine is meant to run in, so it can't directly reenter the coroutines to wake them up. The code seems to have the right intention, it specifies the correct AioContext when it calls qemu_aio_coroutine_enter(). However, this functions doesn't schedule the coroutine to run in that AioContext, but it assumes it is already called in the home thread of the AioContext. To fix this, add a new thread-safe qio_channel_wake_read() that can be called in the main thread to wake up the coroutine in its AioContext, and use this in nbd_drained_poll(). Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Message-Id: <20230517152834.277483-3-kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2023-05-19 19:16:53 +02:00
Paolo Bonzini	1dd91b22a6	io: mark mixed functions that can suspend There should be no paths from a coroutine_fn to aio_poll, however in practice coroutine_mixed_fn will call aio_poll in the !qemu_in_coroutine() path. By marking mixed functions, we can track accurately the call paths that execute entirely in coroutine context, and find more missing coroutine_fn markers. This results in more accurate checks that coroutine code does not end up blocking. If the marking were extended transitively to all functions that call these ones, static analysis could be done much more efficiently. However, this is a start and makes it possible to use vrc's path-based searches to find potential bugs where coroutine_fns call blocking functions. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2023-04-20 11:17:35 +02:00
Peter Xu	86d063fa83	io: tls: Inherit QIO_CHANNEL_FEATURE_SHUTDOWN on server side TLS iochannel will inherit io_shutdown() from the master ioc, however we missed to do that on the server side. This will e.g. allow qemu_file_shutdown() to work on dest QEMU too for migration. Acked-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-04-12 21:19:05 +02:00
Matheus Tavares Bernardino	c3a2c84ae3	io/channel-tls: plug memory leakage on GSource This leakage can be seen through test-io-channel-tls: $ ../configure --target-list=aarch64-softmmu --enable-sanitizers $ make ./tests/unit/test-io-channel-tls $ ./tests/unit/test-io-channel-tls Indirect leak of 104 byte(s) in 1 object(s) allocated from: #0 0x7f81d1725808 in __interceptor_malloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:144 #1 0x7f81d135ae98 in g_malloc (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x57e98) #2 0x55616c5d4c1b in object_new_with_propv ../qom/object.c:795 #3 0x55616c5d4a83 in object_new_with_props ../qom/object.c:768 #4 0x55616c5c5415 in test_tls_creds_create ../tests/unit/test-io-channel-tls.c:70 #5 0x55616c5c5a6b in test_io_channel_tls ../tests/unit/test-io-channel-tls.c:158 #6 0x7f81d137d58d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a58d) Indirect leak of 32 byte(s) in 1 object(s) allocated from: #0 0x7f81d1725a06 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cc:153 #1 0x7f81d1472a20 in gnutls_dh_params_init (/lib/x86_64-linux-gnu/libgnutls.so.30+0x46a20) #2 0x55616c6485ff in qcrypto_tls_creds_x509_load ../crypto/tlscredsx509.c:634 #3 0x55616c648ba2 in qcrypto_tls_creds_x509_complete ../crypto/tlscredsx509.c:694 #4 0x55616c5e1fea in user_creatable_complete ../qom/object_interfaces.c:28 #5 0x55616c5d4c8c in object_new_with_propv ../qom/object.c:807 #6 0x55616c5d4a83 in object_new_with_props ../qom/object.c:768 #7 0x55616c5c5415 in test_tls_creds_create ../tests/unit/test-io-channel-tls.c:70 #8 0x55616c5c5a6b in test_io_channel_tls ../tests/unit/test-io-channel-tls.c:158 #9 0x7f81d137d58d (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x7a58d) ... SUMMARY: AddressSanitizer: 49143 byte(s) leaked in 184 allocation(s). The docs for `g_source_add_child_source(source, child_source)` says "source will hold a reference on child_source while child_source is attached to it." Therefore, we should unreference the child source at `qio_channel_tls_read_watch()` after attaching it to `source`. With this change, ./tests/unit/test-io-channel-tls shows no leakages. Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org> Signed-off-by: Matheus Tavares Bernardino <quic_mathbern@quicinc.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2023-03-14 13:41:21 +00:00
Marc-André Lureau	25657fc6c1	win32: replace closesocket() with close() wrapper Use a close() wrapper instead, so that we don't need to worry about closesocket() vs close() anymore, let's hope. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Message-Id: <20230221124802.4103554-17-marcandre.lureau@redhat.com>	2023-03-13 15:39:31 +04:00
Marc-André Lureau	abe34282b0	win32: avoid mixing SOCKET and file descriptor space Until now, a win32 SOCKET handle is often cast to an int file descriptor, as this is what other OS use for sockets. When necessary, QEMU eventually queries whether it's a socket with the help of fd_is_socket(). However, there is no guarantee of conflict between the fd and SOCKET space. Such conflict would have surprising consequences, we shouldn't mix them. Also, it is often forgotten that SOCKET must be closed with closesocket(), and not close(). Instead, let's make the win32 socket wrapper functions return and take a file descriptor, and let util/ wrappers do the fd/SOCKET conversion as necessary. A bit of adaptation is necessary in io/ as well. Unfortunately, we can't drop closesocket() usage, despite _open_osfhandle() documentation claiming transfer of ownership, testing shows bad behaviour if you forget to call closesocket(). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Message-Id: <20230221124802.4103554-15-marcandre.lureau@redhat.com>	2023-03-13 15:39:31 +04:00
Marc-André Lureau	a4aafea261	win32/socket: introduce qemu_socket_unselect() helper A more explicit version of qemu_socket_select() with no events. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Message-Id: <20230221124802.4103554-8-marcandre.lureau@redhat.com>	2023-03-13 15:23:37 +04:00
Marc-André Lureau	f5fd677ae7	win32/socket: introduce qemu_socket_select() helper This is a wrapper for WSAEventSelect, with Error handling. By default, it will produce a warning, so callers don't have to be modified now, and yet we can spot potential mis-use. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Berger <stefanb@linux.ibm.com> Message-Id: <20230221124802.4103554-7-marcandre.lureau@redhat.com>	2023-03-13 15:23:37 +04:00
Marc-André Lureau	651ccdfa8e	io: use closesocket() Because they are actually sockets... Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Thomas Huth <thuth@redhat.com> Message-Id: <20230221124802.4103554-4-marcandre.lureau@redhat.com>	2023-03-13 15:23:37 +04:00
Antoine Damhet	ffda5db65a	io/channel-tls: fix handling of bigger read buffers Since the TLS backend can read more data from the underlying QIOChannel we introduce a minimal child GSource to notify if we still have more data available to be read. Signed-off-by: Antoine Damhet <antoine.damhet@shadow.tech> Signed-off-by: Charles Frey <charles.frey@shadow.tech> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2023-02-15 11:01:04 -05:00
manish.mishra	84615a19dd	io: Add support for MSG_PEEK for socket channel MSG_PEEK peeks at the channel, The data is treated as unread and the next read shall still return this data. This support is currently added only for socket class. Extra parameter 'flags' is added to io_readv calls to pass extra read flags like MSG_PEEK. Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrange <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Suggested-by: Daniel P. Berrange <berrange@redhat.com> Signed-off-by: manish.mishra <manish.mishra@nutanix.com> Signed-off-by: Juan Quintela <quintela@redhat.com>	2023-02-06 19:22:56 +01:00
Bin Meng	23f77f05f2	io/channel-watch: Fix socket watch on Windows Random failure was observed when running qtests on Windows due to "Broken pipe" detected by qmp_fd_receive(). What happened is that the qtest executable sends testing data over a socket to the QEMU under test but no response is received. The errno of the recv() call from the qtest executable indicates ETIMEOUT, due to the qmp chardev's tcp_chr_read() is never called to receive testing data hence no response is sent to the other side. tcp_chr_read() is registered as the callback of the socket watch GSource. The reason of the callback not being called by glib, is that the source check fails to indicate the source is ready. There are two socket watch sources created to monitor the same socket event object from the char-socket backend in update_ioc_handlers(). During the source check phase, qio_channel_socket_source_check() calls WSAEnumNetworkEvents() to discover occurrences of network events for the indicated socket, clear internal network event records, and reset the event object. Testing shows that if we don't reset the event object by not passing the event handle to WSAEnumNetworkEvents() the symptom goes away and qtest runs very stably. It seems we don't need to call WSAEnumNetworkEvents() at all, as we don't parse the result of WSANETWORKEVENTS returned from this API. We use select() to poll the socket status. Fix this instability by dropping the WSAEnumNetworkEvents() call. Some side notes: During the testing, I removed the following codes in update_ioc_handlers(): remove_hup_source(s); s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP); g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup, chr, NULL); g_source_attach(s->hup_source, chr->gcontext); and such change also makes the symptom go away. And if I moved the above codes to the beginning, before the call to io_add_watch_poll(), the symptom also goes away. It seems two sources watching on the same socket event object is the key that leads to the instability. The order of adding a source watch seems to also play a role but I can't explain why. Hopefully a Windows and glib expert could explain this behavior. Signed-off-by: Bin Meng <bin.meng@windriver.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-26 13:32:08 +01:00
Bin Meng	6c822a031b	io/channel-watch: Drop the unnecessary cast There is no need to do a type cast on ssource->socket as it is already declared as a SOCKET. Suggested-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-26 13:32:08 +01:00
Bin Meng	985be62d44	io/channel-watch: Drop a superfluous '#ifdef WIN32' In the win32 version qio_channel_create_socket_watch() body there is no need to do a '#ifdef WIN32'. Signed-off-by: Bin Meng <bin.meng@windriver.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-10-26 13:32:08 +01:00
Marc-André Lureau	ec5b6c9c5d	io/command: implement support for win32 The initial implementation was changing the pipe state created by GLib to PIPE_NOWAIT, but it turns out it doesn't work (read/write returns an error). Since reading may return less than the requested amount, it seems to be non-blocking already. However, the IO operation may block until the FD is ready, I can't find good sources of information, to be safe we can just poll for readiness before. Alternatively, we could setup the FDs ourself, and use UNIX sockets on Windows, which can be used in blocking/non-blocking mode. I haven't tried it, as I am not sure it is necessary. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20221006113657.2656108-6-marcandre.lureau@redhat.com>	2022-10-12 19:22:01 +04:00
Marc-André Lureau	a95570e3e4	io/command: use glib GSpawn, instead of open-coding fork/exec Simplify qio_channel_command_new_spawn() with GSpawn API. This will allow to build for WIN32 in the following patches. As pointed out by Daniel Berrangé: there is a change in semantics here too. The current code only touches stdin/stdout/stderr. Any other FDs which do NOT have O_CLOEXEC set will be inherited. With the new code, all FDs except stdin/out/err will be explicitly closed, because we don't set the flag G_SPAWN_LEAVE_DESCRIPTORS_OPEN. The only place we use QIOChannelCommand today is the migration exec: protocol, and that is only declared to use stdin/stdout. Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20221006113657.2656108-5-marcandre.lureau@redhat.com>	2022-10-12 19:22:01 +04:00
Philippe Mathieu-Daudé	5e689840a1	io/channel-websock: Replace strlen(const_str) by sizeof(const_str) - 1 The combined_key[... QIO_CHANNEL_WEBSOCK_GUID_LEN ...] array in qio_channel_websock_handshake_send_res_ok() expands to a call to strlen(QIO_CHANNEL_WEBSOCK_GUID), and the compiler doesn't realize the string is const, so consider combined_key[] being a variable-length array. To remove the variable-length array, we provide it a hint to the compiler by using sizeof() - 1 instead of strlen(). Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Peter Maydell <peter.maydell@linaro.org> Message-id: 20220819153931.3147384-5-peter.maydell@linaro.org	2022-09-22 16:38:28 +01:00
Leonardo Bras	5258a7e2c0	QIOChannelSocket: Add support for MSG_ZEROCOPY + IPV6 For using MSG_ZEROCOPY, there are two steps: 1 - io_writev() the packet, which enqueues the packet for sending, and 2 - io_flush(), which gets confirmation that all packets got correctly sent Currently, if MSG_ZEROCOPY is used to send packets over IPV6, no error will be reported in (1), but it will fail in the first time (2) happens. This happens because (2) currently checks for cmsg_level & cmsg_type associated with IPV4 only, before reporting any error. Add checks for cmsg_level & cmsg_type associated with IPV6, and thus enable support for MSG_ZEROCOPY + IPV6 Fixes: `2bc58ffc29` ("QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX") Signed-off-by: Leonardo Bras <leobras@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2022-08-05 16:18:15 +01:00
Leonardo Bras	927f93e099	QIOChannelSocket: Fix zero-copy flush returning code 1 when nothing sent If flush is called when no buffer was sent with MSG_ZEROCOPY, it currently returns 1. This return code should be used only when Linux fails to use MSG_ZEROCOPY on a lot of sendmsg(). Fix this by returning early from flush if no sendmsg(...,MSG_ZEROCOPY) was attempted. Fixes: `2bc58ffc29` ("QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX") Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Message-Id: <20220711211112.18951-2-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-07-20 12:15:09 +01:00
Daniel P. Berrangé	87e4276449	io: add a QIOChannelNull equivalent to /dev/null This is for code which needs a portable equivalent to a QIOChannelFile connected to /dev/null. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-06-22 18:11:21 +01:00
Leonardo Bras	4f5a09714c	QIOChannelSocket: Fix zero-copy send so socket flush works Somewhere between v6 and v7 the of the zero-copy-send patchset a crucial part of the flushing mechanism got missing: incrementing zero_copy_queued. Without that, the flushing interface becomes a no-op, and there is no guarantee the buffer is really sent. This can go as bad as causing a corruption in RAM during migration. Fixes: `2bc58ffc29` ("QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX") Reported-by: 徐闯 <xuchuangxclwt@bytedance.com> Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-06-22 17:22:28 +01:00
Leonardo Bras	803ca43e4c	QIOChannelSocket: Introduce assert and reduce ifdefs to improve readability During implementation of MSG_ZEROCOPY feature, a lot of #ifdefs were introduced, particularly at qio_channel_socket_writev(). Rewrite some of those changes so it's easier to read. Also, introduce an assert to help detect incorrect zero-copy usage is when it's disabled on build. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Signed-off-by: Juan Quintela <quintela@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> dgilbert: Fixed up thinko'd g_assert_unreachable->g_assert_not_reached	2022-06-22 17:22:09 +01:00
Leonardo Bras	2bc58ffc29	QIOChannelSocket: Implement io_writev zero copy flag & io_flush for CONFIG_LINUX For CONFIG_LINUX, implement the new zero copy flag and the optional callback io_flush on QIOChannelSocket, but enables it only when MSG_ZEROCOPY feature is available in the host kernel, which is checked on qio_channel_socket_connect_sync() qio_channel_socket_flush() was implemented by counting how many times sendmsg(...,MSG_ZEROCOPY) was successfully called, and then reading the socket's error queue, in order to find how many of them finished sending. Flush will loop until those counters are the same, or until some error occurs. Notes on using writev() with QIO_CHANNEL_WRITE_FLAG_ZERO_COPY: 1: Buffer - As MSG_ZEROCOPY tells the kernel to use the same user buffer to avoid copying, some caution is necessary to avoid overwriting any buffer before it's sent. If something like this happen, a newer version of the buffer may be sent instead. - If this is a problem, it's recommended to call qio_channel_flush() before freeing or re-using the buffer. 2: Locked memory - When using MSG_ZERCOCOPY, the buffer memory will be locked after queued, and unlocked after it's sent. - Depending on the size of each buffer, and how often it's sent, it may require a larger amount of locked memory than usually available to non-root user. - If the required amount of locked memory is not available, writev_zero_copy will return an error, which can abort an operation like migration, - Because of this, when an user code wants to add zero copy as a feature, it requires a mechanism to disable it, so it can still be accessible to less privileged users. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20220513062836.965425-4-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-05-16 13:56:24 +01:00
Leonardo Bras	b88651cb4d	QIOChannel: Add flags on io_writev and introduce io_flush callback Add flags to io_writev and introduce io_flush as optional callback to QIOChannelClass, allowing the implementation of zero copy writes by subclasses. How to use them: - Write data using qio_channel_writev*(...,QIO_CHANNEL_WRITE_FLAG_ZERO_COPY), - Wait write completion with qio_channel_flush(). Notes: As some zero copy write implementations work asynchronously, it's recommended to keep the write buffer untouched until the return of qio_channel_flush(), to avoid the risk of sending an updated buffer instead of the buffer state during write. As io_flush callback is optional, if a subclass does not implement it, then: - io_flush will return 0 without changing anything. Also, some functions like qio_channel_writev_full_all() were adapted to receive a flag parameter. That allows shared code between zero copy and non-zero copy writev, and also an easier implementation on new flags. Signed-off-by: Leonardo Bras <leobras@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Reviewed-by: Peter Xu <peterx@redhat.com> Reviewed-by: Juan Quintela <quintela@redhat.com> Message-Id: <20220513062836.965425-3-leobras@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2022-05-16 13:56:24 +01:00
Marc-André Lureau	ff5927baa7	util: rename qemu_block() socket functions The qemu_block() functions are meant to be be used with sockets (the win32 implementation expects SOCKET) Over time, those functions where used with Win32 SOCKET or file-descriptors interchangeably. But for portability, they must only be used with socket-like file-descriptors. FDs can use g_unix_set_fd_nonblocking() instead. Rename the functions with "socket" in the name to prevent bad usages. This is effectively reverting commit `f9e8cacc55` ("oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()"). Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-05-03 15:53:20 +04:00
Marc-André Lureau	17fc124529	io: replace qemu_set{_non}block() Those calls are non-socket fd, or are POSIX-specific. Use the dedicated GLib API. (qemu_set_nonblock() is for socket-like) (this is a preliminary patch before renaming qemu_set_nonblock()) Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2022-05-03 15:52:25 +04:00
Marc-André Lureau	05e50e8fe5	io: make qio_channel_command_new_pid() static The function isn't used outside of qio_channel_command_new_spawn(), which is !win32-specific. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>	2022-05-03 15:47:59 +04:00
Marc-André Lureau	d640b59eb3	io: replace pipe() with g_unix_open_pipe(CLOEXEC) Suggested-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>	2022-05-03 15:47:25 +04:00
Marc-André Lureau	0f9668e0c1	Remove qemu-common.h include from most units Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20220323155743.1585078-33-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2022-04-06 14:31:55 +02:00
Marc-André Lureau	9edc6313da	Replace GCC_FMT_ATTR with G_GNUC_PRINTF One less qemu-specific macro. It also helps to make some headers/units only depend on glib, and thus moved in standalone projects eventually. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Richard W.M. Jones <rjones@redhat.com>	2022-03-22 14:40:51 +04:00
Marc-André Lureau	e7b7942822	Drop qemu_foo() socket API wrapper The socket API wrappers were initially introduced in commit `00aa0040` ("Wrap recv to avoid warnings"), but made redundant with commit `a2d96af4` ("osdep: add wrappers for socket functions") which fixes the win32 declarations and thus removed the earlier warnings. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>	2022-03-22 14:40:51 +04:00
Stefan Hajnoczi	826cc32423	aio-posix: split poll check from ready handler Adaptive polling measures the execution time of the polling check plus handlers called when a polled event becomes ready. Handlers can take a significant amount of time, making it look like polling was running for a long time when in fact the event handler was running for a long time. For example, on Linux the io_submit(2) syscall invoked when a virtio-blk device's virtqueue becomes ready can take 10s of microseconds. This can exceed the default polling interval (32 microseconds) and cause adaptive polling to stop polling. By excluding the handler's execution time from the polling check we make the adaptive polling calculation more accurate. As a result, the event loop now stays in polling mode where previously it would have fallen back to file descriptor monitoring. The following data was collected with virtio-blk num-queues=2 event_idx=off using an IOThread. Before: 168k IOPS, IOThread syscalls: 9837.115 ( 0.020 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 16, iocbpp: 0x7fcb9f937db0) = 16 9837.158 ( 0.002 ms): IO iothread1/620155 write(fd: 103, buf: 0x556a2ef71b88, count: 8) = 8 9837.161 ( 0.001 ms): IO iothread1/620155 write(fd: 104, buf: 0x556a2ef71b88, count: 8) = 8 9837.163 ( 0.001 ms): IO iothread1/620155 ppoll(ufds: 0x7fcb90002800, nfds: 4, tsp: 0x7fcb9f1342d0, sigsetsize: 8) = 3 9837.164 ( 0.001 ms): IO iothread1/620155 read(fd: 107, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.174 ( 0.001 ms): IO iothread1/620155 read(fd: 105, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.176 ( 0.001 ms): IO iothread1/620155 read(fd: 106, buf: 0x7fcb9f939cc0, count: 512) = 8 9837.209 ( 0.035 ms): IO iothread1/620155 io_submit(ctx_id: 140512552468480, nr: 32, iocbpp: 0x7fca7d0cebe0) = 32 174k IOPS (+3.6%), IOThread syscalls: 9809.566 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0cdd62be0) = 32 9809.625 ( 0.001 ms): IO iothread1/623061 write(fd: 103, buf: 0x5647cfba5f58, count: 8) = 8 9809.627 ( 0.002 ms): IO iothread1/623061 write(fd: 104, buf: 0x5647cfba5f58, count: 8) = 8 9809.663 ( 0.036 ms): IO iothread1/623061 io_submit(ctx_id: 140539805028352, nr: 32, iocbpp: 0x7fd0d0388b50) = 32 Notice that ppoll(2) and eventfd read(2) syscalls are eliminated because the IOThread stays in polling mode instead of falling back to file descriptor monitoring. As usual, polling is not implemented on Windows so this patch ignores the new io_poll_read() callback in aio-win32.c. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Stefano Garzarella <sgarzare@redhat.com> Message-id: 20211207132336.36627-2-stefanha@redhat.com [Fixed up aio_set_event_notifier() calls in tests/unit/test-fdmon-epoll.c added after this series was queued. --Stefan] Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2022-01-12 17:09:39 +00:00
Marc-André Lureau	653163fcbc	build-sys: add HAVE_IPPROTO_MPTCP The QAPI schema shouldn't rely on C system headers #define, but on configure-time project #define, so we can express the build condition in a C-independent way. Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <20210907121943.3498701-3-marcandre.lureau@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-09-30 15:30:25 +02:00
Daniel P. Berrangé	cfb47f2178	io: use GDateTime for formatting timestamp for websock headers The GDateTime APIs provided by GLib avoid portability pitfalls, such as some platforms where 'struct timeval.tv_sec' field is still 'long' instead of 'time_t'. When combined with automatic cleanup, GDateTime often results in simpler code too. Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2021-07-14 14:15:52 +01:00
Dr. David Alan Gilbert	8bd1078aeb	sockets: Support multipath TCP Multipath TCP allows combining multiple interfaces/routes into a single socket, with very little work for the user/admin. It's enabled by 'mptcp' on most socket addresses: ./qemu-system-x86_64 -nographic -incoming tcp:0:4444,mptcp Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Message-Id: <20210421112834.107651-6-dgilbert@redhat.com> Acked-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-06-08 19:36:22 +01:00
Dr. David Alan Gilbert	5b6116d326	io/net-listener: Call the notifier during finalize Call the notifier during finalize; it's currently only called if we change it, which is not the intent. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210421112834.107651-3-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-06-08 19:36:17 +01:00
Dr. David Alan Gilbert	d80f54ce53	channel-socket: Only set CLOEXEC if we have space for fds MSG_CMSG_CLOEXEC cleans up received fd's; it's really only for Unix sockets, but currently we enable it for everything; some socket types (IP_MPTCP) don't like this. Only enable it when we're giving the recvmsg room to receive fd's anyway. Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <20210421112834.107651-2-dgilbert@redhat.com> Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>	2021-06-08 19:36:16 +01:00
Stefano Garzarella	d0fb9657a3	docs: fix references to docs/devel/tracing.rst Commit `e50caf4a5c` ("tracing: convert documentation to rST") converted docs/devel/tracing.txt to docs/devel/tracing.rst. We still have several references to the old file, so let's fix them with the following command: sed -i s/tracing.txt/tracing.rst/ $(git grep -l docs/devel/tracing.txt) Signed-off-by: Stefano Garzarella <sgarzare@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Message-Id: <20210517151702.109066-2-sgarzare@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>	2021-06-02 06:51:09 +02:00
Jagannathan Raman	c90e3512a4	io: error_prepend() in qio_channel_readv_full_all() causes segfault Using error_prepend() in qio_channel_readv_full_all() causes a segfault as errp is not set when ret is 0. This results in the failure of iotest 83. Replacing with error_setg() fixes the problem. Additionally, removes a full stop at the end of error message Reported-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Fixes: `bebab91ebd` (io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers) Message-Id: <be476bcdb99e820fec0fa09fe8f04c9dd3e62473.1613128220.git.jag.raman@oracle.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com>	2021-02-12 07:50:59 -06:00
Elena Ufimtseva	bebab91ebd	io: add qio_channel_readv_full_all_eof & qio_channel_readv_full_all helpers Adds qio_channel_readv_full_all_eof() and qio_channel_readv_full_all() to read both data and FDs. Refactors existing code to use these helpers. Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-id: b059c4cc0fb741e794d644c144cc21372cad877d.1611938319.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-02-10 09:23:28 +00:00
Elena Ufimtseva	bfa4238750	io: add qio_channel_writev_full_all helper Adds qio_channel_writev_full_all() to transmit both data and FDs. Refactors existing code to use this helper. Signed-off-by: Elena Ufimtseva <elena.ufimtseva@oracle.com> Signed-off-by: John G Johnson <john.g.johnson@oracle.com> Signed-off-by: Jagannathan Raman <jag.raman@oracle.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Acked-by: Daniel P. Berrangé <berrange@redhat.com> Message-id: 480fbf1fe4152495d60596c9b665124549b426a5.1611938319.git.jag.raman@oracle.com Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2021-02-10 09:23:28 +00:00
Peter Maydell	45240eed4f	Yank patches patches for 2021-01-13 -----BEGIN PGP SIGNATURE----- iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAl/+vJoSHGFybWJydUBy ZWRoYXQuY29tAAoJEDhwtADrkYZTyv8P/3Dqb9sM8p2WIeoUt5KjgkgWto8anCXM /vzAvVdOrPcLXgHF1HOEkcYkp5ZDzFb1PP+LNWsbIB82HmGUGfA7CiOpCpXEoDJs Z9OYR3K8W5fvSKYTI/m+s7d+9aYqRZajI6ON5M4Eqem0ZwV93/SZMBHOcs1GmvIR diXIztaWVgHjU1Q37MlvJTM4lLN3RH1kTWEdfp3dkNMO6HxBet0B1g7xwaPxKrgb 4y8D/kk9TA1m4wnwrr9s1l3UnfDiZ7mSfsEsXMcTMmQQrAtonD/xiX2YFsHwf6+U 9cX9BCG2XP65t3ynY5goddcjVX3R6SuP4YWSgYUpJGrSx+GVxXrtF0ZumgiCk+4T uv8sOkJdPe9/aRy86UkNzJx7V50nCZnJ2if9neukVjHGW4Hw4txcYV5/7ZuuDqKl tF5NdF/LcHEZLKkBt+4g4TpJ7vxEmJc8/ukn9niLT381SkRBFnnP+Bnl9taSlHOY xNtQJ3Jrcd/cvBjAPCKgtE4Fx6wzQG3c7Yg4WHxbZzcYZBPp4fifUTLCK5XKHqhb rlqCQIO9DzGz5tqOgG7hWmlodSafGiDsHo9tJVpyF5pSSUv3A4KzX2xW4FZZLKJn 7uBrcV0bLmR4tyw+fr+u2EW0ClYrs/JxeXAnsnTp9JrzkXILf5RjEuK0Sc1ZIuZW cmuPa8027ybj =fLO1 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/armbru/tags/pull-yank-2021-01-13' into staging Yank patches patches for 2021-01-13 # gpg: Signature made Wed 13 Jan 2021 09:25:46 GMT # gpg: using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653 # gpg: issuer "armbru@redhat.com" # gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full] # gpg: aka "Markus Armbruster <armbru@pond.sub.org>" [full] # Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653 * remotes/armbru/tags/pull-yank-2021-01-13: tests/test-char.c: Wait for the chardev to connect in char_socket_client_dupid_test io: Document qmp oob suitability of qio_channel_shutdown and io_shutdown io/channel-tls.c: make qio_channel_tls_shutdown thread-safe migration: Add yank feature chardev/char-socket.c: Add yank feature block/nbd.c: Add yank feature Introduce yank feature Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2021-01-13 14:19:24 +00:00
Lukas Straub	e4d2bfb170	io/channel-tls.c: make qio_channel_tls_shutdown thread-safe Make qio_channel_tls_shutdown thread-safe by using atomics when accessing tioc->shutdown. Signed-off-by: Lukas Straub <lukasstraub2@web.de> Acked-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> Message-Id: <5bd8733f583f3558b32250fd0eb576b7aa756485.1609167865.git.lukasstraub2@web.de> Signed-off-by: Markus Armbruster <armbru@redhat.com>	2021-01-13 10:21:17 +01:00
Roman Bolshakov	3eacf70bb5	meson: Propagate gnutls dependency crypto/tlscreds.h includes GnuTLS headers if CONFIG_GNUTLS is set, but GNUTLS_CFLAGS, that describe include path, are not propagated transitively to all users of crypto and build fails if GnuTLS headers reside in non-standard directory (which is a case for homebrew on Apple Silicon). Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com> Message-Id: <20210102125213.41279-1-r.bolshakov@yadro.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2021-01-12 12:38:03 +01:00
AlexChen	77b7829e75	io: Don't use '#' flag of printf format Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: AlexChen <alex.chen@huawei.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-10-29 09:57:37 +00:00
Chetan Pant	e0622ae3ca	io: Fix Lesser GPL version number There is no "version 2" of the "Lesser" General Public License. It is either "GPL version 2.0" or "Lesser GPL version 2.1". This patch replaces all occurrences of "Lesser GPL version 2" with "Lesser GPL version 2.1" in comment section. Signed-off-by: Chetan Pant <chetan4windows@gmail.com> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>	2020-10-29 09:57:37 +00:00

1 2 3 4

183 commits