Commit graph

1062 commits

Author SHA1 Message Date
Eugenio Pérez 915bf6ccd7 vdpa: reorder vhost_vdpa_net_cvq_cmd_page_len function
We need to call it from resource cleanup context, as munmap needs the
size of the mappings.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230602143854.1879091-3-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez 8bc0049ead vdpa: do not block migration if device has cvq and x-svq=on
It was a mistake to forbid in all cases, as SVQ is already able to send
all the CVQ messages before start forwarding data vqs.  It actually
caused a regression, making impossible to migrate device previously
migratable.

Fixes: 36e4647247 ("vdpa: add vhost_vdpa_net_valid_svq_features")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230602143854.1879091-2-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
2023-06-26 09:50:00 -04:00
Eugenio Pérez 152128d646 vdpa: move CVQ isolation check to net_init_vhost_vdpa
Evaluating it at start time instead of initialization time may make the
guest capable of dynamically adding or removing migration blockers.

Also, moving to initialization reduces the number of ioctls in the
migration, reducing failure possibilities.

As a drawback we need to check for CVQ isolation twice: one time with no
MQ negotiated and another one acking it, as long as the device supports
it.  This is because Vring ASID / group management is based on vq
indexes, but we don't know the index of CVQ before negotiating MQ.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230526153143.470745-3-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-06-23 03:09:45 -04:00
Eugenio Pérez 0f2bb0bf38 vdpa: return errno in vhost_vdpa_get_vring_group error
We need to tell in the caller, as some errors are expected in a normal
workflow.  In particular, parent drivers in recent kernels with
VHOST_BACKEND_F_IOTLB_ASID may not support vring groups.  In that case,
-ENOTSUP is returned.

This is the case of vp_vdpa in Linux 6.2.

Next patches in this series will use that information to know if it must
abort or not.  Also, next patches return properly an errp instead of
printing with error_report.

Reviewed-by: Stefano Garzarella <sgarzare@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230526153143.470745-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-06-23 02:54:44 -04:00
Philippe Mathieu-Daudé de6cd7599b meson: Replace softmmu_ss -> system_ss
We use the user_ss[] array to hold the user emulation sources,
and the softmmu_ss[] array to hold the system emulation ones.
Hold the latter in the 'system_ss[]' array for parity with user
emulation.

Mechanical change doing:

  $ sed -i -e s/softmmu_ss/system_ss/g $(git grep -l softmmu_ss)

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20230613133347.82210-10-philmd@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2023-06-20 10:01:30 +02:00
Philippe Mathieu-Daudé f975033d56 cocoa: Fix warnings about invalid prototype declarations
Fix the following Cocoa trivial warnings:

  C compiler for the host machine: cc (clang 14.0.0 "Apple clang version 14.0.0 (clang-1400.0.29.202)")
  Objective-C compiler for the host machine: clang (clang 14.0.0)

  [100/334] Compiling Objective-C object libcommon.fa.p/net_vmnet-bridged.m.o
  net/vmnet-bridged.m:40:31: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
  static char* get_valid_ifnames()
                                ^
                                 void
  [742/1436] Compiling Objective-C object libcommon.fa.p/ui_cocoa.m.o
  ui/cocoa.m:1937:22: warning: a function declaration without a prototype is deprecated in all versions of C [-Wstrict-prototypes]
  static int cocoa_main()
                       ^
                        void

Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-Id: <20230425192820.34063-1-philmd@linaro.org>
2023-06-13 11:28:58 +02:00
Akihiko Odaki 7e64a9cabb igb: Strip the second VLAN tag for extended VLAN
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 907209e311 igb: Implement Rx SCTP CSO
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki aaa8a15c96 net/eth: Always add VLAN tag
It is possible to have another VLAN tag even if the packet is already
tagged.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 85427bf388 net/eth: Use void pointers
The uses of uint8_t pointers were misleading as they are never accessed
as an array of octets and it even require more strict alignment to
access as struct eth_header.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 0b11783014 net/eth: Rename eth_setup_vlan_headers_ex
The old eth_setup_vlan_headers has no user so remove it and rename
eth_setup_vlan_headers_ex.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Akihiko Odaki 2f0fa232b8 net/net_rx_pkt: Use iovec for net_rx_pkt_set_protocols()
igb does not properly ensure the buffer passed to
net_rx_pkt_set_protocols() is contiguous for the entire L2/L3/L4 header.
Allow it to pass scattered data to net_rx_pkt_set_protocols().

Fixes: 3a977deebe ("Intrdocue igb device emulation")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Reviewed-by: Sriram Yagnaraman <sriram.yagnaraman@est.tech>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-05-23 15:20:15 +08:00
Vladimir Sementsov-Ogievskiy 6c1e3906ce configure: add --disable-colo-proxy option
Add option to not build filter-rewriter and colo-compare when
they are not needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@yandex-team.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Zhang Chen <chen.zhang@intel.com>
Message-Id: <20230515130640.46035-2-vsementsov@yandex-team.ru>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2023-05-18 18:40:50 +02:00
Eugenio Pérez 0d74e2b785 vdpa: accept VIRTIO_NET_F_SPEED_DUPLEX in SVQ
There is no reason to block it as it has nothing to do with the vrings.
All the support of the feature comes via config space.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Suggested-by: Alvaro Karsz <alvaro.karsz@solid-run.com>
Message-Id: <20230307170018.260557-1-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-04-21 03:08:21 -04:00
Marc-André Lureau 25657fc6c1 win32: replace closesocket() with close() wrapper
Use a close() wrapper instead, so that we don't need to worry about
closesocket() vs close() anymore, let's hope.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-17-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau fd3c333315 slirp: open-code qemu_socket_(un)select()
We are about to make the QEMU socket API use file-descriptor space only,
but libslirp gives us SOCKET as fd, still.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-14-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau 21ac728498 slirp: unregister the win32 SOCKET
Presumably, this is what should happen when the SOCKET is to be removed.
(it probably worked until now because closesocket() does it implicitly,
but we never now how the slirp library could use the SOCKET later)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-13-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Marc-André Lureau faa4ec1641 main-loop: remove qemu_fd_register(), win32/slirp/socket specific
Open-code the socket registration where it's needed, to avoid
artificially used or unclear generic interface.

Furthermore, the following patches are going to make socket handling use
FD-only inside QEMU, but we need to handle win32 SOCKET from libslirp.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.ibm.com>
Message-Id: <20230221124802.4103554-12-marcandre.lureau@redhat.com>
2023-03-13 15:39:31 +04:00
Peter Maydell 7284d53f6f -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJkCvgFAAoJEO8Ells5jWIRHiUH/jhydpJHIqnAPxHQAwGtmyhb
 9Z52UOzW5V6KxfZJ+bQ4RPFkS2UwcxmeadPHY4zvvJTVBLAgG3QVgP4igj8CXKCI
 xRnwMgTNeu655kZQ5P/elTwdBTCJFODk7Egg/bH3H1ZiUhXBhVRhK7q/wMgtlZkZ
 Kexo6txCK4d941RNzEh45ZaGhdELE+B+D7cRuQgBs/DXZtJpsyEzBbP8KYSMHuER
 AXfWo0YIBYj7X3ek9D6j0pbOkB61vqtYd7W6xV4iDrJCcFBIOspJbbBb1tGCHola
 AXo5/OhRmiQnp/c/HTbJIDbrj0sq/r7LxYK4zY1x7UPbewHS9R+wz+FfqSmoBF0=
 =056y
 -----END PGP SIGNATURE-----

Merge tag 'net-pull-request' of https://github.com/jasowang/qemu into staging

# -----BEGIN PGP SIGNATURE-----
# Version: GnuPG v1
#
# iQEcBAABAgAGBQJkCvgFAAoJEO8Ells5jWIRHiUH/jhydpJHIqnAPxHQAwGtmyhb
# 9Z52UOzW5V6KxfZJ+bQ4RPFkS2UwcxmeadPHY4zvvJTVBLAgG3QVgP4igj8CXKCI
# xRnwMgTNeu655kZQ5P/elTwdBTCJFODk7Egg/bH3H1ZiUhXBhVRhK7q/wMgtlZkZ
# Kexo6txCK4d941RNzEh45ZaGhdELE+B+D7cRuQgBs/DXZtJpsyEzBbP8KYSMHuER
# AXfWo0YIBYj7X3ek9D6j0pbOkB61vqtYd7W6xV4iDrJCcFBIOspJbbBb1tGCHola
# AXo5/OhRmiQnp/c/HTbJIDbrj0sq/r7LxYK4zY1x7UPbewHS9R+wz+FfqSmoBF0=
# =056y
# -----END PGP SIGNATURE-----
# gpg: Signature made Fri 10 Mar 2023 09:27:33 GMT
# gpg:                using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>" [marginal]
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* tag 'net-pull-request' of https://github.com/jasowang/qemu: (44 commits)
  ebpf: fix compatibility with libbpf 1.0+
  docs/system/devices/igb: Add igb documentation
  tests/avocado: Add igb test
  igb: Introduce qtest for igb device
  tests/qtest/libqos/e1000e: Export macreg functions
  tests/qtest/e1000e-test: Fabricate ethernet header
  Intrdocue igb device emulation
  e1000: Split header files
  pcie: Introduce pcie_sriov_num_vfs
  net/eth: Introduce EthL4HdrProto
  e1000e: Implement system clock
  net/eth: Report if headers are actually present
  e1000e: Count CRC in Tx statistics
  e1000: Count CRC in Tx statistics
  e1000e: Combine rx traces
  MAINTAINERS: Add e1000e test files
  MAINTAINERS: Add Akihiko Odaki as a e1000e reviewer
  e1000e: Do not assert when MSI-X is disabled later
  hw/net/net_tx_pkt: Check the payload length
  hw/net/net_tx_pkt: Implement TCP segmentation
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-03-11 17:17:18 +00:00
Akihiko Odaki 65f474bbae net/eth: Introduce EthL4HdrProto
igb, a new network device emulation, will need SCTP checksum offloading.
Currently eth_get_protocols() has a bool parameter for each protocol
currently it supports, but there will be a bit too many parameters if
we add yet another protocol.

Introduce an enum type, EthL4HdrProto to represent all L4 protocols
eth_get_protocols() support with one parameter.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki 69ff5ef847 net/eth: Report if headers are actually present
The values returned by eth_get_protocols() are used to perform RSS,
checksumming and segmentation. Even when a packet signals the use of the
protocols which these operations can be applied to, the headers for them
may not be present because of too short packet or fragmentation, for
example. In such a case, the operations cannot be applied safely.

Report the presence of headers instead of whether the use of the
protocols are indicated with eth_get_protocols(). This also makes
corresponding changes to the callers of eth_get_protocols() to match
with its new signature and to remove redundant checks for fragmentation.

Fixes: 75020a7021 ("Common definitions for VMWARE devices")
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki 02ef5fdc09 hw/net/net_tx_pkt: Implement TCP segmentation
There was no proper implementation of TCP segmentation before this
change, and net_tx_pkt relied solely on IPv4 fragmentation. Not only
this is not aligned with the specification, but it also resulted in
corrupted IPv6 packets.

This is particularly problematic for the igb, a new proposed device
implementation; igb provides loopback feature for VMDq and the feature
relies on software segmentation.

Implement proper TCP segmentation in net_tx_pkt to fix such a scenario.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Akihiko Odaki 481c52320a net: Strip virtio-net header when dumping
filter-dump specifiees Ethernet as PCAP LinkType, which does not expect
virtio-net header. Having virtio-net header in such PCAP file breaks
PCAP unconsumable. Unfortunately currently there is no LinkType for
virtio-net so for now strip virtio-net header to convert the output to
Ethernet.

Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-03-10 15:35:38 +08:00
Eugenio Pérez 609ab4c3ed vdpa net: allow VHOST_F_LOG_ALL
Since some actions move to the start function instead of init, the
device features may not be the parent vdpa device's, but the one
returned by vhost backend.  If transition to SVQ is supported, the vhost
backend will return _F_LOG_ALL to signal the device is migratable.

Add VHOST_F_LOG_ALL.  HW dirty page tracking can be added on top of this
change if the device supports it in the future.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-14-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 5c1ebd4c43 vdpa: block migration if device has unsupported features
A vdpa net device must initialize with SVQ in order to be migratable at
this moment, and initialization code verifies some conditions.  If the
device is not initialized with the x-svq parameter, it will not expose
_F_LOG so the vhost subsystem will block VM migration from its
initialization.

Next patches change this, so we need to verify migration conditions
differently.

QEMU only supports a subset of net features in SVQ, and it cannot
migrate state that cannot track or restore in the destination.  Add a
migration blocker if the device offers an unsupported feature.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-12-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 9c363cf6d5 vdpa net: block migration if the device has CVQ
Devices with CVQ need to migrate state beyond vq state.  Leaving this to
future series.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-11-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 6949843046 vdpa: add vdpa net migration state notifier
This allows net to restart the device backend to configure SVQ on it.

Ideally, these changes should not be net specific and they could be done
in:
* vhost_vdpa_set_features (with VHOST_F_LOG_ALL)
* vhost_vdpa_set_vring_addr (with .enable_log)
* vhost_vdpa_set_log_base.

However, the vdpa net backend is the one with enough knowledge to
configure everything because of some reasons:
* Queues might need to be shadowed or not depending on its kind (control
  vs data).
* Queues need to share the same map translations (iova tree).

Also, there are other problems that may have solutions but complicates
the implementation at this stage:
* We're basically duplicating vhost_dev_start and vhost_dev_stop, and
  they could go out of sync.  If we want to reuse them, we need a way to
  skip some function calls to avoid recursiveness (either vhost_ops ->
  vhost_set_features, vhost_set_vring_addr, ...).
* We need to traverse all vhost_dev of a given net device twice: one to
  stop and get the vq state and another one after the reset to
  configure properties like address, fd, etc.

Because of that it is cleaner to restart the whole net backend and
configure again as expected, similar to how vhost-kernel moves between
userspace and passthrough.

If more kinds of devices need dynamic switching to SVQ we can:
* Create a callback struct like VhostOps and move most of the code
  there.  VhostOps cannot be reused since all vdpa backend share them,
  and to personalize just for networking would be too heavy.
* Add a parent struct or link all the vhost_vdpa or vhost_dev structs so
  we can traverse them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20230303172445.1089785-9-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 00ef422e9f vdpa net: move iova tree creation from init to start
Only create iova_tree if and when it is needed.

The cleanup keeps being responsible for the last VQ but this change
allows it to merge both cleanup functions.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20230303172445.1089785-2-eperezma@redhat.com>
Tested-by: Lei Yang <leiyang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-03-07 12:38:59 -05:00
Eugenio Pérez 525ae11522 vdpa: fix VHOST_BACKEND_F_IOTLB_ASID flag check
VHOST_BACKEND_F_IOTLB_ASID is the feature bit, not the bitmask. Since
the device under test also provided VHOST_BACKEND_F_IOTLB_MSG_V2 and
VHOST_BACKEND_F_IOTLB_BATCH, this went unnoticed.

Fixes: c1a1008685 ("vdpa: always start CVQ in SVQ mode if possible")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Laurent Vivier 148fbf0d58 net: stream: add a new option to automatically reconnect
In stream mode, if the server shuts down there is currently
no way to reconnect the client to a new server without removing
the NIC device and the netdev backend (or to reboot).

This patch introduces a reconnect option that specifies a delay
to try to reconnect with the same parameters.

Add a new test in qtest to test the reconnect option and the
connect/disconnect events.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Joelle van Dyne 993f71ee33 vmnet: stop recieving events when VM is stopped
When the VM is stopped using the HMP command "stop", soon the handler will
stop reading from the vmnet interface. This causes a flood of
`VMNET_INTERFACE_PACKETS_AVAILABLE` events to arrive and puts the host CPU
at 100%. We fix this by removing the event handler from vmnet when the VM
is no longer in a running state and restore it when we return to a running
state.

Signed-off-by: Joelle van Dyne <j@getutm.app>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Christian Svensson 0c65ef4fbb net: Increase L2TPv3 buffer to fit jumboframes
Increase the allocated buffer size to fit larger packets.
Given that jumboframes can commonly be up to 9000 bytes the closest suitable
value seems to be 16 KiB.

Tested by running qemu towards a Linux L2TPv3 endpoint and pushing
jumboframe traffic through the interfaces.

Signed-off-by: Christian Svensson <blue@cmd.nu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Thomas Huth 3b0cca8e4e net: Replace "Supported NIC models" with "Available NIC models"
Just because a NIC model is compiled into the QEMU binary does not
necessary mean that it can be used with each and every machine.
So let's rather talk about "available" models instead of "supported"
models, just to avoid confusion.

Reviewed-by: Claudio Fontana <cfontana@suse.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Thomas Huth 27c819244b net: Restore printing of the help text with "-nic help"
Running QEMU with "-nic help" used to work in QEMU 5.2 and earlier versions
(it showed the available netdev backends), but this feature got broken during
some refactoring in version 6.0. Let's restore the old behavior, and while
we're at it, let's also print the available NIC models here now since this
option can be used to configure both, netdev backend and model in one go.

Fixes: ad6f932fe8 ("net: do not exit on "netdev_add help" monitor command")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Thomas Huth c6941b3b9b net: Move the code to collect available NIC models to a separate function
The code that collects the available NIC models is not really specific
to PCI anymore and will be required in the next patch, too, so let's
move this into a new separate function in net.c instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-02-17 13:31:33 +08:00
Markus Armbruster e02e085c8b net: Clean up includes
This commit was created with scripts/clean-includes.

All .c should include qemu/osdep.h first.  The script performs three
related cleanups:

* Ensure .c files include qemu/osdep.h first.
* Including it in a .h is redundant, since the .c  already includes
  it.  Drop such inclusions.
* Likewise, including headers qemu/osdep.h includes is redundant.
  Drop these, too.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20230202133830.2152150-13-armbru@redhat.com>
2023-02-08 07:28:05 +01:00
Markus Armbruster ae71d13d4e net: Move hmp_info_network() to net-hmp-cmds.c
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230124121946.1139465-17-armbru@redhat.com>
2023-02-04 07:56:54 +01:00
Markus Armbruster 2030ca36bf net: Move HMP commands from monitor to net/
This moves these commands from MAINTAINERS sections "Human
Monitor (HMP)" and "QMP" to "Network device backends".

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20230124121946.1139465-16-armbru@redhat.com>
2023-02-04 07:56:54 +01:00
Peter Maydell aa96ab7c9d * s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
 * Deprecate the -no-hpet command line option
 * Disable the qtests in the 32-bit Windows CI job again
 * Some other misc fixes here and there
 -----BEGIN PGP SIGNATURE-----
 
 iQJFBAABCAAvFiEEJ7iIR+7gJQEY8+q5LtnXdP5wLbUFAmO8It8RHHRodXRoQHJl
 ZGhhdC5jb20ACgkQLtnXdP5wLbUwbA//dXgfHy95C1r2nTMDekk09+KkmNB1f6M8
 3HK4ROmmrMT/aP9FwfqMBT7JHM/m4bwOGw0Sula8vfjg9NYGPWuSYjdObWKnrIq/
 YORoTxqak9c98Co06EQbAfWn3Pj0ifQkX+FIyzcNGhu4856FWdBsMuyq52VLi36q
 Z8ruSOmclzluoIB3mVYY/s5J7ED2A3K0h39frKLE9FGsKObX10KWj+MZyDHi9oGZ
 ucTHai12OXgNghjlrwI0BqJziih4NxfIWs0JovSo3cN0at7m57G5JChjR38zTMNT
 2Q46tDKoIXesY1GUmVuIgJ5F1Uoshc8Pz5qBSQ5mUbZUQMpivhFrEB666wsYmPd1
 M/YwnZ+PFhWjem7p28fKmnmkeATvE0S+vMDifTVZ880nmAbyUm1vFKfqV6r2mBrT
 p4iXfh/9easFfJWHueU4fBwyMndDGRaCRJnP8KQ5I9yb0WZbt+/0k/y8CQD8Oxr7
 dNFFFoY3KnIO9DCRO5Wr+3OqUgtSAQyhBDf5V2wSMCFrwPHKsvWKSbdiWR3Qe4ck
 41InWgawB3xx57+vXraDUA10+nBZ1VrM92ObqfLPTFqjLCom6Fm85cG4YFRLIvRt
 rdlOC+ScpeVpec7MwcHrScGL0HmUgPnShDAo07pRy4oKK+c89sXzdAFf2nYJTAWS
 WCuChrn7VFM=
 =D+Yw
 -----END PGP SIGNATURE-----

Merge tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu into staging

* s390x header clean-ups from Philippe
* Rework and improvements of the EINTR handling by Nikita
* Deprecate the -no-hpet command line option
* Disable the qtests in the 32-bit Windows CI job again
* Some other misc fixes here and there

# gpg: Signature made Mon 09 Jan 2023 14:21:19 GMT
# gpg:                using RSA key 27B88847EEE0250118F3EAB92ED9D774FE702DB5
# gpg:                issuer "thuth@redhat.com"
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg:                 aka "Thomas Huth <thuth@redhat.com>" [full]
# gpg:                 aka "Thomas Huth <huth@tuxfamily.org>" [full]
# gpg:                 aka "Thomas Huth <th.huth@posteo.de>" [unknown]
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3  EAB9 2ED9 D774 FE70 2DB5

* tag 'pull-request-2023-01-09' of https://gitlab.com/thuth/qemu:
  .gitlab-ci.d/windows: Do not run the qtests in the msys2-32bit job
  error handling: Use RETRY_ON_EINTR() macro where applicable
  Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
  docs/interop: Change the vnc-ledstate-Pseudo-encoding doc into .rst
  i386: Deprecate the -no-hpet QEMU command line option
  tests/qtest/bios-tables-test: Replace -no-hpet with hpet=off machine parameter
  tests/readconfig: spice doesn't support unix socket on windows yet
  target/s390x: Restrict sysemu/reset.h to system emulation
  target/s390x/tcg/excp_helper: Restrict system headers to sysemu
  target/s390x/tcg/misc_helper: Remove unused "memory.h" include
  hw/s390x/pv: Restrict Protected Virtualization to sysemu
  exec/memory: Expose memory_region_access_valid()
  MAINTAINERS: Add MIPS-related docs and configs to the MIPS architecture section
  tests/vm: Update get_default_jobs() to work on non-x86_64 non-KVM hosts
  qemu-iotests/stream-under-throttle: do not shutdown QEMU

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2023-01-09 15:54:31 +00:00
Nikita Ivanov 37b0b24e93 error handling: Use RETRY_ON_EINTR() macro where applicable
There is a defined RETRY_ON_EINTR() macro in qemu/osdep.h
which handles the same while loop.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/415
Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-3-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[thuth: Dropped the hunk that changed socket_accept() in libqtest.c]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-09 13:50:47 +01:00
Nikita Ivanov 8b6aa69365 Refactoring: refactor TFR() macro to RETRY_ON_EINTR()
Rename macro name to more transparent one and refactor
it to expression.

Signed-off-by: Nikita Ivanov <nivanov@cloudlinux.com>
Message-Id: <20221023090422.242617-2-nivanov@cloudlinux.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Bin Meng <bmeng.cn@gmail.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
2023-01-09 13:50:47 +01:00
Longpeng bf7a2ad8b6 vdpa: harden the error path if get_iova_range failed
We should stop if the GET_IOVA_RANGE ioctl failed.

Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-3-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-01-08 01:54:22 -05:00
Longpeng c672f348cb vdpa-dev: get iova range explicitly
In commit a585fad26b ("vdpa: request iova_range only once") we remove
GET_IOVA_RANGE form vhost_vdpa_init, the generic vdpa device will start
without iova_range populated, so the device won't work. Let's call
GET_IOVA_RANGE ioctl explicitly.

Fixes: a585fad26b ("vdpa: request iova_range only once")
Signed-off-by: Longpeng <longpeng2@huawei.com>
Message-Id: <20221224114848.3062-2-longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-01-08 01:54:22 -05:00
Hyman Huang(黄勇) bebcac052a vhost-user: Refactor the chr_closed_bh
Use vhost_user_save_acked_features to implemente acked features
saving.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Message-Id: <6699ee88687b62fb8152fe021e576cd2f468d7ca.1671627406.git.huangy81@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:22 -05:00
Hyman Huang(黄勇) 937b7d96e4 vhost-user: Refactor vhost acked features saving
Abstract vhost acked features saving into
vhost_user_save_acked_features, export it as util function.

Signed-off-by: Hyman Huang(黄勇) <huangy81@chinatelecom.cn>
Signed-off-by: Guoyi Tu <tugy@chinatelecom.cn>
Message-Id: <50dc9b09b0635e3052551efcc1046c2a85332fcb.1671627406.git.huangy81@chinatelecom.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:22 -05:00
Eugenio Pérez 980003debd vdpa: do not handle VIRTIO_NET_F_GUEST_ANNOUNCE in vhost-vdpa
So qemu emulates it even in case the device does not support it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221221115015.1400889-5-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2023-01-08 01:54:21 -05:00
Eugenio Pérez 3f9a3eeb7c vdpa: handle VIRTIO_NET_CTRL_ANNOUNCE in vhost_vdpa_net_handle_ctrl_avail
Since this capability is emulated by qemu shadowed CVQ cannot forward it
to the device. Process all that command within qemu.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221221115015.1400889-4-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2023-01-08 01:54:21 -05:00
Eugenio Pérez c1a1008685 vdpa: always start CVQ in SVQ mode if possible
Isolate control virtqueue in its own group, allowing to intercept control
commands but letting dataplane run totally passthrough to the guest.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221215113144.322011-13-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-12-21 06:35:28 -05:00
Eugenio Pérez 6188d78a19 vdpa: add shadow_data to vhost_vdpa
The memory listener that thells the device how to convert GPA to qemu's
va is registered against CVQ vhost_vdpa. memory listener translations
are always ASID 0, CVQ ones are ASID 1 if supported.

Let's tell the listener if it needs to register them on iova tree or
not.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-12-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-21 06:35:28 -05:00
Eugenio Pérez 7f211a28fd vdpa: store x-svq parameter in VhostVDPAState
CVQ can be shadowed two ways:
- Device has x-svq=on parameter (current way)
- The device can isolate CVQ in its own vq group

QEMU needs to check for the second condition dynamically, because CVQ
index is not known before the driver ack the features. Since this is
dynamic, the CVQ isolation could vary with different conditions, making
it possible to go from "not isolated group" to "isolated".

Saving the cmdline parameter in an extra field so we never disable CVQ
SVQ in case the device was started with x-svq cmdline.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-11-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-21 06:35:28 -05:00
Eugenio Pérez cd831ed5c4 vdpa: add asid parameter to vhost_vdpa_dma_map/unmap
So the caller can choose which ASID is destined.

No need to update the batch functions as they will always be called from
memory listener updates at the moment. Memory listener updates will
always update ASID 0, as it's the passthrough ASID.

All vhost devices's ASID are 0 at this moment.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-10-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-21 06:35:28 -05:00
Eugenio Pérez 258a03941f vdpa: move SVQ vring features check to net/
The next patches will start control SVQ if possible. However, we don't
know if that will be possible at qemu boot anymore.

Since the moved checks will be already evaluated at net/ to know if it
is ok to shadow CVQ, move them.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-8-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-21 06:35:28 -05:00
Eugenio Pérez a585fad26b vdpa: request iova_range only once
Currently iova range is requested once per queue pair in the case of
net. Reduce the number of ioctls asking it once at initialization and
reusing that value for each vhost_vdpa.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Message-Id: <20221215113144.322011-7-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasonwang@redhat.com>
2022-12-21 06:35:28 -05:00
Eugenio Pérez 36e4647247 vdpa: add vhost_vdpa_net_valid_svq_features
It will be reused at vdpa device start so let's extract in its own
function.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <20221215113144.322011-6-eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2022-12-21 06:35:28 -05:00
Peter Maydell 48804eebd4 Miscellaneous patches for 2022-12-14
-----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEENUvIs9frKmtoZ05fOHC0AOuRhlMFAmOZ6lYSHGFybWJydUBy
 ZWRoYXQuY29tAAoJEDhwtADrkYZT6VEQAKynjWh3AIZ4/qOgrVqsP0oRspevLmfH
 BbuGoldjYpEE7RbwuCaZalZ7iy7TcSySxnPfUDVsFHd7NWffJVjwKHifGC0D/Ez0
 +Ggyb1CBebN+mS7t+BNFUHdMM+wxFIlHwg4f4aTFbn2o0HKgj2a8tcNzNRonZbfa
 xURnvbD4G4u0VZEc3Jak+x193xbOJFsuuWq0BZnDuNk+XqjyW2RwfpXLPJVk+82a
 4uy/YgYuqXUqBeULwcJj+shBL4SXR9GyajTFMS64przSUle0ADUmXkPtaS2agV7e
 Pym/UQuAcxvNyw34fJsiMZxx6rZI9YU30jQUMRLoYcPRR/Q/aiPeiiHtiD6Kaid7
 IfOeH/EArXaQRFpD89xj4YcaTnRLQOEj0NXgXvAbQf6eD8JYyao/S/0lCsPZEoA2
 nibLqEQ25ncDNXoSomuwtfjVff3w68lODFbhwqfA0gf3cPtCgVZ6xQ8P/McNY6K6
 wqFHXMWTDHk1LOCTucjYz1z2TGzTnSG4iWi5Yt6FSxAc958AO+v5ALn/1pcYun+E
 azM/MF0AInKj2aJCT530zT0tpCs/Jo07YKC8k6ubi77S0ZdmGS1XLeXkRXfk1+yI
 OhuUgiVlSTHxD69DagT2vbnx1mDMM9X+OBIMvEi5nwvD9A/ghaCgkDeGFvbA1ud0
 t0mxPBZJ+tiZ
 =JJjG
 -----END PGP SIGNATURE-----

Merge tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru into staging

Miscellaneous patches for 2022-12-14

# gpg: Signature made Wed 14 Dec 2022 15:23:02 GMT
# gpg:                using RSA key 354BC8B3D7EB2A6B68674E5F3870B400EB918653
# gpg:                issuer "armbru@redhat.com"
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>" [full]
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>" [full]
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* tag 'pull-misc-2022-12-14' of https://repo.or.cz/qemu/armbru:
  ppc4xx_sdram: Simplify sdram_ddr_size() to return
  block/vmdk: Simplify vmdk_co_create() to return directly
  cleanup: Tweak and re-run return_directly.cocci
  io: Tidy up fat-fingered parameter name
  qapi: Use returned bool to check for failure (again)
  sockets: Use ERRP_GUARD() where obviously appropriate
  qemu-config: Use ERRP_GUARD() where obviously appropriate
  qemu-config: Make config_parse_qdict() return bool
  monitor: Use ERRP_GUARD() in monitor_init()
  monitor: Simplify monitor_fd_param()'s error handling
  error: Move ERRP_GUARD() to the beginning of the function
  error: Drop a few superfluous ERRP_GUARD()
  error: Drop some obviously superfluous error_propagate()
  Drop more useless casts from void * to pointer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2022-12-15 10:13:46 +00:00
Markus Armbruster 7480874a69 qapi net: Elide redundant has_FOO in generated C
The has_FOO for pointer-valued FOO are redundant, except for arrays.
They are also a nuisance to work with.  Recent commit "qapi: Start to
elide redundant has_FOO in generated C" provided the means to elide
them step by step.  This is the step for qapi/net.json.

Said commit explains the transformation in more detail.  The invariant
violations mentioned there do not occur here.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221104160712.3005652-19-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
[Fixes for MacOS squashed in]
2022-12-14 20:04:47 +01:00
Markus Armbruster d1c81c3496 qapi: Use returned bool to check for failure (again)
Commit 012d4c96e2 changed the visitor functions taking Error ** to
return bool instead of void, and the commits following it used the new
return value to simplify error checking.  Since then a few more uses
in need of the same treatment crept in.  Do that.  All pretty
mechanical except for

* balloon_stats_get_all()

  This is basically the same transformation commit 012d4c96e2 applied
  to the virtual walk example in include/qapi/visitor.h.

* set_max_queue_size()

  Additionally replace "goto end of function" by return.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20221121085054.683122-10-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
2022-12-14 16:19:35 +01:00
Stefan Weil via ac14949821 Add G_GNUC_PRINTF to function qemu_set_info_str and fix related issues
With the G_GNUC_PRINTF function attribute the compiler detects
two potential insecure format strings:

../../../net/stream.c:248:31: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
    qemu_set_info_str(&s->nc, uri);
                              ^~~
../../../net/stream.c:322:31: warning: format string is not a string literal (potentially insecure) [-Wformat-security]
    qemu_set_info_str(&s->nc, uri);
                              ^~~

There are also two other warnings:

../../../net/socket.c:182:35: warning: zero-length gnu_printf format string [-Wformat-zero-length]
  182 |         qemu_set_info_str(&s->nc, "");
      |                                   ^~
../../../net/stream.c:170:35: warning: zero-length gnu_printf format string [-Wformat-zero-length]
  170 |         qemu_set_info_str(&s->nc, "");

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221126152507.283271-7-sw@weilnetz.de>
2022-11-27 13:36:17 -05:00
Stefano Garzarella 562a7d23bf vhost: mask VIRTIO_F_RING_RESET for vhost and vhost-user devices
Commit 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
enabled VIRTIO_F_RING_RESET by default for all virtio devices.

This feature is not currently emulated by QEMU, so for vhost and
vhost-user devices we need to make sure it is supported by the offloaded
device emulation (in-kernel or in another process).
To do this we need to add VIRTIO_F_RING_RESET to the features bitmap
passed to vhost_get_features(). This way it will be masked if the device
does not support it.

This issue was initially discovered with vhost-vsock and vhost-user-vsock,
and then also tested with vhost-user-rng which confirmed the same issue.
They fail when sending features through VHOST_SET_FEATURES ioctl or
VHOST_USER_SET_FEATURES message, since VIRTIO_F_RING_RESET is negotiated
by the guest (Linux >= v6.0), but not supported by the device.

Fixes: 69e1c14aa2 ("virtio: core: vq reset feature negotation support")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1318
Signed-off-by: Stefano Garzarella <sgarzare@redhat.com>
Message-Id: <20221121101101.29400-1-sgarzare@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2022-11-22 05:19:00 -05:00
Ahmed Abouzied f469150be8 net: Replace TAB indentations with spaces
Replaces TABs with spaces, making sure to have a consistent coding style
of 4 space indentations in the net subsystem.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/377
Signed-off-by: Ahmed Abouzied <email@aabouzied.com>
Message-Id: <20210614183849.20622-1-email@aabouzied.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
[thuth: Fixed mis-aligned indentation in some of the files]
Signed-off-by: Thomas Huth <thuth@redhat.com>
2022-11-11 09:39:03 +01:00
Si-Wei Liu bc5add1dad vhost-vdpa: fix assert !virtio_net_get_subqueue(nc)->async_tx.elem in virtio_net_reset
The citing commit has incorrect code in vhost_vdpa_receive() that returns
zero instead of full packet size to the caller. This renders pending packets
unable to be freed so then get clogged in the tx queue forever. When device
is being reset later on, below assertion failure ensues:

0  0x00007f86d53bb387 in raise () from /lib64/libc.so.6
1  0x00007f86d53bca78 in abort () from /lib64/libc.so.6
2  0x00007f86d53b41a6 in __assert_fail_base () from /lib64/libc.so.6
3  0x00007f86d53b4252 in __assert_fail () from /lib64/libc.so.6
4  0x000055b8f6ff6fcc in virtio_net_reset (vdev=<optimized out>) at /usr/src/debug/qemu/hw/net/virtio-net.c:563
5  0x000055b8f7012fcf in virtio_reset (opaque=0x55b8faf881f0) at /usr/src/debug/qemu/hw/virtio/virtio.c:1993
6  0x000055b8f71f0086 in virtio_bus_reset (bus=bus@entry=0x55b8faf88178) at /usr/src/debug/qemu/hw/virtio/virtio-bus.c:102
7  0x000055b8f71f1620 in virtio_pci_reset (qdev=<optimized out>) at /usr/src/debug/qemu/hw/virtio/virtio-pci.c:1845
8  0x000055b8f6fafc6c in memory_region_write_accessor (mr=<optimized out>, addr=<optimized out>, value=<optimized out>,
   size=<optimized out>, shift=<optimized out>, mask=<optimized out>, attrs=...) at /usr/src/debug/qemu/memory.c:483
9  0x000055b8f6fadce9 in access_with_adjusted_size (addr=addr@entry=20, value=value@entry=0x7f867e7fb7e8, size=size@entry=1,
   access_size_min=<optimized out>, access_size_max=<optimized out>, access_fn=0x55b8f6fafc20 <memory_region_write_accessor>,
   mr=0x55b8faf80a50, attrs=...) at /usr/src/debug/qemu/memory.c:544
10 0x000055b8f6fb1d0b in memory_region_dispatch_write (mr=mr@entry=0x55b8faf80a50, addr=addr@entry=20, data=0, op=<optimized out>,
   attrs=attrs@entry=...) at /usr/src/debug/qemu/memory.c:1470
11 0x000055b8f6f62ada in flatview_write_continue (fv=fv@entry=0x7f86ac04cd20, addr=addr@entry=549755813908, attrs=...,
   attrs@entry=..., buf=buf@entry=0x7f86d0223028 <Address 0x7f86d0223028 out of bounds>, len=len@entry=1, addr1=20, l=1,
   mr=0x55b8faf80a50) at /usr/src/debug/qemu/exec.c:3266
12 0x000055b8f6f62c8f in flatview_write (fv=0x7f86ac04cd20, addr=549755813908, attrs=...,
   buf=0x7f86d0223028 <Address 0x7f86d0223028 out of bounds>, len=1) at /usr/src/debug/qemu/exec.c:3306
13 0x000055b8f6f674cb in address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>,
   len=<optimized out>) at /usr/src/debug/qemu/exec.c:3396
14 0x000055b8f6f67575 in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=..., attrs@entry=...,
   buf=buf@entry=0x7f86d0223028 <Address 0x7f86d0223028 out of bounds>, len=<optimized out>, is_write=<optimized out>)
   at /usr/src/debug/qemu/exec.c:3406
15 0x000055b8f6fc1cc8 in kvm_cpu_exec (cpu=cpu@entry=0x55b8f9aa0e10) at /usr/src/debug/qemu/accel/kvm/kvm-all.c:2410
16 0x000055b8f6fa5f5e in qemu_kvm_cpu_thread_fn (arg=0x55b8f9aa0e10) at /usr/src/debug/qemu/cpus.c:1318
17 0x000055b8f7336e16 in qemu_thread_start (args=0x55b8f9ac8480) at /usr/src/debug/qemu/util/qemu-thread-posix.c:519
18 0x00007f86d575aea5 in start_thread () from /lib64/libpthread.so.0
19 0x00007f86d5483b2d in clone () from /lib64/libc.so.6

Make vhost_vdpa_receive() return the size passed in as is, so that the
caller qemu_deliver_packet_iov() would eventually propagate it back to
virtio_net_flush_tx() to release pending packets from the async_tx queue.
Which corresponds to the drop path where qemu_sendv_packet_async() returns
non-zero in virtio_net_flush_tx().

Fixes: 846a1e85da ("vdpa: Add dummy receive callback")
Cc: Eugenio Perez Martin <eperezma@redhat.com>
Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20221108041929.18417-2-jasowang@redhat.com>
2022-11-08 13:38:02 -05:00
Peter Maydell 5107fd3eff net/vhost-vdpa.c: Fix clang compilation failure
Commit 8801ccd050 introduced a compilation failure with clang
version 10.0.0-4ubuntu1:

../../net/vhost-vdpa.c:654:16: error: variable 'vdpa_device_fd' is
used uninitialized whenever 'if' condition is false
[-Werror,-Wsometimes-uninitialized]
    } else if (opts->has_vhostfd) {
               ^~~~~~~~~~~~~~~~~
../../net/vhost-vdpa.c:662:33: note: uninitialized use occurs here
    r = vhost_vdpa_get_features(vdpa_device_fd, &features, errp);
                                ^~~~~~~~~~~~~~
../../net/vhost-vdpa.c:654:12: note: remove the 'if' if its condition
is always true
    } else if (opts->has_vhostfd) {
           ^~~~~~~~~~~~~~~~~~~~~~~
../../net/vhost-vdpa.c:629:23: note: initialize the variable
'vdpa_device_fd' to silence this warning
    int vdpa_device_fd;
                      ^
                       = 0
1 error generated.

It's a false positive -- the compiler doesn't manage to figure out
that the error checks further up mean that there's no code path where
vdpa_device_fd isn't initialized.  Put another way, the problem is
that we check "if (opts->has_vhostfd)" when in fact that condition
must always be true.  A cleverer static analyser would probably warn
that we were checking an always-true condition.

Fix the compilation failure by removing the unnecessary if().

Fixes: 8801ccd050 ("vhost-vdpa: allow passing opened vhostfd to vhost-vdpa")
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20221031132901.1277150-1-peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-31 13:01:31 -04:00
Laurent Vivier e506fee8b1 net: stream: add QAPI events to report connection state
The netdev reports NETDEV_STREAM_CONNECTED event when the backend
is connected, and NETDEV_STREAM_DISCONNECTED when it is disconnected.

The NETDEV_STREAM_CONNECTED event includes the destination address.

This allows a system manager like libvirt to detect when the server
fails.

For instance with passt:

{ 'execute': 'qmp_capabilities' }
{ "return": { } }
{ "timestamp": { "seconds": 1666341395, "microseconds": 505347 },
    "event": "NETDEV_STREAM_CONNECTED",
    "data": { "netdev-id": "netdev0",
        "addr": { "path": "/tmp/passt_1.socket", "type": "unix" } } }

[killing passt here]

{ "timestamp": { "seconds": 1666341430, "microseconds": 968694 },
    "event": "NETDEV_STREAM_DISCONNECTED",
    "data": { "netdev-id": "netdev0" } }

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 1f9c890fa3 net: stream: move to QIO to enable additional parameters
Use QIOChannel, QIOChannelSocket and QIONetListener.
This allows net/stream to use all the available parameters provided by
SocketAddress.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 784e7a2531 net: dgram: add unix socket
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 8ecc7f40bc net: dgram: move mcast specific code from net_socket_fd_init_dgram()
It is less complex to manage special cases directly in
net_dgram_mcast_init() and net_dgram_udp_init().

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 7c1f0c33cc net: dgram: make dgram_dst generic
dgram_dst is a sockaddr_in structure. To be able to use it with
unix socket, use a pointer to a generic sockaddr structure.

Rename it dest_addr, and store socket length in dest_len.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 13c6be9661 net: stream: add unix socket
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com> (QAPI schema)
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Stefano Brivio 80d3e4779d net: stream: Don't ignore EINVAL on netdev socket connection
Other errors are treated as failure by net_stream_client_init(),
but if connect() returns EINVAL, we'll fail silently. Remove the
related exception.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
[lvivier: applied to net/stream.c]
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Stefano Brivio daf188ff04 net: socket: Don't ignore EINVAL on netdev socket connection
Other errors are treated as failure by net_socket_connect_init(),
but if connect() returns EINVAL, we'll fail silently. Remove the
related exception.

Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 5166fe0ae4 qapi: net: add stream and dgram netdevs
Copied from socket netdev file and modified to use SocketAddress
to be able to introduce new features like unix socket.

"udp" and "mcast" are squashed into dgram netdev, multicast is detected
according to the IP address type.
"listen" and "connect" modes are managed by stream netdev. An optional
parameter "server" defines the mode (off by default)

The two new types need to be parsed the modern way with -netdev, because
with the traditional way, the "type" field of netdev structure collides with
the "type" field of SocketAddress and prevents the correct evaluation of the
command line option. Moreover the traditional way doesn't allow to use
the same type (SocketAddress) several times with the -netdev option
(needed to specify "local" and "remote" addresses).

The previous commit paved the way for parsing the modern way, but
omitted one detail: how to pick modern vs. traditional, in
netdev_is_modern().

We want to pick based on the value of parameter "type".  But how to
extract it from the option argument?

Parsing the option argument, either the modern or the traditional way,
extracts it for us, but only if parsing succeeds.

If parsing fails, there is no good option.  No matter which parser we
pick, it'll be the wrong one for some arguments, and the error
reporting will be confusing.

Fortunately, the traditional parser accepts *anything* when called in
a certain way.  This maximizes our chance to extract the value of
"type", and in turn minimizes the risk of confusing error reporting.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 53b85d9574 net: introduce qemu_set_info_str() function
Embed the setting of info_str in a function.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier f3eedcddba qapi: net: introduce a way to bypass qemu_opts_parse_noisily()
As qemu_opts_parse_noisily() flattens the QAPI structures ("type" field
of Netdev structure can collides with "type" field of SocketAddress),
we introduce a way to bypass qemu_opts_parse_noisily() and use directly
visit_type_Netdev() to parse the backend parameters.

More details from Markus:

qemu_init() passes the argument of -netdev, -nic, and -net to
net_client_parse().

net_client_parse() parses with qemu_opts_parse_noisily(), passing
QemuOptsList qemu_netdev_opts for -netdev, qemu_nic_opts for -nic, and
qemu_net_opts for -net.  Their desc[] are all empty, which means any
keys are accepted.  The result of the parse (a QemuOpts) is stored in
the QemuOptsList.

Note that QemuOpts is flat by design.  In some places, we layer non-flat
on top using dotted keys convention, but not here.

net_init_clients() iterates over the stored QemuOpts, and passes them to
net_init_netdev(), net_param_nic(), or net_init_client(), respectively.

These functions pass the QemuOpts to net_client_init().  They also do
other things with the QemuOpts, which we can ignore here.

net_client_init() uses the opts visitor to convert the (flat) QemOpts to
a (non-flat) QAPI object Netdev.  Netdev is also the argument of QMP
command netdev_add.

The opts visitor was an early attempt to support QAPI in
(QemuOpts-based) CLI.  It restricts QAPI types to a certain shape; see
commit eb7ee2cbeb "qapi: introduce OptsVisitor".

A more modern way to support QAPI is qobject_input_visitor_new_str().
It uses keyval_parse() instead of QemuOpts for KEY=VALUE,... syntax, and
it also supports JSON syntax.  The former isn't quite as expressive as
JSON, but it's a lot closer than QemuOpts + opts visitor.

This commit paves the way to use of the modern way instead.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 21fccb2cbb net: simplify net_client_parse() error management
All net_client_parse() callers exit in case of error.

Move exit(1) to net_client_parse() and remove error checking from
the callers.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier d63ef17bfc net: remove the @errp argument of net_client_inits()
The only caller passes &error_fatal, so use this directly in the function.

It's what we do for -blockdev, -device, and -object.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Laurent Vivier 30e4226b78 net: introduce convert_host_port()
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Brivio <sbrivio@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Daniel P. Berrangé 7d0e12af59 net: improve error message for missing netdev backend
The current message when using '-net user...' with SLIRP disabled at
compile time is:

  qemu-system-x86_64: -net user: Parameter 'type' expects a net backend type (maybe it is not compiled into this binary)

An observation is that we're using the 'netdev->type' field here which
is an enum value, produced after QAPI has converted from its string
form.

IOW, at this point in the code, we know that the user's specified
type name was a valid network backend. The only possible scenario that
can make the backend init function be NULL, is if support for that
backend was disabled at build time. Given this, we don't need to caveat
our error message with a 'maybe' hint, we can be totally explicit.

The use of QERR_INVALID_PARAMETER_VALUE doesn't really lend itself to
user friendly error message text. Since this is not used to set a
specific QAPI error class, we can simply stop using this pre-formatted
error text and provide something better.

Thus the new message is:

  qemu-system-x86_64: -net user: network backend 'user' is not compiled into this binary

The case of passing 'hubport' for -net is also given a message reminding
people they should have used -netdev/-nic instead, as this backend type
is only valid for the modern syntax.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Si-Wei Liu 8801ccd050 vhost-vdpa: allow passing opened vhostfd to vhost-vdpa
Similar to other vhost backends, vhostfd can be passed to vhost-vdpa
backend as another parameter to instantiate vhost-vdpa net client.
This would benefit the use case where only open file descriptors, as
opposed to raw vhost-vdpa device paths, are accessible from the QEMU
process.

(qemu) netdev_add type=vhost-vdpa,vhostfd=61,id=vhost-vdpa1

Signed-off-by: Si-Wei Liu <si-wei.liu@oracle.com>
Acked-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Eugenio Pérez 6ce262fbe7 vdpa: Remove shadow CVQ command check
The guest will see undefined behavior if it issue not negotiate
commands, bit it is expected somehow.

Simplify code deleting this check.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
Eugenio Pérez faa825ddb3 vdpa: Delete duplicated vdpa_feature_bits entry
This entry was duplicated on referenced commit. Removing it.

Fixes: 402378407d ("vhost-vdpa: multiqueue support")
Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-10-28 13:28:52 +08:00
lu zhipeng bf769f742c virtio: del net client if net_init_tap_one failed
If the net tap initializes successful, but failed during
network card hot-plugging, the net-tap will remains,
so cleanup.

Signed-off-by: lu zhipeng <luzhipeng@cestc.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-27 15:14:37 +08:00
Eugenio Pérez 72b99a87ce vdpa: Allow MQ feature in SVQ
Finally enable SVQ with MQ feature.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-27 15:14:37 +08:00
Eugenio Pérez 275ba561cd vdpa: validate MQ CVQ commands
So we are sure we can update the device model properly before sending to
the device.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-27 15:14:37 +08:00
Eugenio Pérez f64c7cda69 vdpa: Add vhost_vdpa_net_load_mq
Same way as with the MAC, restore the expected number of queues at
device's start.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-27 15:14:37 +08:00
Eugenio Pérez f73c0c43ac vdpa: extract vhost_vdpa_net_load_mac from vhost_vdpa_net_load
Since there may be many commands we need to issue to load the NIC
state, let's split them in individual functions

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-27 15:13:55 +08:00
Eugenio Pérez 17fb889f8a vdpa: Make VhostVDPAState cvq_cmd_in_buffer control ack type
This allows to simplify the code. Rename to status while we're at it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-27 15:13:55 +08:00
Zhang Chen 3772cf0d1b net/colo.c: Fix the pointer issue reported by Coverity.
When enabled the virtio-net-pci, guest network packet will
load the vnet_hdr. In COLO status, the primary VM's network
packet maybe redirect to another VM, it needs filter-redirect
enable the vnet_hdr flag at the same time, COLO-proxy will
correctly parse the original network packet. If have any
misconfiguration here, the vnet_hdr_len is wrong for parse
the packet, the data+offset will point to wrong place.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-02 10:22:39 +08:00
Eugenio Pérez 0e3fdcffea vdpa: Delete CVQ migration blocker
We can restore the device state in the destination via CVQ now. Remove
the migration blocker.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-02 10:22:39 +08:00
Eugenio Pérez dd036d8d27 vdpa: Add virtio-net mac address via CVQ at start
This is needed so the destination vdpa device see the same state a the
guest set in the source.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-02 10:22:39 +08:00
Eugenio Pérez be4278b65f vdpa: extract vhost_vdpa_net_cvq_add from vhost_vdpa_net_handle_ctrl_avail
So we can reuse it to inject state messages.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
--
v7:
* Remove double free error

v6:
* Do not assume in buffer sent to the device is sizeof(virtio_net_ctrl_ack)

v5:
* Do not use an artificial !NULL VirtQueueElement
* Use only out size instead of iovec dev_buffers for these functions.

Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-02 10:22:39 +08:00
Eugenio Pérez 7a7f87e94c vdpa: Move command buffers map to start of net device
As this series will reuse them to restore the device state at the end of
a migration (or a device start), let's allocate only once at the device
start so we don't duplicate their map and unmap.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-02 10:22:39 +08:00
Eugenio Pérez f8972b56ee vdpa: add net_vhost_vdpa_cvq_info NetClientInfo
Next patches will add a new info callback to restore NIC status through
CVQ. Since only the CVQ vhost device is needed, create it with a new
NetClientInfo.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-02 10:22:39 +08:00
Eugenio Pérez 69292a8e40 util: accept iova_tree_remove_parameter by value
It's convenient to call iova_tree_remove from a map returned from
iova_tree_find or iova_tree_find_iova. With the current code this is not
possible, since we will free it, and then we will try to search for it
again.

Fix it making accepting the map by value, forcing a copy of the
argument. Not applying a fixes tag, since there is no use like that at
the moment.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-09-02 10:22:39 +08:00
Eugenio Pérez aed5da45da vdpa: Fix file descriptor leak on get features error
File descriptor vdpa_device_fd is not free in the case of returning
error from vhost_vdpa_get_features. Fixing it by making all errors go to
the same error path.

Resolves: Coverity CID 1490785
Fixes: 8170ab3f43 ("vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs")

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20220802112447.249436-2-eperezma@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-08-04 13:46:08 +02:00
Zhang Chen 8bdab83b34 net/colo.c: fix segmentation fault when packet is not parsed correctly
When COLO use only one vnet_hdr_support parameter between
filter-redirector and filter-mirror(or colo-compare), COLO will crash
with segmentation fault. Back track as follow:

Thread 1 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555cb200b in eth_get_l2_hdr_length (p=0x0)
    at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296
296         uint16_t proto = be16_to_cpu(PKT_GET_ETH_HDR(p)->h_proto);
(gdb) bt
0  0x0000555555cb200b in eth_get_l2_hdr_length (p=0x0)
    at /home/tao/project/COLO/colo-qemu/include/net/eth.h:296
1  0x0000555555cb22b4 in parse_packet_early (pkt=0x555556a44840) at
net/colo.c:49
2  0x0000555555cb2b91 in is_tcp_packet (pkt=0x555556a44840) at
net/filter-rewriter.c:63

So wrong vnet_hdr_len will cause pkt->data become NULL. Add check to
raise error and add trace-events to track vnet_hdr_len.

Signed-off-by: Tao Xu <tao3.xu@intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-07-20 16:58:08 +08:00
Zhang Chen 94c36c4875 net/colo.c: No need to track conn_list for filter-rewriter
Filter-rewriter no need to track connection in conn_list.
This patch fix the glib g_queue_is_empty assertion when COLO guest
keep a lot of network connection.

Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Li Zhijian <lizhijian@fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-07-20 16:58:08 +08:00
Zhang Chen a18d436954 net/colo: Fix a "double free" crash to clear the conn_list
We notice the QEMU may crash when the guest has too many
incoming network connections with the following log:

15197@1593578622.668573:colo_proxy_main : colo proxy connection hashtable full, clear it
free(): invalid pointer
[1]    15195 abort (core dumped)  qemu-system-x86_64 ....

This is because we create the s->connection_track_table with
g_hash_table_new_full() which is defined as:

GHashTable * g_hash_table_new_full (GHashFunc hash_func,
                       GEqualFunc key_equal_func,
                       GDestroyNotify key_destroy_func,
                       GDestroyNotify value_destroy_func);

The fourth parameter connection_destroy() will be called to free the
memory allocated for all 'Connection' values in the hashtable when
we call g_hash_table_remove_all() in the connection_hashtable_reset().

But both connection_track_table and conn_list reference to the same
conn instance. It will trigger double free in conn_list clear. So this
patch remove free action on hash table side to avoid double free the
conn.

Signed-off-by: Like Xu <like.xu@linux.intel.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-07-20 16:58:08 +08:00
Eugenio Pérez 1576dbb5bb vdpa: Add x-svq to NetdevVhostVDPAOptions
Finally offering the possibility to enable SVQ from the command line.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-07-20 16:58:08 +08:00
Eugenio Pérez 8170ab3f43 vdpa: Extract get features part from vhost_vdpa_get_max_queue_pairs
To know the device features is needed for CVQ SVQ, so SVQ knows if it
can handle all commands or not. Extract from
vhost_vdpa_get_max_queue_pairs so we can reuse it.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-07-20 16:58:08 +08:00
Eugenio Pérez 2df4dd31e1 vdpa: Buffer CVQ support on shadow virtqueue
Introduce the control virtqueue support for vDPA shadow virtqueue. This
is needed for advanced networking features like rx filtering.

Virtio-net control VQ copies the descriptors to qemu's VA, so we avoid
TOCTOU with the guest's or device's memory every time there is a device
model change.  Otherwise, the guest could change the memory content in
the time between qemu and the device read it.

To demonstrate command handling, VIRTIO_NET_F_CTRL_MACADDR is
implemented.  If the virtio-net driver changes MAC the virtio-net device
model will be updated with the new one, and a rx filtering change event
will be raised.

More cvq commands could be added here straightforwardly but they have
not been tested.

Signed-off-by: Eugenio Pérez <eperezma@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2022-07-20 16:58:08 +08:00