qemu/meson_options.txt
Ilya Maximets cb039ef3d9 net: add initial support for AF_XDP network backend
AF_XDP is a network socket family that allows communication directly
with the network device driver in the kernel, bypassing most or all
of the kernel networking stack.  In the essence, the technology is
pretty similar to netmap.  But, unlike netmap, AF_XDP is Linux-native
and works with any network interfaces without driver modifications.
Unlike vhost-based backends (kernel, user, vdpa), AF_XDP doesn't
require access to character devices or unix sockets.  Only access to
the network interface itself is necessary.

This patch implements a network backend that communicates with the
kernel by creating an AF_XDP socket.  A chunk of userspace memory
is shared between QEMU and the host kernel.  4 ring buffers (Tx, Rx,
Fill and Completion) are placed in that memory along with a pool of
memory buffers for the packet data.  Data transmission is done by
allocating one of the buffers, copying packet data into it and
placing the pointer into Tx ring.  After transmission, device will
return the buffer via Completion ring.  On Rx, device will take
a buffer form a pre-populated Fill ring, write the packet data into
it and place the buffer into Rx ring.

AF_XDP network backend takes on the communication with the host
kernel and the network interface and forwards packets to/from the
peer device in QEMU.

Usage example:

  -device virtio-net-pci,netdev=guest1,mac=00:16:35:AF:AA:5C
  -netdev af-xdp,ifname=ens6f1np1,id=guest1,mode=native,queues=1

XDP program bridges the socket with a network interface.  It can be
attached to the interface in 2 different modes:

1. skb - this mode should work for any interface and doesn't require
         driver support.  With a caveat of lower performance.

2. native - this does require support from the driver and allows to
            bypass skb allocation in the kernel and potentially use
            zero-copy while getting packets in/out userspace.

By default, QEMU will try to use native mode and fall back to skb.
Mode can be forced via 'mode' option.  To force 'copy' even in native
mode, use 'force-copy=on' option.  This might be useful if there is
some issue with the driver.

Option 'queues=N' allows to specify how many device queues should
be open.  Note that all the queues that are not open are still
functional and can receive traffic, but it will not be delivered to
QEMU.  So, the number of device queues should generally match the
QEMU configuration, unless the device is shared with something
else and the traffic re-direction to appropriate queues is correctly
configured on a device level (e.g. with ethtool -N).
'start-queue=M' option can be used to specify from which queue id
QEMU should start configuring 'N' queues.  It might also be necessary
to use this option with certain NICs, e.g. MLX5 NICs.  See the docs
for examples.

In a general case QEMU will need CAP_NET_ADMIN and CAP_SYS_ADMIN
or CAP_BPF capabilities in order to load default XSK/XDP programs to
the network interface and configure BPF maps.  It is possible, however,
to run with no capabilities.  For that to work, an external process
with enough capabilities will need to pre-load default XSK program,
create AF_XDP sockets and pass their file descriptors to QEMU process
on startup via 'sock-fds' option.  Network backend will need to be
configured with 'inhibit=on' to avoid loading of the program.
QEMU will need 32 MB of locked memory (RLIMIT_MEMLOCK) per queue
or CAP_IPC_LOCK.

There are few performance challenges with the current network backends.

First is that they do not support IO threads.  This means that data
path is handled by the main thread in QEMU and may slow down other
work or may be slowed down by some other work.  This also means that
taking advantage of multi-queue is generally not possible today.

Another thing is that data path is going through the device emulation
code, which is not really optimized for performance.  The fastest
"frontend" device is virtio-net.  But it's not optimized for heavy
traffic either, because it expects such use-cases to be handled via
some implementation of vhost (user, kernel, vdpa).  In practice, we
have virtio notifications and rcu lock/unlock on a per-packet basis
and not very efficient accesses to the guest memory.  Communication
channels between backend and frontend devices do not allow passing
more than one packet at a time as well.

Some of these challenges can be avoided in the future by adding better
batching into device emulation or by implementing vhost-af-xdp variant.

There are also a few kernel limitations.  AF_XDP sockets do not
support any kinds of checksum or segmentation offloading.  Buffers
are limited to a page size (4K), i.e. MTU is limited.  Multi-buffer
support implementation for AF_XDP is in progress, but not ready yet.
Also, transmission in all non-zero-copy modes is synchronous, i.e.
done in a syscall.  That doesn't allow high packet rates on virtual
interfaces.

However, keeping in mind all of these challenges, current implementation
of the AF_XDP backend shows a decent performance while running on top
of a physical NIC with zero-copy support.

Test setup:

2 VMs running on 2 physical hosts connected via ConnectX6-Dx card.
Network backend is configured to open the NIC directly in native mode.
The driver supports zero-copy.  NIC is configured to use 1 queue.

Inside a VM - iperf3 for basic TCP performance testing and dpdk-testpmd
for PPS testing.

iperf3 result:
 TCP stream      : 19.1 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
 Tx only         : 3.4 Mpps
 Rx only         : 2.0 Mpps
 L2 FWD Loopback : 1.5 Mpps

In skb mode the same setup shows much lower performance, similar to
the setup where pair of physical NICs is replaced with veth pair:

iperf3 result:
  TCP stream      : 9 Gbps

dpdk-testpmd (single queue, single CPU core, 64 B packets) results:
  Tx only         : 1.2 Mpps
  Rx only         : 1.0 Mpps
  L2 FWD Loopback : 0.7 Mpps

Results in skb mode or over the veth are close to results of a tap
backend with vhost=on and disabled segmentation offloading bridged
with a NIC.

Signed-off-by: Ilya Maximets <i.maximets@ovn.org>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com> (docker/lcitool)
Signed-off-by: Jason Wang <jasowang@redhat.com>
2023-09-18 14:36:13 +08:00

356 lines
18 KiB
Meson

# These options do not correspond to a --enable/--disable-* option
# on the configure script command line. If you add more, list them in
# scripts/meson-buildoptions.py's SKIP_OPTIONS constant too.
option('qemu_suffix', type : 'string', value: 'qemu',
description: 'Suffix for QEMU data/modules/config directories (can be empty)')
option('docdir', type : 'string', value : 'share/doc',
description: 'Base directory for documentation installation (can be empty)')
option('qemu_firmwarepath', type : 'array', value : ['share/qemu-firmware'],
description: 'search PATH for firmware files')
option('pkgversion', type : 'string', value : '',
description: 'use specified string as sub-version of the package')
option('smbd', type : 'string', value : '',
description: 'Path to smbd for slirp networking')
option('iasl', type : 'string', value : '',
description: 'Path to ACPI disassembler')
option('tls_priority', type : 'string', value : 'NORMAL',
description: 'Default TLS protocol/cipher priority string')
option('default_devices', type : 'boolean', value : true,
description: 'Include a default selection of devices in emulators')
option('audio_drv_list', type: 'array', value: ['default'],
choices: ['alsa', 'coreaudio', 'default', 'dsound', 'jack', 'oss', 'pa', 'pipewire', 'sdl', 'sndio'],
description: 'Set audio driver list')
option('block_drv_rw_whitelist', type : 'string', value : '',
description: 'set block driver read-write whitelist (by default affects only QEMU, not tools like qemu-img)')
option('block_drv_ro_whitelist', type : 'string', value : '',
description: 'set block driver read-only whitelist (by default affects only QEMU, not tools like qemu-img)')
option('interp_prefix', type : 'string', value : '/usr/gnemul/qemu-%M',
description: 'where to find shared libraries etc., use %M for cpu name')
option('fuzzing_engine', type : 'string', value : '',
description: 'fuzzing engine library for OSS-Fuzz')
option('trace_file', type: 'string', value: 'trace',
description: 'Trace file prefix for simple backend')
option('coroutine_backend', type: 'combo',
choices: ['ucontext', 'sigaltstack', 'windows', 'auto'],
value: 'auto', description: 'coroutine backend to use')
# Everything else can be set via --enable/--disable-* option
# on the configure script command line. After adding an option
# here make sure to run "make update-buildoptions".
option('docs', type : 'feature', value : 'auto',
description: 'Documentations build support')
option('fuzzing', type : 'boolean', value: false,
description: 'build fuzzing targets')
option('gettext', type : 'feature', value : 'auto',
description: 'Localization of the GTK+ user interface')
option('modules', type : 'feature', value : 'disabled',
description: 'modules support (non Windows)')
option('module_upgrades', type : 'boolean', value : false,
description: 'try to load modules from alternate paths for upgrades')
option('install_blobs', type : 'boolean', value : true,
description: 'install provided firmware blobs')
option('sparse', type : 'feature', value : 'auto',
description: 'sparse checker')
option('guest_agent', type : 'feature', value : 'auto',
description: 'Build QEMU Guest Agent')
option('guest_agent_msi', type : 'feature', value : 'auto',
description: 'Build MSI package for the QEMU Guest Agent')
option('tools', type : 'feature', value : 'auto',
description: 'build support utilities that come with QEMU')
option('qga_vss', type : 'feature', value: 'auto',
description: 'build QGA VSS support (broken with MinGW)')
option('malloc_trim', type : 'feature', value : 'auto',
description: 'enable libc malloc_trim() for memory optimization')
option('malloc', type : 'combo', choices : ['system', 'tcmalloc', 'jemalloc'],
value: 'system', description: 'choose memory allocator to use')
option('kvm', type: 'feature', value: 'auto',
description: 'KVM acceleration support')
option('whpx', type: 'feature', value: 'auto',
description: 'WHPX acceleration support')
option('hvf', type: 'feature', value: 'auto',
description: 'HVF acceleration support')
option('nvmm', type: 'feature', value: 'auto',
description: 'NVMM acceleration support')
option('xen', type: 'feature', value: 'auto',
description: 'Xen backend support')
option('xen_pci_passthrough', type: 'feature', value: 'auto',
description: 'Xen PCI passthrough support')
option('tcg', type: 'feature', value: 'enabled',
description: 'TCG support')
option('plugins', type: 'boolean', value: false,
description: 'TCG plugins via shared library loading')
option('debug_tcg', type: 'boolean', value: false,
description: 'TCG debugging')
option('tcg_interpreter', type: 'boolean', value: false,
description: 'TCG with bytecode interpreter (slow)')
option('safe_stack', type: 'boolean', value: false,
description: 'SafeStack Stack Smash Protection (requires clang/llvm and coroutine backend ucontext)')
option('sanitizers', type: 'boolean', value: false,
description: 'enable default sanitizers')
option('tsan', type: 'boolean', value: false,
description: 'enable thread sanitizer')
option('stack_protector', type: 'feature', value: 'auto',
description: 'compiler-provided stack protection')
option('cfi', type: 'boolean', value: false,
description: 'Control-Flow Integrity (CFI)')
option('cfi_debug', type: 'boolean', value: false,
description: 'Verbose errors in case of CFI violation')
option('multiprocess', type: 'feature', value: 'auto',
description: 'Out of process device emulation support')
option('vfio_user_server', type: 'feature', value: 'disabled',
description: 'vfio-user server support')
option('dbus_display', type: 'feature', value: 'auto',
description: '-display dbus support')
option('tpm', type : 'feature', value : 'auto',
description: 'TPM support')
# Do not enable it by default even for Mingw32, because it doesn't
# work on Wine.
option('membarrier', type: 'feature', value: 'disabled',
description: 'membarrier system call (for Linux 4.14+ or Windows')
option('avx2', type: 'feature', value: 'auto',
description: 'AVX2 optimizations')
option('avx512f', type: 'feature', value: 'disabled',
description: 'AVX512F optimizations')
option('avx512bw', type: 'feature', value: 'auto',
description: 'AVX512BW optimizations')
option('keyring', type: 'feature', value: 'auto',
description: 'Linux keyring support')
option('af_xdp', type : 'feature', value : 'auto',
description: 'AF_XDP network backend support')
option('attr', type : 'feature', value : 'auto',
description: 'attr/xattr support')
option('auth_pam', type : 'feature', value : 'auto',
description: 'PAM access control')
option('brlapi', type : 'feature', value : 'auto',
description: 'brlapi character device driver')
option('bzip2', type : 'feature', value : 'auto',
description: 'bzip2 support for DMG images')
option('cap_ng', type : 'feature', value : 'auto',
description: 'cap_ng support')
option('blkio', type : 'feature', value : 'auto',
description: 'libblkio block device driver')
option('bpf', type : 'feature', value : 'auto',
description: 'eBPF support')
option('cocoa', type : 'feature', value : 'auto',
description: 'Cocoa user interface (macOS only)')
option('curl', type : 'feature', value : 'auto',
description: 'CURL block device driver')
option('gio', type : 'feature', value : 'auto',
description: 'use libgio for D-Bus support')
option('glusterfs', type : 'feature', value : 'auto',
description: 'Glusterfs block device driver')
option('libdw', type : 'feature', value : 'auto',
description: 'debuginfo support')
option('libiscsi', type : 'feature', value : 'auto',
description: 'libiscsi userspace initiator')
option('libnfs', type : 'feature', value : 'auto',
description: 'libnfs block device driver')
option('mpath', type : 'feature', value : 'auto',
description: 'Multipath persistent reservation passthrough')
option('numa', type : 'feature', value : 'auto',
description: 'libnuma support')
option('iconv', type : 'feature', value : 'auto',
description: 'Font glyph conversion support')
option('curses', type : 'feature', value : 'auto',
description: 'curses UI')
option('gnutls', type : 'feature', value : 'auto',
description: 'GNUTLS cryptography support')
option('nettle', type : 'feature', value : 'auto',
description: 'nettle cryptography support')
option('gcrypt', type : 'feature', value : 'auto',
description: 'libgcrypt cryptography support')
option('crypto_afalg', type : 'feature', value : 'disabled',
description: 'Linux AF_ALG crypto backend driver')
option('libdaxctl', type : 'feature', value : 'auto',
description: 'libdaxctl support')
option('libpmem', type : 'feature', value : 'auto',
description: 'libpmem support')
option('libssh', type : 'feature', value : 'auto',
description: 'ssh block device support')
option('libudev', type : 'feature', value : 'auto',
description: 'Use libudev to enumerate host devices')
option('libusb', type : 'feature', value : 'auto',
description: 'libusb support for USB passthrough')
option('linux_aio', type : 'feature', value : 'auto',
description: 'Linux AIO support')
option('linux_io_uring', type : 'feature', value : 'auto',
description: 'Linux io_uring support')
option('lzfse', type : 'feature', value : 'auto',
description: 'lzfse support for DMG images')
option('lzo', type : 'feature', value : 'auto',
description: 'lzo compression support')
option('rbd', type : 'feature', value : 'auto',
description: 'Ceph block device driver')
option('opengl', type : 'feature', value : 'auto',
description: 'OpenGL support')
option('rdma', type : 'feature', value : 'auto',
description: 'Enable RDMA-based migration')
option('pvrdma', type : 'feature', value : 'auto',
description: 'Enable PVRDMA support')
option('gtk', type : 'feature', value : 'auto',
description: 'GTK+ user interface')
option('sdl', type : 'feature', value : 'auto',
description: 'SDL user interface')
option('sdl_image', type : 'feature', value : 'auto',
description: 'SDL Image support for icons')
option('seccomp', type : 'feature', value : 'auto',
description: 'seccomp support')
option('smartcard', type : 'feature', value : 'auto',
description: 'CA smartcard emulation support')
option('snappy', type : 'feature', value : 'auto',
description: 'snappy compression support')
option('spice', type : 'feature', value : 'auto',
description: 'Spice server support')
option('spice_protocol', type : 'feature', value : 'auto',
description: 'Spice protocol support')
option('u2f', type : 'feature', value : 'auto',
description: 'U2F emulation support')
option('canokey', type : 'feature', value : 'auto',
description: 'CanoKey support')
option('usb_redir', type : 'feature', value : 'auto',
description: 'libusbredir support')
option('l2tpv3', type : 'feature', value : 'auto',
description: 'l2tpv3 network backend support')
option('netmap', type : 'feature', value : 'auto',
description: 'netmap network backend support')
option('slirp', type: 'feature', value: 'auto',
description: 'libslirp user mode network backend support')
option('vde', type : 'feature', value : 'auto',
description: 'vde network backend support')
option('vmnet', type : 'feature', value : 'auto',
description: 'vmnet.framework network backend support')
option('virglrenderer', type : 'feature', value : 'auto',
description: 'virgl rendering support')
option('png', type : 'feature', value : 'auto',
description: 'PNG support with libpng')
option('vnc', type : 'feature', value : 'auto',
description: 'VNC server')
option('vnc_jpeg', type : 'feature', value : 'auto',
description: 'JPEG lossy compression for VNC server')
option('vnc_sasl', type : 'feature', value : 'auto',
description: 'SASL authentication for VNC server')
option('vte', type : 'feature', value : 'auto',
description: 'vte support for the gtk UI')
# GTK Clipboard implementation is disabled by default, since it may cause hangs
# of the guest VCPUs. See gitlab issue 1150:
# https://gitlab.com/qemu-project/qemu/-/issues/1150
option('gtk_clipboard', type: 'feature', value : 'disabled',
description: 'clipboard support for the gtk UI (EXPERIMENTAL, MAY HANG)')
option('xkbcommon', type : 'feature', value : 'auto',
description: 'xkbcommon support')
option('zstd', type : 'feature', value : 'auto',
description: 'zstd compression support')
option('fuse', type: 'feature', value: 'auto',
description: 'FUSE block device export')
option('fuse_lseek', type : 'feature', value : 'auto',
description: 'SEEK_HOLE/SEEK_DATA support for FUSE exports')
option('trace_backends', type: 'array', value: ['log'],
choices: ['dtrace', 'ftrace', 'log', 'nop', 'simple', 'syslog', 'ust'],
description: 'Set available tracing backends')
option('alsa', type: 'feature', value: 'auto',
description: 'ALSA sound support')
option('coreaudio', type: 'feature', value: 'auto',
description: 'CoreAudio sound support')
option('dsound', type: 'feature', value: 'auto',
description: 'DirectSound sound support')
option('jack', type: 'feature', value: 'auto',
description: 'JACK sound support')
option('oss', type: 'feature', value: 'auto',
description: 'OSS sound support')
option('pa', type: 'feature', value: 'auto',
description: 'PulseAudio sound support')
option('pipewire', type: 'feature', value: 'auto',
description: 'PipeWire sound support')
option('sndio', type: 'feature', value: 'auto',
description: 'sndio sound support')
option('vhost_kernel', type: 'feature', value: 'auto',
description: 'vhost kernel backend support')
option('vhost_net', type: 'feature', value: 'auto',
description: 'vhost-net kernel acceleration support')
option('vhost_user', type: 'feature', value: 'auto',
description: 'vhost-user backend support')
option('vhost_crypto', type: 'feature', value: 'auto',
description: 'vhost-user crypto backend support')
option('vhost_vdpa', type: 'feature', value: 'auto',
description: 'vhost-vdpa kernel backend support')
option('vhost_user_blk_server', type: 'feature', value: 'auto',
description: 'build vhost-user-blk server')
option('virtfs', type: 'feature', value: 'auto',
description: 'virtio-9p support')
option('virtfs_proxy_helper', type: 'feature', value: 'auto',
description: 'virtio-9p proxy helper support')
option('libvduse', type: 'feature', value: 'auto',
description: 'build VDUSE Library')
option('vduse_blk_export', type: 'feature', value: 'auto',
description: 'VDUSE block export support')
option('capstone', type: 'feature', value: 'auto',
description: 'Whether and how to find the capstone library')
option('fdt', type: 'combo', value: 'auto',
choices: ['disabled', 'enabled', 'auto', 'system', 'internal'],
description: 'Whether and how to find the libfdt library')
option('selinux', type: 'feature', value: 'auto',
description: 'SELinux support in qemu-nbd')
option('live_block_migration', type: 'feature', value: 'auto',
description: 'block migration in the main migration stream')
option('replication', type: 'feature', value: 'auto',
description: 'replication support')
option('colo_proxy', type: 'feature', value: 'auto',
description: 'colo-proxy support')
option('bochs', type: 'feature', value: 'auto',
description: 'bochs image format support')
option('cloop', type: 'feature', value: 'auto',
description: 'cloop image format support')
option('dmg', type: 'feature', value: 'auto',
description: 'dmg image format support')
option('qcow1', type: 'feature', value: 'auto',
description: 'qcow1 image format support')
option('vdi', type: 'feature', value: 'auto',
description: 'vdi image format support')
option('vhdx', type: 'feature', value: 'auto',
description: 'vhdx image format support')
option('vmdk', type: 'feature', value: 'auto',
description: 'vmdk image format support')
option('vpc', type: 'feature', value: 'auto',
description: 'vpc image format support')
option('vvfat', type: 'feature', value: 'auto',
description: 'vvfat image format support')
option('qed', type: 'feature', value: 'auto',
description: 'qed image format support')
option('parallels', type: 'feature', value: 'auto',
description: 'parallels image format support')
option('block_drv_whitelist_in_tools', type: 'boolean', value: false,
description: 'use block whitelist also in tools instead of only QEMU')
option('rng_none', type: 'boolean', value: false,
description: 'dummy RNG, avoid using /dev/(u)random and getrandom()')
option('coroutine_pool', type: 'boolean', value: true,
description: 'coroutine freelist (better performance)')
option('debug_graph_lock', type: 'boolean', value: false,
description: 'graph lock debugging support')
option('debug_mutex', type: 'boolean', value: false,
description: 'mutex debugging support')
option('debug_stack_usage', type: 'boolean', value: false,
description: 'measure coroutine stack usage')
option('qom_cast_debug', type: 'boolean', value: true,
description: 'cast debugging support')
option('gprof', type: 'boolean', value: false,
description: 'QEMU profiling with gprof',
deprecated: true)
option('slirp_smbd', type : 'feature', value : 'auto',
description: 'use smbd (at path --smbd=*) in slirp networking')
option('hexagon_idef_parser', type : 'boolean', value : true,
description: 'use idef-parser to automatically generate TCG code for the Hexagon frontend')