system/freebsd-src

mirror of https://github.com/freebsd/freebsd-src synced 2024-10-15 04:43:53 +00:00

Author	SHA1	Message	Date
Warner Losh	95ee2897e9	sys: Remove $FreeBSD$: two-line .h pattern Remove /^\s\\n \*\s+\$FreeBSD\$$\n/	2023-08-16 11:54:11 -06:00
Ed Maste	a51f81c2e5	x86: move EARLY_AP_STARTUP into DEFAULTS EARLY_AP_STARTUP was introduced in 2016 (commit `fdce57a042`) with note: As a transition aid, the new behavior is moved under a new kernel option (EARLY_AP_STARTUP). This will allow the option to be turned off if need be during initial testing. I hope to enable this on x86 by default in a followup commit ... It was enabled by default, but became effectively mandatory (on x86) some time later. Move it to DEFAULTS to avoid an unbootable system if the option is left out of a custom kernel configuration file. Reported by: wollman Reviewed by: jhb Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D41352	2023-08-14 16:17:48 -04:00
Marius Strobl	37c8ee8847	ath(4): Remove MIPS AHB frontend and join PCI one w/ main support again Following the removal of general MIPS support, there's no longer a need to have the AHB bus-frontend in place, which according to Linux sources also isn't used with any non-MIPS SoCs. For simplicity, PCI bus support is only made conditional on the main one again, i. e. device ath_pci is removed, and built into the main module, i. e. if_ath_pci.ko obsoleted, respectively. Effectively, this reverts the following commits and associated changes: `dba9c85977` `e849bb3ecb` Approved by: adrian Relnotes: yes Differential Revision: https://reviews.freebsd.org/D41354	2023-08-08 22:30:13 +02:00
Mark Johnston	78cc000cba	amd64: Increase sanitizers' static shadow memory reservation Because KASAN shadows the kernel image itself (KMSAN currently does not), a shadow mapping of the boot stack must be created very early during boot. pmap_san_enter() reserves a fixed number of pages for the purpose of creating and mapping this shadow region. After commit `789df254cc` ("amd64: Use a larger boot stack"), it could happen that this reservation is insufficient; this happens when bootstack crosses a PAGE_SHIFT + KASAN_SHADOW_SCALE_SHIFT boundary. Update the calculation to take into account the new size of the boot stack. Fixes: `789df254cc` ("amd64: Use a larger boot stack") Sponsored by: The FreeBSD Foundation	2023-08-04 12:38:24 -04:00
Dmitry Chagin	b5c0b9555d	linux(4): Regen for ioprio syscalls MFC after: 1 month	2023-08-04 16:03:57 +03:00
Dmitry Chagin	1c83154e49	linux(4): Modify ioprio syscalls to match Linux MFC after: 1 month	2023-08-04 16:03:55 +03:00
Ed Maste	9051987e40	amd64: Bump MAXCPU to 1024 (from 256) Hardware with more than 256 CPU cores is currently available and will become increasingly common over FreeBSD 14's lifetime. Increase MAXCPU in the amd64 GENERIC kernel configuration to 1024. Earlier commits increased some related limits. These prerequisite commits include at least: - d7ed40243769 Increase MAX_APIC_ID safeguard to 0x800 - `d1639e43c5` cpuset: increase userland maximum size to 1024 Global and allocated arrays sized by MAXCPU result in excessive bloat on systems with lower core counts. In addition, some code used u_char (8 bits) to hold a CPU index, which is not valid if MAXCPU is greater than 256. A number of recent commits addressed these sorts of issues, including at least: - `133935d26f` pf: atomically increment state ids - `74ac712f72` vmm: Dynamically allocate a couple of per-CPU state save areas - `78cfa762eb` callout: Move per-CPU callout state into the dpcpu region - `42f722e721` amd64: store pcids pmap data in pcpu zone - `9801e7c275` smp_topo: dynamically allocate group array - `9fb6718d1b` smp: Dynamically allocate the stoppcbs array - `2bb16c6352` x86: retire use of intr_bind There are some additional allocations still to be converted and more scalability work is required to make effective use of very high core count systems, but this change allows us to boot on these systems and provides a Kernel Binary Interface (KBI) for the FreeBSD 14 release that supports these configurations. Special thanks to AMD for providing hardware to test these changes. PR: 269572 Reviewed by: des Relnotes: Yes Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D36838	2023-08-03 17:41:26 -04:00
Gordon Bergling	29eab3e4e0	linux(4): Fix two typos in source code comments - s/decriptors/descriptors/ MFC after: 3 days	2023-08-02 11:55:30 +02:00
Mark Johnston	5ad29bc8d4	amd64: Fix TLB invalidation routines in !SMP kernels amd64 is special in that its implementation of zpcpu_offset_cpu() is not the identity transformation, even in !SMP kernels. Because the pm_pcidp array of amd64's struct pmap is allocated from a pcpu UMA zone, this means that accessing pm_pcidp directly, as is done in !SMP implementations of pmap_invalidate_, does not work. Specifically, I see occasional unexplicable crashes in userspace when PCIDs are enabled. Apply a minimal patch to fix the problem. While it would also make sense to provide separate implementations of zpcpu_ for !SMP kernels, fixing it this way makes the SMP and !SMP implementations of pmap_invalidate_* more similar. Reviewed by: alc, kib MFC after: 1 week Sponsored by: Klara, Inc. Sponsored by: Juniper Networks, Inc. Differential Revision: https://reviews.freebsd.org/D41230	2023-07-30 11:12:35 -04:00
Alan Cox	3d7c37425e	amd64 pmap: Catch up with pctrie changes Recent changes to the pctrie code make it necessary to initialize the kernel pmap's rangeset for PKU.	2023-07-28 15:13:13 -05:00
Dmitry Chagin	4281dab8bc	linux(4): Add elf_hwcap2 to x86 On x86 Linux via AT_HWCAP2 the user controlled (by tunables) processor capabilities are exposed. Reviewed by: Differential Revision: https://reviews.freebsd.org/D41165 MFC after: 2 weeks	2023-07-28 11:56:59 +03:00
Mark Johnston	640e5cb304	kmsan: Add a comment explaining why KMSAN doesn't shadow above KERNBASE Sponsored by: The FreeBSD Foundation	2023-07-27 16:01:58 -04:00
Mark Johnston	789df254cc	amd64: Use a larger boot stack With sanitizers enabled, it becomes possible to overflow the stack when only a single page is used. Follow arm64's example and use the default kernel stack size instead. This is a bit wasteful, but without a guard page, overflow merely corrupts adjacent .bss entries and is thus difficult to debug. Note, with a GENERIC kernel we already consume over half of the available boot stack space, see the review for an example. Reviewed by: kib Reported by: Jenkins MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D41166	2023-07-24 18:49:36 -04:00
Dmitry Chagin	d9c2dc6bf1	linux(4): Regen for xattr syscalls MFC after: 1 month	2023-07-22 14:03:32 +03:00
Dmitry Chagin	41f2c69ee3	linux(4): Modify xattr syscalls to match Linux MFC after: 1 month	2023-07-22 14:03:31 +03:00
Kristof Provost	208fcb55e3	Fix MINIMAL build on amd64 amd64/include/counter.h uses KASSERT, but failed to include the kassert.h header.	2023-07-14 09:18:43 +02:00
Doug Moore	3e04ae433f	vm_radix_init: use initializer Several vm_radix tries are not initialized with vm_radix_init. That works, for now, since static initialization zeroes the root field anyway, but if initialization changes, these tries will fail. Add missing initializer calls. Reviewed by: alc, kib, markj Differential Revision: https://reviews.freebsd.org/D40971	2023-07-14 01:49:55 -05:00
Yufeng Zhou	294c52d969	amd64 pmap: Fix compilation when superpage reservations are disabled The function pmap_pde_ept_executable() should not be conditionally compiled based on VM_NRESERVLEVEL. It is required indirectly by pmap_enter(..., psind=1) even when reservation-based allocation is disabled at compile time. Reviewed by: alc MFC after: 1 week	2023-07-12 12:07:42 -05:00
Gleb Smirnoff	0d1ff2b04d	vmm: don't leak locks exiting vmmdev_ioctl() At least an error from vcpu_lock_all() at line 553 would leak memseg lock. There might be other cases as well. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D40981	2023-07-12 09:16:40 -07:00
Gleb Smirnoff	30f0328a32	vmm: don't return random error from vcpu_lock_all() if vcpu is empty When vcpu array is empty, function would return random value from stack. What I observed was -1. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D40980	2023-07-12 09:16:40 -07:00
John Baldwin	2596008a0b	amd64 pcpu.h: Add missing 'do' from do-while loop around __PCPU_SET. Reported by: mjg Diagnosed by: jrtc27	2023-07-08 12:59:26 -07:00
John Baldwin	2329393c61	amd64: Use __seg_gs to implement per-CPU data accesses. This makes use of the alternate address space support in both GCC and clang to access per-CPU data as accesses relative to GS:. The original motivation for this is that it quiets verbose warnings from GCC 12. However, this version is also much easier to read and allows the compiler to generate better code (e.g. the compiler can use a GS: memory operand directly in other instructions such as IMUL and CMP rather than always MOVing to a temporary register). The one caveat is that the current approach is very inefficient at -O0 since the compiler expects to load the 0 base offset from a global variable instead of assuming it is 0 (even with the const). Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40647	2023-07-07 13:06:55 -07:00
Mitchell Horne	a89262079e	Consistently provide ffs/fls using builtins Use of compiler builtin ffs/ctz functions will result in optimized instruction sequences when possible, and fall back to calling a function provided by the compiler run-time library. We have slowly shifted our platforms to take advantage of these builtins in `60645781d6` (arm64), `1c76d3a9fb` (arm), `9e319462a0` (powerpc, partial). Some platforms still rely on the libkern implementations of these functions provided by libkern, namely riscv, powerpc (ffs, flsll), and i386 (ffsll and flsll). These routines are slow, as they perform a linear search for the bit in question. Even on platforms lacking dedicated bit-search instructions, such as riscv, the compiler library will provide better-optimized routines, e.g. by using binary search. Consolidate all definitions of these functions (whether currently using builtins or not) to libkern.h. This should result in equivalent or better performing routines in all cases. One wart in all of this is the existing HAVE_INLINE_F** macros, which we use in a few places to conditionally avoid the slow libkern routines. These aren't easily removed in one commit. For now, provide these defines unconditionally, but marked for removal after subsequent cleanup. Removal of the now unused libkern routines will follow in the next commit. Reviewed by: dougm, imp (previous version) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40698	2023-07-06 14:46:41 -03:00
Mark O'Donovan	b0d3d44dfe	qlnxe: add driver to amd64 NOTES Signed-off-by: Mark O'Donovan <shiftee@posteo.net> Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/779	2023-07-01 11:06:59 -06:00
Alan Cox	0d2f98c2f0	amd64 pmap: Tidy up pmap_promote_pde() calls Since pmap_ps_enabled() is true by default, check it inside of pmap_promote_pde() instead of at every call site. Modify pmap_promote_pde() to return true if the promotion succeeded and false otherwise. Use this return value in a couple places. Reviewed by: kib, markj Differential Revision: https://reviews.freebsd.org/D40744	2023-06-24 13:09:04 -05:00
Alan Cox	34eeabff5a	amd64/arm64 pmap: Stop requiring the accessed bit for superpage promotion Stop requiring all of the PTEs to have the accessed bit set for superpage promotion to occur. Given that change, add support for promotion to pmap_enter_quick(), which does not set the accessed bit in the PTE that it creates. Since the final mapping within a superpage-aligned and sized region of a memory-mapped file is typically created by a call to pmap_enter_quick(), we now achieve promotions in circumstances where they did not occur before, for example, the X server's read-only mapping of libLLVM-15.so. See also https://www.usenix.org/system/files/atc20-zhu-weixi_0.pdf Reviewed by: kib, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D40478	2023-06-12 13:40:57 -05:00
Warner Losh	9121945d70	Regenerate sysent stuff after $FreeBSD$ removal Sponsored by: Netflix	2023-06-09 07:28:27 -06:00
Dmitry Chagin	cbbac56091	linux(4): Preserve fpu xsave state across signal delivery on amd64 PR: 270247 Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40444 MFC after: 2 weeks	2023-06-09 01:33:26 +03:00
Dmitry Chagin	920184ed6e	linux(4): In preparation for xsave refactor fxsave code on amd64 Due to fxsave area is os independent reimplement fxsave handmade code using copying of a whole area. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D40443 MFC after: 2 weeks	2023-06-09 01:32:46 +03:00
Dmitry Chagin	84617f6fcc	linux(4) rt_sendsig: Remove the use of caddr_t Replace caddr_t by more appropriate char *. MFC after: 2 weeks	2023-06-06 23:01:39 +03:00
Colin Percival	9d6ae1e3c2	Revert "Revert "tslog: Annotate some early boot functions"" Now that <sys/tslog.h> is wrapped in #ifdef _KERNEL, it's safe to have tslog annotations in files which might be built from userland (i.e. in subr_boot.c, which is built as part of the boot loader). This reverts commit `59588a546f`.	2023-06-04 22:49:38 -07:00
Xin LI	4d779448ad	gve: Fix build on i386 and enable LINT builds. Reviewed-by: imp Differential Revision: https://reviews.freebsd.org/D40419	2023-06-04 16:35:00 -07:00
Colin Percival	59588a546f	Revert "tslog: Annotate some early boot functions" The change to subr_boot.c broke the libsa build because the TSLOG macros have their own definitions for the boot loader -- I didn't realize that the loader code used subr_boot.c. I'm currently testing a fix and I'll revert this revert once I'm satisfied that everything works, but I don't want to leave the tree broken for too long. This reverts commit `469cfa3c30`.	2023-06-04 11:39:45 -07:00
Colin Percival	45cc8519f5	tslog: Annotate parts of SYSINIT cpu Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM, SYSINIT cpu takes roughly 2770 us: * 2280 us in vm_ksubmap_init * 535 us in kmem_malloc * 450 us in pmap_zero_page * 1720 us in pmap_growkernel * 1620 us in pmap_zero_page * 80 us in bufinit * 480 us in cpu_setregs * 430 us in cpu_setregs calling load_cr0 Much of this is hypervisor overhead: load_cr0 is slow because it traps to the hypervisor, and 99% of the time in pmap_zero_page is spent when we first touch the page, presumably due to the host Linux kernel faulting in backing pages one by one. Sponsored by: https://www.patreon.com/cperciva Differential Revision: https://reviews.freebsd.org/D40327	2023-06-04 10:16:35 -07:00
Colin Percival	2404380aac	tslog: Optionally instrument pmap_zero_page Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM, pmap_zero_page is responsible for 4.6 ms of the 25.0 ms of boot time. This is not in fact time spent zeroing pages though; almost all of that time is spent in a first-touch penalty, presumably due to the host Linux kernel faulting in backing pages one by one. There's probably a way to improve that by teaching Firecracker to fault in all the VM's pages from the start rather than having them faulted in one at a time, but that's outside of FreeBSD's control. This commit adds a TSLOG_PAGEZERO option which enables TSLOG on the amd64 pmap_zero_page function; it's a separate option (turned off by default even if TSLOG is enabled) since zeroing pages happens enough that it can easily fill the TSLOG buffer and prevent other timing information from being recorded. Sponsored by: https://www.patreon.com/cperciva Differential Revision: https://reviews.freebsd.org/D40326	2023-06-04 10:16:31 -07:00
Colin Percival	469cfa3c30	tslog: Annotate some early boot functions Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM, hammer_time takes roughly 2740 us: * 55 us in xen_pvh_parse_preload_data * 20 us in boot_parse_cmdline_delim * 20 us in boot_env_to_howto * 15 us in identify_hypervisor * 1320 us in link_elf_reloc * 1310 us in relocate_file1 handling ef->rela * 25 us in init_param1 * 30 us in dpcpu_init * 355 us in initializecpu * 255 us in initializecpu calling load_cr4 * 425 us in getmemsize * 280 us in pmap_bootstrap * 205 us in create_pagetables * 10 us in init_param2 * 25 us in pci_early_quirks * 60 us in cninit * 90 us in kdb_init * 105 us in msgbufinit * 20 us in fpuinit * 205 us elsewhere in hammer_time Some of these are unavoidable (e.g. identify_hypervisor uses CPUID and load_cr4 loads the CR4 register, both of which trap to the hypervisor) but others may deserve attention. Sponsored by: https://www.patreon.com/cperciva Differential Revision: https://reviews.freebsd.org/D40325	2023-06-04 10:16:22 -07:00
Mark Johnston	18282c4772	sysarch: Add includes required for ktrcapfail() calls to be compiled Reported by: jfree MFC after: 1 week	2023-06-01 17:18:23 -04:00
Mateusz Guzik	6217c2473d	amd64: zero-pad register dumps on panic de gustibus and so on Sponsored by: Rubicon Communications, LLC ("Netgate")	2023-05-30 13:15:56 +00:00
Dmitry Chagin	eb98f77910	linux(4): Regen for linux_execve MFC after: 2 month	2023-05-29 12:18:30 +03:00
Dmitry Chagin	8340b03425	linux(4): Add a dedicated linux_exec_copyin_args() Because Linux allows to exec binaries with 0 argc. Reviewed by: brooks Differential Revision: https://reviews.freebsd.org/D40148 MFC after: 2 month	2023-05-29 12:18:16 +03:00
Dmitry Chagin	d706d02edb	sysentvec: Retire sv_imgact_try as unneeded anymore The sysentvec sv_imgact_try was used by kern_exec() to allow non-native ABI to fixup shell path according to ABI root directory. Since the non-native ABI can now specify its root directory directly to namei() via pwd_altroot() call this facility is not needed anymore. Differential Revision: https://reviews.freebsd.org/D40092 MFC after: 2 month	2023-05-29 11:18:11 +03:00
Dmitry Chagin	57578deac7	Brandinfo: Retire emul_path as unneeded anymore The Barndinfo emul_path was used by the Elf image activator to fixup interpreter file name according to ABI root directory. Since the non-native ABI can now specify its root directory directly to namei() via pwd_altroot() call this facility is not needed anymore. Differential Revision: https://reviews.freebsd.org/D40091 MFC after: 2 month	2023-05-29 11:17:28 +03:00
Dmitry Chagin	fd745e1db6	linux(4): Use pwd_altroot() to tell namei() about ABI root path PR: 72920 Differential Revision: https://reviews.freebsd.org/D40090 MFC after: 2 month	2023-05-29 11:16:46 +03:00
Dmitry Chagin	78c2e58fa5	linux(4): Fix stack unwinding across signal frame on x86_64 Get rid of using register numbers which is undefined in libunwind on x86_64. Differential Revision: https://reviews.freebsd.org/D40156 MFC after: 1 month	2023-05-28 17:07:28 +03:00
Dmitry Chagin	037b60fb0f	linux(4): Preserve %rcx (return address) like a Linux do Perhaps, this does not makes much sense as destroyng %rcx declared by the x86_64 Linux syscall ABI. However,: a) if we get a signal while we are in the kernel, we should restore tf_rcx when preparing machine context for signal handlers. b) the Linux world is strange, someone can depend on %rcx value after syscall, something like go. Differential Revision: https://reviews.freebsd.org/D40155 MFC after: 1 month	2023-05-28 17:06:47 +03:00
Dmitry Chagin	185bd9fa30	linux(4): Simplify %r10 restoring on amd64 Restore %r10 at system call entry to avoid doing this multiply times. Differential Revision: https://reviews.freebsd.org/D40154 MFC after: 1 month	2023-05-28 17:06:23 +03:00
Dmitry Chagin	a463dd8108	linux(4): Add a comment explaining registers at syscall entry point on amd64 Differential Revision: https://reviews.freebsd.org/D40153 MFC after: 1 month	2023-05-28 17:06:05 +03:00
Dmitry Chagin	a99b890ecd	linux(4): Drop a weird comment from linux_set_syscall_retval on amd64 I agree, it would be great to avoid PCB_FULL_IRET, however we should follow Linux system call ABI. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D40152 MFC after: 1 month	2023-05-28 17:05:44 +03:00
Mark Johnston	9fb6718d1b	smp: Dynamically allocate the stoppcbs array This avoids bloating the kernel image when MAXCPU is large. A follow-up patch for kgdb and other kernel debuggers is needed since the stoppcbs symbol is now a pointer. Bump __FreeBSD_version so that debuggers can use osreldate to figure out how to handle stoppcbs. PR: 269572 MFC after: never Reviewed by: mjg, emaste Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D39806	2023-05-25 18:09:55 -04:00
Mark Johnston	e17eca3276	vmm: Avoid embedding cpuset_t ioctl ABIs Commit `0bda8d3e9f` ("vmm: permit some IPIs to be handled by userspace") embedded cpuset_t into the vmm(4) ioctl ABI. This was a mistake since we otherwise have some leeway to change the cpuset_t for the whole system, but we want to keep the vmm ioctl ABI stable. Rework IPI reporting to avoid this problem. Along the way, make VM_RUN a bit more efficient: - Split vmexit metadata out of the main VM_RUN structure. This data is only written by the kernel. - Have userspace pass a cpuset_t pointer and cpusetsize in the VM_RUN structure, as is done for cpuset syscalls. - Have the destination CPU mask for VM_EXITCODE_IPIs live outside the vmexit info structure, and make VM_RUN copy it out separately. Zero out any extra bytes in the CPU mask, like cpuset syscalls do. - Modify the vmexit handler prototype to take a full VM_RUN structure. PR: 271330 Reviewed by: corvink, jhb (previous versions) Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D40113	2023-05-23 21:15:59 -04:00
Dmitry Chagin	1d76741520	linux(4): Implement ptrace_pokeusr for x86_64 Differential Revision: https://reviews.freebsd.org/D40097 MFC after: 1 week	2023-05-18 20:02:35 +03:00
Dmitry Chagin	3d0addcd35	linux(4): Make ptrace_pokeusr machine dependent Differential Revision: https://reviews.freebsd.org/D40096 MFC after: 1 week	2023-05-18 20:01:12 +03:00
Dmitry Chagin	dd2a6cd701	linux(4): Make ptrace_peekusr machine dependend And partially implement it for x86_64. Differential Revision: https://reviews.freebsd.org/D40095 MFC after: 1 week	2023-05-18 20:00:12 +03:00
Piotr Pawel Stefaniak	411942a70e	GENERIC: remove a stray space character	2023-05-13 21:31:49 +02:00
Warner Losh	4d846d260e	spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch up to that fact and revert to their recommended match of BSD-2-Clause. Discussed with: pfg MFC After: 3 days Sponsored by: Netflix	2023-05-12 10:44:03 -06:00
Bjoern A. Zeeb	721b44ba5f	amd64: pmap.h put a guard around a pcpu.h function pmap_get_pcid() calls zpcpu_get() which is defined in pcpu.h. It is unclear why we do not include that header but like right above the change add another guard around pmap_get_pcid(). This allows some LinuxKPI headers to compile again. Suggested by: markj MFC after: 10 days	2023-05-12 11:14:54 +00:00
Warner Losh	062a7b918f	twe: Remove driver Sponsored by: Netflix	2023-05-10 22:24:12 -06:00
Konstantin Belousov	bf864c3ed5	amd64 MINIMAL: SysV IPC syscalls are loadable Reviewed by: emaste, imp Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39990	2023-05-09 18:30:07 +03:00
Konstantin Belousov	0c1c5e36eb	amd64 MINIMAL: remove UFS from compiled-in list Reviewed by: emaste, imp Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39990	2023-05-09 18:30:07 +03:00
Konstantin Belousov	bba6249ae9	amd64 MINIMAL config: remove statements about UFS module All UFS options work for ufs.ko. Reviewed by: emaste, imp Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39990	2023-05-09 18:30:07 +03:00
Vitaliy Gusev	c543e09f1f	bhyve: save/restore pir_desc Failing to preserve pir_desc can result in pending interrupts being lost on resume leading to a hung VM. Reviewed by: corvink, jhb MFC after: 1 week Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D35447	2023-05-09 10:31:27 +02:00
Bojan Novković	fefac54359	bhyve: fix vCPU single-stepping on VMX This patch fixes virtual machine single stepping on VMX hosts. Currently, when using bhyve's gdb stub, each attempt at single-stepping a vCPU lands in a timer interrupt. The current single-stepping mechanism uses the Monitor Trap Flag feature to cause VMEXIT after a single instruction is executed. Unfortunately, the SDM states that MTF causes VMEXITs for the next instruction that gets executed, which is often not what the person using the debugger expects. [1] This patch adds a new VM capability that masks interrupts on a vCPU by blocking interrupt injection and modifies the gdb stub to use the newly added capability while single-stepping a vCPU. [1] Intel SDM 26.5.2 Vol. 3C Reviewed by: corvink, jbh MFC after: 1 week Differential Revision: https://reviews.freebsd.org/D39949	2023-05-09 10:04:55 +02:00
Mitchell Horne	aba91805aa	hwpmc: use kstack_contains() This existing helper function is preferable to the hand-rolled calculation of the kstack bounds. Make some small style improvements while here. Notably, rename every instance of "r", the return address, to "ra". Tidy the includes in the affected files. Reviewed by: jkoshy MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D39909	2023-05-06 14:49:19 -03:00
Konstantin Belousov	38843fe0f2	amd64: add MINIMALUP config This is the MINIMAL config with SMP/NUMA options turned off. Useful to ensure that UP configuration still builds, until it is removed finally. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2023-05-06 14:24:07 +03:00
Konstantin Belousov	3a8c69c1ff	amd64 MINIMAL config: remove sentence about acpi On amd64 ACPI is required to boot, it cannot work as a module, and we do not build the ACPI module for long time. Sponsored by: The FreeBSD Foundation MFC after: 1 week	2023-05-06 14:24:07 +03:00
Konstantin Belousov	7c8e66ed8d	amd64: convert UP code to dynamically allocated pmap->pm_pcid Reported by: peterj Sponsored by: The FreeBSD Foundation	2023-05-06 14:24:07 +03:00
Corvin Köhne	b10e100d16	vmm: don't free unallocated memory If vmx or svm is disabled in BIOS or the device isn't supported by vmm, modinit won't allocate these state save areas. As kmem_free panics when passing a NULL pointer to it, loading the vmm kernel module causes a panic too. PR: 271251 Reviewed by: markj Fixes: `74ac712f72` ("vmm: Dynamically allocate a couple of per-CPU state save areas") MFC after: 1 week Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D39974	2023-05-05 15:34:00 +02:00
Igor Ostapenko	0167b5a793	sys/amd64/conf/FIRECRACKER: typo (compatiblity) https://bugs.freebsd.org/269753 PR: 269753 Reported by: Igor Ostapenko Approved by: doc, src (delphij, imp, zlei) Differential revision: https://reviews.freebsd.org/D38741	2023-05-05 01:23:08 +01:00
John Baldwin	4961faaacc	pmap_{un}map_io_transient: Use bool instead of boolean_t. Reviewed by: imp, kib Differential Revision: https://reviews.freebsd.org/D39920	2023-05-04 12:29:48 -07:00
John Baldwin	407f675718	imgact_elf: Change header_supported to return bool instead of boolean_t. Reviewed by: imp, kib, emaste Differential Revision: https://reviews.freebsd.org/D39919	2023-05-04 12:29:29 -07:00
Konstantin Belousov	3582acbad3	amd64 mp_machdep.c: remove useless comment Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39945	2023-05-04 18:39:22 +03:00
Konstantin Belousov	af1c6d3f30	amd64: do not leak pcpu pages Do not preallocate pcpu area backing pages on early startup, only allocate enough of KVA for pcpu[MAXCPU] and the page for BSP. Other pages are allocated after we know the number of cpus and their assignments to the domains. PCPUs are not accessed until they are initialized, which happens on AP startup. Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39945	2023-05-04 18:39:22 +03:00
Konstantin Belousov	e704f88f3d	amd64: initialize APs kpmap_store in init_secondary() The APs pcpu area is zeroed in init_secondary() by pcpu_init(), so the early initialization in pmap_bootstrap() is nop. Fixes: 42f722e721cd010ae5759a4b0d3b7b93c2b9cad2ESC Reviewed by: markj Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39945	2023-05-04 18:39:22 +03:00
Konstantin Belousov	42f722e721	amd64: store pcids pmap data in pcpu zone This change eliminates the struct pmap_pcid array embedded into struct pmap and sized by MAXCPU, which would bloat with MAXCPU increase. Also it removes false sharing of cache lines, since the array elements are mostly locally accessed by corresponding CPUs. Suggested by: mjg Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39890	2023-05-02 14:32:47 +03:00
Konstantin Belousov	9c8cbf3819	amd64 pmap_pcid_alloc(): pass a pointer to struct pmap_pcid instead of cpuid Cpuid is used to index the pmap->pm_pcids array only. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39890	2023-05-02 14:32:40 +03:00
Konstantin Belousov	9e0143694a	amd64: add pmap_get_pcid() helper Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39890	2023-05-02 14:32:35 +03:00
Konstantin Belousov	86b61ccb34	amd64 pmap: add pmap_pinit_pcids() helper to initialize pm_pcids array for a new user pmap Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39890	2023-05-02 14:32:29 +03:00
Konstantin Belousov	32bb28d8ad	amd64: move definition of the struct pmap_pcids into _pmap.h and rename the structure to pmap_pcid. Reviewed by: markj Tested by: pho Sponsored by: The FreeBSD Foundation Differential revision: https://reviews.freebsd.org/D39890	2023-05-02 14:32:20 +03:00
Dmitry Chagin	80d8a4a003	linux(4): Make struct stat64 to match Linux actual one	2023-04-28 11:55:04 +03:00
Dmitry Chagin	cd0fca82bb	linux(4): Regen for mknod syscall changes	2023-04-28 11:55:04 +03:00
Dmitry Chagin	ca3333dd4a	linux(4): Use Linux dev_t type for mknod syscalls dev argument As of version 2.6.0 of the Linux kernel, dev_t is a 32-bit unsigned integer on all platforms. Prior the 2.6 kernel dev_t type was an unsigned short. However, since the firs commit of the Linuxulator, mknod syscall get int dev argument. Also, there is some confusion here, while the kernel declares a dev_t type as a 32-bit sized, the user-space dev_t type can be size of 64 bits, e.g., in the Glibc library. To avoid confusion and to help porting of the Linuxulator to other platforms use explicit l_dev_t for dev argument of mknod syscalls.	2023-04-28 11:55:02 +03:00
Dmitry Chagin	19973638be	linux(4): Move dev_t type declaration under /compat/linux As of version 2.6.0 of the Linux kernel, dev_t is a 32-bit unsigned integer on all platforms. Move it into the MI linux.h under /compat/linux.	2023-04-28 11:55:02 +03:00
Dmitry Chagin	e0bfe0d62c	linux(4): Make struct newstat to match actual Linux one In the struct stat the st_dev, st_rdev are unsigned long.	2023-04-28 11:55:01 +03:00
Dmitry Chagin	023e688496	linux(4): Regen for struct l_old_stat changes	2023-04-28 11:55:01 +03:00
Dmitry Chagin	2370c7321f	linux(4): Update syscalls.master to reflect struct l_old_stat	2023-04-28 11:54:59 +03:00
Dmitry Chagin	391fd1e1a1	linux(4): Mark old fstat syscal as unimplemented It looks like the old fstat system call never been implemented.	2023-04-28 11:54:59 +03:00
Dmitry Chagin	a408fc097f	linux(4): Rename obsolete old struct l_stat to struct l_old_stat	2023-04-28 11:54:59 +03:00
Mark Johnston	74ac712f72	vmm: Dynamically allocate a couple of per-CPU state save areas This avoids bloating the BSS when MAXCPU is large. No functional change intended. PR: 269572 Reviewed by: corvink, rew Tested by: rew MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D39805	2023-04-26 10:08:42 -04:00
Vitaliy Gusev	0912408a28	vmm: fix HLT loop while vcpu has requested virtual interrupts This fixes the detection of pending interrupts when pirval is 0 and the pending bit is set More information how this situation occurs, can be found here: `c5b5f2d808/sys/amd64/vmm/intel/vmx.c (L4016-L4031)` Reviewed by: corvink, markj Fixes: `02cc877968` ("Recognize a pending virtual interrupt while emulating the halt instruction.") MFC after: 1 week Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D39620	2023-04-26 10:38:46 +02:00
Mateusz Guzik	95e4f5ef7c	x86: whack pmspcv from GENERIC The driver is enormous and rarely used. text data bss dec hex filename 23076646 1870505 `4415872` 29363023 0x1c00b4f kernel.before 20017433 1870305 4416000 26303738 0x1915cfa kernel.after People using the driver will need to add pmspcv_load="YES" to their loader.conf. Reviewed by: jhb Relnotes: yes Sponsored by: Rubicon Communications, LLC ("Netgate") Differential Revision: https://reviews.freebsd.org/D39816	2023-04-25 18:09:44 +00:00
Mark Johnston	47cf1b37f4	vmm: Expose some more AVX512 CPUID bits to guests This is required to announce support for some accelerated AES operations. AVX512BW indicates support for the AVX512-FP16 extension and AVX512VL indicates support for the use of AVX512 instructions with vector lengths smaller than 512 bits. VAES and VPCLMULQDQ extensions indicate that VEX-prefixed AES-NI and pclmulqdq instructions are supported. All of these bits are needed for OpenSSL to use VAES to accelerate AES-GCM transforms. Reviewed by: corvink, kib, jhb MFC after: 2 weeks Sponsored by: Stormshield Sponsored by: Klara, Inc. Differential Revision: https://reviews.freebsd.org/D39781	2023-04-25 13:35:14 -04:00
Dmitry Chagin	56c5230afd	linux(4): Fix LINUX_AT_COUNT comments Differential Revision: https://reviews.freebsd.org/D39645 MFC after: 1 month	2023-04-22 22:16:43 +03:00
Dmitry Chagin	7d8c983983	linux(4): Deduplicate linux_copyout_auxargs() Export default MINSIGSTKSZ value for the x86 until we do not preserve AVX registers in the signal context. Differential Revision: https://reviews.freebsd.org/D39644 MFC after: 1 month	2023-04-22 22:16:02 +03:00
Warner Losh	559b94a122	syscall.master: Fix comments Have more accruate comments. While #if, #else, etc are copied to the header files, lines that don't start with # are not. And #include files are only output to sysinc (which winds up at the front of init_sysent.c which seems a bit odd). This is all radically undocumented, and likely has drifted somewhat from 4.4BSD and what other systems do (they've drifted too, fwiw). Sponsored by: Netflix	2023-04-20 16:18:02 -06:00
Dmitry Chagin	de4da6cd04	x86: Move i386 timerreg.h to x86 Reviewed by: emaste, jhb Differential Revision: https://reviews.freebsd.org/D39656 MFC after: 1 month	2023-04-20 19:42:59 +03:00
Dmitry Chagin	d1f4c44aa8	x86: Move i386 ppireg.h to x86 Differential Revision: https://reviews.freebsd.org/D39655 MFC after: 1 month	2023-04-20 19:42:59 +03:00
Konstantin Belousov	617a11eab6	x86: initialize use_xsave once The explanation from https://reviews.freebsd.org/D39637 by stevek: The "use_xsave" variable is a global and that is only supposed to be initialized early before scheduling gets started. However, with the way the ifuncs for "fpusave" and "fpurestore" are implemented, the value could be changed at runtime when scheduling is active if "use_xsave" was set to 0 by the tunable. This leaves a window of opportunity where "use_xsave" gets re-initialized to 1 and a context switch could occur with a thread that was not set up to be able to use xsave functionality. This can lead to an "privileged instruction fault". The fix is to protect "use_xsave" from being initialized more than once. Reported and reviewed by: stevek Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39660	2023-04-19 02:22:28 +03:00
Konstantin Belousov	1e0e335b0f	amd64: fix PKRU and swapout interaction When vm_map_remove() is called from vm_swapout_map_deactivate_pages() due to swapout, PKRU attributes for the removed range must be kept intact. Provide a variant of pmap_remove(), pmap_map_delete(), to allow pmap to distinguish between real removes of the UVA mappings and any other internal removes, e.g. swapout. For non-amd64, pmap_map_delete() is stubbed by define to pmap_remove(). Reported by: andrew Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39556	2023-04-15 02:53:59 +03:00
Julien Grall	ab7ce14b1d	xen/intr: introduce dev/xen/bus/intr-internal.h Move the xenisrc structure which needs to be shared between the core Xen interrupt code and architecture-dependent code into a separate header. A similar situation exists for the NR_EVENT_CHANNELS constant. Turn xi_intsrc into a type definition named xi_arch to reflect the new purpose of being an architectural variable for the interrupt source. This was originally implemented by Julien Grall, but has been heavily modified. The core side was renamed "intr-internal.h" and is #include'd by "arch-intr.h" instead of the other way around. This allows the architecture to add function definitions which use struct xenisrc. The original version only moved xi_intsrc into xen_arch_isrc_t. Moving xi_vector was done by the submitter. The submitter had also moved xi_activehi and xi_edgetrigger into xen_arch_isrc_t. Those disappeared with the removal of PVHv1 support. Copyright note. The current xenisrc structure was introduced at `76acc41fb7` by Justin T. Gibbs. Traces remain, but the strength of Copyright claims from before 2013 seem pretty weak. Reviewed by: royger Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>, 2021-03-17 19:09:01 Original implementation: Julien Grall <julien@xen.org>, 2015-10-20 09:14:56 Differential Revision: https://reviews.freebsd.org/D30648 [royger] - Adjust some line lengths - Fix comment about NR_EVENT_CHANNELS after movement. - Use #include instead of symlinks.	2023-04-14 15:58:53 +02:00
Elliott Mitchell	af610cabf1	xen/intr: adjust xen_intr_handle_upcall() to match driver filter xen_intr_handle_upcall() has two interfaces. It needs to be called by the x86 assembly code invoked by the APIC. Second, it needs to be called as a driver_filter_t for the XenPCI code and for architectures besides x86. Unfortunately the driver_filter_t interface was implemented as a wrapper around the x86-APIC interface. Now create a simple wrapper for the x86-APIC code, which calls an architecture-independent xen_intr_handle_upcall(). When called via intr_event_handle(), driver_filter_t functions expect preemption to be disabled. This removes the need for critical_enter()/critical_exit() when called this way. The lapic_eoi() call is only needed on x86 in some cases when invoked directly as an APIC vector handler. Additionally driver_filter_t functions have no need to handle interrupt counters. The intrcnt_add() calling function was reworked to match the current situation. intrcnt_add() is now only called via one path. The increment/decrement of curthread->td_intr_nesting_level had previously been left out. Appears this was mostly harmless, but this was noticed during implementation and has been added. CONFIG_X86 is a leftover from use with Linux. While the barrier isn't needed for FreeBSD on x86, it will be needed for FreeBSD on other architectures. Copyright note. xen_intr_intrcnt_add() was introduced at `76acc41fb7` by Justin T. Gibbs. xen_intrcnt_init() was introduced at `fd036deac1` by John Baldwin. sys/x86/xen/xen_arch_intr.c was originally created by Julien Grall in 2015 for the purpose of holding the x86 interrupt interface. Later it was found xen_intr_handle_upcall() was better earlier, and the x86 interrupt interface better later. As such the filename and header list belong to Julien Grall, but what those were created for is later. Reviewed by: royger Differential Revision: https://reviews.freebsd.org/D30006	2023-04-14 15:58:52 +02:00
Elliott Mitchell	ecdcad6516	xen: remove CONFIG_XEN_COMPAT, purge Xen 3.0 compatibility This overlaps the purpose of __XEN_INTERFACE_VERSION__. Remove Xen 3.0.2 compatibility. __XEN_INTERFACE_VERSION__ has compatibility to Xen 3.2.8 enabled. As Xen 3.3 was released almost 15 years ago, it seems unlikely anyone hasn't updated. Reviewed by: royger	2023-04-14 15:58:48 +02:00
Elliott Mitchell	b2c50bb934	xen/efi: make Xen PV EFI clock optional The present implementation is only for x86. Other architectures need adjustments for querying presence of EFI. Xen's EFI support is also quite troublesome on non-x86. This is being slowly remedied, but until in better shape the EFI clock functionality should be disabled. Reviewed by: royger Differential Revision: https://reviews.freebsd.org/D31065	2023-04-14 15:58:47 +02:00
Henri Hennebert	71883128e5	rtsx: Add plug-and-play info Add MODULE_PNP_INFO() to the driver to make it autoload if not linked statically into the kernel. Remove the device from amd64/i386 GENERIC. Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D35074	2023-04-13 11:12:50 -03:00
Dmitry Chagin	50111714f5	linux(4): Regen for close_range syscall MFC after: 2 weeks	2023-04-04 23:23:37 +03:00
Dmitry Chagin	1c27dce1f8	linux(4): Modify close_range syscall to match Linux MFC after: 2 weeks	2023-04-04 23:23:24 +03:00
Alexander V. Chernikov	3091d980f5	netlink: add NETLINK to the DEFAULTS for each architecture NETLINK is going to replace rtsock and a number of other ioctl/sysctl interfaces. In-base utilies such as route(8), netstat(8) and soon ifconfig(8) are being converted to use netlink sockets as a transport between kernel and userland. In the current configuration, it still possible have the kernel without NETLINK (`nooptions NETLINK`) and use the aforementioned utilies by buidling the world with `WITHOUT_NETLINK` src.conf knob. However, this approach does not cover the cases when person unintentionally builds a custom kernel without netlink and tries to use the standard userland. This change adds `option NETLINK` to the default options for each architecture, fixing the custom kernel issue. For arm, this change uses `std.armv6` and `std.armv7` (netlink already in) instead of DEFAULTS. Reviewed By: imp Differential Revision: https://reviews.freebsd.org/D39339	2023-04-02 15:27:21 +00:00
Konstantin Belousov	cd137909c3	amd64 wakeup: recalculate mitigations after APICs are woken APICs are needed to broadcast IPIs for MSR writes. PR: 270489 Reviewed by: dchagin, emaste, jhb Tested by: dchagin, manu Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39302	2023-03-29 21:45:20 +03:00
Elliott Mitchell	9f3be3a6ec	xen: switch to using core atomics for synchronization Now that the atomic macros are always genuinely atomic on x86, they can be used for synchronization with Xen. A single core VM isn't too unusual, but actual single core hardware is uncommon. Replace an open-coding of evtchn_clear_port() with the inline. Substantially inspired by work done by Julien Grall <julien@xen.org>, 2014-01-13 17:40:58. Reviewed by: royger MFC after: 1 week	2023-03-29 09:51:42 +02:00
John Baldwin	0f735657aa	bhyve: Remove vmctx member from struct vm_snapshot_meta. This is a userland-only pointer that isn't relevant to the kernel and doesn't belong in the ioctl structure shared between userland and the kernel. For the kernel, the old structure for the ioctl is still supported under COMPAT_FREEBSD13. This changes vm_snapshot_req() in libvmmapi to accept an explicit vmctx argument. It also changes vm_snapshot_guest2host_addr to take an explicit vmctx argument. As part of this change, move the declaration for this function and its wrapper macro from vmm_snapshot.h to snapshot.h as it is a userland-only API. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38125	2023-03-24 11:49:06 -07:00
John Baldwin	7d9ef309bd	libvmmapi: Add a struct vcpu and use it in most APIs. This replaces the 'struct vm, int vcpuid' tuple passed to most API calls and is similar to the changes recently made in vmm(4) in the kernel. struct vcpu is an opaque type managed by libvmmapi. For now it stores a pointer to the VM context and an integer id. As an immediate effect this removes the divergence between the kernel and userland for the instruction emulation code introduced by the recent vmm(4) changes. Since this is a major change to the vmmapi API, bump VMMAPI_VERSION to 0x200 (2.0) and the shared library major version. While here (and since the major version is bumped), remove unused vcpu argument from vm_setup_pptdev_msi*(). Add new functions vm_suspend_all_cpus() and vm_resume_all_cpus() for use by the debug server. The underyling ioctl (which uses a vcpuid of -1) remains unchanged, but the userlevel API now uses separate functions for global CPU suspend/resume. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D38124	2023-03-24 11:49:06 -07:00
Konstantin Belousov	2b4b3789f8	acpi_wakeup.c: apply the reviewer' editorial corrections to the comment text. Fixes: `02904a06c7` MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39146	2023-03-18 17:47:19 +02:00
Konstantin Belousov	02904a06c7	amd64: properly recalculate mitigations knobs after resume Revision r333125 AKA `986c4ca387` forced clear cpu_stdext_feature3 on suspend, since at that time microcode update was not reloaded early on resume. Then, revision `050f5a8405` started re-reading cpu_stdext_feature3 again. Since modern CPUs do not require mitigations from the Skylake era, this went unnoticed for some time. Keep zeroing cpu_stdext_feature3 on suspend, but re-read it in more controlled way on resume after microcode is reloaded, and recalculate active workarounds based on actual microcode capabilities. Reported and tested by: romain Reviewed by: emaste, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D39146	2023-03-18 17:40:05 +02:00
Konstantin Belousov	ff6d60946a	amd64 acpi_wakeup.c: fix typo Sponsored by: The FreeBSD Foundation MFC after: 3 days	2023-03-17 15:10:34 +02:00
Vitaliy Gusev	94a3876d7e	vmm: fix missing ipi statistic ipi counters are missing in bhyvectl's output because vm_maxcpu is 0 when initializing them. That's because vmm_stat_register is executed before vmm_init. Instead of directly fixing it, there's a better solution in illumos which is cherry picked: `65a3bc8373` It replaces the matrix statistic by two counters per vcpu. One for counting the ipis to the vcpu and one counting the ipis received by the vcpu. This has several advantages: - A matrix statistic becomes huge when using many vcpus. - A matrix statistic easily reaches the MAX_VMM_STAT_ELEMS limit. - Two counters are enough in most cases. DTrace can be used for more advanced debugging purposes. - A matrix statistic wastes memory. The matrix size is determined by vm_maxcpu regardless of the number of vcpus assigned to the vm. Reviewed by: corvink, markj Fixes: `ee98f99d7a` ("vmm: Convert VM_MAXCPU into a loader tunable hw.vmm.maxcpu.") MFC after: 1 week Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D39038	2023-03-17 13:50:08 +01:00
Dmitry Chagin	9e7f03e9c6	linux(4): Drop unncessary struct l_ifconf declaration from amd64/linux Its needed only for amd64/linux32 Linuxulator. Differential Revision: https://reviews.freebsd.org/D38793	2023-03-04 12:11:38 +03:00
Dmitry Chagin	cabbfb60d0	linux(4): Reduce code duplication between MD files Move struct ifnet definitions under compat/linux. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D38791	2023-03-04 12:11:38 +03:00
John-Mark Gurney	2fee875629	abstract out the vm detection via smbios.. This makes the detection of VMs common between platforms that have SMBios. Reviewed by: imp, kib Differential Revision: https://reviews.freebsd.org/D38800	2023-03-02 16:54:21 -08:00
Vitaliy Gusev	8104fc31a2	bhyve: fix restore of kernel structs vmx_snapshot() and svm_snapshot() do not save any data and error occurs at resume: Restoring kernel structs... vm_restore_kern_struct: Kernel struct size was 0 for: vmx Failed to restore kernel structs. Reviewed by: corvink, markj Fixes: `39ec056e6d` ("vmm: Rework snapshotting of CPU-specific per-vCPU data.") MFC after: 2 weeks Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D38476	2023-02-28 13:37:53 +01:00
Vitaliy Gusev	281b496f22	vmm: fix restore of TSC offset After suspend/resume Ubuntu 20.04 and 22.04 installer can hang if tsc-early clocksource has a big skew. Reviewed by: corvink, jhb Fixes: `a7db532e3a` ("vmm: Simplify saving of absolute TSC values in snapshots.") MFC after: 2 weeks Sponsored by: vStack Differential Revision: https://reviews.freebsd.org/D38474	2023-02-28 13:37:44 +01:00
Mike Karels	dd6f6030cc	amd64 kernel config: clean up whitespace Most options in kernel config files use "options<space><tab>OPTION". This allows the option to be commented out without shifting columns. A few options had two tabs, and some had spaces. Make them consistent.	2023-02-24 08:36:28 -06:00
Mateusz Guzik	6b9acd1bfb	Exclude MMCCAM kernels from make universe They don't provide any value and are quite arbitrary. Note arm64 GENERIC-MMCCAM was already excluded, just not the NODEBUG variant. The option is already build-tested with arm64 LINT kernel. Reviewed by: manu Differential Revision: https://reviews.freebsd.org/D38458	2023-02-16 07:29:53 +00:00
Dmitry Chagin	c8a79231a5	linux(4): Rename linux_timer.h to linux_time.h To avoid confusing people, rename linux_timer.h to linux_time.h, as linux_timer.c is the implementation of timer syscalls only, while linux_time.c contains implementation of all stuff declared in linux_time.h. MFC after: 2 weeks	2023-02-14 17:46:33 +03:00
Dmitry Chagin	2456a45929	linux(4): Cleanup includes under amd64/linux Cleanup unneeded includes, sort the rest according to style(9). No functional changes. MFC after: 2 weeks	2023-02-14 17:46:32 +03:00
Dmitry Chagin	31e938c531	linux(4): Cleanup vm includes from linux_util.h Include vm headers directly where they needed. The linux_util.h included in a most source files of the Linuxulator, avoid collecting a rarely used includes here. MFC after: 2 weeks	2023-02-14 17:46:30 +03:00
Dmitry Chagin	06c07e1203	Complete removal of opt_compat.h Since Linux emulation layer build options was removed there is no reason to keep opt_compat.h. Reviewed by: emaste Differential Revision: https://reviews.freebsd.org/D38548 MFC after: 2 weeks	2023-02-13 19:07:38 +03:00
Dmitry Chagin	10d16789a3	linux(4): Get rid of the opt_compat.h include. Since `e013e369` COMPAT_LINUX, COMPAT_LINUX32 build options are removed, so include of opt_compat.h is no more needed. MFC after: 2 weeks	2023-02-12 20:24:32 +03:00
Mark Johnston	b265a2e0d7	vmm: Fix AP startup compatibility for old bhyve executables These changes unbreak AP startup when using a 13.1-RELEASE bhyve executable with a newer kernel: - Correct the destination mask for the VM_EXITCODE_IPI message generated by an INIT or STARTUP IPI in vlapic_icrlo_write_handler(). - Only initialize vlapics on active vCPUs. 13.1-RELEASE bhyve activates AP vCPUs only after the BSP starts them with an IPI, and vmm now allocates vcpu structures lazily, so the STARTUP handling in vm_handle_ipi() could trigger a page fault. - Fix an off-by-one setting the vcpuid in a VM_EXITCODE_SPINUP_AP message. Fixes: `7c326ab5bb` ("vmm: don't lock a mtx in the icr_low write handler") Reviewed by: jhb, corvink MFC after: 2 weeks Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D38446	2023-02-09 16:14:33 -05:00
Mark Johnston	ba34de1b3b	vmm: Remove an unneeded initialization of "retu" vm_handle_ipi() unconditionally initializes "retu". No functional change intended. Reviewed by: jhb, corvink MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D38446	2023-02-09 16:14:33 -05:00
Mark Johnston	f3bbd0e818	vmm: Collapse identical case statements in vlapic_icrlo_write_handler() No functional change intended. Reviewed by: jhb, corvink MFC after: 1 week Sponsored by: The FreeBSD Foundation Differential Revision: https://reviews.freebsd.org/D38446	2023-02-09 16:14:33 -05:00
Dag-Erling Smørgrav	43d4680b39	MINIMAL: Update and clean up. * Add GEOM_LABEL, required to boot a default UEFI install. * Add enough of virtio to boot in bhyve. * Reduce diff between amd64 and i386. * Reduce diff to GENERIC. MFC after: 1 week Reviewed by: imp Differential Revision: https://reviews.freebsd.org/D38468	2023-02-09 18:24:45 +01:00
Konstantin Belousov	ee84487120	amd64 ia32 vdso: always define some __vdso_ symbols ... regardless of the kernel config options. It is reported that llvm16 ld.lld warns about undefined symbols referenced by the VERSION script. Reviewed by: emaste, val_packett.cool Discussed with: jrtc27 Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D38392	2023-02-09 04:36:40 +02:00
Mateusz Guzik	b2c68dc6d9	amd64: ansify Reported by: clang 15 Sponsored by: Rubicon Communications, LLC ("Netgate")	2023-02-07 22:52:06 +00:00
Mateusz Guzik	819ed47204	amd64 pmap: patch up a comment in pmap_init_pv_table Requested by: jhb	2023-02-06 22:33:28 +00:00
Yuri	e4d3f1e40a	hv_hid: Hyper-V HID driver Hyper-V HID driver using hidbus/hms. Reviewed by: wulf MFC after: 1 week PR: 221074 Differential revision: https://reviews.freebsd.org/D38140	2023-02-05 18:32:08 +03:00
Elliott Mitchell	d27d543c78	vmm: purge EOL release compatibility Remove FreeBSD 11 support Reviewed by: imp Pull Request: https://github.com/freebsd/freebsd-src/pull/603 Differential Revision: https://reviews.freebsd.org/D35560	2023-02-04 09:13:10 -07:00
Dmitry Chagin	ce20c00e85	linux(4): Remove stale comment that no longer applies. MFC after: 1 week	2023-02-02 20:21:37 +03:00
Dmitry Chagin	6ad07a4b2b	linux(4): Microoptimize rt_sendsig() on amd64. Drop proc lock earlier, before copying user stuff. Pointed out by: kib Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D38326 MFC after: 1 week	2023-02-02 20:21:37 +03:00
Dmitry Chagin	a95cb95e12	linux(4): Preserve fpu fxsave state across signal delivery on amd64. PR: 240768 Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D38302 MFC after: 1 week	2023-02-02 20:21:37 +03:00
Dmitry Chagin	95b8603427	linux(4): Deduplicate linux_trans_osrel(). MFC after: 1 week	2023-02-02 17:58:07 +03:00
Dmitry Chagin	6039e966ff	linux(4): Deduplicate linux_copyout_strings(). It is still present in the 32-bit Linuxulator on amd64. MFC after: 1 week	2023-02-02 17:58:07 +03:00
Dmitry Chagin	9e550625f8	linux(4): Deduplicate linux_fixup_elf(). Use native routines to fixup initial process stack. On Arm64 linux_elf_fixup() is noop, as it do the stack fixup (room for argc) in the linux_copyout_strings(). MFC after: 1 week	2023-02-02 17:58:07 +03:00
Dmitry Chagin	7446514533	linux(4): Microoptimize linux_elf.h for future use. In order to reduce code duplication move coredump support definitions into the appropriate header and hide private definitions. MFC after: 1 week	2023-02-02 17:58:06 +03:00
Konstantin Belousov	2555f175b3	Move kstack_contains() and GET_STACK_USAGE() to MD machine/stack.h Reviewed by: jhb Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D38320	2023-02-02 00:59:26 +02:00
Dmitry Chagin	575e48f1c4	linux(4): Deduplicate MI futex structures. MFC after: 1 week	2023-02-01 21:57:04 +03:00
Dmitry Chagin	5c32146723	amd64: Eliminate write only cpu_fxsr. Reviewed by: kib Differential Revision: https://reviews.freebsd.org/D38289 MFC after: 1 week	2023-02-01 18:17:06 +03:00
Corvin Köhne	892feec221	vmm: avoid spurious rendezvous A vcpu only checks if a rendezvous is in progress or not to decide if it should handle a rendezvous. This could lead to spurios rendezvous where a vcpu tries a handle a rendezvous it isn't part of. This situation is properly handled by vm_handle_rendezvous but it could potentially degrade the performance. Avoid that by an early check if the vcpu is part of the rendezvous or not. At the moment, rendezvous are only used to spin up application processors and to send ioapic interrupts. Spinning up application processors is done in the guest boot phase by sending INIT SIPI sequences to single vcpus. This is known to cause spurious rendezvous and only occurs in the boot phase. Sending ioapic interrupts is rare because modern guest will use msi and the rendezvous is always send to all vcpus. Reviewed by: jhb MFC after: 1 week Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D37390	2023-02-01 12:36:36 +01:00
Eric Joyner	5354596764	vtd: Increase DRHD_MAX_UNITS Observed on a couple Ice Lake-SP platforms (Intel Coyote Pass, Dell R750), there are more than 8 DRHD sections enumerated in the DMAR ACPI section. Since the previous limit was 8, this resulted in some of these not being parsed by vtd when the iommu is initialized; in this case when PCI devices are being passthru'd to a bhyve VM. This omission later causes a kernel panic later in initialization when devices could not be found in a valid DRHD scope because the DHRD containing the device's scope was not added to vtd. Signed-off-by: Eric Joyner <erj@FreeBSD.org> PR: 268486 Sponsored by: Intel Corporation Reviewed by: rew@, corvink@ MFC after: 1 day Differential Revision: https://reviews.freebsd.org/D38285	2023-01-31 13:57:42 -08:00
Konstantin Belousov	153643a5bc	amd64: do not enable PKRU if user disabled saving PKRU register in xsave mask This is done by reverting CR4_PKE bit, because we perform %CR4 initialization in initializecpu(), and the function is called before xsave_mask is read. To not redo the whole early initialization sequence for the corner case, this should be good enough. Reported by: jhb Reviewed by: jhb, markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D38219	2023-01-27 19:44:49 +02:00
Andrew Gallatin	9cb6ba29cb	vm: centralize VM_BATCHQUEUE_SIZE definition Remove the platform-specific definitions of VM_BATCHQUEUE_SIZE for amd64 and powerpc64, and instead treat all 64-bit platforms identically. This has the effect of increasing the arm64 and riscv VM_BATCHQUEUE_SIZE to match that of other platforms. Reviewed by: jhb, markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D37707	2023-01-21 14:30:00 -05:00
Robert Wing	27029bc08f	vmm: fix use after free in ppt_detach() The vmm module destroys the host_domain before unloading the ppt module causing a use after free. This can happen when kldunload'ing vmm. Reviewed by: markj, jhb Differential Revision: https://reviews.freebsd.org/D38072	2023-01-20 11:25:27 +00:00
Robert Wing	c668e8173a	vmm: take exclusive mem_segs_lock in vm_cleanup() The consumers of vm_cleanup() are vm_reinit() and vm_destroy(). The vm_reinit() call path is, here vmmdev_ioctl() takes mem_segs_lock: vmmdev_ioctl() vm_reinit() vm_cleanup(destroy=false) The call path for vm_destroy() is (mem_segs_lock not taken): sysctl_vmm_destroy() vmmdev_destroy() vm_destroy() vm_cleanup(destroy=true) Fix this by taking mem_segs_lock in vm_cleanup() when destroy == true. Reviewed by: corvink, markj, jhb Fixes: `67b69e76e8` ("vmm: Use an sx lock to protect the memory map.") Differential Revision: https://reviews.freebsd.org/D38071	2023-01-20 11:10:53 +00:00
Robert Wing	ccf32a68f8	vmm: take exclusive mem_segs_lock when (un)assigning ppt dev PR: 268744 Reported by: mmatalka@gmail.com Reviewed by: corvink, markj, jhb Fixes: `67b69e76e8` ("vmm: Use an sx lock to protect the memory map.") Differential Revision: https://reviews.freebsd.org/D37962	2023-01-20 10:03:59 +00:00
Gordon Bergling	05187f2ffc	amd64: Fix a common typo in source code comments - s/comparision/comparison/ MFC after: 3 days	2023-01-19 14:27:18 +01:00
Alexander V. Chernikov	692e19cf51	netlink: add netlink to GENERIC@amd64 Netlink is a communication protocol defined in RFC 3549. It is async, TLV-based protocol, providing 1-1 and 1-many communications between kernel and userland. Netlink is currently used in Linux kernel to modify, read and subscribe for nearly all networking states. Interface state, addresses, routes, firewall, rules, fibs, etc, are controlled via Netlink. Netlink support was added in D36002. It has got a number of improvements and first customers since then: * net/bird2 got netlink support, enabling route multipath in FreeBSD * netlink-based devd notifications are being worked on ( D37574 ). * linux(4) fully supports and depends on Netlink Enabling Netlink in GENERIC targets two goals. The first one is to provide stability for the third-party userland applications, so they can rely on the fact that netlink always exists since 14.0 and potentially 13.2. Loadable module makes life of the app delepers harder. For example, `net/bird2` can be either build with netlink or rtsock support, but not both. The second goal is to enable gradual conversion of the base userland tools to use netlink(4) interfaces. Converting tools like netstat (D36529), route, ifconfig one-by-one simplifies testing and addressing the feedback. Othewise, switching all base to use netlink at once may be too big of a leap. This change targets amd64, the other architectures will follow soon. Differential Revision: https://reviews.freebsd.org/D37783	2023-01-13 10:22:40 +00:00
Konstantin Belousov	ad97b9bbfc	amd64 pmap.h: make it easier to use the header for other consumers Guard pmap_invlpg() definition with checks that only provide it when both sys/pcpu.h and machine/cpufunc.h were already included. Requested by: Elliott Mitchell Sponsored by: The FreeBSD Foundation MFC after: 1 week	2023-01-06 01:30:29 +02:00
Konstantin Belousov	a2c08eba43	amd64: be more precise when enabling the AlderLake small core PCID workaround In particular, do not enable the workaround if INVPCID is not supported by the core. Reported by: "Chen, Alvin W" <Weike.Chen@Dell.com> Reviewed by: markj Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37940	2023-01-06 01:30:29 +02:00
Konstantin Belousov	231d75568f	Move INVLPG to pmap_quick_enter_page() from pmap_quick_remove_page(). If processor prefetches neighboring TLB entries to the one being accessed (as some have been reported to do), then the spin lock does not prevent the situation described in the "AMD64 Architecture Programmer's Manual Volume 2: System Programming" rev. 3.23, "7.3.1 Special Coherency Considerations". Reported and reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37770	2023-01-01 00:09:46 +02:00
Konstantin Belousov	cde70e312c	amd64: for small cores, use (big hammer) INVPCID_CTXGLOB instead of INVLPG A hypothetical CPU bug makes invalidation of global PTEs using INVLPG in pcid mode unreliable, it seems. The workaround is applied for all CPUs with small cores, since we do not know the scope of the issue, and the right fix. Reviewed by: alc (previous version) Discussed with: emaste, markj Tested by: karels PR: 261169, 266145 Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37770	2023-01-01 00:09:45 +02:00
Konstantin Belousov	45ac7755a7	amd64: identify small cores Reviewed by: alc Sponsored by: The FreeBSD Foundation MFC after: 1 week Differential revision: https://reviews.freebsd.org/D37770	2023-01-01 00:09:45 +02:00
Andrew Gallatin	1cac76c93f	vm: reduce lock contention when processing vm batchqueues Rather than waiting until the batchqueue is full to acquire the lock & process the queue, we now start trying to acquire the lock using trylocks when the batchqueue is 1/2 full. This removes almost all contention on the vm pagequeue mutex for for our busy sendfile() based web workload. It also greadly reduces the amount of time a network driver ithread remains blocked on a mutex, and eliminates some packet drops under heavy load. So that the system does not loose the benefit of processing large batchqueues, I've doubled the size of the batchqueues. This way, when there is no contention, we process the same batch size as before. This has been run for several months on a busy Netflix server, as well as on my personal desktop. Reviewed by: markj Sponsored by: Netflix Differential Revision: https://reviews.freebsd.org/D37305	2022-12-14 14:34:07 -05:00
Alan Cox	f0878da03b	pmap: standardize promotion conditions between amd64 and arm64 On amd64, don't abort promotion due to a missing accessed bit in a mapping before possibly write protecting that mapping. Previously, in some cases, we might not repromote after madvise(MADV_FREE) because there was no write fault to trigger the repromotion. Conversely, on arm64, don't pointlessly, yet harmlessly, write protect physical pages that aren't part of the physical superpage. Don't count aborted promotions due to explicit promotion prohibition (arm64) or hardware errata (amd64) as ordinary promotion failures. Reviewed by: kib, markj MFC after: 2 weeks Differential Revision: https://reviews.freebsd.org/D36916	2022-12-12 11:32:50 -06:00
John Baldwin	af3b48e101	vmm: Free vCPUs when destroying them. Reported by: andrew Reviewed by: corvink, andrew, markj Differential Revision: https://reviews.freebsd.org/D37649	2022-12-09 10:27:05 -08:00
John Baldwin	d212d6ebb4	vmm: Avoid infinite loop in vcpu_lock_all error case. Reported by: Coverity (CIDs 1501060,1501071) Reviewed by: corvink, markj, emaste Differential Revision: https://reviews.freebsd.org/D37648	2022-12-09 10:26:49 -08:00
John Baldwin	91980db1be	vmm: Don't lock a vCPU for VM_PPTDEV_MSI[X]. These are manipulating state in a ppt(4) device none of which is vCPU-specific. Mark the vcpu fields in the relevant ioctl structures as unused, but don't remove them for now. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37639	2022-12-09 10:26:23 -08:00
John Baldwin	62be9ffd82	vmm: VM_GET/SET_KERNEMU_DEV should run with the vCPU locked. Reviewed by: corvink, kib, markj Differential Revision: https://reviews.freebsd.org/D37638	2022-12-09 10:25:30 -08:00
John Baldwin	1f6db5d6b5	vmm: Remove stale comment for vm_rendezvous. Support for rendezvous outside of a vcpu context (vcpuid of -1) was removed in commit `949f0f47a4`, and the vm, vcpuid argument pair was replaced by a single struct vcpu pointer in commit `d8be3d523d`. Reported by: andrew	2022-11-30 13:06:46 -08:00
Bjoern A. Zeeb	4a8e4d1546	net80211: fix IEEE80211_DEBUG_REFCNT builds Remove the KPI/KBI changes from ieee80211_node.h and always use the macros to pass in __func__ and __LINE__ to the functions. The actual implementations are prefixed by "_" rather than suffixed by "_debug" as they no longer are "debug"-specific. Some of the select functions were not actually using the passed in func, line options; however they are calling other functions which use them. Directly call the internal implementation in those cases passing the arguments on. Use a file-local __debrefcnt_used define to mark the arguments __unused in cases when we compile without IEEE80211_DEBUG_REFCNT and hope the toolchain is intelligent enough to not pass them at all in those cases. Also _ieee80211_free_node() now has a conflict so make the previous _ieee80211_free_node() the new __ieee80211_free_node(). Add IEEE80211_DEBUG_REFCNT to the NOTES file on amd64 to keep exercising the option. Sponsored by: The FreeBSD Foundation X-MFC: never Discussed on: freebsd-wireless Reviewed by: adrian Differential Revision: https://reviews.freebsd.org/D37529	2022-11-29 21:20:37 +00:00
Corvin Köhne	7c326ab5bb	vmm: don't lock a mtx in the icr_low write handler x2apic accesses are handled by a wrmsr exit. This handler is called in a critical section. So, we can't lock a mtx in the icr_low handler. Reported by: kp, pho Tested by: kp, pho Approved by: manu (mentor) Fixes: `c0f35dbf19` vmm: Use a cpuset_t for vCPUs waiting for STARTUP IPIs. MFC after: 1 week MFC with: `c0f35dbf19` Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D37452	2022-11-23 09:00:04 +01:00
Corvin Köhne	fde8ce8892	vmm: remove unneccessary rendezvous assertion When a vcpu sees that a rendezvous is in progress, it exits and tries to handle the rendezvous. The vcpu doesn't check if it's part of the rendezvous or not. If the vcpu isn't part of the rendezvous, the rendezvous could be done before it reaches the assertion. This will cause a panic. The assertion isn't needed at all because vm_handle_rendezvous properly handles a spurious rendezvous. So, we can just remove it. PR: 267779 Reviewed by: jhb, markj Tested by: bz Approved by: manu (mentor) MFC after: 1 week Sponsored by: Beckhoff Automation GmbH & Co. KG Differential Revision: https://reviews.freebsd.org/D37417	2022-11-21 08:19:36 +01:00
Dmitry Chagin	2ee1a18d51	vmm: Fix build w/o KDTRACE_HOOKS. Reviewed by: imp Differential revision: https://reviews.freebsd.org/D37446	2022-11-20 18:00:55 +03:00
Cy Schubert	d487cba33d	vmm: Fix non-INVARIANTS build Reported by: O. Hartmann <freebsd@walstatt-de.de> Reviewed by: jhb Fixes: `58eefc67a1` Differential Revision: https://reviews.freebsd.org/D37444	2022-11-18 13:20:13 -08:00
Mark Johnston	ca6b48f080	vmm: Restore the correct vm_inject_*() prototypes Fixes: `80cb5d845b` ("vmm: Pass vcpu instead of vm and vcpuid...") Reviewed by: jhb Differential Revision: https://reviews.freebsd.org/D37443	2022-11-18 14:11:48 -05:00
John Baldwin	49fd5115a9	vmm: Trim some pointless #ifdef KTR. Reported by: markj Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37272	2022-11-18 10:25:39 -08:00
John Baldwin	ee98f99d7a	vmm: Convert VM_MAXCPU into a loader tunable hw.vmm.maxcpu. The default is now the number of physical CPUs in the system rather than 16. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37175	2022-11-18 10:25:39 -08:00
John Baldwin	98568a005a	vmm: Allocate vCPUs on first use of a vCPU. Convert the vcpu[] array in struct vm to an array of pointers and allocate vCPUs on first use. This avoids always allocating VM_MAXCPU vCPUs for each VM, but instead only allocates the vCPUs in use. A new per-VM sx lock is added to serialize attempts to allocate vCPUs on first use. However, a given vCPU is never freed while the VM is active, so the pointer is read via an unlocked read first to avoid the need for the lock in the common case once the vCPU has been created. Some ioctls need to lock all vCPUs. To prevent races with ioctls that want to allocate a new vCPU, these ioctls also lock the sx lock that protects vCPU creation. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37174	2022-11-18 10:25:38 -08:00
John Baldwin	c0f35dbf19	vmm: Use a cpuset_t for vCPUs waiting for STARTUP IPIs. Retire the boot_state member of struct vlapic and instead use a cpuset in the VM to track vCPUs waiting for STARTUP IPIs. INIT IPIs add vCPUs to this set, and STARTUP IPIs remove vCPUs from the set. STARTUP IPIs are only reported to userland for vCPUs that were removed from the set. In particular, this permits a subsequent change to allocate vCPUs on demand when the vCPU may not be allocated until after a STARTUP IPI is reported to userland. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37173	2022-11-18 10:25:38 -08:00
John Baldwin	223de44c93	vmm devmem_mmap_single: Bump object reference under memsegs lock. Reported by: markj Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37273	2022-11-18 10:25:38 -08:00
John Baldwin	67b69e76e8	vmm: Use an sx lock to protect the memory map. Previously bhyve obtained a "read lock" on the memory map for ioctls needing to read the map by locking the last vCPU. This is now replaced by a new per-VM sx lock. Modifying the map requires exclusively locking the sx lock as well as locking all existing vCPUs. Reading the map requires either locking one vCPU or the sx lock. This permits safely modifying or querying the memory map while some vCPUs do not exist which will be true in a future commit. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37172	2022-11-18 10:25:38 -08:00
John Baldwin	08ebb36076	vmm: Destroy mutexes. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37171	2022-11-18 10:25:38 -08:00
John Baldwin	d5118d0fc4	vmm stat: Add a special nelems constant for arrays sized by vCPU count. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37170	2022-11-18 10:25:38 -08:00
John Baldwin	58eefc67a1	vmm vmx: Allocate vpids on demand as each vCPU is initialized. Compared to the previous version this does mean that if the system as a whole runs out of dedicated vPIDs you might end up with some vCPUs within a single VM using dedicated vPIDs and others using shared vPIDs, but this should not break anything. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37169	2022-11-18 10:25:38 -08:00
John Baldwin	3f0f4b1598	vmm: Lookup vcpu pointers in vmmdev_ioctl. Centralize mapping vCPU IDs to struct vcpu objects in vmmdev_ioctl and pass vcpu pointers to the routines in vmm.c. For operations that want to perform an action on all vCPUs or on a single vCPU, pass pointers to both the VM and the vCPU using a NULL vCPU pointer to request global actions. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37168	2022-11-18 10:25:38 -08:00
John Baldwin	0cbc39d53d	vmm ppt: Remove unused vcpu arg from MSI setup handlers. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37167	2022-11-18 10:25:37 -08:00
John Baldwin	e42c24d56b	vmm: Remove unused vcpuid argument from vioapic_process_eoi. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37166	2022-11-18 10:25:37 -08:00
John Baldwin	d8be3d523d	vmm: Use struct vcpu in the rendezvous code. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37165	2022-11-18 10:25:37 -08:00
John Baldwin	949f0f47a4	vmm: Remove support for vm_rendezvous with a cpuid of -1. This is not currently used. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37164	2022-11-18 10:25:37 -08:00
John Baldwin	9388bc1e3a	vmm: Remove vcpuid from I/O port handlers. No I/O ports are vCPU-specific (unlike memory which does have vCPU-specific ranges such as the local APIC). Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37163	2022-11-18 10:25:37 -08:00
John Baldwin	80cb5d845b	vmm: Pass vcpu instead of vm and vcpuid to APIs used from CPU backends. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37162	2022-11-18 10:25:37 -08:00
John Baldwin	d3956e4673	vmm: Use struct vcpu in the instruction emulation code. This passes struct vcpu down in place of struct vm and and integer vcpu index through the in-kernel instruction emulation code. To minimize userland disruption, helper macros are used for the vCPU arguments passed into and through the shared instruction emulation code. A few other APIs used by the instruction emulation code have also been updated to accept struct vcpu in the kernel including vm_get/set_register and vm_inject_fault. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37161	2022-11-18 10:25:37 -08:00
John Baldwin	28b561ad9d	vmm: Add vm_gpa_hold_global wrapper function. This handles the case that guest pages are being held not on behalf of a virtual CPU but globally. Previously this was handled by passing a vcpuid of -1 to vm_gpa_hold, but that will not work in the future when vm_gpa_hold is changed to accept a struct vcpu pointer. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37160	2022-11-18 10:25:36 -08:00
John Baldwin	0f435e6476	vmm: Add _KERNEL guards for io headers shared with userspace. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37159	2022-11-18 10:25:36 -08:00
John Baldwin	2b4fe856f4	bhyve: Remove unused vm and vcpu arguments from vm_copy routines. The arguments identifying the VM and vCPU are only needed for vm_copy_setup. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37158	2022-11-18 10:25:36 -08:00
John Baldwin	3dc3d32ad6	vmm: Use struct vcpu with the vmm_stat API. The function callbacks still use struct vm and and vCPU index. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37157	2022-11-18 10:25:36 -08:00
John Baldwin	950af9ffc6	vmm: Expose struct vcpu as an opaque type. Pass a pointer to the current struct vcpu to the vcpu_init callback and save this pointer in the CPU-specific vcpu structures. Add routines to fetch a struct vcpu by index from a VM and to query the VM and vcpuid from a struct vcpu. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37156	2022-11-18 10:25:36 -08:00
John Baldwin	d030f941e6	vmm: Use VLAPIC_CTR* in more places. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37155	2022-11-18 10:25:36 -08:00
John Baldwin	57e0119ef3	vmm vmx: Add VMX_CTR* wrapper macros. These macros are similar to VCPU_CTR* but accept a single vmx_vcpu pointer as the first argument instead of separate vm and vcpuid. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37154	2022-11-18 10:25:36 -08:00
John Baldwin	fca494dad0	vmm svm: Add SVM_CTR* wrapper macros. These macros are similar to VCPU_CTR* but accept a single svm_vcpu pointer as the first argument instead of separate vm and vcpuid. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37153	2022-11-18 10:25:36 -08:00
John Baldwin	869c8d1946	vmm: Remove the per-vm cookie argument from vmmops taking a vcpu. This requires storing a reference to the per-vm cookie in the CPU-specific vCPU structure. Take advantage of this new field to remove no-longer-needed function arguments in the CPU-specific backends. In particular, stop passing the per-vm cookie to functions that either don't use it or only use it for KTR traces. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37152	2022-11-18 10:25:35 -08:00
John Baldwin	1aa5150479	vmm: Refactor storage of CPU-dependent per-vCPU data. Rather than storing static arrays of per-vCPU data in the CPU-specific per-VM structure, adopt a more dynamic model similar to that used to manage CPU-specific per-VM data. That is, add new vmmops methods to init and cleanup a single vCPU. The init method returns a pointer that is stored in 'struct vcpu' as a cookie pointer. This cookie pointer is now passed to other vmmops callbacks in place of the integer index. The index is now only used in KTR traces and when calling back into the CPU-independent layer. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37151	2022-11-18 10:25:35 -08:00
John Baldwin	73abae4493	vmm vmx: Add a global bool to indicate if the host has the TSC_AUX MSR. A future commit will remove direct access to vCPU structures from struct vmx, so add a dedicated boolean for this rather than checking the capabilities for vCPU 0. Reviewed by: corvink, markj Differential Revision: https://reviews.freebsd.org/D37269	2022-11-18 10:25:35 -08:00

... 2 3 4 5 6 ...

9249 commits