Commit graph

9172 commits

Author SHA1 Message Date
Mark Johnston 6adf554abd vmm: Fix handling of errors from subyte()
subyte() returns -1 upon an error, not an errno value.

MFC after:	1 week
Fixes:	e17eca3276 ("vmm: Avoid embedding cpuset_t ioctl ABIs")
2023-12-25 21:04:01 -05:00
Mark Johnston 7b68fb5ab2 thread: Add a return value to cpu_set_upcall()
Some implementations copy data to userspace, an operation which can in
principle fail.  In preparation for adding a __result_use_check
annotation to copyin() and related functions, let implementations of
cpu_set_upcall() return an error, and check for errors when copying data
to user memory.

Reviewed by:	kib, jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D43100
2023-12-25 21:04:00 -05:00
Konstantin Belousov 671a00491d vm_iommu_map()/unmap(): stop transiently wiring already wired pages
Namely, switch from vm_fault_quick_hold() to pmap_extract() KPI to
translate gpa to hpa. Assert that the looked up hpa belongs to the wired
page, as it should be for the VM which is configured for pass-throu
(this is theoretically a restriction that could be removed on newer
DMARs).

Noted by:	alc
Reviewed by:	alc, jhb, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D43140
2023-12-22 19:34:27 +02:00
Konstantin Belousov 3abc72f871 vmm_iommu_modify(): split vm_iommu_map()/unmap() into separate functions
Reviewed by:	alc, jhb, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D43140
2023-12-22 19:34:27 +02:00
Konstantin Belousov 7c8f163184 vmm.h: remove dup declaration
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D43139
2023-12-21 01:51:47 +02:00
Elliott Mitchell 40e1d9d45f xen: add SPDX license tags to Xen headers
These are in fact GPLv2 when distributed with the Linux kernel, but the
license also allows MIT if distributed separately.  Add the markers to
avoid interference by automated tools.

Differential Revision: https://reviews.freebsd.org/D32796
Reviewed by: royger
2023-12-15 14:59:25 +01:00
Bojan Novković 181afaaaee vmm: implement VM_CAP_MASK_HWINTR on AMD CPUs
This patch implements the interrupt blocking VM capability on AMD
CPUs.  Implementing this capability allows the GDB stub to single-step
a virtual machine without landing inside interrupt handlers.

Reviewed by:	jhb, corvink
Sponsored by:   Google, Inc. (GSoC 2022)
Differential Revision:	https://reviews.freebsd.org/D42299
2023-12-07 15:11:04 -08:00
Bojan Novković e3b4fe645e vmm: implement single-stepping for AMD CPUs
This patch implements single-stepping for AMD CPUs using the RFLAGS.TF
single-stepping mechanism.  The GDB stub requests single-stepping
using the VM_CAP_RFLAGS_TF capability.  Setting this capability will
set the RFLAGS.TF bit on the selected vCPU, activate DB exception
intercepts, and activate POPF/PUSH instruction intercepts.  The
resulting DB exception is then caught by the IDT_DB vmexit handler and
bounced to userland where it is processed by the GDB stub.  This patch
also makes sure that the value of the TF bit is correctly updated and
that it is not erroneously propagated into memory.  Stepping over PUSHF
will cause the vm_handle_db function to correct the pushed RFLAGS
value and stepping over POPF will update the shadowed TF bit copy.

Reviewed by:	jhb
Sponsored by:	Google, Inc. (GSoC 2022)
Differential Revision:	https://reviews.freebsd.org/D42296
2023-12-07 15:11:04 -08:00
Bojan Novković 231eee17d2 vmm: enable software breakpoints for AMD CPUs
This patch adds support for software breakpoint vmexits on AMD SVM.
It implements the VM_CAP_BPT_EXIT used to enable software breakpoints.
When enabled, breakpoint vmexits are passed to userspace where they
are handled by the GDB stub.

Reviewed by:	jhb
Sponsored by:	Google, Inc. (GSoC 2022)
Differential Revision:	https://reviews.freebsd.org/D42295
2023-12-07 15:10:56 -08:00
Bojan Novković 78c1d174a1 vmm: refactor event reflection in AMD SVM
This patch refactors AMD SVM event reflection to allow events to be
propagated to userland, rather than always reflected into the guest.

This is necessary to implement some capabilities that request VMEXITs
when a specific exception occurs (e.g. VM_CAP_BPT_EXIT).

Reviewed by:	jhb
Sponsored by:	Google, Inc. (GSoC 2022)
Differential Revision:	https://reviews.freebsd.org/D42405
2023-12-07 15:10:53 -08:00
John Baldwin f54a3890b1 x86: Support multiple PCI MCFG regions
In particular, this enables support for PCI config access for domains
(segments) other than 0.

Reported by:	cperciva
Tested by:	cperciva (m7i.metal-48xl AWS instance)
Reviewed by:	imp
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D42828
2023-11-29 10:32:39 -08:00
John Baldwin 9893a4fd31 x86: Refactor pcie_cfgregopen
Split out some bits of pcie_cfgregopen that only need to be executed
once into helper functions in preparation for supporting multiple MCFG
entries.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D42829
2023-11-29 10:32:16 -08:00
John Baldwin 1587a9db92 pci_cfgreg: Add a PCI domain argument to the low-level register API
This commit changes the API of pci_cfgreg(read|write) to add a domain
argument (referred to as a segment in ACPI parlance) (note that this
is not the same as a NUMA domain, but something PCI-specific).  This
does not yet enable access to domains other than 0, but updates the
API to support domains.

Places that use hard-coded bus/slot/function addresses have been
updated to hardcode a domain of 0.  A few places that have the PCI
domain (segment) available such as the acpi_pcib_acpi.c Host-PCI
bridge driver pass the PCI domain.

The hpt27xx(4) and hptnr(4) drivers fail to attach to a device not on
domain 0 since they provide APIs to their binary blobs that only
permit bus/slot/function addressing.

The x86 non-ACPI PCI bus drivers all hardcode a domain of 0 as they do
not support multiple domains.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D42827
2023-11-29 10:31:47 -08:00
Elliott Mitchell b3195ed2a3 xen: correct spacing in hypercall.h headers
A precursor to merging them.  The spacing differs quite a bit between
the i386 and amd64 hypercall headers, despite very similar content.
Consistently use tabs instead of spaces.

Reviewed by: royger
2023-11-28 12:23:18 +01:00
Warner Losh fdafd315ad sys: Automated cleanup of cdefs and other formatting
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by:		Netflix
2023-11-26 22:24:00 -07:00
Warner Losh 5b31cc94b1 sccs: Manual changes
For the uncommon items: Go through the tree and remove sccs tags that
didn't fit any nice pattern. If in the neighborhood, other SCM tags were
removed when they were detritis of long-ago CVS somehow in the early
mists of the project. Some adjacent copyrights stringswere removed (they
duplicated the copyright notices in the file). This also removed
non-standard formations of omission of SCCS tags (usually by adding an
extra #if 0 somewhere.

After this commit, a number of strings tagged with the 'what' @(#)
prefix remain, but they are primarily copyright notices.

Sponsored by:		Netflix
2023-11-26 22:23:58 -07:00
Warner Losh 29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
Brooks Davis 54d487c4d0 makesyscalls: don't make syscall.mk by default
We only want to produce syscall.mk for the main syscall table so default
to not producing it (send it to /dev/null) and add a syscalls.conf to
sys/kern to trigger the creation of sys/sys/syscall.mk.  This eliminates
the need for entries in other syscalls.conf files and is a cleaner
pattern going forward.

Reviewed by:	kevans, imp
Differential Revision:	https://reviews.freebsd.org/D42663
2023-11-18 00:48:14 +00:00
Warner Losh 49025a1109 _bus.h: Use standard licnese text
All of these used the 'immediately at beginning' variation of the
BSD-2-Clause license. This wasn't intentional, just what I copied from
from a random file in the tree back in 2005. It was not an intentional
decision.

The different arch bus.h files are a mix of BSD-2-Clause and
BSD-4-Clause that have various copyright holders (Charles M. Hannum,
Christopher G. Demetriou, The NetBSD Foundation and KATO Takenori), and
some of the content of these files were likely copied from there.
However, apart from the uncopyrightable interface lines, there are very
few comments. It's unclear if these comments are 'original material'
here to copyright, but to the extent that there is, license it under the
standard BSD-2-Clause copyright that's the norm for the project today.
In any event, the standard BSD-2-Clause is also closer to those
originals.

In addition, FreeBSD uses different type definitions than the original
NetBSD code in part. The comments that were copied have been copied a
lot, but appear in NetBSD's bus.h files in NetBSD 1.3.

While I'm here, assign the copyright, to the extent any exists from me,
to the FreeBSD Foundation. I just cut and pasted these into _bus.h from
the different machine files and those files have a rich history of
modification from the original imports from NetBSD over more than 25
years so it's tricky to say who, exactly, wrote each bit. Given the size
of the files, this seems like the best compromise.  Also add an
acknowledgement to the NetBSD 1.3 bus.h files and their authors (there
were no additional FreeBSD authors listed in the various
sys/*/include/bus.h files). Finally, use the SPDX identifier instead of
multiple copies of the text.

Differential Revision:	https://reviews.freebsd.org/D42532
Sponsored by:		Netflix
2023-11-13 12:25:30 -07:00
Mark Johnston 2b08492382 amd64: Remove PMAP_INLINE
With clang it expands to "inline"; clang in practice may inline
externally visible functions even without the hint.  So just remove the
hints and let the compiler decide.

No functional change intended.  pmap.o is identical before and after
this patch.

Reviewed by:	alc
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D42446
2023-11-02 14:30:10 -04:00
Zhenlei Huang 9e7f349ff1 amd64 pmap: Prefer consistent naming for loader tunable
The sysctl knob 'vm.pmap.allow_2m_x_ept' is loader tunable and have
public document entry in security(7) but is fetched from kernel
environment 'hw.allow_2m_x_ept'. That is inconsistent and obscure.

As there is public security advisory FreeBSD-SA-19:25.mcepsc [1],
people may refer to it and use 'hw.allow_2m_x_ept', let's keep old
name for compatibility.

[1] https://www.freebsd.org/security/advisories/FreeBSD-SA-19:25.mcepsc.asc

Reviewed by:	kib
Fixes:		c08973d09c Workaround for Intel SKL002/SKL012S errata
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D42311
2023-10-21 09:31:58 +08:00
Zhenlei Huang f3ff0918ff vmx: Prefer consistent naming for loader tunables
The following loader tunables do have corresponding sysctl MIBs but
with different names. That may be historical reason. Let's prefer
consistent naming for them so that it will be easier to read and
maintain.

 1. hw.vmm.l1d_flush -> hw.vmm.vmx.l1d_flush
 2. hw.vmm.l1d_flush_sw -> hw.vmm.vmx.l1d_flush_sw
 3. hw.vmm.vmx.use_apic_pir -> hw.vmm.vmx.cap.posted_interrupts
 4. hw.vmm.vmx.use_apic_vid -> hw.vmm.vmx.cap.virtual_interrupt_delivery
 5. hw.vmm.vmx.use_tpr_shadowing -> hw.vmm.vmx.cap.tpr_shadowing

Old names are kept for compatibility.

Meanwhile, add sysctl flag CTLFLAG_TUN to them so that `sysctl -T` will
report them correctly.

Reviewed by:	corvink, jhb, kib, #bhyve
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D42251
2023-10-20 01:18:25 +08:00
Zhenlei Huang afbb8041a0 amd64: Fix two typos of loader tunables
To match the sysctl MIBs and document entries in security(7).

Fixes:	2dec2b4a34 amd64: flush L1 data cache on syscall return with an error
Fixes:	17edf152e5 Control for Special Register Buffer Data Sampling mitigation

Reviewed by:	kib
MFC after:	1 day
Differential Revision:	https://reviews.freebsd.org/D42249
2023-10-19 23:23:33 +08:00
Mark Johnston 8fd0ec53de uiomove: Add some assertions
Make sure that we don't try to copy with a negative resid.

Make sure that we don't walk off the end of the iovec array.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D42098
2023-10-17 09:12:19 -04:00
Mark Johnston a37e484d04 amd64: Zero-fill AP PCPU pages
At least KMSAN relies on zero-initialization of AP PCPU regions, see
commit 4b136ef259.

Prior to commit af1c6d3f30 these were allocated with allocpages() in
the amd64 pmap, which always returns zero-initialized memory.

Reviewed by:	kib
Fixes:		af1c6d3f30 ("amd64: do not leak pcpu pages")
MFC after:	3 days
Sponsored by:	Klara, Inc.
Sponsored by:	Juniper Networks, Inc.
Differential Revision:	https://reviews.freebsd.org/D42241
2023-10-17 09:12:08 -04:00
Mina Galić 74e4a8d208 pmap: add pmap_kextract(9) man page
Add a man page for pmap_kextract(9), with alias to vtophys(9). This man
page is based on pmap_extract(9).

Add it as cross reference in pmap(9), and add comments above the
function implementations.

Co-authored-by:	Graham Perrin <grahamperrin@gmail.com>
Co-authored-by:	mhorne
Sponsored by:	The FreeBSD Foundation
Pull Request:	https://github.com/freebsd/freebsd-src/pull/827
2023-10-13 15:27:24 -03:00
John Baldwin cc1cb9ea0c x86: Rename {stop,start}_emulating to fpu_{enable,disable}
While here, centralize the macros in <x86/fpu.h>.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D42135
2023-10-11 14:32:06 -07:00
John Baldwin e839ebfc0d amd64: Remove a stale comment from cpu_setregs
Reviewed by:	kib, markj, emaste
Differential Revision:	https://reviews.freebsd.org/D42134
2023-10-11 14:22:17 -07:00
Kristof Provost 84d12f887c Add a COMPAT_FREEBSD14 kernel option
Use it wherever COMPAT_FREEBSD13 is currently specified.

Reviewed by:	brooks, zlei
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42100
2023-10-10 11:48:22 +02:00
Dmitry Chagin 5bdd74cc05 linux(4): Drop the outdated comments about sixth register on i386 int0x80
This is well documented in the Linux syscall(2).

MFC after:		1 week
2023-10-10 12:33:22 +03:00
Dmitry Chagin 4fe779900b linux(4): Deduplicate SystemV IPC defines from amd64/linux
MFC after:		1 week
2023-10-04 21:18:45 +03:00
Dmitry Chagin 199e397e9b linux(4): Deorbit linux_nosys
Differential Revision:	https://reviews.freebsd.org/D41901
MFC after:		1 week
2023-10-03 10:38:03 +03:00
Dmitry Chagin 99abee8b7b linux(4): Regen for linux_nosys change
MFC after:		1 week
2023-10-03 10:38:03 +03:00
Dmitry Chagin 8e523be5a5 linux(4): Deorbit linux_nosys from syscalls.master
Differential Revision:	https://reviews.freebsd.org/D41902
MFC after:		1 week
2023-10-03 10:38:02 +03:00
Konstantin Belousov 7acc4240ce linuxolator: fix nosys() to not send SIGSYS
Reviewed by:	dchagin, markj
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D41976
2023-10-03 01:30:53 +03:00
Konstantin Belousov b82b4ae752 sysentvec: add SV_SIGSYS flag
to allow ABIs to indicate that SIGSYS is needed.  Mark all native
FreeBSD ABIs with the flag.

This implicitly marks Linux' ABIs as not delivering SIGSYS on invalid
syscall.

Reviewed by:	dchagin, markj
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D41976
2023-10-03 01:30:52 +03:00
Konstantin Belousov 39024a8914 syscalls: fix missing SIGSYS for several ENOSYS errors
In particular, when the syscall number is too large, or when syscall is
dynamic.  For that, add nosys_sysent structure to pass fake sysent to
syscall top code.

Reviewed by:	dchagin, markj
Discussed with:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D41976
2023-10-03 01:30:52 +03:00
Konstantin Belousov 6b3bb233cd amd64 cpu_fetch_syscall_args_fallback(): fix whitespace
Reviewed by:	dchagin, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D41976
2023-10-03 01:30:52 +03:00
Olivier Certner ebaea1bcd2 x86: AMD Zen2: Zenbleed chicken bit mitigation
Applies only to bare-metal Zen2 processors.  The system currently
automatically applies it to all of them.

Tunable/sysctl 'machdep.mitigations.zenbleed.enable' can be used to
forcibly enable or disable the mitigation at boot or run-time.  Possible
values are:

    0: Mitigation disabled
    1: Mitigation enabled
    2: Run the automatic determination.

Currently, value 2 is the default and has identical effect as value 1.
This might change in the future if we choose to take into account
microcode revisions in the automatic determination process.

The tunable/sysctl value is simply ignored on non-applicable CPU models,
which is useful to apply the same configuration on a set of machines
that do not all have Zen2 processors.  Trying to set it to any integer
value not listed above is silently equivalent to setting it to value 2
(automatic determination).

The current mitigation state can be queried through sysctl
'machdep.mitigations.zenbleed.state', which returns "Not applicable",
"Mitigation enabled" or "Mitigation disabled".  Note that this state is
not guaranteed to be accurate in case of intervening modifications of
the corresponding chicken bit directly via cpuctl(4) (this includes the
cpucontrol(8) utility).  Resetting the desired policy through
'machdep.mitigations.zenbleed.enable' (possibly to its current value)
will reset the hardware state and ensure that the reported state is
again coherent with it.

Reviewed by:	kib
Sponsored by:   The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41817
2023-10-02 15:29:18 -04:00
Dmitry Chagin 28035f675b linux(4): Regen
MFC after:		1 week
2023-09-25 12:26:34 +03:00
Dmitry Chagin 0a16d3d14d linux(4): Update syscalls.master to 6.5
MFC after:		1 week
2023-09-25 12:24:58 +03:00
Mark Johnston f7c733e4fe amd64: Convert a cheap DIAGNOSTIC check to a KASSERT
MFC after:	1 week
2023-09-17 06:27:31 -04:00
Bojan Novković aa3bcaad51 amd64: Add a leaf PTP when pmap_enter(psind=1) creates a wired mapping
This patch reverts the changes made in D19670 and fixes the original
issue by allocating and prepopulating a leaf page table page for wired
userspace 2M pages.

The original issue is an edge case that creates an unmapped, wired
region in userspace. Subsequent faults on this region can trigger wired
superpage creation, which leads to a panic in pmap_demote_pde_locked()
as the pmap does not create a leaf page table page for the wired
superpage. D19670 fixed this by disallowing preemptive creation of
wired superpage mappings, but that fix is currently interfering with an
ongoing effort of speeding up vm_map_wire for large, contiguous entries
(e.g. bhyve wiring guest memory).

Reviewed by:	alc, markj
Sponsored by:	Google, Inc. (GSoC 2023)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D41132
2023-09-17 06:27:22 -04:00
Olivier Certner 125bbadf60 x86: Add defines for workaround bits in AMD's MSR "Decode Configuration"
They are a bit more informative than raw hexadecimal values.

While here, sort existing defines of bits for AMD MSRs to match the address
order.

Reviewed by:	kib, emaste
Sponsored by:   The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41816
2023-09-14 16:24:48 +01:00
Dmitry Chagin ba90a31d08 linux(4): Cleanup includes under amd64/linux32
No functional changes.

MFC after:		1 week
2023-09-11 21:29:40 +03:00
Dmitry Chagin 68df2376e0 linux(4): Cleanup includes under amd64/linux
No functional changes.

MFC after:		1 week
2023-09-11 21:29:34 +03:00
Dmitry Chagin 2a1cf1b6b5 linux(4): Deduplicate mmap2
To help porting the Linux emulation layer to a new platforms start using
Linux names for conditional builds instead of architecture-specific ifdefs.

MFC after:		1 week
2023-09-05 21:16:39 +03:00
Dmitry Chagin 553b1a4e4e linux(4): Deduplicate mprotect, madvise
MFC after:		1 week
2023-09-05 21:15:52 +03:00
John Baldwin 4a9cd9fc22 amd64 db_trace: Reject unaligned frame pointers
Switch to using db_addr_t to hold frame pointer values until they are
verified to be suitably aligned.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D41532
2023-09-01 15:55:37 -07:00
John Baldwin d1e4c63d9e efirt_machdep.c: Trim some unused includes
Reviewed by:	imp, kib, markj
Differential Revision:	https://reviews.freebsd.org/D41596
2023-08-28 16:22:03 -07:00