Commit graph

1433 commits

Author SHA1 Message Date
Konstantin Belousov 24e38af60a DMAR: add knob to disable RMRR entries installation into domains
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-12-26 03:28:22 +02:00
Konstantin Belousov 7153d5e4bc dmar(9): style, fix indent
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-12-26 03:28:22 +02:00
Konstantin Belousov 6afa2333d2 iommu: remove leftover sys/cdefs.h includes
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-12-26 03:28:22 +02:00
Elliott Mitchell 4c9e6ad320 xen: add atomic #defines to accomodate differing xen_ulong_t sizes
Alas, ARM declared xen_ulong_t to be 64-bits long, unlike i386 where
it matches the word size.  As a result, compatibility wrappers are
needed for Xen atomic operations.

Reviewed by: royger
2023-12-15 14:59:26 +01:00
Mitchell Horne 3933ff56f9 busdma: tidy bus_dma_run_filter() functions
After removing filter functionality, the naming doesn't clearly
represent what the function does, so try to address this. Include some
code clarity and style improvements.

Create a common version in subr_busdma_bounce.c, used by most
implementations. powerpc still needs its own version of the function,
due to its dmat->iommu == NULL check.

No functional change intended.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D42896
2023-12-06 19:11:39 -04:00
Mitchell Horne 1228b93b41 busdma: remove parent tag tracking
Without filter functions, we do not need to keep track of tag ancestry.
All inheritance of the parent tag's parameters occurs when creating the
new child tag.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D42895
2023-12-06 19:11:39 -04:00
Mitchell Horne 900907f439 busdma: kill filter functionality internally
Address filter functions are unused, unsupported, and now rejected.
Simplify some busdma code by removing filter functionality completely.

Note that the chains of parent tags become useless, and will be cleaned
up in the next commit.

No functional change intended.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D42894
2023-12-06 19:11:39 -04:00
Mitchell Horne 7cb028deff busdma: Prevent the use of filters with bus_dma_tag_create()
A deprecation notice was added to the bus_dma(9) man page by scottl@ in
September 2020 discouraging the use of filter functions. I've performed
an attentive check of all callers in the tree and everything that exists
today passes NULL for both filtfunc and filtarg. Thus, we should start
returning EINVAL if these arguments are non-NULL to prevent new usages
from popping up. Update the man page to be more clear about this.

The deprecation notice is present since at least 13.0-RELEASE, so this
is the appropriate step for the lifetime of 15, without actually
breaking the driver API. Stable branches will emit a warning instead.

This change enables the removal of a fair amount of unused complexity
across the various busdma implementations.

Reviewed by:	jhb
MFC after:	never
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D42852
2023-12-06 19:10:25 -04:00
John Baldwin f54a3890b1 x86: Support multiple PCI MCFG regions
In particular, this enables support for PCI config access for domains
(segments) other than 0.

Reported by:	cperciva
Tested by:	cperciva (m7i.metal-48xl AWS instance)
Reviewed by:	imp
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D42828
2023-11-29 10:32:39 -08:00
John Baldwin 1587a9db92 pci_cfgreg: Add a PCI domain argument to the low-level register API
This commit changes the API of pci_cfgreg(read|write) to add a domain
argument (referred to as a segment in ACPI parlance) (note that this
is not the same as a NUMA domain, but something PCI-specific).  This
does not yet enable access to domains other than 0, but updates the
API to support domains.

Places that use hard-coded bus/slot/function addresses have been
updated to hardcode a domain of 0.  A few places that have the PCI
domain (segment) available such as the acpi_pcib_acpi.c Host-PCI
bridge driver pass the PCI domain.

The hpt27xx(4) and hptnr(4) drivers fail to attach to a device not on
domain 0 since they provide APIs to their binary blobs that only
permit bus/slot/function addressing.

The x86 non-ACPI PCI bus drivers all hardcode a domain of 0 as they do
not support multiple domains.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D42827
2023-11-29 10:31:47 -08:00
Elliott Mitchell c7368ccb68 xen: remove xen_domain_type enum/variable
The vm_guest variable readily covers all uses of xen_domain_type, so
merge them together.  Since support for PV domains has been removed
hard-core xen_pv_domain() to return false.

Reviewed by: royger
2023-11-28 13:40:19 +01:00
Elliott Mitchell c5c26f15f8 xen/x86: move x86-only variable out of common
Commit 27c36a12f1 is an x86-only feature.  As such xen_evtchn_needs_ack
should only exist on x86.

Differential Revision: https://reviews.freebsd.org/D29913
Reviewed by: royger
[royger]: adjust comment.
2023-11-28 13:30:40 +01:00
Elliott Mitchell 54a0b7203c xen/apic: remove passing trapframe as argument
While otherwise a handy potential approach, getting the trapframe via the
argument isn't documented and isn't supposed to be used.  While
ipi_bitmap_handler() and ipi_swi_handler() need to be passed the
trapframe as their arguments, the Xen functions can retrieve it from
curthread->td_intr_frame, which is the proper way.

Reviewed by: royger
2023-11-28 13:22:30 +01:00
Warner Losh fdafd315ad sys: Automated cleanup of cdefs and other formatting
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by:		Netflix
2023-11-26 22:24:00 -07:00
Warner Losh 29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
John Baldwin a03a335a80 x86 nexus: Use bus_generic_rman_*_resource
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D42740
2023-11-24 09:28:10 -08:00
John Baldwin b887b665eb nexus: Use resource_validate_map_request
Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D42724
2023-11-23 09:06:37 -08:00
John Baldwin ecf2106c07 arm64/amd64/riscv nexus: Use bus_generic_rl_*
Reviewed by:	mhorne, imp
Differential Revision:	https://reviews.freebsd.org/D42716
2023-11-22 09:06:33 -08:00
Warner Losh 20f8814cd3 busdma: On systmes that use subr_busdma_bounce, measure deferred time
Measure the total deferred time (from the time we decide to defer until
we try again) for busdma_load requests. On systems that don't ever
defer, there is no performnce change. Add new sysctl
hw.busdma.zoneX.total_deferred_time to report this (in
microseconds).

Normally, deferrals don't happen in modern hardware... Except there's a
lot of buggy hardware that can't cope with memory > 4GB or that can't
cross a 4GB boundary (or even more restrictive values), necessitating
bouncing. This will measure the effect on the I/Os of this deferral.

Sponsored by:		Netflix
Reviewed by:		gallatin, mav
Differential Revision:	https://reviews.freebsd.org/D42550
2023-11-13 07:23:53 -07:00
Zhenlei Huang 12cce5994b x86: Prefer consistent naming for loader tunables
The following loader tunables do have corresponding sysctl MIBs but
with inconsistent naming. That may be historical reason. Let's prefer
consistent naming for them so that it will be easier to maintain.

 1. hw.dmar.timeout -> hw.iommu.dmar.timeout
 2. hw.lapic_eoi_suppression -> hw.apic.eoi_suppression
 3. hw.lapic_tsc_deadline -> hw.apic.timer_tsc_deadline
 4. hw.x2apic_enable -> hw.apic.x2apic_mode

Those tunables are for field debugging, no need to keep old names for
compatibility.

Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D42248
2023-10-21 09:31:58 +08:00
John Baldwin bfccb4a429 x86: Cosmetic cleanups to struct msi_intsrc
- Sort members by size.

- Change msi_msix from a u_int to a bool.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D42305
2023-10-20 14:53:05 -07:00
John Baldwin 2d49248921 x86 msi: Enable/disable IDT vectors for MSI groups all at once
Unlike MSI-X, when a device uses multiple MSI interrupts, the entire
group of interrupts are enabled/disabled at once in the relevant PCI
config register.  Currently, the interrupt code enables the IDT vector
for each MSI interrupt when a handler is first registered.  If the PCI
device triggers an MSI interrupt which doesn't yet have a handler,
this can trigger a panic when the Xrsvd ISR executes rather than
treating it as a stray device interrupt.

To fix, enable all the IDT vectors for an MSI group when the first
interrupt handler is configured, and don't disable the IDT vectors
until the last interrupt handler for the group is torn down.

When migrating an MSI group between CPUs, enable/disable the entire
group of IDT vectors if at least one interrupt handler is configured
for the group.

Reported by:	jhay
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D42232
2023-10-20 14:52:38 -07:00
John Baldwin cc1cb9ea0c x86: Rename {stop,start}_emulating to fpu_{enable,disable}
While here, centralize the macros in <x86/fpu.h>.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D42135
2023-10-11 14:32:06 -07:00
Ed Maste 792655abd6 x86: make EARLY_AP_STARTUP mandatory
When early AP startup was introduced in 2016 it was put behind a kernel
option EARLY_AP_STARTUP as a transition aid, so that it could be turned
off if necessary.  For x86 the non-EARLY_AP_STARTUP case is no longer
functional, so disallow it.

Other archs are still incompatible with EARLY_AP_STARTUP, so the option
cannot yet be removed entirely.

Reported by:	wollman
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41351
2023-10-09 16:08:22 -04:00
Zhenlei Huang 149b9c234b x86: Add sysctl flag CTLFLAG_TUN to loader tunables
The following sysctl variables are actually loader tunables. Add sysctl
flag CTLFLAG_TUN to them so that `sysctl -T` will report them correctly.

 1. machdep.idle
 2. machdep.idle_apl31

No functional change intended.

Reviewed by:	kib, imp
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D42113
2023-10-09 18:30:21 +08:00
Olivier Certner ebaea1bcd2 x86: AMD Zen2: Zenbleed chicken bit mitigation
Applies only to bare-metal Zen2 processors.  The system currently
automatically applies it to all of them.

Tunable/sysctl 'machdep.mitigations.zenbleed.enable' can be used to
forcibly enable or disable the mitigation at boot or run-time.  Possible
values are:

    0: Mitigation disabled
    1: Mitigation enabled
    2: Run the automatic determination.

Currently, value 2 is the default and has identical effect as value 1.
This might change in the future if we choose to take into account
microcode revisions in the automatic determination process.

The tunable/sysctl value is simply ignored on non-applicable CPU models,
which is useful to apply the same configuration on a set of machines
that do not all have Zen2 processors.  Trying to set it to any integer
value not listed above is silently equivalent to setting it to value 2
(automatic determination).

The current mitigation state can be queried through sysctl
'machdep.mitigations.zenbleed.state', which returns "Not applicable",
"Mitigation enabled" or "Mitigation disabled".  Note that this state is
not guaranteed to be accurate in case of intervening modifications of
the corresponding chicken bit directly via cpuctl(4) (this includes the
cpucontrol(8) utility).  Resetting the desired policy through
'machdep.mitigations.zenbleed.enable' (possibly to its current value)
will reset the hardware state and ensure that the reported state is
again coherent with it.

Reviewed by:	kib
Sponsored by:   The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41817
2023-10-02 15:29:18 -04:00
John Hay d33a4ae8ba x86: Properly align interrupt vectors for MSI
MSI (not MSI-X) interrupt vectors must be allocated in groups that are
powers of 2, and the block of IDT vectors must be aligned to the size
of the request.

The code in native_apic_alloc_vectors() does an alignment check in the loop:

    if ((vector & (align - 1)) != 0)
        continue;
    first = vector;

But it adds APIC_IO_INTS to the value it returns:

    return (first + APIC_IO_INTS);

The problem is that APIC_IO_INTS is not a multiple of 32. It is 48:

As a result, a request for 32 vectors (the max supported by MSI), was
not always aligned.  To fix, check the alignment of
'vector + APIC_IO_INTS' in the loop.

PR:		274074
Reviewed by:	jhb
2023-09-28 14:08:08 -07:00
Olivier Certner 125bbadf60 x86: Add defines for workaround bits in AMD's MSR "Decode Configuration"
They are a bit more informative than raw hexadecimal values.

While here, sort existing defines of bits for AMD MSRs to match the address
order.

Reviewed by:	kib, emaste
Sponsored by:   The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41816
2023-09-14 16:24:48 +01:00
Ed Maste 0b029e9e85 x86: Introduce APIC ID limit by default on AMD hardware
Lack of an AMD IOMMU driver means we cannot successfully route
interrupts to APIC IDs 255 and over.  Do not add the corresponding CPUs
to the per-domain lists of CPUs to which interrupts can be assigned.

This change should be reverted (or, at least the APIC ID limit) once we
have an AMD IOMMU / interrupt remapping driver.

See also commits fa5f94140a ("msi: handle error from BUS_REMAP_INTR in
msi_assign_cpu") and 4258eb5a0d ("x86: handle domains with no CPUs
usable for intr delivery.").

Reviewed by:	markj, jhb
Tested by:	cperciva (earlier version)
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41618
2023-08-29 13:25:30 -04:00
Ed Maste 4258eb5a0d x86: handle domains with no CPUs usable for intr delivery
We can end up with a domain having no CPUs capable of receiving I/O
interrupts.  This can occur, for example, when all APIC IDs in a given
domain are 256 or greater, and we have no IOMMU.

In this case disable per-domain interrupt support, effectively reverting
to the behaviour before commit a48de40bcc ("Only use CPUs in the
domain the device is attached to for default").  This has a performance
impact but at least allows the system to be functional.  It is a stop-
gap until we can rely on the presence of an IOMMU on all x86 platforms.

Thanks to AMD for providing the high-thread-count machine I used for
testing this change, and to cperciva for testing on other hardware.

Reviewed by:	jhb
Tested by:	cperciva, emaste
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41501
2023-08-21 15:52:10 -04:00
Dmitry Chagin 0541653520 linux(4): Remove sys/cdefs.h inclusion under x86/linux due to 685dc743 2023-08-18 15:58:32 +03:00
Ed Maste fa5f94140a msi: handle error from BUS_REMAP_INTR in msi_assign_cpu
Previously errors from BUS_REMAP_INTR were silently ignored, and we
ended up with non-functional interrupts.

Now we allocate and enable new vectors, but postpone assignment of new
APIC IDs and vectors where we can, until after BUS_REMAP_INTR is
successful.  We then disable and free the old vectors.

If BUS_REMAP_INTR fails we restore the old configuration, and disable
and free the new, unused vectors.

Thanks to AMD for providing hardware (with APIC IDs above 255) for
testing.

Reviewed by:	jhb
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41455
2023-08-17 20:03:48 -04:00
Elliott Mitchell 5ad59b9153 intr: merge interrupt table uses of MAXCOMLEN into INTRNAME_LEN
The repeated uses of `MAXCOMLEN + 1` seem a bit hazardous.  If there was
a future need to change the size, the repeats will be troublesome.
Merge everything into `#define INTRNAME_LEN` (matches the name used by
INTRNG).

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D38455
2023-08-17 18:10:02 -04:00
Elliott Mitchell d8099e33c7 intr: move MAX_STRAY_LOG to interrupt.h
The two interrupt controllers which implement squelching of reports
after a maximum use the same limit.  Move the limit to interrupt.h, the
better to encourage other interrupt controllers to implement the same.

Reviewed by:	markj
MFC after:	2 weks
Differential Revision:	https://reviews.freebsd.org/D35527
2023-08-17 18:10:02 -04:00
Warner Losh 031beb4e23 sys: Remove $FreeBSD$: one-line sh pattern
Remove /^\s*#[#!]?\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:58 -06:00
Warner Losh 685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh 71625ec9ad sys: Remove $FreeBSD$: one-line .c comment pattern
Remove /^/[*/]\s*\$FreeBSD\$.*\n/
2023-08-16 11:54:24 -06:00
Warner Losh 2ff63af9b8 sys: Remove $FreeBSD$: one-line .h pattern
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:18 -06:00
Warner Losh 95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Ed Maste fe5d8f7a64 x86: include CPU ID in "Invalid CPU ID" panic
Sponsored by:	The FreeBSD Foundation
2023-08-15 09:38:29 -04:00
Konstantin Belousov 93626d5437 tc_fill_vdso_timehands32(): fix
On 64bit, there is a 4-byte hole in struct vdso_timekeep32 after
tk_current, if the structure is not packed.  This is due to the MD
th_x86_pvc_last_systime being 64bit.

Change amd64 VDSO_TIMEHANDS_MD32 to not use uint64_t, replace it with
pair of uint32_t, as it is done for all other members.

PR:	273085
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-08-13 01:34:08 +03:00
Ed Maste cbf845052f msi: report error for attempt to use APIC ID > 255
The MSI/MSI-X address includes 8 bits to encode the Destination ID.
Previously IDs over 255 overlapped with the fixed portion of the
address, resulting in an invalid value (and a nonfunctional interrupt).

Instead, print an error message and return EINVAL.  The interrupt will
still not work, but the user will have a clue as to why.

PR:		273022
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41395
2023-08-09 13:52:43 -04:00
Elliott Mitchell eee6537665 x86: remove intr_bind
`intr_bind(u_int vector, u_char cpu);` looked suspicious since
everywhere else "cpu" is a u_int and >256 processors isn't unreasonable
now.  `intr_bind()` is not used anywhere in FreeBSD (now, after commit
bf42f3738087).  Time to remove.

Relnotes:	Yes
Reviewed by:	mjg
Differential Revision: https://reviews.freebsd.org/D36901
2023-08-03 17:01:56 -04:00
Elliott Mitchell 2bb16c6352 x86: retire use of intr_bind
`intr_bind(u_int vector, u_char cpu);` looked suspicious since
everywhere else "cpu" is a u_int and >256 processors isn't unreasonable
now.

Reviewed by:	mjg
Differential Revision: https://reviews.freebsd.org/D36901
2023-08-03 17:01:18 -04:00
Dmitry Chagin 4281dab8bc linux(4): Add elf_hwcap2 to x86
On x86 Linux via AT_HWCAP2 the user controlled (by tunables) processor
capabilities are exposed.

Reviewed by:
Differential Revision:	https://reviews.freebsd.org/D41165
MFC after:		2 weeks
2023-07-28 11:56:59 +03:00
Elliott Mitchell 20fc5bf7df xen: move vcpu_info to common, leave hook for setup
vcpu_info is crucial for the Xen event channel core.  Since both the
data and setup steps are identical between architectures, move them to
the common file.  Since there is no cross-architecture method to call
a function on every processor during bring-up, simply leave the setup
function.

The number of vcpu_info structures available on the shared information
page varies by architecture.  Instead of hard-coding the count use
nitems().  Add a warning message for this being used.

Switch to XEN_VCPUID() and use Xen's typedefs.

panic() on failure since >32 processors is no longer unusual.

royger: Specify 64-byte alignment for vcpu_info to try to defend
against vcpu_info crossing a page boundary.  Add detection for this
limit.

Reviewed by: royger
2023-07-21 10:59:12 +02:00
Mark Johnston e60316d1ea x86: Add defines for a couple of thermal and PM bits
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2023-06-19 13:32:22 -04:00
Johannes Totz e74dd9577f hwpstate_amd: calculate power if P-state info comes from MSR
Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D40140
2023-06-12 12:52:24 -04:00
Dmitry Chagin cbbac56091 linux(4): Preserve fpu xsave state across signal delivery on amd64
PR:			270247
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D40444
MFC after:		2 weeks
2023-06-09 01:33:26 +03:00
Dmitry Chagin 920184ed6e linux(4): In preparation for xsave refactor fxsave code on amd64
Due to fxsave area is os independent reimplement fxsave handmade code
using copying of a whole area.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D40443
MFC after:		2 weeks
2023-06-09 01:32:46 +03:00