They are a bit more informative than raw hexadecimal values.
While here, sort existing defines of bits for AMD MSRs to match the address
order.
Reviewed by: kib, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41816
Lack of an AMD IOMMU driver means we cannot successfully route
interrupts to APIC IDs 255 and over. Do not add the corresponding CPUs
to the per-domain lists of CPUs to which interrupts can be assigned.
This change should be reverted (or, at least the APIC ID limit) once we
have an AMD IOMMU / interrupt remapping driver.
See also commits fa5f94140a ("msi: handle error from BUS_REMAP_INTR in
msi_assign_cpu") and 4258eb5a0d ("x86: handle domains with no CPUs
usable for intr delivery.").
Reviewed by: markj, jhb
Tested by: cperciva (earlier version)
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41618
We can end up with a domain having no CPUs capable of receiving I/O
interrupts. This can occur, for example, when all APIC IDs in a given
domain are 256 or greater, and we have no IOMMU.
In this case disable per-domain interrupt support, effectively reverting
to the behaviour before commit a48de40bcc ("Only use CPUs in the
domain the device is attached to for default"). This has a performance
impact but at least allows the system to be functional. It is a stop-
gap until we can rely on the presence of an IOMMU on all x86 platforms.
Thanks to AMD for providing the high-thread-count machine I used for
testing this change, and to cperciva for testing on other hardware.
Reviewed by: jhb
Tested by: cperciva, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41501
Previously errors from BUS_REMAP_INTR were silently ignored, and we
ended up with non-functional interrupts.
Now we allocate and enable new vectors, but postpone assignment of new
APIC IDs and vectors where we can, until after BUS_REMAP_INTR is
successful. We then disable and free the old vectors.
If BUS_REMAP_INTR fails we restore the old configuration, and disable
and free the new, unused vectors.
Thanks to AMD for providing hardware (with APIC IDs above 255) for
testing.
Reviewed by: jhb
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41455
The repeated uses of `MAXCOMLEN + 1` seem a bit hazardous. If there was
a future need to change the size, the repeats will be troublesome.
Merge everything into `#define INTRNAME_LEN` (matches the name used by
INTRNG).
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D38455
The two interrupt controllers which implement squelching of reports
after a maximum use the same limit. Move the limit to interrupt.h, the
better to encourage other interrupt controllers to implement the same.
Reviewed by: markj
MFC after: 2 weks
Differential Revision: https://reviews.freebsd.org/D35527
On 64bit, there is a 4-byte hole in struct vdso_timekeep32 after
tk_current, if the structure is not packed. This is due to the MD
th_x86_pvc_last_systime being 64bit.
Change amd64 VDSO_TIMEHANDS_MD32 to not use uint64_t, replace it with
pair of uint32_t, as it is done for all other members.
PR: 273085
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
The MSI/MSI-X address includes 8 bits to encode the Destination ID.
Previously IDs over 255 overlapped with the fixed portion of the
address, resulting in an invalid value (and a nonfunctional interrupt).
Instead, print an error message and return EINVAL. The interrupt will
still not work, but the user will have a clue as to why.
PR: 273022
Reviewed by: jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D41395
`intr_bind(u_int vector, u_char cpu);` looked suspicious since
everywhere else "cpu" is a u_int and >256 processors isn't unreasonable
now. `intr_bind()` is not used anywhere in FreeBSD (now, after commit
bf42f3738087). Time to remove.
Relnotes: Yes
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D36901
On x86 Linux via AT_HWCAP2 the user controlled (by tunables) processor
capabilities are exposed.
Reviewed by:
Differential Revision: https://reviews.freebsd.org/D41165
MFC after: 2 weeks
vcpu_info is crucial for the Xen event channel core. Since both the
data and setup steps are identical between architectures, move them to
the common file. Since there is no cross-architecture method to call
a function on every processor during bring-up, simply leave the setup
function.
The number of vcpu_info structures available on the shared information
page varies by architecture. Instead of hard-coding the count use
nitems(). Add a warning message for this being used.
Switch to XEN_VCPUID() and use Xen's typedefs.
panic() on failure since >32 processors is no longer unusual.
royger: Specify 64-byte alignment for vcpu_info to try to defend
against vcpu_info crossing a page boundary. Add detection for this
limit.
Reviewed by: royger
Due to fxsave area is os independent reimplement fxsave handmade code
using copying of a whole area.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D40443
MFC after: 2 weeks
Now that <sys/tslog.h> is wrapped in #ifdef _KERNEL, it's safe to have
tslog annotations in files which might be built from userland (i.e. in
subr_boot.c, which is built as part of the boot loader).
This reverts commit 59588a546f.
The change to subr_boot.c broke the libsa build because the TSLOG
macros have their own definitions for the boot loader -- I didn't
realize that the loader code used subr_boot.c.
I'm currently testing a fix and I'll revert this revert once I'm
satisfied that everything works, but I don't want to leave the
tree broken for too long.
This reverts commit 469cfa3c30.
Booting an amd64 kernel on Firecracker with 1 CPU and 128 MB of RAM,
hammer_time takes roughly 2740 us:
* 55 us in xen_pvh_parse_preload_data
* 20 us in boot_parse_cmdline_delim
* 20 us in boot_env_to_howto
* 15 us in identify_hypervisor
* 1320 us in link_elf_reloc
* 1310 us in relocate_file1 handling ef->rela
* 25 us in init_param1
* 30 us in dpcpu_init
* 355 us in initializecpu
* 255 us in initializecpu calling load_cr4
* 425 us in getmemsize
* 280 us in pmap_bootstrap
* 205 us in create_pagetables
* 10 us in init_param2
* 25 us in pci_early_quirks
* 60 us in cninit
* 90 us in kdb_init
* 105 us in msgbufinit
* 20 us in fpuinit
* 205 us elsewhere in hammer_time
Some of these are unavoidable (e.g. identify_hypervisor uses CPUID and
load_cr4 loads the CR4 register, both of which trap to the hypervisor)
but others may deserve attention.
Sponsored by: https://www.patreon.com/cperciva
Differential Revision: https://reviews.freebsd.org/D40325
This avoids bloating the kernel image when MAXCPU is large.
A follow-up patch for kgdb and other kernel debuggers is needed since
the stoppcbs symbol is now a pointer. Bump __FreeBSD_version so that
debuggers can use osreldate to figure out how to handle stoppcbs.
PR: 269572
MFC after: never
Reviewed by: mjg, emaste
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D39806
The SPDX folks have obsoleted the BSD-2-Clause-NetBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
Early versions of the VT-d spec mentioned 6-level paging support as a
possible value for the SAGAW capability, but later versions removed it
and SAGAW=0x10 is currently listed as a reserved value.
The 6-level (agaw=64) entry in sagaw_bits is furthermore problematic
with clang15 because the attempted comparison against 1ULL << 64 in
dmar_maxaddr2mgaw() causes the compiler to elide the last iteration
of the initial loop, which bypasses the subsequent logic to find the
greatest HW-supported address width. This results in 5-level paging
always being selected regardless of whether the hardware supports it,
which can result address translation failure due to invalid context-
entry programming.
Reviewed by: kib
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39896
Otherwise POSTREAD syncs may re-invalidate the shadow of the data buffer
when copying from bounce pages, resulting in false-positive KMSAN
reports.
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
A signed one-bit wide bit-field can take only the values 0 and -1. Clang
16 introduced a warning that "implicit truncation from 'int' to a
one-bit wide bit-field changes value from 1 to -1". Fix the warnings by
using C99 bool.
Reported by: Clang 16
Reviewed by: emaste, jhb
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D39705
The xen_domain_type and HYPERVISOR_shared_info variables are shared by
all Xen architectures, so they should be in common rather than
reimplemented by each architecture.
hvm_start_flags is used by xen_initial_domain() and so needs to be in
common.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D28982
The event channel source code or equivalent is needed on all
architectures. Since much of this is viable to share, get this moved out
of x86-land. Each interrupt interface then needs a distinct back-end
implementation.
Reviewed by: royger
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Original implementation: Julien Grall <julien@xen.org>, 2014-01-13 17:41:04
Differential Revision: https://reviews.freebsd.org/D30236
Simply moving the interrupt allocation and release functions into files
which belong to the architecture. Since x86 interrupt handling is quite
distinct from other architectures, this is a crucial necessary step.
Identifying the border between x86 and architecture-independent is
actually quite tricky. Similarly, getting the prototypes for the
border right is also quite tricky.
Inspired by the work of Julien Grall <julien@xen.org>,
2015-10-20 09:14:56, but heavily adjusted.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30936
The x86 PIC interface is very much x86-specific and not used by other
architectures. Since most of xen_intr.c can be shared with other
architectures, the PIC interface needs to be broken off.
Introduce wrappers for calls into the architecture-dependent interrupt
layer. All architectures need roughly the same functionality, but the
interface is slightly different between architectures. Due to the
wrappers being so thin, all of them are implemented as inline in
arch-intr.h.
The original implementation was done by Julien Grall in 2015, but this
has required major updating.
Removal of PVHv1 meant substantial portions disappeared. The original
implementation took care of moving interrupt allocation to
xen_arch_intr.c, but this has required massive rework and was broken
off.
In the original implementation the wrappers were normal functions. Some
had empty stubs in xen_intr.c and were removed.
Reviewed by: royger
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>
Original implementation: Julien Grall <julien@xen.org>, 2015-10-20 09:14:56
Differential Revision: https://reviews.freebsd.org/D30909
This value doesn't need to be set in xen_intr_alloc_isrc(). What is
needed is simply to ensure the allocated xenisrc won't appear as free,
even if xi_type is written non-atomically. Since the type is no longer
used to indicate free or not, the calling function should take care of
all non-architecture initialization.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D31188
Scanning the list of interrupts to find an unused entry is rather
inefficient. Instead overlay a free list structure and use a list
instead.
This also has the useful effect of removing the last use of evtchn_type
values outside of xen_intr.c.
Reviewed by: royger
[royger]
- Make avail_list static.
Move the xenisrc structure which needs to be shared between the core Xen
interrupt code and architecture-dependent code into a separate header. A
similar situation exists for the NR_EVENT_CHANNELS constant.
Turn xi_intsrc into a type definition named xi_arch to reflect the new
purpose of being an architectural variable for the interrupt source.
This was originally implemented by Julien Grall, but has been heavily
modified. The core side was renamed "intr-internal.h" and is #include'd
by "arch-intr.h" instead of the other way around. This allows the
architecture to add function definitions which use struct xenisrc.
The original version only moved xi_intsrc into xen_arch_isrc_t. Moving
xi_vector was done by the submitter.
The submitter had also moved xi_activehi and xi_edgetrigger into
xen_arch_isrc_t. Those disappeared with the removal of PVHv1 support.
Copyright note. The current xenisrc structure was introduced at
76acc41fb7 by Justin T. Gibbs. Traces remain, but the strength of
Copyright claims from before 2013 seem pretty weak.
Reviewed by: royger
Submitted by: Elliott Mitchell <ehem+freebsd@m5p.com>, 2021-03-17 19:09:01
Original implementation: Julien Grall <julien@xen.org>, 2015-10-20 09:14:56
Differential Revision: https://reviews.freebsd.org/D30648
[royger]
- Adjust some line lengths
- Fix comment about NR_EVENT_CHANNELS after movement.
- Use #include instead of symlinks.
xen_intr_handle_upcall() has two interfaces. It needs to be called by
the x86 assembly code invoked by the APIC. Second, it needs to be called
as a driver_filter_t for the XenPCI code and for architectures besides
x86.
Unfortunately the driver_filter_t interface was implemented as a wrapper
around the x86-APIC interface. Now create a simple wrapper for the
x86-APIC code, which calls an architecture-independent
xen_intr_handle_upcall().
When called via intr_event_handle(), driver_filter_t functions expect
preemption to be disabled. This removes the need for
critical_enter()/critical_exit() when called this way.
The lapic_eoi() call is only needed on x86 in some cases when invoked
directly as an APIC vector handler.
Additionally driver_filter_t functions have no need to handle interrupt
counters. The intrcnt_add() calling function was reworked to match the
current situation. intrcnt_add() is now only called via one path.
The increment/decrement of curthread->td_intr_nesting_level had
previously been left out. Appears this was mostly harmless, but this
was noticed during implementation and has been added.
CONFIG_X86 is a leftover from use with Linux. While the barrier isn't
needed for FreeBSD on x86, it will be needed for FreeBSD on other
architectures.
Copyright note. xen_intr_intrcnt_add() was introduced at 76acc41fb7
by Justin T. Gibbs. xen_intrcnt_init() was introduced at fd036deac1
by John Baldwin.
sys/x86/xen/xen_arch_intr.c was originally created by Julien Grall in
2015 for the purpose of holding the x86 interrupt interface. Later it
was found xen_intr_handle_upcall() was better earlier, and the x86
interrupt interface better later. As such the filename and header list
belong to Julien Grall, but what those were created for is later.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30006
Keeping released xenisrcs in a known state simplifies allocation, but
forces the allocation function to maintain that state. This turns into
a problem when trying to allow for interchangeable allocation functions.
Fix this issue by ensuring xenisrcs are always *fully* initialized
during binding.
Reviewed by: royger
There are actually several distinct locking domains in xen_intr.c, but
all were sharing the same lock. Both xen_intr_port_to_isrc[] and the
x86 interrupt structures needed protection. Split these two apart as a
precursor to splitting the architecture portions off the file.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30726
Locking for allocation was being done in xen_intr_bind_isrc(), but the
unlock was inside xen_intr_alloc_isrc(). While the lock acquisition at
the end of xen_intr_alloc_isrc() was to modify xen_intr_port_to_isrc[],
NOT allocation. Fix this garbled (though working) locking scheme.
Now locking for allocation is strictly in xen_intr_alloc_isrc(), while
locking to modify xen_intr_port_to_isrc[] is in xen_intr_bind_isrc().
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30726
The call structure around xen_intr_alloc_isrc() was rather awful.
Notably finding a structure for reuse is part of allocation, but this
was done outside xen_intr_alloc_isrc(). Move this into
xen_intr_alloc_isrc() so the function handles all allocation steps.
Reviewed by: royger
Differential Revision: https://reviews.freebsd.org/D30726
As "CPUs", IRQs (vector) and virtual IRQs are always positive integers,
adjust the Xen code to use unsigned integers. Several format strings
need adjustment to match. Additionally single-bit bitfields are
boolean.
No functional change expected.
Reviewed by: royger