Commit graph

2520 commits

Author SHA1 Message Date
Mark Johnston d03e1ffbea arm64: Remove some redundant calculations from pmap_bootstrap{,_san1}()
No functional change intended.

MFC after:	1 week
Sponsored by:	Klara, Inc.
Sponsored by:	Juniper Networks, Inc.
2024-06-20 17:53:54 -04:00
Mark Johnston ddf0ed09bd sdt: Implement SDT probes using hot-patching
The idea here is to avoid a memory access and conditional branch per
probe site.  Instead, the probe is represented by an "unreachable"
unconditional function call.  asm goto is used to store the address of
the probe site (represented by a no-op sled) and the address of the
function call into a tracepoint record.  Each SDT probe carries a list
of tracepoints.

When the probe is enabled, the no-op sled corresponding to each
tracepoint is overwritten with a jmp to the corresponding label.  The
implementation uses smp_rendezvous() to park all other CPUs while the
instruction is being overwritten, as this can't be done atomically in
general.  The compiler moves argument marshalling code and the
sdt_probe() function call out-of-line, i.e., to the end of the function.

Per gallatin@ in D43504, this approach has less overhead when probes are
disabled.  To make the implementation a bit simpler, I removed support
for probes with 7 arguments; nothing makes use of this except a
regression test case.  It could be re-added later if need be.

The approach taken in this patch enables some more improvements:
1. We can now automatically fill out the "function" field of SDT probe
   names.  The SDT macros let the programmer specify the function and
   module names, but this is really a bug and shouldn't have been
   allowed.  The intent was to be able to have the same probe in
   multiple functions and to let the user restrict which probes actually
   get enabled by specifying a function name or glob.
2. We can avoid branching on SDT_PROBES_ENABLED() by adding the ability
   to include blocks of code in the out-of-line path.  For example:

	if (SDT_PROBES_ENABLED()) {
		int reason = CLD_EXITED;

		if (WCOREDUMP(signo))
			reason = CLD_DUMPED;
		else if (WIFSIGNALED(signo))
			reason = CLD_KILLED;
		SDT_PROBE1(proc, , , exit, reason);
	}

could be written

	SDT_PROBE1_EXT(proc, , , exit, reason,
		int reason;

		reason = CLD_EXITED;
		if (WCOREDUMP(signo))
			reason = CLD_DUMPED;
		else if (WIFSIGNALED(signo))
			reason = CLD_KILLED;
	);

In the future I would like to use this mechanism more generally, e.g.,
to remove branches and marshalling code used by hwpmc, and generally to
make it easier to add new tracepoint consumers without having to add
more conditional branches to hot code paths.

Reviewed by:	Domagoj Stolfa, avg
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D44483
2024-06-19 16:57:41 -04:00
Mark Johnston 5823a09f79 cavium/thunder: Use device_set_desc()
No functional change intended.

MFC after:	1 week
2024-06-16 16:37:25 -04:00
Mark Johnston 4441dd4094 vm_phys: Fix a typo
Fixes:	b16b4c22d2 ("vm_page: Implement lazy page initialization")
Reported by:	Steffen Nurpmeso <steffen@sdaoden.eu>
2024-06-16 13:33:00 -04:00
Bojan Novković 5d4545a227 arm64 pmap: Release PTP reference on leaf ptpage allocation failure
808f5ac fixed an edge case invloving mlock() and superpage creation
by creating and inserting a leaf pagetable page for mlock'd superpages.
However, the code does not properly release the reference to the
pagetable page in the error handling path.
This commit fixes the issue by adding calls to 'pmap_abort_ptp'
in the error handling path.

Reported by: alc
Approved by: markj (mentor)
Fixes: 808f5ac
Differential Revision:	https://reviews.freebsd.org/D45578
2024-06-16 18:19:26 +02:00
John F. Carr 97ab935d56 rk_pinctrl: fix error check
The parse_bias method returns a signed int, with a value of -1 when
the device tree reports nothing of the bias configuration. Convert the
local 'bias' from unsigned to signed to properly check this condition.

PR:		229721
Reviewed by:	mhorne
MFC after:	3 days
2024-06-14 13:42:27 -03:00
Mark Johnston d730cdea2a arm64/vmm: Avoid unnecessary indirection in vmmops_modinit()
Most of vmm.h is machine-independent.  Simplify merging amd64 and arm64
vmm code by removing this machine-dependent routine from arm64's vmm.h.
No functional change intended.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D45557
2024-06-13 21:19:00 -04:00
Mark Johnston a03354b002 arm64/vmm: Implement vm_disable_vcpu_creation()
No functional change intended.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D45556
2024-06-13 21:19:00 -04:00
Mark Johnston b16b4c22d2 vm_page: Implement lazy page initialization
FreeBSD's boot times have decreased to the point where vm_page array
initialization represents a significant fraction of the total boot time.
For example, when booting FreeBSD in Firecracker (a VMM designed to
support lightweight VMs) with 128MB and 1GB of RAM, vm_page
initialization consumes 9% (3ms) and 37% (21.5ms) of the kernel boot
time, respectively.  This is generally relevant in cloud environments,
where one wants to be able to spin up VMs as quickly as possible.

This patch implements lazy initialization of (most) page structures,
following a suggestion from cperciva@.  The idea is to introduce a new
free pool, VM_FREEPOOL_LAZYINIT, into which all vm_page structures are
initially placed.  For this to work, we need only initialize the first
free page of each chunk placed into the buddy allocator.  Then, early
page allocations draw from the lazy init pool and initialize vm_page
chunks (up to 16MB, 4096 pages) on demand.  Once APs are started, an
idle-priority thread drains the lazy init pool in the background to
avoid introducing extra latency in the allocator.  With this scheme,
almost all of the initialization work is moved out of the critical path.

A couple of vm_phys operations require the pool to be drained before
they can run: vm_phys_find_range() and vm_phys_unfree_page().  However,
these are rare operations.  I believe that
vm_phys_find_freelist_contig() does not require any special treatment,
as it only ever accesses the first page in a power-of-2-sized free page
chunk, which is always initialized.

For now the new pool is only used on amd64 and arm64, since that's where
I can easily test and those platforms would get the most benefit.

Reviewed by:	alc, kib
Differential Revision:	https://reviews.freebsd.org/D40403
2024-06-13 21:19:00 -04:00
Andrew Turner a30149b2a9 arm64: Create a version of vfp_save_state for cpu_switch
This will be used when we add SVE support to reduce the registers
needed to be saved on context switch.

Reviewed by:	imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D43305
2024-06-12 14:09:14 +01:00
Andrew Turner 4eec584d79 arm64: Clear td_frame when returning to userspace
When returning from an exception to userspace clear the saved td_frame.
On the next exception this should point to the frame, however this is
not guaranteed.

To ensure the trap frame pointer is either valid or NULL clear it
before returning to userspace in the EL0 synchronous exception handler.

Reviewed by:	kib, markj
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D44807
2024-06-12 14:08:13 +01:00
Andrew Turner abf239cf09 arm64/vmm: Add braces to fix the gcc build
Reviewed by:	markj, emaste
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45548
2024-06-11 13:12:43 +00:00
Andrew Turner 86bafddd61 arm64: Fix indentation to be consistent
Adjust the mair_el1 macro indentation to be consistent with the
surrounding macros.

Reviewed by:	emaste
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45524
2024-06-10 15:16:10 +00:00
Alan Cox 5ee5c40402 arm64 pmap: Defer bti lookup
Defer the bti lookup until after page table page allocation is complete.
We sometimes release the pmap lock and sleep during page table page
allocation.  Consequently, the result of a bti lookup from before
page table page allocation could be stale when we finally create the
mapping based on it.

Modify pmap_bti_same() to update the prototype PTE at the same time as
checking the address range.  This eliminates the need for calling
pmap_pte_bti() in addition to pmap_bti_same().  pmap_bti_same() was
already doing most of the work of pmap_pte_bti().

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D45502
2024-06-08 02:26:55 -05:00
Doug Moore 2c10bacdf4 rangeset: add next() iteration
Add a method rangeset_next to find the first range that starts at or
after a given value. Use it to rewrite pmap_pkru_same and
pmap_bti_same to avoid walking a page at a time over pages in no
range.

Reviewed by:	andrew, kib
Differential Revision:	https://reviews.freebsd.org/D45511
2024-06-06 13:42:31 -05:00
Andrew Turner bed65d85c6 linux64: Fix the build on arm64 with bti checking
When we enable checking for BTI on arm64 we need to include an ELF
note in all object files linked into a module.

As using objcopy from a binary to an ELF object file doesn't add the
note switch to using .incbin from an assembly file. This allows us to
add the needed note without affecting the included object.

Reviewed by:	imp, kib, emaste
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45468
2024-06-05 09:23:40 +00:00
Andrew Turner c2e0d56f5e arm64: Support BTI checking in most of the kernel
LLD has the -zbti-report=error argument to check if the BTI note is
present when linking. To allow for this to be used when linking the
kernel and modules:
 - Add the BTI note to the remaining assembly files
 - Mark ptrauth.c as protected by BTI
 - Disable -zbti-report for vmm hypervisor switching code as it's not
   used there.

The linux64 module doesn't build with the flag as it includes vdso code
that doesn't include the note.

Reviewed by:	imp, kib, emaste
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45466
2024-06-05 09:23:40 +00:00
Alan Cox 41dfea24ee arm64 pmap: Enable L3C promotions by pmap_enter_quick()
More precisely, implement L3C (64KB/2MB, depending on base page size)
promotion in pmap_enter_quick()'s helper function,
pmap_enter_quick_locked().  At the same time, use the recently
introduced flag VM_PROT_NO_PROMOTE from pmap_enter_object() to
pmap_enter_quick_locked() to avoid L3C promotion attempts that will
fail.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D45445
2024-06-04 23:25:51 -05:00
Mitchell Horne 5df74441b3 devmap: eliminate unused arguments
The optional 'table' pointer is a legacy part of the interface, which
has been replaced by devmap_register_table()/devmap_add_entry(). The few
in-tree callers have already adapted to this, so it can be removed.

The 'l1pt' argument is already entirely unused within the function.

Reviewed by:	andrew, markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45319
2024-06-04 20:17:47 -03:00
Mitchell Horne 191e6a6049 physmem: zero entire array
As a convenience to callers, who might allocate the array on the stack.
An empty/zero-valued range indicates the end of the physmap entries.

Remove the now-redundant calls to bzero() at the call site.

Reviewed by:	andrew
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45318
2024-06-04 20:17:13 -03:00
Mark Johnston 75cb949228 arm64/vmm: Add breakpoint and single-stepping support
This will be used to implement parts of bhyve's gdb stub.

Three VM capabilities are added, similar to amd64 without monitor mode.
Two cause breakpoint and single-step exceptions to be raised to EL2 and
then down to bhyve.  One lets the gdb stub mask hardware interrupts
while single-stepping, since otherwise the guest will handle a timer
interrupt before executing the target instruction and thus fail
to make progress.

Reviewed by:	bnovkov, andrew
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D44739
2024-06-04 14:58:08 -04:00
Alan Cox f1d73aacdc pmap: Skip some superpage promotion attempts that will fail
Implement a simple heuristic to skip pointless promotion attempts by
pmap_enter_quick_locked() and moea64_enter().  Specifically, when
vm_fault() calls pmap_enter_quick() to map neighboring pages at the end
of a copy-on-write fault, there is no point in attempting promotion in
pmap_enter_quick_locked() and moea64_enter().  Promotion will fail
because the base pages have differing protection.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D45431
MFC after:	1 week
2024-06-04 00:38:05 -05:00
Doug Moore b0056b31e9 libkern: add ilog2 macro
The kernel source contains several definitions of an ilog2 function;
some are slower than necessary, and one of them is incorrect.
Elimininate them all and define an ilog2 macro in libkern to replace
them, in a way that is fast, correct for all argument types, and, in a
GENERIC kernel, includes a check for an invalid zero parameter.

Folks at Microsoft have verified that having a correct ilog2
definition for their MANA driver doesn't break it.

Reviewed by:	alc, markj, mhorne (older version), jhibbits (older version)
Differential Revision:	https://reviews.freebsd.org/D45170
Differential Revision:	https://reviews.freebsd.org/D45235
2024-06-03 11:37:55 -05:00
Alan Cox 3dc2a88489 arm64 pmap: Convert panic()s to KASSERT()s
There is no reason for the ATTR_SW_NO_PROMOTE checks in
pmap_update_{entry,strided}() to be panic()s instead of KASSERT()s.

Requested by:	markj
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D45424
2024-05-31 16:54:27 -05:00
Mitchell Horne deab57178f Adjust comments referencing vm_mem_init()
I cannot find a time where the function was not named this.

Reviewed by:	kib, markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45383
2024-05-27 18:37:40 -03:00
Bojan Novković 0a44b8a56d vm: Simplify startup page dumping conditional
This commit introduces the MINIDUMP_STARTUP_PAGE_TRACKING symbol and
uses it to simplify several instances of a complex preprocessor conditional
for adding pages allocated when bootstraping the kernel to minidumps.

Reviewed by:	markj, mhorne
Approved by:	markj (mentor)
Differential Revision: https://reviews.freebsd.org/D45085
2024-05-25 19:24:55 +02:00
Bojan Novković da76d349b6 uma: Deduplicate uma_small_alloc
This commit refactors the UMA small alloc code and
removes most UMA machine-dependent code.
The existing machine-dependent uma_small_alloc code is almost identical
across all architectures, except for powerpc where using the direct
map addresses involved extra steps in some cases.

The MI/MD split was replaced by a default uma_small_alloc
implementation that can be overridden by architecture-specific code by
defining the UMA_MD_SMALL_ALLOC symbol. Furthermore, UMA_USE_DMAP was
introduced to replace most UMA_MD_SMALL_ALLOC uses.

Reviewed by: markj, kib
Approved by: markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D45084
2024-05-25 19:24:46 +02:00
Mitchell Horne 1d3c23676d arm64, riscv: remove unused declaration
It is inherited from arm, where the global exists and is used. No
functional change.

Reviewed by:	markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45323
2024-05-24 10:55:24 -03:00
Mitchell Horne b5e17840de arm64, riscv: removed unused struct pv_addr
No functional change.

Reviewed by:	markj
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D45322
2024-05-24 10:55:24 -03:00
Warner Losh bedbaee805 syscalls: Regen for Linux emulator additions 2024-05-23 13:40:47 -06:00
Ricardo Branco 97add684f5 linux: Support POSIX message queues
Reviewed by: imp, kib
Pull Request: https://github.com/freebsd/freebsd-src/pull/1248
2024-05-23 13:40:46 -06:00
Ricardo Branco 427db2c45e linux: Fix linux_mq_notify_args & linux_timer_create_args
Reviewed by: imp, kib
Pull Request: https://github.com/freebsd/freebsd-src/pull/1248
2024-05-23 13:40:46 -06:00
Alan Cox 9fc5e3fb39 arm64: set ATTR_CONTIGUOUS on the DMAP's L2 blocks
On systems configured with 16KB pages, this change creates 1GB page
mappings in the direct map where possible.  Previously, the largest page
size that was used to implement the direct map was 32MB.  Similarly, on
systems configured with 4KB pages, this change creates 32MB page
mappings, instead of 2MB, in the direct map where 1GB is too large.

Implement demotion on L2C (32MB/1GB) page mappings within the DMAP.

Update sysctl vm.pmap.kernel_maps to report on L2C page mappings.

Reviewed by:	markj
Tested by:	gallatin, Eliot Solomon <ehs3@rice.edu>
Differential Revision:	https://reviews.freebsd.org/D45224
2024-05-22 22:09:43 -05:00
Dmitry Salychev 971b77da46
arm64: Return newline at the end of NOTES back
It fixes LINT kernel build after ac4ddc467b.

MFC after:	3 days
2024-05-22 15:20:40 +02:00
Dmitry Salychev ac4ddc467b
arm64: Fixed IOMMU compilation errors
These are missing changes after 1228b93b41 when ref_count was
removed from bus_dma_tag_common and 1e3f42b6ba, when the address
arguments were switched to pointers.

Reviewed by:		jhb
MFC after:		3 days
Differential Revision:	https://reviews.freebsd.org/D45289
2024-05-22 11:08:00 +02:00
Andrew Turner 4f012d7a7a arm64/rockchip: Fix the build with GCC
We were missing brackets in GPIO_FLAGS_PINCTRL. Without them GCC
complains a use is ambiguous. Fix by adding the needed brackets.

Reviewed by:	manu, brooks, imp, jhb, emaste
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45264
2024-05-22 08:19:19 +00:00
Andrew Turner 73c2004473 arm64: Use the pointer auth register defines
When building with gcc it complains the pointer authentication
registers aren't valid with the architecture level we are targeting.
Fix this by using the alternative spelling of these registers accesses
through MRS_REG_ALT_NAME.

Reviewed by:	jhb
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45263
2024-05-22 08:19:06 +00:00
Andrew Turner 57d714a23f arm64: Add the pointer auth registers to armreg.h
Add the pointer authentication registers to armreg.h. These will be
used to support pointer authentication in a kernel built with GCC.

Reviewed by:	jhb
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45262
2024-05-22 08:18:54 +00:00
Andrew Turner 29c1cf9860 arm64: Use the UL macro in TCR_EL1 defines
While clang can handle numbers with a UL suffix in assembly files
gcc/gas is unable to. Switch to use the UL macro for TCR_EL1 defines as
some are used in locore.S

Reviewed by:	brooks, jhb
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45261
2024-05-22 08:18:39 +00:00
Alan Cox 4f77144279 arm64 pmap: eliminate a redundant variable
Moreover, if we attempt an L2 promotion on the kernel pmap from
pmap_enter_quick_locked(), this change eliminates the recomputation of
the L2 entry's address.

MFC after:	1 week
2024-05-19 14:33:19 -05:00
Andrew Turner 457fa0f69c arm64: Support break and watch points in VHE
When booting the kernel with VHE it will be running at EL2. The current
config register values only enable the reaces at EL1 when tracing the
kernel.

Set the HMC flag to also trap from EL2.

Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45121
2024-05-17 16:07:16 +00:00
Zachary Leaf 4f8ba1c9dd arm64: add CONTEXTIDR_EL1 reg
CONTEXTIDR_EL1 is used in debug and trace features to identify the
current process or context.

Reviewed by:	andrew
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45173
2024-05-17 15:46:27 +01:00
Zachary Leaf 10b3eac88d arm64: add PMBSR_MSS_{BSC,FSC} status code field
Bits [5:0] of PMBSR_MSS encodes either Buffer Status Code (BSC) or Fault
Status Code (FSC) depending on PMBSR_EC value.

Add PMBSR_MSS_{BSC,FSC} to cover this field.

Reviewed by:	andrew
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45172
2024-05-17 15:46:00 +01:00
Zachary Leaf f7bdaa103e arm64: make SPE regs use ALT_NAME macro
When the register is not defined in Armv8.0 i.e. added in a later
extension, like SPE added in v8.2, the alternative name format of:
    S<op0>_<op1>_C<crn>_C<crm>_<op2>
should be used; otherwise, calls to {READ,WRITE}_SPECIALREG() will
fail.

Use the MRS_REG_ALT_NAME() macro for SPE changing hex to decimal as
required by the macro.

Reviewed by:	andrew
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45171
2024-05-17 15:45:44 +01:00
Andrew Turner 4660d96587 arm64/vmm: Fix the build with GCC
- Fix the spelling of handle_el2_el1_irq64
- Add .section before .rodata as the GCC build needs it

Sponsored by:	Arm Ltd
2024-05-17 13:19:45 +00:00
Andrew Turner cd36810110 arm64: Use the _REG macros to read ID registers
To build with old toolchains use the *_REG macros to access the ID
registers. These become a name in the form S?_?_C?_C?_? where the '?'
values encode the op and CR values needed to access the register.

For consistency use these macros for all ID registers, even if most
toolchains understand them.

Reviewed by:	Zachary Leaf <zachary.leaf@arm.com>
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45177
2024-05-17 09:38:38 +00:00
Andrew Turner d6d860c7ff arm64: Add MRS_REG_ALT_NAME ID register macros
These can be used even when the compiler is too old for the register
to be included.

Reviewed by:	Zachary Leaf <zachary.leaf@arm.com>
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D45176
2024-05-17 09:38:17 +00:00
Doug Moore b5a1f0406b arm64_pmap: narrow scope of bti_same test
The pmap_bti_same test in pmap_enter_l3c only happens in the
!ADDR_IS_KERNEL case; in the other case, a KASSERT fails. So move the
test into that case to save a bit of time when ADDR_IS_KERNEL.

Reviewed by:	andrew
Differential Revision:	https://reviews.freebsd.org/D45160
2024-05-13 23:22:52 -05:00
Alan Cox 94b09d388b arm64: map kernel using large pages when page size is 16K
When the page size is 16K, use ATTR_CONTIGUOUS to map the kernel code
and data sections using 2M pages.  Previously, they were mapped using
16K pages.

Reviewed by:	markj
Tested by:	markj
Differential Revision:	https://reviews.freebsd.org/D45162
2024-05-12 18:22:38 -05:00
Doug Moore c1ebd76c3f arm64: add page-to-pte convenience macros
Define macros to perform pte to vm_page and vm_page to pte conversions
without composing two macros, and use the convenience macros wherever
possible.

Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D44699
2024-05-11 01:04:48 -05:00