Commit graph

763 commits

Author SHA1 Message Date
Mark Johnston 31218f3209 riscv: Add support for enabling SV48 mode
This increases the size of the user map from 256GB to 128TB.  The kernel
map is left unchanged for now.

For now SV48 mode is left disabled by default, but can be enabled with a
tunable.  Note that extant hardware does not implement SV48, but QEMU
does.

- In pmap_bootstrap(), allocate a L0 page and attempt to enable SV48
  mode.  If the write to SATP doesn't take, the kernel continues to run
  in SV39 mode.
- Define VM_MAX_USER_ADDRESS to refer to the SV48 limit.  In SV39 mode,
  the region [VM_MAX_USER_ADDRESS_SV39, VM_MAX_USER_ADDRESS_SV48] is not
  mappable.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34280
2022-03-01 09:39:44 -05:00
Mark Johnston 6ce716f7c3 riscv: Add support for dynamically allocating L1 page table pages
This is required in SV48 mode.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34279
2022-03-01 09:39:44 -05:00
Mark Johnston 1321117200 riscv: Handle four-level page tables in various pmap traversal routines
Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34278
2022-03-01 09:39:44 -05:00
Mark Johnston ceed61483c riscv: Maintain the allpmaps list only in SV39 mode
When four-level page tables are used, there is no need to distribute
updates to the top-level page to all pmaps.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34277
2022-03-01 09:39:44 -05:00
Mark Johnston 5cf3a8216e riscv: Add pmap helper functions required by four-level page tables
No functional change intended.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34276
2022-03-01 09:39:44 -05:00
Mark Johnston 4337979236 riscv: Try to improve the comments for locore's page table setup
No functional change intended.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34275
2022-03-01 09:39:44 -05:00
Mark Johnston ecaf115434 riscv: Conditionally modify the ELF64 sysentvec for SV48
A sysinit determines whether the pmap has enabled SV48 mode and modifies
the corresponding fields which describe the user memory map.

Reviewed by:	kib, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34274
2022-03-01 09:39:43 -05:00
Mark Johnston 35d0f443cf riscv: Define a SV48 memory map
No functional change intended.

Reviewed by:	kib, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34273
2022-03-01 09:39:43 -05:00
Mark Johnston 59f192c507 riscv: Add various pmap definitions needed to support SV48 mode
No functional change intended.

Reviewed by:	jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34272
2022-03-01 09:39:43 -05:00
Mark Johnston 2e956c30ca riscv: Use generic CSR macros for writing SATP
Instead of having the one-off load_satp(), just use csr_write().  No
functional change intended.

Reviewed by:	alc, jhb
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34271
2022-03-01 09:39:43 -05:00
Mark Johnston 82f4e0d0f0 riscv: Rename struct pmap's pm_l1 field to pm_top
In SV48 mode, the top-level page will be an L0 page rather than an L1
page.  Rename the field accordingly.  No functional change intended.

Reviewed by:	alc, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34270
2022-03-01 09:39:43 -05:00
Mark Johnston d5c0a7b6d3 riscv: Fix another race in pmap_pinit()
Commit c862d5f2a7 ("riscv: Fix a race in pmap_pinit()") did not really
fix the race.  Alan writes,

 Suppose that N entries in the L1 tables are in use, and we are in the
 middle of the memcpy().  Specifically, we have read the zero-filled
 (N+1)st entry from the kernel L1 table.  Then, we are preempted.  Now,
 another core/thread does pmap_growkernel(), which fills the (N+1)st
 entry.  Finally, we return to the original core/thread, and overwrite
 the valid entry with the zero that we earlier read.

Try to fix the race properly, by copying kernel L1 entries while holding
the allpmaps lock.  To avoid doing unnecessary work while holding this
global lock, copy only the entries that we expect to be valid.

Fixes:		c862d5f2a7 ("riscv: Fix a race in pmap_pinit()")
Reported by:	alc, jrtc27
Reviewed by:	alc
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34267
2022-02-22 09:26:33 -05:00
Emmanuel Vadot 092a42a6a3 riscv: conf: Remove options EXT_RESOURCES
It is now unused in kernel code.

Reviewed by:	mhorne
MFC after:      1 month
Differential Revision:	https://reviews.freebsd.org/D33838
2022-02-21 17:29:13 +01:00
Navdeep Parhar e9e7bc8250 cxgbe(4): Changes to the fatal error handler.
* New error_flags that can be used from the error ithread and elsewhere
  without a synch_op.
* Stop the adapter immediately in t4_fatal_err but defer most of the
  rest of the handling to a task.  The task is allowed to sleep, unlike
  the ithread.  Remove async_event_task as it is no longer needed.
* Dump the devlog, CIMLA, and PCIE_FW exactly once on any fatal error
  involving the firmware or the CIM block.  While here, dump some
  additional info (see dump_cim_regs) for these errors.
* If both reset_on_fatal_err and panic_on_fatal_err are set then attempt
  a reset first and do not panic the system if it is successful.

MFC after:	1 week
Sponsored by:	Chelsio Communications
2022-02-18 09:16:14 -08:00
Warner Losh 0987dc5be5 riscv: Add static asssert for context size
Add a static assert for the siginfo_t, mcontext_t and ucontext_t
sizes. These are de-factor ABI options and cannot change size ever.

Differential Revision:	https://reviews.freebsd.org/D34214
2022-02-10 14:32:21 -07:00
Mark Johnston c862d5f2a7 riscv: Fix a race in pmap_pinit()
All pmaps share the top half of the address space.  With 3-level page
tables, the top-level kernel map entries are not static: they might
change if the kernel map is extended (via pmap_growkernel()) or a 1GB
mapping in the direct map is demoted (not implemented yet).  Thus the
riscv pmap maintains the allpmaps list to synchronize updates to
top-level entries.

When a pmap is created, it is inserted into this list after copying
top-level entries from the kernel pmap.  The copying is done without
holding the allpmaps lock, and it is possible for pmap_pinit() to race
with kernel map updates.  In particular, if a thread is modifying L1
entries, and a concurrent pmap_pinit() copies the old version of the
entries, it might not receive the update.

Fix the problem by copying the kernel map entries after inserting the
pmap into the list.  This ensures that the nascent pmap always receives
updates, though pmap_distribute_l1() may race with the page copy.

Reviewed by:	mhorne, jhb
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34158
2022-02-08 13:31:55 -05:00
Mitchell Horne 4e1bc961bb arm64, riscv: handle RB_KDB
This allows entering the debugger at the earliest possible time, if
the '-d' argument is passed to the kernel.

Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D34120
2022-02-01 13:59:54 -04:00
Mitchell Horne e6ee2b6506 riscv: add ALT_BREAK_TO_DEBUGGER to GENERIC
It allows quickly entering ddb(4) over a serial line.

Reviewed by:	jhb
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D34119
2022-02-01 13:59:54 -04:00
Andrew Turner 548a2ec49b Add PT_GETREGSET
This adds the PT_GETREGSET and PT_SETREGSET ptrace types. These can be
used to access all the registers from a specified core dump note type.
The NT_PRSTATUS and NT_FPREGSET notes are initially supported. Other
machine-dependant types are expected to be added in the future.

The ptrace addr points to a struct iovec pointing at memory to hold the
registers along with its length. On success the length in the iovec is
updated to tell userspace the actual length the kernel wrote or, if the
base address is NULL, the length the kernel would have written.

Because the data field is an int the arguments are backwards when
compared to the Linux PTRACE_GETREGSET call.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19831
2022-01-27 11:40:34 +00:00
Mitchell Horne eb81812fb7 riscv: fix unused var in page_fault_handler()
clang warns that p is set-but-not-used, so let's use it.
2022-01-19 17:21:25 -04:00
Mark Johnston 706f4a81a8 exec: Introduce the PROC_PS_STRINGS() macro
Rather than fetching the ps_strings address directly from a process'
sysentvec, use this macro.  With stack address randomization the
ps_strings address is no longer fixed.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 16:11:54 -05:00
Mark Johnston 3fc21fdd5f sysent: Add a sv_psstringssz field to struct sysentvec
The size of the ps_strings structure varies between ABIs, so this is
useful for computing the address of the ps_strings structure relative to
the top of the stack when stack address randomization is enabled.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33704
2022-01-17 11:42:07 -05:00
Brooks Davis 0910a41ef3 Revert "syscallarg_t: Add a type for system call arguments"
Missed issues in truss on at least armv7 and powerpcspe need to be
resolved before recommit.

This reverts commit 3889fb8af0.
This reverts commit 1544e0f5d1.
2022-01-12 23:29:20 +00:00
Brooks Davis 1544e0f5d1 syscallarg_t: Add a type for system call arguments
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).

Obtained from:	CheriBSD

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D33780
2022-01-12 22:51:25 +00:00
Mitchell Horne d72e944812 riscv: gdb(4) support
Add the MD portion required for the gdb stub.

Reviewed by:	jhb (earlier version)
Discussed with:	jrtc27
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33734
2022-01-10 13:40:12 -04:00
John Baldwin 7def1e10b3 bus_dma: Deduplicate locking helper functions.
- Move busdma_lock_mutex to subr_bus_dma.c.

- Move _busdma_lock_dflt to subr_bus_dma.c.  This function was named a
  couple of different things previously.  It is not a public API but
  an internal helper used in place of a NULL pointer.  The prototype
  is in <sys/bus_dma.h> as not all backends include
  <sys/bus_dma_internal.h>.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D33694
2022-01-05 13:50:40 -08:00
John Baldwin 85b4607324 Deduplicate bus_dma bounce code.
Move mostly duplicated code in various MD bus_dma backends to support
bounce pages into sys/kern/subr_busdma_bounce.c.  This file is
currently #include'd into the backends rather than compiled standalone
since it requires access to internal members of opaque bus_dma
structures such as bus_dmamap_t and bus_dma_tag_t.

Reviewed by:	kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D33684
2022-01-05 13:50:40 -08:00
Doug Moore f1e7a532d1 busdma: _bus_dmamap_addseg repaired
A recent change introduced a one-off error into a test allowing
coalescing chunks into segments.  This fixes that error.

broke a check in _bus_dmamap_addseg on many architectures. This change makes it clear that it is not a particular range that is being boundary-checked, but the proposed union of the two adjacent ranges.
Reported by:	se
Reviewed by:	se
Fixes:	c606ab59e7 vm_extern: use standard address checkers everywhere
Differential Revision:	https://reviews.freebsd.org/D33715
2022-01-02 12:37:05 -06:00
Doug Moore b496126886 riscv-busdma: Balance parens.
Reported by:	jenkins
Fixes:	c606ab59e7 vm_extern: use standard address checkers everywhere
2021-12-31 02:01:58 -06:00
Doug Moore c606ab59e7 vm_extern: use standard address checkers everywhere
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D33685
2021-12-30 22:09:08 -06:00
John Baldwin 254e4e5b77 Simplify swi for bus_dma.
When a DMA request using bounce pages completes, a swi is triggered to
schedule pending DMA requests using the just-freed bounce pages.  For
a long time this bus_dma swi has been tied to a "virtual memory" swi
(swi_vm).  However, all of the swi_vm implementations are the same and
consist of checking a flag (busdma_swi_pending) which is always true
and if set calling busdma_swi.  I suspect this dates back to the
pre-SMPng days and that the intention was for swi_vm to serve as a
mux.  However, in the current scheme there's no need for the mux.

Instead, remove swi_vm and vm_ih.  Each bus_dma implementation that
uses bounce pages is responsible for creating its own swi (busdma_ih)
which it now schedules directly.  This swi invokes busdma_swi directly
removing the need for busdma_swi_pending.

One consequence is that the swi now works on RISC-V which had previously
failed to invoke busdma_swi from swi_vm.

Reviewed by:	imp, kib
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D33447
2021-12-28 13:51:25 -08:00
Alan Cox e161dfa918 Fix pmap_is_prefaultable() on arm64 and riscv
The current implementations never correctly return TRUE. In all cases,
when they currently return TRUE, they should have returned FALSE.  And,
in some cases, when they currently return FALSE, they should have
returned TRUE.  Except for its effects on performance, specifically,
additional page faults and pointless calls to pmap_enter_quick() that
abort, this error is harmless.  That is why it has gone unnoticed.

Add a comment to the amd64, arm64, and riscv implementations
describing how their return values are computed.

Reviewed by:	kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33659
2021-12-27 19:17:14 -06:00
Jessica Clarke 434cb1c4a6 riscv: Fix PLIC -Wunused-but-set-variable warnings 2021-12-10 04:51:32 +00:00
Alexander Motin 8493918868 busdma: Remove outdated comments about Giant.
MFC after:	2 weeks
2021-12-09 22:18:53 -05:00
John Baldwin 1a62e9bc00 Add <machine/tls.h> header to hold MD constants and helpers for TLS.
The header exports the following:

- Definition of struct tcb.
- Helpers to get/set the tcb for the current thread.
- TLS_TCB_SIZE (size of TCB)
- TLS_TCB_ALIGN (alignment of TCB)
- TLS_VARIANT_I or TLS_VARIANT_II
- TLS_DTV_OFFSET (bias of pointers in dtv[])
- TLS_TP_OFFSET (bias of "thread pointer" relative to TCB)

Note that TLS_TP_OFFSET does not account for if the unbiased thread
pointer points to the start of the TCB (arm and x86) or the end of the
TCB (MIPS, PowerPC, and RISC-V).

Note also that for amd64, the struct tcb does not include the unused
tcb_spare field included in the current structure in libthr.  libthr
does not use this field, and the existing calls in libc and rtld that
allocate a TCB for amd64 assume it is the size of 3 Elf_Addr's (and
thus do not allocate room for tcb_spare).

A <sys/_tls_variant_i.h> header is used by architectures using
Variant I TLS which uses a common struct tcb.

Reviewed by:	kib (older version of x86/tls.h), jrtc27
Sponsored by:	The University of Cambridge, Google Inc.
Differential Revision:	https://reviews.freebsd.org/D33351
2021-12-09 13:17:13 -08:00
Brooks Davis 547566526f Make struct syscall_args machine independent
After a round of cleanups in late 2020, all definitions are
functionally identical.

This removes a rotted __aligned(8) on arm. It was added in
b7112ead32 and was intended to align the
args member so that 64-bit types (off_t, etc) could be safely read on
armeb compiled with clang. With the removal of armev, this is no
longer needed (armv7 requires that 32-bit aligned reads of 64-bit
values be supported and we enable such support on armv6).  As further
evidence this is unnecessary, cleanups to struct syscall_args have
resulted in args being 32-bit aligned on 32-bit systems.  The sole
effect is to bloat the struct by 4 bytes.

Reviewed by:	kib, jhb, imp
Differential Revision:	https://reviews.freebsd.org/D33308
2021-12-08 18:45:33 +00:00
Mitchell Horne 0d2224733e Implement GET_STACK_USAGE on remaining archs
This definition enables callers to estimate remaining space on the
kstack, and take action on it. Notably, it enables optimizations in the
GEOM and netgraph subsystems to directly dispatch work items when there
is sufficient stack space, rather than queuing them for a worker thread.

Implement it for riscv, arm, and mips. Remove the #ifdefs, so it will
not go unimplemented elsewhere.

PR:		259157
Reviewed by:	mav, kib, markj (previous version)
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32580
2021-11-30 11:15:56 -04:00
Mark Johnston ecbbe83144 netinet: Deduplicate most in_cksum() implementations
in_cksum() and related routines are implemented separately for each
platform, but only i386 and arm have optimized versions.  Other
platforms' copies of in_cksum.c are identical except for style
differences and support for big-endian CPUs.

Deduplicate the implementations for the rest of the platforms.  This
will make it easier to implement in_cksum() for unmapped mbufs.  On arm
and i386, define HAVE_MD_IN_CKSUM to mean that the MI implementation is
not to be compiled.

No functional change intended.

Reviewed by:	kp, glebius
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33095
2021-11-24 13:31:16 -05:00
Warner Losh d2bf8c544a riscv: Make machine/regs.h self-contained
Make sys/reg.h self-contained by making riscv's machine/reg.h
self-contained.

Sponsored by:		Netflix
2021-11-23 21:21:17 -07:00
Mitchell Horne 10fe6f80a6 minidump: Use the provided dump bitset
When constructing the set of dumpable pages, use the bitset provided by
the state argument, rather than assuming vm_page_dump invariably. For
normal kernel minidumps this will be a pointer to vm_page_dump, but when
dumping the live system it will not.

To do this, the functions in vm_dumpset.h are extended to accept the
desired bitset as an argument. Note that this provided bitset is assumed
to be derived from vm_page_dump, and therefore has the same size.

Reviewed by:	kib, markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31992
2021-11-19 15:05:52 -04:00
Mitchell Horne 1d2d1418b4 minidump: Use provided msgbuf pointer
Don't assume we are dumping the global message buffer, but use the one
provided by the state argument. While here, drop superfluous
cast to char *.

Reviewed by:	markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31991
2021-11-19 15:05:52 -04:00
Mitchell Horne 681bd71047 minidump: reduce the amount direct accesses to page tables
During a live dump, we may race with updates to the kernel page tables.
This is generally okay; we accept that the state of the system while
dumping may be somewhat inconsistent with its state when the dump was
invoked. However, when walking the kernel page tables, it is important
that we load each PDE/PTE only once while operating on it. Otherwise, it
is possible to have the relevant PTE change underneath us. For example,
after checking the valid bit, but before reading the physical address.

Convert the loads to atomics, and add some validation around the
physical addresses, to ensure that we do not try to dump a non-existent
or non-canonical physical address.

Similarly, don't read kernel_vm_end more than once, on the off chance
that pmap_growkernel() is called between the two page table walks.

Reviewed by:	kib, markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31990
2021-11-19 15:05:52 -04:00
Mitchell Horne 1adebe3cd6 minidump: Parameterize minidumpsys()
The minidump code is written assuming that certain global state will not
change, and rightly so, since it executes from a kernel debugger
context. In order to support taking minidumps of a live system, we
should allow copies of relevant global state that is likely to change to
be passed as parameters to the minidumpsys() function.

This patch does the work of parameterizing this function, by adding a
struct minidumpstate argument. For now, this struct allows for copies of
the kernel message buffer, and the bitset that tracks which pages should
be dumped (vm_page_dump). Follow-up changes will actually make use of
these arguments.

Notably, dump_avail[] does not need a snapshot, since it is not expected
to change after system initialization.

The existing minidumpsys() definitions are renamed, and a thin MI
wrapper is added to kern_dump.c, which handles the construction of
the state struct. Thus, calling minidumpsys() remains as simple as
before.

Reviewed by:	kib, markj, jhb
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D31989
2021-11-19 15:05:52 -04:00
Kristof Provost 4e85b64890 Add a COMPAT_FREEBSD13 kernel option
Use it wherever COMPAT_FREEBSD11 is currently specified.

Reviewed by:	jhb (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33005
2021-11-17 03:08:40 +01:00
Kristof Provost 23e1961e78 riscv: add COMPAT_FREEBSD12 option
Turn on compat option for older FreeBSD versions (i.e. 12). We do not
enable the compat options for 11 or older because riscv was never
supported in those versions.

Reviewed by:	jrtc27 (previous version)
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D33015
2021-11-17 03:08:14 +01:00
Warner Losh 7e3c9ec906 tcp: better congestion control defaults
Define CC_NEWRENO in all the appropriate DEFAULTS and std.* config
files. It's the default congestion control algorithm.  Add code to cc.c
so that CC_DEFAULT is "newreno" if it's not overriden in the config
file.

Sponsored by: Netflix
Fixes: b8d60729de ("tcp: Congestion control cleanup.")
Revired by: manu, hselasky, jhb, glebius, tuexen
Differential Revision:	https://reviews.freebsd.org/D32964
2021-11-12 12:16:11 -07:00
Randall Stewart b8d60729de tcp: Congestion control cleanup.
NOTE: HEADS UP read the note below if your kernel config is not including GENERIC!!

This patch does a bit of cleanup on TCP congestion control modules. There were some rather
interesting surprises that one could get i.e. where you use a socket option to change
from one CC (say cc_cubic) to another CC (say cc_vegas) and you could in theory get
a memory failure and end up on cc_newreno. This is not what one would expect. The
new code fixes this by requiring a cc_data_sz() function so we can malloc with M_WAITOK
and pass in to the init function preallocated memory. The CC init is expected in this
case *not* to fail but if it does and a module does break the
"no fail with memory given" contract we do fall back to the CC that was in place at the time.

This also fixes up a set of common newreno utilities that can be shared amongst other
CC modules instead of the other CC modules reaching into newreno and executing
what they think is a "common and understood" function. Lets put these functions in
cc.c and that way we have a common place that is easily findable by future developers or
bug fixers. This also allows newreno to evolve and grow support for its features i.e. ABE
and HYSTART++ without having to dance through hoops for other CC modules, instead
both newreno and the other modules just call into the common functions if they desire
that behavior or roll there own if that makes more sense.

Note: This commit changes the kernel configuration!! If you are not using GENERIC in
some form you must add a CC module option (one of CC_NEWRENO, CC_VEGAS, CC_CUBIC,
CC_CDG, CC_CHD, CC_DCTCP, CC_HTCP, CC_HD). You can have more than one defined
as well if you desire. Note that if you create a kernel configuration that does not
define a congestion control module and includes INET or INET6 the kernel compile will
break. Also you need to define a default, generic adds 'options CC_DEFAULT=\"newreno\"
but you can specify any string that represents the name of the CC module (same names
that show up in the CC module list under net.inet.tcp.cc). If you fail to add the
options CC_DEFAULT in your kernel configuration the kernel build will also break.

Reviewed by: Michael Tuexen
Sponsored by: Netflix Inc.
RELNOTES:YES
Differential Revision: https://reviews.freebsd.org/D32693
2021-11-11 06:28:18 -05:00
Kyle Evans 6a8ea6d174 sched: split sched_ap_entry() out of sched_throw()
sched_throw() can no longer take a NULL thread, APs enter through
sched_ap_entry() instead.  This completely removes branching in the
common case and cleans up both paths.  No functional change intended.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D32829
2021-11-05 15:45:51 -05:00
Kyle Evans 589aed00e3 sched: separate out schedinit_ap()
schedinit_ap() sets up an AP for a later call to sched_throw(NULL).

Currently, ULE sets up some pcpu bits and fixes the idlethread lock with
a call to sched_throw(NULL); this results in a window where curthread is
setup in platforms' init_secondary(), but it has the wrong td_lock.
Typical platform AP startup procedure looks something like:

- Setup curthread
- ... other stuff, including cpu_initclocks_ap()
- Signal smp_started
- sched_throw(NULL) to enter the scheduler

cpu_initclocks_ap() may have callouts to process (e.g., nvme) and
attempt to sched_add() for this AP, but this attempt fails because
of the noted violated assumption leading to locking heartburn in
sched_setpreempt().

Interrupts are still disabled until cpu_throw() so we're not really at
risk of being preempted -- just let the scheduler in on it a little
earlier as part of setting up curthread.

Reviewed by:	alfredo, kib, markj
Triage help from:	andrew, markj
Smoke-tested by:	alfredo (ppc), kevans (arm64, x86), mhorne (arm)
Differential Revision:	https://reviews.freebsd.org/D32797
2021-11-03 15:54:59 -05:00
Philip Paeps 91feb4f420 riscv: add iicbus and iicoc to GENERIC
The iicoc driver supports the OpenCores I2C IP.  This is included in at
least the SiFive "Unleashed" and "Unmatched" cores and probably others.

Suggested by:	jrtc27
2021-11-01 13:19:55 +08:00
Mark Johnston 84c3922243 Convert consumers to vm_page_alloc_noobj_contig()
Remove now-unneeded page zeroing.  No functional change intended.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32006
2021-10-19 21:22:56 -04:00
Mark Johnston a4667e09e6 Convert vm_page_alloc() callers to use vm_page_alloc_noobj().
Remove page zeroing code from consumers and stop specifying
VM_ALLOC_NOOBJ.  In a few places, also convert an allocation loop to
simply use VM_ALLOC_WAITOK.

Similarly, convert vm_page_alloc_domain() callers.

Note that callers are now responsible for assigning the pindex.

Reviewed by:	alc, hselasky, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31986
2021-10-19 21:22:56 -04:00
Jessica Clarke 682c00a6ce riscv: Implement pmap_mapdev_attr
This is needed for LinuxKPI's _ioremap_attr. This reuses the generic
implementation introduced for aarch64, and itself requires implementing
pmap_kenter, which is trivial to do given riscv currently treats all
mapping attributes the same due to the Svpbmt extension not yet being
ratified and in hardware.

Reviewed by:	markj, mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D32445
2021-10-17 15:31:35 +01:00
Konstantin Belousov 4cc167a352 Restore PPS_SYNC in NOTES
This partially reverts e81e77c5a0, leaving the option both in
GENERICs on amd64/arm64/arm, and in global NOTES file.  Apparently
this better matches existing practice, where we do not try to hard
to make LINT and GENERIC complimentary.

Requested and reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-12 23:10:35 +03:00
Konstantin Belousov e81e77c5a0 Enable PPS_SYNC on amd64, arm64 and armv7
Remove the option from NOTES/LINT, and add to NOTES for powerpc and
riscv.

PR:	259036
Requested by:	John Hay <john@sanren.ac.za>
Discussed with:	ian, imp
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2021-10-10 22:34:40 +03:00
Mitchell Horne 17f790f49f arm, arm64, riscv: adjust top-level nexus comment
These platforms don't manage resources for DMA request lines or I/O
ports, this is specific to x86. Remove the references from the comments.

Reviewed by:	imp, jhb
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32358
2021-10-08 14:16:32 -03:00
Konstantin Belousov aba66031f2 riscv: move signal delivery code to exec_machdep.c
Reviewed by:	emaste, imp
Discussed with:	jrtc27
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D32310
2021-10-08 03:20:42 +03:00
Mitchell Horne 8babb5582e riscv: fix VM_MAXUSER_ADDRESS checks in asm routines
There are two issues with the checks against VM_MAXUSER_ADDRESS. First,
the comparison should consider the values as unsigned, otherwise
addresses with the high bit set will fail to branch. Second, the value
of VM_MAXUSER_ADDRESS is, by convention, one larger than the maximum
mappable user address and invalid itself. Thus, use the bgeu instruction
for these comparisons.

Add a regression test case for copyin(9).

PR:		257193
Reported by:	Robert Morris <rtm@lcs.mit.edu>
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D31209
2021-10-07 18:12:30 -03:00
Mitchell Horne 4a9f2f8b07 riscv: handle page faults in the unmappable region
When handling a kernel page fault, check explicitly that stval resides
in either the user or kernel address spaces, and make the page fault
fatal if not. Otherwise, a properly crafted address may appear to
pmap_fault() as a valid and present page in the kernel map, causing the
page fault to be retried continuously. This is mainly due to the fact
that the upper bits of virtual addresses are not validated by most of
the pmap code.

Faults of this nature should only occur due to some kind of bug in the
kernel, but it is best to handle them gracefully when they do.

Handle user page faults in the same way, sending a SIGSEGV immediately
when a malformed address is encountered.

Add an assertion to pmap_l1(), which should help catch other bugs of
this kind that make it this far.

Reviewed by:	jrtc27, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31208
2021-10-07 18:12:17 -03:00
Jessica Clarke 2404f03fca riscv: Add vt and kbdmux to GENERIC for video console support
No in-tree drivers are supported for RISC-V (given it supports UEFI we
could enable the EFI framebuffer, but U-Boot has very limited hardware
support and EDK2 remains a work in progress), but drm-kmod exists with
drivers for video cards that can be used with the HiFive Unmatched.

Reviewed by:	imp, jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D32001
2021-10-03 19:34:53 +01:00
Jessica Clarke 1be2e16df1 riscv: Add a stub pmap_change_attr implementation
pmap_change_attr is required by drm-kmod so we need the function to
exist. Since the Svpbmt extension is on the horizon we will likely end
up with a real implementation of it, so this stub implementation does
all the necessary page table walking to validate the input, ensuring
that no new errors are returned once it's implemented fully (other than
due to out of memory conditions when demoting L2 entries) and providing
a skeleton for that future implementation.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31996
2021-10-03 19:33:47 +01:00
John Baldwin 0177102173 arm64, riscv: Fix TRAF_PC() to return the PC, not the return address.
Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D31969
2021-10-01 11:53:12 -07:00
Mitchell Horne ab4ed843a3 minidump: De-duplicate the progress bar
The implementation of the progress bar is simple, but duplicated for
most minidump implementations. Extract the common bits to kern_dump.c.
Ensure that the bar is reset with each subsequent dump; this was only
done on some platforms previously.

Reviewed by:	markj
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31885
2021-09-29 16:42:21 -03:00
Mitchell Horne 31991a5a45 minidump: De-duplicate is_dumpable()
The function is identical in each minidump implementation, so move it to
vm_phys.c. The only slight exception is powerpc where the function was
public, for use in moea64_scan_pmap().

Reviewed by:	kib, markj, imp (earlier version)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31884
2021-09-29 16:41:52 -03:00
Thomas Skibo f5d78bea1f sifive_spi: Add missing case for SPIBUS_MODE_NONE
Otherwise sckmode is left uninitialised, not zero. This mode is used for
the on-board flash on the HiFive Unmatched board. Whilst here, catch
unknown modes and return an error rather than silently continuing.

Reviewed by:	#riscv, jrtc27
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31562
2021-08-30 23:38:02 +01:00
Andrew Turner b792434150 Create sys/reg.h for the common code previously in machine/reg.h
Move the common kernel function signatures from machine/reg.h to a new
sys/reg.h. This is in preperation for adding PT_GETREGSET to ptrace(2).

Reviewed by:	imp, markj
Sponsored by:	DARPA, AFRL (original work)
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D19830
2021-08-30 12:50:53 +01:00
Mateusz Guzik c69cc8d101 riscv: retire bzero
Unused since ba96f37758 ("Use __builtin for various mem* and b* (e.g. bzero)
routines.")

Reviewed by:	mhorne
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2021-08-23 18:38:05 +00:00
Jessica Clarke 98138bbde0 riscv: Fix pmap_alloc_l2 when it should allocate a new L1 entry
The current code checks the RWX bits are 0 but does not check the V bit
is non-zero, meaning not-yet-allocated L1 entries that are still zero
are regarded as being allocated. This is likely due to copying the arm64
code that checks ATTR_DESC_MASK is L1_TABLE, which emcompasses both the
type and the validity in a single field, and erroneously translating
that to a check of just PTE_RWX being 0 to indicate non-leaf, forgetting
about the V bit. This then results in the following panic:

    panic: Fatal page fault at 0xffffffc0005cf292: 0x00000000000050
    cpuid = 1
    time = 1628379581
    KDB: stack backtrace:
    db_trace_self() at db_trace_self
    db_trace_self_wrapper() at db_trace_self_wrapper+0x38
    kdb_backtrace() at kdb_backtrace+0x2c
    vpanic() at vpanic+0x148
    panic() at panic+0x2a
    page_fault_handler() at page_fault_handler+0x1ba
    do_trap_supervisor() at do_trap_supervisor+0x7a
    cpu_exception_handler_supervisor() at
    cpu_exception_handler_supervisor+0x70
    --- exception 13, tval = 0x50
    pmap_enter_l2() at pmap_enter_l2+0xb2
    pmap_enter_object() at pmap_enter_object+0x15e
    vm_map_pmap_enter() at vm_map_pmap_enter+0x228
    vm_map_insert() at vm_map_insert+0x4ec
    vm_map_find() at vm_map_find+0x474
    vm_map_find_min() at vm_map_find_min+0x52
    vm_mmap_object() at vm_mmap_object+0x1ba
    vn_mmap() at vn_mmap+0xf8
    kern_mmap() at kern_mmap+0x4c4
    sys_mmap() at sys_mmap+0x38
    do_trap_user() at do_trap_user+0x208
    cpu_exception_handler_user() at cpu_exception_handler_user+0x72
    --- exception 8, tval = 0x1dd

Instead, we should just check the V bit, as on amd64, and assert that
any valid L1 entries are not leaves, since an L1 leaf would render the
entire range allocated and thus we should not have attempted to map that
VA in the first place.

Reported by:	David Gilbert <dgilbert@daveg.ca>
MFC after:	1 week
Reviewed by:	markj, mhorne
Differential Revision:	https://reviews.freebsd.org/D31460
2021-08-09 20:28:37 +01:00
Ed Maste 9feff969a0 Remove "All Rights Reserved" from FreeBSD Foundation sys/ copyrights
These ones were unambiguous cases where the Foundation was the only
listed copyright holder (in the associated license block).

Sponsored by:	The FreeBSD Foundation
2021-08-08 10:42:24 -04:00
Jessica Clarke c5e5202a3d riscv: Sync NOTES with GENERIC changes
USB is already in sys/conf/NOTES, but NVMe is not, nor of course are the
new SiFive device drivers.

MFC after:	1 week
2021-08-07 23:20:38 +01:00
Jessica Clarke 0a4cb54506 riscv: Add hwreset to NOTES to fix LINT build
Fixes:		8e7e0690ec ("sifive_prci: Add reset support for the FU540 and FU740")
MFC after:	1 week
2021-08-07 23:15:20 +01:00
Jessica Clarke 6e162bd2f2 riscv: Add NVMe, USB and HID support to GENERIC
The SiFive FU740 has both NVMe and USB so we need both to ensure we can
mount root, and HID is a dependency of USB.

Reviewed by:	kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31036
2021-08-07 19:27:33 +01:00
Jessica Clarke 896e217a0e fu740_pci_dw: Add SiFive FU740 PCIe controller driver
Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31033
2021-08-07 19:27:31 +01:00
Jessica Clarke b47e5c5dbe sifive_gpio: Add SiFive GPIO controller driver
This is present on both the FU540 and FU740, but only needed for the
FU740 in order to assert reset and power enable signals for its PCIe
controller.

Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31031
2021-08-07 19:27:31 +01:00
Jessica Clarke 90a089cf2a fu540_spi: Rename to sifive_spi
The FU740 also uses the same SPI controller.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31026
2021-08-07 19:27:30 +01:00
Jessica Clarke 8e7e0690ec sifive_prci: Add reset support for the FU540 and FU740
This is needed for FU740 PCIe support. Whilst we don't need the FU540's
resets they are also defined for completeness.

Reviewed by:	manu
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31024
2021-08-07 19:27:29 +01:00
Jessica Clarke dcbea9a2f4 sifive_prci: Delay attachment until after clk_fixed
This avoids noisy output from early attempts to attach before clk_fixed
has attached to the parent clocks.

Reviewed by:	kp, mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31023
2021-08-07 19:27:29 +01:00
Jessica Clarke 589d8a78a5 sifive_prci: Add support for the FU740 PRCI
Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31022
2021-08-07 19:27:28 +01:00
Jessica Clarke 12b115ec57 fu540_prci: Rename to sifive_prci and use ocd_data for FU540 specificity
The FU740 has a very similar controller and will reuse most of the
driver. This also drops the dependency on the device-tree include for
the binding indices; the header doesn't namespace its contents (and nor
does the FU740 one) so using both would require seperate translation
units which would be unnecessarily complicated just to avoid defining
local copies of the small number of constants.

Whilst here, add the missing l to gemgxlclk's name and drop the prci_
prefix from tlclk's name as we don't prefix any of the others and it's
entirely unnecessary.

Reviewed by:	kp, mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31021
2021-08-07 19:27:27 +01:00
Konstantin Belousov 041b7317f7 Add pmap_vm_page_alloc_check()
which is the place to put MD asserts about allocated pages.

On amd64, verify that allocated page does not belong to the kernel
(text, data) or early allocated pages.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D31121
2021-07-31 16:53:42 +03:00
Jessica Clarke 4a23504908 riscv: Fix pmap_kextract racing with concurrent superpage promotion/demotion
This repeats amd64's cfcbf8c6fd (r180498) and i386's cf3508519c
(r202894) but for riscv; pmap_kextract must be lock-free and so it can
race with superpage promotion and demotion, thus the L2 entry must only
be loaded once to avoid using inconsistent state.

PR:	250866
Reviewed by:	markj, mhorne
Tested by:	David Gilbert <dgilbert@daveg.ca>
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31253
2021-07-22 20:02:14 +01:00
Jessica Clarke 8c439847f0 riscv: Include spibus and spigen in GENERIC
We already attempt to enable the SiFive SPI controller, but since spibus
isn't enabled it isn't actually built.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31027
2021-07-21 06:46:09 +01:00
Jessica Clarke ade2ea3c45 riscv: Fix pindex level confusion
The pindex values are assigned from the L3 leaves upwards, meaning there
are NUL2E L3 tables and then NUL1E L2 tables (with a futher NUL0E L1
tables in future when we implement Sv48 support). Therefore anything
below NUL2E is an L3 table's page and anything above or equal to NUL2E
is an L2 table's page (with the threshold of NUL2E + NUL1E marking the
start of the L1 tables' pages in Sv48). Thus all the comparisons and
arithmetic operations must use NUL2E to handle the L3/L2 allocation (and
thus L2/L1 entry) transition point, not NUL1E as all but pmap_alloc_l2
were doing.

To make matters confusing, the NUL1E and NUL2E definitions in the RISC-V
pmap are based on a 4-level page hierarchy but we currently use the
3-level Sv39 format (as that's the only required one, and hardware
support for the 4-level Sv48 is not widespread). This means that, in
effect, the above bug cancels out with the bloated NULxE definitions
such that things "work" (but are still technically wrong, and thus would
break when adding Sv48 support), with one exception. pmap_enter_l2 is
currently the only function to use the correct constant, but since
_pmap_alloc_l3 uses the incorrect constant, it will do complete nonsense
when it needs to allocate a new L2 table (which is rather rare). In this
instance, _pmap_alloc_l3, whilst it would correctly determine the pindex
was for an L2 table, would only subtract NUL1E when computing l1index
and thus go way out of bounds (by 511*512*512 bytes, or 127.75 GiB) of
its own L1 table and, thanks to pmap_distribute_l1, of every other
pmap's L1 table in the whole system. This has likely never been hit as
it would presumably instantly fault and panic.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31087
2021-07-21 02:51:26 +01:00
Jessica Clarke a1f9cdb1ab sifive_uart: Fix input character dropping in ddb and at a mountroot prompt
These use the raw console interface and poll. Unfortunately, the SiFive
UART puts the FIFO empty bit inside the FIFO data register, which means
that the act of checking whether a character is available also dequeues
any character from the FIFO, requiring the user to press each key twice.
However, since we configure the watermark to be 0 and, when the UART has
been grabbed for the console, we have interrupts off, we can abuse the
interrupt pending register to act as a substitute for the FIFO empty
bit.

This perhaps suggests that the console interface should move from having
rxready and getc to having getc_nonblock and getc (or make getc take a
bool), as all the places that call rxready do so to avoid blocking on
getc when there is no character available.

Reviewed by:	kp, philip
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31025
2021-07-21 02:51:25 +01:00
Jessica Clarke d9e85f2c6f riscv: Implement missing nexus methods
This is required for the SiFive FU740's PCIe controller. Copied from
arm64 with the only difference being changing pmap_mapdev_attr to
pmap_mapdev as riscv only has the latter.

Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D31032
2021-07-21 02:51:25 +01:00
David Chisnall cf98bc28d3 Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

This reapplies 3a522ba1bc with a fix for
the static assertion failure on i386.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-16 18:06:44 +01:00
Mark Johnston b092c58c00 Assert that valid PTEs are not overwritten when installing a new PTP
amd64 and 32-bit ARM already had assertions to this effect.  Add them to
other pmaps.

Reviewed by:	alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31171
2021-07-15 12:17:33 -04:00
David Chisnall d2b558281a Revert "Pass the syscall number to capsicum permission-denied signals"
This broke the i386 build.

This reverts commit 3a522ba1bc.
2021-07-10 20:26:01 +01:00
David Chisnall 3a522ba1bc Pass the syscall number to capsicum permission-denied signals
The syscall number is stored in the same register as the syscall return
on amd64 (and possibly other architectures) and so it is impossible to
recover in the signal handler after the call has returned.  This small
tweak delivers it in the `si_value` field of the signal, which is
sufficient to catch capability violations and emulate them with a call
to a more-privileged process in the signal handler.

Approved by:	markj (mentor)

Reviewed by:	kib, bcr (manpages)

Differential Revision: https://reviews.freebsd.org/D29185
2021-07-10 17:19:52 +01:00
Konstantin Belousov 28a66fc3da Do not call FreeBSD-ABI specific code for all ABIs
Use sysentvec hooks to only call umtx_thread_exit/umtx_exec, which handle
robust mutexes, for native FreeBSD ABI.  Similarly, there is no sense
in calling sigfastblock_clear() for non-native ABIs.

Requested by:	dchagin
Reviewed by:	dchagin, markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D30987
2021-07-07 14:12:07 +03:00
Jessica Clarke 348c41d181 riscv: Implement non-stub __vdso_gettc and __vdso_gettimekeep
PR:	256905
Reviewed by:	arichardson, mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30963
2021-07-05 16:16:53 +01:00
Edward Tomasz Napierala 435754a59e Add infrastructure required for Linux coredump support
This adds `sv_elf_core_osabi`, `sv_elf_core_abi_vendor`,
and `sv_elf_core_prepare_notes` fields to `struct sysentvec`,
and modifies imgact_elf.c to make use of them instead
of hardcoding FreeBSD-specific values.  It also updates all
of the ABI definitions to preserve current behaviour.

This makes it possible to implement non-native ELF coredump
support without unnecessary code duplication.  It will be used
for Linux coredumps.

Reviewed By:	kib
Sponsored By:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D30921
2021-06-29 08:49:12 +01:00
Jessica Clarke 5720b8de48 riscv: Add an hw.ncpu tunable to limit the number of cores
Based on a similar change to arm64 in 01a8235ea6.

Reviewed by:	mhorne
MRC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D30655
2021-06-15 01:22:13 +01:00
Alex Richardson 9bb8a4091c Reduce code duplication in machine/_types.h
Many of these typedefs are the same across all architectures or can
be set based on an architecture-independent compiler-provided macro
(e.g. __SIZEOF_SIZE_T__). These macros have been available since GCC 4.6
and Clang sometime before 3.0 (godbolt.org does not have any older clang
versions installed).

I originally considered using the compiler-provided `__FOO_TYPE__` directly.
However, in order to do so we have to check that those match the previous
typedef exactly (not just that they have the same size) since any change
would be an ABI break. For example, changing `long` to `long long` results
in different C++ name mangling. Additionally, Clang and GCC disagree on
the underlying type for some of (u)int*_fast_t types, so this change
only moves the definitions that are identical across all architectures
and does not touch those types.

This de-deduplication will allow us to have a smaller diff downstream in
CheriBSD: we only have to only change the (u)intptr_t definition in
sys/_types.h in CheriBSD instead of having to change machine/_types.h for
all CHERI-enabled architectures (currently RISC-V, AArch64 and MIPS).

Reviewed By: imp, kib
Differential Revision: https://reviews.freebsd.org/D29895
2021-06-14 16:30:16 +01:00
Mark Johnston 317113bb12 riscv: Rename pmap_fault_fixup() to pmap_fault()
This is consistent with other platforms, specifically arm and arm64.  No
functional change intended.

Reviewed by:	jrtc27
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30645
2021-06-06 16:44:46 -04:00
Mark Johnston c05748e028 riscv: Handle hardware-managed dirty bit updates in pmap_promote_l2()
pmap_promote_l2() failed to handle implementations which set the
accessed and dirty flags.  In particular, when comparing the attributes
of a run of 512 PTEs, we must handle the possibility that the hardware
will set PTE_D on a clean, writable mapping.

Following the example of amd64 and arm64, change riscv's
pmap_promote_l2() to downgrade clean, writable mappings to read-only, so
that updates are synchronized by the pmap lock.

Fixes:		f6893f09d
Reported by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Tested by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	jrtc27, alc, Nathaniel Filardo
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30644
2021-06-06 16:44:46 -04:00
Mitchell Horne 6f4bb8ecc2 arm64, riscv: remove reference to fsu_intr_fault
This variable no longer exists.

MFC after:	3 days
2021-05-25 12:26:52 -03:00
Ceri Davies c1a148873d sys/*/conf/*, docs: fix links to handbook
While here, fix all links to older en_US.ISO8859-1 documentation
in the src/ tree.

PR:             255026
Reported by:    Michael Büker <freebsd@michael-bueker.de>
Reviewed by:    dbaio
Approved by:    blackend (mentor), re (gjb)
MFC after:      10 days
Differential Revision: https://reviews.freebsd.org/D30265
2021-05-20 09:27:10 +01:00
Brandon Bergren 6e1abda231 riscv: Remove old qemu compatibility code
During early qemu development, the /soc node was marked as compatible
with "riscv-virtio-soc" instead of "simple-bus".

This was changed in qemu 53f54508dae6 in Sep 2018, and predates the
baseline required qemu version (5.0) for riscv by a wide margin.

The generic simplebus code handles attachment in all cases nowadays.

Sponsored by:	Tag1 Consulting, Inc.
Reviewed by:	jrtc27, mhorne
Differential Revision:	https://reviews.freebsd.org/D30011
2021-04-27 16:22:04 -05:00
John Baldwin 6a3a6fe34b riscv: Assert that SUM is not set in SSTATUS for exceptions.
Reviewed by:	mhorne
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29764
2021-04-21 13:57:20 -07:00
John Baldwin 753bcca440 riscv: Clear SUM in SSTATUS for supervisor mode exceptions.
Previously, a page fault taken during copyin/out and related functions
would run the entire fault handler while permitting direct access to
user addresses.  This could also leak across context switches (e.g. if
the page fault handler was preempted by an interrupt or slept for disk
I/O).

To fix, clear SUM in assembly after saving the original version of
SSTATUS in the supervisor mode trapframe.

Reviewed by:	mhorne, jrtc27
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D29763
2021-04-21 13:57:04 -07:00
Mitchell Horne 9d81dd5404 ddb: replace watchpoint set/clear functions
Use the new kdb variants. Print more specific error messages.

Reviewed by:	jhb, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29156
2021-03-29 12:05:44 -03:00
Mitchell Horne 763107f26c Introduce kdb-level watchpoint functions
This basically mirrors what already exists in ddb, but provides a
slightly improved interface. It allows the caller to specify the
watchpoint access type, and returns more specific error codes to
differentiate failure cases.

This will be used to support hardware watchpoints in gdb(4).

Stubs are provided for architectures lacking hardware watchpoint logic
(mips, powerpc, riscv), while other architectures are added individually
in follow-up commits.

Reviewed by:	jhb, kib, markj
MFC after:	3 weeks
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29155
2021-03-29 12:05:43 -03:00
Mitchell Horne 720dc6bcb5 Consolidate machine/endian.h definitions
This change serves two purposes.

First, we take advantage of the compiler provided endian definitions to
eliminate some long-standing duplication between the different versions
of this header. __BYTE_ORDER__ has been defined since GCC 4.6, so there
is no need to rely on platform defaults or e.g. __MIPSEB__ to determine
endianness. A new common sub-header is added, but there should be no
changes to the visibility of these definitions.

Second, this eliminates the hand-rolled __bswapNN() routines, again in
favor of the compiler builtins. This was done already for x86 in
e6ff6154d2. The benefit here is that we no longer have to maintain our
own implementations on each arch, and can instead rely on the compiler
to emit appropriate instructions or libcalls, as available. This should
result in equivalent or better code generation. Notably 32-bit arm will
start using the `rev` instruction for these routines, which is available
on armv6+.

PR:		236920
Reviewed by:	arichardson, imp
Tested by:	bdragon (BE powerpc)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D29012
2021-03-26 19:00:22 -03:00
Jason A. Harmening d22883d715 Remove PCPU_INC
e4b8deb222 removed the last in-tree uses of PCPU_INC().  Its
potential benefit is also practically nonexistent.  Non-x86
platforms already implement it as PCPU_ADD(..., 1), and according
to [0] there are no recent x86 processors for which the 'inc'
instruction provides a performance benefit over the equivalent
memory-operand form of the 'add' instruction.  The only remaining
benefit of 'inc' is smaller instruction size, which in this case
is inconsequential given the limited number of per-CPU data consumers.

[0]: https://www.agner.org/optimize/instruction_tables.pdf

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D29308
2021-03-20 19:23:59 -07:00
Mitchell Horne 0d3b3beeb2 riscv: fix errors in some atomic type aliases
This appears to be a copy-and-paste error that has simply been
overlooked. The tree contains only two calls to any of the affected
variants, but recent additions to the test suite started exercising the
call to atomic_clear_rel_int() in ng_leave_write(), reliably causing
panics.

Apparently, the issue was inherited from the arm64 atomic header. That
instance was addressed in c90baf6817, but the fix did not make its way
to RISC-V.

Note that the particular test case ng_macfilter_test:main still appears
to fail on this platform, but this change reduces the panic to a
timeout.

PR:		253237
Reported by:	Jenkins, arichardson
Reviewed by:	kp, arichardson
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D29064
2021-03-04 16:59:58 -04:00
John Baldwin 67932460c7 Add a VA_IS_CLEANMAP() macro.
This macro returns true if a provided virtual address is contained
in the kernel's clean submap.

In CHERI kernels, the buffer cache and transient I/O map are allocated
as separate regions.  Abstracting this check reduces the diff relative
to FreeBSD.  It is perhaps slightly more readable as well.

Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D28710
2021-02-17 16:32:11 -08:00
Danjel Qyteza 9bae4ce661 riscv: add SBI system reset extension
The System Reset extension provides functions to shutdown or reboot the
system via SBI firmware. This newly defined extension supersedes the
functionality of the legacy shutdown extension.

Update the SBI code to use the new System Reset extension when
available, and fall back to the legacy one.

Reviewed By:	kp, jhb
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D28226
2021-01-27 19:19:54 -04:00
Mitchell Horne a6405133b7 riscv: style(9) nits in sbi.c
Wrap a few lines at 80 columns, which were overlooked in recent commits.
2021-01-27 19:17:26 -04:00
Mark Johnston 3e3eb5f45f arm64, riscv: Set VM_KMEM_SIZE_SCALE to 1
This setting limits the amount of memory that can be allocated to UMA.
On systems with a direct map and ample KVA, however, there is no reason
for VM_KMEM_SIZE_SCALE to be larger than 1.  This appears to have been
inherited from the 32-bit ARM platform definitions.

Also remove VM_KMEM_SIZE_MIN, which is not needed when
VM_KMEM_SIZE_SCALE is defined to be 1.[*]

Reviewed by:	alc, kp, kib
Reported by:	alc [*]
Submitted by:	Klara, Inc.
Sponsored by:	Ampere Computing
Differential Revision:	https://reviews.freebsd.org/D28225
2021-01-19 20:34:36 -05:00
Emmanuel Vadot fa67846c6f riscv: Fix build by using the correct device-tree include path 2021-01-16 11:31:39 +01:00
Andrew Turner 6eebda3bba Split out the NODEBUG options to a common file
This is the superset of the nooptions found in the -DEBUG kernels.

Reviewed by:	emaste, manu
Sponsored by:	Innovate UK
Differential Revision:	https://reviews.freebsd.org/D28152
2021-01-14 16:57:53 +00:00
Mitchell Horne 818390ce0c arm64: fix early devmap assertion
The purpose of this KASSERT is to ensure that we do not run out of space
in the early devmap. However, the devmap grew beyond its initial size of
2MB in r336519, and this assertion did not grow with it.

A devmap mapping of a 1080p framebuffer requires 1920x1080 bytes, or
1.977 MB, so it is just barely able to fit without triggering the
assertion, provided no other devices are mapped before it. With the
addition of `options GDB` in GENERIC by bbfa199cbc, the uart is now
mapped for the purposes of a debug port, before mapping the framebuffer.
The presence of both these conditions pushes the selected virtual
address just below the threshold, triggering the assertion.

To fix this, use the correct size of the devmap, defined by
PMAP_MAPDEV_EARLY_SIZE. Since this code is shared with RISC-V, define
it for that platform as well (although it is a different size).

PR:		25241
Reported by:	gbe
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
2021-01-13 17:27:44 -04:00
mhorne 0628f68357 riscv pmap: add some pv list assertions
Ensure that we don't end up with a superpage in the vm_page_t's pv list.

This may help with debugging the panic reported in PR 250866, in which
l3 in pmap_remove_write() was found to be NULL. Adding a KASSERT to this
function will help narrow down the cause of this panic the next time it
occurs.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D28109
2021-01-12 11:12:02 -04:00
Thomas Skibo facdd1cd20 cgem: add 64-bit support
Add 64-bit address support to Cadence CGEM Ethernet driver for use in
other SoCs such as the Zynq UltraScale+ and SiFive HighFive Unleashed.

Reviewed by:	philip, 0mp (manpages)
Differential Revision: https://reviews.freebsd.org/D24304
2021-01-10 16:51:52 -04:00
Mitchell Horne cbc9be948a sifive_uart: quiet GCC -Werror=parentheses
Add an additional set of braces to clarify intention. The '&' operator
has a higher precedence than '|', but the reader may not always remember
this. No functional change.
2021-01-08 17:32:18 -04:00
John Baldwin 1dce7d9e7e Skip the vm.pmap.kernel_maps sysctl by default.
This sysctl node can generate very verbose output, so don't trigger it
for sysctl -a or sysctl vm.pmap.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D27504
2020-12-18 20:41:23 +00:00
Mitchell Horne 25de8fb6ca riscv: report additional known SBI implementations
These implementation IDs are defined in the SBI spec, so we should print
their name if detected.

Submitted by:	Danjel Qyteza <danq1222@gmail.com>
Reviewed by:	jhb, kp
Differential Revision:	https://reviews.freebsd.org/D27660
2020-12-18 20:10:30 +00:00
Alexander V. Chernikov d5fe384b4d Enable ROUTE_MPATH support in GENERIC kernels.
Ability to load-balance traffic over multiple path is a must-have thing for routers.
It may be used by the servers to balance outgoing traffic over multiple default gateways.

The previous implementation, RADIX_MPATH stayed in the shadow for too long.
It was not well maintained, which lead us to a vicious circle - people were using
 non-contiguous mask or firewalls to achieve similar goals. As a result, some routing
 daemons implementation still don't have multipath support enabled for FreeBSD.

Turning on ROUTE_MPATH by default would fix it. It will allow to reduce networking
 feature gap to other operating systems. Linux and OpenBSD enabled similar support
 at least 5 years ago.

ROUTE_MPATH does not consume memory unless actually used. It enables around ~1k LOC.

It does not bring any behaviour changes for userland.
Additionally, feature is (temporarily) turned off by the net.route.multipath sysctl
 defaulting to 0.

Differential Revision:	https://reviews.freebsd.org/D27428
2020-12-14 22:23:08 +00:00
Mitchell Horne f7cc0eae7e riscv: small counter(9) improvements
Prefer atomics to critical section. This reduces the cost of the
increment operation and removes the possibility of it being interrupted
by counter_u64_zero().

Use CPU_FOREACH() macro to skip absent CPUs.

Replace hand-rolled address calculation with zpcpu_get().

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27536
2020-12-11 20:01:45 +00:00
Mitchell Horne 5a28499f2f riscv: handle debug.debugger_on_trap for fatal page faults
Allows recovery or diagnosis of a fatal page fault before panicking the
system.

Reviewed by:	jhb, kp
Differential Revision:	https://reviews.freebsd.org/D27534
2020-12-10 22:20:20 +00:00
John Baldwin 9b9e7f4c51 Stack unwinding robustness fixes for RISC-V.
- Push the kstack_contains check down into unwind_frame() so that it
  is honored by DDB and DTrace.

- Check that the trapframe for an exception frame is contained in the
  traced thread's kernel stack for DDB traces.

Reviewed by:	markj
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27357
2020-12-08 17:57:18 +00:00
John Baldwin 5941edfcdc Add a kstack_contains() helper function.
This is useful for stack unwinders which need to avoid out-of-bounds
reads of a kernel stack which can trigger kernel faults.

Reviewed by:	kib, markj
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D27356
2020-12-01 17:04:46 +00:00
Alex Richardson acf8792067 Add .cfi_{start,end}proc for RISC-V assembly functions
This allows GDB to print more useful backtraces when setting a breakpoint
on an assembly function.

Reviewed By:	jhb
Differential Revision: https://reviews.freebsd.org/D27177
2020-11-26 17:37:22 +00:00
Mitchell Horne 08241f9192 riscv: always initialize the static kernel environment
Ensure we initialize the static environment when not booting via
loader(8), and provide a static buffer if this is the case. This fixes
two issues.

First, performing the initialization ensures that kenv variables set in
the kernel's config file are honored. Previously, any new or overridden
values were ignored.

Second, providing the static buffer allows variables to be set in the
device tree's bootargs property of the chosen node. This can be set by
u-boot or by QEMU's '-append' flag. Attempting to this prior to this
change resulted in an early panic, since the static environment had no
buffer backing it.

Submitted by:	syrinx (earlier version)
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D25034
2020-11-20 15:21:10 +00:00
Mitchell Horne caaddb88e8 riscv: set kernel_pmap hart mask more precisely
In pmap_bootstrap(), we fill kernel_pmap->pm_active since it is
invariably active on all harts. However, this marks it as active even
for harts that don't exist in the system, which can cause issue when the
mask is passed to the SBI firmware via sbi_remote_sfence_vma().
Specifically, the SBI spec allows SBI_ERR_INVALID_PARAM to be returned
when an invalid hart is set in the mask.

The latest version of OpenSBI does not have this issue, but v0.6 does,
and this is triggering a recently added KASSERT in CI. Switch to only
setting bits in pm_active for harts that enter the system.

Reported by:	Jenkins
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D27080
2020-11-05 00:52:52 +00:00
Alan Cox 9b4e77cb97 Tidy up the #includes. Recent changes, such as the introduction of
VM_ALLOC_WAITOK and vm_page_unwire_noq(), have eliminated the need for
many of the #includes.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D27052
2020-11-02 19:20:06 +00:00
Edward Tomasz Napierala b1497fb649 Optimize set_syscall_retval for riscv by predicting the return
value to be zero.

Reviewed by:	mhorne, kp
MFC after:	2 weeks
Sponsored by:	EPSRC
Differential Revision:	https://reviews.freebsd.org/D26990
2020-10-29 15:36:20 +00:00
Kristof Provost eb81dfb3af riscv: Minor cleanup in startup code
- remove setting of register value which is not used until the next value is
   set
 - Use the L2_SHIFT constant when setting up L2 superpages

Submitted by:	Antonin Houska <ah AT melesmeles DOT cz>
2020-10-27 12:44:49 +00:00
Mitchell Horne 89f3492919 riscv: make use of SBI legacy replacement extensions
Version 0.2 of the SBI specification [1] marked the existing SBI
functions as "legacy" in order to move to a newer calling convention. It
also introduced a set of replacement extensions for some of the legacy
functionality. In particular, the TIME, IPI, and RFENCE extensions
implement and extend the semantics of their legacy counterparts, while
conforming to the newer version of the spec.

Update our SBI code to use the new replacement extensions when
available, and fall back to the legacy ones. These will eventually be
dropped, when support for version 0.2 is ubiquitous.

[1] https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc

Submitted by:	Danjel Q. <danq1222@gmail.com>
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D26953
2020-10-26 19:13:22 +00:00
Mitchell Horne 6b35ff5fcb riscv: remove sbi_clear_ipi()
S-mode software has write access to the SIP.SSIP bit, so instead of
making a second round-trip through the SBI we can clear it ourselves.
The SBI spec has deprecated this function for this exactly this reason.

Submitted by:	Danjel Q. <danq1222@gmail.com
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D26952
2020-10-26 19:06:30 +00:00
Mitchell Horne cb8e067818 riscv: improve exception code naming
The existing names were inherited from arm64, but we should prefer
RISC-V terminology. Change the prefix to SCAUSE, and further change the
names to better match the RISC-V spec and be more consistent with one
another. Also, remove two codes that are not defined for S-mode (machine
and hypervisor ecall).

While here, apply style(9) to some condition checks.

Reviewed by:	kp
Discussed with: jrtc27
Differential Revision:	https://reviews.freebsd.org/D26918
2020-10-24 20:57:13 +00:00
Mitchell Horne 02a37049b4 riscv: zero reserved PTE bits for L2 PTEs
As was done for L3 PTEs in r362853, mask out the reserved bits when
extracting the physical address from an L2 PTE. Future versions of the
spec or custom implementations may make use of these reserved bits, in
which case the resulting physical address could be incorrect.

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	kp, mhorne
Differential Revision:	https://reviews.freebsd.org/D26607
2020-10-17 17:31:06 +00:00
Mitchell Horne ce4900bc8a Simplify preload_dump() condition
Hiding this feature behind RB_VERBOSE is gratuitous. The tunable is enough
to limit its use to only those who explicitly request it.

Suggested by:	kevans
2020-10-15 20:21:15 +00:00
Konstantin Belousov 6f3b523c9a Avoid dump_avail[] redefinition.
Move dump_avail[] extern declaration and inlines into a new header
vm/vm_dumpset.h.  This fixes default gcc build for mips.

Reviewed by:	alc, scottph
Tested by:	kevans (previous version)
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D26741
2020-10-14 22:51:40 +00:00
Conrad Meyer f8e8a06d23 random(4) FenestrasX: Push root seed version to arc4random(3)
Push the root seed version to userspace through the VDSO page, if
the RANDOM_FENESTRASX algorithm is enabled.  Otherwise, there is no
functional change.  The mechanism can be disabled with
debug.fxrng_vdso_enable=0.

arc4random(3) obtains a pointer to the root seed version published by
the kernel in the shared page at allocation time.  Like arc4random(9),
it maintains its own per-process copy of the seed version corresponding
to the root seed version at the time it last rekeyed.  On read requests,
the process seed version is compared with the version published in the
shared page; if they do not match, arc4random(3) reseeds from the
kernel before providing generated output.

This change does not implement the FenestrasX concept of PCPU userspace
generators seeded from a per-process base generator.  That change is
left for future discussion/work.

Reviewed by:	kib (previous version)
Approved by:	csprng (me -- only touching FXRNG here)
Differential Revision:	https://reviews.freebsd.org/D22839
2020-10-10 21:52:00 +00:00
Mitchell Horne eff4c46e28 RISC-V LINT kernel config
Create the RISC-V NOTES and LINT files. As of r366559, LINT configs are
no longer generated but checked in to the tree.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D26502
2020-10-09 14:45:41 +00:00
Mitchell Horne 22e6a67086 Add a routine to dump boot metadata
The boot metadata (also referred to as modinfo, or preload metadata)
provides information about the size and location of the kernel,
pre-loaded modules, and other metadata (e.g. the EFI framebuffer) to be
consumed during by the kernel during early boot. It is encoded as a
series of type-length-value entries and is usually constructed by
loader(8) and passed to the kernel. It is also faked on some
architectures when booted by other means.

Although much of the module information is available via kldstat(8),
there is no easy way to debug the metadata in its entirety. Add some
routines to parse this data and allow it to be printed to the console
during early boot or output via a sysctl.

Since the output can be lengthly, printing to the console is gated
behind the debug.dump_modinfo_at_boot kenv variable as well as the
BOOTVERBOSE flag. The sysctl to print the metadata is named
debug.dump_modinfo.

Reviewed by:	tsoome
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26687
2020-10-08 18:02:05 +00:00
Edward Tomasz Napierala 5319fa1b3e Remove yet another useless assignment, adding a KASSERT just in case.
Reviewed by:	kp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26698
2020-10-08 11:04:32 +00:00
Mitchell Horne 8481aab1ac Print symbol index for unsupported relocation types
It is unlikely, but possible, that an unrecognized or unsupported
relocation type is encountered while trying to load a kernel module. If
this occurs we should offer the symbol index as a hint to the user.

While here, fix some small style issues.

Reviewed by:	markj, kib (amd64 part, in D26701)
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
2020-10-07 18:48:10 +00:00
Edward Tomasz Napierala 29c4e4b1af Don't use critical section when calling intr_irq_handler() - that function
enters critical section by itself anyway.

Reviewed by:	kp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26694
2020-10-07 12:11:11 +00:00
Jessica Clarke 2152743f11 riscv: Remove outdated condition in page_fault_handler
Since r366355 and r366284 we panic on access faults rather than treating
them like page faults so this condition is never true.

Reviewed by:	jhb (mentor), markj, mhorne
Approved by:	jhb (mentor), markj, mhorne
Differential Revision:	https://reviews.freebsd.org/D26686
2020-10-06 13:03:31 +00:00
Jessica Clarke 105708ca1c riscv: Handle supervisor instruction page faults
We should never take instruction page faults when in the kernel, but by
using the standard page fault code we should get a more-informative
message about faulting on a NOFAULT page rather than branching to the
default case here and printing an "Unknown kernel exception ..."
message.

Reviewed by:	jhb (mentor), markj
Approved by:	jhb (mentor), markj
Differential Revision:	https://reviews.freebsd.org/D26685
2020-10-06 13:02:20 +00:00
Jessica Clarke da8944d96d riscv: De-Arm a few names
These names were inherited from the arm64 port and should be changed to
the RISC-V terminology.

Reviewed by:	jhb (mentor), kp, markj
Approved by:	jhb (mentor), kp, markj
Differential Revision:	https://reviews.freebsd.org/D26671
2020-10-06 12:56:29 +00:00
Edward Tomasz Napierala f157761902 Drop useless assignment, and add a KASSERT to make sure it really was useless.
Reviewed by:	nick, jhb
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26649
2020-10-05 18:41:35 +00:00
Edward Tomasz Napierala f726515758 Optimize riscv's cpu_fetch_syscall_args(), making it possible
for the compiler to inline the memcpy.

Reviewed by:	arichardson, mhorne
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26528
2020-10-03 13:01:07 +00:00
Kristof Provost 75f022774f riscv: handle access faults in user mode
Access faults in user mode are treated like TLB misses, which leads to an
endless loop of faults. It's less serious than the same fault in kernel mode,
because we can just terminate the process, but that's not ideal.

Treat user mode access faults as a bus error.

Suggested by:	jrtc27
Reviewed by:	br, jhb
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D26621
2020-10-02 07:30:11 +00:00
Kristof Provost 57712c0b76 riscv: Add memmmap so we can mmap /dev/mem
Reviewed by:	mhorne
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D26622
2020-10-01 15:04:55 +00:00
Kristof Provost 0d3aa0fb64 riscv: Panic on PMP errors
Load/store/fetch access exceptions always indicate a violation of a PMP
rule. We can't treat those as page faults, because updating the page
table and trying again will only result in exactly the same access
exception recurring. This leaves us in an endless exception loop.

We cannot recover from these exceptions, so panic instead.

Reviewed by:	jhb
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D26544
2020-09-30 08:23:43 +00:00
Jessica Clarke 7de649170f riscv: Define __PCI_REROUTE_INTERRUPT
Every other architecture defines this and this is required for
interrupts to work when using QEMU's PCI VirtIO devices (which all
report an interrupt line of 0) for two reasons.

Firstly, interrupt line 0 is wrong; they use one of 0x20-0x23 with the
lines being cycled across devices like normal. Moreover, RISC-V uses
INTRNG, whose IRQs are virtual as indices into its irq_map, so even if
we have the right interrupt line we still need to try and route the
interrupt in order to ultimately call into intr_map_irq and get back a
unique index into the map for the given line, otherwise we will use
whatever happens to be in irq_map[line] (which for QEMU where the line
is initialised to 0 results in using the first allocated interrupt,
namely the RTC on IRQ 11 at time of commit).

Note that pci_assign_interrupt will still do the wrong thing for INTRNG
when using a tunable, as it will bypass INTRNG entirely and use the
tunable's value as the index into irq_map, when it should instead
(indirectly) call intr_map_irq to allocate a new entry for the given
IRQ and treat the tunable as stating the physical line in use, which is
what one would expect. This, however, is a problem shared by all INTRNG
architectures, and not exclusive to RISC-V.

Reviewed by:	kib
Approved by:	kib
Differential Revision:	https://reviews.freebsd.org/D26564
2020-09-30 02:21:38 +00:00
Edward Tomasz Napierala 1e2521ffae Get rid of sa->narg. It serves no purpose; use sa->callp->sy_narg instead.
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26458
2020-09-27 18:47:06 +00:00
Mark Johnston 78257765f2 Add a vmparam.h constant indicating pmap support for large pages.
Enable SHM_LARGEPAGE support on arm64.

Reviewed by:	alc, kib
Sponsored by:	Juniper Networks, Inc., Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26467
2020-09-23 19:34:21 +00:00
Nick O'Brien e1c8f8f87d riscv: Trap cleanup - use nitems()
No functional changes, just cleanup.

Reviewed by:	kp
Approved by:	kp (mentor)
Sponsored by:	Axiado
2020-09-23 18:54:14 +00:00
Mitchell Horne 3994f5bc18 RISC-V: build SiFive drivers and DTB in GENERIC
In the spirit of the GENERIC config, we should include the drivers required to
run on most supported platforms.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D26501
2020-09-22 13:00:02 +00:00
D Scott Phillips 00e6614750 Sparsify the vm_page_dump bitmap
On Ampere Altra systems, the sparse population of RAM within the
physical address space causes the vm_page_dump bitmap to be much
larger than necessary, increasing the size from ~8 Mib to > 2 Gib
(and overflowing `int` for the size).

Changing the page dump bitmap also changes the minidump file
format, so changes are also necessary in libkvm.

Reviewed by:	jhb
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26131
2020-09-21 22:21:59 +00:00
D Scott Phillips ab041f713a Move vm_page_dump bitset array definition to MI code
These definitions were repeated by all architectures, with small
variations. Consolidate the common definitons in machine
independent code and use bitset(9) macros for manipulation. Many
opportunities for deduplication remain in the machine dependent
minidump logic. The only intended functional change is increasing
the bit index type to vm_pindex_t, allowing the indexing of pages
with address of 8 TiB and greater.

Reviewed by:	kib, markj
Approved by:	scottl (implicit)
MFC after:	1 week
Sponsored by:	Ampere Computing, Inc.
Differential Revision:	https://reviews.freebsd.org/D26129
2020-09-21 22:20:37 +00:00
Michal Meloun 3182062142 Add missing assignment forgotten in r365899
Noticed by:	mav
MFC after:	1 month
MFC with:	r365899
2020-09-20 15:11:52 +00:00
Michal Meloun 95a85c125d Add NetBSD compatible bus_space_peek_N() and bus_space_poke_N() functions.
One problem with the bus_space_read_N() and bus_space_write_N() family of
functions is that they provide no protection against exceptions which can
occur when no physical hardware or device responds to the read or write
cycles. In such a situation, the system typically would panic due to a
kernel-mode bus error. The bus_space_peek_N() and bus_space_poke_N() family
of functions provide a mechanism to handle these exceptions gracefully
without the risk of crashing the system.

Typical example is access to PCI(e) configuration space in bus enumeration
function on badly implemented PCI(e) root complexes (RK3399 or Neoverse
N1 N1SDP and/or access to PCI(e) register when device is in deep sleep state.

This commit adds a real implementation for arm64 only. The remaining
architectures have bus_space_peek()/bus_space_poke() emulated by using
bus_space_read()/bus_space_write() (without exception handling).

MFC after:	1 month
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D25371
2020-09-19 11:06:41 +00:00
Edward Tomasz Napierala 70890254b3 Get rid of sv_errtbl and SV_ABI_ERRNO().
Reviewed by:	kib
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26388
2020-09-17 11:39:33 +00:00
Scott Long 74c781ed91 Refine the busdma template interface. Provide tools for filling in fields
that can be extended, but also ensure compile-time type checking.  Refactor
common code out of arch-specific implementations.  Move the mpr and mps
drivers to this new API.  The template type remains visible to the consumer
so that it can be allocated on the stack, but should be considered opaque.
2020-09-14 05:58:12 +00:00
John Baldwin 39585a4c10 Disable WITNESS for spin locks by default.
This matches all other architectures and removes substantial overhead.

Reported by:	arichardson (indirectly)
Reviewed by:	imp, arichardson
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26403
2020-09-11 00:06:16 +00:00
Mitchell Horne 7c7b8f577e RISC-V: fix some mismatched format specifiers
RISC-V is currently built with -Wno-format, which is how these went
undetected. Address them now before re-enabling those warnings.

Differential Revision:	https://reviews.freebsd.org/D26319
2020-09-08 13:21:13 +00:00
Brooks Davis 46b974a9db Round TF_SIZE up to the stack alignment (16-bytes).
The kernel adjusts the stack by TF_SIZE and the RISC-V ABI requires
that it remain 16-byte aligned.

Reported by:	CHERI, jrtc27
Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D26328
2020-09-04 21:55:22 +00:00
Mark Johnston 847ab36bf2 Include the psind in data returned by mincore(2).
Currently we use a single bit to indicate whether the virtual page is
part of a superpage.  To support a forthcoming implementation of
non-transparent 1GB superpages, it is useful to provide more detailed
information about large page sizes.

The change converts MINCORE_SUPER into a mask for MINCORE_PSIND(psind)
values, indicating a mapping of size psind, where psind is an index into
the pagesizes array returned by getpagesizes(3), which in turn comes
from the hw.pagesizes sysctl.  MINCORE_PSIND(1) is equal to the old
value of MINCORE_SUPER.

For now, two bits are used to record the page size, permitting values
of MAXPAGESIZES up to 4.

Reviewed by:	alc, kib
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D26238
2020-09-02 18:16:43 +00:00
Mark Johnston 2d838cd867 Add the MEM_EXTRACT_PADDR ioctl to /dev/mem.
This allows privileged userspace processes to find information about the
physical page backing a given mapping.  It is useful in applications
such as DPDK which perform some of their own memory management.

Reviewed by:	kib, jhb (previous version)
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D26237
2020-09-02 18:12:47 +00:00
Kristof Provost 3bebdc0564 riscv: very large dma mappings can cause integer overflow
Fix the return type for _bus_dmamap_addseg().
Based on the same fix done for arm64 in r348571.

Sponsored by:	Axiado
2020-09-02 11:33:31 +00:00
Mateusz Guzik e91d4ae878 riscv: clean up empty lines in .c and .h files 2020-09-01 21:21:03 +00:00
Nick O'Brien bdc3ee3546 riscv: Use global mimpid in identify_cpu()
sbi_init() sets mimpid, we can use that value.

Reviewed by: philip (mentor), kp (mentor)
Approved by: philip (mentor), kp (mentor)
Sponsored by: Axiado
Differential Revision: https://reviews.freebsd.org/D26092
2020-08-18 16:51:04 +00:00
Mitchell Horne 9ead45af7b RISC-V: copy kernelname from the environment
This is allows kern.bootfile to report the correct value.
2020-08-15 16:15:34 +00:00
Mitchell Horne 958a094323 Enable interrupts while handling traps
I observed hangs post-r362977 in QEMU with -smp 2, in which one thread
would acquire write access to an rm_lock (sysctllock) and get stuck
waiting in smp_rendezvous_cpus while the other CPU was servicing a trap.
The other thread was waiting for read access to the same lock, thus
causing deadlock.

It's clear that this is just one symptom of a larger problem. The
general expectation of MI kernel code is that interrupts are enabled.
Violating this assumption will at best create some additional latency,
but otherwise might cause locking or other unforeseen issues. All other
architectures do so for some subset of trap values, but this somehow got
missed in the RISC-V port. Enable interrupts now during kernel page
faults and for all user trap types.

The code in exception.S already knows to disable interrupts while
handling the return from exception, so there are no changes required
there.

Reviewed by:	jhb, markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D26017
2020-08-13 14:21:05 +00:00
Mitchell Horne 99c9fdd09a Small fixes in locore.S
- Properly set up the frame pointer
 - Hang if we return from mi_startup
 - Whitespace

Clearing the frame pointer marks the end of the backtrace. This fixes
"bt 0" in ddb, which previously would unwind one frame too far.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D26016
2020-08-13 14:17:36 +00:00
John Baldwin 40db51b42f Check that the frame pointer is within the current stack.
This same check is used on other architectures.  Previously this would
permit a stack frame to unwind into any arbitrary kernel address
(including unmapped addresses).

Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25996
2020-08-12 20:33:29 +00:00
John Baldwin 367de39efa Use uintptr_t instead of uint64_t for pointers in stack frames.
Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25995
2020-08-12 20:29:49 +00:00
John Baldwin 90699f2a76 Correct padding length for RISC-V PCPU data.
There was an additional 7 bytes of compiler-inserted padding at the
end of the structure visible via 'ptype /o' in gdb.

Reviewed by:	mhorne
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25867
2020-08-12 18:45:36 +00:00
Mateusz Guzik 9ce4656a1f riscv: fix uintfptr_t
Fixes compilation after r363932
2020-08-05 22:09:40 +00:00
Andrew Turner c085d2ea97 Add DDB_CTF to the arm64 and riscv kernel configs
This allows DTrace fbt probes to find arguments.

Sponsored by:	Innovate UK
2020-08-05 11:54:51 +00:00
Kristof Provost 7393b267c6 libc: Provide sub fp(s|g)etmask() implementations for RISC-V
RISC-V doesn't support floating-point exceptions.

RISC-V Instruction Set Manual: Volume I: User-Level ISA, 11.2 Floating-Point
Control and Status Register: "As allowed by the standard, we do not support
traps on floating-point exceptions in the base ISA, but instead require
explicit checks of the flags in software. We considered adding branches
controlled directly by the contents of the floating-point accrued exception
flags, but ultimately chose to omit these instructions to keep the ISA simple."

We still need these functions, because some applications (notably Perl) call
them, but we cannot provide a meaningful implementation.

Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D25740
2020-08-03 12:48:51 +00:00
John Baldwin 9886554a02 Trim some extraneous parentheses.
Reported by:	kib (do_trap_user)
Sponsored by:	DARPA
2020-07-27 16:37:18 +00:00
John Baldwin c294dddb03 Set si_trapno to the exception code from scause.
Reviewed by:	kib
Obtained from:	CheriBSD
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25770
2020-07-27 16:28:44 +00:00
Jessica Clarke 825240034e riscv: Include syscon_power device driver in GENERIC kernel config
QEMU's RISC-V virt machine provides syscon-power and syscon-reset
devices as the means by which to shutdown and reboot. We also need to
ensure that we have attached the syscon_generic device before attaching
any syscon_power devices, and so we introduce a new riscv_syscon device
akin to aw_syscon added in r327936. Currently the SiFive test finisher
is used as the specific implementation of such a syscon device.

Reviewed by:	br, brooks (mentor), jhb (mentor)
Approved by:	br, brooks (mentor), jhb (mentor)
Obtained from:	CheriBSD
Differential Revision:	https://reviews.freebsd.org/D25725
2020-07-26 18:21:02 +00:00
Jessica Clarke d63a631e72 Add Goldfish RTC device driver for RISC-V
This device was originally used as part of the goldfish virtual hardware
platform used for emulating Android on QEMU, but is now also used as the
RTC for the RISC-V virt machine in QEMU. It provides a simple 64-bit
nanosecond timer exposed via a pair of memory-mapped 32-bit registers,
although only with 1s granularity.

Reviewed by:	brooks (mentor), jhb (mentor), kp
Approved by:	brooks (mentor), jhb (mentor), kp
Obtained from:	CheriBSD
Differential Revision:	https://reviews.freebsd.org/D25717
2020-07-26 18:15:16 +00:00
Alex Richardson b798ef6490 Include TMPFS in all the GENERIC kernel configs
Being able to use tmpfs without kernel modules is very useful when building
small MFS_ROOT kernels without a real file system.
Including TMPFS also matches arm/GENERIC and the MIPS std.MALTA configs.

Compiling TMPFS only adds 4 .c files so this should not make much of a
difference to NO_MODULES build times (as we do for our minimal RISC-V
images).

Reviewed By: br (earlier version for riscv), brooks, emaste
Differential Revision: https://reviews.freebsd.org/D25317
2020-07-24 08:40:04 +00:00
John Baldwin e7aaabe15e Pass the right size to memcpy() when copying the array of FP registers.
The size of the containing structure was passed instead of the size of
the array.  This happened to be harmless as the extra word copied is
one we copy in the next line anyway.

Reported by:	CHERI (bounds check violation)
Reviewed by:	brooks, imp
Obtained from:	CheriBSD
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25791
2020-07-23 21:33:10 +00:00
John Baldwin a1119d08b9 Add missing space after switch.
Reviewed by:	br, emaste
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D25778
2020-07-22 22:51:14 +00:00
Mitchell Horne dc42509049 INTRNG: only shuffle for !EARLY_AP_STARTUP
During device attachment, all interrupt sources will bind to the BSP,
as it is the only processor online. This means interrupts must be
redistributed ("shuffled") later, during SI_SUB_SMP.

For the EARLY_AP_STARTUP case, this is no longer true. SI_SUB_SMP will
execute much earlier, meaning APs will be online and available before
devices begin attachment, and there will therefore be nothing to
shuffle.

All PIC-conforming interrupt controllers will handle this early
distribution properly, except for RISC-V's PLIC. Make the necessary
tweak to the PLIC driver.

While here, convert irq_assign_cpu from a boolean_t to a bool.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D25693
2020-07-21 22:47:02 +00:00
Mitchell Horne cf5cd89c3a riscv: look for bootargs in FDT
The FDT may contain a short /chosen/bootargs string which we should pass
to boot_parse_cmdline. Notably, this allows the use of qemu's -append
option to pass things like -s to boot to single user mode.

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	mhorne
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D25544
2020-07-19 23:34:52 +00:00
Mark Johnston e64080e79c Switch from SCTP to SCTP_SUPPORT in GENERIC configs.
This removes SCTP from in-tree kernel configuration files.  Now, SCTP
can be enabled by simply loading the module, as discussed on
freebsd-net@.

Reviewed by:	tuexen
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D25611
2020-07-16 15:09:04 +00:00
Kristof Provost 38d715f789 riscv plic: Do not complete interrupts until the interrupt handler has run
We cannot complete the interrupt (i.e. write to the claims/complete register
until the interrupt handler has actually run. We don't run the interrupt
handler immediately from intr_isrc_dispatch(), we only schedule it for later
execution.

If we immediately complete it (i.e. before the interrupt handler proper has
run) the interrupt may be triggered again if the interrupt source remains set.
From RISC-V Instruction Set Manual: Volume II: Priviliged Architecture, 7.4
Interrupt Gateways:

"If a level-sensitive interrupt source deasserts the interrupt after the PLIC
core accepts the request and before the interrupt is serviced, the interrupt
request remains present in the IP bit of the PLIC core and will be serviced by
a handler, which will then have to determine that the interrupt device no
longer requires service."

In other words, we may receive interrupts twice.

Avoid that by postponing the completion until after the interrupt handler has
run.

If the interrupt is handled by a filter rather than by scheduling an interrupt
thread we must also complete the interrupt, so set up a post_filter handler
(which is the same as the post_ithread handler).

Reviewed by:	mhorne
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D25531
2020-07-06 21:29:50 +00:00
Mitchell Horne 2192efc03b RISC-V boot1.efi and loader.efi support
This implementation doesn't have any major deviations from the other EFI
ports. I've copied the boilerplate from arm and arm64.

I've tested this with the following boot flows:
OpenSBI (M-mode) -> u-boot (S-mode) -> loader.efi -> FreeBSD
OpenSBI (M-mode) -> u-boot (S-mode) -> boot1.efi -> loader.efi -> FreeBSD

Due to the way that u-boot handles secondary CPUs, OpenSBI >= v0.7 is required,
as the HSM extension is needed to bring them up explicitly. Because of this,
using BBL as the SBI implementation will not be possible. Additionally, there
are a few recent u-boot changes that are required as well, all of which will be
present in the upcoming v2020.07 release.

Looks good:	emaste
Differential Revision:	https://reviews.freebsd.org/D25135
2020-07-06 18:19:42 +00:00
Kristof Provost b865714d95 riscv pmap: zero reserved pte bits in ppn
The top 10 bits of a pte are reserved by specification[1] and are not part of
the PPN.

[1] 'Volume II: RISC-V Privileged Architectures V20190608-Priv-MSU-Ratified',
'4.4.1 Addressing and Memory Protection', page 72: "The PTE format for Sv39 is
shown in Figure 4.18. ... Bits 63–54 are reserved for future use and must be
zeroed by software for forward compatibility."

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	kp, mhorne
Differential Revision:	https://reviews.freebsd.org/D25523
2020-07-01 19:15:43 +00:00
Kristof Provost 6f11e59d72 riscv locore.S: load constant prior to loop
A very minor micro-optimization; t0 is not clobbered between the loop top and
bottom and there appear to be no other branches to this label.

Submitted by:	Nathaniel Filardo <nwf20@cl.cam.ac.uk>
Reviewed by:	mhorne
Differential Revision:	https://reviews.freebsd.org/D25524
2020-07-01 19:12:47 +00:00
Kristof Provost d53a2816c7 riscv: Log missing registers in dump_regs()
If we panic we dump the registers for debugging. This is very useful, but it
missed several registers (ra, sp, gp and tp).

Log these as well. Especially the return address value is extremely useful.

Sponsored by:	Axiado
2020-07-01 19:11:02 +00:00
Mitchell Horne 133b1f1461 Only invalidate the early DTB mapping if it exists
This temporary mapping will become optional. Booting via loader(8)
means that the DTB will have already been copied into the kernel's
staging area, and is therefore covered by the early KVA mappings.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D24911
2020-06-24 15:21:12 +00:00
Mitchell Horne f7d2df2a8a Handle load from loader(8)
In locore, we must detect and handle different arguments passed by
loader(8) compared to what we recieve when booting directly via SBI
firmware. Currently we receive the hart ID in a0 and a pointer to the
device tree blob in a1. loader(8) provides only a pointer to its
metadata in a0.

The solution to this is to add an additional entry point, _alt_start.
This will be placed first in the .text section, so SBI firmware will
enter here, and jump to the common pagetable setup shortly after. Since
loader(8) understands our ELF kernel, it will enter at the ELF's entry
address, which points to _start. This approach leads to very little
guesswork as to which way we booted.

Fix-up initriscv() to parse the loader's metadata, continuing to use
fake_preload_metadata() in the SBI direct boot case.

Reviewed by:	markj, jrtc27 (asm portion)
Differential Revision:	https://reviews.freebsd.org/D24912
2020-06-24 15:20:00 +00:00
Jessica Clarke e28d8a5b26 riscv: Use SBI shutdown call to implement RB_POWEROFF
Currently we only call sbi_shutdown in cpu_reset, which means we reach
"Please press any key to reboot." even when RB_POWEROFF is set, and only
once the user presses a key do we then shutdown. Instead, register a
shutdown_final event handler and make an SBI shutdown call if
RB_POWEROFF is set.

Reviewed by:	br, jhb (mentor), kp
Approved by:	br, jhb (mentor), kp
Differential Revision:	https://reviews.freebsd.org/D25183
2020-06-08 17:57:21 +00:00
Alex Richardson c98013c0b1 RISC-V: Check that the DTB doesn't overlap with kernel
This can happen with very large kernels (e.g. ones embedding a root
filesystem). The DTB written by OpenSBI/BBL is quite small so this is
unlikely to hit important data, but if it does this can result in very
confusing and hard-to-debug crashes. Add a KASSERT() and a verbose print
to catch this problem with debug kernels.

While this will not print any output by default if it fails (that would
depend on EARLY_PRINTF), at least the kernel now halts reliably instead
of randomly crashing.

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25153
2020-06-08 08:52:02 +00:00
Alex Richardson f7910a3df9 sys/riscv: Remove debug printfs
They are only visible with EARLY_PRINTF so don't show up by default.

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25152
2020-06-08 08:51:57 +00:00
Alex Richardson c714d79726 RISC-V: handle DTB aligned to less than 2MB
By default OpenSBI and BBL will pass the DTB at a 2MB-aligned address.
However, by default there are no 2MB aligned regions between the SBI and
the kernel, so we have to choose a 2MB aligned region after the kernel.
OpenSBI defaults to placing the DTB 32MB after the start of the kernel but
this is not sufficient for a kernel with a large MFS embedded.
We could increase this offset to a larger number (e.g. 64/128/256) but that
imposes restrictions on the minimum RAM size.
Another solution would be to place the DTB between OpenSBI and the kernel
at 1MB alignment, but current locore.S code assumes 2MB alignment.

With this change I can now boot on QEMU with an OpenSBI configured to
store the DTB at an offset of 1MB.

See also https://github.com/riscv/opensbi/issues/169

Reviewed By:	mhorne
Differential Revision: https://reviews.freebsd.org/D25151
2020-06-08 08:51:52 +00:00
Mitchell Horne cd9207569f Remove remnant of arm's ELF trampoline
The trampoline code used for loading gzipped a.out kernels on arm was
removed in r350436. A portion of this code allowed for DDB to find the
symbol tables when booting without loader(8), and some of this was
untouched in the removal. Remove it now.

Differential Revision:	https://reviews.freebsd.org/D24950
2020-05-31 14:43:04 +00:00
Mitchell Horne dde3b16bbc Add macros simplifying the fake preload setup
This is in preparation for booting via loader(8). Lift these macros from arm64
so we don't need to worry about the size when inserting new elements. This
could have been done in r359673, but I didn't think I would be returning to
this function so soon.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D24910
2020-05-28 14:56:11 +00:00
Ruslan Bukin d75038a0af Fix entering KDB with dtrace-enabled kernel.
Reviewed by:	markj, jhb
Differential Revision:	https://reviews.freebsd.org/D24018
2020-05-26 16:44:05 +00:00
Conrad Meyer 852c303b61 copystr(9): Move to deprecate (attempt #2)
This reapplies logical r360944 and r360946 (reverting r360955), with fixed
copystr() stand-in replacement macro.  Eventually the goal is to convert
consumers and kill the macro, but for a first step it helps if the macro is
correct.

Prior commit message:

Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy (with correction from brooks@ -- thanks).

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:		jhb (earlier version)
Discussed with:		brooks (thanks!)
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-25 16:40:48 +00:00
Jessica Clarke 0721214a60 riscv: Fix pmap_protect for superpages
When protecting a superpage, we would previously fall through to the
non-superpage case and read the contents of the superpage as PTEs,
potentially modifying them and trying to look up underlying VM pages that
don't exist if they happen to look like PTEs we would care about. This led
to nginx causing an unexpected page fault in pmap_protect that panic'ed the
kernel. Instead, if we see a superpage, we are done for this range and
should continue to the next.

Reviewed by:	markj, jhb (mentor)
Approved by:	markj, jhb (mentor)
Differential Revision:	https://reviews.freebsd.org/D24827
2020-05-13 17:20:51 +00:00
Conrad Meyer 051fc58cb3 Revert r360944 and r360946 until reported issues can be resolved
Reported by:	cy
2020-05-12 04:34:26 +00:00
Conrad Meyer 580744621f copystr(9): Move to deprecate [2/2]
Unlike the other copy*() functions, it does not serve to copy from one
address space to another or protect against potential faults.  It's just
an older incarnation of the now-more-common strlcpy().

Add a coccinelle script to tools/ which can be used to mechanically
convert existing instances where replacement with strlcpy is trivial.
In the two cases which matched, fuse_vfsops.c and union_vfsops.c, the
code was further refactored manually to simplify.

Replace the declaration of copystr() in systm.h with a small macro
wrapper around strlcpy.

Remove N redundant MI implementations of copystr.  For MIPS, this
entailed inlining the assembler copystr into the only consumer,
copyinstr, and making the latter a leaf function.

Reviewed by:	jhb
Differential Revision:	https://reviews.freebsd.org/D24672
2020-05-11 22:57:21 +00:00
Mitchell Horne 4d7e9134bb Use the HSM SBI extension to halt CPUs
Differential Revision:	https://reviews.freebsd.org/D24498
2020-05-01 21:59:47 +00:00
Mitchell Horne c74959537c Use the HSM SBI extension to start APs
The addition of the HSM SBI extension to OpenSBI introduces a new
breaking change: secondary harts will remain parked in the firmware,
until they are brought up explicitly via sbi_hsm_hart_start(). Add
the call to do this, sending the secondary harts to mpentry.

If the HSM extension is not present, secondary harts are assumed to be
released by the firmware, as is the case for OpenSBI =< v0.6 and BBL.

In the case that the HSM call fails we exclude the CPU, notify the
user, and allow the system to proceed with booting.

Reviewed by:	markj (older version)
Differential Revision:	https://reviews.freebsd.org/D24497
2020-05-01 21:58:19 +00:00
Mitchell Horne bfe918fa0e Add support for HSM SBI extension
The Hardware State Management (HSM) extension provides a set of SBI
calls that allow the supervisor software to start and stop hart
execution.

The HSM extension has been implemented in OpenSBI and is present in
the v0.7 release.

[1] https://github.com/riscv/riscv-sbi-doc/blob/master/riscv-sbi.adoc

Reviewed by:	br
Differential Revision:	https://reviews.freebsd.org/D24496
2020-05-01 21:55:51 +00:00
Mitchell Horne df62bf00a5 Make mpentry independent of _start
APs enter the kernel at the same point as the BSP, the _start routine.
They then jump to mpentry, but not before storing the kernel's physical
load address in the s9 register. Extract this calculation into its own
routine, so that APs can be instructed to enter directly from mpentry.

Differential Revision:	https://reviews.freebsd.org/D24495
2020-05-01 21:52:29 +00:00
John Baldwin 02343a67c2 Retire the GENERICSF kernel config.
Now that hw.machine_arch handles soft-float vs hard-float there is no
longer a reason for this config.

Submitted by:	mhorne (kern.mk hunk)
Reviewed by:	imp (earlier version), kp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24544
2020-04-27 21:51:22 +00:00
John Baldwin 61bbe53c2d Improve MACHINE_ARCH handling for hard vs soft-float on RISC-V.
For userland, MACHINE_ARCH reflects the current ABI via preprocessor
directives.  For the kernel, the hw.machine_arch sysctl uses the ELF
header flags of the current process to select the correct MACHINE_ARCH
value.

Reviewed by:	imp, kp
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D24543
2020-04-27 17:55:40 +00:00
Mitchell Horne 0d26dae5c3 RISC-V: provide the correct value for kernstart
pmap_bootstrap() expects the kernel's physical load address, but we have
been providing the start of physical memory. This had the nice effect of
protecting the memory used by the SBI runtime firmware, but now that we
have alternate means of achieving that, we should provide the correct
value. This will free up any memory between the SBI firmware and the
kernel for allocation.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D24156
2020-04-19 00:34:49 +00:00
Mitchell Horne f2566be5ce RISC-V: exclude reserved memory regions
The device tree may contain a "reserved-memory" node, whose purpose is
to communicate sections of physical memory that should not be used for
general allocations. Add the logic to parse and exclude these regions.

The particular motivation for this is protection of the SBI runtime
firmware. Currently, there is no mechanism through which the SBI
can communicate the details of its reserved memory region(s) to
a supervisor payload. There has been some discussion recently on how
this can be achieved [1], and it seems that the path going forward
will be to add an entry to the reserved-memory node.

This hasn't caused any issues for us yet, since we exclude all physical
memory below the kernel's load address from being allocated, and on all
currently supported platforms this covers the SBI firmware region. This
will change in another commit, so as a safety measure, ensure that the
lowest 2MB of memory is excluded if this region has not been reported.

[1] https://github.com/riscv/riscv-sbi-doc/pull/37

Reviewed by:	markj, nick (older version)
Differential Revision:	https://reviews.freebsd.org/D24155
2020-04-19 00:33:05 +00:00
Mitchell Horne 820a3f438d RISC-V: use physmem to manage physical memory
Replace our hand-rolled functions with the generic ones provided by
kern/subr_physmem.c. This greatly simplifies the initialization of
physical memory regions and kernel globals.

Tested by:	nick
Differential Revision:	https://reviews.freebsd.org/D24154
2020-04-19 00:18:16 +00:00
Jessica Clarke be4ed3d2cf riscv: Add semicolon missing from r359672
Somehow this got lost between build-testing and submitting to Phabricator.
2020-04-06 23:54:50 +00:00
Mitchell Horne 24891abdb2 RISC-V: copy the DTB to early KVA
The location of the device-tree blob is passed to the kernel by the
previous booting stage (i.e. BBL or OpenSBI). Currently, we leave it
untouched and mark the 1MB of memory holding it as unavailable.

Instead, do what is done by other fake_preload_metadata() routines and
copy to the DTB to KVA space. This is more in line with what loader(8)
will provide us in the future, and it allows us to reclaim the hole in
physical memory.

Reviewed by:	markj, kp (earlier version)
Differential Revision:	https://reviews.freebsd.org/D24152
2020-04-06 22:48:43 +00:00
Jessica Clarke 44c27d70a5 riscv: Make sure local hart's icache is synced in pmap_sync_icache
The only way to flush the local hart's icache is with a FENCE.I (or an
equivalent SBI call); a normal FENCE is insufficient and, for the
single-hart case, unnecessary.

Reviewed by:	jhb (mentor), markj
Approved by:	jhb (mentor), markj
Differential Revision:	https://reviews.freebsd.org/D24317
2020-04-06 22:31:30 +00:00
Jessica Clarke 1dc32a6d77 riscv: Fix pmap_fault_fixup for L3 pages
Summary:
The parentheses being in the wrong place means that, for L3 pages,
oldpte has all bits except PTE_V cleared, and so all the subsequent
checks against oldpte will fail, causing us to bail out and not retry
the faulting instruction after an SFENCE.VMA. This causes a WITNESS +
INVARIANTS kernel to fault on the "Chisel P3" (BOOM-based) DARPA SSITH
GFE SoC in pmap_init when writing to pv_table and, being a nofault
entry, subsequently panic with:

  panic: vm_fault_lookup: fault on nofault entry, addr: 0xffffffc004e00000

Reviewed by:	markj
Approved by:	markj
Differential Revision:	https://reviews.freebsd.org/D24315
2020-04-06 22:29:15 +00:00
Nick O'Brien 96d3575edc riscv/sifive: add FE310 Always-on driver
This driver supports SiFive's FE310 Always-on (AON) peripheral's
Real-time clock (RTC) and Watchdog timer (WDT). AON has other
functionality that this driver could support such as the power
management unit (PMU) but that functionality hasn't been implemented.

Reviewed by:	philip (mentor), kp (mentor)
Approved by:	philip (mentor)
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D24170
2020-04-02 00:33:15 +00:00
John Baldwin 59838c1a19 Retire procfs-based process debugging.
Modern debuggers and process tracers use ptrace() rather than procfs
for debugging.  ptrace() has a supserset of functionality available
via procfs and new debugging features are only added to ptrace().
While the two debugging services share some fields in struct proc,
they each use dedicated fields and separate code.  This results in
extra complexity to support a feature that hasn't been enabled in the
default install for several years.

PR:		244939 (exp-run)
Reviewed by:	kib, mjg (earlier version)
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D23837
2020-04-01 19:22:09 +00:00
Mark Johnston 8db2e8fd16 Remove the secondary_stacks array in arm64 and riscv kernels.
Instead, dynamically allocate a page for the boot stack of each AP when
starting them up, like we do on x86.  This shrinks the bss by
MAXCPU*KSTACK_PAGES pages, which corresponds to 4MB on arm64 and 256KB
on riscv.

Duplicate the logic used on x86 to free the bootstacks, by using a
sysinit to wait for each AP to switch to a thread before freeing its
stack.

While here, mark some static MD variables as such.

Reviewed by:	kib
MFC after:	1 month
Sponsored by:	Juniper Networks, Klara Inc.
Differential Revision:	https://reviews.freebsd.org/D24158
2020-03-24 18:43:23 +00:00
Mitchell Horne 9634349d9c Fix ordering of machine includes
Remove machine/asm.h since it is unused.
2020-03-22 17:59:36 +00:00
Brandon Bergren 3069380898 [PowerPC][Book-E] Fix missing load base in elf_cpu_parse_dynamic().
When I implemented MD DYNAMIC parsing, I was originally passing a
linker_file_t so that the MD code could relocate pointers.

However, it turns out this isn't even filled in until later, so it was
always 0.

Just pass the load base (ef->address) directly, as that's really the only
thing we were interested in in the first place.

This fixes a crash on RB800 where it was trying to write to an unmapped
address when updating the GOT.

Reviewed by:	jhibbits
Sponsored by:	Tag1 Consulting, Inc.
Differential Revision:	https://reviews.freebsd.org/D24105
2020-03-18 02:58:18 +00:00
Philip Paeps 8375d1173f fuspi: silence build warning, plug resource leak
This silences an "unused label" warning as well as fixes the attach fail
path that wasn't releasing resources.

Submitted by:   Nicholas O'Brien <nickisobrien_gmail.com>
Sponsored by:	Axiado
Differential Revision: https://reviews.freebsd.org/D24004
2020-03-09 04:09:36 +00:00
Brooks Davis dc30b290e1 riscv: Add a GENERIC-NODEBUG (copied from amd64)
Sponsored by:	DARPA
2020-02-27 20:26:37 +00:00
Warner Losh 6b72948d73 Better check for floating point type.
Use __riscv_flen instead of __riscv_float_abi_soft. While the latter works for
userland (and one could argue it's more correct), it fails for the kernel. We
compile the kernel with -mabi=lp64 (eg soft float abi) to avoid floating point
instructions in the kernel. We also compile the kernel -march=rv64imafdc for
hard float kernels (eg those with options FPE), but with -march=rv64imac for
softfloat kernels (eg those with FPE). Since we do this, in the kernel (as in
userland) __riscv_flen will be defined for 'riscv64' and not for 'riscv64sf'.

This also removes the -DMACHINE_ARCH hack now that it's no longer needed.

Longer term, we should return the ABI from the sysctl hw.machine_arch like on
amd64 for i386 binaries.

Suggested by: mhorne@
Differential Revision: https://reviews.freebsd.org/D23813
2020-02-27 15:34:30 +00:00
Pawel Biernacki 7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Warner Losh 990a56e866 Add a soft-float riscv kernel config
GENERICSF is just like GENERIC, only creates a soft-float kernel. Omit it from the
universe build for now.

Reviewed by: philip
Differential Revision: https://reviews.freebsd.org/D23812
2020-02-24 16:42:44 +00:00
Warner Losh 65289c9963 Only compile clear_fpu state code when we're building with options FPE.
Soft float kernels build without floating point, and will fail to build if we
try to include floating point code.

Obtained from: kp@
2020-02-24 16:41:29 +00:00
Kristof Provost 6ebb17dfa7 riscv: Set MACHINE_ARCH correctly
MACHINE_ARCH sets the hw.machine_arch sysctl in the kernel. In userspace
it sets MACHINE_ARCH in bmake, which bsd.cpu.mk uses to configure the
target ABI for ports.

For riscv64sf builds (i.e. soft-float) that needs to be riscv64sf, but
the sysctl didn't reflect that. It is static.

Set the define from the riscv makefile so that we correctly reflect our
actual build (i.e. riscv64 or riscv64sf), depending on what TARGET_ARCH
we were built with.

That still doesn't satisfy userspace builds (e.g. bmake), so check if
we're building with a software-floating point toolchain there. That
check doesn't work in the kernel, because it never uses floating point.

Reviewed by:	philip (previous version), mhorne
Sponsored by:	Axiado
Differential Revision:	https://reviews.freebsd.org/D23741
2020-02-22 13:23:27 +00:00
Mitchell Horne 9ce0b407c2 Implement vm.pmap.kernel_maps for RISC-V
This is taken from the arm64 version, with the following simplifications:

- Our current pmap implementation uses a 3-level paging scheme
- The "mode" field has been omitted since RISC-V PTEs don't encode
  typical mode attributes

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D23594
2020-02-12 14:06:02 +00:00
Mitchell Horne ccfb8acd70 RISC-V: un-ifdef vm.kvm_size and vm.kvm_free
Fix formatting and add CTLFLAG_MPSAFE.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D23522
2020-02-12 13:58:37 +00:00
Mateusz Guzik 3acb6572fc Store offset into zpcpu allocations in the per-cpu area.
This shorten zpcpu_get and allows more optimizations.

Reviewed by:	jeff
Differential Revision:	https://reviews.freebsd.org/D23570
2020-02-12 11:11:22 +00:00
John Baldwin 2243365e29 Use the context created in makectx() for stack traces.
Always use the kdb_thr_ctx() for db_trace_thread() as on other
architectures.  Initialize pcb_ra to be the sepc from the saved
trapframe rather than the saved ra to avoid skipping a frame.

Reviewed by:	mhorne, br
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23513
2020-02-06 18:04:45 +00:00
John Baldwin ecafbba5a6 Fix DDB to unwind across exception frames.
While here, add an extra line of information for exceptions and
interrupts and compress the per-frame line down to one line to match
other architectures.

Reviewed by:	mhorne, br
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23508
2020-02-06 18:02:38 +00:00
John Baldwin 4a9b01b262 Fix EXCP_MASK to include all relevant bits from scause.
While cause codes higher than 16 are reserved, the exception code
field of the register is defined to be all bits but the upper-most
bit.

Reviewed by:	mhorne
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23510
2020-02-05 20:34:22 +00:00
John Baldwin ac2b208d08 Use csr_read() to read sstatus instead of inline assembly.
While here, remove a local variable to avoid the CSR read in non-debug
kernels.

Reviewed by:	mhorne
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23511
2020-02-05 20:32:37 +00:00
John Baldwin b68892fe61 Remove stale workaround for the htif console.
In practice this discarded all characters entered at the DDB prompt.

Reviewed by:	br
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23509
2020-02-05 20:11:08 +00:00
John Baldwin 37bd6bb849 Read the breakpoint instruction to determine its length in BKPT_SKIP.
This fixes continuing from debug.kdb.enter=1 after enabling the use of
compressed instructions since the compiler can emit the two byte
c.ebreak instead of the 4 byte ebreak.

Reviewed by:	br
MFC after:	1 week
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23507
2020-02-05 20:06:35 +00:00
Mark Johnston c3d326fd44 Define MAXCPU consistently between the kernel and KLDs.
This reverts r177661.  The change is no longer very useful since
out-of-tree KLDs will be built to target SMP kernels anyway.  Moveover
it breaks the KBI in !SMP builds since cpuset_t's layout depends on the
value of MAXCPU, and several kernel interfaces, notably
smp_rendezvous_cpus(), take a cpuset_t as a parameter.

PR:		243711
Reviewed by:	jhb, kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23512
2020-02-05 19:08:21 +00:00
Mitchell Horne 9e483eadd4 prci: register tlclk as a fixed clock
The PRCI exports tlclk as a constant fixed divisor clock, defined as 1/2
of the coreclk frequency. In older FU540 device trees (such as the one
provided by SiFive), tlclk is represented as its own entity, and is
automatically registered as a fixed-divisor-clock. Unfortunately the
upstream FU540 device tree (that we have in our tree) represents tlclk
as an output of the PRCI block, and we must register it manually. At
worst, users of the old device tree will end up with an unreferenced
duplicate of tlclk.

This fixes device attachment for the SiFive UART on newer device trees,
since it references tlclk via the PRCI.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D23406
2020-02-01 17:13:52 +00:00
Mitchell Horne adec0ce785 prci: fix up compat
Add two additional compat strings that can be used to identify the PRCI.

With newer device trees the PRCI has two parents, hfclk and rtcclk, so
allow the driver to attach when more than one parent is found.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D23405
2020-02-01 17:12:15 +00:00
Mitchell Horne f6029f2bc9 prci: register the DDR and GEMGX PLLs
The PRCI module exports three PLLs. Currently only the coreclk/corepll
is registered, so add the logic to register the DDR (memory) and GEMGX
(ethernet) clocks as well. These clocks are unused at the moment.

Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D23404
2020-02-01 17:09:56 +00:00
John Baldwin fb97e58e5c Add stricter checks on user changes to SSTATUS.
Rather than trying to blacklist which bits userland can't write to via
sigreturn() or setcontext(), only permit changes to whitelisted bits.

- Permit arbitrary writes to bits in the user-writable USTATUS
  register that shadows SSTATUS.

- Ignore changes in write-only bits maintained by the CPU.

- Ignore the user-supplied value of the FS field used to track
  floating point state and instead set it to a value matching the
  actions taken by set_fpcontext().

Discussed with:	mhorne
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23338
2020-01-31 19:00:48 +00:00
John Baldwin cca117a17b Fix 64-bit value of SSTATUS_SD to use an unsigned long.
While here, fix MSTATUS_SD to match SSTATUS_SD.

Reviewed by:	mhorne
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23434
2020-01-31 17:49:15 +00:00
Mark Johnston 1c29da0279 Reimplement stack capture of running threads on i386 and amd64.
After r355784 the td_oncpu field is no longer synchronized by the thread
lock, so the stack capture interrupt cannot be delievered precisely.
Fix this using a loop which drops the thread lock and restarts if the
wrong thread was sampled from the stack capture interrupt handler.

Change the implementation to use a regular interrupt instead of an NMI.
Now that we drop the thread lock, there is no advantage to the latter.

Simplify the KPIs.  Remove stack_save_td_running() and add a return
value to stack_save_td().  On platforms that do not support stack
capture of running threads, stack_save_td() returns EOPNOTSUPP.  If the
target thread is running in user mode, stack_save_td() returns EBUSY.

Reviewed by:	kib
Reported by:	mjg, pho
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23355
2020-01-31 15:43:33 +00:00
John Baldwin 5a02cd314d Trim duplicate CSR swaps from user exceptions.
The stack pointer is swapped with the sscratch CSR just before the
jump to cpu_exception_handler_user where the first instruction swaps
it again.  The two swaps together are a no-op, but the csr swap
instructions can be expensive (e.g. on Bluespec RISC-V cores csr swap
instructions force a full pipeline stall).

Reported by:	jrtc27
Reviewed by:	br
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23394
2020-01-30 22:19:48 +00:00
John Baldwin 0317861660 Remove unused fields from struct pcb.
cpu_switch/throw() and savectx() do not save or restore any values in
these fields which mostly held non-callee-save registers.

makectx() copied these fields from kdb_frame, but they weren't used
except for PC_REGS using pcb_sepc.  Change PC_REGS to use
kdb_frame->tf_sepc directly instead.

Reviewed by:	br
MFC after:	2 weeks
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D23395
2020-01-30 19:15:27 +00:00
Mitchell Horne 741ba007c1 Fix definition of SSTATUS_SD
The SD bit is defined as the MSB of the sstatus register, meaning its
position will vary depending on the CSR's length. Previously, there were
two (unused) defines for this, for the 32 and 64-bit cases, but their
definitions were swapped.

Consolidate these into one define: SSTATUS_SD, and make the definition
dependent on the value of __riscv_xlen.

Reviewed by:	br
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D23402
2020-01-29 15:50:48 +00:00
Ruslan Bukin 7106b618d2 Include the PCI stack to the riscv GENERIC kernel.
It will be used by an upcoming PCI root complex driver.

Sponsored by:	DARPA, AFRL
2020-01-24 17:10:21 +00:00