Ahead of including inline in __always_inline, move __always_inline to
where inline goes.
Reviewed by: kib, olce
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45709
Ahead of including inline in __always_inline, move __always_inline to
where inline goes.
Reviewed by: kib, olce
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45708
Separate function into assertive part and into assigning part.
Consistently use __func__ in the assertions. Write the assigning code in
a declarative style.
The functional change is that we no longer validate flags in the
non-INVARIANT kernel. The assertion that checks flags has been there for
17 years, so all code that calls with invalid flags must have been
filtered and fixed.
SDT calls dtrace_probe() directly, and this can be used to pass up to
five probe arguments directly. To pass the sixth argument (SDT
currently doesn't support more than this), we use a hack: just add
additional parameters to the call and cast dtrace_probe accordingly.
This happens to work on amd64, but doesn't work in general.
Modify SDT to call dtrace_probe() after storing arguments beyond the
first five in thread-local storage. Implement sdt_getargval() to fetch
extra argument values this way. An alternative would be to use invop
handlers instead and make sdt_probe_func point to a breakpoint
instruction, so that one can extract arguments using the breakpoint
exception trapframe, but this makes the providers more expensive when
enabled and doesn't seem justified. This approach works well unless we
want to add more than one or two more parameters to SDT probes, which
seems unlikely at present.
In particular, this fixes fetching the last argument of most ip and tcp
probes on arm64.
Reported by: rwatson
Reviewed by: Domagoj Stolfa
MFC after: 1 month
Sponsored by: Innovate UK
Differential Revision: https://reviews.freebsd.org/D45648
The idea here is to avoid a memory access and conditional branch per
probe site. Instead, the probe is represented by an "unreachable"
unconditional function call. asm goto is used to store the address of
the probe site (represented by a no-op sled) and the address of the
function call into a tracepoint record. Each SDT probe carries a list
of tracepoints.
When the probe is enabled, the no-op sled corresponding to each
tracepoint is overwritten with a jmp to the corresponding label. The
implementation uses smp_rendezvous() to park all other CPUs while the
instruction is being overwritten, as this can't be done atomically in
general. The compiler moves argument marshalling code and the
sdt_probe() function call out-of-line, i.e., to the end of the function.
Per gallatin@ in D43504, this approach has less overhead when probes are
disabled. To make the implementation a bit simpler, I removed support
for probes with 7 arguments; nothing makes use of this except a
regression test case. It could be re-added later if need be.
The approach taken in this patch enables some more improvements:
1. We can now automatically fill out the "function" field of SDT probe
names. The SDT macros let the programmer specify the function and
module names, but this is really a bug and shouldn't have been
allowed. The intent was to be able to have the same probe in
multiple functions and to let the user restrict which probes actually
get enabled by specifying a function name or glob.
2. We can avoid branching on SDT_PROBES_ENABLED() by adding the ability
to include blocks of code in the out-of-line path. For example:
if (SDT_PROBES_ENABLED()) {
int reason = CLD_EXITED;
if (WCOREDUMP(signo))
reason = CLD_DUMPED;
else if (WIFSIGNALED(signo))
reason = CLD_KILLED;
SDT_PROBE1(proc, , , exit, reason);
}
could be written
SDT_PROBE1_EXT(proc, , , exit, reason,
int reason;
reason = CLD_EXITED;
if (WCOREDUMP(signo))
reason = CLD_DUMPED;
else if (WIFSIGNALED(signo))
reason = CLD_KILLED;
);
In the future I would like to use this mechanism more generally, e.g.,
to remove branches and marshalling code used by hwpmc, and generally to
make it easier to add new tracepoint consumers without having to add
more conditional branches to hot code paths.
Reviewed by: Domagoj Stolfa, avg
MFC after: 2 months
Differential Revision: https://reviews.freebsd.org/D44483
This is derived from swills@ fork of the Juniper virtfs with many
changes by me including bug fixes, style improvements, clearer layering
and more consistent logging. The filesystem is renamed to p9fs to better
reflect its function and to prevent possible future confusion with
virtio-fs.
Several updates and fixes from Juniper have been integrated into this
version by Val Packett and these contributions along with the original
Juniper authors are credited below.
To use this with bhyve, add 'virtio_p9fs_load=YES' to loader.conf. The
bhyve virtio-9p device allows access from the guest to files on the host
by mapping a 'sharename' to a host path. It is possible to use p9fs as a
root filesystem by adding this to /boot/loader.conf:
vfs.root.mountfrom="p9fs:sharename"
for non-root filesystems add something like this to /etc/fstab:
sharename /mnt p9fs rw 0 0
In both examples, substitute the share name used on the bhyve command
line.
The 9P filesystem protocol relies on stateful file opens which map
protocol-level FIDs to host file descriptors. The FreeBSD vnode
interface doesn't really support this and we use heuristics to guess the
right FID to use for file operations. This can be confused by privilege
lowering and does not guarantee that the FID created for a given file
open is always used for file operations, even if the calling process is
using the file descriptor from the original open call. Improving this
would involve changes to the vnode interface which is out-of-scope for
this import.
Differential Revision: https://reviews.freebsd.org/D41844
Reviewed by: kib, emaste, dch
MFC after: 3 months
Co-authored-by: Val Packett <val@packett.cool>
Co-authored-by: Ka Ho Ng <kahon@juniper.net>
Co-authored-by: joyu <joyul@juniper.net>
Co-authored-by: Kumara Babu Narayanaswamy <bkumara@juniper.net>
This is a scheme to avoid taking the bufobj lock and doing a second
lookup in the case where in getblk we do an unlocked lookup and find no
buf. Was there really no buf, or were we in the middle of a reassignbuf
race? By tracking any use of reassignbuf with a flag, we can know if
there can't have been a race because there has been no reassignbuf.
Because this scheme is spoiled on the first use of reassignbuf, it is
mostly only beneficial for cases where a certain vnode is never expected
to use dirty bufs at all.
Reviewed by: kib
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45571
Have PCTRIE_RECLAIM_CALLBACK typecast one function pointer type to
another, to relieve the writer of the call back function from having
to cast its first argument from void* to member type.
Reviewed by: rlibby
Differential Revision: https://reviews.freebsd.org/D45586
Replace the lookup-remove loop in rangeet_remove_all with a call
to SWAP_PCTRIE_RECLAIM_CALLBACK, to eliminate repeated trie searches.
Reviewed by: rlibby
Differential Revision: https://reviews.freebsd.org/D45584
PCTRIE_RECLAIM frees all the interior nodes in a pctrie, but is little
used because most trie-destroyers want to free leaves of the tree
too. Add PCTRIE_RECLAIM_CALLBACK, with two extra arguments, a callback
function and an auxiliary argument, that is invoked on every non-NULL
leaf in the tree as the tree is destroyed.
Reviewed by: rlibby, kib (previous version)
Differential Revision: https://reviews.freebsd.org/D45565
Add a method rangeset_next to find the first range that starts at or
after a given value. Use it to rewrite pmap_pkru_same and
pmap_bti_same to avoid walking a page at a time over pages in no
range.
Reviewed by: andrew, kib
Differential Revision: https://reviews.freebsd.org/D45511
Use the new pctrie combined insert/lookup facility to reduce work and
time under the bufobj interlock when associating a buf with a vnode.
We now do one lookup in the dirty tree and one combined lookup/insert in
the clean tree instead of one lookup in dirty, two in clean, and then an
insert in clean. We also avoid touching the possibly unrelated buf at
the tail of the queue.
Also correct an issue where the actual order of the tail queue depended
on the insertion order due to sign issues.
Reviewed by: kib (previous version), dougm, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45395
In several places in code, we do a pctrie lookup followed by a pctrie
insert. Provide a few flavors of combined lookup/insert. This may save
a portion of the work from walking a large pctrie twice.
The general idea is that while we walk the trie during insert, we also
do the same kind of tracking work that we do during pctrie_lookup_ge or
pctrie_lookup_le, and we pass out a pctrie node from where such a lookup
may continue.
Reviewed by: dougm (previous version), kib (previous version), markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45394
LLD has the -zbti-report=error argument to check if the BTI note is
present when linking. To allow for this to be used when linking the
kernel and modules:
- Add the BTI note to the remaining assembly files
- Mark ptrauth.c as protected by BTI
- Disable -zbti-report for vmm hypervisor switching code as it's not
used there.
The linux64 module doesn't build with the flag as it includes vdso code
that doesn't include the note.
Reviewed by: imp, kib, emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45466
The optional 'table' pointer is a legacy part of the interface, which
has been replaced by devmap_register_table()/devmap_add_entry(). The few
in-tree callers have already adapted to this, so it can be removed.
The 'l1pt' argument is already entirely unused within the function.
Reviewed by: andrew, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45319
As a convenience to callers, who might allocate the array on the stack.
An empty/zero-valued range indicates the end of the physmap entries.
Remove the now-redundant calls to bzero() at the call site.
Reviewed by: andrew
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45318
This function follows both m_nextpkt and m_next linkage freeing all mbufs.
Note that existing m_freem() follows only m_next.
Reviewed by: khng
Differential Revision: https://reviews.freebsd.org/D45477
Do some light cleanup to make the output format more consistent for
readability.
Reviewed by: kib
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45442
In three instances where fls(x)-1 is used, the compiler does not know
that x is nonzero and so adds needless zero checks. Using ilog(x)
instead saves, in each instance, about 4 instructions, including a
conditional, and 16 or so bytes, on an amd64 build.
Reviewed by: alc
Differential Revision: https://reviews.freebsd.org/D45330
In three instances where fls(x)-1 is used, the compiler does not know
that x is nonzero and so adds needless zero checks. Using ilog(x)
instead saves, in each instance, about 4 instructions, including a
conditional, and 16 or so bytes, on an amd64 build.
Reviewed by: alc
Differential Revision: https://reviews.freebsd.org/D45330
I cannot find a time where the function was not named this.
Reviewed by: kib, markj
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45383
This commit refactors the UMA small alloc code and
removes most UMA machine-dependent code.
The existing machine-dependent uma_small_alloc code is almost identical
across all architectures, except for powerpc where using the direct
map addresses involved extra steps in some cases.
The MI/MD split was replaced by a default uma_small_alloc
implementation that can be overridden by architecture-specific code by
defining the UMA_MD_SMALL_ALLOC symbol. Furthermore, UMA_USE_DMAP was
introduced to replace most UMA_MD_SMALL_ALLOC uses.
Reviewed by: markj, kib
Approved by: markj (mentor)
Differential Revision: https://reviews.freebsd.org/D45084
We previously defaulted to using sc(4) with a special case to prefer
vt(4) when booted via UEFI. As vt(4) is now always the default we can
simplify this.
Reviewed by: imp, kevans
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45356
There are some cases of OSD use where the value is only initialized once
at a point where successive access of the value can be done so safely
without the need to take the lock.
Reviewed by: markj
Obtained from: Juniper Networks, Inc.
Differential Revision: https://reviews.freebsd.org/D44631
The arguments are left completely unchanged by these functions. This
allows passing constant pointers for verifying ownership, but not
modifying the contents.
Reviewed by: imp,jhb
Pull Request: https://github.com/freebsd/freebsd-src/pull/1224
The flsl() function makes use of hardware functionality to compute the
value faster than this loop. The only deviation from flsl() is at 0.
Reviewed by: imp,jhb
Pull Request: https://github.com/freebsd/freebsd-src/pull/1224
Not once has rman_reserve_resource_bound() ever been used. There are
though several uses of RF_ALIGNMENT. In light of this remove this
extra and leave the actually used portion in place.
This partially reverts commit 13fb665772.
Reviewed by: imp,jhb
Pull Request: https://github.com/freebsd/freebsd-src/pull/1224
Rather than hard-code the function name, use __func__ instead. Apply
some style and adjust indentation as appropriate. Remove the no longer
required braces.
Reviewed by: imp,jhb
Pull Request: https://github.com/freebsd/freebsd-src/pull/1224
Using a variadic macro allows passing everything properly to printf().
Using the do { } while(0) construct ensures the macro acts like any
other single statement. This shows just how long some of this has
existed.
Reviewed by: imp,jhb
Pull Request: https://github.com/freebsd/freebsd-src/pull/1224
If we asked not to wait on a lock, and then we failed to get a buf lock
because we would have had to wait, then just return the error. This
avoids taking the bufobj lock and a second trip to lockmgr.
Reviewed by: mckusick, kib, markj
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D45245