Commit graph

58 commits

Author SHA1 Message Date
Bojan Novković da76d349b6 uma: Deduplicate uma_small_alloc
This commit refactors the UMA small alloc code and
removes most UMA machine-dependent code.
The existing machine-dependent uma_small_alloc code is almost identical
across all architectures, except for powerpc where using the direct
map addresses involved extra steps in some cases.

The MI/MD split was replaced by a default uma_small_alloc
implementation that can be overridden by architecture-specific code by
defining the UMA_MD_SMALL_ALLOC symbol. Furthermore, UMA_USE_DMAP was
introduced to replace most UMA_MD_SMALL_ALLOC uses.

Reviewed by: markj, kib
Approved by: markj (mentor)
Differential Revision:	https://reviews.freebsd.org/D45084
2024-05-25 19:24:46 +02:00
Doug Moore 10db91ecec vm_radix: add a missing paren
429c871ddd left parens unbalanced in a
powerpc case that my testing missed.  Restore balance.

Reported by:	jenkins
2023-09-12 04:19:51 -05:00
Doug Moore 429c871ddd radix_trie: have vm_radix use pctrie code
Implement everything currently in vm_radix.c with calls to functions
in subr_pctrie.c, asccessed via the interface provided by the
DEFINE_PCTRIE_SMR macro.

Add back some #includes removed in the first attempt, and avoid the
use of a discontinued type in a bit of conditionally compiled code.

Reviewed by:	alc, markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D41344
2023-09-12 02:42:38 -05:00
Doug Moore 6cec93da46 Revert "radix_trie: have vm_radix use pctrie code"
This reverts commit a494d30465.
2023-09-11 03:35:36 -05:00
Doug Moore a494d30465 radix_trie: have vm_radix use pctrie code
Implement everything currently in vm_radix.c with calls to functions
in subr_pctrie.c, asccessed via the interface provided by the
DEFINE_PCTRIE_SMR macro.

Reviewed by:	alc, markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D41344
2023-09-11 01:53:40 -05:00
Warner Losh 685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Doug Moore ac0572e660 radix_tree: compute slot from keybarr
The computation of keybarr(), the function that determines when a
search has failed at a non-leaf node, can be done in a way that
computes the 'slot' value when keybarr() fails, which is exactly when
slot() would next be invoked. Computing things this way saves space in
search loops.

This reduces the amd64 coding of the search loop in vm_radix_lookup
from 40 bytes to 28 bytes.

Reviewed by:	alc
Tested by:	pho (as part of a larger change)
Differential Revision:	https://reviews.freebsd.org/D41235
2023-07-30 15:12:06 -05:00
Doug Moore 38f5cb1bfb radix_tree: redefine the clev field
The clev field in the node struct is almost always multiplied by
WIDTH; occasionally, it is incremented and then multiplied by
WIDTH. Instructions can be saved by storing it always multiplied by
WIDTH.

For the computation of slot(), this just eliminates a
multiplication. For trimkey(), where the caller always adds one to
clev before passing it as an argument, this change has the caller, not
the caller, do that. Trimkey() handles it not by adding WIDTH to the
input parameter, but by shifting COUNT, and not 1. That produces the
same result, and it relieves keybarr of the need to test to avoid
shifting by more than 63 bits, since level is always <= 63.

This takes 3 instrutions and 14 bytes out of the basic lookup loop on
amd64.

Reviewed by:	kib
Tested by:	pho (as part of a larger change)
Differential Revision:	https://reviews.freebsd.org/D41226
2023-07-30 01:20:07 -05:00
Doug Moore 2d2bcba7ba Every path in a radix trie ends with a leaf or a NULL. By replacing
NULL (non-leaf) pointers with NULL leaves, there is a NULL test
removed from every iteration of an index-based search loop.

This speeds up radix trie searches by few percent. If there are any
radix tries that are not initialized with the init() function, but
instead depend on zeroing everything being proper initialization, this
will break those tries.

Reviewed by:	alc, kib
Tested by:	pho (as part of a larger change)
Differential Revision:	https://reviews.freebsd.org/D41171
2023-07-28 11:39:52 -05:00
Doug Moore 6f251ef228 radix_trie: simplify ge, le lookups
Replace the implementations of lookup_le and lookup_ge with ones
that do not use a stack or climb back up the tree, and instead
exploit the popmap field to quickly identify the place to resume
searching if the straightforward indexed search fails.

The code size of the original functions shrinks by a combined 160
bytes on amd64, and the cumulative cycle count per invocation of
the two functions together is reduced 20% in a buildworld test.

Reviewed by:	alc, markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D40936
2023-07-19 09:43:31 -05:00
Doug Moore 16e01c05c0 radix_trie: avoid code duplication in insert
Two cases in the insert routine are written differently, when
they're really doing the same thing. Writing that case only once
saves 208 bytes in the compiled vm_radix_insert code and reduces
instructions executed by about 2%.
Reviewed by:	alc
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D40807
2023-07-09 15:06:02 -05:00
Doug Moore 8df38859d0 radix_trie: replace node count with popmap
Replace the 'count' field in a trie node with a bitmap that
identifies non-NULL children. Drop the 'last' field, and use the
last bit set in the bitmap instead.  In lookup_le, lookup_ge,
remove, and reclaim_all, use the bitmap to find the
previous/next/only/every non-null child in constant time by
examining the bitmask instead of looping across array elements
and null-checking them one-by-one.

A buildworld test suggests that this reduces the cycle count on
those functions that eliminate some null-checks by 4.9%, 1.5%,
0.0% and 13.3%.
Reviewed by:	alc
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D40775
2023-07-07 11:09:36 -05:00
Doug Moore da72505f9c radix_trie: pass fewer params to node_get
Let node_get calculate it's own owner value. Don't pass the count
parameter, since it's always 2. Save 16 bytes in insert(). Move,
without modifying, slot and trimkey to handle use-before-declaration
problem.
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D40723
2023-06-27 12:21:11 -05:00
Doug Moore 9cfed089ac radix_trie: clean up overlong lines
This is purely a cosmetic change. vm_radix.c has lines that reach past
column 80 and this change cleans that up. The associated changes to
subr_pctrie.c are just to keep mirroring vm_radix.c.
Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D40764
2023-06-27 12:01:33 -05:00
Doug Moore 72c3a43b16 radix_trie: skip compare in lookup_le, lookup_ge
In _lookup_ge, where a loop "looks for an available edge or val within
the current bisection node" (to quote the code comment), the value of
index has already been modified to guarantee that it is the least
value than can be found in the non-NULL child node being
examined. Therefore, if the non-NULL child is a leaf, there's no need
to compare 'index' to anything, and the value can just be returned.

The same is true for _lookup_le with 'most' replacing 'least'.
Reviewed by:	alc
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D40746
2023-06-27 00:42:41 -05:00
Doug Moore a42d8fe001 radix_trie: simplify trimkey functions
Replacing a branch and two shifts with a single masking operation saves 64 bytes the pair of functions lookup_le and lookup_ge on amd64.  Refresh the associated comments.
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D40722
2023-06-25 12:49:15 -05:00
Doug Moore e8efee297c radix_trie: avoid reloading radix node
In the vm_radix:remove loop that searches for the last child, load
that child once, without loading it again after the search is over.
Change KASSERTS from index check to NULL node check.
Reviewed by:	alc
Differential Revision:	https://reviews.freebsd.org/D40721
2023-06-23 18:47:23 -05:00
Doug Moore 1efa7dbc07 vm_radix: drop unused function; use bool.
Replace boolean_t with bool in vm_radix.c. Drop the unused function
vm_radix_is_singleton, which is unused and has no corresponding
function in subr_pctrie.c.
Reviewed by:	alc
Differential Revision:	<https://reviews.freebsd.org/D40586>
2023-06-20 23:52:27 -05:00
Doug Moore 05963ea4d1 radix_trie: eliminate iteration in keydiff
Use flsll(), instead of a loop, to find where two keys differ, and
then arithmetic to transform that to a trie level.
Approved by:	alc, markj
Differential Revision:	https://reviews.freebsd.org/D40585
2023-06-20 11:30:29 -05:00
Warner Losh 4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Mateusz Guzik c3aa3bf97c vm: clean up empty lines in .c and .h files 2020-09-01 21:20:45 +00:00
Kyle Evans c79cee7136 kernel: provide panicky version of __unreachable
__builtin_unreachable doesn't raise any compile-time warnings/errors on its
own, so problems with its usage can't be easily detected. While it would be
nice for this situation to change and compilers to at least add a warning
for trivial cases where local state means the instruction can't be reached,
this isn't the case at the moment and likely will not happen.

This commit adds an __assert_unreachable, whose intent is incredibly clear:
it asserts that this instruction is unreachable. On INVARIANTS builds, it's
a panic(), and on non-INVARIANTS it expands to  __unreachable().

Existing users of __unreachable() are converted to __assert_unreachable,
to improve debuggability if this assumption is violated.

Reviewed by:	mjg
Differential Revision:	https://reviews.freebsd.org/D23793
2020-05-13 18:07:37 +00:00
Mark Johnston 3fba886874 Move SMR pointer type definition and access macros to smr_types.h.
The intent is to provide a header that can be included by other headers
without introducing too much pollution.  smr.h depends on various
headers and will likely grow over time, but is less likely to be
required by system headers.

Rename SMR_TYPE_DECLARE() to SMR_POINTER():
- One might use SMR to protect more than just pointers; it
  could be used for resizeable arrays, for example, so TYPE seems too
  generic.
- It is useful to be able to define anonymous SMR-protected pointer
  types and the _DECLARE suffix makes that look wrong.

Reviewed by:	jeff, mjg, rlibby
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D23988
2020-03-07 00:55:46 +00:00
Kyle Evans cef81f8f01 vm_radix: prefer __builtin_unreachable() to an unreachable panic()
This provides the needed hint to GCC and offers an annotation for readers to
observe that it's in-fact impossible to hit this point. We'll get hit with a
a -Wswitch error if the enum applicable to the switch above were to get
expanded without the new value(s) being handled.
2020-02-22 16:20:04 +00:00
Jeff Roberson 4b3dac72b3 Silence a gcc warning about no return from a function that handles every
possible enum in a switch statement.  I verified that this emits nothing
as expected on clang.  radix relies on constant propagation to eliminate
any branching from these access routines.

Reported by:	lwhsu/tinderbox
2020-02-19 22:34:22 +00:00
Jeff Roberson 1ddda2eb24 Use SMR to provide a safe unlocked lookup for vm_radix.
The tree is kept correct for readers with store barriers and careful
ordering.  The existing object lock serializes writers.  Consumers
will be introduced in later commits.

Reviewed by:	markj, kib
Differential Revision:	https://reviews.freebsd.org/D23446
2020-02-19 19:58:31 +00:00
Mateusz Guzik a3d799fbb5 vm: stop passing M_ZERO when allocating radix nodes
Allocation explicitely initialized the 3 leading fields. The rest is an
array which is supposed to be NULL-ed prior to deallocation.

Delegate zeroing to the infrequently called object initializator.

This gets rid of one of the most common memset consumers.

Reviewed by:	markj
Differential Revision:	https://reviews.freebsd.org/D15989
2018-06-24 13:08:05 +00:00
Gleb Smirnoff ae941b1b4e Fix boot_pages calculation for machines that don't have UMA_MD_SMALL_ALLOC.
o Call uma_startup1() after initializing kmem, vmem and domains.
o Include 8 eight VM startup pages into uma_startup_count() calculation.
o Account for vmem_startup() and vm_map_startup() preallocating pages.
o Account for extra two allocations done by kmem_init() and vmem_create().
o Hardcode the place of execution of vm_radix_reserve_kva(). Using SYSINIT
  allowed several other SYSINITs to sneak in before it, thus bumping
  requirement for amount of boot pages.
2018-02-06 22:06:59 +00:00
Pedro F. Giffuni fe267a5590 sys: general adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 2-Clause license, however the tool I
was using misidentified many licenses so this was mostly a manual - error
prone - task.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

No functional change intended.
2017-11-27 15:23:17 +00:00
Jeff Roberson 8d6fbbb867 Replace manyinstances of VM_WAIT with blocking page allocation flags
similar to the kernel memory allocator.

This simplifies NUMA allocation because the domain will be known at wait
time and races between failure and sleeping are eliminated.  This also
reduces boilerplate code and simplifies callers.

A wait primitive is supplied for uma zones for similar reasons.  This
eliminates some non-specific VM_WAIT calls in favor of more explicit
sleeps that may be satisfied without new pages.

Reviewed by:	alc, kib, markj
Tested by:	pho
Sponsored by:	Netflix, Dell/EMC Isilon
2017-11-08 02:39:37 +00:00
Konstantin Belousov cd1241fbd0 Add pctrie_init() and vm_radix_init() to initialize generic pctrie and
vm_radix trie.

Existing vm_radix_init() function is renamed to vm_radix_zinit().
Inlines moved out of the _ headers.

Reviewed by:	alc, markj (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D11661
2017-07-19 20:52:47 +00:00
Alan Cox e94965d82e Previously, vm_radix_remove() would panic if the radix trie didn't
contain a vm_page_t at the specified index.  However, with this
change, vm_radix_remove() no longer panics.  Instead, it returns NULL
if there is no vm_page_t at the specified index.  Otherwise, it
returns the vm_page_t.  The motivation for this change is that it
simplifies the use of radix tries in the amd64, arm64, and i386 pmap
implementations.  Instead of performing a lookup before every remove,
the pmap can simply perform the remove.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D8708
2016-12-08 04:29:29 +00:00
Alan Cox 8804a2b030 Eliminate a stale comment; vm_radix_prealloc() was replaced in r254141.
MFC after:	3 days
2016-12-02 16:29:30 +00:00
Alan Cox 563a19d546 During vm_page_cache()'s call to vm_radix_insert(), if vm_page_alloc() was
called to allocate a new page of radix trie nodes, there could be a call to
vm_radix_remove() on the same trie (of PG_CACHED pages) as the in-progress
vm_radix_insert().  With the removal of PG_CACHED pages, we can simplify
vm_radix_insert() and vm_radix_remove() by removing the flags on the root of
the trie that were used to detect this case and the code for restarting
vm_radix_insert() when it happened.

Reviewed by:	kib, markj
Tested by:	pho
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D8664
2016-12-01 17:26:37 +00:00
Pedro F. Giffuni b66bb393f2 Cleanup redundant parenthesis from existing howmany()/roundup() macro uses. 2016-04-22 16:57:42 +00:00
Hans Petter Selasky af3b2549c4 Pull in r267961 and r267973 again. Fix for issues reported will follow. 2014-06-28 03:56:17 +00:00
Glen Barber 37a107a407 Revert r267961, r267973:
These changes prevent sysctl(8) from returning proper output,
such as:

 1) no output from sysctl(8)
 2) erroneously returning ENOMEM with tools like truss(1)
    or uname(1)
 truss: can not get etype: Cannot allocate memory
2014-06-27 22:05:21 +00:00
Hans Petter Selasky 3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag to automatically check if
there is an environment variable which shall initialize the SYSCTL
during early boot. This works for all SYSCTL types both statically and
dynamically created ones, except for the SYSCTL NODE type and SYSCTLs
which belong to VNETs. A new flag, CTLFLAG_NOFETCH, has been added to
be used in the case a tunable sysctl has a custom initialisation
function allowing the sysctl to still be marked as a tunable. The
kernel SYSCTL API is mostly the same, with a few exceptions for some
special operations like iterating childrens of a static/extern SYSCTL
node. This operation should probably be made into a factored out
common macro, hence some device drivers use this. The reason for
changing the SYSCTL API was the need for a SYSCTL parent OID pointer
and not only the SYSCTL parent OID list pointer in order to quickly
generate the sysctl path. The motivation behind this patch is to avoid
parameter loading cludges inside the OFED driver subsystem. Instead of
adding special code to the OFED driver subsystem to post-load tunables
into dynamically created sysctls, we generalize this in the kernel.

Other changes:
- Corrected a possibly incorrect sysctl name from "hw.cbb.intr_mask"
to "hw.pcic.intr_mask".
- Removed redundant TUNABLE statements throughout the kernel.
- Some minor code rewrites in connection to removing not needed
TUNABLE statements.
- Added a missing SYSCTL_DECL().
- Wrapped two very long lines.
- Avoid malloc()/free() inside sysctl string handling, in case it is
called to initialize a sysctl from a tunable, hence malloc()/free() is
not ready when sysctls from the sysctl dataset are registered.
- Bumped FreeBSD version to indicate SYSCTL API change.

MFC after:	2 weeks
Sponsored by:	Mellanox Technologies
2014-06-27 16:33:43 +00:00
Bryan Drewery 44f1c91610 Rename global cnt to vm_cnt to avoid shadowing.
To reduce the diff struct pcu.cnt field was not renamed, so
PCPU_OP(cnt.field) is still used. pc_cnt and pcpu are also used in
kvm(3) and vmstat(8). The goal was to not affect externally used KPI.

Bump __FreeBSD_version_ in case some out-of-tree module/code relies on the
the global cnt variable.

Exp-run revealed no ports using it directly.

No objection from:	arch@
Sponsored by:	EMC / Isilon Storage Division
2014-03-22 10:26:09 +00:00
Alan Cox 703b304f33 Eliminate a redundant parameter to vm_radix_replace().
Improve the wording of the comment describing vm_radix_replace().

Reviewed by:	attilio
MFC after:	6 weeks
Sponsored by:	EMC / Isilon Storage Division
2013-12-08 20:07:02 +00:00
Alan Cox 776cad90ff Addendum to r254141: The call to vm_radix_insert() in vm_page_cache() can
reclaim the last preexisting cached page in the object, resulting in a call
to vdrop().  Detect this scenario so that the vnode's hold count is
correctly maintained.  Otherwise, we panic.

Reported by:	scottl
Tested by:	pho
Discussed with:	attilio, jeff, kib
2013-08-23 17:27:12 +00:00
Attilio Rao e946b94934 On all the architectures, avoid to preallocate the physical memory
for nodes used in vm_radix.
On architectures supporting direct mapping, also avoid to pre-allocate
the KVA for such nodes.

In order to do so make the operations derived from vm_radix_insert()
to fail and handle all the deriving failure of those.

vm_radix-wise introduce a new function called vm_radix_replace(),
which can replace a leaf node, already present, with a new one,
and take into account the possibility, during vm_radix_insert()
allocation, that the operations on the radix trie can recurse.
This means that if operations in vm_radix_insert() recursed
vm_radix_insert() will start from scratch again.

Sponsored by:	EMC / Isilon storage division
Reviewed by:	alc (older version)
Reviewed by:	jeff
Tested by:	pho, scottl
2013-08-09 11:28:55 +00:00
Alan Cox 9f2e600890 To reduce the amount of arithmetic performed in the various radix tree
functions, reverse the numbering scheme for the levels.  The highest
numbered level in the tree now appears near the root instead of the leaves.

Sponsored by:	EMC / Isilon Storage Division
2013-05-11 18:01:41 +00:00
Alan Cox bb0e1de4ab Remove a redundant call to panic() from vm_radix_keydiff(). The assertion
before the loop accomplishes the same thing.

Sponsored by:	EMC / Isilon Storage Division
2013-05-07 18:45:34 +00:00
Alan Cox 2d4b9a6438 Optimize vm_radix_lookup_ge() and vm_radix_lookup_le(). Specifically,
change the way that these functions ascend the tree when the search for a
matching leaf fails at an interior node.  Rather than returning to the root
of the tree and repeating the lookup with an updated key, maintain a stack
of interior nodes that were visited during the descent and use that stack
to resume the lookup at the closest ancestor that might have a matching
descendant.

Sponsored by:	EMC / Isilon Storage Division
Reviewed by:	attilio
Tested by:	pho
2013-05-04 22:50:15 +00:00
Alan Cox 82af926a57 Eliminate an unneeded call to vm_radix_trimkey() from vm_radix_lookup_le().
This call is clearing bits from the key that will be set again by the next
line.

Sponsored by:	EMC / Isilon Storage Division
2013-04-28 08:29:00 +00:00
Alan Cox 40076ebc5c Avoid some lookup restarts in vm_radix_lookup_{ge,le}().
Sponsored by:	EMC / Isilon Storage Division
2013-04-27 16:44:59 +00:00
Alan Cox 384875a3a6 Simplify vm_radix_{add,dec}lev().
Sponsored by:	EMC / Isilon Storage Division
2013-04-22 01:26:13 +00:00
Alan Cox 880659fe81 When calculating the number of reserved nodes, discount the pages that will
be used to store the nodes.

Sponsored by:	EMC / Isilon Storage Division
2013-04-18 05:34:33 +00:00
Alan Cox a08f2cf69e Although we perform path compression to reduce the height of the trie and
the number of interior nodes, we have previously created a level zero
interior node at the root of every non-empty trie, even when that node is
not strictly necessary, i.e., it has only one child.  This change is the
second (and final) step in eliminating those unnecessary level zero interior
nodes.  Specifically, it updates the deletion and insertion functions so
that they do not require a level zero interior node at the root of the trie.
For a "buildworld" workload, this change results in a 16.8% reduction in the
number of interior nodes allocated and a similar reduction in the average
execution time for lookup functions.  For example, the average execution
time for a call to vm_radix_lookup_ge() is reduced by 22.9%.

Reviewed by:	attilio, jeff (an earlier version)
Sponsored by:	EMC / Isilon Storage Division
2013-04-15 06:12:00 +00:00