Commit graph

4683 commits

Author SHA1 Message Date
Mark Johnston fff19e0ed2 vm_object: Remove redundant OBJ_SWAP checks
With the removal of OBJT_DEFAULT, OBJ_ANON implies OBJ_SWAP.

Note, this means that vm_object_split() is more expensive than it used
to be, as it holds busy locks until the end of the range is reached,
even if the object has no swap blocks allocated.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35789
2022-07-17 07:09:48 -04:00
Mark Johnston 0cb2610ee2 vm: Remove handling for OBJT_DEFAULT objects
Now that OBJT_DEFAULT objects can't be instantiated, we can simplify
checks of the form object->type == OBJT_DEFAULT || (object->flags &
OBJ_SWAP) != 0.  No functional change intended.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35788
2022-07-17 07:09:48 -04:00
Mark Johnston fffc1c594a vm_object: Release object swap charge in the swap pager destructor
With the removal of OBJT_DEFAULT, we can simply handle this in
swap_pager_dealloc().  No functional change intended.

Suggested by:	alc
Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35787
2022-07-17 07:09:48 -04:00
Mark Johnston cb6757c0a6 swap_pager: Removing handling for objects with OBJ_SWAP clear
With the removal of OBJT_DEFAULT, we can assume that pager operations
provide an object with OBJ_SWAP set.  Also, we do not need to convert
objects from type OBJT_DEFAULT.  Thus, remove checks for OBJ_SWAP and
remove code which modifies the object type.  In some places, replace the
check for OBJ_SWAP with a check for whether any swap blocks are
assigned.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35786
2022-07-17 07:09:48 -04:00
Mark Johnston 5d32157d4e vm_object: Modify vm_object_allocate_anon() to return OBJT_SWAP objects
With this change, OBJT_DEFAULT objects are no longer allocated.
Instead, anonymous objects are always of type OBJT_SWAP and always have
OBJ_SWAP set.

Modify the page fault handler to check the swap block radix tree in
places where it checked for objects of type OBJT_DEFAULT.  In
particular, there's no need to invoke getpages for an OBJT_SWAP object
with no swap blocks assigned.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35785
2022-07-17 07:09:48 -04:00
Mark Johnston eee9aab9cb vm_mmap: Remove obsolete code and comments from vm_mmap()
In preparation for removing OBJT_DEFAULT, eliminate some stale/unhelpful
comments from vm_mmap(), and remove an unused case.  In particular, the
remaining callers of vm_mmap() in the tree do not specify OBJT_DEFAULT.

It's much more common to use vm_map_find() to map an object into user
memory, so rather than adjusting vm_mmap() to handle OBJT_SWAP objects,
let's further discourage its use and simply remove OBJT_DEFAULT
handling.

Reviewed by:	dougm, alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35778
2022-07-13 09:39:26 -04:00
Mark Johnston 31508912d8 uma: Apply a missed piece of review feedback from D35738
Fixes:	93cd28ea82 ("uma: Use a taskqueue to execute uma_timeout()")
2022-07-13 09:30:00 -04:00
Mark Johnston 70b2996120 vm_map: Simplify a call to vm_object_allocate_anon()
vm_object_allocate_anon() automatically sets "charge" to 0 if no cred
reference is provided, so the caller doesn't need any conditional logic.

No functional change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35781
2022-07-12 09:10:15 -04:00
Mark Johnston e1979b45b6 vm_object: Assert that overcommit charge is released in the object dtor
Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35780
2022-07-12 09:10:15 -04:00
Mark Johnston 93cd28ea82 uma: Use a taskqueue to execute uma_timeout()
uma_timeout() has several responsibilities; it visits every UMA zone and
as of recently will drain underutilized caches, so is rather expensive
(>1ms in some cases).  Currently it is executed by softclock threads
and so will preempt most other CPU activity.  None of this work requires
a high scheduling priority, though, so defer it to a taskqueue so as to
avoid stalling higher-priority work.

Reviewed by:	rlibby, alc, mav, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35738
2022-07-11 15:58:43 -04:00
Mark Johnston b57be759d0 vm_fault: Fix some nits in vm_fault_copy_entry()
- Correct the description (vm_fault_copy_entry() does not create a
  shadow object).
- Move some initialization and assertions out of the scope of the object
  locks, when doing so makes sense.
- Merge a pair of conditional blocks.
- Use __unused when appropriate.

No functional change intended.

Reviewed by:	alc
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-07-11 15:58:42 -04:00
Mark Johnston e123264e4d vm: Fix racy checks for swap objects
Commit 4b8365d752 introduced the ability to dynamically register
VM object types, for use by tmpfs, which creates swap-backed objects.
As a part of this, checks for such objects changed from

  object->type == OBJT_DEFAULT || object->type == OBJT_SWAP

to

  object->type == OBJT_DEFAULT || (object->flags & OBJ_SWAP) != 0

In particular, objects of type OBJT_DEFAULT do not have OBJ_SWAP set;
the swap pager sets this flag when converting from OBJT_DEFAULT to
OBJT_SWAP.

A few of these checks are done without the object lock held.  It turns
out that this can result in false negatives since the swap pager
converts objects like so:

  object->type = OBJT_SWAP;
  object->flags |= OBJ_SWAP;

Fix the problem by adding explicit tests for OBJT_SWAP objects in
unlocked checks.

PR:		258932
Fixes:		4b8365d752 ("Add OBJT_SWAP_TMPFS pager")
Reported by:	bdrewery
Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35470
2022-06-20 12:48:14 -04:00
Mark Johnston 540da48d83 vm_kern: Update KMSAN shadow maps when allocating kmem memory
This addresses a couple of false positive reports for memory returned by
malloc_large().

Sponsored by:	The FreeBSD Foundation
2022-06-20 12:48:13 -04:00
Mark Johnston a932a5a649 uma: Mark zeroed slabs as initialized for KMSAN
Otherwise zone initializers can produce false positives, e.g., when
lock_init() attempts to detect double initialization.

Sponsored by:	The FreeBSD Foundation
2022-06-20 12:48:13 -04:00
Mark Johnston 1f88394b7f vm_fault: Avoid unnecessary object relocking in vm_fault_copy_entry()
Suggested by:	alc
Reviewed by:	alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35485
2022-06-14 18:19:07 -04:00
Mark Johnston d0443e2b98 vm_fault: Fix a racy copy of page valid bits
We do not hold the object lock or a page busy lock when copying src_m's
validity state.  Prior to commit 45d72c7d7f we marked dst_m as fully
valid.

Use the source object's read lock to ensure that valid bits are not
concurrently cleared.

Reviewed by:	alc, kib
Fixes:		45d72c7d7f ("vm_fault_copy_entry: accept invalid source pages.")
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35471
2022-06-14 18:18:09 -04:00
Mark Johnston 630f633f2a vm_object: Use the vm_object_(set|clear)_flag() helpers
... rather than setting and clearing flags inline.  No functional change
intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35469
2022-06-14 12:00:59 -04:00
Gordon Bergling 860740ae0f vm: Fix a common typo in a source code comment
- s/independant/independent/

MFC after:	3 days
2022-06-05 09:52:32 +02:00
Gordon Bergling f77a88c855 vm_page: Fix a typo in a source code comment
- s/consistancy/consistency/

MFC after:	3 days
2022-06-04 12:52:22 +02:00
Doug Moore fa8a6585c7 vm_phys: avoid waste in multipage allocation
In vm_phys_alloc_contig, for an allocation bigger than the size of any
buddy queue free block, avoid examining any maximum-size free block
more than twice, by only starting to consider a sequence of adjacent
max-blocks starting at a max-block that does not follow another
max-block.  If that first max-block follows adjacent blocks of smaller
size, and if together they provide enough memory to reduce by one the
number of max-blocks required for this allocation, use them as part of
this allocation.

Reviewed by:	markj
Tested by:	pho
Discussed with:	alc
Differential Revision:	https://reviews.freebsd.org/D34815
2022-04-26 02:56:23 -05:00
John Baldwin 52526922ac vm_phys_init: Quiet unused but set warnings about npages.
npages is used in two optional cases:

- to conditionally create a separate DMA32 free list

- to index vm_page_array for VM_PHYSSEG_SPARSE

Add in more #ifdef's around npages statements.

Reviewed by:	alc, markj
Differential Revision:	https://reviews.freebsd.org/D34887
2022-04-18 12:06:14 -07:00
Mark Johnston f82177b8cf vm: Initialize the transient buffer mapping arena with M_WAITOK
The wait flag is passed to UMA when allocating boundary tags for the
initial span, and UMA expects either M_WAITOK or M_NOWAIT to be present.

Reported by:	cperciva
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-04-14 15:46:14 -04:00
Mark Johnston 6fb7c42d59 vm: Move the "vm_wait in early boot" assertion to the proper place
The assertion was added in commit 1771e987ca.  After that, vm_wait()
and friends were refactored such that the actual sleep happens
elsewhere.  Now the assertion condition is not checked when
vm_wait_doms() is called directly, and it is checked even if we are not
going to sleep (because vm_page_count_min_set(wdoms) is false).

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34909
2022-04-14 15:45:54 -04:00
John Baldwin b8ebd99aa5 vm: Use __diagused for variables only used in KASSERT(). 2022-04-13 16:08:20 -07:00
John Baldwin 40cbcb996c vm_fault_dontneed: Inline value of variable used once in an assertion. 2022-04-13 16:08:19 -07:00
Enji Cooper 567378cc07 Fix OID format for vm.swap_reserved and vm.swap_total
The correct OID format for CTLTYPE_U64 is `QU` (`uquad_t`), not `A`
(text expressed via `char *`).

This issue was noticed while doing an sysctl tree walk using a
sysctl(9) consumer that relies on the OID format to intuit what the
type should be for a given sysctl.

MFC after:	1 month
Sponsored by:	DellEMC Isilon
Differential Revision: https://reviews.freebsd.org/D34877
2022-04-10 18:17:09 -07:00
John Baldwin 2e7838ae84 vm_phys_early_alloc: mem_index is only used under #ifdef NUMA.
Possibly mem_index should just reuse biggestone since this loop is
already reusing biggestsize.
2022-04-08 17:25:13 -07:00
John Baldwin a7e1a58554 uma_zfree_smr: uz_flags is only used if NUMA is defined. 2022-04-08 17:25:13 -07:00
Gordon Bergling f167c46e79 memguard(9): Fix two typos in source code comments
- s/comparsion/comparison/

MFC after:	3 days
2022-04-02 13:51:27 +02:00
Peter Jeremy 9a89977bf6
kern: Fix typo in kassert message.
- s/unepxected/unexpected/
MFC after:	3 days
2022-04-02 21:36:17 +11:00
Doug Moore 557dc337e6 vm_phys: check small blocks to finish allocation
In vm_phys_alloc_queues_contig, in the case that a sequence of
max-order blocks are sought to fulfill an allocation, a sequence is
ruled out if it does not have enough max-order blocks to satisfy the
allocation. However, there may be smaller blocks of free memory that
follow the last max-order block in the sequence, and they may be big
enough to complete the allocation request, so check for that
possibility before giving up on that block sequence.

Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D34724
2022-03-31 16:19:55 -05:00
Doug Moore 342056fa1c vm_phys: alloc pages without duplicating searches.
In the search for contiguous pages, as each page segment is examined,
check to see if the free list set for the next page segment differs
from the set for the current segment, and avoid a pointless search if
they do not differ.

Discussed with:	alc
Reviewed by:	markj
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D33947
2022-03-31 01:40:46 -05:00
Mark Johnston d53927b0ba uma: Don't allow a limit to be set in a warm zone
The limit accounting in UMA does not tolerate this.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-03-30 15:42:18 -04:00
Mark Johnston 54361f9020 uma: Use the correct type for a return value
zone_alloc_bucket() returns a pointer, not a bool.

MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2022-03-30 15:42:05 -04:00
Brooks Davis b1ad6a9000 syscallarg_t: Add a type for system call arguments
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).

Obtained from:	CheriBSD

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D33780
2022-03-28 19:43:03 +01:00
Eric van Gyzen 490b09f240 uma_zalloc_domain: call uma_zalloc_debug in multi-domain path
It was only called in the non-NUMA and single-domain paths.
Some of its assertions were duplicated in uma_zalloc_domain,
but some things were missed, especially memguard.

Reviewed by:	markj, rstone
MFC after:	1 week
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D34472
2022-03-25 20:10:38 -05:00
Eric van Gyzen a8cbb835bf uma_zalloc: assert M_NOWAIT ^ M_WAITOK
The uma_zalloc functions expect exactly one of [M_NOWAIT, M_WAITOK].
If neither or both are passed, print an error and a stack dump.
Only do this ten times, to prevent livelock.  In the future, after
this exposes enough bad callers, this will be changed to a KASSERT().

Reviewed by:	rstone, markj
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D34452
2022-03-25 20:10:37 -05:00
Eric van Gyzen cfbb5f8ce0 vm_ksubmap_init: pass M_WAITOK to vmem_init -> uma_zalloc_arg
uma_zalloc_arg expects exactly one of the two WAIT flags.  A future
commit will assert this.

Reviewed by:	rstone
MFC after:	1 month
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D34450
2022-03-25 20:10:37 -05:00
Mateusz Guzik bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Mark Johnston 389a3fa693 uma: Add UMA_ZONE_UNMANAGED
Allow a zone to opt out of cache size management.  In particular,
uma_reclaim() and uma_reclaim_domain() will not reclaim any memory from
the zone, nor will uma_timeout() purge cached items if the zone is idle.
This effectively means that the zone consumer has control over when
items are reclaimed from the cache.  In particular, uma_zone_reclaim()
will still reclaim cached items from an unmanaged zone.

Reviewed by:	hselasky, kib
MFC after:	3 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34142
2022-02-15 09:25:34 -05:00
John Baldwin becaf6433b Use vmspace->vm_stacktop in place of sv_usrstack in more places.
Reviewed by:	markj
Obtained from:	CheriBSD
Differential Revision:	https://reviews.freebsd.org/D34174
2022-02-14 10:57:30 -08:00
Konstantin Belousov b51927b7b0 Revert "vm_pageout_scans: correct detection of active object"
This reverts commit 3de96d664a.

Problem is that it is possible to reach the state with ref_count ==
1 for the mapped non-anonymous object. For instance, anonymous posix
shmfd or linux shmfs object could be mapped, and then corresponding
file descriptor closed, dropping the object reference owned by the
shmfd/shmfs file.  Then the check in inactive scan assumes that the
object and page are not mapped and frees the page, while they are not.

PR:	261707
Discussed with:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	now
2022-02-10 16:55:10 +02:00
Robert Wing c9e023541a pbuf_ctor(): lock the buffer with LK_NOWAIT
This LOR happens when reading from a file backed MD device:

lock order reversal:
 1st 0xfffffe00431eaac0 pbufwait (pbufwait, lockmgr) @ /cobra/src/sys/vm/vm_pager.c:471
 2nd 0xfffff80003f17930 ufs (ufs, lockmgr) @ /cobra/src/sys/dev/md/md.c:977
lock order pbufwait -> ufs attempted at:
    #0 0xffffffff80c78ead at witness_checkorder+0xbdd
    #1 0xffffffff80bd6a52 at lockmgr_lock_flags+0x182
    #2 0xffffffff80f52d5c at ffs_lock+0x6c
    #3 0xffffffff80d0f3f4 at _vn_lock+0x54
    #4 0xffffffff80708629 at mdstart_vnode+0x499
    #5 0xffffffff807060ec at md_kthread+0x20c
    #6 0xffffffff80bbfcd0 at fork_exit+0x80
    #7 0xffffffff810b809e at fork_trampoline+0xe

This LOR was previously blessed by witness before commit 531f8cfea0
("Use dedicated lock name for pbufs").

Instead of blessing ufs and pbufwait, use LK_NOWAIT to prevent recording
the lock order. LK_NOWAIT will be a nop here as the lock is dropped in
pbuf_dtor(). The takes the same approach as 5875b94c74 ("buf_alloc():
lock the buffer with LK_NOWAIT").

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D34183
2022-02-07 10:05:20 -09:00
Konstantin Belousov 0b8643eaf6 vmmeter(): Fix detection of the named swap objects
Noted and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33549
2022-02-02 11:39:58 +02:00
Konstantin Belousov 4cf9f5d807 vm_object: restore handling of shadow_count for all type of objects
instead of only OBJ_ANON objects that are backing, as it is now.
This is required for e.g. vm_meter is_object_active() detection, and
should be useful in some more cases.

Use refcount KPI for all objects, regardless of owning the object lock,
and the fact that currently OBJ_ANON cannot change for the live object.

Noted and reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33549
2022-02-02 11:39:51 +02:00
Konstantin Belousov d950c5898a vm/vm_extern.h, vm/vm_page.h: use sys/kassert.h
instead of fatty sys/systm.h.

Suggested by:	jhb
Reviewed by:	alc, imp, jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
Konstantin Belousov f4cdb9d7c3 vm/vm_pager.h: use sys/systm.h header
it is needed for __read_mostly attribute definition, which right now
comes from vm/vm_page.h including sys/systm.h

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
Konstantin Belousov 531f8cfea0 Use dedicated lock name for pbufs
Also remove a pointer to array variable, use array address directly.

Reviewed by:	markj, mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34072
2022-01-31 04:46:14 +02:00
John Baldwin 29d481ae6a Make <vm/vm_extern.h> more self-contained.
Add a nested include of <sys/systm.h> for recently added assertions.
Without this, existing code (such as in drm-kmod) needs to be patched
to add the newly required header.

While here, rewrite the assertions using KASSERT().

Reviewed by:	dougm, alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D34070
2022-01-28 13:14:03 -08:00
Konstantin Belousov 3de96d664a vm_pageout_scans: correct detection of active object
For non-anonymous swap objects, there is always a reference from the
owner to the object to keep it from recycling.  Account for it when
deciding should we query pmap for hardware active references for the
page.

As result, we avoid unneeded calls to pmap_ts_referenced(), which for
non-mapped page means avoiding unneccessary lock and unlock of the pv list.

Reviewed by:	markj
Discussed with:	alc
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33924
2022-01-22 19:34:32 +02:00