With the removal of OBJT_DEFAULT, OBJ_ANON implies OBJ_SWAP.
Note, this means that vm_object_split() is more expensive than it used
to be, as it holds busy locks until the end of the range is reached,
even if the object has no swap blocks allocated.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35789
Now that OBJT_DEFAULT objects can't be instantiated, we can simplify
checks of the form object->type == OBJT_DEFAULT || (object->flags &
OBJ_SWAP) != 0. No functional change intended.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35788
With the removal of OBJT_DEFAULT, we can simply handle this in
swap_pager_dealloc(). No functional change intended.
Suggested by: alc
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35787
With the removal of OBJT_DEFAULT, we can assume that pager operations
provide an object with OBJ_SWAP set. Also, we do not need to convert
objects from type OBJT_DEFAULT. Thus, remove checks for OBJ_SWAP and
remove code which modifies the object type. In some places, replace the
check for OBJ_SWAP with a check for whether any swap blocks are
assigned.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35786
With this change, OBJT_DEFAULT objects are no longer allocated.
Instead, anonymous objects are always of type OBJT_SWAP and always have
OBJ_SWAP set.
Modify the page fault handler to check the swap block radix tree in
places where it checked for objects of type OBJT_DEFAULT. In
particular, there's no need to invoke getpages for an OBJT_SWAP object
with no swap blocks assigned.
Reviewed by: alc, kib
Tested by: pho
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35785
In preparation for removing OBJT_DEFAULT, eliminate some stale/unhelpful
comments from vm_mmap(), and remove an unused case. In particular, the
remaining callers of vm_mmap() in the tree do not specify OBJT_DEFAULT.
It's much more common to use vm_map_find() to map an object into user
memory, so rather than adjusting vm_mmap() to handle OBJT_SWAP objects,
let's further discourage its use and simply remove OBJT_DEFAULT
handling.
Reviewed by: dougm, alc, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35778
vm_object_allocate_anon() automatically sets "charge" to 0 if no cred
reference is provided, so the caller doesn't need any conditional logic.
No functional change intended.
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35781
uma_timeout() has several responsibilities; it visits every UMA zone and
as of recently will drain underutilized caches, so is rather expensive
(>1ms in some cases). Currently it is executed by softclock threads
and so will preempt most other CPU activity. None of this work requires
a high scheduling priority, though, so defer it to a taskqueue so as to
avoid stalling higher-priority work.
Reviewed by: rlibby, alc, mav, kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35738
- Correct the description (vm_fault_copy_entry() does not create a
shadow object).
- Move some initialization and assertions out of the scope of the object
locks, when doing so makes sense.
- Merge a pair of conditional blocks.
- Use __unused when appropriate.
No functional change intended.
Reviewed by: alc
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Commit 4b8365d752 introduced the ability to dynamically register
VM object types, for use by tmpfs, which creates swap-backed objects.
As a part of this, checks for such objects changed from
object->type == OBJT_DEFAULT || object->type == OBJT_SWAP
to
object->type == OBJT_DEFAULT || (object->flags & OBJ_SWAP) != 0
In particular, objects of type OBJT_DEFAULT do not have OBJ_SWAP set;
the swap pager sets this flag when converting from OBJT_DEFAULT to
OBJT_SWAP.
A few of these checks are done without the object lock held. It turns
out that this can result in false negatives since the swap pager
converts objects like so:
object->type = OBJT_SWAP;
object->flags |= OBJ_SWAP;
Fix the problem by adding explicit tests for OBJT_SWAP objects in
unlocked checks.
PR: 258932
Fixes: 4b8365d752 ("Add OBJT_SWAP_TMPFS pager")
Reported by: bdrewery
Reviewed by: kib
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35470
Otherwise zone initializers can produce false positives, e.g., when
lock_init() attempts to detect double initialization.
Sponsored by: The FreeBSD Foundation
We do not hold the object lock or a page busy lock when copying src_m's
validity state. Prior to commit 45d72c7d7f we marked dst_m as fully
valid.
Use the source object's read lock to ensure that valid bits are not
concurrently cleared.
Reviewed by: alc, kib
Fixes: 45d72c7d7f ("vm_fault_copy_entry: accept invalid source pages.")
MFC after: 2 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35471
... rather than setting and clearing flags inline. No functional change
intended.
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D35469
In vm_phys_alloc_contig, for an allocation bigger than the size of any
buddy queue free block, avoid examining any maximum-size free block
more than twice, by only starting to consider a sequence of adjacent
max-blocks starting at a max-block that does not follow another
max-block. If that first max-block follows adjacent blocks of smaller
size, and if together they provide enough memory to reduce by one the
number of max-blocks required for this allocation, use them as part of
this allocation.
Reviewed by: markj
Tested by: pho
Discussed with: alc
Differential Revision: https://reviews.freebsd.org/D34815
npages is used in two optional cases:
- to conditionally create a separate DMA32 free list
- to index vm_page_array for VM_PHYSSEG_SPARSE
Add in more #ifdef's around npages statements.
Reviewed by: alc, markj
Differential Revision: https://reviews.freebsd.org/D34887
The wait flag is passed to UMA when allocating boundary tags for the
initial span, and UMA expects either M_WAITOK or M_NOWAIT to be present.
Reported by: cperciva
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
The assertion was added in commit 1771e987ca. After that, vm_wait()
and friends were refactored such that the actual sleep happens
elsewhere. Now the assertion condition is not checked when
vm_wait_doms() is called directly, and it is checked even if we are not
going to sleep (because vm_page_count_min_set(wdoms) is false).
Reviewed by: alc, kib
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34909
The correct OID format for CTLTYPE_U64 is `QU` (`uquad_t`), not `A`
(text expressed via `char *`).
This issue was noticed while doing an sysctl tree walk using a
sysctl(9) consumer that relies on the OID format to intuit what the
type should be for a given sysctl.
MFC after: 1 month
Sponsored by: DellEMC Isilon
Differential Revision: https://reviews.freebsd.org/D34877
In vm_phys_alloc_queues_contig, in the case that a sequence of
max-order blocks are sought to fulfill an allocation, a sequence is
ruled out if it does not have enough max-order blocks to satisfy the
allocation. However, there may be smaller blocks of free memory that
follow the last max-order block in the sequence, and they may be big
enough to complete the allocation request, so check for that
possibility before giving up on that block sequence.
Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D34724
In the search for contiguous pages, as each page segment is examined,
check to see if the free list set for the next page segment differs
from the set for the current segment, and avoid a pointless search if
they do not differ.
Discussed with: alc
Reviewed by: markj
Tested by: pho
Differential Revision: https://reviews.freebsd.org/D33947
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).
Obtained from: CheriBSD
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D33780
It was only called in the non-NUMA and single-domain paths.
Some of its assertions were duplicated in uma_zalloc_domain,
but some things were missed, especially memguard.
Reviewed by: markj, rstone
MFC after: 1 week
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D34472
The uma_zalloc functions expect exactly one of [M_NOWAIT, M_WAITOK].
If neither or both are passed, print an error and a stack dump.
Only do this ten times, to prevent livelock. In the future, after
this exposes enough bad callers, this will be changed to a KASSERT().
Reviewed by: rstone, markj
MFC after: 1 month
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D34452
uma_zalloc_arg expects exactly one of the two WAIT flags. A future
commit will assert this.
Reviewed by: rstone
MFC after: 1 month
Sponsored by: Dell EMC Isilon
Differential Revision: https://reviews.freebsd.org/D34450
Allow a zone to opt out of cache size management. In particular,
uma_reclaim() and uma_reclaim_domain() will not reclaim any memory from
the zone, nor will uma_timeout() purge cached items if the zone is idle.
This effectively means that the zone consumer has control over when
items are reclaimed from the cache. In particular, uma_zone_reclaim()
will still reclaim cached items from an unmanaged zone.
Reviewed by: hselasky, kib
MFC after: 3 weeks
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D34142
This reverts commit 3de96d664a.
Problem is that it is possible to reach the state with ref_count ==
1 for the mapped non-anonymous object. For instance, anonymous posix
shmfd or linux shmfs object could be mapped, and then corresponding
file descriptor closed, dropping the object reference owned by the
shmfd/shmfs file. Then the check in inactive scan assumes that the
object and page are not mapped and frees the page, while they are not.
PR: 261707
Discussed with: markj
Sponsored by: The FreeBSD Foundation
MFC after: now
This LOR happens when reading from a file backed MD device:
lock order reversal:
1st 0xfffffe00431eaac0 pbufwait (pbufwait, lockmgr) @ /cobra/src/sys/vm/vm_pager.c:471
2nd 0xfffff80003f17930 ufs (ufs, lockmgr) @ /cobra/src/sys/dev/md/md.c:977
lock order pbufwait -> ufs attempted at:
#0 0xffffffff80c78ead at witness_checkorder+0xbdd
#1 0xffffffff80bd6a52 at lockmgr_lock_flags+0x182
#2 0xffffffff80f52d5c at ffs_lock+0x6c
#3 0xffffffff80d0f3f4 at _vn_lock+0x54
#4 0xffffffff80708629 at mdstart_vnode+0x499
#5 0xffffffff807060ec at md_kthread+0x20c
#6 0xffffffff80bbfcd0 at fork_exit+0x80
#7 0xffffffff810b809e at fork_trampoline+0xe
This LOR was previously blessed by witness before commit 531f8cfea0
("Use dedicated lock name for pbufs").
Instead of blessing ufs and pbufwait, use LK_NOWAIT to prevent recording
the lock order. LK_NOWAIT will be a nop here as the lock is dropped in
pbuf_dtor(). The takes the same approach as 5875b94c74 ("buf_alloc():
lock the buffer with LK_NOWAIT").
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D34183
instead of only OBJ_ANON objects that are backing, as it is now.
This is required for e.g. vm_meter is_object_active() detection, and
should be useful in some more cases.
Use refcount KPI for all objects, regardless of owning the object lock,
and the fact that currently OBJ_ANON cannot change for the live object.
Noted and reviewed by: markj
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33549
it is needed for __read_mostly attribute definition, which right now
comes from vm/vm_page.h including sys/systm.h
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D34089
Also remove a pointer to array variable, use array address directly.
Reviewed by: markj, mckusick
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D34072
Add a nested include of <sys/systm.h> for recently added assertions.
Without this, existing code (such as in drm-kmod) needs to be patched
to add the newly required header.
While here, rewrite the assertions using KASSERT().
Reviewed by: dougm, alc, imp, kib
Differential Revision: https://reviews.freebsd.org/D34070
For non-anonymous swap objects, there is always a reference from the
owner to the object to keep it from recycling. Account for it when
deciding should we query pmap for hardware active references for the
page.
As result, we avoid unneeded calls to pmap_ts_referenced(), which for
non-mapped page means avoiding unneccessary lock and unlock of the pv list.
Reviewed by: markj
Discussed with: alc
Tested by: pho
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D33924