Commit graph

4629 commits

Author SHA1 Message Date
Mark Johnston a04ce833f9 uma: Avoid polling for an invalid SMR sequence number
Buckets in an SMR-enabled zone can legitimately be tagged with
SMR_SEQ_INVALID.  This effectively means that the zone destructor (if
any) was invoked on all items in the bucket, and the contained memory is
safe to reuse.  If the first bucket in the full bucket list was tagged
this way, UMA would unnecessarily poll per-CPU state before attempting
to fetch a full bucket from the list.

MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
2022-01-14 15:38:02 -05:00
Mark Johnston 4a864f624a vm_pageout: Print a more accurate message to the console before an OOM kill
Previously we'd always print "out of swap space."  This can be
misleading, as there are other reasons an OOM kill can be triggered.  In
particular, it's entirely possible to trigger an OOM kill on a system
with plenty of free swap space.

Reviewed by:	kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33810
2022-01-14 15:04:21 -05:00
Brooks Davis 0910a41ef3 Revert "syscallarg_t: Add a type for system call arguments"
Missed issues in truss on at least armv7 and powerpcspe need to be
resolved before recommit.

This reverts commit 3889fb8af0.
This reverts commit 1544e0f5d1.
2022-01-12 23:29:20 +00:00
Brooks Davis 1544e0f5d1 syscallarg_t: Add a type for system call arguments
This more clearly differentiates system call arguments from integer
registers and return values. On current architectures it has no effect,
but on architectures where pointers are not integers (CHERI) and may
not even share registers (CHERI-MIPS) it is necessiary to differentiate
between system call arguments (syscallarg_t) and integer register values
(register_t).

Obtained from:	CheriBSD

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D33780
2022-01-12 22:51:25 +00:00
Doug Moore 84e2ae64c5 vm_reserv: use enhanced bitstring for popmaps
vm_reserv.c uses its own bitstring implemenation for popmaps. Using
the bitstring_t type from a standard header eliminates the code
duplication, allows some bit-at-a-time operations to be replaced with
more efficient bitstring range operations, and, in
vm_reserv_test_contig, allows bit_ffc_area_at to more efficiently
search for a big-enough set of consecutive zero-bits.

Make bitstring changes improve the vm_reserv code.  Define a bit_ntest
method to test whether a range of bits is all set, or all clear.
Define bit_ff_at and bit_ff_area_at to implement the ffs and ffc
versions with a parameter to choose between set- and clear- bits.
Improve the area_at implementation.  Modify the bit_nset and
bit_nclear implementations to allow code optimization in the cases
when start or end are multiples of _BITSTR_BITS.

Add a few new cases to bitstring_test.

Discussed with:	alc
Reviewed by:	markj
Tested by:	pho (earlier version)
Differential Revision:	https://reviews.freebsd.org/D33312
2022-01-12 11:03:53 -06:00
Mark Johnston c4a25e0713 vm_pageout: Group sysctl variables together with sysctl definitions
Fix some style bugs while here.  No functional change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33811
2022-01-11 09:27:45 -05:00
Mark Johnston 43b3b8e52d swap_pager: uma_zcreate() doesn't fail
Remove always-false checks for UMA zone creation failure.  No functional
change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33809
2022-01-11 09:27:45 -05:00
Doug Moore ae13829ddc vm_addr_ok: add power2 invariant check
With INVARIANTS defined, have vm_addr_align_ok and vm_addr_bound_ok
panic when passed an alignment/boundary parameter that is not a power
of two.

Reviewed by:	alc
Suggested by:	kib, se
Differential Revision:	https://reviews.freebsd.org/D33725
2022-01-10 01:17:25 -06:00
Konstantin Belousov c25a30e255 Dump page tracking no longer needed on mips
Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D33763
2022-01-06 06:00:39 +02:00
Konstantin Belousov f54882a862 Remove special kstack allocation code for mips.
The arch required two-pages alignment due to single TLB entry caching
two consequtive mappings.

Reviewed by:	imp
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D33763
2022-01-06 04:43:56 +02:00
Doug Moore f76916c095 vm_reserv: #include vm_extern.h explicitly, for arm.
Fixes:	c606ab59e7 vm_extern: use standard address checkers everywhere
2021-12-31 00:40:25 -06:00
Doug Moore e6930b1c5f vm_phys: convert error back to warning
Move an assignment back to where it was before, to turn the
defined-but-not-used error back into a set-but-not-used warning.

Fixes:	01e115ab83 vm_phys: #include vm_extern
2021-12-31 00:23:46 -06:00
Doug Moore 01e115ab83 vm_phys: #include vm_extern
Arm64 and powerpc don't include vm_extern.h indirectly in vm_phys.c, which
means that for the sake of those architectures, it must be included explicitly.

Also, fix a set-unused warning that jenkins also found.

Reported by:	Jenkins
Fixes:	c606ab59e7 vm_extern: use standard address checkers everywhere
2021-12-30 23:31:18 -06:00
Doug Moore c606ab59e7 vm_extern: use standard address checkers everywhere
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D33685
2021-12-30 22:09:08 -06:00
Gleb Smirnoff 841e0a8757 uma: with KTR trace allocs/frees from SMR zones 2021-12-29 23:08:33 -08:00
Gleb Smirnoff 28782f73df uma: with KTR report item being freed in uma_zfree_arg() 2021-12-29 23:08:15 -08:00
Doug Moore 8119cdd38b vm_phys: hide vm_phys_set_pool
It is only called in the file that defines it, so make it static and
remove the declaration from the header.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D33688
2021-12-29 11:17:33 -06:00
John Baldwin d90e41a154 sys/vm: Use C99 fixed-width integer types.
No functional change.

Reviewed by:	imp, kib, emaste
Differential Revision:	https://reviews.freebsd.org/D33641
2021-12-28 09:43:21 -08:00
Doug Moore 49fd2d51f0 vm_reserv: fix zero-boundary error
Handle specially the boundary==0 case of vm_reserv_reclaim_config,
by turning off boundary adjustment in that case.

Reviewed by:	alc
Tested by:	pho, madpilot
2021-12-26 11:40:27 -06:00
Doug Moore 4bae154fe8 vm_page: Move a comment
fb38b29b56 (page_alloc_br) vm_page: Remove extra test, dup code from page alloc
should have moved a comment block when it moved the function call that followed it.

Move the comment block now.
2021-12-24 16:10:30 -06:00
Doug Moore 0d5fac2872 vm: alloc pages from reserv before breaking it
Function vm_reserv_reclaim_contig breaks a reservation with enough
free space to satisfy an allocation request and returns the free space
to the buddy allocator. Change the function to allocate the request
memory from the reservation before breaking it, and return that memory
to the caller. That avoids a second call to the buddy allocator and
guarantees successful allocation after breaking the reservation, where
that success is not currently guaranteed.

Reviewed by:	alc, kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D33644
2021-12-24 12:59:16 -06:00
Doug Moore 184c63db3c Fix clerical error in page alloc
Fix a very recent change that introduced a page accounting error in
case of a reserveration being broken.
Reviewed by:	alc
Fixes:	fb38b29b56 (page_alloc_br) vm_page: Remove extra test, dup code from page alloc
Differential Revision:	https://reviews.freebsd.org/D33645
2021-12-24 02:47:21 -06:00
Doug Moore fb38b29b56 vm_page: Remove extra test, dup code from page alloc
Extract code common to functions vm_page_alloc_contig_domain and
vm_page_alloc_noobj_contig_domain into a new function.  Do so in a way
that eliminates a bound-to-fail reservation test after a reservation
is broken by a call from vm_page_alloc_contig_domain.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D33551
2021-12-23 22:45:47 -06:00
Stephen J. Kiernan 18048b6e3c Eliminate key press requirement "show vmopag" command output.
Summary:
One was required to press a key to continue after every 18 lines of
output. This requirement had been in the "show vmopag" command since it
was introduced, which was many years before paging was added to DDB.
With paging, this explict key check is no longer necessary.

Obtained from:	Juniper Networks, Inc.
MFC after:	1 week

Test Plan:
Run "show vmopag" from db> prompt and see that it does not need additional
keypresses other than the ones needed for the pager.

Subscribers: imp, #contributor_reviews_base

Differential Revision: https://reviews.freebsd.org/D33550
2021-12-19 19:40:52 -05:00
Rick Macklem cd37afd8b6 vm_object: Make is_object_active() global
Commit 867c27c23a modified the NFS client so that
it does IO_APPEND writes directly to the NFS server,
bypassing the buffer cache.  However, this could result
in stale data in client pages when the file is mmap(2)'d.
As such, the NFS client needs to call is_object_active()
to check if the file is mmap(2)'d.

This patch renames is_object_active() to vm_object_is_active(),
moves it to sys/vm/vm_object.c and makes it global, so that
the NFS client can call it in a future commit.

Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33520
2021-12-19 16:11:44 -08:00
Doug Moore f7aa44763d Correct type size format error in KASSERT.
Reported by:	jenkins
Fixes:	6f1c890827 vm: Don't break vm reserv that can't meet align reqs
2021-12-16 13:48:58 -06:00
Doug Moore 6f1c890827 vm: Don't break vm reserv that can't meet align reqs
Function vm_reserv_test_contig has incorrectly used its alignment
and boundary parameters to find a well-positioned range of empty pages
in a reservation.  Consequently, a reservation could be broken
mistakenly when it was unable to provide a satisfactory set of pages.

Rename the function, correct the errors, and add assertions to detect
the error in case it appears again.

Reviewed by:	alc, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33344
2021-12-16 12:20:56 -06:00
Mark Johnston 88642d978a vm_fault: Fix vm_fault_populate()'s handling of VM_FAULT_WIRE
vm_map_wire() works by calling vm_fault(VM_FAULT_WIRE) on each page in
the rage.  (For largepage mappings, it calls vm_fault() once per large
page.)

A pager's populate method may return more than one page to be mapped.
If VM_FAULT_WIRE is also specified, we'd wire each page in the run, not
just the fault page.  Consider an object with two pages mapped in a
vm_map_entry, and suppose vm_map_wire() is called on the entry.  Then,
the first vm_fault() would allocate and wire both pages, and the second
would encounter a valid page upon lookup and wire it again in the
regular fault handler.  So the second page is wired twice and will be
leaked when the object is destroyed.

Fix the problem by modify vm_fault_populate() to wire only the fault
page.  Also modify the error handler for pmap_enter(psind=1) to not test
fs->wired, since it must be false.

PR:		260347
Reviewed by:	alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33416
2021-12-14 15:10:46 -05:00
Konstantin Belousov 5346570276 swapoff: add one more variant of the syscall
Requested and reviewed by:	brooks
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33343
2021-12-09 02:48:46 +02:00
Doug Moore 9f32cb5b1c Set uninitialized popmap bits in vm_reserv_init
In vm_reserv_init, set all the marker popmap bits in vm_reserv_init,
and not just the bits of the first popmap entry.

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33258
2021-12-05 17:17:25 -06:00
Gleb Smirnoff 2cb67bd798 uma: remove unused *item argument from cache_free()
Reviewed by:		markj
Differential revision:	https://reviews.freebsd.org/D33272
2021-12-05 10:44:47 -08:00
Mark Johnston 39a7396f5d vm_page: Tighten the object lock assertion in vm_page_invalid()
A page must not become invalid while vm_fault_soft_fast() is attempting
to map unbusied pages for reading.

Note that all callers hold the object write lock already, and
vm_page_set_invalid() asserts the object write lock.

Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33250
2021-12-05 10:51:11 -05:00
Konstantin Belousov e8dc2ba29c swapoff(2): add a SWAPOFF_FORCE flag
The flag requests skipping the heuristic which tries to avoid leaving
system with more allocated memory than available from RAM and remanining
swap.

Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov a4e4132fa3 swapoff(2): replace special device name argument with a structure
For compatibility, add a placeholder pointer to the start of the
added struct swapoff_new_args, and use it to distinguish old vs. new
style of syscall invocation.

Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov 6df359449f swap_pager.c: Remove MPSAFE and ARGSUSED annotations
Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov 0190c38b9d swapoff_one(): only check free pages count manually turning swap off
When swap is turned off due to system shutdown or reboot, ignore the
check.  Problem is that the check is not accurate by any means, free
page count can legitimately be low while system still able to page in
everything from the swap.  Then, we turn swap off if swapping on
real file or some non-standard geom provider, and typically panic
when system appears to actually need to unavailable page.

For syscall, it is better to be safe than sorry.

Reported and tested by:	peterj
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33147
2021-11-29 18:38:02 +02:00
Mateusz Guzik 7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Konstantin Belousov b19740f4ce swap_pager: lock vnode in swapdev_strategy()
VOP_STRATEGY() requires locked vnode.  Note that we lock the swap vnode
while pages are busy, but this would only cause real LoR if pages belong
to the swap vnode, which must not be the case for correct use.

Reported and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:50 +02:00
Konstantin Belousov 6ddf41faa6 swapon: extend the region where the swap vnode is locked
to cover VOP_GETATTR() call in sys_swapon().  Move locking from inside
swapongeom() and swaponvp() into sys_swapon().

Reported by and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:44 +02:00
Konstantin Belousov a6d04f34a4 swap pager: lock vnode around VOP_CLOSE()
Reported and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:39 +02:00
Mark Johnston d47d3a94bb vm_fault: Factor out per-object operations into vm_fault_object()
No functional change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33018
2021-11-24 14:02:56 -05:00
Mark Johnston f1b642c255 vm_fault: Introduce a fault_status enum for internal return types
Rather than overloading the meanings of the Mach statuses, introduce a
new set for use internally in the fault code.  This makes the control
flow easier to follow and provides some extra error checking when a
fault status variable is used in a switch statement.

vm_fault_lookup() and vm_fault_relookup() continue to use Mach statuses
for now, as there isn't much benefit to converting them and they
effectively pass through a status from vm_map_lookup().

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33017
2021-11-24 14:02:55 -05:00
Mark Johnston 45c09a74d6 vm_fault: Move nera into faultstate
This makes it easier to factor out pieces of vm_fault().  No functional
change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33016
2021-11-24 14:02:55 -05:00
Mitchell Horne 10fe6f80a6 minidump: Use the provided dump bitset
When constructing the set of dumpable pages, use the bitset provided by
the state argument, rather than assuming vm_page_dump invariably. For
normal kernel minidumps this will be a pointer to vm_page_dump, but when
dumping the live system it will not.

To do this, the functions in vm_dumpset.h are extended to accept the
desired bitset as an argument. Note that this provided bitset is assumed
to be derived from vm_page_dump, and therefore has the same size.

Reviewed by:	kib, markj, jhb
MFC after:	2 weeks
Sponsored by:	Juniper Networks, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31992
2021-11-19 15:05:52 -04:00
Brooks Davis 01ce7fca44 ommap: fix signed len and pos arguments
4.3 BSD's mmap took an int len and long pos.  Reject negative lengths
and in freebsd32 sign-extend pos correctly rather than mis-handling
negative positions as large positive ones.

Reviewed by:	kib
2021-11-15 18:34:28 +00:00
Mark Johnston d28af1abf0 vm: Add a mode to vm_object_page_remove() which skips invalid pages
This will be used to break a deadlock in ZFS between the per-mountpoint
teardown lock and page busy locks.  In particular, when purging data
from the page cache during dataset rollback, we want to avoid blocking
on the busy state of invalid pages since the busying thread may be
blocked on the teardown lock in zfs_getpages().

Add a helper, vn_pages_remove_valid(), for use by filesystems.  Bump
__FreeBSD_version so that the OpenZFS port can make use of the new
helper.

PR:		258208
Reviewed by:	avg, kib, sef
Tested by:	pho (part of a larger patch)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32931
2021-11-15 13:01:30 -05:00
Mark Johnston a2665158d0 vm_page: Remove vm_page_sbusy() and vm_page_xbusy()
They are unused today and cannot be safely used in the face of unlocked
lookup, in which pages may be busied without the object lock held.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32948
2021-11-15 13:01:30 -05:00
Mark Johnston 87b646630c vm_page: Consolidate page busy sleep mechanisms
- Modify vm_page_busy_sleep() and vm_page_busy_sleep_unlocked() to take
  a VM_ALLOC_* flag indicating whether to sleep on shared-busy, and fix
  up callers.
- Modify vm_page_busy_sleep() to return a status indicating whether the
  object lock was dropped, and fix up callers.
- Convert callers of vm_page_sleep_if_busy() to use vm_page_busy_sleep()
  instead.
- Remove vm_page_sleep_if_(x)busy().

No functional change intended.

Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D32947
2021-11-15 13:01:30 -05:00
Mark Johnston b0acc3f11b vm_pager: Optimize an assertion
Obtained from:	jeff (object_concurrency patches)
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D32946
2021-11-15 13:01:30 -05:00
Mark Johnston e4bdb6857a vm_page: Handle VM_ALLOC_NORECLAIM in the contiguous page allocator
We added _NORECLAIM to request that kmem_alloc_contig_pages() not spend
time scanning physical memory for candidates to reclaim.  In some
situations the scanning can induce large amounts of undesirable latency,
and it's less important that the request be satisfied than it is that we
not spend many milliseconds scanning.

The problem extends to vm_reserv_reclaim_contig(), which unlike
vm_reserv_reclaim() may have to scan the entire list of partially
populated reservations.  Use VM_ALLOC_NORECLAIM to request that this
scan not be executed.[1]

As a side effect, this fixes a regression in 02fb0585e7 ("vm_page:
Drop handling of VM_ALLOC_NOOBJ in vm_page_alloc_contig_domain()")
where VM_ALLOC_CONTIG was not included in VPAC_FLAGS or VPANC_FLAGS even
though it is not masked by kmem_alloc_contig_pages().[2]

Reported by:	gallatin [1], glebius [2]
Reviewed by:	alc, glebius, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D32899
2021-11-11 14:26:41 -05:00