Commit graph

529 commits

Author SHA1 Message Date
Doug Moore 28af3eb62b Revert "swap_pager: small improvement to find_least"
This reverts commit dd0e5c02ab.
2024-06-16 10:37:16 -05:00
Doug Moore 2a21cfe60f pctrie: avoid typecast
Have PCTRIE_RECLAIM_CALLBACK typecast one function pointer type to
another, to relieve the writer of the call back function from having
to cast its first argument from void* to member type.

Reviewed by:	rlibby
Differential Revision:	https://reviews.freebsd.org/D45586
2024-06-14 02:19:03 -05:00
Doug Moore d2acf0a447 swap_pager: pctrie_reclaim_cb in meta_free_all
Replace the lookup-remove loop in swp_pager_meta_free_all with a call
to SWAP_PCTRIE_RECLAIM_CALLBACK, to eliminate repeated trie searches.

Reviewed by:	rlibby
Differential Revision:	https://reviews.freebsd.org/D45583
2024-06-13 13:52:25 -05:00
Doug Moore a880104a21 swap_pager: add new page range struct
Define a page_range struct to pair up the two values passed to
freerange functions. Have swp_pager_freeswapspace also take a
page_range argument rather than a pair of arguments.

In swp_pager_meta_free_all, drop a needless test and use a new
helper function to do the cleanup for each swap block.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D45562
2024-06-11 22:54:39 -05:00
Doug Moore dd0e5c02ab swap_pager: small improvement to find_least
Drop an unneeded test, a branch and a needless computation to save a
few instructions.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D45558
2024-06-11 11:36:23 -05:00
Konstantin Belousov 6ada4e8a0a swap-like pagers: assert that writemapping decrease does not pass zero
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D45119
2024-05-13 21:33:29 +03:00
Mark Johnston 4696650782 swap_pager: Unbusy readahead pages after an I/O error
The swap pager itself allocates readahead pages, so should take care to
unbusy them after a read error, just as it does in the non-error case.

PR:		277538
Reviewed by:	olce, dougm, alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D44646
2024-04-08 09:02:48 -04:00
Warner Losh 29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
Mark Johnston e61568aeee swap_pager: Fix a race in swap_pager_swapoff_object()
When we disable swapping to a device, we scan the full VM object list
looking for objects with swap trie nodes that reference the device in
question.  The pages corresponding to those nodes are paged in.

While paging in, we drop the VM object lock.  Moreover, we do not hold a
reference for the object; swap_pager_swapoff_object() merely bumps the
paging-in-progress counter.  vm_object_terminate() waits for this
counter to drain before proceeding and freeing pages.

However, swap_pager_swapoff_object() decrements the counter before
re-acquiring the VM object lock, which means that vm_object_terminate()
can race to acquire the lock and free the pages.  Then,
swap_pager_swapoff_object() ends up unbusying a freed page.  Fix the
problem by acquiring the lock before waking up sleepers.

PR:		273610
Reported by:	Graham Perrin <grahamperrin@gmail.com>
Reviewed by:	kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D42029
2023-10-02 07:49:52 -04:00
Warner Losh 685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Dimitry Andric f74be55e30 vm: fix a number of functions to match the expected prototypes
Noticed while attempting to make boolean_t unsigned: some vm-related
function declarations and defintions were using boolean_t where they
should have used int, and vice versa.

MFC after:	1 week
Reviewed by:	jhb
Differential Revision: https://reviews.freebsd.org/D39753
2023-04-25 19:58:18 +02:00
Konstantin Belousov 645510e62e Provide consistent prototype for swp_pager_meta_free()
This should fix 32bit build breakage.

Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2022-12-09 17:23:09 +02:00
Konstantin Belousov baa1ccceef Make swap_pager_freespace() global
also make it return the count of the swap pages freed, which are not
simultaneously resident in the object.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37097
2022-12-09 14:15:37 +02:00
Konstantin Belousov 5bd45b2ba3 swap_pager_find_least(): assert that the function is called on the right object type
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D37024
2022-10-19 20:24:07 +03:00
Konstantin Belousov 26eed2aa06 swap_pager: style, wrap long lines
Reviewed by:	brooks, imp (previous version)
Discussed with:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36540
2022-09-16 23:23:13 +03:00
Konstantin Belousov ccdaa1ab1c vm_overcommit: put into __read_mostly section
Suggested by:	mjg
Reviewed by:	brooks, imp (previous version)
Discussed with:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36540
2022-09-16 23:23:04 +03:00
Konstantin Belousov a6cc4c6e98 vm: make vm.overcommit available externally
Reviewed by:	brooks, imp (previous version)
Discussed with:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D36540
2022-09-16 23:22:49 +03:00
Alan Cox 54291f7d65 swap_pager: Reduce the scope of the object lock in putpages
We don't need to hold the object lock while allocating swap space, so
don't.

Reviewed by:	dougm, kib, markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D35839
2022-07-18 22:35:49 -05:00
Mark Johnston fffc1c594a vm_object: Release object swap charge in the swap pager destructor
With the removal of OBJT_DEFAULT, we can simply handle this in
swap_pager_dealloc().  No functional change intended.

Suggested by:	alc
Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35787
2022-07-17 07:09:48 -04:00
Mark Johnston cb6757c0a6 swap_pager: Removing handling for objects with OBJ_SWAP clear
With the removal of OBJT_DEFAULT, we can assume that pager operations
provide an object with OBJ_SWAP set.  Also, we do not need to convert
objects from type OBJT_DEFAULT.  Thus, remove checks for OBJ_SWAP and
remove code which modifies the object type.  In some places, replace the
check for OBJ_SWAP with a check for whether any swap blocks are
assigned.

Reviewed by:	alc, kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D35786
2022-07-17 07:09:48 -04:00
John Baldwin b8ebd99aa5 vm: Use __diagused for variables only used in KASSERT(). 2022-04-13 16:08:20 -07:00
Enji Cooper 567378cc07 Fix OID format for vm.swap_reserved and vm.swap_total
The correct OID format for CTLTYPE_U64 is `QU` (`uquad_t`), not `A`
(text expressed via `char *`).

This issue was noticed while doing an sysctl tree walk using a
sysctl(9) consumer that relies on the OID format to intuit what the
type should be for a given sysctl.

MFC after:	1 month
Sponsored by:	DellEMC Isilon
Differential Revision: https://reviews.freebsd.org/D34877
2022-04-10 18:17:09 -07:00
Mateusz Guzik bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Mark Johnston 43b3b8e52d swap_pager: uma_zcreate() doesn't fail
Remove always-false checks for UMA zone creation failure.  No functional
change intended.

Reviewed by:	alc, kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D33809
2022-01-11 09:27:45 -05:00
Konstantin Belousov 5346570276 swapoff: add one more variant of the syscall
Requested and reviewed by:	brooks
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33343
2021-12-09 02:48:46 +02:00
Konstantin Belousov e8dc2ba29c swapoff(2): add a SWAPOFF_FORCE flag
The flag requests skipping the heuristic which tries to avoid leaving
system with more allocated memory than available from RAM and remanining
swap.

Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov a4e4132fa3 swapoff(2): replace special device name argument with a structure
For compatibility, add a placeholder pointer to the start of the
added struct swapoff_new_args, and use it to distinguish old vs. new
style of syscall invocation.

Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov 6df359449f swap_pager.c: Remove MPSAFE and ARGSUSED annotations
Reviewed by:	markj
Discussed with:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33165
2021-12-05 00:20:58 +02:00
Konstantin Belousov 0190c38b9d swapoff_one(): only check free pages count manually turning swap off
When swap is turned off due to system shutdown or reboot, ignore the
check.  Problem is that the check is not accurate by any means, free
page count can legitimately be low while system still able to page in
everything from the swap.  Then, we turn swap off if swapping on
real file or some non-standard geom provider, and typically panic
when system appears to actually need to unavailable page.

For syscall, it is better to be safe than sorry.

Reported and tested by:	peterj
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33147
2021-11-29 18:38:02 +02:00
Mateusz Guzik 7e1d3eefd4 vfs: remove the unused thread argument from NDINIT*
See b4a58fbf64 ("vfs: remove cn_thread")

Bump __FreeBSD_version to 1400043.
2021-11-25 22:50:42 +00:00
Konstantin Belousov b19740f4ce swap_pager: lock vnode in swapdev_strategy()
VOP_STRATEGY() requires locked vnode.  Note that we lock the swap vnode
while pages are busy, but this would only cause real LoR if pages belong
to the swap vnode, which must not be the case for correct use.

Reported and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:50 +02:00
Konstantin Belousov 6ddf41faa6 swapon: extend the region where the swap vnode is locked
to cover VOP_GETATTR() call in sys_swapon().  Move locking from inside
swapongeom() and swaponvp() into sys_swapon().

Reported by and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:44 +02:00
Konstantin Belousov a6d04f34a4 swap pager: lock vnode around VOP_CLOSE()
Reported and tested by:	peterj
Reviewed by:	markj
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33119
2021-11-25 21:34:39 +02:00
Gleb Smirnoff 183f8e1e57 Externalize nsw_cluster_max and initialize it early.
GEOM_ELI needs to know the value, cause it will soon have special
memory handling for IO operations associated with swap.

Move initialization to swap_pager_init(), which is executed at
SI_SUB_VM, unlike swap_pager_swap_init(), which would be executed
only when a swap is configured. GEOM_ELI might need the value at
SI_SUB_DRIVERS, when disks are tasted by GEOM.

Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D24400
2021-09-28 11:23:52 -07:00
Gleb Smirnoff c6213beff4 Add flag BIO_SWAP to mark IOs that are associated with swap.
Submitted by:		jtl
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D24400
2021-09-28 11:23:51 -07:00
Mark Johnston 686aa9287c swap_pager: Handle large swap_pager_reserve() requests
This interface is used solely by md(4) when the MD_RESERVE flag is
specified, as in `mdconfig -a -t swap -s 1G -o reserve`.  It
pre-allocates swap blocks for the entire object.

The number of blocks to be reserved is specified as a vm_size_t, but
swp_pager_getswapspace() can allocate at most INT_MAX blocks.  vm_size_t
also seems like the incorrect type to use here it refers only to the
size of the VM object, not the size of a mapping.  So:
- change the type of "size" in swap_pager_reserve() to vm_pindex_t, and
- clamp the requested number of blocks for a single
  swp_pager_getswapspace() call to INT_MAX.

Reported by:	syzkaller
Reviewed by:	dougm, alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D31875
2021-09-07 14:04:50 -04:00
Konstantin Belousov 28bc23ab92 tmpfs: dynamically register tmpfs pager
Remove OBJT_SWAP_TMPFS. Move tmpfs-specific swap pager bits into
tmpfs_subr.c.

There is no longer any code to directly support tmpfs in sys/vm, most
tmpfs knowledge is shared by non-anon swap object type implementation.
The tmpfs-specific methods are provided by registered tmpfs pager, which
inherits from the swap pager.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30168
2021-05-13 20:13:34 +03:00
Konstantin Belousov 00a3fe968b vm_object_kvme_type(): reimplement by embedding kvme_type into pagerops
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30168
2021-05-13 20:10:35 +03:00
Mark Johnston 06d1fd9f42 swap_pager: Zero swap info before exporting to userspace
Otherwise padding bytes are leaked.

Reported by:	KMSAN
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
2021-05-12 12:52:05 -04:00
Konstantin Belousov d474440ab3 Constify vm_pager-related virtual tables.
Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov 4b8365d752 Add OBJT_SWAP_TMPFS pager
This is OBJT_SWAP pager, specialized for tmpfs.  Right now, both swap pager
and generic vm code have to explicitly handle swap objects which are tmpfs
vnode v_object, in the special ways.  Replace (almost) all such places with
proper methods.

Since VM still needs a notion of the 'swap object', regardless of its
use, add yet another type-classification flag OBJ_SWAP. Set it in
vm_object_allocate() where other type-class flags are set.

This change almost completely eliminates the knowledge of tmpfs from VM,
and opens a way to make OBJT_SWAP_TMPFS loadable from tmpfs.ko.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov a7c198a24b Implement vm_object_vnode() using vm_pager_getvp()
Allow vp_heldp argument to be NULL, in which case the returned vnode
is not held for tmpfs swap objects.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov 1390a5cbeb Add pgo_freespace method
Makes the code in vm_object collapse/page_remove cleaner

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov 192112b74f Add pgo_getvp method
This eliminates the staircase of conditions in vm_map_entry_set_vnode_text().

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov c23c555bc1 Add pgo_mightbedirty method
Used to implement vm_object_mightbedirty()

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov 180bcaa46c vm_pager: add pgo_set_writeable_dirty method
specialized for swap and vnode pagers, and used to implement
vm_object_set_writeable_dirty().

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:03 +03:00
Konstantin Belousov a0850dd057 swappagerops: slightly more style-compliant formatting
Remove excess spaces from comments.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D30070
2021-05-07 17:08:02 +03:00
Konstantin Belousov ecfbddf0cd sysctl vm.objects: report backing object and swap use
For anonymous objects, provide a handle kvo_me naming the object,
and report the handle of the backing object.  This allows userspace
to deconstruct the shadow chain.  Right now the handle is the address
of the object in KVA, but this is not guaranteed.

For the same anonymous objects, report the swap space used for actually
swapped out pages, in kvo_swapped field.  I do not believe that it is
useful to report full 64bit counter there, so only uint32_t value is
returned, clamped to the max.

For kinfo_vmentry, report anonymous object handle backing the entry,
so that the shadow chain for the specific mapping can be deconstructed.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D29771
2021-04-19 21:32:01 +03:00
Konstantin Belousov cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Mateusz Guzik c3aa3bf97c vm: clean up empty lines in .c and .h files 2020-09-01 21:20:45 +00:00