Commit graph

143 commits

Author SHA1 Message Date
John Baldwin 9e0164087c vm: Change the return types of kernacc and useracc to bool
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D45156
2024-05-10 13:43:56 -07:00
Bojan Novković 7a79d06697 vm: improve kstack_object pindex calculation to avoid pindex holes
This commit replaces the linear transformation of kernel virtual
addresses to kstack_object pindex values with a non-linear
scheme that circumvents physical memory fragmentation caused by
kernel stack guard pages. The new mapping scheme is used to
effectively "skip" guard pages and assign pindices for
non-guard pages in a contiguous fashion.

The new allocation scheme requires that all default-sized kstack KVAs
come from a separate, specially aligned region of the KVA space.
For this to work, this commited introduces a dedicated per-domain
kstack KVA arena used to allocate kernel stacks of default size.
The behaviour on 32-bit platforms remains unchanged due to a
significatly smaller KVA space.

Aside from fullfilling the requirements imposed by the new scheme, a
separate kstack KVA arena facilitates superpage promotion in the rest
of kernel and causes most kstacks to have guard pages at both ends.

Reviewed by:  alc, kib, markj
Tested by:    markj
Approved by:  markj (mentor)
Differential Revision: https://reviews.freebsd.org/D38852
2024-04-10 17:37:20 +02:00
Andrew Turner 839999e7ef vm: Add kva_alloc_aligned
Add a function like kva_alloc that allows us to specify the alignment
of the virtual address space returned.

Reviewed by:	alc, kib, markj
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D42788
2023-11-30 10:50:03 +00:00
Warner Losh 29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
Warner Losh 2ff63af9b8 sys: Remove $FreeBSD$: one-line .h pattern
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:18 -06:00
John Baldwin f49fd63a6a kmem_malloc/free: Use void * instead of vm_offset_t for kernel pointers.
Reviewed by:	kib, markj
Sponsored by:	DARPA
Differential Revision:	https://reviews.freebsd.org/D36549
2022-09-22 15:09:19 -07:00
Konstantin Belousov d950c5898a vm/vm_extern.h, vm/vm_page.h: use sys/kassert.h
instead of fatty sys/systm.h.

Suggested by:	jhb
Reviewed by:	alc, imp, jhb (previous version)
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D34089
2022-02-01 05:55:35 +02:00
John Baldwin 29d481ae6a Make <vm/vm_extern.h> more self-contained.
Add a nested include of <sys/systm.h> for recently added assertions.
Without this, existing code (such as in drm-kmod) needs to be patched
to add the newly required header.

While here, rewrite the assertions using KASSERT().

Reviewed by:	dougm, alc, imp, kib
Differential Revision:	https://reviews.freebsd.org/D34070
2022-01-28 13:14:03 -08:00
Doug Moore ae13829ddc vm_addr_ok: add power2 invariant check
With INVARIANTS defined, have vm_addr_align_ok and vm_addr_bound_ok
panic when passed an alignment/boundary parameter that is not a power
of two.

Reviewed by:	alc
Suggested by:	kib, se
Differential Revision:	https://reviews.freebsd.org/D33725
2022-01-10 01:17:25 -06:00
Doug Moore c606ab59e7 vm_extern: use standard address checkers everywhere
Define simple functions for alignment and boundary checks and use them
everywhere instead of having slightly different implementations
scattered about. Define them in vm_extern.h and use them where
possible where vm_extern.h is included.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D33685
2021-12-30 22:09:08 -06:00
Ka Ho Ng de2e152959 Add vnode_pager_purge_range(9) KPI
This KPI is created in addition to the existing vnode_pager_setsize(9)
KPI. The KPI is intended for file systems that are able to turn a range
of file into sparse range, also known as hole-punching.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D27194
2021-08-05 22:52:26 +08:00
Mark Johnston 7dd979dfef Remove the VM map zone.
Today, the zone is only used to allocate a trio of kernel maps: the
kernel map itself, and the exec and pipe submaps.  Maps for user
processes are dynamically allocated but are embedded in the vmspace
structure, which is allocated from its own zone.  Make the
aforementioned kernel maps statically allocated and get rid of the zone.

While here, remove a stale comment above vmspace_alloc() and change the
names of locks initialized in vm_map_init() to match vmspace_zinit().

Reported by:	alc
Reviewed by:	alc, kib
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D26052
2020-08-17 13:02:01 +00:00
Mark Johnston f13fa9df05 Use a single VM object for kernel stacks.
Previously we allocated a separate VM object for each kernel stack.
However, fully constructed kernel stacks are cached by UMA, so there is
no harm in using a single global object for all stacks.  This reduces
memory consumption and makes it easier to define a memory allocation
policy for kernel stack pages, with the aim of reducing physical memory
fragmentation.

Add a global kstack_object, and use the stack KVA address to index into
the object like we do with kernel_object.

Reviewed by:	kib
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D24473
2020-04-26 20:08:57 +00:00
Konstantin Belousov df08823d07 Improve MD page fault handlers.
Centralize calculation of signal and ucode delivered on unhandled page
fault in new function vm_fault_trap().  MD trap_pfault() now almost
always uses the signal numbers and error codes calculated in
consistent MI way.

This introduces the protection fault compatibility sysctls to all
non-x86 architectures which did not have that bug, but apparently they
were already much more wrong in selecting delivered signals on
protection violations.

Change the delivered signal for accesses to mapped area after the
backing object was truncated.  According to POSIX description for
mmap(2):
   The system shall always zero-fill any partial page at the end of an
   object. Further, the system shall never write out any modified
   portions of the last page of an object which are beyond its
   end. References within the address range starting at pa and
   continuing for len bytes to whole pages following the end of an
   object shall result in delivery of a SIGBUS signal.

   An implementation may generate SIGBUS signals when a reference
   would cause an error in the mapped object, such as out-of-space
   condition.
Adjust according to the description, keeping the existing
compatibility code for SIGSEGV/SIGBUS on protection failures.

For situations where kernel cannot handle page fault due to resource
limit enforcement, SIGBUS with a new error code BUS_OBJERR is
delivered.  Also, provide a new error code SEGV_PKUERR for SIGSEGV on
amd64 due to protection key access violation.

vm_fault_hold() is renamed to vm_fault().  Fixed some nits in
trap_pfault()s like mis-interpreting Mach errors as errnos.  Removed
unneeded truncations of the fault addresses reported by hardware.

PR:	211924
Reviewed by:	alc
Discussed with:	jilles, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D21566
2019-09-27 18:43:36 +00:00
Mark Johnston 9978bd996b Add malloc_domainset(9) and _domainset variants to other allocator KPIs.
Remove malloc_domain(9) and most other _domain KPIs added in r327900.
The new functions allow the caller to specify a general NUMA domain
selection policy, rather than specifically requesting an allocation from
a specific domain.  The latter policy tends to interact poorly with
M_WAITOK, resulting in situations where a caller is blocked indefinitely
because the specified domain is depleted.  Most existing consumers of
the _domain KPIs are converted to instead use a DOMAINSET_PREF() policy,
in which we fall back to other domains to satisfy the allocation
request.

This change also defines a set of DOMAINSET_FIXED() policies, which
only permit allocations from the specified domain.

Discussed with:	gallatin, jeff
Reported and tested by:	pho (previous version)
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D17418
2018-10-30 18:26:34 +00:00
Alan Cox 49bfa624ac Eliminate the arena parameter to kmem_free(). Implicitly this corrects an
error in the function hypercall_memfree(), where the wrong arena was being
passed to kmem_free().

Introduce a per-page flag, VPO_KMEM_EXEC, to mark physical pages that are
mapped in kmem with execute permissions.  Use this flag to determine which
arena the kmem virtual addresses are returned to.

Eliminate UMA_SLAB_KRWX.  The introduction of VPO_KMEM_EXEC makes it
redundant.

Update the nearby comment for UMA_SLAB_KERNEL.

Reviewed by:	kib, markj
Discussed with:	jeff
Approved by:	re (marius)
Differential Revision:	https://reviews.freebsd.org/D16845
2018-08-25 19:38:08 +00:00
Alan Cox 83a90bffd8 Eliminate kmem_malloc()'s unused arena parameter. (The arena parameter
became unused in FreeBSD 12.x as a side-effect of the NUMA-related
changes.)

Reviewed by:	kib, markj
Discussed with:	jeff, re@
Differential Revision:	https://reviews.freebsd.org/D16825
2018-08-21 16:43:46 +00:00
Alan Cox 44d0efb215 Eliminate kmem_alloc_contig()'s unused arena parameter.
Reviewed by:	hselasky, kib, markj
Discussed with:	jeff
Differential Revision:	https://reviews.freebsd.org/D16799
2018-08-20 15:57:27 +00:00
Alan Cox db7c2a4822 Eliminate the unused arena parameter from kmem_alloc_attr().
Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D16793
2018-08-18 22:07:48 +00:00
Alan Cox 067fd85894 Eliminate the arena parameter to kmem_malloc_domain(). It is redundant.
The domain and flags parameters suffice.  In fact, the related functions
kmem_alloc_{attr,contig}_domain() don't have an arena parameter.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D16713
2018-08-18 18:33:50 +00:00
Mark Johnston 483f692ea6 Have preload_delete_name() free pages backing preloaded data.
On i386 and amd64, add a vm_phys segment for physical memory used to
store the kernel binary and other preloaded data.  This makes it
possible to free such memory back to the system once it is no longer
needed, e.g., when a preloaded kernel module is unloaded.  Previously,
it would have remained unused.

Reviewed by:	kib, royger
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D16330
2018-07-19 20:00:28 +00:00
Jonathan T. Looney 0766f278d8 Make UMA and malloc(9) return non-executable memory in most cases.
Most kernel memory that is allocated after boot does not need to be
executable.  There are a few exceptions.  For example, kernel modules
do need executable memory, but they don't use UMA or malloc(9).  The
BPF JIT compiler also needs executable memory and did use malloc(9)
until r317072.

(Note that a side effect of r316767 was that the "small allocation"
path in UMA on amd64 already returned non-executable memory.  This
meant that some calls to malloc(9) or the UMA zone(9) allocator could
return executable memory, while others could return non-executable
memory.  This change makes the behavior consistent.)

This change makes malloc(9) return non-executable memory unless the new
M_EXEC flag is specified.  After this change, the UMA zone(9) allocator
will always return non-executable memory, and a KASSERT will catch
attempts to use the M_EXEC flag to allocate executable memory using
uma_zalloc() or its variants.

Allocations that do need executable memory have various choices.  They
may use the M_EXEC flag to malloc(9), or they may use a different VM
interfact to obtain executable pages.

Now that malloc(9) again allows executable allocations, this change also
reverts most of r317072.

PR:		228927
Reviewed by:	alc, kib, markj, jhb (previous version)
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D15691
2018-06-13 17:04:41 +00:00
Jeff Roberson e2068d0bcd Use per-domain locks for vm page queue free. Move paging control from
global to per-domain state.  Protect reservations with the free lock
from the domain that they belong to.  Refactor to make vm domains more
of a first class object.

Reviewed by:    markj, kib, gallatin
Tested by:      pho
Sponsored by:   Netflix, Dell/EMC Isilon
Differential Revision:  https://reviews.freebsd.org/D14000
2018-02-06 22:10:07 +00:00
Jeff Roberson 7a469c8ef3 Implement NUMA policy for kmem_*(9). This maintains compatibility with
reservations by giving each memory domain its own KVA space in vmem that
is naturally aligned on superpage boundaries.

Reviewed by:	alc, markj, kib  (some objections)
Sponsored by:	Netflix, Dell/EMC Isilon
Tested by;	pho
Differential Revision:	https://reviews.freebsd.org/D13289
2018-01-12 23:13:55 +00:00
Pedro F. Giffuni 51369649b0 sys: further adoption of SPDX licensing ID tags.
Mainly focus on files that use BSD 3-Clause license.

The Software Package Data Exchange (SPDX) group provides a specification
to make it easier for automated tools to detect and summarize well known
opensource licenses. We are gradually adopting the specification, noting
that the tags are considered only advisory and do not, in any way,
superceed or replace the license texts.

Special thanks to Wind River for providing access to "The Duke of
Highlander" tool: an older (2014) run over FreeBSD tree was useful as a
starting point.
2017-11-20 19:43:44 +00:00
Konstantin Belousov ac04195ba6 Move swapout code into vm/vm_swapout.c.
There is no NO_SWAPPING #ifdef left in the code.

Requested by:	alc
Reviewed by:	alc, markj
Sponsored by:	The FreeBSD Foundation
MFC after:	3 weeks
Differential revision:	https://reviews.freebsd.org/D12663
2017-10-20 09:10:49 +00:00
Warner Losh fbbd9655e5 Renumber copyright clause 4
Renumber cluase 4 to 3, per what everybody else did when BSD granted
them permission to remove clause 3. My insistance on keeping the same
numbering for legal reasons is too pedantic, so give up on that point.

Submitted by:	Jan Schaumann <jschauma@stevens.edu>
Pull Request:	https://github.com/freebsd/freebsd/pull/96
2017-02-28 23:42:47 +00:00
Konstantin Belousov 496ab0532d Rework r313352.
Rename kern_vm_* functions to kern_*.  Move the prototypes to
syscallsubr.h.  Also change Mach VM types to uintptr_t/size_t as
needed, to avoid headers pollution.

Requested by:	alc, jhb
Reviewed by:	alc
Sponsored by:	The FreeBSD Foundation
MFC after:	2 weeks
Differential revision:	https://reviews.freebsd.org/D9535
2017-02-13 09:04:38 +00:00
Konstantin Belousov 1c2ad3e962 Change type of the prot parameter for kern_vm_mmap() from vm_prot_t to int.
This makes the code to pass whole word of the mmap(2) syscall argument
prot to the syscall helper kern_vm_mmap(), which can validate all
bits.  The change provides temporal fix for sys/vm/mmap_test
mmap__bad_arguments, which was broken after r313352.

PR:	216976
Reported and tested by:	ngie
Sponsored by:	The FreeBSD Foundation
2017-02-11 20:27:39 +00:00
Edward Tomasz Napierala 69cdfcef2e Add kern_vm_mmap2(), kern_vm_mprotect(), kern_vm_msync(), kern_vm_munlock(),
kern_vm_munmap(), and kern_vm_madvise(), and use them in various compats
instead of their sys_*() counterparts.

Reviewed by:	ed, dchagin, kib
MFC after:	2 weeks
Sponsored by:	DARPA, AFRL
Differential Revision:	https://reviews.freebsd.org/D9378
2017-02-06 20:57:12 +00:00
John Baldwin 8a4dc40ff4 Various cleanups to the main function for AIO kernel processes:
- Pull the vmspace logic out into helper functions and reduce duplication.
  Operations on the vmspace are all isolated to vm_map.c, but it now exports
  a new 'vmspace_switch_aio' for use by AIO kernel processes.
- When an AIO kernel process wants to exit, break out of the main loop and
  perform cleanup after the loop end.  This reduces a lot of indentation and
  allows cleanup to more closely mirror setup actions before the loop starts.
- Convert a DIAGNOSTIC to KASSERT().
- Replace mycp with more typical 'p'.

Reviewed by:	kib
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D4990
2016-01-19 21:37:51 +00:00
John Baldwin 7077c42623 Add a new file operations hook for mmap operations. File type-specific
logic is now placed in the mmap hook implementation rather than requiring
it to be placed in sys/vm/vm_mmap.c.  This hook allows new file types to
support mmap() as well as potentially allowing mmap() for existing file
types that do not currently support any mapping.

The vm_mmap() function is now split up into two functions.  A new
vm_mmap_object() function handles the "back half" of vm_mmap() and accepts
a referenced VM object to map rather than a (handle, handle_type) tuple.
vm_mmap() is now reduced to converting a (handle, handle_type) tuple to a
a VM object and then calling vm_mmap_object() to handle the actual mapping.
The vm_mmap() function remains for use by other parts of the kernel
(e.g. device drivers and exec) but now only supports mapping vnodes,
character devices, and anonymous memory.

The mmap() system call invokes vm_mmap_object() directly with a NULL object
for anonymous mappings.  For mappings using a file descriptor, the
descriptors fo_mmap() hook is invoked instead.  The fo_mmap() hook is
responsible for performing type-specific checks and adjustments to
arguments as well as possibly modifying mapping parameters such as flags
or the object offset.  The fo_mmap() hook routines then call
vm_mmap_object() to handle the actual mapping.

The fo_mmap() hook is optional.  If it is not set, then fo_mmap() will
fail with ENODEV.  A fo_mmap() hook is implemented for regular files,
character devices, and shared memory objects (created via shm_open()).

While here, consistently use the VM_PROT_* constants for the vm_prot_t
type for the 'prot' variable passed to vm_mmap() and vm_mmap_object()
as well as the vm_mmap_vnode() and vm_mmap_cdev() helper routines.
Previously some places were using the mmap()-specific PROT_* constants
instead.  While this happens to work because PROT_xx == VM_PROT_xx,
using VM_PROT_* is more correct.

Differential Revision:	https://reviews.freebsd.org/D2658
Reviewed by:	alc (glanced over), kib
MFC after:	1 month
Sponsored by:	Chelsio
2015-06-04 19:41:15 +00:00
Alan Cox 66cd575b28 Handle wiring failures in vm_map_wire() with the new functions
pmap_unwire() and vm_object_unwire().

Retire vm_fault_{un,}wire(), since they are no longer used.

(See r268327 and r269134 for the motivation behind this change.)

Reviewed by:	kib
Sponsored by:	EMC / Isilon Storage Division
2014-08-02 16:10:24 +00:00
Alan Cox 0346250941 When unwiring a region of an address space, do not assume that the
underlying physical pages are mapped by the pmap.  If, for example, the
application has performed an mprotect(..., PROT_NONE) on any part of the
wired region, then those pages will no longer be mapped by the pmap.
So, using the pmap to lookup the wired pages in order to unwire them
doesn't always work, and when it doesn't work wired pages are leaked.

To avoid the leak, introduce and use a new function vm_object_unwire()
that locates the wired pages by traversing the object and its backing
objects.

At the same time, switch from using pmap_change_wiring() to the recently
introduced function pmap_unwire() for unwiring the region's mappings.
pmap_unwire() is faster, because it operates a range of virtual addresses
rather than a single virtual page at a time.  Moreover, by operating on
a range, it is superpage friendly.  It doesn't waste time performing
unnecessary demotions.

Reported by:	markj
Reviewed by:	kib
Tested by:	pho, jmg (arm)
Sponsored by:	EMC / Isilon Storage Division
2014-07-26 18:10:18 +00:00
Neel Natu 74d1d2b7cc Merge the following changes from projects/bhyve_npt_pmap:
- add fields to 'struct pmap' that are required to manage nested page tables.
- add a parameter to 'vmspace_alloc()' that can be used to override the
  default pmap initialization routine 'pmap_pinit()'.

These changes are pushed ahead of the remaining changes in 'bhyve_npt_pmap'
in anticipation of the upcoming KBI freeze for 10.0.

Reviewed by:	kib@, alc@
Approved by:	re (glebius)
2013-09-20 17:06:49 +00:00
Jeff Roberson 5df87b21d3 Replace kernel virtual address space allocation with vmem. This provides
transparent layering and better fragmentation.

 - Normalize functions that allocate memory to use kmem_*
 - Those that allocate address space are named kva_*
 - Those that operate on maps are named kmap_*
 - Implement recursive allocation handling for kmem_arena in vmem.

Reviewed by:	alc
Tested by:	pho
Sponsored by:	EMC / Isilon Storage Division
2013-08-07 06:21:20 +00:00
Attilio Rao be99683637 Revert r253939:
We cannot busy a page before doing pagefaults.
Infact, it can deadlock against vnode lock, as it tries to vget().
Other functions, right now, have an opposite lock ordering, like
vm_object_sync(), which acquires the vnode lock first and then
sleeps on the busy mechanism.

Before this patch is reinserted we need to break this ordering.

Sponsored by:	EMC / Isilon storage division
Reported by:	kib
2013-08-05 08:55:35 +00:00
Attilio Rao 3b6714cacb The page hold mechanism is fast but it has couple of fallouts:
- It does not let pages respect the LRU policy
- It bloats the active/inactive queues of few pages

Try to avoid it as much as possible with the long-term target to
completely remove it.
Use the soft-busy mechanism to protect page content accesses during
short-term operations (like uiomove_fromphys()).

After this change only vm_fault_quick_hold_pages() is still using the
hold mechanism for page content access.
There is an additional complexity there as the quick path cannot
immediately access the page object to busy the page and the slow path
cannot however busy more than one page a time (to avoid deadlocks).

Fixing such primitive can bring to complete removal of the page hold
mechanism.

Sponsored by:	EMC / Isilon storage division
Discussed with:	alc
Reviewed by:	jeff
Tested by:	pho
2013-08-04 21:07:24 +00:00
Gleb Smirnoff 995d706909 Make sys_mlock() function just a wrapper around vm_mlock() function
that does all the job.

Reviewed by:	kib, jilles
Sponsored by:	Nginx, Inc.
2013-06-08 13:13:40 +00:00
Alan Cox 5c1f2cc4c2 Eliminate vm_phys_bootstrap_alloc(). It was a failed attempt at
eliminating duplicated code in the various pmap implementations.

Micro-optimize vm_phys_free_pages().

Introduce vm_phys_free_contig().  It is fast routine for freeing an
arbitrary number of physically contiguous pages.  In particular, it
doesn't require the number of pages to be a power of two.

Use "u_long" instead of "unsigned long".

Bruce Evans (bde@) has convinced me that the "boundary" parameters
to kmem_alloc_contig(), vm_phys_alloc_contig(), and
vm_reserv_reclaim_contig() should be of type "vm_paddr_t" and not
"u_long".  Make this change.
2011-10-30 05:06:14 +00:00
Alan Cox 1933a67cf4 Use "u_long" instead of "unsigned long". 2011-10-28 22:36:15 +00:00
Konstantin Belousov 2e32165ce0 Extract the code to translate VM error into errno, into an exported
function vm_mmap_to_errno(). It is useful for the drivers that implement
mmap(2)-like functionality, to be able to return error codes consistent
with mmap(2).

Sponsored by:	The FreeBSD Foundation
No objections from:	alc
MFC after:	1 week
2011-07-10 20:49:13 +00:00
Konstantin Belousov 2801687d56 Add a facility to disable processing page faults. When activated,
uiomove generates EFAULT if any accessed address is not mapped, as
opposed to handling the fault.

Sponsored by:	The FreeBSD Foundation
Reviewed by:	alc (previous version)
2011-07-09 15:21:10 +00:00
Alan Cox 4de2261903 Move vm_object_print()'s prototype to the expected place. 2010-12-27 07:12:22 +00:00
Alan Cox 0b47b37621 Retire vm_fault_quick(). It's no longer used.
Reviewed by:	kib@
2010-12-25 23:54:50 +00:00
Alan Cox 82de724fe1 Introduce and use a new VM interface for temporarily pinning pages. This
new interface replaces the combined use of vm_fault_quick() and
pmap_extract_and_hold() throughout the kernel.

In collaboration with:	kib@
2010-12-25 21:26:56 +00:00
Alan Cox acd11c7499 Introduce vm_fault_hold() and use it to (1) eliminate a long-standing race
condition in proc_rwmem() and to (2) simplify the implementation of the
cxgb driver's vm_fault_hold_user_pages().  Specifically, in proc_rwmem()
the requested read or write could fail because the targeted page could be
reclaimed between the calls to vm_fault() and vm_page_hold().

In collaboration with:	kib@
MFC after:	6 weeks
2010-12-20 22:49:31 +00:00
Matthew D Fleming e3813573bd Rework memguard(9) to reserve significantly more KVA to detect
use-after-free over a longer time.  Also release the backing pages of
a guarded allocation at free(9) time to reduce the overhead of using
memguard(9).  Allow setting and varying the malloc type at run-time.
Add knobs to allow:

 - randomly guarding memory
 - adding un-backed KVA guard pages to detect underflow and overflow
 - a lower limit on the size of allocations that are guarded

Reviewed by:    alc
Reviewed by:    brueffer, Ulrich Spörlein <uqs spoerlein net> (man page)
Silence from:   -arch
Approved by:    zml (mentor)
MFC after:      1 month
2010-08-11 22:10:37 +00:00
Juli Mallett ca596a25f0 o) Add a VM find-space option, VMFS_TLB_ALIGNED_SPACE, which searches the
address space for an address as aligned by the new pmap_align_tlb()
   function, which is for constraints imposed by the TLB. [1]
o) Add a kmem_alloc_nofault_space() function, which acts like
   kmem_alloc_nofault() but allows the caller to specify which find-space
   option to use. [1]
o) Use kmem_alloc_nofault_space() with VMFS_TLB_ALIGNED_SPACE to allocate the
   kernel stack address on MIPS. [1]
o) Make pmap_align_tlb() on MIPS align addresses so that they do not start on
   an odd boundary within the TLB, so that they are suitable for insertion as
   wired entries and do not have to share a TLB entry with another mapping,
   assuming they are appropriately-sized.
o) Eliminate md_realstack now that the kstack will be appropriately-aligned on
   MIPS.
o) Increase the number of guard pages to 2 so that we retain the proper
   alignment of the kstack address.

Reviewed by:	[1] alc
X-MFC-after:	Making sure alc has not come up with a better interface.
2010-04-18 22:32:07 +00:00
Alan Cox ac45ee97c9 Initialize the virtual memory-related resource limits in a single place.
Previously, one of these limits was initialized in two places to a
different value in each place.  Moreover, because an unsigned int was used
to represent the amount of pageable physical memory, some of these limits
were incorrectly initialized on 64-bit architectures.  (Currently, this
error is masked by login.conf's default settings.)

Make vm_thread_swapin() and vm_thread_swapout() static.

Submitted by:	bde (an earlier version)
Reviewed by:	kib
2010-04-11 16:26:07 +00:00