vm_phys_enq_chunk() inserts a run of pages into the buddy queues. When
lazy initialization is enabled, only the first page of each run is
initialized; vm_phys_enq_chunk() thus initializes the page following the
just-inserted run.
This fails to account for the possibility that the page following the
run doesn't belong to the segment. Handle that in vm_phys_enq_chunk().
Reported by: KASAN
Reported by: syzbot+1097ef4cee8dfb240e31@syzkaller.appspotmail.com
Fixes: b16b4c22d2 ("vm_page: Implement lazy page initialization")
Add some sanity checks when service jails are used in jails:
- children.max > 0
- children.max - children.cur > 0
The nesting is too deep at those places to have a sane formatting, so no
line wrapping at the usual column.
If someone has a better idea how to format this: feel free to go ahead.
Clarify that the "sysvipc" svcj option inherits from the host / parent.
Add "sysvipcnew" which creates a new SysV namespace for the service
jail.
Sanity check that only one of them is used.
Make it explicit why we must set the trap vector before enabling virtual
memory.
Reviewed by: br, jhb, markj
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45474
This code path should never be hit, if it does it means we did not
bootstrap correctly. Turn it into a panic like we do on amd64 and arm64.
Reviewed by: markj, jhb
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45326
Make sure we do this BEFORE pmap_bootstrap().
Reviewed by: markj, jhb
MFC after: 3 days
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45325
The parse_bias method returns a signed int, with a value of -1 when
the device tree reports nothing of the bias configuration. Convert the
local 'bias' from unsigned to signed to properly check this condition.
PR: 229721
Reviewed by: mhorne
MFC after: 3 days
Storing local time in the RTC is a legacy of 1990s PCs; it's not
relevant on other platforms of interest to FreeBSD.
While here switch to C99 bool.
Sponsored by: The FreeBSD Foundation
Reviewed by: allanjude (earlier), imp (earlier)
Differential Revision: https://reviews.freebsd.org/D45575
Normally, memset isn't used. However for OPT_INIT_ALL=zero it is. Always
include it since we're not space constrained and latter-day loaders won't
include a copy if it's not actually used.
Reviewed by: emaste
Sponsored by: Netflix
libkern.h uses KASSERT, which fails when building in the boot
loader. This is hacked around in a number of other places, but it's
easier to just include sys/kassert.h here. Those other hacks still work,
but are no longer really needed and can be torn down over time.
Reviewed by: emaste
Sponsored by: Netflix
Change 483fe9651 embedded struct inpcb into struct udpcb and updated the
intoudpcb macro to use __containerof to locate it. This change accidentally
introduced a dependency on the identifier inp being defined in the block the
macro is expanded in. This should have been the macro argument ip. This change
makes this simple correction.
No functional change intended.
Reviewed by: kp
Sponsored by: Rubicon Communications, LLC ("Netgate")
Skipping the UTC question via -s will not create or delete
/etc/wall_cmos_clock.
Reported by: Tomoaki AOKI
Reviewed by: imp, allanjude, jrm
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45576
Have PCTRIE_RECLAIM_CALLBACK typecast one function pointer type to
another, to relieve the writer of the call back function from having
to cast its first argument from void* to member type.
Reviewed by: rlibby
Differential Revision: https://reviews.freebsd.org/D45586
vm_phys_seg_paddr_to_vm_page() expects a PA that's in bounds, but
vm_phys_find_range() purposefully returns a pointer to the end of the
last page in a segment.
Fixes: 69cbb18746 ("vm_phys: Add a vm_phys_seg_paddr_to_vm_page() helper")
Most of vmm.h is machine-independent. Simplify merging amd64 and arm64
vmm code by removing this machine-dependent routine from arm64's vmm.h.
No functional change intended.
Reviewed by: andrew
Differential Revision: https://reviews.freebsd.org/D45557
FreeBSD's boot times have decreased to the point where vm_page array
initialization represents a significant fraction of the total boot time.
For example, when booting FreeBSD in Firecracker (a VMM designed to
support lightweight VMs) with 128MB and 1GB of RAM, vm_page
initialization consumes 9% (3ms) and 37% (21.5ms) of the kernel boot
time, respectively. This is generally relevant in cloud environments,
where one wants to be able to spin up VMs as quickly as possible.
This patch implements lazy initialization of (most) page structures,
following a suggestion from cperciva@. The idea is to introduce a new
free pool, VM_FREEPOOL_LAZYINIT, into which all vm_page structures are
initially placed. For this to work, we need only initialize the first
free page of each chunk placed into the buddy allocator. Then, early
page allocations draw from the lazy init pool and initialize vm_page
chunks (up to 16MB, 4096 pages) on demand. Once APs are started, an
idle-priority thread drains the lazy init pool in the background to
avoid introducing extra latency in the allocator. With this scheme,
almost all of the initialization work is moved out of the critical path.
A couple of vm_phys operations require the pool to be drained before
they can run: vm_phys_find_range() and vm_phys_unfree_page(). However,
these are rare operations. I believe that
vm_phys_find_freelist_contig() does not require any special treatment,
as it only ever accesses the first page in a power-of-2-sized free page
chunk, which is always initialized.
For now the new pool is only used on amd64 and arm64, since that's where
I can easily test and those platforms would get the most benefit.
Reviewed by: alc, kib
Differential Revision: https://reviews.freebsd.org/D40403
A subsequent patch will make this factoring more worthwhile.
No functional change intended.
Reviewed by: dougm, alc, kib, emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D40400
This is useful for a subsequent patch which implements lazy
initialization of vm_page structures using a dedicate vm_phys free page
pool.
No functional change intended.
Reviewed by: alc, kib, emaste
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D40399
When running regression tests in paralle, this one occasionally fails
because uniq exits with status 0. I believe this is because the test is
a bit racy: it assumes that true(1) will exit before uniq writes to
standard out.
Just sleep for a bit to give the other end of the pipe to exit.
Reviewed by: des
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45534
jemalloc performs two types of virtual memory allocations: (1) large
chunks of virtual memory, where the chunk size is a multiple of a
superpage and explicitly aligned, and (2) small allocations, mostly
128KB, where no alignment is requested. Typically, it starts with a
small allocation, and over time it makes both types of allocation.
With anon_loc being updated on every allocation, we wind up with a
repeating pattern of a small allocation, a large gap, and a large,
aligned allocation. (As an aside, we wind up allocating a reservation
for these small allocations, but it will never fill because the next
large, aligned allocation updates anon_loc, leaving a gap that will
never be filled with other small allocations.)
With this change, anon_loc isn't updated on every allocation. So, the
small allocations will be clustered together, the large allocations will
be clustered together, and there will be fewer gaps between the
anonymous memory allocations. In addition, I see a small reduction in
reservations allocated (e.g., 1.6% during buildworld), fewer partially
populated reservations, and a small increase in 64KB page promotions on
arm64.
Reviewed by: kib
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39845
Replace the lookup-remove loop in rangeet_remove_all with a call
to SWAP_PCTRIE_RECLAIM_CALLBACK, to eliminate repeated trie searches.
Reviewed by: rlibby
Differential Revision: https://reviews.freebsd.org/D45584
Replace the lookup-remove loop in swp_pager_meta_free_all with a call
to SWAP_PCTRIE_RECLAIM_CALLBACK, to eliminate repeated trie searches.
Reviewed by: rlibby
Differential Revision: https://reviews.freebsd.org/D45583
PCTRIE_RECLAIM frees all the interior nodes in a pctrie, but is little
used because most trie-destroyers want to free leaves of the tree
too. Add PCTRIE_RECLAIM_CALLBACK, with two extra arguments, a callback
function and an auxiliary argument, that is invoked on every non-NULL
leaf in the tree as the tree is destroyed.
Reviewed by: rlibby, kib (previous version)
Differential Revision: https://reviews.freebsd.org/D45565
UTC is Coordinated Universal Time, not Greenwich Mean Time.
Reviewed by: imp, allanjude
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45573
For NFSv4.1/4.2, an atomic upgrade of a delegation from a
read delegation to a write delegation is allowed and can
result in significantly improved performance.
This patch adds this upgrade to the NFSv4.1/4.2 client and
enables use of read delegations.
For a test case of building a FreeBSD kernel (sources and
output objects) over a NFSv4.2 mount, these changes reduced
the elapsed time by 30% and included a reduction of 80% for
RPC counts when delegations were enabled. As such, with this
patch there are at least certain cases where enabling
delegations seems to be worth the increased complexity they
bring.
This patch should only affect the NFSv4.1/4.2 behaviour
when delegations are enabled, which is not the default.
MFC after: 1 month
Since delegations are only issued for regular files, check
v_type to see if the query is for a regular file. This is
a simple optimization for the non-VREG case.
While here, fix a couple of global variable declarations.
This patch should only affect the NFSv4.1/4.2 behaviour
when delegations are enabled, which is not the default.
MFC after: 1 month
NFSv4.1/4.2 defined new OPEN_WANT_xxx flags that a client
can use to hint to the server that delegations are or are
not wanted. This patch adds use of those delegations to
the client.
This patch should only affect the NFSv4.1/4.2 behaviour
when delegations are enabled, which is not the default.
MFC after: 1 month
The data of a TCP packet must fit into the announced window, but this is not
required for the sequence number of the FIN. A packet with the FIN bit set and
containing data that fits exactly into the announced window was blocked. Our
stack generates such packets when the receive buffer size is set to 1024. Now
pf uses only the data lenght for window comparison.
OK henning@
Obtained From: OpenBSD
Sponsored by: Rubicon Communications, LLC ("Netgate")
pf was setting max_win to 0 and discarded retransmitted SYN-ACK segments without
wscale if the original SYN contained a wscale option. with gerhard@, ok
henning@
Obtained From: OpenBSD
Sponsored by: Rubicon Communications, LLC ("Netgate")
This will be used when we add SVE support to reduce the registers
needed to be saved on context switch.
Reviewed by: imp
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D43305
When returning from an exception to userspace clear the saved td_frame.
On the next exception this should point to the frame, however this is
not guaranteed.
To ensure the trap frame pointer is either valid or NULL clear it
before returning to userspace in the EL0 synchronous exception handler.
Reviewed by: kib, markj
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D44807
Some LinuxKPI lock macros pass need a flags field passed in. This is
written to but never read from so gcc complains.
Fix this by marking the flags variables as unused to quieten the
compiler.
Reviewed by: brooks (earlier version), kib
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45303
When a variable in write only and can't be removed, e.g. for API
reasons, it is useful to document this fact similar to __diagused
and __witness_used.
Add __writeonly to tell the compiler and anyone looking at the code
that this variable is expected to only be written to, and to not
raise and error.
Reviewed by: imp, kib
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45561
Remove support for pre-armv6 from nanobsd. It was removed from FreeBSD
in 2020.
Reviewed by: imp, emaste
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D45560
In several places, a loop tests for powers of two, or iterates through
powers of two. In those places, replace the loop with an invocation
of fls or ilog2 without changing the meaning of the code.
Reviewed by: alc, markj, kib, np, erj, avg (previous version)
Differential Revision: https://reviews.freebsd.org/D45494
Define a page_range struct to pair up the two values passed to
freerange functions. Have swp_pager_freeswapspace also take a
page_range argument rather than a pair of arguments.
In swp_pager_meta_free_all, drop a needless test and use a new
helper function to do the cleanup for each swap block.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D45562
This change introduced a test failure, so revert until that can be
addressed.
This reverts commit 888796ade2.
PR: 277783
Reported by: rlibby
Sponsored by: The FreeBSD Foundation