While otherwise a handy potential approach, getting the trap frame via
the argument isn't documented and isn't supposed to be used. With all
uses removed, now remove support to end the mixed calling conventions.
Differential Revision: https://reviews.freebsd.org/D37688
Reviewed by: imp, mhorne
Pull Request: https://github.com/freebsd/freebsd-src/pull/1225
This is mostly to reduce the diff with CheriBSD which adds additional
constants to enum uio_rw, but also matches the normal style used for
uio_segflg.
Reviewed by: kib, emaste
Obtained from: CheriBSD
Differential Revision: https://reviews.freebsd.org/D45142
Introduce hw.bus.devctl_nomatch_enabled and use it to suppress NOMATCH
until devmatch runs
There's a lot of NOMATCH events generated at boot. We also run devmatch
once during early boot to load unmatched devices. To avoid redundant
work, don't start generating NOMATCH events until after devmatch runs.
Set hw.bus.devctl_nomatch_enabled=1 just before we run devmatch. The
kernel will suppress NOMATCH events until this is set to true.
This saves about 170ms from the boot on aarch64 running atop Apple
M-series processors and the VMWare Fusion hypervisor.
Reviewed by: imp, cperciva
MFC after: 3 days
Sponsored by: Google Summer of Code
Pull Request: https://github.com/freebsd/freebsd-src/pull/1213
The flag variables behind these are all unsigned. As such adjust the
declarations to match reality and reduce the number of mismatches.
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1126
This matches reality and allows removal of a __DECONST().
Fixes: 4c72d075a5 ("LinuxKPI: const argument to irq_set_affinity_hint()")
Fixes: 9b33b154b5 ("Add support to cpuset for binding hardware interrupts")
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1126
All architecture implementations actually want this to be unsigned.
INTRNG the equivalent is overtly unsigned. x86 and PowerPC merely avoid
the need to explicitly convert at several points.
Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/1126
Typically, when a DMA transaction requires bouncing, we will break up
the request into segments that are, at maximum, page-sized.
However, in the atypical case of a driver whose maximum segment size is
smaller than PAGE_SIZE, we end up inefficiently assigning each segment
its own bounce page. For example, the dwmmc driver has a maximum segment
size of 2048 (PAGE_SIZE / 2); a 4-page transfer ends up requiring 8
bounce pages in the current scheme.
We should attempt to batch segments into bounce pages more efficiently.
This is achieved by pushing all considerations of the maximum segment
size into the new _bus_dmamap_addsegs() function, which wraps
_bus_dmamap_addseg(). Thus we allocate the minimal number of bounce
pages required to complete the entire transfer, while still performing
the transfer with smaller-sized transactions.
For most drivers with a segment size >= PAGE_SIZE, this will have no
impact. For drivers like dwmmc mentioned above, this improves the memory
and performance efficiency when bouncing a large transfer.
Co-authored-by: jhb
Reviewed by: jhb
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45048
It is functionally identical in all implementations, so move the
function to subr_busdma_bounce.c. The KASSERT present in the x86 version
is now enabled for all architectures. It should be universally
applicable.
Reviewed by: jhb
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D45047
This is somewhat similar to EXT_NET_DRV, but CTL isn't a network
driver.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D44725
Clear the list before returning so that sysctl_ctx_free() can be called
more than once on the same list without side effects. This simplifies
error handling in drivers; previously, drivers would have to be careful
to call sysctl_ctx_free() at most once to avoid a use-after-free.
While here, use TAILQ_FOREACH_SAFE in the loop which unregisters OIDs.
Reviewed by: thj, emaste
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D45041
It turns out that the only conversion issue was in fattime2timespec, where
multiplying the number of seconds in a day by the number of days overflowed
32-bit unsigned int for dates beyond 2106-02-07 06:28:15.
Casting one of the multiplicands as time_t forces a 64-bit multiplication on
systems where time_t is 64-bits and produces no binary changes on the one
remaining system with 32-bit time_t (namely i386).
Since the code is now tested & fixed, this change removes the fixme comments.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44755
On systems that have a 64-bit time_t, the test code now exercises the whole
range of fattime. To do this, this commit...
1. replaces the call to random() with two calls to arc4random() to
generate a 33-bit number of seconds in order to cover the entire range of
fattime [1970,2107]. (32-bits stops just short - in January 2106.)
On systems with 32-bit time_t, the extra bits are discarded and only the
time_t expressible range is tested.
2. casts time_t values passed to printf as longs and changes the format
string to match.
Now, the test code builds, runs, and exercises what it can (i.e., the whole
fattime range or the 32-bit time_t subset of it) on both 32-bit and 64-bit
time_t systems.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44754
This change...
1. replaces calls to timet2fattime/fattime2timet with calls to
timespec2fattime/fattime2timespec. The functions got renamed shortly
after they landed in the kernel but the test code wasn't updated (see
7ea93e912b).
2. adds a utc_offset stub.
With this, the test code builds and runs as a 32-bit binary (cc -Wall -O2
-m32 subr_fattime.c).
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44753
This is useful for embedded systems, where it provides feedback that the
kernel has booted, but avoids printing the probe messages. If both
mutemsgs and verbose are set, verbose cancels the mute.
Additionally, this unmutes the console on panic, so a user can see what
happened leading up to the panic.
Obtained from: Juniper Networks, Inc.
In cd85379104, kib made maxphys a load-time tunable. This made
the #define MAXPHYS in sys/param.h almost entirely obsolete, as
it could now be overridden by kern.maxphys at boot time, or by
opt_maxphys.h.
However, decades of tradition have led to several new, incorrect, uses
of MAXPHYS in other parts of the kernel, mostly by seasoned
developers. I've corrected those uses here in a mechanical fashion,
and verified that it fixes a bug in the md driver that I was
experiencing.
Since using MAXPHYS is such an easy mistake to make, it is best to
hide it from the kernel namespace. So I've moved its definition to
_maxphys.h, which is now included in param.h only for userspace.
That brings up the fact that lots of userspace programs use MAXPHYS
for different reasons, most of them probably wrong. Userspace consumers
that really need to know the value of maxphys should probably be
changed to use the kern.maxphys sysctl. But that's outside the scope
of this change.
Reviewed by: imp, jkim, kib, markj
Fixes: 30038a8b4e ("md: Get rid of the pbuf zone")
Sponsored by: Netflix
Differential Revision: https://reviews.freebsd.org/D44986
See commit ae77041e07 ("kthread: Set *newtdp earlier in
kthread_add1()") for details. That commit was incomplete since
g_init()'s first call to kproc_kthread_add() will cause
kproc_kthread_add() to take the `*procptr == NULL` branch, which avoids
kthread_create().
To ensure that the thread pointer is initialized before the thread
starts running, we have to start the kernel process with RFSTOPPED.
We could perhaps go further and use RFSTOPPED only when tdptr != NULL,
but it's probably better to have consistent behaviour.
Reviewed by: olce, kib
Reported by: syzbot+e91e798f3c088215ace6@syzkaller.appspotmail.com
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44927
The fact that an accept filter needs to be cleared first before setting to
a different one isn't properly documented. The requirement that the
socket needs already be listening, although trivial, isn't documented
either. At least return a more meaningful error than EINVAL for an
existing filter. Cover this with a test case.
This was missed when read/write, etc were updated to return ssize_t.
Fixes: 2e83b28161 Fix a few syscall arguments to use size_t instead of u_int.
Reviewed by: imp, kib
Differential Revision: https://reviews.freebsd.org/D44930
The flag allows the pid argument to designate a thread from the calling
process. The flag value is carved from the high bit of the signal
number, which slightly changes the ABI of syscall.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44867
Remove extra sys/param.h, provided by sys/systm.h.
Order the rest alphabetically.
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44867
I didn't notice this during testing because invariants-enabled kernels
implicitly include asan.h via kassert.h.
Reported by: Lexi Winter <lexi@le-Fay.org>
Fixes: 800da341bc ("thread: Simplify sanitizer integration with thread creation")
fork() may allocate a new thread in one of two ways: from UMA, or cached
in a freed proc that was just allocated from UMA. In either case, KASAN
and KMSAN need to initialize some state; in particular they need to
initialize the shadow mapping of the new thread's stack.
This is done differently between KASAN and KMSAN, which is confusing.
This patch improves things a bit:
- Add a new thread_recycle() function, which moves all kernel stack
handling out of kern_fork.c, since it doesn't really belong there.
- Then, thread_alloc_stack() has only one local caller, so just inline
it.
- Avoid redundant shadow stack initialization: thread_alloc()
initializes the KMSAN shadow stack (via kmsan_thread_alloc()) even
through vm_thread_new() already did that.
- Add kasan_thread_alloc(), for consistency with kmsan_thread_alloc().
No functional change intended.
Reviewed by: khng
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44891
In fork1(), if a thread is reused and thread_alloc_stack() is not
called, mark the reused thread's kstack pages clean in the KASAN shadow
buffer.
Sponsored by: Juniper Networks, Inc.
MFC after: 3 days
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D44875
This declares an API for libsys which currently consists of
__sys_<foo>() declarations for system call stubs and function pointer
typedefs of the form __sys_<foo>_t. The vast majority of the
implementation resides in a generated _libsys.h which ensures that all
system call stub declarations match syscalls.master.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44387
Add sys/errno.h, sys/malloc.h, sys/queue.h, and vm/uma.h as needed.
sys/sysproto.h currently includes sys/acl.h which currently includes
sys/param.h, sys/queue.h, and vm/uma.h which in turn bring in
sys/errno.h sys/malloc.h.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44465
An initial reading of the preamble of sys_procctl() gives the impression
that no test prevents a malicious user from passing a negative commands
index (in 'uap->com'), which is soon used as an index into the static
array procctl_cmds_info[].
However, a closer examination leads to the conclusion that the existing
code is technically correct. Indeed, the comparison of 'uap->com' to
the nitems() expression, which expands to a ratio of sizeof(), leads to
a conversion of 'uap->com' to an 'unsigned int' as per Usual Arithmetic
Conversions/Integer Promotions applied by '<=', because sizeof() returns
'size_t' values, and we define 'size_t' as an equivalent of 'unsigned
int' (which is not mandated by the standard, the latter allowing, e.g.,
integers of lower ranks).
With this conversion, negative values of 'uap->com' are automatically
ruled-out since they are converted to very big unsigned integers which
are caught by the test. An analysis of assembly code produced by LLVM
16 on amd64 and practical tests confirm that no exploitation is possible.
However, the guard code as written is misleading to readers and might
trip up static analysis tools. Make sure that negative values are
explicitly excluded so that it is immediately clear that EINVAL will be
returned in this case.
Build tested with clang 16 and GCC 12.
Approved by: markj (mentor)
MFC after: 1 week
Sponsored by: The FreeBSD Foundation
If userpath is not SHM_ANON, then copy it in early so ktrace(2) can
record it. Without this change, ktrace(2) will attempt to strcpy a
userspace string and trigger a page fault.
Reported by: syzbot+490b9c2a89f53b1b9779@syzkaller.appspotmail.com
Fixes: 0cd9cde767
Approved by: markj (mentor)
Reviewed by: markj
MFC after: 1 month
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D44702
The regressions in aio(4) and kernel RPC aren't a 5 minute problem.
This reverts commit d80a97def9.
This reverts commit d1cbb17a87.
This reverts commit fb8a8333b4.
Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX
SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension
of SOCK_STREAM. The change meets three goals: get rid of unix(4) specific
stuff in the generic socket code, provide a faster and robust unix/stream
sockets and bring unix/seqpacket much closer to specification. Highlights
follow:
- The send buffer now is truly bypassed. Previously it was always empty,
but the send(2) still needed to acquire its lock and do a variety of
tricks to be woken up in the right time while sleeping on it. Now the
only two things we care about in the send buffer is the I/O sx(9) lock
that serializes operations and value of so_snd.sb_hiwat, which we can read
without obtaining a lock. The sleep of a send(2) happens on the mutex of
the receive buffer of the peer. A bulk send/recv of data with large
socket buffers will make both syscalls just bounce between owning the
receive buffer lock and copyin(9)/copyout(9), no other locks would be
involved.
- The implementation uses new mchain structure to manipulate mbuf chains.
Note that this required converting to mchain two functions that are shared
with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding
a new shared one uipc_process_kernel_mbuf(). This induces some non-
functional changes in the unix/dgram code as well. There is a space for
improvement here, as right now it is a mix of mchain and manually managed
mbuf chains.
- unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated
as a datagram socket by the generic socket code, now becomes a true stream
socket with record markers.
- unix/stream loses the sendfile(2) support. This can be brought back,
but requires some work. Let's first see if there is any interest in this
feature, except purely academical.
Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44151
Implement m_uiotombuf() as a wrapper around mc_uiotomc(). The M_EXTPG is
left untouched. The m_uiotombuf() is left as a compat KPI. New code
should use either mc_uiotomc() or m_uiotombuf_nomap().
Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44150
Implement m_getm2(), which is widely used via m_getm() macro, as a wrapper
around mc_get(). New code is advised to use mc_get().
Reviewed by: markj, tuexen
Differential Revision: https://reviews.freebsd.org/D44149
It preserves tail points and all length/memory accounting, so that caller
doesn't need to do any extra traversals. It doesn't respect M_PKTHDR but
it may be improved if needed. It respects M_EOR, though. First consumer
will be the new unix(4) SOCK_STREAM and SOCK_SEQPACKET.
Also provide much more simple mc_concat() that glues two chains back.
Reviewed by: markj
Differentail Revision: https://reviews.freebsd.org/D44148
Back in 2015 when it turned non-blocking, it was working with PF_UNIX
and it may still work. However, the usefullness of such application
of sendfile(2) is questionable. Disable the feature while unix/stream
is under refactoring.
Relnotes: yes
Report the delivery of signals to processes other than self while
Capsicum violation tracing with CAPFAIL_SIGNAL.
Reviewed by: markj
Approved by: markj (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D40679
Report syscalls that are not allowed in capability mode with
CAPFAIL_SYSCALL.
Reviewed by: markj
Approved by: markj (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D40678
When a Capsicum violation occurs in the kernel, ktrace will now record
detailed information pertaining to the violation.
For example:
- When a namei lookup violation occurs, ktrace will record the path.
- When a signal violation occurs, ktrace will record the signal number.
- When a sendto(2) violation occurs, ktrace will record the recipient
sockaddr.
For all violations, the syscall and ABI is recorded.
kdump is also modified to display this new information to the user.
Reviewed by: oshogbo, markj
Approved by: markj (mentor)
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D40676
Since thread_single(SINGLE_ALLPROC) ignores them since 9241ebc796,
and there is not much we can do for the debugger-controlled process.
Noted by: olce
Reviewed by: markj, olce
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44638
This ensures that we invoke VOP_READ on the input file even if it's
empty, which in turn helps ensure that filesystems update the atime of
the file.
PR: 274615
Reviewed by: olce, rmacklem, kib
MFC after: 1 month
Differential Revision: https://reviews.freebsd.org/D43524
a0993376ec (from D43179) subtly changed stats_v1_blob_clone() to stop returning EOVERFLOW in the case where the user buffer is not large enough to receive the entire statsblob. This results in any consumers which are implemented to retry on receiving EOVERFLOW to instead give up after receiving an empty statsblob header.
Fix by latching any errors recorded prior to copyout.
Reviewed by: markj
Obtained from: Netflix, Inc.
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44585
Fixes: a0993376ec ("stats: Check for errors from copyout()")
I have a kernel module which fails to load because of an unrecognized
relocation type. link_elf_load_file() fails before the module's ctors
are invoked and it calls linker_file_unload(), which causes the module's
dtors to be executed, resulting in a kernel panic.
Add a flag to the linker file to ensure that dtors are not invoked if
unloading due to an error prior to ctors being invoked.
At the moment I only implemented this for link_elf_obj.c since
link_elf.c doesn't invoke dtors, but I refactored link_elf.c to make
them more similar.
Fixes: 9e575fadf4 ("link_elf_obj: Invoke fini callbacks")
Reviewed by: zlei, kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D44559
In order for pmap_kenter{,_device}() to create superpage mappings,
either 64 KB or 2 MB, pmap_mapdev{,_attr}() must request appropriately
aligned virtual addresses.
Reviewed by: markj
Tested by: gallatin
Differential Revision: https://reviews.freebsd.org/D42737
Debugger has the powers to cause unbound delay in single-threading,
which then blocks the threaded taskqueue. The reproducer is
`truss -f timeout 2 sleep 10`.
Reported by: mjg
Reviewed by: markj
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44523
The kernel CTF loading routines print various warnings when attempting
to load CTF data from an ELF file. After the changes in c21bc6f3c2
those warnings are unnecessarily printed for each kernel module
that was compiled without CTF data.
The kernel linker already uses the bootverbose flag to conditionally
print CTF loading errors. This patch alters kern_ctf.c
routines to do the same.
Reported by: Alexander@leidinger.net
Approved by: markj (mentor)
Fixes: c21bc6f3c2 ("ddb: Add CTF-based pretty printing")
Before a protocol specific control block started to embed inpcb in self
(see 0aa120d52f, e68b379244, 483fe96511) this pointer used to point
at it.
Retain kf_sock_inpcb field in the struct kinfo_file in <sys/user.h>. The
exp-run detected a minimal use of the field in ports:
* sysutils/lsof - patched upstream
* net-mgmt/netdata - patch accepted upstream
* emulators/qemu-user-static - upstream master branch seems not using
the field anymore
We can keep the field around for some time, but eventually it may be
reused for something else.
PR: 277659 (exp-run)
Reviewed by: tuexen
Differential Revision: https://reviews.freebsd.org/D44491
This allows for shutdown_final EVENTHANDLERs to know that a core dump
successfully occurred. Embedded systems may want to record this fact
or act on it.
Obtained from: Juniper Networks, Inc.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44542
HPTS inserts a softclock for system call return that optimizes performance. However when
no HPTS threads need the help (i.e. when they have less than 100 or so connections) then
there should be little work done i.e. check the counter and return instead of running through
all the threads getting locks etc.ptimize HPTS so that little work is done until we have a hpts
thread that is over the connection threshold.
Reported by: eduardo
Reviewed by: gallatin, glebius, tuexen
Tested by: gallatin
Differential Revision: https://reviews.freebsd.org/D44420
The only possible return value has been zero since cee9542d51.
No functional change intended.
Reviewed by: dfr
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D44507
A nonzero `userrefs` of a linker file indicates that the file, either
loaded from kldload(2) or preloaded, can be unloaded via kldunload(2).
As for the kernel file, it can be unloaded by the loader but should not
be after initialization.
This change fixes regression from d9ce8a41ea which incidentally
increases `userrefs` of the kernel file.
Reviewed by: dfr, dab, jhb
Fixes: d9ce8a41ea kern_linker: Handle module-loading failures in preloaded .ko files
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42530
Despite the name, linker_file_unload() will drop a reference and return
success when the module file has dependants, i.e. it has more than one
reference. When user request to unload such modules then the kernel
should reject unambiguously and immediately.
PR: 274986
Reviewed by: dfr, dab, jhb
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D42527
Unitialized td_frame mostly does not matter since all registers are
overwritten on exec to activate init(8). Except PSL_T bit from the
%rflags which might leak into fresh init as garbage, causing spurious
SIGTRAPs delivered to init until first syscall is executed.
Reviewed by: emaste, jhb, jhibbits
Sponsored by: The FreeBSD Foundation
MFC after: 1 week
Differential revision: https://reviews.freebsd.org/D44498
!DDB builds don't include the db_ctf_lookup_typename() symbol, so this
is a stop-gap to fix linking of the MINIMAL kernel config.
Reported by: bapt
Fixes: c21bc6f3c2 ("ddb: Add CTF-based pretty printing")
Add basic CTF support and a CTF-powered pretty-printer to ddb.
The db_ctf.* files expose a basic interface for fetching type
data for ELF symbols, interacting with the CTF string table,
and translating type identifiers to type data.
The db_pprint.c file uses those interfaces to implement
a pretty-printer for all kernel ELF symbols.
The pretty-printer works with symbol names and arbitrary addresses:
pprint struct thread 0xffffffff8194ad90
Pretty-printing currently only works after the root filesystem
gets mounted because the CTF info is not available during
early boot.
Differential Revision: https://reviews.freebsd.org/D37899
Approved by: markj (mentor)
The (optional) third argument of fcntl is sometimes a pointer so change
the type to intptr_t. Update the libc-internal defintion (actually used
by libthr) to take a fixed intptr_t argument rather than pretending it's
a variadic function. (That worked because all supported architectures
pass variadic arguments as though the function was declared with those
types. In CheriBSD that changes because variadic arguments are passed
via a bounded array.)
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44381
sigfastblock is declared to take a void * argument in the manpage in
headers so declare it that way and use SAL annotations to say it
interacts with a 32-bit word.
Reviewed by: kib
Differential Revision: https://reviews.freebsd.org/D44379
livedump_start_vnode(9) is introduced such that the live minidump on the
system could take a vnode. This interface could be used to extend support
for the existing framework in downstream.
Bump __FreeBSD_version for introducing livedump_start_vnode(9).
Sponsored by: Juniper Networks, Inc.
Reviewed by: khng
Differential Revision: https://reviews.freebsd.org/D43471
These KPIs were added in dd0e6c383a and through 15 years had zero use.
They slightly remind what IfAPI does for struct ifnet. But IfAPI does
that for the sake of large collection of NIC drivers not being aware of
struct ifnet. For the sockets it is unclear what could be a large
collection of externally written kernel modules that need extensively use
sockets and not be aware of their internals at the same time. This
isolation of a structure knowledge requires a lot of work, and just
throwing in a few KPIs isn't helpful.
Reviewed by: kib, olce, markj
Differential Revision: https://reviews.freebsd.org/D44311
vn_generic_copy_file_range() tries to maintain holes
in file ranges being copied, using SEEK_DATA/SEEK_HOLE
where possible,
Unfortunately SEEK_DATA/SEEK_HOLE operations can take
a long time under certain circumstances.
Although it is not currently possible to know if a file has
unallocated data regions, the case where va_bytes >= va_size
is a strong hint that there are no unallocated data regions.
This hint does not work well for file systems doing compression,
but since it is only a hint, it is still useful.
For the case of va_bytes >= va_size, avoid doing SEEK_DATA/SEEK_HOLE.
Reviewed by: kib
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D44509
The public bus_release_resource() API still accepts both forms, but
the internal kobj method no longer passes the arguments.
Implementations which need the rid or type now use rman_get_rid() or
rman_get_type() to fetch the value from the allocated resource.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44131
The public bus_activate/deactivate_resource() API still accepts both
forms, but the internal kobj methods no longer pass the arguments.
Implementations which need the rid or type now use rman_get_rid() or
rman_get_type() to fetch the value from the allocated resource.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44130
The public bus_map/unmap_resource() API still accepts both forms, but
the internal kobj methods no longer pass the argument.
Implementations which need the type now use rman_get_type() to fetch
the value from the allocated resource.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44129
The public bus_adjust_resource() API still accepts both forms, but the
internal kobj method no longer passes the argument. Implementations
which need the type now use rman_get_type() to fetch the value from
the allocated resource.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44128
Remove the 'type' and 'rid' arguments from the wrapper bus API
functions (e.g. bus_release_resource) that accept a struct resource.
The "new" versions extract the 'type' and/or 'rid' from the passed in
resource object via rman_get_type and rman_get_rid.
This commit adds the new API as functions with a _new suffix. Wrapper
macros choose between the old and new functions based on the number of
arguments provided to the macro. This commit does not change the ABI
but can be safely MFCd to older branches so long as older kernels use
rman_set_type when allocating resources.
Future commits will push the removal of these extraneous arguments
through the bus implementation.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44124
Use rman_set_type to set the type of allocated resources everywhere
rman_set_rid is currently called.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44123
This permits associating a resource type (e.g. SYS_RES_MEMORY) with a
struct resource.
I considered adding a new field to struct rman to store the type and
only providing rman_get_type as an accessor. However, changing
'struct rman' is an ABI breakage. I might revisit this in main, but
the current approach is MFC'able.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D44122
Doing a deep copy of the keys early allows users of the
tls_enable structure to assume kernel memory.
This enables the socket options to be set by kernel threads.
Reviewed By: #transport, tuexen, jhb, rrs
Sponsored by: NetApp, Inc.
X-NetApp-PR: #79
Differential Revision: https://reviews.freebsd.org/D44250
With recent fixes to the ACPI and pcib drivers to translate mapping
requests of child resources into mappings of sub-ranges of parent
resources these assertions should now be true.
This reverts commit ed88eef140.
Reviewed by: imp
Differential Revision: https://reviews.freebsd.org/D43691
When calling VOP_CREATE(), uipc_bindat() reuses the componentname
object from the preceding lookup operation, which is likely to specify
LK_SHARED. Furthermore, the VOP_CREATE() interface technically only
requires the newly-created vnode to be returned with a shared lock.
However, the socket layer requires the new vnode to be locked exclusive
and asserts to that effect.
In most cases, this is not a practical concern because most if not
all base-layer filesystems (certainly FFS, ZFS, and msdosfs at least)
always return the vnode locked exclusive regardless of the lock flags.
However, it is an issue for unionfs which uses cn_lkflags to determine
how the new unionfs wrapper vnode should be locked. While it would
be easy enough to work around this issue within unionfs itself, it
seems better for the socket layer to be explicit about its locking
requirements when issuing VOP_CREATE().
Reviewed by: kib, olce
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D44047
It's a bit strange to require the caller to pass contrived lock flags
if the corresponding vnode is NULL, simply to appease the assertion
that exactly one of LK_SHARED or LK_EXCLUSIVE must be set. On the
other hand, we still want to catch cases in which completely bogus
or corrupt flags are passed even if the corresponding vnode is NULL.
Therefore, specifically allow empty flags for lkflags1/lkflags2 iff
the respective vp1/vp2 param is NULL.
Reviewed by: kib, olce
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D44046