With recent change 110113bc08, a vnet tunable can be initialized when
there is a corresponding kernel environment variable unless it is marked
with the flag CTLFLAG_NOFETCH.
The initialization may happen during early boot(linker preload), at that
time vnet0 has not been created. The hander carp_allow_sysctl() for the
tunable net.inet.carp.allow requires vnet, thus invoking it during early
boot will cause kernel panic.
The tunnable is initialized by vnet sysinit routine ipcarp_sysinit() so
let's just mark it with flag CTLFLAG_NOFETCH.
No functional change intended.
Fixes: 110113bc08 sysctl(9): Enable vnet sysctl variables to be loader tunable
MFC after: 2 week
Differential Revision: https://reviews.freebsd.org/D41525
The module preload happens before vnet0 creation, at this moment the vnet
list is empty thus invoking vnet_data_copy() during preload is a noop.
With recent change 110113bc08, for dynamic module load, aka via kldload,
linker will do vnet propagation right after registering sysctls which
happens after module load, then previous propagation (during module load)
is redundant.
No functional change intended.
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D39852
Complete phase two of 3da1cf1e88.
In 3da1cf1e88, the meaning of the flag CTLFLAG_TUN is extended to
automatically check if there is a kernel environment variable which
shall initialize the SYSCTL during early boot. It works for all SYSCTL
types both statically and dynamically created ones, except for the
SYSCTLs which belong to VNETs.
This change extends the meaning further, to allow it also works for
the SYSCTLs which belong to VNETs. A typical usage is
```
VNET_DEFINE_STATIC(int, foo) = 0;
SYSCTL_INT(_net, OID_AUTO, foo, CTLFLAG_RWTUN | CTLFLAG_VNET,
&VNET_NAME(foo), 0, "Description of the foo loader tunable");
```
Note that the implementation has a limitation. It behaves the same way
as that of non-vnet loader tunables. That is, after the kernel or modules
being initialized, any changes (e.g. via kenv) to kernel environment
variable will not affect the corresponding vnet variable of subsequently
created VNETs. To overcome it, we can use TUNABLE_XXX_FETCH to fetch
the kernel environment variable into those vnet variables during vnet
constructing.
This change will fix the following SYSCTLs those belong to VNETs and
have CTLFLAG_TUN flag:
```
net.add_addr_allfibs
net.bpf.optimize_writers
net.inet.tcp.fastopen.ccache_buckets
net.link.bridge.inherit_mac
net.link.bridge.ipfw_arp
net.link.bridge.log_stp
net.link.bridge.pfil_bridge
net.link.bridge.pfil_local_phys
net.link.bridge.pfil_member
net.link.bridge.pfil_onlyip
net.link.lagg.default_use_flowid
net.link.lagg.default_use_numa
net.link.lagg.default_flowid_shift
net.link.lagg.lacp.debug
net.link.lagg.lacp.default_strict_mode
```
Although the following vnet SYSCTLs have CTLFLAG_TUN flag, theirs
values are re-fetched via TUNABLE_XXX_FETCH, thus are not affected
by this change.
```
net.inet.ip.reass_hashsize
net.inet.tcp.hostcache.cachelimit
net.inet.tcp.hostcache.hashsize
net.inet.tcp.hostcache.bucketlimit
net.inet.tcp.syncache.bucketlimit
net.inet.tcp.syncache.cachelimit
net.inet.tcp.syncache.hashsize
net.key.spdcache.maxentries
net.key.spdcache.threshold
```
In memoriam: hselasky
Discussed with: hselasky, glebius
Fixes: 3da1cf1e88 Extend the meaning of the CTLFLAG_TUN flag ...
MFC after: 2 weeks
Relnotes: yes
Differential Revision: https://reviews.freebsd.org/D39638
All of the kern_* prototypes belong in this header. While here, sort
the prototypes by function name.
Reviewed by: dchagin
Fixes: 6453d4240f vfs: Export exattr methods to reuse by Linuxulator
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D41766
Similar to dcfddc8dc0, replace the
simpler, inlined version with the full version.
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D41690
In particular, the kernel RPC layer used by the NFS client never
invokes pru_rcvd since it always reads data from the socket upcall
via MSG_SOCALLBCK which avoids calling pru_rcvd. As a result, on an
NFS client connection managed by t4_tom, RX credits were never
returned to the TOE connection to open the TCP window resulting in
connection hangs.
To fix, expand the set of conditions in do_rx_data where RX credits
are returned to match those in t4_rcvd_locked by calling the function
directly.
Reviewed by: np
Sponsored by: Chelsio Communications
Differential Revision: https://reviews.freebsd.org/D41688
All notifications are now queued via sctp_ulp_notify(). Do
the locking of the inp read lock there and validate this in all
functions being used.
This is one step in avoiding race conditions when closing the
read end of an SCTP socket.
MFC after: 3 days
This makes consistent use of the parameters and ensures that
all SCTP AUTH related notifications are using sctp_ulp_notify().
No functional change intended.
MFC after: 3 days
When building the frequencies table we convert the value in the DTS to
megahertz and loose precision. While it's not a problem for most of the
DTS it is when the expected frequency value is strict down to the hertz.
So it's either we don't truncate the value and have some ugly and long
values in the sysctls or we just find the closest frequency.
Do the latter.
Reviewed by: mmel
Differential Revision: https://reviews.freebsd.org/D41762
Sponsored by: Beckhoff Automation GmbH & Co. KG
So we can use it in non-linuxkpi sources.
Reviewed by: emaste, mmel
Differential Revision: https://reviews.freebsd.org/D41767
Sponsored by: Beckhoff Automation GmbH & Co. KG
The M1 uses FDT, and has bge to start with. Add a SOC_* option for
the first SoC we'll be supporting.
IOMMU is added commented out because it does have it, but IOMMU is not
well-tested on aarch64. An initial version of the DART driver will be
upstreamed that just puts the DARTs that support bypass mode into bypass
mode -- we'll be missing some functionality, but we at least still end
up with some USB ports.
Reviewed by: karels, manu
Input from: jrtc27 (IOMMU)
Differential Revision: https://reviews.freebsd.org/D39823
Parse IP removal in ASCONF chunks, find the affected state(s) and mark
them as shutting down. This will cause them to time out according to
PFTM_TCP_CLOSING timeouts, rather than waiting for the established
session timeout.
MFC after: 3 weeks
Sponsored by: Orange Business Services
When we create a new state for an existing SCTP association inherit the
v_tag values from the original connection.
MFC after: 3 weeks
Sponsored by: Orange Business Services
Only create new states for INIT chunks, or when we're creating a
secondary state for a multihomed association.
Store and verify verification tag.
MFC after: 3 weeks
Sponsored by: Orange Business Services
SCTP may announce additional IP addresses it'll use in the INIT/INIT_ACK
chunks, or in ASCONF chunks at any time during the connection. Parse these
parameters, evaluate the ruleset for the new connection and if allowed
create the corresponding states.
MFC after: 3 weeks
Sponsored by: Orange Business Services
Differential Revision: https://reviews.freebsd.org/D41637
Each Rx descriptor points to a packet buffer of size 2K, which means
that MTUs greater than 2K see multi-descriptor packets. The TCP-hood of
such packets was being incorrectly determined by looking for a flag on
the last descriptor instead of the first descriptor.
Also fixed and progressed the version number.
Reviewed by: markj
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41754
When removing them from sysinit_list, append them to sysinit_done_list;
print this list from 'show sysinit' along with the list of future
sysinits.
Reviewed by: jhb, gallatin (previous version)
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41749
Constructing an SLIST of SYSINITs by inserting them one by one at the
head of the list resulted in them being sorted in anti-stable order:
When two SYSINITs tied for (subsystem, order), they were executed in
the reverse order to the order in which they appeared in the linker
set.
Note that while this changes struct sysinit, it doesn't affect ABI
since SLIST_ENTRY and STAILQ_ENTRY are compatible (in both cases a
single pointer to the next element).
Fixes: 9a7add6d01 "init_main: Switch from sysinit array to SLIST"
Reported by: gallatin
Reviewed by: jhb, gallatin, emaste
Tested by: gallatin
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D41748
FEAT_E0PD adds two fields to the tcr_el1 special register that, when
set, cause userspace access to either the top or bottom half of the
address spaces without a page walk.
This can be used to stop userspace probing the kernel address space
as the CPU will raise an exception in the same time if the probed
address is in the TLB or not.
Reviewed by: kevans
Sponsored by: Arm Ltd
Differential Revision: https://reviews.freebsd.org/D41760
Commit 3f686532c9 tried to fix an issue with not properly starting
at the first page in the sg list to prevent a panic. This worked but
with the side effect of incrementing "s" during the final iteration
causing it to be NULL since the list had ended.
In cases non-DEBUG kernels this causes a panic with drm-5.15, since
"s" is NULL when we later pass it to sg_mark_end().
This change decouples the iteration sg from the return value so that
it is never incremented past the final page in the chain.
MFC after: 3 days
Reviewed by: manu
Differential Revision: https://reviews.freebsd.org/D41574
When using printm(), one should always pass a scratch pointer to it.
This is achieved by calling printm with memref
BEGIN { printm(fixed_len, memref(ptr, var_len)); }
which will return a pointer to the DTrace scratch space of size
sizeof(uintptr_t) * 2. However, one can easily call printm() as follows
BEGIN { printm(10, (void *)NULL); }
and panic the kernel as a result. This commit does two things:
(1) adds a new macro DTRACE_INSCRATCHPTR(mstate, ptr, howmany) which
checks if a certain pointer is in the DTrace scratch space;
(2) uses DTRACE_INSCRATCHPTR() to implement a check on printm()'s DIFO
return value in order to avoid the panic and sets CPU_DTRACE_BADADDR
if the address is not in the scratch space.
Reviewed by: markj
MFC after: 2 weeks
Differential Revision: https://reviews.freebsd.org/D41722
VOP_COPY_FILE_RANGE(9) is now caled when source and target vnodes
reside on the same filesystem type (not just on the same mountpoint).
The check if vnodes are on the same mountpoint must be done in the
filesystem code. There are currently only three users - fusefs(5) already
has this check, ZFS can handle multiple mountpoints and a check has been
added to NFS client.
ZFS block cloning is now possible between all snapshots and datasets
of the same ZFS pool.
MFC after: 1 week
Reviewed by: rmacklem
Differential Revision: https://reviews.freebsd.org/D41721
Commit 868aabb470 introduced per-flow priority. There's a defect in the
logic for untagged traffic, it does not check M_VLANTAG set in the mbuf
packet header or MTAG_8021Q/MTAG_8021Q_PCP_OUT tag set by firewall, then
can result missing desired priority in the outbound packets.
For mbuf packet with M_VLANTAG in header, some interfaces happen to work
due to bug in the drivers mentioned in D39499. As modern interfaces have
VLAN hardware offloading, the defect is barely noticeable unless the
feature per-flow priority is widely tested.
As a side effect of this defect, the soft padding to work around buggy
bridges is bypassed. That may result in regression if soft padding is
requested.
PR: 273431
Discussed with: kib
Fixes: 868aabb470 Add IP(V6)_VLAN_PCP to set 802.1 priority per-flow
MFC after: 1 week
Differential Revision: https://reviews.freebsd.org/D39536
- Suppressed a harmless warning message, "arcmsr_dr_handle: target=f,
lun=0, GONE!!!," which could appear a few seconds after UEFI system
boot due to the boot volume UEFI initialization.
- Corrected various typing errors.
- Refactored arcmsr_initialize() to improve code readability.
- Added support for device IDs 1883 and 1886 controllers.
- Introduced support for controllers requiring host memory for the
RAID 5 and 6 XOR engines.
Many thanks to Areca for continuing to support FreeBSD.
MFC after: 3 days
32-bit compatibility code is conventionally stored in
sys/compat/freebsd32. Move freebsd32_timerfd_gettime() and
freebsd32_timerfd_settime() from sys/kern/sys_timerfd.c to
sys/compat/freebsd32/freebsd32_misc.c.
MFC After: 3 days
Reviewed by: imp, markj
Differential Revision; https://reviews.freebsd.org/D41640
Do not pollute userspace with <sys/proc.h>, instead declare struct thread
when _KERNEL is defined.
Include <sys/time.h> instead of <sys/timespec.h>. This causes intentional
namespace pollution that mimics Linux. g/musl libcs include <time.h> in
their <sys/timerfd.h>, exposing clock gettime, settime functions and
CLOCK_ macro constants. Ports like Chromium expect this namespace
pollution and fail without it.
MFC After: 3 days
Reviewed by: imp, markj
Differential Revision: https://reviews.freebsd.org/D41641
Define a locking regime for the members of struct timerfd and document
it so future code can follow the standard. The lock legend can be found
in a comment above struct timerfd.
Additionally,
* Add assertions based on locking regime.
* Fill kn_data with the expiration count when EVFILT_READ is triggered.
* Report st_ctim for stat(2).
* Check if file has f_type == DTYPE_TIMERFD before assigning timerfd
pointer to f_data.
MFC After: 3 days
Reviewed by: imp, kib, markj
Differential Revision: https://reviews.freebsd.org/D41600
netlink(4) calls back into the driver during detach and it attempts to
start an internal synchronized op recursively, causing an interruptible
hang. Fix it by failing the ioctl if the VI has been marked as DOOMED
by cxgbe_detach.
Here's the stack for the hang for reference.
#6 begin_synchronized_op
#7 cxgbe_media_status
#8 ifmedia_ioctl
#9 cxgbe_ioctl
#10 if_ioctl
#11 get_operstate_ether
#12 get_operstate
#13 dump_iface
#14 rtnl_handle_ifevent
#15 rtnl_handle_ifnet_event
#16 rt_ifmsg
#17 if_unroute
#18 if_down
#19 if_detach_internal
#20 if_detach
#21 ether_ifdetach
#22 cxgbe_vi_detach
#23 cxgbe_detach
#24 DEVICE_DETACH
MFC after: 3 days
Sponsored by: Chelsio Communications
Update ieee80211_request_smps() to the new number of arguments in
LinuxKPI (which was already prepared) and update the one call in the
older iwlwifi driver version.
This will allow iwlwifi as-is now and rtw88 to compile in case someone
else wants to work on the latter in parallel to predominant efforts on
the former.
Sponsored by: The FreeBSD Foundation
MFC after: 20 days
This is a combined version of updates of the rtw88 driver based
on wireless-testing
(wt-2023-05-11) 711dca0ca3d77414f8f346e564e9c8640147f40d (after v6.4-rc1),
(wt-2023-06-09) 7bd20e011626ccc3ad53e57873452b1716fcfaaa (after v6.4-rc5),
(wt-2023-07-24) 62e409149b62a285e89018e49b2e115757fb9022 (after v6.5-rc3),
(wt-2023-08-06) 2a220a15be657a24868368892e3e2caba2115283 (after v6.5-rc4).
This update follows other currently disconnected LinuxKPI based wireless
drivers to lift them all to a same version in case someone else wants to
work on this driver in parallel to predominant iwlwifi efforts.
MFC after: 20 days
As announced on freebsd-wireless [1] disconnect rtw88 from the build.
Add a note to the man page about the current state but leave the man
page in place for now as this is supposed to be temporary.
[1] https://lists.freebsd.org/archives/freebsd-wireless/2023-September/001377.html
MFC after: 20 days
X-MFC: will see about 14/13
Remove firmware from src/ in favor of the ports/packages and fwget(8).
This will allow us to shrink the size of src (and installed modules).
Update the rtw88 man page to reflect the change.
MFC after: 20 days
X-MFC: will see about 14/13
The lr register is cleared at the beginning of the _dl_start and _start,
so there is no need to initialize it.
Gnu libc _start takes an rtld_fini pointer in x0 which is set by ld.so
for __libc_start_main, the kernel does not register any atexit pointers.
While here fix whitespaces.
MFC after: 1 week
To help porting the Linux emulation layer to a new platforms start using
Linux names for conditional builds instead of architecture-specific ifdefs.
MFC after: 1 week
Similar to d95fbf4e1a, always save gp in
the trapframe even though it is only restored when returning to user
mode. This is mostly a debugging aid so that dump_regs() doesn't
print out random stack garbage as the value of gp for kernel faults
(e.g. sysctl debug.kdb.trap=1) as well as keeping kgdb's trapframe
unwinder from reporting bogus values of $gp for lower frames.
Reviewed by: mhorne, jrtc27, markj
Sponsored by: DARPA
Differential Revision: https://reviews.freebsd.org/D41699
Queue an initial fetch of data during attach and after every read
rather than synchronously fetching data and polling for completion.
If data has not been returned from an previous fetch during read,
just return EAGAIN rather than blocking.
Co-authored-by: John Baldwin <jhb@FreeBSD.org>
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D41656
The timerfd is introduced in FreeBSD 14, and the Linux ABI timerfd is
also moved to FreeBSD native timerfd, but it can't work well as Linux
TFD_CLOEXEC and TFD_NONBLOCK haven't been converted to FreeBSD
TFD_CLOEXEC and TFD_NONBLOCK.
Reviewed by: dchagin, jfree
Differential revision: https://reviews.freebsd.org/D41708
MFC after: 1 week