Commit graph

286304 commits

Author SHA1 Message Date
Konstantin Belousov 525bc87f54 kern_kthread: fork1() does not handle locked Giant
fork1() does not behave if called under Giant.  For instance, it might
need to call thread_suspend_check() which explicitly verifies that Giant
is not locked.  On the other hand, the kthread KPI is often called from
SYSINIT() which is still Giant-locked.

Correct this by dropping Giant in kthread_add() and kproc_create().

Reported by:	pho
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D41694
2023-09-03 08:21:53 +03:00
Konstantin Belousov ea70866bb1 kern_kthread.c: some style
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D41694
2023-09-03 08:21:44 +03:00
Justin Hibbits ac77837ca7 dtsec(4): Restore IFCAP_JUMBO_MTU lost in IfAPI conversion
Also add IFCAP_VLAN_MTU, since it's supported.

Fixes:		0083fc5c7 ("Mechanically convert dtsec(4) to IfAPI")
MFC after:	1 week
2023-09-02 16:59:09 -04:00
Zhenlei Huang 224aec05e7 tcp: Initialize the maximum number of entries in a client cookie cache bucket
This vnet loader tunable is defined with SYSCTL_PROC, thus will not be
initialized by kernel on vnet creating and will always have the default
value TCP_FASTOPEN_CCACHE_BUCKET_LIMIT_DEFAULT.

Fix by fetching the value from the corresponding kernel environment during
vnet constructing.

PR:		273509
Reviewed by:	#transport, tuexen
Fixes:	c560df6f12 This is an implementation of the client side of TCP Fast Open (TFO) [RFC7413]
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D41691
2023-09-03 04:34:07 +08:00
Warner Losh ea82362219 tests: Skip all tests that require mdconfig when /dev/mdctl missing
When run in a jail, /dev/mdctl is missing. So skip any tests that use
mdconfig or mdmfs with md in this case: they can't possibly work. This
is in line with other tests that test for presence of required features
and skip if they aren't present. I did this instead of checking for
jails so they can still run in jails that allow creation of md devices.

Sponsored by:		Netflix
2023-09-02 13:16:22 -06:00
Mateusz Guzik 32988c1499 vfs cache: fix a hang when bumping vnode limit too high
Overflow in cache_changesize would make the value flip to 0 and stay
there as 0 << 1 does not do anything.

Note callers limit the outcome to something below u_int.

Also note there entire vnode handling thing both in vfs layer as a whole
and this file can't decide whether to long, u_long or u_int.
2023-09-02 14:45:27 +00:00
Martin Matuska 2ad756a6bb zfs: merge openzfs/zfs@95f71c019
Notable upstream pull request merges:
  #15018 Increase limit of redaction list by using spill block
  #15161 Make zoned/jailed zfsprops(7) make more sense
  #15216 Relax error reporting in zpool import and zpool split
  #15218 Selectable block allocators
  #15227 ZIL: Tune some assertions
  #15228 ZIL: Revert zl_lock scope reduction
  #15233 ZIL: Change ZIOs issue order

Obtained from:	OpenZFS
OpenZFS commit:	95f71c019d
2023-09-02 12:33:26 +02:00
Mateusz Guzik f4296cfb40 timerfd: convert timerfd_list_lock from sx to mtx
There was no good reason to use the former. This should prevent some
head-scratching by an interested and qualified reader.
2023-09-02 09:55:50 +00:00
Kyle Evans 07bc20e474 localedef: correct definition of right-parenthesis in default charmap
It turns out that right parentheses do exist and are different than
left parentheses, so let's switch to that.

Sponsored by:	Klara, Inc.
2023-09-02 00:58:35 -05:00
ednadolski-ix 95f71c019d
Selectable block allocators
ZFS historically has had several space allocators that were
dynamically selectable.  While these have been retained in 
OpenZFS, only a single allocator has been statically compiled 
in. This patch compiles all allocators for OpenZFS and provides 
a module parameter to allow for manual selection between them.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Edmund Nadolski <edmund.nadolski@ixsystems.com>
Closes #15218
2023-09-01 18:00:30 -07:00
Umer Saleem 71472bf375
Relax error reporting in zpool import and zpool split
For zpool import and zpool split, zpool_enable_datasets is called
to mount and share all datasets in a pool. If there is an error
while mounting or sharing any dataset in the pool, the status of
import or split is reported as failure. However, the changes do
show up in zpool list.

This commit updates the error reporting in zpool import and zpool
split path. More descriptive messages are shown to user in case
there is an error during mount or share. Errors in mount or share
do not effect the overall status of zpool import and zpool split.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Closes #15216
2023-09-01 17:25:11 -07:00
Andrea Righi bcb1159c09
Linux 6.5 compat: safe cleanup in spl_proc_fini()
If we fail to create a proc entry in spl_proc_init() we may end up
calling unregister_sysctl_table() twice: one in the failure path of
spl_proc_init() and another time during spl_proc_fini().

Avoid the double call to unregister_sysctl_table() and while at it
refactor the code a bit to reduce code duplication.

This was accidentally introduced when the spl code was
updated for Linux 6.5 compatibility.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Ameer Hamza <ahamza@ixsystems.com>
Signed-off-by: Andrea Righi <andrea.righi@canonical.com>
Closes #15234 
Closes #15235
2023-09-01 17:21:40 -07:00
Alexander Motin 9da6b60417
ZIL: Change ZIOs issue order.
In zil_lwb_write_issue(), after issuing lwb_root_zio/lwb_write_zio,
we have no right to access lwb->lwb_child_zio. If it was not there,
the first two ZIOs may have already completed and freed the lwb.
ZIOs issue in opposite order from children to parent should keep
the lwb valid till the end, since the lwb can be freed only after
lwb_root_zio completion callback.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15233
2023-09-01 17:14:50 -07:00
Alexander Motin b1b99e10a6
ZIL: Revert zl_lock scope reduction.
While I have no reports of it, I suspect possible use-after-free
scenario when zil_commit_waiter() tries to dereference zcw_lwb
for lwb already freed by zil_sync(), while zcw_done is not set.
Extension of zl_lock scope as it was originally should block
zil_sync() from freeing the lwb, closing this race.

This reverts #14959 and couple chunks of #14841.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15228
2023-09-01 17:13:52 -07:00
Alexander Motin bbcf18c293
ZIL: Tune some assertions.
In zil_free_lwb() we should first assert lwb_state or the rest of
assertions can be misleading if it is false.

Add lwb_state assertions in zil_lwb_add_block() to make sure we are
not trying to add elements to lwb_vdev_tree after it was processed.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15227
2023-09-01 17:13:22 -07:00
John Baldwin 4a9cd9fc22 amd64 db_trace: Reject unaligned frame pointers
Switch to using db_addr_t to hold frame pointer values until they are
verified to be suitably aligned.

Reviewed by:	kib, markj
Differential Revision:	https://reviews.freebsd.org/D41532
2023-09-01 15:55:37 -07:00
Dag-Erling Smørgrav c9f5889d05 libc: Further nit in fopen(3) man page.
Sponsored by:	Klara, Inc.
Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D41687
2023-09-01 22:53:35 +00:00
John Baldwin 7063f94283 pci_iov: Refuse to create VFs which require ARI if ARI is not available
If a parent downstream port doesn't support ARI, the code would try to
create VFs anyway but then all PCI config space access to those VFs
would fail.

Tested by:	np
Sponsored by:	Chelsio Communications
2023-09-01 14:18:38 -07:00
Trond Endrestøl b7000cadfb scandir.3: Fix several typos
PR:		273480
Reviewed by:	markj
MFC after:	1 week
2023-09-01 16:57:03 -04:00
Dag-Erling Smørgrav 5a57401e71 libc: Fix fmemopen(3) prototype in fopen(3) man page.
While here, also update a mention of ANSI C.

Sponsored by:	Klara, Inc.
Reviewed by:	kevans, markj
Differential Revision:	https://reviews.freebsd.org/D41686
2023-09-01 20:56:26 +00:00
Mateusz Guzik b2a48c3cf8 pf: retire pf_krule_to_rule and pf_kpool_to_pool
Discussed with:	kp
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-09-01 18:18:02 +00:00
Brooks Davis a89e98ec33 src.conf.5: regen 2023-09-01 18:23:33 +01:00
Brooks Davis a8ae129e6e LIBC_MALLOC: description typo fix
Fixes:		09e32b2fdd
Reported by:	jrtc27
2023-09-01 18:23:33 +01:00
Brooks Davis 48d057378d UPDATING: typo fox
Fixes:		2befa269b8
Reported by:	jrtc27
2023-09-01 18:23:23 +01:00
Brooks Davis 3fe97711e3 src.conf.5: regen 2023-09-01 17:54:24 +01:00
Brooks Davis 2befa269b8 Add INIT_ALL build option
This option replaces WITH_INIT_ALL_PATTERN and WITH_INIT_ALL_ZERO with
INIT_ALL=pattern and INIT_ALL=zero respectively.  As these are
relatively rarely used options no backwards compatibility is
implemented.

Reviewed by:	emaste
Relnotes:	yes
Differential Revision:	https://reviews.freebsd.org/D41675
2023-09-01 17:54:24 +01:00
Brooks Davis 09e32b2fdd libc: add LIBC_MALLOC option
This will enable alternative mallocs to be included in the tree and
selected by setting LIBC_MALLOC.  As there is only one today (jemalloc)
this option does nothing, but we expect to add other implementations
in the future.  This will also reduce diffs to CheriBSD.

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D41660
2023-09-01 17:54:23 +01:00
Brooks Davis bd016ad227 Teach make showconfig about group options
Output OPT_ variables in addition to MK_ variables.

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D41682
2023-09-01 17:54:23 +01:00
Brooks Davis 897ae85f7d makeman: add minimal support for group options
Ignore OPT_* values in showconfig out in exising code paths and add
a new path to include descriptions for each. For now, hardcode the
description contents rather than attempting to generate it.  This runs
the risk of docs getting out of date, limits the amount of new shell
code added today while a lua rewrite is nearly ready to land.

This change requires a followup commit to enable OPT_* values in
"make showconfig" in order to actually find group options.

Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D41681
2023-09-01 17:54:23 +01:00
Brooks Davis ce5fa47cf0 share/mk: support for "single" group options
Support group options where 1 of n values will be selected (or a default
value will be used).  After processing, an OPT_FOO will be set to one
value from __FOO_OPTIONS for each FOO in __SINGLE_OPTIONS.  If the user
sets FOO that value will be used, otherwise __FOO_DEFAULT will be used.

Options that don't work an a particular system can be remapped to an
alternative using BROKEN_SINGLE_OPTIONS which can be set to a list of
3-tuples of the form:
	OPTION broken_value replacement_value

This is somewhat inspired by OPTIONS_SINGLE from ports, but the
structure is quite different with a per-option variable in the style of
MK_FOO={yes,no}.

Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41659
2023-09-01 17:52:28 +01:00
Zachary Leaf 565c887a77 armv8_crypto: fix recursive fpu_kern_enter call
Now armv8_crypto is using FPU_KERN_NOCTX, this results in a kernel panic
in armv8_crypto.c:armv8_crypto_cipher_setup:

    panic: recursive fpu_kern_enter while in PCB_FP_NOSAVE state

This is because in armv8_crypto.c:armv8_crypto_cipher_process,
directly after calling fpu_kern_enter() a call is made to
armv8_crypto_cipher_setup(), resulting in nested calls to
fpu_kern_enter() without the required fpu_kern_leave() in between.

Move fpu_kern_enter() in armv8_crypto_cipher_process() after the
call to armv8_crypto_cipher_setup() to resolve this.

Reviewed by:	markj, andrew
Fixes: 6485286f53 ("armv8_crypto: Switch to using FPU_KERN_NOCTX")
Sponsored by: Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41671
2023-09-01 10:56:58 +01:00
Andrew Turner 5429e19421 gicv3: Add logging for when its_device_alloc fails
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41566
2023-09-01 10:56:42 +01:00
Andrew Turner 8b143276ae gicv3: Support indirect ITS tables
The GICv3 ITS device supports two options for device tables. Currently
we support a single table to hold all device IDs, however when the
device ID space grows large this can be too large for the GITS_BASER
register to describe.

To handle this case, and to reduce the memory needed when this space
is sparse support the second option, the indirect table. The indirect
table is a 2 level table where the first level contains the physical
address of the second with a valid bit. The second level is an ITS
page sized table where each entry is the original entry size.

As we don't need to allocate a second level table for devices IDs that
don't exist this can reduce the allocation size.

Reviewed by:	gallatin
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41555
2023-09-01 10:56:17 +01:00
Andrew Turner 7ad28b73ec arm: Add a userspace physical timer check
We currently use the same Arm generic time in both userspace and the
kernel. As we always enable userspace access to the virtual timer we
can tell userspace to use it.

Reviewed by:	imp
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D41565
2023-09-01 10:49:18 +01:00
Dmitry Chagin edd28b857e jail(8): Fix mandoc warnings
Reviewed by:		gbe
Differential Revision:	https://reviews.freebsd.org/D41680
MFC after:		1 week
2023-09-01 11:13:46 +03:00
Dmitry Chagin 1d41989933 extattr(9): Remove a reference to a non-existent VFS_EXTATTRCTL(9)
Reviewed by:		gbe
Differential Revision:	https://reviews.freebsd.org/D41678
MFC after:		1 week
2023-09-01 11:13:22 +03:00
Dmitry Chagin 315d7bbbb4 extattr(9): Mention system namespace restrictions in a jail
Reported by:		netchild
Reviewed by:		gbe
Differential revision:	https://reviews.freebsd.org/D41676
MFC after:		1 week
2023-09-01 11:12:51 +03:00
Shawn Webb cb48780db4 jail: Add the ability to access system-level filesystem extended attributes
Prior to this commit privileged accounts in a jail could not access to the
filesystem extended attributes in the system namespace. To control access to
the system namespace in a per-jail basis add a new configuration parameter
allow.extattr which is off by default.

Reported by:		zirias
Tested by:		zirias
Obtained from:		HardenedBSD
Reviewed by:		kevans, jamie
Differential revision:	https://reviews.freebsd.org/D41643
MFC after:		1 week
Relnotes:		yes
2023-09-01 11:11:33 +03:00
Dmitry Chagin 1bfc4574f7 linux(4): Return ENOTSUP from xattr syscalls instead of EPERM
FreeBSD does not permits manipulating extended attributes in the system
namespace by unprivileged accounts, even if account has appropriate
privileges to access filesystem object.
In Linux the system namespace is used to preserve posix acls. Some Gnu
coreutils binaries uses posix acls, eg, install, ls.  And fails if we
unexpectedly return EPERM error from xattr system calls.

In the other hands, in Linux read and write access to the system
namespace depend on the policy implemented for each filesystem, so we'll
mimics we're a filesystem that prohibits this for unpriveleged accounts.

Reported by:		zirias
Tested by:		zirias
MFC after:		1 week
2023-09-01 11:11:02 +03:00
Dmitry Chagin dfcc0237c3 linux(4): Merge removexattr for future error recode
Tested by:		zirias
MFC after:		1 week
2023-09-01 11:10:44 +03:00
Dmitry Chagin 4d59b79055 linux(4): Return ENODATA from getxattr syscalls instead of EPERM
On Linux ENODATA mean the named attribute does not exist, or the
process has no access to this attribute.

Reported by:		zirias
Tested by:		zirias
MFC after:		1 week
2023-09-01 11:10:12 +03:00
Dmitry Chagin 6b46ec6612 linux(4): Merge getxattr for future error recode
Tested by:		zirias
MFC after:		1 week
2023-09-01 11:09:49 +03:00
Kyle Evans 03d104888c arm64: initialize pcb in the TBI/PAC/etc. fault case
After 2c10be9e06, we may jump to the bad_far label without `pcb` being
set, resulting in a follow-up fault as we may dereference it immediately
after the jump if td_intr_nesting_level == 0.  In this branch, it should
be safe to dereference `td` as we're not handling the special case
mentioned below of accessing it during promotion/demotion.

This seems to fix a null ptr deref I hit during my most recent pkgbase
build attempt on the Windows DevKit, though that was admittedly
encountered while we were on the way to a panic from an apparent
use-after-free in ZFS bits.

Reviewed by:	andrew, markj
Fixes:	2c10be9e06 ("arm64: Handle translation faults for thread [..]")
Differential Revision:	https://reviews.freebsd.org/D41677
2023-08-31 21:10:38 -05:00
Dimitry Andric 010c003e5f
dmu_buf_will_clone: change assertion to fix 32-bit compiler warning
Building module/zfs/dbuf.c for 32-bit targets can result in a warning:

In file included from
/usr/src/sys/contrib/openzfs/include/sys/zfs_context.h:97,
                 from /usr/src/sys/contrib/openzfs/module/zfs/dbuf.c:32:
/usr/src/sys/contrib/openzfs/module/zfs/dbuf.c: In function
'dmu_buf_will_clone':
/usr/src/sys/contrib/openzfs/lib/libspl/include/assert.h:116:33: error:
cast from pointer to integer of different size
[-Werror=pointer-to-int-cast]
  116 |         const uint64_t __left = (uint64_t)(LEFT);
  \
      |                                 ^
/usr/src/sys/contrib/openzfs/lib/libspl/include/assert.h:148:25: note:
in expansion of macro 'VERIFY0'
  148 | #define ASSERT0         VERIFY0
      |                         ^~~~~~~
/usr/src/sys/contrib/openzfs/module/zfs/dbuf.c:2704:9: note: in
expansion of macro 'ASSERT0'
 2704 |         ASSERT0(dbuf_find_dirty_eq(db, tx->tx_txg));
      |         ^~~~~~~

This is because dbuf_find_dirty_eq() returns a pointer, which if
pointers are 32-bit results in a warning about the cast to uint64_t.

Instead, use the ASSERT3P() macro, with == and NULL as second and third
arguments, which should work regardless of the target's bitness.

Reviewed-by: Kay Pedersen <mail@mkwg.de>
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Brian Atkinson <batkinson@lanl.gov>
Signed-off-by: Dimitry Andric <dimitry@andric.com>
Closes #15224
2023-08-31 18:17:12 -07:00
Brooks Davis d889833334 src.conf.5: regen 2023-09-01 01:02:20 +01:00
Brooks Davis 89aed8837f makeman: clarify scope of ignored option values
The values of WITH_ and WITHOUT_ options are ignored, but group options
are not.

Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41683
2023-09-01 00:58:39 +01:00
Glen Barber 82c57e2a75 release: remove arm/armv6 RPI-B configuration file
The arm/armv6 RPI-B images are deprecated in 15 and 14.
An MFC to stable/14 will follow.

MFC after:	3 days
Sponsored by:	GoFundMe https://www.gofundme.com/f/gjbbsd
Sponsored by:	PayPal https://paypal.me/gjbbsd
2023-08-31 19:24:38 -04:00
Jamie Gritton db08e8ba0e Re-remove $FreeBSD$ inadvertantly put back into jail.8 2023-08-31 15:35:00 -07:00
Dag-Erling Smørgrav 4cd9d804ae libipf: fix parser error message.
MFC after:	1 week
Reviewed by:	cy
Differential Revision:	https://reviews.freebsd.org/D41652
2023-08-31 22:15:54 +02:00
Mina Galić 09ec5e67a7 libc: fix history for strverscmp(3) and versionsort(3)
strverscmp(3) and versionsort(3) where first released in 13.2

PR:		273401
Reviewed by:	kib
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D41617
2023-08-31 14:52:31 +03:00