Commit graph

287746 commits

Author SHA1 Message Date
Kristof Provost 0fe663b2a8 pf: always create multihomed states as floating
When we create a new state for multihomed sctp connections (i.e.
based on INIT/INIT_ACK or ASCONF parameters) we cannot know what
interfaces we'll be seeing that traffic on. Make those states floating,
irrespective of state policy.

MFC after:	1 week
Sponsored by:	Orange Business Services
2023-11-17 23:33:43 +01:00
Kirk McKusick 772430dd67 Ensure I/O buffers in libufs(3) are 128-byte aligned.
Various disk controllers require their buffers to be aligned to a
cache-line size (128 bytes). For buffers allocated in structures,
ensure that they are 128-byte aligned. Use aligned_malloc to allocate
memory to ensure that the returned memory is 128-byte aligned.

While we are here, we replace the dynamically allocated inode buffer
with a buffer allocated in the uufsd structure just as the superblock
and cylinder group buffers do.

This can be removed if/when the kernel is fixed. Because this problem
has existed on one I/O subsystem or another since the 1990's, we
are probably stuck with dealing with it forever.

The problem most recent showed up in Azure, see:
    https://reviews.freebsd.org/D41728
    https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=267654
Before these fixes were applied, it was confirmed that the changes
in this commit also fixed the issue in Azure.

Reviewed-by: Warner Losh, kib
Tested-by:   Souradeep Chakrabarti of Microsoft (earlier version)
PR:          267654
Differential Revision: https://reviews.freebsd.org/D41724
2023-11-17 14:11:24 -08:00
Brooks Davis cd67bc0ae4
freebsd: remove __FBSDID macro use
With FreeBSD's switch to git the $FreeBSD$ string is no longer expanded
and they have mostly been removed upstream.  Stop using __FBSDID and
remove the no-longer needed sys/cdefs.h includes.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Signed-off-by: Brooks Davis <brooks.davis@sri.com>
Closes #15527
2023-11-17 14:02:09 -08:00
Alexander Motin 5a3bffab10
ZIO: Optimize zio_flush()
- Generalize vdev_nowritecache handling by traversing through the
VDEV tree and skipping children ZIOs where not supported.
 - Remove intermediate zio_null() in case of several VDEV children.
 - Remove children handling from zio_ioctl().  There are no other
use cases for this code beside DKIOCFLUSHWRITECACHED, and would there
be, I doubt they would so straightforward apply to all VDEV children.

Comparing to removed previous optimization this should improve cases
of redundant ZILs/SLOGs.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: George Wilson <george.wilson@delphix.com>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15515
2023-11-17 14:00:59 -08:00
Alexander Motin 22c8c33a58
Use abd_zero_off() where applicable
In several places abd_zero() cleaned ABD filled at the next line.

Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by:	Alexander Motin <mav@FreeBSD.org>
Sponsored by:	iXsystems, Inc.
Closes #15514
2023-11-17 13:28:32 -08:00
Rob N 92dc4ad83d
Consider dnode_t allocations in dbuf cache size accounting
Entries in the dbuf cache contribute only the size of the dbuf data to
the cache size. Attached "user" data is not counted. This can lead to
the data currently "owned" by the cache consuming more memory accounting
appears to show. In some cases (eg a metadnode data block with all child
dnode_t slots allocated), the actual size can be as much as 3x as what
the cache believes it to be.

This is arguably correct behaviour, as the cache is only tracking the
size of the dbuf data, not even the overhead of the dbuf_t. On the other
hand, in the above case of dnodes, evicting cached metadnode dbufs is
the only current way to reclaim the dnode objects, and can lead to the
situation where the dbuf cache appears to be comfortably within its
target memory window and yet is holding enormous amounts of slab memory
that cannot be reclaimed.

This commit adds a facility for a dbuf user to artificially inflate the
apparent size of the dbuf for caching purposes. This at least allows for
cache tuning to be adjusted to match something closer to the real memory
overhead.

metadnode dbufs carry a >1KiB allocation per dnode in their user data.
This informs the dbuf cache machinery of that fact, allowing it to make
better decisions when evicting dbufs.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #15511
2023-11-17 13:25:53 -08:00
Paul Dagnelie 6c6fae6fae
Fix memory leak in zfs_setprocinit code
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Reviewed-by: Rich Ercolani <rincebrain@gmail.com>
Signed-off-by: Paul Dagnelie <pcd@delphix.com>
Closes #15508
2023-11-17 13:21:04 -08:00
Mike Karels 415c1c748d khelp: suppress useless warning message on shutdown
If a module (e.g. the ertt hhook for TCP) can't clean up at
shutdown, there is nothing to be done about it.  In the ertt case,
cleanup is just shutting down a UMA zone, which doesn't need to be
done.  Suppress EBUSY warnings on shutdown.

PR:		271677
Reviewed by:	tuexen, imp
Differential Revision:	https://reviews.freebsd.org/D42650
2023-11-17 12:51:18 -06:00
Gordon Bergling 115459be31 SEE ALSO section improvements for tuning(7), tunefs(8) and fsck_ffs(8)
cross-reference ffs(7) in fsck_ffs(8)
cross-reference ffs(7) and tuning(7) in tunefs(8)
cross-reference ffs(7) in tuning(7)

PR:	263433
Reviewed by:	bcr
MFC after:	5 days
Differential Revision:	https://reviews.freebsd.org/D42631
2023-11-17 19:24:22 +01:00
Alexander Motin 2ac9cecac6 Fix typo in previous d282baddb0, breaking DTrace. 2023-11-17 12:42:33 -05:00
Gleb Smirnoff 43f7e21668 ng_ksocket: fix accept(2)
- Provide listen upcall and set it on NGM_KSOCKET_LISTEN
- Mask EWOULDBLOCK on NGM_KSOCKET_ACCEPT

Reviewed by:		afedorov
Differential Revision:	https://reviews.freebsd.org/D42637
PR:			272319
PR:			275106
Fixes:			779f106aa1
2023-11-17 09:24:30 -08:00
Gleb Smirnoff efad7cbfdc ng_ksocket: fix upcall clearing on node shutdown
Note: imho, the proper solution would be to guarantee that upcalls
won't ever be called after soclose(), but this isn't the case, yet.
This change at least makes the node work the way it always worked.

Reviewed by:		afedorov
Differential Revision:	https://reviews.freebsd.org/D42636
PR:			272319
PR:			275106
Fixes:			779f106aa1
2023-11-17 09:23:58 -08:00
Brad Davis 5bcd2d5a43 Fix a comment typo. 2023-11-17 10:08:24 -07:00
Igor Ostapenko fe3bb40b9e pf: fix dummynet + ipdivert use case
Dummynet re-injects an mbuf with MTAG_IPFW_RULE added, and the same mtag
is used by divert(4) as parameters for packet diversion.

If according to pf rule set a packet should go through dummynet first
and through ipdivert after then mentioned mtag must be removed after
dummynet not to make ipdivert think that this is its input parameters.

At the very beginning ipfw consumes this mtag what means the same
behavior with tag clearing after dummynet.

And after fabf705f4b pf passes parameters to ipdivert using its
personal MTAG_PF_DIVERT mtag.

PR:		274850
Reviewed by:	kp
Differential Revision:	https://reviews.freebsd.org/D42609
2023-11-17 17:06:16 +01:00
Ka Ho Ng b1538e8fc4 dirdeps: Fix libpcap Makefile.depend.options
This prevents libpcap's Makefile.depend from flip-flopping when OFED is
enabled.

Sponsored by :	Juniper Networks, Inc.
MFC after:	7 days
Reviewed by:	sjg
Differential Revision:	https://reviews.freebsd.org/D42649
2023-11-17 11:34:57 -05:00
Mark Johnston b08a9b86f5 ktls tests: Relax error checking for shutdown(2) a bit
In my test suite runs I occasionally see shutdown(2) fail with
ECONNRESET rather than ENOTCONN.  soshutdown(2) will return ENOTCONN if
the socket has been disconnected (synchronized by the socket lock), and
tcp_usr_shutdown() will return ECONNRESET if the inpcb has been dropped
(synchronized by the inpcb lock).  I think it's possible to pass the
first check in soshutdown() but fail the second check in
tcp_usr_shutdown(), so modify the KTLS tests to permit this.

Reviewed by:	jhb
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D42277
2023-11-17 09:31:21 -05:00
Bjoern A. Zeeb 1965dd85c3 mii: add Vitesse/Microsemi VSC8514
The VSC8514 Quad-Port 10/100/1000BASE-T PHY seems to match the handling
for the VSC8504 (for the little we support of what we could) and while
it works with our generic ukphy add it as vscphy for porper display of
names in the system message buffer and the like (or in case we decide
to implement some extra features).

Tested on:	Ten64
MFC after:	3 days
2023-11-17 12:38:07 +00:00
Bjoern A. Zeeb 43324ec770 mii: resort VSC8641 entry in miidevs
VSC8641 is a ciphy not a vscphy.
Sort it with the other entries of ciphy to avoid confusion.

MFC after:	3 days
2023-11-17 12:38:07 +00:00
Kristof Provost 498934c5ff libpfctl: handle pfctl_do_ioctl() failures better
Ensure that we free nvlists and other allocations if pfctl_do_ioctl()
fails.

MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-11-17 13:21:14 +01:00
Kristof Provost 33d55d0d0f libpfctl: handle allocation failure
While it's unlikely for userspace to fail to allocate memory it is still
possible. Handle malloc() returning NULL.

Reported by:	Bill Meeks <bill@themeeks.net>
MFC after:	1 week
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-11-17 13:21:14 +01:00
Bjoern A. Zeeb 6c46ebb05d dpaa2: fdt improve detection for dpmac/phys
'pcs-handles' are not mandatory in the device tree here so do not
enforce them.  This allows us to find dpmac entries needed for phys
on the WHLE-LS1 as well.

MFC after:	3 days
Reviewed by:	jceel, dsl
Differential Revision: https://reviews.freebsd.org/D42644
2023-11-17 12:20:11 +00:00
Bjoern A. Zeeb 964b3408fa dpaa2: defer link_state updates until we are up
dpaa2_ni_media_change() was called in early setup stages, before we
were fully setup.  That lead to internal driver state being all synched
and fine but hardware state was lost/never setup corrently.

Introduce dpaa2_ni_media_change_locked() so we can avoid reccursive
locking and call "dpaa2_ni_media_change()" instead of mii_mediachg()
as the latter does not setup our state there either.

In order for this all to work, call if_setdrvflagbits() just before
rather than after the above.

Also remove an unecessary direct call to dpaa2_ni_miibus_statchg()
which mii_mediachg() will trigger anyway.

This all fixes a problem [1] that one had to lose the link (either
unplugging/replugging the cable or using ifconfig media none;
ifconfig media auto) to re-trigger the all updates and get the
full state programmed when hardware expected.

MFC after:	3 days
GH-Issue:	https://github.com/mcusim/freebsd-src/issues/21 [1]
Reviewed by:	dsl, dch
Differential Revision: https://reviews.freebsd.org/D42643
2023-11-17 12:20:03 +00:00
Bjoern A. Zeeb 0480dccd3f dpaa2: make software VLANs usable on dpni
dpni announces IFCAP_VLAN_MTU but internally does not increase the
maximum frame length.  Createing a vlan interface on top of a dpni
interface will result in full-sized frames not passing.
Extend the maximum frame length by ETHER_VLAN_ENCAP_LEN to allow at
least for one layer of (software) vlans for now

MFC after:	3 days
GH-Issue:	https://github.com/mcusim/freebsd-src/issues/22
Reviewed by:	dsl
Differential Revision: https://reviews.freebsd.org/D42645
2023-11-17 12:17:54 +00:00
Rich Ercolani 03e9caaec0
Add a tunable to disable BRT support.
Copy the disable parameter that FreeBSD implemented, and extend it to
work on Linux as well, until we're sure this is stable.

Reviewed-by: Alexander Motin <mav@FreeBSD.org>
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rich Ercolani <rincebrain@gmail.com>
Closes #15529
2023-11-16 11:35:22 -08:00
Umer Saleem 5796e3a742
Packaging: Auto-generate changelog during configure (#15528)
Auto-generate changelog based off on @VERSION@ during configure,
so that it is not needed to be update with new releases / version
updates.

Signed-off-by: Umer Saleem <usaleem@ixsystems.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
2023-11-16 08:58:47 -08:00
Richard Scheffenegger 49a6fbe387 [tcp] add PRR 6937bis heuristic and retire prr_conservative sysctl
Improve Proportional Rate Reduction (RFC6937) by using a
heuristic, which automatically chooses between
conservative CRB and more aggressive SSRB modes.
Only when snd_una advances (a partial ACK), SSRB may be
used. Also, that ACK must not have any indication of
ongoing loss - using the addition of new holes into the
scoreboard as proxy for such an event.

MFC after: 4 weeks
Reviewed By: #transport, kbowling, rrs
Sponsored By: NetApp, Inc.
Differential Revision: https://reviews.freebsd.org/D28822
2023-11-15 23:10:29 +01:00
Martin Matuska cf0ad6fd64 zfs: update zfs_config.h and zfs_gitrev.h missed in 47bb16f8f0 2023-11-17 10:00:16 +01:00
Martin Matuska 47bb16f8f0 zfs: merge openzfs/zfs@03e9caaec
Notable upstream pull request merges:
 #15516 da51bd17e Fix snap_obj_array memory leak in check_filesystem()
 #15519 35da34516 L2ARC: Restrict write size to 1/4 of the device
 #15529 03e9caaec Add a tunable to disable BRT support

Obtained from:	OpenZFS
OpenZFS commit:	03e9caaec0
2023-11-17 09:39:42 +01:00
Gleb Smirnoff 70e30addaf tcp: remove extraneous network epoch entry
accept(2) on IPv6 TCP doesn't need epoch.  Some leaf functions may
need it, but they will enter accordingly, see sa6_recoverscope().

Reviewed by:		rscheff, tuexen (implicitly, see deleted XXXMT)
Differential Revision:	https://reviews.freebsd.org/D42634
2023-11-16 18:30:35 -08:00
Konstantin Belousov 22bac49b09 vn_lock_pair(): reasonably handle vp1 == vp2 case
Lock the vnode in the most exclusive lock mode requested, once.
All callers already ensure that vp1 != vp2 or are careful enough to only
unlock once otherwise.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D42642
2023-11-17 03:51:41 +02:00
Konstantin Belousov e256f71389 kernel: add missed FEATUREs compat_freebsd 8-14
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2023-11-17 00:04:55 +02:00
Konstantin Belousov 0aa93010c5 arm64: do not register elf32 brand if hardware cannot exec aarch32
Reviewed by:	imp, jrtc27
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D42639
2023-11-17 00:04:40 +02:00
Konstantin Belousov 4c6cded216 fuse_vnop_copy_file_range(): add safety
v_mount for unlocked vnode could be NULL, check for it.  Explain why it
is safe to access fs-specific data for mp if it is read as non-NULL.

Reviewed by:	asomers, jah
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D42625
2023-11-16 22:10:31 +02:00
Konstantin Belousov 318c56714a fuse_vnop_copy_file_range(): use vn_lock_pair()
Reviewed by:	asomers, jah
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D42625
2023-11-16 22:10:30 +02:00
Jonathan T. Looney 884eeff20c genoffset.sh: fix build break on MacOS
Switch from using the shell's builtin echo command to using the
builtin printf command to print the asserts.

Reported by:	jrtc27
Suggested by:	imp
Fixes:	accfb4cc93
Sponsored by:	Netflix
2023-11-16 17:54:28 +00:00
Gleb Smirnoff 070d9e3540 socket tests: add socket_accept
Start with most basic functionality on a TCP socket.
2023-11-16 08:23:48 -08:00
Jonathan T. Looney accfb4cc93 genoffset.sh: stop using a temporary file
Instead, use a here document for the input. This allows us to run the
while loop in the main script so we can build the list of asserts in
a shell variable. We then print out the list of asserts at the end of
the loop.

Reviewed by:	imp
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D42407
2023-11-16 15:02:32 +00:00
Martin Matuska a592812327 mlx5_core: fix deadlock when using RXTLS
If removing a node of type FS_TYPE_FLOW_DEST we lock the flow group too
late. This can lead to a deadlock with fs_add_dst_fg().

PR:		274715
MFC after:	1 week
Reviewed by:	kib
Tested by:	mm
Differential Revision: https://reviews.freebsd.org/D42368
2023-11-16 12:17:41 +01:00
Thomas Eberhardt a6ed8c9593 Fix /root permissions after 'make installworld'
According to /etc/mtree/BSD.root.dist /root should have
0750 permissions, but the build target 'make installworld'
changes these to 0755.

This is caused by the installation of the configuration
files of sh(1) and csh(1).

Correct this by specifying the correct default /root permissions.

PR:	273342
Reviewed by:	jilles
Approved by:	jilles
MFC after:	2 weeks
Differential Revision:https://reviews.freebsd.org/D42395
2023-11-16 10:59:38 +01:00
Gordon Bergling 54611b7cc6 Document library types in the intro(3) manual page
Add a paragraph about library types to the intro(3)
manual page. Document library types, locations
and versioning.

Reviewed by:	emaste, jilles, mhorne, pauamma_gundo.com
Obtained from:	OpenBSD (partial)
MFC after:	3 weeks
Differential Revision:	https://reviews.freebsd.org/D36594
2023-11-16 10:48:09 +01:00
Yan-Hao Wang 55141f2c89
Add tests for gunion(8)
Reviewed by:	mckusick (earlier version)
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41645
2023-11-16 16:15:33 +08:00
Alexander Motin 3aebcb9ecb iostat: Restore lost spaces after tout
MFC after:	2 weeks
2023-11-15 23:45:22 -05:00
Alexander Motin d282baddb0 Add interface NVME to devstat
This allows to list only NVMe devices in systat, iostat, vmstat, etc.
Previously those were counted as OTHER.
2023-11-15 23:03:40 -05:00
Alexander Motin 7b21c447fb vmstat: Make disks reporting some more reasonable
MFC after:	1 month
2023-11-15 22:56:51 -05:00
John Baldwin fd9ae9ac04 pkg: Allocate a suitably-sized string for the local ABI
Previously the local ABI string was written to an on-stack buffer and
the pointer to that buffer was saved in a global before the function
returned.  This had two issues: c[ABI].val pointed to a
no-longer-valid on-stack buffer after config_init returned, and the
string could potentially be truncated.  Fix both of those by changing
pkg_get_myabi to return a pointer to a string allocated by asprintf.

Note that the allocated string is left in the global config array
until it is implicitly freed on process exit.

Reported by:	GCC 13 -Wdangling-pointer
Reviewed by:	emaste
Differential Revision:	https://reviews.freebsd.org/D42623
2023-11-15 16:53:53 -08:00
Andrew Gallatin 5972ffde91 ig4(4): Add an EMAG device type
Sponsored by: Ampere Computing LLC, Netflix
Submitted by: allanjude
Differential Revision: https://reviews.freebsd.org/D28746
Reviewed by: imp
2023-11-15 19:53:21 -05:00
Brooks Davis 500bf0592c libc: remove unused stub vdso timecounter implementations
All supported architectures have shared page support so remove this
unused stub.

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D42619
2023-11-15 23:43:56 +00:00
Brooks Davis c704518681 libc: centralize a few numeric symbols
fabs, __infinity, and __nan are universally implemented so declare them
in gen/Symbol.map.

We would also include __flt_rounds, but  it's under FBSD_1.3 on arm so
until that's gone we're stuck with it.  Likewise, everyone but i386
implements fp[gs]etmask.

Reviewed by:	imp, kib, emaste
Differential Revision:	https://reviews.freebsd.org/D42618
2023-11-15 23:42:37 +00:00
Brooks Davis 5d79b5445e libc: centralize makecontext symbols
Declare makecontext() and __makecontext() symbols centrally as they are
always implemented.

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D42617
2023-11-15 23:42:18 +00:00
Brooks Davis 1c656143be libc: centralize {_,sig,}{set,long}jmp symbols
These symbols are universally exposed and documented so declare them
centrally.  Double- and triple-underscore versions exist on some
platforms, but leave those alone for now.

Reviewed by:	imp, kib
Differential Revision:	https://reviews.freebsd.org/D42616
2023-11-15 23:41:35 +00:00