Commit graph

286915 commits

Author SHA1 Message Date
Jose Luis Duran 20012a3a1a ping tests: Test IHL/quoted data/inner packet paths
Commit 46d7b45a26 introduced these code
paths.  Test and document them.

- Add inner packet too short test
- Add inner IHL too short test
- Add quoted data too short test
- Add IHL too short test
- Add max inner packet IHL without payload test

Reviewed by:	markj
MFC after:	1 week
Pull Request:	https://github.com/freebsd/freebsd-src/pull/863
Differential Revision:	https://reviews.freebsd.org/D38528
2023-10-11 13:48:27 -04:00
Jose Luis Duran 4d348e83b7 ping: Avoid reporting NaNs
Avoid calculating the square root of negative zero, which can easily
happen on certain architectures when calculating the population standard
deviation with a sample size of one, e.g., 0.01 - (0.1 * 0.1) =
-0.000000.

Avoid returning a NaN by capping the minimum possible variance value to
zero (positive).

In the future, maybe skip reporting statistics at all for a single
sample.

Reported by:	Jenkins
Reviewed by:	asomers
MFC after:	1 week
Pull Request:	https://github.com/freebsd/freebsd-src/pull/863
Differential Revision:	https://reviews.freebsd.org/D42114
2023-10-11 13:48:27 -04:00
Alfonso S. Siciliano 04b465777a
bsdinstall auto: replace dialog with bsddialog
bsdinstall/scripts/auto: Replace dialog(1) with bsddialog(1).
2023-10-11 18:33:25 +02:00
Poul-Henning Kamp f17b69fd0f Move (LENOVO, TBT3LAN) from if_ure til if_cdce where it works much better 2023-10-11 15:54:55 +00:00
Alfonso S. Siciliano 160ccec84c
bsdinstall: restore --calendar
Restore --calendar to select a date because bsddialog(1) >= 0.4
provides a calendar dialog.
2023-10-11 15:48:53 +02:00
Mateusz Guzik 281a9715b5 vfs: add max_vnlru_free to the vfs.vnode.vnlru tree
While here rename the var internally.
2023-10-11 13:07:13 +00:00
Baptiste Daroussin 742f7ec59e dialog: correctly mark the libaries
Mark the libraries as such in order for make delete-old to not
remove them when the DIALOG option is off
2023-10-11 13:36:16 +02:00
Mateusz Guzik 054f45e026 vfs: further speed up continuous free vnode recycle
The primary bottleneck *was* vnode_list mtx, which got artificially
worsened due to the following work done with the lock held:
1. the global heavily modified numvnodes counter was being read,
   inducing massive cache line ping pong
2. should the value fit limits (which it normally did) there would be an
   avoidable write to vn_alloc_cyclecount, which is being read outside
   of the lock, once more inducing traffic

But if vn_alloc_cyclecount is 0, which it normally is even when facing
vnode shortage, there is no need to check numvnodes nor set it to 0 again.

Another problem was numvnodes adjustment (which made the locked read
much worse). While it fundamentally does not scale as it is not
distributed in any fashion, it was avoidably slow. When bumping over the
vnode limit, it would be modified with atomics 3 times: inc + dec to
backpedal in vn_alloc, then final inc in vn_alloc_hard.

One can let some slop persist over calls to vnlru_free instead.

In principle each thread in the system could get here and bump it, so a
limit is put in place to keep things sane.

Bench setup same as in prior commits: zfs, 20 separate directory trees
each with 1 million files in total and 20 find(1) processes stating them
in parallel (one per each tree).

Total run time (in seconds) goes down as follows:
vnode limit	8388608	400000
before		~20	~35
after		~8	~15

With this in place the primary bottleneck is now ZFS.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-10-11 10:37:52 +00:00
Olivier Cochard 75ae7e436e syslogd: Prevent running tests in parallel
They all use the same listening port.

Approved by:	markj
Sponsored by:	Netflix
Differential Revision:	https://reviews.freebsd.org/D41989
2023-10-11 12:24:42 +02:00
Alfonso S. Siciliano 8df9efe877
bsdinstall: Fix command line argument list parsing
bsddialog(1) uses getopt_long(3) to parse command line argument list.
Add '--' to avoid errors caused by arguments (menu items) begin
with '-'.
The change is compatible with dialog(1) and Xdialog(1).
2023-10-11 10:17:04 +02:00
Alfonso S. Siciliano 7cff9672de
spkrtest.8: Add module info
Add the module and driver info as usual.

Approved by:		bcr, wosch
Differential Revision:	https://reviews.freebsd.org/D37710
2023-10-11 09:08:13 +02:00
Mateusz Guzik a4f753e812 vfs: don't recycle transiently excess vnodes
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-10-11 06:39:48 +00:00
Ed Maste a572dfa1bf ktrace.2: correct kern.ktrace.genio_size sysctl name
The man page had `kern.ktrace.geniosize` but the sysctl node contains an
underscore.

PR:		274274
Reported by:	Ivan Rozhuk
Sponsored by:	The FreeBSD Foundation
2023-10-10 21:23:02 -04:00
Dag-Erling Smørgrav 7fd2c91a29 ping: Simplify protocol selection.
* Interrupt the option loop as soon as we have an indication of which
  protocol is intended.
* If we end up having to perform a DNS lookup, loop over the entire
  result looking for either IPv4 or IPv6 addresses.

Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Reviewed by:	rscheff, kevans, allanjude
Differential Revision:	https://reviews.freebsd.org/D42137
2023-10-11 00:47:59 +02:00
Martin Matuska 26103ccba8 zfs: enable block cloning by default
Discussed with:	markj
Reviewed by:	mav
Tested  by:	mm (FreeBSD test suite + OpenZFS test suite)
MFC after:	2 months
Differential Revision:	https://reviews.freebsd.org/D41991
2023-10-11 00:43:35 +02:00
Warner Losh afc3d49b17 nvme: Close a race in destroying qpair and timeouts
While we should have cleared all the pending I/O prior to calling
nvme_qpair_destroy, which should ensure that if the callout_drain causes
a call to nvme_qpair_timeout(), it won't schedule any new
timeout. However, it doesn't hurt to set timeout_pending to false in
nvme_qpair_destroy() and have nvme_qpair_timeout() exit early if it sees
it w/o scheduling a timeout. Since we don't otherwise stop the timeout
until we're about to destroy the qpair, this ensures we fail safe. The
lock/unlock also ensures the callout_drain will either remove the callout,
or wait for it to run with the early bailout.

We can likely further improve this by using callout_stop() inside the
pending lock. I'll investigate that for future refinement.

Sponsored by:		Netflix
Suggestions by:		jhb
Reviewed by:		gallatin
Differential Revision:	https://reviews.freebsd.org/D42065
2023-10-10 16:13:57 -06:00
Warner Losh 9cd7b62473 nvme: Eliminate RECOVERY_FAILED state
While it seemed like a good idea to have this state, we can do
everything we wanted with the state by checking ctrlr->is_failed since
that's set before we start failing the qpairs. Add some comments about
racing when we're failing the controller, though in practice I'm not
sure that kind of race could even be lost.

Sponsored by:		Netflix
Reviewed by:		chuck, gallatin, jhb
Differential Revision:	https://reviews.freebsd.org/D42051
2023-10-10 16:13:57 -06:00
Warner Losh 6b2a6e9cb0 nvme: Remove stale comment
After da8324a925, the pre/post hooks are gone. So remove a coment
about why we don't call them in this case.

Sponsored by:		Netflix
Reviewed by:		chuck, jhb
Differential Revision:	https://reviews.freebsd.org/D42050
2023-10-10 16:13:56 -06:00
Warner Losh 4026128983 nvme: Really remove NVME_2X_RESET
da8324a925 removed one of the two instances of NVME_2X_RESET. It
failed to snag the other one, and remove it from the options file.
Remove from both of those here.

Sponsored by:		Netflix
Reviewed by:		chuck, gallatin, jhb
Differential Revision:	https://reviews.freebsd.org/D42049
2023-10-10 16:13:56 -06:00
Warner Losh bc85cd303c nvme: gc nvme_ctrlr_post_failed_request and related task stuff
In 4b977e6dda we removed the call to nvme_ctrlr_post_failed_request
because we can now directly fail requests in this context since we're in
the reset task already. No need to queue it. I left it in place against
future need, but it's been two years and no panics have resulted. Since
the static analysis (code checking) and the dyanmic analysis (surviving
in the field for 2 years, including at $WORK where we know we've gone
through this path when we've failed drives) both signal that it's not
really needed, go ahead and GC it. If we discover at a later date a flaw
in this analysis, we can add it back easily enough by reverting this and
4b977e6dda.

Sponsored by:		Netflix
Reviewed by:		chuck, gallatin, jhb
Differential Revision:	https://reviews.freebsd.org/D42048
2023-10-10 16:13:56 -06:00
Mateusz Guzik 90a008e94b vfs: prefix regular vnlru with a special case for free vnodes
Works around severe performance problems in certain corner cases, see
the commentary added.

Modifying vnlru logic has proven rather error prone in the past and a
release is near, thus take the easy way out and fix it without having to
dig into the current machinery.
2023-10-10 19:35:12 +00:00
John Baldwin f53355131f Trim various $FreeBSD$
Approved by:	markj (cddl/contrib changes)
Reviewed by:	imp, emaste
Differential Revision:	https://reviews.freebsd.org/D41961
2023-10-10 10:34:43 -07:00
Mateusz Guzik 23ef25d25d vfs: consult freevnodes in vnlru_kick_cond
If the count is high enough there is no point trying to produce more.
Not going there reduces traffic on the vnode_list mtx.

This further shaves total real time in a test mentioned in:
74be676d87 ("vfs: drop one vnode list lock trip during vnlru free
recycle") -- 20 instances of find each creating 1 million vnodes, while
total limit is set to 400k.

Time goes down from ~41 to ~35 seconds.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-10-10 16:19:53 +00:00
Mateusz Guzik 1bf55a739e vfs: be less eager to call uma_reclaim(UMA_RECLAIM_DRAIN)
In face of vnode shortage the count very easily can go few units above
the limit before going back down.

Calling uma_reclaim results in massive amount of work which in this case
is not warranted.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
2023-10-10 16:15:53 +00:00
Ed Maste e49c7cd677 dtrace: remove x86 non-EARLY_AP_STARTUP support
After 792655abd6 EARLY_AP_STARTUP is mandatory for x86.

Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42139
2023-10-10 11:03:27 -04:00
Baptiste Daroussin 52fe961c6c src.conf(5): regen after 38981026e7
Reported by:	manu
2023-10-10 16:17:23 +02:00
Mateusz Guzik 8733bc277a vfs: don't provoke recycling non-free vnodes without a good reason
If the total number of free vnodes is at or above target, there is no
point creating more of them.

Tested by:	pho (in a bigger patch)
2023-10-10 12:49:04 +00:00
Baptiste Daroussin b627b3e6ea RELNOTES: fix typo
Reported by:	garga
2023-10-10 13:59:32 +02:00
Andrew Turner d09a64e15d arm64: Enable kernel branch protection
Add the build flags to enable branch protection on arm64. This enable
the use of PAC and BTI in the kernel.

For PAC we already install the kernel keys when entering the kernel
from userspace so this will start using these to sign the stack.

For BTI we need to mark the kernel page tables with a new guarded page
field. As this will require all code that could be reached through a
function pointer with an appropriate branch target instruction we
are enabling this before setting the field.

As the pointer authentication support shouldn't be reached via a
function pointer it is safe to not enable the use of BTI there.

Reviewed by:	markj
Sponsored by:	Arm Ltd
Differential Revision:	https://reviews.freebsd.org/D42079
2023-10-10 10:52:16 +01:00
Andrew Turner e340882d3e arm64: Add BTI landing pads to assembly functions
When we enable BTI iboth the first instruction in a function that could
be called indirectly, and a branch within a function need a valid
landing pad instruction.

There are three options for these instructions:
 1. A breakpoint instruction
 2. A pointer authentication PACIASP/PACIBSP
 3. A BTI instruction

Option 1 will raise a breakpoint exception so isn't useable in either
cases. Option 2 could be used in some function entry cases, but needs
to be paired with an authentication instruction, and is normally only
used in non-leaf functions we can't use it in this case. This leaves
option 3.

There are four variants of the instruction, the C variant is used on
function entry and the J variant is for jumping within a function.
There is also a JC that works with both and one with no target that
works with neither.

Reviewed by:	markj
Sponsored by:	Arm Ltd
Sponsored by:	The FreeBSD Foundation (earlier version)
Differential Revision:	https://reviews.freebsd.org/D42078
2023-10-10 10:52:16 +01:00
Kristof Provost ebfd3b229a pf: move DIOCGETSTATES(V2) to COMPAT_FREEBSD14
We now have an improved version (via netlink). The old-style ioctl will
be removed in FreeBSD 16.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42101
2023-10-10 11:48:22 +02:00
Kristof Provost 84d12f887c Add a COMPAT_FREEBSD14 kernel option
Use it wherever COMPAT_FREEBSD13 is currently specified.

Reviewed by:	brooks, zlei
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42100
2023-10-10 11:48:22 +02:00
Kristof Provost a7191e5d7b pf: add a way to list creator ids
Allow userspace to retrieve a list of distinct creator ids for the
current states.

This is used by pfSense, and used to require dumping all states to
userspace. It's rather inefficient to export a (potentially extremely
large) state table to obtain a handful (typically 2) of 32-bit integers.

Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42092
2023-10-10 11:48:21 +02:00
Kristof Provost f218b851da libpfctl: introduce state iterator
Allow consumers to start processing states as the kernel supplies them,
rather than having to build a full list and only then start processing.
Especially for very large state tables this can significantly reduce
memory use.

Without this change when retrieving 1M states time -l reports:

    real 3.55
    user 1.95
    sys 1.05
        318832  maximum resident set size
           194  average shared memory size
            15  average unshared data size
           127  average unshared stack size
         79041  page reclaims
             0  page faults
             0  swaps
             0  block input operations
             0  block output operations
         15096  messages sent
        250001  messages received
             0  signals received
            22  voluntary context switches
            34  involuntary context switches

With it it reported:

    real 3.32
    user 1.88
    sys 0.86
          3220  maximum resident set size
           195  average shared memory size
            11  average unshared data size
           128  average unshared stack size
           260  page reclaims
             0  page faults
             0  swaps
             0  block input operations
             0  block output operations
         15096  messages sent
        250001  messages received
             0  signals received
            21  voluntary context switches
            31  involuntary context switches

Reviewed by:	mjg
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D42091
2023-10-10 11:48:21 +02:00
Alexander V. Chernikov 2cef62886d pf: convert state retrieval to netlink
Use netlink to export pf's state table.

The primary motivation is to improve how we deal with very large state
stables. With the previous implementation we had to build the entire
list (both in the kernel and in userspace) before we could start
processing. With netlink we start to get data in userspace while the
kernel is still generating more. This reduces peak memory consumption
(which can get to the GB range once we hit millions of states).

Netlink also makes future extension easier, in that we can easily add
fields to the state export without breaking userspace. In that regard
it's similar to an nvlist-based approach, except that it also deals
with transport to userspace and that it performs significantly better
than nvlists. Testing has failed to measure a performance difference
between the previous struct-copy based ioctl and the netlink approach.

Differential Revision:	https://reviews.freebsd.org/D38888
2023-10-10 11:48:21 +02:00
Dmitry Chagin 5bdd74cc05 linux(4): Drop the outdated comments about sixth register on i386 int0x80
This is well documented in the Linux syscall(2).

MFC after:		1 week
2023-10-10 12:33:22 +03:00
Dmitry Chagin 03f5bd1e46 linux(4): Drop the outdated comment, nosys is fine since 39024a89
MFC after:		1 week
2023-10-10 12:20:51 +03:00
Baptiste Daroussin bb63e82e8c bsddialog(1): document the replacement of dialog(1) 2023-10-10 09:24:25 +02:00
Baptiste Daroussin 38981026e7 dialog(1): switch off dialog(1) by default
Every direct consumers in base have switch to use bsddialog(1) by
default
2023-10-10 09:19:48 +02:00
Baptiste Daroussin ff01d71e48 bsdconfig: do not remove files id MK_DIALOG=no
bsdconfig does not depends anymore on anything related to dialog(1)
and libdialog(1) and has totally switched to bsddialog(1)
2023-10-10 09:17:29 +02:00
Baptiste Daroussin 6d3c0798cc bsdconfig: rework packages selection TUI
Rework the packages TUI, do that the index caching is now done with
dialog --gauge (tested with cdialog and bsddialog).
With pkg we can know in avance the number of packages making it
possible to have a real gauge.

The cache of the index is now a file that can be sourced, meaning it
is not anymore an index like file, but a post process one, simplifying
the code.

Each menu is now built calling directly pkg rquery with just the
informations required to build the menu instead of parsing an indexfile

install all the awk index processing into a separate file to ease
reading and debuggung
2023-10-10 09:01:51 +02:00
Ed Maste 826d144679 newvers: remove references to svnliteversion
svnliteversion was provided by the base system copy of subversion,
which was disabled in a2bc17474b ("Disable building svnlite(1) by
default.")

Reviewed by:	zlei
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42034
2023-10-09 23:46:36 -04:00
Ed Maste 75be7e3027 sysctl: emit a newline after NULL node descriptions
Previously when printing the sysctl description (via the -d flag) we
omitted the newline if the node provided no description (i.e., NULL).
This could be observed via e.g. `sysctl -d dev`.

PR:		44034
Reviewed by:	zlei
Sponsored by:	The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D42112
2023-10-09 22:48:53 -04:00
Olivier Certner 892654fe9b setusercontext(): Apply personal settings only on matching effective UID
Commit 35305a8dc1 (r211393) added a check on whether 'uid' was equal
to getuid() before calling setlogincontext().  Doing so still allows
a setuid program to apply resource limits and priorities specified in
a user-controlled configuration file ('~/.login_conf') where
a non-setuid program could not.  Plug the hole by checking instead that
the process' effective UID is the target one (which is likely what was
meant in the initial commit).

PR:                     271750
Reviewed by:            kib, des
MFC after:              2 weeks
Sponsored by:           Kumacom SAS
Differential Revision:  https://reviews.freebsd.org/D40351
2023-10-09 21:47:10 -04:00
Konstantin Belousov 6e92fc9309 vkbd: correct ref count on cloned cdevs
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D42008
2023-10-10 02:37:43 +03:00
Konstantin Belousov 27f1ec0be2 tun/tap: correct ref count on cloned cdevs
Reported and tested by:	eugen
PR:	273418
Discussed with:	jah, kevans
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D42008
2023-10-10 02:36:59 +03:00
Ed Maste 792655abd6 x86: make EARLY_AP_STARTUP mandatory
When early AP startup was introduced in 2016 it was put behind a kernel
option EARLY_AP_STARTUP as a transition aid, so that it could be turned
off if necessary.  For x86 the non-EARLY_AP_STARTUP case is no longer
functional, so disallow it.

Other archs are still incompatible with EARLY_AP_STARTUP, so the option
cannot yet be removed entirely.

Reported by:	wollman
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41351
2023-10-09 16:08:22 -04:00
Bjoern A. Zeeb ad134f8ab7 iwlwifi: re-enable "Invalid TXQ id" logging
Various reports recently hit the "Invalid TXQ id" in iwlwifi again.
Unconditionally enable logging and add a note to report to a specific
PR in the log message for now.
Along with 018d93ece1 this will hopefully help us to understand what
is going on.

Sponsored by:	The FreeBSD Foundation
PR:	274382
2023-10-09 19:50:02 +00:00
Bjoern A. Zeeb 018d93ece1 LinuxKPI: 802.11: add unconditional error reporting
Multiple reports have shown missed state transitions in net80211 without
major cause obvious (or with a txq warning in iwlwifi).
In order to better track down potential problems add unconditional
ic_printf calls to any case in the lkpi state machine compat code which
would let us return with an error in the hope that it helps us to catch
the actual problems.
Also remove the debug conditions from ieee80211_{beacon,connection}_loss
which can also cause state transitions to have the ic_printf all the time
there too.

Sponsored by:	The FreeBSD Foundation
MFC after:	3 days
2023-10-09 19:21:51 +00:00
Jose Luis Duran 5d371834d2 Cirrus CI: Trigger on pull requests or downstream repos
Since Cirrus Labs is limiting their free usage tier [1], limit CI runs
on pull requests only.  Otherwise, we might deplete our monthly quota
within a few days.

Adapt the task amd64-llvm16 to execute on downstream repos or on pull
requests only.

Other alternatives will be further studied.

[1]: https://cirrus-ci.org/blog/2023/07/17/limiting-free-usage-of-cirrus-ci/
2023-10-09 15:13:21 -04:00