Commit graph

290633 commits

Author SHA1 Message Date
Gleb Smirnoff f7c4d12bcd icmp: correct the assertion that checks limit + jitter
Fixes:	4399e055ea
2024-04-08 16:54:19 -07:00
Dag-Erling Smørgrav 0729d1e8fd cp: Never follow symbolic links in destination.
Historically, BSD cp has followed symbolic links in the destination
when copying recursively, while GNU cp has not.  POSIX is somewhat
vague on the topic, but both interpretations are within bounds.  In
33ad990ce9, cp was changed to apply the same logic for symbolic
links in the destination as for symbolic links in the source: follow
if not recursing (which is moot, as this situation can only arise
while recursing) or if the `-L` option was given.  There is no support
for this in POSIX.  We can either switch back, or go all the way.

Having carefully weighed the kind of trouble you can run into by
following unexpected symlinks up against the kind of trouble you can
run into by not following symlinks you expected to follow, we choose
to go all the way.

Note that this means we need to stat the destination twice: once,
following links, to check if it is or references the same file as the
source, and a second time, not following links, to set the dne flag
and determine the destination's type.

While here, remove a needless complication in the dne logic.  We don't
need to explicitly reject overwriting a directory with a non-directory,
because it will fail anyway.

Finally, add test cases for copying a directory to a symlink and
overwriting a directory with a non-directory.

MFC after:	never
Relnotes: 	yes
Sponsored by:	Klara, Inc.
Reviewed by:	kevans
Differential Revision:	https://reviews.freebsd.org/D44578
2024-04-09 00:41:33 +02:00
Gleb Smirnoff d80a97def9 unix: new implementation of unix/stream & unix/seqpacket
Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX
SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension
of SOCK_STREAM.  The change meets three goals: get rid of unix(4) specific
stuff in the generic socket code, provide a faster and robust unix/stream
sockets and bring unix/seqpacket much closer to specification.  Highlights
follow:

- The send buffer now is truly bypassed.  Previously it was always empty,
but the send(2) still needed to acquire its lock and do a variety of
tricks to be woken up in the right time while sleeping on it.  Now the
only two things we care about in the send buffer is the I/O sx(9) lock
that serializes operations and value of so_snd.sb_hiwat, which we can read
without obtaining a lock.  The sleep of a send(2) happens on the mutex of
the receive buffer of the peer.  A bulk send/recv of data with large
socket buffers will make both syscalls just bounce between owning the
receive buffer lock and copyin(9)/copyout(9), no other locks would be
involved.

- The implementation uses new mchain structure to manipulate mbuf chains.
Note that this required converting to mchain two functions that are shared
with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding
a new shared one uipc_process_kernel_mbuf().  This induces some non-
functional changes in the unix/dgram code as well.  There is a space for
improvement here, as right now it is a mix of mchain and manually managed
mbuf chains.

- unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated
as a datagram socket by the generic socket code, now becomes a true stream
socket with record markers.

- unix/stream loses the sendfile(2) support.  This can be brought back,
but requires some work.  Let's first see if there is any interest in this
feature, except purely academical.

Reviewed by:		markj, tuexen
Differential Revision:	https://reviews.freebsd.org/D44151
2024-04-08 13:16:51 -07:00
Gleb Smirnoff aba79b0f4a mbuf: provide mc_uiotomc() a function to copy from uio(9) to mchain
Implement m_uiotombuf() as a wrapper around mc_uiotomc().  The M_EXTPG is
left untouched.  The m_uiotombuf() is left as a compat KPI.  New code
should use either mc_uiotomc() or m_uiotombuf_nomap().

Reviewed by:		markj, tuexen
Differential Revision:	https://reviews.freebsd.org/D44150
2024-04-08 13:16:51 -07:00
Gleb Smirnoff 71f8702f49 mbuf: provide mc_get() that allocates struct mchain of given length
Implement m_getm2(), which is widely used via m_getm() macro, as a wrapper
around mc_get().  New code is advised to use mc_get().

Reviewed by:		markj, tuexen
Differential Revision:	https://reviews.freebsd.org/D44149
2024-04-08 13:16:51 -07:00
Gleb Smirnoff fd01798fc4 mbuf: add mc_split() that works on two struct mchain
It preserves tail points and all length/memory accounting, so that caller
doesn't need to do any extra traversals.  It doesn't respect M_PKTHDR but
it may be improved if needed.  It respects M_EOR, though.  First consumer
will be the new unix(4) SOCK_STREAM and SOCK_SEQPACKET.

Also provide much more simple mc_concat() that glues two chains back.

Reviewed by:		markj
Differentail Revision:	https://reviews.freebsd.org/D44148
2024-04-08 13:16:51 -07:00
Gleb Smirnoff ab8a51c455 mbuf: provide new type for mbuf manipulation - mbuf chain
It tracks both the first mbuf and last mbuf, making it handy to use inside
functions that are interested in both. It also tracks length of data and
memory usage. It can be allocated on stack and passed to an mbuf
allocation or another mbuf manipulation function. It can be embedded into
some kernel facility internal structure representing most simple data
buffer. It uses modern queue(3) based linkage, but is also compatible with
old style m_next linkage. Transitioning older code to new type can be done
gradually - a code that doesn't understand the chain yet, can be supplied
with STAILQ_FIRST(&mc.mc_q). So you can have a mix of old style and new
style code in one function as a temporary solution.

Reviewed by:		markj, tuexen
Differential Revision:	https://reviews.freebsd.org/D44147
2024-04-08 13:16:51 -07:00
Gleb Smirnoff 3b7aa842e2 sendfile: mark it explicitly as a TCP only feature
Back in 2015 when it turned non-blocking, it was working with PF_UNIX
and it may still work.  However, the usefullness of such application
of sendfile(2) is questionable.  Disable the feature while unix/stream
is under refactoring.

Relnotes:	yes
2024-04-08 13:16:51 -07:00
Gleb Smirnoff 0b49929762 tests/unix_seqpacket: remove workaround for a kernel bug that is no longer 2024-04-08 13:16:51 -07:00
Gleb Smirnoff f992782124 tests/unix_seqpacket: test send(2) to a closed or aborted peer socket
In both cases the kernel returns EPIPE and delivers SIGPIPE, unless
blocked or disabled.  The test isn't specific to SOCK_SEQPACKET, it is the
same for SOCK_STREAM.  Put the test into this file, since it has all
primitives to write this test tersely.

Reviewed by:		tuexen
Differential Revision:	https://reviews.freebsd.org/D44146
2024-04-08 13:16:50 -07:00
Gleb Smirnoff eb338e2370 tests/unix_seqpacket: provide random data pumping test with MSG_EOR
Allocate a big chunk of randomly initialized memory.  Send it to the peer
in random sized chunks, throwing MSG_EOR at randomly initialized offsets.
Receive into random sized chunks setting MSG_WAITALL randomly.  Check that
MSG_EORs where they should be, check that MSG_WAITALL is abode, but
overriden by MSG_EOR.  And finally memcmp() what we receive.

Reviewed by:		asomers, tuexen
Differential Revision:	https://reviews.freebsd.org/D43775
2024-04-08 13:16:50 -07:00
Dag-Erling Smørgrav 7f479dee48 sys/queue.h: Add {LIST,TAILQ}_REPLACE().
MFC after:	1 week
Obtained from:	NetBSD
Sponsored by:	Klara, Inc.
Reviewed by:	cperciva, imp
Differential Revision:	https://reviews.freebsd.org/D44679
2024-04-08 20:16:46 +02:00
Dag-Erling Smørgrav 69fd60f1ea sys/queue.h: Whitespace cleanup.
MFC after:	1 week
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D44678
2024-04-08 20:16:46 +02:00
Gleb Smirnoff e943eceb92 ng_bridge: document the limitation brought in f961caf218 2024-04-08 10:48:22 -07:00
David Marker 86a6393a7d ng_bridge: allow to automatically assign numbers to new hooks
This will allow a userland machinery that orchestrates a bridge (e.g. a
jail or vm manager) to not double the number allocation logic.  See bug
278130 for longer description and examples.

Reviewed by:		glebius, afedorov
Differential Revision:	https://reviews.freebsd.org/D44615
PR:			278130
2024-04-08 10:48:22 -07:00
Zhenlei Huang 6fe4d8395b debugnet: Fix logging of frame length
MFC after:	1 week
2024-04-09 00:47:10 +08:00
Zhenlei Huang e7102929bf ethernet: Fix logging of frame length
Both the mbuf length and the total packet length are signed.

While here, update a stall comment to reflect the current practice.

Reviewed by:	kp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D42390
2024-04-09 00:44:33 +08:00
Fernando Apesteguía 7e68976408 echo(1): Add EXAMPLES
While here add CAVEAT section and promote the use of printf(1)

Reviewed by:		gbe@, imp@
Approved by:		manpages (gbe)
Differential Revision:	https://reviews.freebsd.org/D43493
2024-04-08 18:35:40 +02:00
Kristof Provost 60d8dbbef0 netinet: add a probe point for IP, IP6, ICMP, ICMP6, UDP and TCP stats counters
When debugging network issues one common clue is an unexpectedly
incrementing error counter. This is helpful, in that it gives us an
idea of what might be going wrong, but often these counters may be
incremented in different functions.

Add a static probe point for them so that we can use dtrace to get
futher information (e.g. a stack trace).

For example:
	dtrace -n 'mib:ip:count: { printf("%d", arg0); stack(); }'

This can be disabled by setting the following kernel option:
	options 	KDTRACE_NO_MIB_SDT

Reviewed by:	gallatin, tuexen (previous version), gnn (previous version)
Sponsored by:	Rubicon Communications, LLC ("Netgate")
Differential Revision:	https://reviews.freebsd.org/D43504
2024-04-08 17:29:59 +02:00
Jake Freeland 34791f4ac7 capsicum.h: Include ktrace.h only in kernel
Fix cross build failure by including ktrace.h only when _KERNEL is
defined.

Fixes:		9bec841312
Approved by:	markj (mentor)
MFC after:	1 month
Sponsored by:	The FreeBSD Foundation
2024-04-08 09:32:58 -05:00
Rob Norris b6c7ff583f libvmmapi: add missing capability strings
Signed-off-by: Rob Norris <robn@despairlabs.com>

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D44642
2024-04-08 09:08:59 -04:00
Rob Norris b9fa1500cb bhyvectl: generate usage from options table
The usage text had fallen out of sync with the actually available
options. Rather than keep them in sync by hand, just generate usage from
the available options.

Signed-off-by: Rob Norris <robn@despairlabs.com>

Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D44641
2024-04-08 09:08:54 -04:00
Mark Johnston 4696650782 swap_pager: Unbusy readahead pages after an I/O error
The swap pager itself allocates readahead pages, so should take care to
unbusy them after a read error, just as it does in the non-error case.

PR:		277538
Reviewed by:	olce, dougm, alc, kib
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D44646
2024-04-08 09:02:48 -04:00
Vladimir Kondratyev 4e7aa03b70 LinuxKPI: Stub sysfs_remove_link in linux/sysfs.h
sysfs_create_link is stubbed already. Stub sysfs_remove_link too to be
feature-complete.

Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:43 +03:00
Vladimir Kondratyev 56041ee817 LinuxKPI: Add want_init_on_free to linux/mm.h
want_init_on_free returns if heap memory zeroing on free is enabled.
FreeBSD does not zeroes heap memory on free().

Sponsored by:	Serenity Cyber Security
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:43 +03:00
Vladimir Kondratyev f7ea333e2b LinuxKPI: Add ACPI_ID_LEN const to linux/mod_devicetable.h
Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:43 +03:00
Vladimir Kondratyev f1206503e5 LinuxKPI: Add pci_dev_id to linux/pci.h
It returns bus/device/function number for given PCI device.
Also add intermediate PCI_DEVID macro used in some drivers.

Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:43 +03:00
Vladimir Kondratyev e8f59f4d31 LinuxKPI: Add the accelerator PCIe class
Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:43 +03:00
Vladimir Kondratyev e0eeeca8b8 LinuxKPI: Add PTR_IF macro
Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 06902a4479 LinuxKPI: Add vm_flags_(clear|set) functions
Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 61fb195e8d LinuxKPI: Improve timer_shutdown_sync
timer_shutdown_sync not only shutdowns a timer but prevents it rearming.

Sponsored by:	Serenity CyberSecurity, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 9289c1f6f1 LinuxKPI: Add get_random_u32_below function
get_random_u32_below returns a random integer in the interval [0, ceil),
with uniform distribution.

Sponsored by:	Serenity CyberSecurity, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 38c276a43f LinuxKPI: Add VM_ACCESS_FLAGS define to linux/mm.h
VM_ACCESS_FLAGS is a basic access permission flags.

Sponsored by:	Serenity CyberSecurity, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 3208d4ad2b LinuxKPI: Import vanilla linux/overflow.h
It is dual-licensed (GPLv2 & MIT) and self-contained header file.
No need to reimplement it.

Sponsored by:	Serenity CyberSecurity, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 8cfd1dd821 LinuxKPI: Move [SU](8|16|32|64)_(MAX|MIN) defines to linux/limits.h
Some source files get them from linux/limits.h directly rather than from
linux/kernel.h.
While here replace Linux constant values with sys/stdint.h provided ones.

Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 1970388766 LinuxKPI: Add strnchr function
strnchr() finds a character in a length limited string.

Sponsored by:	Serenity CyberSecurity, LLC
Reviewed by:	emaste
MFC after:	1 month
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev aafe4126f7 LinuxKPI: Add ms_to_ktime
Sponsored by:	Serenity CyberSecurity, LLC
Reviewed by:	emaste
MFC after:	1 week
2024-04-08 09:47:42 +03:00
Vladimir Kondratyev 8ace984e47 LinuxKPI: Set suspend type on syspend/resume cycle enter
Recent amdgpu depends on pm_suspend_target_state value to separate
S3 and S0ix support.

Sponsored by:	Serenity Cyber Security, LLC
Reviewed by:	manu (in bugzilla)
MFC after:	1 week
2024-04-08 09:47:41 +03:00
Jake Freeland 1ff4bc0f49 RELNOTES: Add entry for updates to ktrace(2)
Approved by:	markj (mentor)
2024-04-07 18:52:51 -05:00
Jake Freeland 2f39a98664 tests: Add ktrace capability violation test cases
Introduce regression tests for ktrace(2) that target capability
violations.

These test cases ensure that ktrace(2) records these violations:
- CAPFAIL_NOTCAPABLE
- CAPFAIL_INCREASE
- CAPFAIL_SYSCALL
- CAPFAIL_SIGNAL
- CAPFAIL_PROTO
- CAPFAIL_SOCKADDR
- CAPFAIL_NAMEI
- CAPFAIL_CPUSET

A portion of these test cases create processes that do NOT enter
capability mode, but raise violations. This is intended behavior.
Users may run `ktrace -t p` on non-Capsicumized programs to detect
violations that would occur if the process were in capability mode.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D40682
2024-04-07 18:52:51 -05:00
Jake Freeland aa32d7cbc9 ktrace: Record socket violations with KTR_CAPFAIL
Report restricted access to socket addresses and protocols while
Capsicum violation tracing with CAPFAIL_ADDR and CAPFAIL_PROTO.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D40681
2024-04-07 18:52:51 -05:00
Jake Freeland 0cd9cde767 ktrace: Record namei violations with KTR_CAPFAIL
Report namei path lookups while Capsicum violation tracing with
CAPFAIL_NAMEI. vfs caching is also ignored when tracing to mimic
capability mode behavior.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D40680
2024-04-07 18:52:51 -05:00
Jake Freeland 6a4616a529 ktrace: Record signal violations with KTR_CAPFAIL
Report the delivery of signals to processes other than self while
Capsicum violation tracing with CAPFAIL_SIGNAL.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D40679
2024-04-07 18:52:51 -05:00
Jake Freeland 05296a0ff6 ktrace: Record syscall violations with KTR_CAPFAIL
Report syscalls that are not allowed in capability mode with
CAPFAIL_SYSCALL.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D40678
2024-04-07 18:52:51 -05:00
Jake Freeland 96c8b3e509 ktrace: Record cpuset violations with KTR_CAPFAIL
Report Capsicum violations in the cpuset namespace with CAPFAIL_CPUSET.

Reviewed by:	markj
Approved by:	markj (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D40677
2024-04-07 18:52:51 -05:00
Jake Freeland 9bec841312 ktrace: Record detailed ECAPMODE violations
When a Capsicum violation occurs in the kernel, ktrace will now record
detailed information pertaining to the violation.

For example:
- When a namei lookup violation occurs, ktrace will record the path.
- When a signal violation occurs, ktrace will record the signal number.
- When a sendto(2) violation occurs, ktrace will record the recipient
  sockaddr.

For all violations, the syscall and ABI is recorded.

kdump is also modified to display this new information to the user.

Reviewed by:	oshogbo, markj
Approved by:	markj (mentor)
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D40676
2024-04-07 18:52:51 -05:00
Rick Macklem 401c0f8aa1 exports.5: Add RFC number for NFS over TLS
This is a content change.

MFC after:	1 week
2024-04-07 16:35:55 -07:00
Ed Maste e8b7c78c1b Cirrus-CI: switch to llvm18 by default
As of commit 439352ac82 Clang/LLVM 18 is the default in-tree compiler.
Follow suit in with the external toolchain package used by Cirrus-CI.

Sponsored by:	The FreeBSD Foundation
2024-04-07 17:23:25 -04:00
Michael Tuexen e8c149ab85 tcp: add some debug output
Also log, when dropping text or FIN after having received a FIN.
This is the intended behavior described in RFC 9293.
A follow-up patch will enforce this behavior for the base stack
and the RACK stack.
Reviewed by:		rscheff
MFC after:		3 days
Sponsored by:		Netflix, Inc.
Differential Revision:	https://reviews.freebsd.org/D44669
2024-04-07 22:41:24 +02:00
Dimitry Andric 4c983a2886 libcompiler_rt Makefile.inc: include bsd.compiler.mk to fix build
Apparently libgcc_s has always included libcompiler_rt's Makefile.inc
without first including bsd.compiler.mk, even though Makefile.inc used
COMPILER_TYPE already. It looks like we were just lucky that the
expression was not malformed.

PR:		276104
Reported by:	Herbert J. Skuhra <herbert@gojira.at>
MFC after:	1 month
2024-04-07 21:45:51 +02:00