unix: new implementation of unix/stream & unix/seqpacket

Provide protocol specific pr_sosend and pr_soreceive for PF_UNIX
SOCK_STREAM sockets and implement SOCK_SEQPACKET sockets as an extension
of SOCK_STREAM.  The change meets three goals: get rid of unix(4) specific
stuff in the generic socket code, provide a faster and robust unix/stream
sockets and bring unix/seqpacket much closer to specification.  Highlights
follow:

- The send buffer now is truly bypassed.  Previously it was always empty,
but the send(2) still needed to acquire its lock and do a variety of
tricks to be woken up in the right time while sleeping on it.  Now the
only two things we care about in the send buffer is the I/O sx(9) lock
that serializes operations and value of so_snd.sb_hiwat, which we can read
without obtaining a lock.  The sleep of a send(2) happens on the mutex of
the receive buffer of the peer.  A bulk send/recv of data with large
socket buffers will make both syscalls just bounce between owning the
receive buffer lock and copyin(9)/copyout(9), no other locks would be
involved.

- The implementation uses new mchain structure to manipulate mbuf chains.
Note that this required converting to mchain two functions that are shared
with unix/dgram: unp_internalize() and unp_addsockcred() as well as adding
a new shared one uipc_process_kernel_mbuf().  This induces some non-
functional changes in the unix/dgram code as well.  There is a space for
improvement here, as right now it is a mix of mchain and manually managed
mbuf chains.

- unix/seqpacket previously marked as PR_ADDR & PR_ATOMIC and thus treated
as a datagram socket by the generic socket code, now becomes a true stream
socket with record markers.

- unix/stream loses the sendfile(2) support.  This can be brought back,
but requires some work.  Let's first see if there is any interest in this
feature, except purely academical.

Reviewed by:		markj, tuexen
Differential Revision:	https://reviews.freebsd.org/D44151
This commit is contained in:
Gleb Smirnoff 2024-04-08 13:16:51 -07:00
parent aba79b0f4a
commit d80a97def9
2 changed files with 652 additions and 325 deletions

File diff suppressed because it is too large Load diff

View file

@ -130,6 +130,13 @@ struct sockbuf {
uint64_t sb_tls_seqno; /* TLS seqno */
struct ktls_session *sb_tls_info; /* TLS state */
};
/*
* PF_UNIX/SOCK_STREAM and PF_UNIX/SOCK_SEQPACKET
* A most simple stream buffer.
*/
struct {
STAILQ_HEAD(, mbuf) sb_mbq;
};
/*
* PF_UNIX/SOCK_DGRAM
*