On Feb. 28, a problem was reported on freebsd-stable@ where a
nfsd thread processing an ExchangeID operation was blocked for
a long time by another nfsd thread performing a copy_file_range.
This occurred because the copy_file_range was taking a long time,
but also because handling a clientID requires that all other nfsd
threads be blocked via an exclusive lock, as required by ExchangeID.
This patch adds two arguments to nfsv4_cleanclient() so that it
can optionally be called with a mutex held. For this patch, the
first of these arguments is "false" and, as such, there is no
change in semantics. However, this change will allow a future
commit to modify handling of the clientID so that it can be done
with a mutex held while other nfsd threads continue to process
NFS RPCs.
MFC after: 1 month
RFC8881 specifies that, when a Link operation occurs on an
NFSv4, that file delegations issued to other clients must
be recalled. Discovered during a recent discussion on nfsv4@ietf.org.
Although I have not observed a problem caused by not doing
the required delegation recall, it is definitely required
by the RFC, so this patch makes the server do the recall.
Tested during a recent NFSv4 IETF Bakeathon event.
MFC after: 1 week
There is only one place in the unpatched sources where B_DIRECT is
set in the NFS client and this code is never executed. As such, this patch
removes this code that is never executed, since B_DIRECT should never
be set.
During a IETF testing event this week, I saw a crash in ncl_doio_directwrite(),
but this function is only called if B_DIRECT is set.
I cannot explain how ncl_doio_directwrite() got called, but once this patch
was applied to the sources, the crash did not recur. This is not surprising,
since this patch deleted the function.
Reviewed by: kib, markj
MFC after: 3 days
Differential Revision: https://reviews.freebsd.org/D44980
Commit 57ce37f9dc modified the NFSv4.2 Copy operation so that
it will update atime on the infd file whenever possible.
This is done by adding a Setattr of TimeAccess for the
input file.
This patch disables this change for the case of an NFSv4.2
mount with the "noatime" mount option, which avoids the
additional Setattr of TimeAccess operation.
MFC after: 1 week
If the NFS server detects that the Kerberos credentials provided
by a NFSv4.1/4.2 mount using sec=krb5[ip] have expired, the NFS
server replies with a krpc layer error of RPC_AUTHERROR.
When this happened, the client erroneously left the NFSv4.1/4.2
session slot busy, so that it could not be used by other RPCs.
If this happened for all session slots, the mount point would
hang.
This patch fixes the problem by releasing the session slot
and resetting its sequence# upon receiving a RPC_AUTHERROR
reply.
This bug only affects NFSv4.1/4.2 mounts using sec=krb5[ip],
but has existed since NFSv4.1 client support was added to
FreeBSD.
So, why has the bug remained undetected for so long?
I cannot be sure, but I suspect that, often, the client detected
the Kerberos credential expiration before attempting the RPC.
For this case, the client would not do the RPC and, as such,
there would be no busy session slot. Also, no hang would
occur until all session slots are busied (64 for a FreeBSD
client/server), so many cases of the bug probably went undetected?
Also, use of sec=krb5[ip] mounts are not that common.
PR: 275905
Tested by: Lexi <lexi.freebsd@le-fay.org>
MFC after: 1 week
If vfs.nfs.nfs_directio_enable is set non-zero (the default is
zero) and a file on an NFS mount is read after being opened
with O_DIRECT | O_ RDONLY, a call to nfsm_mbufuio() calls
copyout() without checking for an error return.
If copyout() returns EFAULT, this would not work correctly.
Only the call path
VOP_READ()->ncl_readrpc()->nfsrpc_read()->nfsrpc_readrpc()
will do this and the error return for EFAULT will
be returned back to VOP_READ().
This patch adds the error check to nfsm_mbufuio().
Reviewed by: markj
Differential Revision: https://reviews.freebsd.org/D43160
During recent testing related to the IETF NFSv4 Bakeathon, it was
discovered that Kerberized NFSv4.1/4.2 mounts to pNFS servers
(sec=krb5[ip],pnfs mount options) was broken.
The FreeBSD client was using the "service principal" for
the MDS to try and establish a rpcsec_gss credential for a DS,
which is incorrect. (A "service principal" looks like
"nfs@<fqdn-of-server>" and the <fqdn-of-server> for the DS is not
the same as the MDS for most pNFS servers.)
To fix this, the rpcsec_gss code needs to be able to do a
reverse DNS lookup of the DS's IP address. A new kgssapi upcall
to the gssd(8) daemon is added by this patch to do the reverse DNS
along with a new rpcsec_gss function to generate the "service
principal".
A separate patch to the gssd(8) will be committed, so that this
patch will fix the problem. Without the gssd(8) patch, the new
upcall fails and current/incorrect behaviour remains.
This bug only affects the rare case of a Kerberized (sec=krb5[ip],pnfs)
mount using pNFS.
This patch changes the internal KAPI between the kgssapi and
nfscl modules, but since I did a version bump a few days ago,
I will not do one this time.
MFC after: 1 month
RFC7862 does not specify infile atime behaviour when a NFSv4.2 Copy
operation is performed. Since the collective opinion of a mailing
list discussion (on freebsd-hackers@) seemed to indicate that
copy_file_range(2) should update atime on the infd,
even if there is no data copied, this
patch attempts to ensure that behaviour.
For Copy, it preceeds the Copy operation with a Setattr of
TimeAccess_Set(NFSv4. speak for atime) for the invp. For the case
where no data will be copied, it does a Setattr RPC to set
TimeAccess_Set for the invp.
A __FreeBSD_version bump will be done as a separate commit, since
this patch changes the internal interface between the nfscommon and
nfscl modules.
MFC after: 1 month
In a recent email list discussion related to NFSv4 mount problems
against a non-FreeBSD NFSv4 server, the reporter of the issue noted
that the server had replied 10068 (NFSERR_RETRYUNCACHEDREP). This
did not seem related to the mount problem, but I had never seen this
error before. It indicates that an RPC retry after a new TCP
connection has been established failed because the server did not
cache the reply. Since this should only happen for idempotent
operations, redoing the RPC should be safe.
This patch modifies the NFSv4.1/4.2 client to redo the RPC instead
of considering the server error fatal. It should only affect the
unusual case where TCP connections to NFSv4 servers are breaking
without the NFSv4 server rebooting.
Reported by: J David <j.devid.lists@gmail.com>
MFC after: 2 weeks
PR#274346 reports a crash which appears to be caused by a NULL default session
being destroyed. This patch should avoid the crash.
Tested by: Joshua Kinard <freebsd@kumba.dev>
PR: 274346
MFC after: 2 weeks
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.
Discussed with: pfg
MFC After: 3 days
Sponsored by: Netflix
Although the NFS client does not currently perform Null RPCs,
this fix is needed if/when it might do so.
Found during testing of experimental code that uses Null RPCs
to maintain/monitor TCP connections for "nconnect" mounts.
MFC after: 3 months
Commit f4179ad46f added support for operation bitmaps for
NFSv4.1/4.2. This commit uses those to implement the SP4_MACH_CRED
case for the NFSv4.1/4.2 ExchangeID operation since the Linux
NFSv4.1/4.2 client is now using this for Kerberized mounts.
The Linux Kerberized NFSv4.1/4.2 mounts currently work without
support for this because Linux will fall back to SP4_NONE,
but there is no guarantee this fallback will work forever.
This commit only affects Kerberized NFSv4.1/4.2 mounts from
Linux at this time.
MFC after: 3 months
NFSv4.1/4.2 uses operation bitmaps for various operations,
such as the SP4_MACH_CRED case for ExchangeID.
This patch adds support for operation bitmaps so that
support for SP4_MACH_CRED can be added to the NFSv4.1/4.2
server in a future commit.
This commit should not change any NFSv4.1/4.2 semantics.
MFC after: 3 months
Coverity does not like code that checks a function's
return value sometimes. Add "(void)" in front of the
function when the return value does not matter to try
and make it happy.
A recent commit deleted "(void)"s in front of nfsm_fhtom().
This commit puts them back in.
Reported by: emaste
MFC after: 3 months
Without this patch, a Kerberized NFSv4.1/4.2 mount must provide
a Kerberos credential for the client at mount time. This credential
is typically referred to as a "machine credential". It can be
created one of two ways:
- The user (usually root) has a valid TGT at the time the mount
is done and this becomes the machine credential.
There are two problems with this.
1 - The user doing the mount must have a valid TGT for a user
principal at mount time. As such, the mount cannot be put
in fstab(5) or similar.
2 - When the TGT expires, the mount breaks.
- The client machine has a service principal in its default keytab
file and this service principal (typically called a host-based
initiator credential) is used as the machine credential.
There are problems with this approach as well:
1 - There is a certain amount of administrative overhead creating
the service principal for the NFS client, creating a keytab
entry for this principal and then copying the keytab entry
into the client's default keytab file via some secure means.
2 - The NFS client must have a fixed, well known, DNS name, since
that FQDN is in the service principal name as the instance.
This patch uses a feature of NFSv4.1/4.2 called SP4_NONE, which
allows the state maintenance operations to be performed by any
authentication mechanism, to do these operations via AUTH_SYS
instead of RPCSEC_GSS (Kerberos). As such, neither of the above
mechanisms is needed.
It is hoped that this option will encourage adoption of Kerberized
NFS mounts using TLS, to provide a more secure NFS mount.
This new NFSv4.1/4.2 mount option, called "syskrb5" must be used
with "sec=krb5[ip]" to avoid the need for either of the above
Kerberos setups to be done by the client.
Note that all file access/modification operations still require
users on the NFS client to have a valid TGT recognized by the
NFSv4.1/4.2 server. As such, this option allows, at most, a
malicious client to do some sort of DOS attack.
Although not required, use of "tls" with this new option is
encouraged, since it provides on-the-wire encryption plus,
optionally, client identity verification via a X.509
certificate provided to the server during TLS handshake.
Alternately, "sec=krb5p" does provide on-the-wire
encryption of file data.
A mount_nfs(8) man page update will be done in a separate commit.
Discussed on: freebsd-current@
MFC after: 3 months
The Kasan tests show the nfsrvd_cleancache() results
in a modify after free. I think this occurs because the
nfsrv_cleanup() function gets executed after nfs_cleanup()
which free's the nfsstatsv1_p.
This patch makes them use the same subsystem and sets
SI_ORDER_FIRST for nfs_cleanup(), so that it will be called
after nfsrv_cleanup() via VNET_SYSUNINIT().
The patch also sets nfsstatsv1_p NULL after free'ng it,
so that a crash will result if it is used after free'ng.
Tested by: markj
Reviewed by: markj
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D38750
Another oopsie. The vnet initialization function in
nfs_commonport.c for initializing prison0 by testing
curthread->td_ucred->cr_prison == &prison0. This is bogus
and always true. Replace it with IS_DEFAULT_VNET(curvnet).
MFC after: 3 months
Commit ed03776ca7 enabled the vnet front end macros.
As such, for kernels built with the VIMAGE option will malloc
data and initialize locks on a per-vnet basis, typically
via a VNET_SYSINIT().
This patch adds VNET_SYSUNINIT() macros to do the frees
of the per-vnet malloc'd data and destroys of per-vnet
locks. It also removes the mtx_lock/mtx_unlock calls
from nfsrvd_cleancache(), since they are not needed.
Discussed with: bz, jamie
MFC after: 3 months
Several commits have added front end macros for the vnet
macros to the NFS server, krpc and kgssapi. These macros
are now null, but this patch changes them to front end
the vnet macros.
With this commit, many global variables in the code become
vnet'd, so that nfsd(8), nfsuserd(8), rpc.tlsservd(8) and
gssd(8) can run in a vnet prison, once enabled.
To run the NFS server in a vnet prison still requires a
couple of patches (in D37741 and D38371) that allow mountd(8)
to export file systems from within a vnet prison. Once
these are committed to main, a small patch to kern_jail.c
allowing "allow.nfsd" without VNET_NFSD defined will allow
the NFS server to run in a vnet prison.
One area that still needs to be settled is cleanup when a
prison is removed. Without this, everything should work
except there will be a leak of malloc'd data and mutex locks
when a vnet prison is removed.
MFC after: 3 months
Commit 9d329bbc9a converted a lot of accesses to nfsstatsv1
to use nfsstatsv1_p instead. However, the accesses in
nfs_commonkrpc.c are for client side and should not be
converted. This patch puts them back in the correct
pre-commit 9d329bbc9a form.
MFC after: 3 months
Commit 7344856e3a6d added a lot of macros that will front end
vnet macros so that nfsd(8) can run in vnet prison.
The nfsstatsv1_p variable got missed. This patch wraps all
uses of nfsstatsv1_p with the NFSD_VNET() macro.
The NFSD_VNET() macro is still a null macro.
MFC after: 3 months
Commit 7344856e3a6d added a lot of macros that will front end
vnet macros so that nfsd(8) can run in vnet prison.
This patch adds some more, to allow the nfsuserd(8) daemon to
run in vnet prison, once the macros map to vnet ones.
This is the last commit for NFSD_VNET_xxx macros, but there are
still some for KRPC_VNET_xxx and KGSS_VNET_xx to allow the
rpc.tlsservd(8) and gssd(8) daemons to run in a vnet prison.
MFC after: 3 months
Commit 7344856e3a6d added a lot of macros that will front end
vnet macros so that nfsd(8) can run in vnet prison.
This patch adds some more of them and also a lot of uses of
nfsstatsv1_p instead of nfsstatsv1. nfsstatsv1_p points to
nfsstatsv1 for prison0, but will point to a malloc'd structure
for other prisons.
It also puts nfsstatsv1_p in nfscommon.ko instead of nfsd.ko.
MFC after: 3 months
Commit 7344856e3a6d added a lot of macros that will front end
vnet macros so that nfsd(8) can run in vnet prison.
This patch adds some more of them.
MFC after: 3 months
This patch defines null macros that can be used to apply
the vnet macros for global variables and SYSCTL flags.
It also applies these macros to many of the global variables
and some of the SYSCTLs. Since the macros do nothing, these
changes should not result in semantics changes, although the
changes are large in number.
The patch does change several global variables that were
arrays or structures to pointers to same. For these variables,
modified initialization and cleanup code malloc's and free's
the arrays/structures. This was done so that the vnet footprint
would be about 300bytes when the macros are defined as vnet macros,
allowing nfsd.ko to load dynamically.
I believe the comments in D37519 have been addressed, although
it has never been reviewed, due in part to the large size of the patch.
This is the first of a series of patches that will put D37519 in main.
Once everything is in main, the macros will be defined as front
end macros to the vnet ones.
MFC after: 3 months
Differential Revision: https://reviews.freebsd.org/D37519
Commit 65127e982b removed a check for ni_startdir != NULL.
This allowed the vrele(ndp->ni_dvp) to be called with
a NULL argument.
This patch adds a new boolean argument to nfsvno_open()
that can be checked instead of ni_startdir, since mjg@ requested
that ni_startdir not be used. (Discussed in PR#268828.)
PR: 268828
Reviewed by: mjg
Differential Revision: https://reviews.freebsd.org/D38032
Commit 40ada74ee1 modified the NFSv4.1/4.2 client so
that it would issue a DestroySession to the server when
all session slots are marked bad. Once this is done,
the Sequence operation should get a NFSERR_BADSESSION
reply from the server.
Without this patch, the code was setting ND_HASSLOTID
when, in fact, there was no slot marked in use by
nfsv4_sequencelookup(). This would result in the
code freeing a slot not in use. The effect of this
was minimal, since the session was already destroyed.
This patch fixes the code so that it does not set
ND_HASSLOTID for this case.
MFC after: 2 weeks
The NFSv4.1/4.2 client does recovery when it receives a
NFSERR_BADSESSION reply from the server. If the server has
not rebooted, this is often caused by multiple clients using
the same /etc/hostid and, as such, not being recognized as
different clients by the server.
This trivial patch adds a console message to suggest that
client's /etc/hostid's need to be checked for uniqueness.
MFC after: 2 weeks
When the NFSv4.1/4.2 client is handling a server error
of NFSERR_BADSESSION, it retries RPCs with a new session.
Without this patch, the nd_slotid was not being updated
for the new session.
This would result in a bogus console message like
"Wrong session srvslot=X slot=Y" and then it would
free the incorrect slot, often generating a
"freeing free slot!!" console message as well.
This patch fixes the problem.
Note that FreeBSD NFSv4.1/4.2 servers only
generate a NFSERR_BADSESSION error after a reboot
or after a client does a DestroySession operation.
PR: 260011
MFC after: 1 week
When the NFSv4.1/4.2 client is handling a server error
of NFSERR_BADSESSION, it retries RPCs with a new session.
Without this patch, the nd_slotid was not being updated
for the new session.
This would result in a bogus console message like
"Wrong session srvslot=X slot=Y" and then it would
free the incorrect slot, often generating a
"freeing free slot!!" console message as well.
This patch fixes the problem.
Note that FreeBSD NFSv4.1/4.2 servers only
generate a NFSERR_BADSESSION error after a reboot
or after a client does a DestroySession operation.
PR: 260011
MFC after: 1 week
When a session has been marked defunct by the server
sending a NFSERR_BADSESSION reply to the NFSv4.1/4.2
client, nfsv4_sequencelookup() returns NFSERR_BADSESSION
without actually assigning a session slot.
Without this patch, newnfs_request() would erroneously
free slot 0.
This could result in the slot being reused prematurely,
but most likely just generated a "freeing free slot!!"
console message.
This patch fixes the code to not do the erroneous
freeing of the slot for this case.
PR: 260011
MFC after: 1 week
I mis-read the RFC w.r.t. handling of the sequenceid
when a CreateSession is done after the initial one
that confirms the ClientID. Fortunately this does
not affect most extant NFSv4.1/4.2 clients, since
they only acquire a single session for TCP for a
ClientID (Solaris might be an exception?).
This patch fixes the server to handle this case,
where the RFC requires the sequenceid be incremented
for each CreateSession and is required to reply to
a retried CreateSession with a cached reply.
It adds a field to nfsclient called lc_prevsess,
which caches the sessionid, which is the only field
in a CreateSession reply that will change for a
retry, to implement this reply cache.
The recent commits up to d4a11b3e3b that mark
session slots bad when "intr" and/or "soft" mounts
are used by the client needs this server patch.
Without this patch, the client will do a full
recovery, including a new ClientID, losing all
byte range locks. However, prior to the recent
client commits, the client would hang when all
session slots were bad, so even without this
patch it is not a regression.
PR: 260011
MFC after: 2 weeks
To deal with broken session slots caused by the use of the
"soft" and/or "intr" mount options, nfsv4_sequencelookup()
has been modified to track the potentially broken session
slots (commit 40ada74ee1). Then, when all session slots
are potentially broken, nfsv4_sequencelookup() does a
DeleteSession operation, so that the NFSv4.1/4.2 server will
reply NFSERR_BADSESSION to uses of the session.
The client will then recover by doing a CreateSession to
acquire a new session.
This patch adds the code that marks potentially bad
slots, so that the above semantics become functional.
It has been successfully tested against a FreeBSD
NFSv4.1/4.2 server, but does not work against a Linux 5.15
NFSv4.1/4.2 server. (The Linux 5.15 server creates
a new session with the same sessionid as the destroyed
one and, as such, keeps returning NFSERR_BADSESSION.
I believe this is a bug in the Linux server.)
However, this should not cause a regression and will
make "intr" mounts fairly usable against the NFSv4.1/4.2
servers where it works.
PR: 260011
MFC after: 2 weeks
This patch adds support for session slots marked bad
to nfsv4_sequencelookup(). An additional boolean
argument indicates if the check for slots marked bad
should be done.
The "cred" argument added to nfscl_reqstart() by
commit 326bcf9394 is now passed into nfsv4_setquence()
so that it can optionally set the boolean argument
for nfsv4_sequencelookup(). When optionally enabled,
nfsv4_setsequence() will do a DestroySession when all
slots are marked bad.
Since the code that marks slots bad is not yet committed,
this patch should not result in a semantics change.
PR: 260011
MFC after: 2 weeks
This patch moves nfsrpc_destroysession() into nfscommon.ko
and also modifies its arguments slightly. This will allow
the function to be called from nfsv4_sequencelookup() in
a future commit.
This patch should not result in a semantics change.
PR: 260011
MFC after: 2 weeks
To deal with broken session slots caused by the use of the
"soft" and/or "intr" mount options, nfsv4_sequencelookup()
will be modified to track the potentially broken session
slots. Then, when all session slots are potentially
broken, do a DeleteSession operation, so that the NFSv4
server will reply NFSERR_BADSESSION to uses of the session.
These changes will be done in future commits. However,
to do the DeleteSession RPC, a "cred" argument is needed
for nfscl_reqstart(). This patch adds this argument,
which is unused at this time. If the argument is NULL,
it indicates that DeleteSession should not be done
(usually because the RPC does not use sessions).
This patch should not cause any semantics change.
PR: 260011
MFC after: 2 weeks
The vnode_vtype() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code and, therefore,
use of the macro has been deleted by previous commits.
This commit deletes the, now unused, macro.
This commit should not result in a semantics change.
The vnode_vtype() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code, so
avoid using it to clean up the code.
This commit should not result in a semantics change.
The vfs_flags() macro was used to make the code compatible
with Mac OSX, for the Mac OSX port.
For FreeBSD, this macro just obscured the code, so
remove it to clean up the code.
This commit should not result in a semantics change.
The definition of "APPLE" was used by the Mac OSX port.
For FreeBSD, this definition is never used, so remove
the references to it to clean up the code.
This commit should not result in a semantics change.
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port. For FreeBSD, this argument
is always NULL, so remove it to clean up the code.
This commit gets rid of "stuff" for assorted functions
defined in nfs_clrpcops.c and called in nfs_clvnops.c and
nfs_clstate.c.
This commit should not result in a semantics change.