Commit graph

4783 commits

Author SHA1 Message Date
Rick Macklem 9792c7d3eb nfscl: Enable support for the Lookup+Open RPC
Commits 3ad1e1c1ce and 57014f21e7 added a Lookup+Open
RPC for NFSv4.1/4.2, which can reduce the RPC count by
10-20% for some loads.  This has now received a fair amount
of testing, so I think it is ok to enable it.

Note that the Lookup+Open RPC is only used when the
"oneopenown" mount option is specified.  As such, this
change won't affect most NFSv4.1/4.2 mounts.
2022-05-31 11:59:39 -07:00
Dmitry Chagin 31d1b816fe sysent: Get rid of bogus sys/sysent.h include.
Where appropriate hide sysent.h under proper condition.

MFC after:	2 weeks
2022-05-28 20:52:17 +03:00
Rick Macklem a7bb120f8b nfscl: Add a diagnostic printf() for a "should never happen" case
When a NFSv4.1/4.2 session to the NFS server (not a pNFS DS) is
replaced, the old session should always be marked defunct by
nfsess_defunct being set non-zero.

However, the hang reported by the PR suggests that this might
be the case.

This patch adds a printf() to indicate this has somehow happened.

PR:	260011
MFC after: 	2 weeks
2022-05-27 14:32:46 -07:00
Rick Macklem 425e5c739b nfscl: Do not handle NFSERR_BADSESSION in operation code
The NFSERR_BADSESSION reply from a NFSv4.1/4.2 server
is handled by newnfs_request().  It should not be handled
separately after newnfs_request() has returned.

These two cases were spotted during code inspection.
One of them should only redo what newnfs_request() already
did by the same "nfscl" thread.  The other might have
resulted in recovery being done twice, but the code is
only used for "pnfs" mounts, so that would be rare.
Also, since NFSERR_BADSESSION should only be replied by
a server after the server reboots, this would be extremely
rare.

MFC after: 	2 weeks
2022-05-27 14:20:31 -07:00
Alan Somers 0bef4927ea fusefs: handle evil servers that return illegal inode numbers
* If during FUSE_CREATE, FUSE_MKDIR, etc the server returns the same
  inode number for the new file as for its parent directory, reject it.
  Previously this would triggers a recurse-on-non-recursive lock panic.

* If during FUSE_LINK the server returns a different inode number for
  the new name as for the old one, reject it.  Obviously, that can't be
  a hard link.

* If during FUSE_LOOKUP the server returns the same inode number for the
  new file as for its parent directory, reject it.  Nothing good can
  come of this.

PR:		263662
Reported by:	Robert Morris <rtm@lcs.mit.edu>
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D35128
2022-05-12 14:32:26 -06:00
Alan Somers dcfa054216 fusefs: fix an undefined variable access
In an error path, a dtrace probe could access an undefined variable.

Reported by:	Coverity (CID 1471986)
MFC after:	2 weeks
Sponsored by:	Axcient
2022-05-05 08:43:51 -06:00
Rick Macklem ef4edb70c9 nfsd: Add a sanity check for Owner/OwnerGroup string length
Robert Morris reported that, if a client sends an absurdly
large Owner/OwnerGroup string, the kernel malloc() for the
large size string can block forever.

This patch adds a sanity limit for Owner/OwnerGroup string
length.  Since the RFCs do not specify any limit and FreeBSD
can handle a group name greater than 1Kbyte, the limit is
set at a generous 10Kbytes.

Reported by:	rtm@lcs.mit.edu
PR:	260546
MFC after:	2 weeks
2022-05-04 13:58:22 -07:00
Rick Macklem f32bf50d43 nfsd: Fix handling of Open/Create for the pNFS server
When the MDS of a pNFS service receives an Open/Create
and the file already exists, it must do a Setattr of
size == 0.  Without this patch, this was eroneously
done via a VOP_SETAATR() call, which would set the
length of the MDS file to 0 (which is already is,
since all data lives on the DSs).

This patch fixes the problem by doing a nfsvno_setattr()
instead of VOP_SETATTR(), which knows to do a proxied
Setattr on the DSs.

For a non-pNFS server, the change has no effect, since
nfsvno_setattr() only does a VOP_SETATTR() for that case.

This was found during a recent IETF NFSv4 testing event.

MFC after:	2 weeks
2022-05-04 13:52:33 -07:00
Rick Macklem 70910e4b55 nfscl: Acquire a refcount on "cred" for mirrored pNFS RPCs
When the NFSv4.1/4.2 client is doing a pnfs mount to
mirrored DS(s), asynchronous threads are used to do the
RPCs against the DS(s) concurrently.  If a DS is slow
to reply, it is possible for the "cred" to be free'd
before the asynchronous thread is done with it, causing
a panic/crash.

This patch fixes the problem by acquiring a refcount on
the "cred" while it is being used by the asynchronous thread
for a DS RPC.  This bug was found during a recent IETF
NFSv4 testing event.

This bug only affects "pnfs" mounts to mirrored pNFS
servers.

MFC after:	2 weeks
2022-05-03 07:22:15 -07:00
Rick Macklem 271f6d52a6 nfsd: Fix session slot freeing for NFSv4.1/4.2
Without this patch the NFSv4.1/4.2 server erroneously
always frees session slot zero for callbacks.  This only
affects 4.1/4.2 mounts if the server has delegations
enabled or is a pNFS configuration.  Even for those
cases, the effect is mainly to only use slot 0 for
callbacks, serializing all of them.  There is a slight
chance that callbacks will fail if the client performs
them in a different order than received on the TCP
connection.

If this bug affects your server, you will see console
messages like:
  newnfs_request: Bad session slot

This patch fixes the problem.  Found during a recent
IETF NFSv4 testing event.

PR:	263728
MFC after:	2 weeks
2022-05-02 12:47:43 -07:00
Rick Macklem 47d75c29f5 nfsd: Add a sanity check to SecinfoNoname for file type
Robert Morris reported that, for the case of SecinfoNoname
with the Parent option, providing a non-directory could
cause a crash.

This patch adds a sanity check for v_type == VDIR for
this case, to avoid the crash.

Reported by:	rtm@lcs.mit.edu
PR:	260300
MFC after:	2 weeks
2022-05-01 13:41:31 -07:00
Rick Macklem 5218d82c81 nfscl: Add support for a NFSv4 AppendWrite RPC
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.

Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct, followed by the Write operation.

This patch modifies the NFSv4 client to use an Appendwrite
RPC, which does a Verify to check the file's size before
doing the Write.  This avoids the need for a Getattr RPC
to preceed this RPC and reduces the RPC count by half for
IO_APPEND writes, so long as the client knows the file's
size.

The nfsd structure was moved from the stack to be malloc()'d,
since the kernel stack limit was being exceeded.

While here, fix the types of a few variables, although
there should not be any semantics change caused by these
type changes.
2022-04-30 13:49:23 -07:00
Alan Somers 2f6362484c fusefs: use the fsname mount option if set
The daemon can specify fsname=XXX in its mount options.  If so, the file
system should report f_mntfromname as XXX during statfs.  This will show
up in the output of commands like mount and df.

Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35090
2022-04-29 11:10:03 -06:00
Alan Somers 45825a12f9 fusefs: fix FUSE_CREATE with file handles and fuse protocol < 7.9
Prior to fuse protocol version 7.9, the fuse_entry_out structure had a
smaller size.  But fuse_vnop_create did not take that into account when
working with servers that use older protocols.  The bug does not matter
for servers which don't use file handles or open flags (the only fields
affected).

PR:		263625
Submitted by:	Ali Abdallah <ali.abdallah@suse.com>
MFC after:	2 weeks
2022-04-28 15:13:09 -06:00
Mateusz Guzik 11c5495554 ext2: plug a set-but-not-used var
Sponsored by:	Rubicon Communications, LLC ("Netgate")
2022-04-19 12:45:57 +00:00
Alan Somers 3a1b3c6a1e fusefs: correctly handle servers that report too much data written
During a FUSE_WRITE, the kernel requests the server to write a certain
amount of data, and the server responds with the amount that it actually
did write.  It is obviously an error for the server to write more than
it was provided, and we always treated it as such, but there were two
problems:

* If the server responded with a huge amount, greater than INT_MAX, it
  would trigger an integer overflow which would cause a panic.

* When extending the file, we wrongly set the file's size before
  validing the amount written.

PR:		263263
Reported by:	Robert Morris <rtm@lcs.mit.edu>
MFC after:	2 weeks
Sponsored by:	Axcient
Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D34955
2022-04-18 18:59:10 -06:00
Rick Macklem 32c3e0f049 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for assorted functions
local to nfs_clrpcops.c.
Future commits will do the same for other functions.
2022-04-15 18:51:01 -07:00
Alan Somers 155ac516c6 fusefs: validate servers' error values
Formerly fusefs would pass up the stack any error value returned by the
fuse server.  However, some values aren't valid for userland, but have
special meanings within the kernel.  One of these, EJUSTRETURN, could
cause a kernel page fault if the server returned it in response to
FUSE_LOOKUP.  Fix by validating all errors returned by the server.

Also, fix a data lifetime bug in the FUSE_DESTROY test.

PR:		263220
Reported by:	Robert Morris <rtm@lcs.mit.edu>
MFC after:	3 weeks
Sponsored by:	Axcient
Reviewed by:	emaste
Differential Revision: https://reviews.freebsd.org/D34931
2022-04-15 13:57:32 -06:00
Rick Macklem 068fc05745 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_nget().
Future commits will do the same for other functions.
2022-04-14 16:15:56 -07:00
John Baldwin ac9c3c32c6 unionfs: Use __diagused for a variable only used in KASSERT(). 2022-04-13 16:08:20 -07:00
Rick Macklem 4ad3423bc2 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_loadattrcache().
Future commits will do the same for other functions.
2022-04-13 07:43:13 -07:00
Rick Macklem 5580e5bd71 nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_request().
Future commits will do the same for other functions.
2022-04-10 14:05:44 -07:00
Rick Macklem 38c3cf6aed nfscl: Clean up the code by removing unused arguments
The "void *stuff" (also called fstuff and dstuff) argument
was used by the Mac OSX port.  For FreeBSD, this argument
is always NULL, so remove it to clean up the code.

This commit gets rid of "stuff" for nfscl_postop_attr().
Future commits will do the same for other functions.
2022-04-09 18:53:25 -07:00
Rick Macklem c45d934f6b nfscl: Ansify a function header 2022-04-09 15:14:05 -07:00
Rick Macklem 21de450aa1 nfscl: Add support for a NFSv4 AppendWrite RPC
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.

Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct, followed by the Write operation.

This patch modifies nfscl_wcc_data() to optionally
acquire the file's size, for use with an AppendWrite.
Although the "stuff" arguments are always NULL
(these were used for the Mac OSX port and should be
cleared out someday), make the argument to
nfscl_wcc_data() explicitly NULL for clarity.

This patch does not cause any semantics change until
the AppendWrite is added in a future commit.
2022-04-08 13:59:05 -07:00
John Baldwin 771e4a8613 smbfs: Remove unused variable. 2022-04-07 17:01:28 -07:00
John Baldwin 9fe2867ce4 smbfs_rename: Move all references to flags under #ifdef notnow. 2022-04-07 17:01:28 -07:00
Alan Somers 3227325366 fusefs: fix two bugs regarding VOP_RECLAIM of the root inode
* We never send FUSE_LOOKUP for the root inode, since its inode number
  is hard-coded to 1.  Therefore, we should not send FUSE_FORGET for it,
  lest the server see its lookup count fall below 0.

* During VOP_RECLAIM, if we are reclaiming the root inode, we must clear
  the file system's vroot pointer.  Otherwise it will be left pointing
  at a reclaimed vnode, which will cause future VOP_LOOKUP operations to
  fail.  Previously we only cleared that pointer during VFS_UMOUNT.  I
  don't know of any real-world way to trigger this bug.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D34753
2022-04-06 16:16:52 -06:00
Warner Losh 393b7606f9 Unbreak the build: Also define NFSV42_OLDNPROCS here.
If nfsproto.h is included before nfsport.h, then NFSV42_OLDNPROCS would
be undefined when it is used in struct nfsstatsov1.

Sponsored by:		Netflix
2022-04-05 11:54:20 -06:00
Rick Macklem 330aa8acde nfscl: Add support for a NFSv4 AppendWrite RPC
For IO_APPEND VOP_WRITE()s, the code first does a
Getattr RPC to acquire the file's size, before it
can do the Write RPC.

Although NFS does not have an append write operation,
an NFSv4 compound can use a Verify operation to check
that the client's notion of the file's size is
correct before doing the Write operation.

This patch prepares the NFSv4 client for such an
RPC, which will be added in a future commit.

This patch does not cause any semantics change.
2022-04-05 08:11:37 -07:00
Gordon Bergling ef1534cad8 fusefs(5): Fix a typo in a source code comment
- s/accomodate/accommodate/

MFC after:	3 days
2022-04-02 14:56:21 +02:00
Rick Macklem c1970a7eba nfscl: Fix IO_APPEND writes from kernel space
Commit 867c27c23a modified the NFS client so that
it did IO_APPEND writes directly to the NFS server
bypassing the buffer cache, via a call to
nfs_directio_write().  Unfortunately, this (very old)
function assumed that the uio iov was for user space
addresses.  As such, a IO_APPEND VOP_WRITE() that
was for system space, such as ktrace(1) does, would
write bogus data.

This patch fixes nfs_directio_write() so that it
handles kernel space uio iovs.

Reported by:	bz
Tested by:	bz
MFC after:	2 weeks
2022-03-28 15:11:52 -07:00
Gordon Bergling c1ad8a39a1 nfsclient: Fix a typos in source code comments
- s/ony/only/

Obtained from:	NetBSD
MFC after:	3 days
2022-03-27 19:27:05 +02:00
Mateusz Guzik bb92cd7bcd vfs: NDFREE(&nd, NDF_ONLY_PNBUF) -> NDFREE_PNBUF(&nd) 2022-03-24 10:20:51 +00:00
Mateusz Guzik aeabf8d4b9 nullfs: hash insertion without vnode lock upgrade
Use the hash lock to serialize instead.

This enables shared-locked ".." lookups.

Reviewed by:	markj
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D34466
2022-03-19 10:47:10 +00:00
Mark Johnston c0b98fe16f fusefs: Initialize a pad word in the mknod message
Reported by:	Jenkins (KMSAN job)
Reviewed by:	asomers
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34593
2022-03-17 18:30:21 -04:00
Rick Macklem f37dc50d9f nfscl: Do not do a Lookup+Open for pNFS mounts
A NFSv4.1/4.2 pNFS mount needs to do a
separate Open+LayoutGet RPC, so do not do
a Lookup+Open RPC for these mounts.

The Lookup+Open RPCs are still disabled,
until further testing is done, so this patch
has no effect at this time.
2022-03-17 07:48:06 -07:00
Rick Macklem 57014f21e7 nfscl: Fix NFSv4.1/4.2 Lookup+Open RPC
Use of the Lookup+Open RPC is currently disabled,
due to a problem detected during testing.  This
patch fixes this problem.  The problem was that
nfscl_postop_attr() does not parse the attributes
if nd_repstat != 0.  It also would parse the
return status for the operation, where the
Lookup+Open code had already parsed it.

The first change in the patch does not make any
semantics change, but makes the code identical
to what is done later in the function, so that
it is apparent that the semantics should be the
same in both places.

Lookup+Open remains disabled while further
testing is being done, so this patch has no
effect at this time.
2022-03-13 13:15:12 -07:00
Mateusz Guzik 0134bbe56f vfs: prefix lookup and relookup with vfs_
Reviewed by:	imp, mckusick
Differential Revision:		https://reviews.freebsd.org/D34530
2022-03-13 14:44:39 +00:00
Rick Macklem 3fc3fe9091 nfsd: Do not exempt NFSv3 Fsinfo from the TLS check
The Fsinfo RPC is exempt from the check for
Kerberized NFS being required, as recommended
by RFC2623.  However, there is no reason to
exempt Fsinfo from the requirement to use TLS.

This patch fixes the code so that the exemption
only applies to Kerberized NFS and not
NFS-over-TLS.

This only affects NFS-over-TLS for an NFSv3
mount when it is required, but the client does
not do so.

MFC after:	1 month
2022-03-09 16:52:42 -08:00
Rick Macklem 1cedb4ea1a nfscl: Fix a use after free in nfscl_cleanupkext()
ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
  owp->nfsow_owner.  If we free that particular
  owner structure, than in subsequent comparisons
  "own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
  of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
  without the NFSCLSTATE mutex held, this could race with
  nfscl_cleanupkext().
  This could happen when the exclusive lock is held
  on the client, such as when delegations are being returned
  or when recovering from NFSERR_EXPIRED.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
    nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() so that it will never free more
    than the first matching element.  Normally there should only
    be one element in each list with a matching open/lock owner
    anyhow (but there might be a bug that results in a duplicate).
    This should guarantee that the FOREACH_SAFE loops in
    nfscl_cleanupkext() are adequate.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
    and nfscl_freelockowner(), if it is not already held.
    This serializes all of these calls with the ones done in
    nfscl_cleanup_common().

Reported by:	ler
Reviewed by:	markj
Tested by:	cy
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34334
2022-02-25 07:27:03 -08:00
Alan Somers e8553be9bc fusefs: fix a cached attributes bug during directory rename
When renaming a directory into a different parent directory, invalidate
the cached attributes of the new parent.  Otherwise, stat will show the
wrong st_nlink value.

MFC after:	1 week
Reviewed by:	ngie
Differential Revision: https://reviews.freebsd.org/D34336
2022-02-24 14:07:25 -07:00
Rick Macklem 06148d2251 Revert "nfscl: Fix a use after free in nfscl_cleanupkext()"
This reverts commit dd08b84e35.

cy@ reported a problem caused by this patch.  He will be
testing an alternate patch, but I'm reverting this one.
2022-02-24 07:01:03 -08:00
Jason A. Harmening fcb164742b unionfs: rework unionfs_getwritemount()
VOP_GETWRITEMOUNT() is called on the vn_start_write() path without any
vnode locks guaranteed to be held.  It's therefore unsafe to blindly
access per-mount and per-vnode data.  Instead, follow the approach taken
by nullfs and use the vnode interlock coupled with the hold count to
ensure the mount and the vnode won't be recycled while they are being
accessed.

Reviewed by:	kib (earlier version), markj, pho
Tested by:	pho
Differential Revision: https://reviews.freebsd.org/D34282
2022-02-23 22:10:02 -06:00
Rick Macklem dd08b84e35 nfscl: Fix a use after free in nfscl_cleanupkext()
ler@, markj@ reported a use after free in nfscl_cleanupkext().
They also provided two possible causes:
- In nfscl_cleanup_common(), "own" is the owner string
  owp->nfsow_owner.  If we free that particular
  owner structure, than in subsequent comparisons
  "own" will point to freed memory.
- nfscl_cleanup_common() can free more than one owner, so the use
  of LIST_FOREACH_SAFE() in nfscl_cleanupkext() is not sufficient.

I also believe there is a 3rd:
- If nfscl_freeopenowner() or nfscl_freelockowner() is called
  without the NFSCLSTATE mutex held, this could race with
  nfscl_cleanupkext().
  This could happen when the exclusive lock is held
  on the client, such as when delegations are being returned.

This patch fixes them as follows:
1 - Copy the owner string to a local variable before the
    nfscl_cleanup_common() call.
2 - Modify nfscl_cleanup_common() to return whether or not a
    free was done.
    When a free was done, do a goto to restart the loop, instead
    of using FOREACH_SAFE, which was not safe in this case.
3 - Acquire the NFSCLSTATE mutex in nfscl_freeopenowner()
    and nfscl_freelockowner(), if it not already held.
    This serializes all of these calls with the ones done in
    nfscl_cleanup_common().

Reported by:	ler
Reviewed by:	markj
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D34334
2022-02-22 14:21:43 -08:00
Mark Johnston c7cd607a4e msdosfs: Fix mounting when the device sector size is >512B
HugeSectors * BytesPerSec should be computed before converting
HugeSectors to a DEV_BSIZE-based count.

Fixes:	ba2c98389b ("msdosfs: sanity check sector count from BPB")
Reviewed by:	kib
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D34264
2022-02-14 10:06:47 -05:00
Jason A. Harmening 974efbb3d5 unionfs: fix typo in comment
I deleted the wrong word when writing up a comment in a prior change;
the covered vnode may be recursed during any unmount, not just forced
unmount.
2022-02-10 15:17:43 -06:00
Rick Macklem 17a56f3fab nfsd: Reply NFSERR_SEQMISORDERED for bogus seqid argument
The ESXi NFSv4.1 client bogusly sends the wrong value
for the csa_sequence argument for a Create_session operation.
RFC8881 requires this value to be the same as the sequence
reply from the ExchangeID operation most recently done for
the client ID.

Without this patch, the server replies NFSERR_STALECLIENTID,
which is the correct response for an NFSv4.0 SetClientIDConfirm
but is not the correct error for NFSv4.1/4.2, which is
specified as NFSERR_SEQMISORDERED in RFC8881.
This patch fixes this.

This change does not fix the issue reported in the PR, where
the ESXi client loops, attempting ExchangeID/Create_session
repeatedly.

Reported by:	asomers
Tested by:	asomers
PR:	261291
MFC after:	1 week
2022-02-09 15:17:50 -08:00
Gordon Bergling 8ea3ceda76 fs: fix a few common typos in source code comments
- s/quadradically/quadratically/
- s/persistant/persistent/

Obtained from:	NetBSD
MFC after:	3 days
2022-02-06 13:48:31 +01:00
Alan Somers 18ed2ce77a fusefs: fix the build without INVARIANTS after 00134a0789
MFC after:	2 weeks
MFC with:	00134a0789
Reported by:	se
2022-02-04 18:44:27 -07:00