Commit graph

4721 commits

Author SHA1 Message Date
Rick Macklem 98c788737f nfsclient: Delete unused function nfscl_getcookie()
The function nfscl_getcookie(), which is essentially the
same as ncl_getcookie(), is never called, so delete it.
This is probably cruft left over from the port of the
NFSv4 code to FreeBSD several years ago.

Found while modifying the code to better use the
directory offset cookies.

MFC after:	2 weeks
2022-01-27 15:30:26 -08:00
Mateusz Guzik 2a7e4cf843 Revert b58ca5df0b ("vfs: remove the now unused insmntque1")
I was somehow convinced that insmntque calls insmntque1 with a NULL
destructor. Unfortunately this worked well enough to not immediately
blow up in simple testing.

Keep not using the destructor in previously patched filesystems though
as it avoids unnecessary casts.

Noted by:	kib
Reported by:	pho
2022-01-27 16:32:22 +00:00
Mateusz Guzik d35991d327 nullfs: ansify fs/nullfs/null_subr.c 2022-01-27 01:01:45 +01:00
Mateusz Guzik 3150cf0c13 unionfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:57:37 +01:00
Mateusz Guzik 5ccdfdabc8 tmpfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:56:12 +01:00
Mateusz Guzik 4e91a0b9fe nullfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:47 +01:00
Mateusz Guzik ade1367ba8 fdescfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:38 +01:00
Mateusz Guzik 3af3e99ce4 devfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:30 +01:00
Mark Johnston 3d8562348c fusefs: Address -Wunused-but-set-variable warnings
Reviewed by:	asomers
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D33957
2022-01-20 08:25:00 -05:00
Alan Somers 89d57b94d7 fusefs: implement VOP_DEALLOCATE
MFC after:	Never
Reviewed by:	khng
Differential Revision: https://reviews.freebsd.org/D33800
2022-01-18 21:13:02 -07:00
Jason A. Harmening 39a2dc44f8 unionfs: allow vnode lock to be held shared during VOP_OPEN
do_execve() will hold the vnode lock shared when it calls VOP_OPEN(),
but unionfs_open() requires the lock to be held exclusive to
correctly synchronize node status updates.  This requirement is
asserted in unionfs_get_node_status().

Change unionfs_open() to temporarily upgrade the lock as is already
done in unionfs_close().  Related to this, fix various cases throughout
unionfs in which vnodes are not checked for reclamation following lock
upgrades that may have temporarily dropped the lock.  Also fix another
related issue in which unionfs_lock() can incorrectly add LK_NOWAIT
during a downgrade operation, which trips a lockmgr assertion.

Reviewed by:	kib (prior version), markj, pho
Reported by:	pho
Differential Revision: https://reviews.freebsd.org/D33729
2022-01-11 18:44:03 -08:00
Rick Macklem a91a57846b nfsd: Do not accept audit/alarm ACEs for the NFSv4 server
The UFS and ZFS file systems only support Allow/Deny ACEs
in the NFSv4 ACLs.  This patch does not allow the server
to parse Audit/Alarm ACEs.  The NFSv4 client is still
allowed to pase Audit/Alarm ACEs, since non-FreeBSD NFSv4
servers may use them.

This patch should not have a significant effect, since the
UFS and ZFS file systems will not handle these ACEs anyhow.
It simply serves as an additional "safety belt" for the
NFSv4 server.

MFC after:	2 weeks
2022-01-11 09:40:07 -08:00
Rick Macklem 5da9b3b011 Revert "nfscommon: Add arguments for support of the dacl attribute"
This reverts commit 0fa074b53e.

I now see that the implementation of the "dacl" operation
requires that the NFSv4 server to "automatic inheritance"
and I do not plan on doing this.  As such, this patch is
harmless, but unneeded.
2022-01-11 08:30:50 -08:00
Rick Macklem b1f80dfac9 Revert "nfscommon: Return NFSERR_ATTRNOTSUPP for AUDIT/ALARM ACEs"
This reverts commit f10dc28ec2.

The client should still be able to getfacl
audit and alarm ACEs, for non-FreeBSD NFSv4 servers.

A patch that only disables audit/alarm for the server
side will be committed to replace this patch.
2022-01-11 08:26:42 -08:00
Alexander Motin 3455c738ac nfsd: Reduce callouts rate.
Before this callouts were scheduled twice a seconds even if nfsd was
never used.  This reduces the rate to ~1Hz and only after nfsd first
started.

MFC after:	2 weeks
2022-01-09 13:14:23 -05:00
Konstantin Belousov aaaa4fb54e msdosfs: use mntfs vnode for pm_devvp
to prevent races with devfs VCHR vnode reclamation, same as it was
done for UFS.

Reported by:	pho
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 06:21:58 +02:00
Konstantin Belousov 41e85eeab9 msdosfs: on integrity error, fire a task to remount filesystem to ro
In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 06:20:48 +02:00
Konstantin Belousov b2e4b63584 msdosfs: add msdosfs_integrity_error()
A function to remount the filesystem from rw to ro on integrity error.
The work is performed in taskqueue to allow the call to be done from
almost arbitrary context where erronous state was detected.

Tested by:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 06:20:48 +02:00
Konstantin Belousov ba2c98389b msdosfs: sanity check sector count from BPB
We use sector count to size the FAT inuse bitset.  If sector count is
corrupted, kernel might be tricked into doing unbound allocation.
Ensure that the sector count does not exceed the actual volume size.

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov 65990b68a2 msdosfs: clusterfree() is used only in error handling cases
Change its return type to void, because its result is ignored in both
call sites.  Remove oldcnp argument as well, it is NULL always.

Suggested and reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov aec97963cd msdosfs: do no allow lookup to return vdp except for dot lookups
In collaboaration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov 1319c433f4 msdosfs: handle a case when non-dot lookup returned dvp
This means that filesystem is corrupted, there is a loop.

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov 2c9a1c22c3 msdosfs: take inusemap inconsistency as an error, not invariants violation
In other words, stop silently accepting freeing free cluster in
non-debug kernels, but return the error to the caller.  Modify callers
to handle errors from usemap_free().

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Konstantin Belousov 595ed4d767 msdosfs: handle inconsistently hashed denodes
It is possible, on the corrupted msdosfs volume, to have file which
denode inode number does not match the one calculated using directory
cluster.  Instead of asserting the condition as impossible, handle it
and return error, after reclaiming the aliased vnode.

In collaboration with:	pho
Reviewed by:	markj, mckusick
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-08 05:41:44 +02:00
Rick Macklem e4df1036f6 nfscl: Always invalidate buffers for append writes
kib@ reported a problem which was resolved by
reverting commit 867c27c23a, which changed the NFS
client to use direct RPCs to the server for
IO_APPEND writes.  He also spotted that the
code only invalidated buffer cache buffers
when they were marked NMODIFIED (had been
written into).

This patch modifies the NFS VOP_WRITE() to
always invalidate the buffer cache buffers
and pages for the file when IO_APPEND is
specified.  It also includes some cleanup
suggested by kib@.

Reported by:	kib
Tested by:	kib
Reviewed by:	kib
MFC after:	10 weeks
2022-01-06 14:18:36 -08:00
Jason A. Harmening 9e891d43f5 unionfs: implement VOP_SET_TEXT/VOP_UNSET_TEXT
The implementation simply passes the text ref to the appropriate
underlying vnode.  Without this, the default [un]set_text
implementation will only manage the text ref on the unionfs vnode,
causing it to be out of sync with the underlying filesystems and
potentially allowing corruption of executable file contents.
On INVARIANTS kernels, it also readily produces a panic on process
termination because the VM object representing the executable mapping
is backed by the underlying vnode, not the unionfs vnode.

PR:	251342
Reviewed by:	kib
Differential Revision: https://reviews.freebsd.org/D33611
2022-01-02 19:52:58 -08:00
Jason A. Harmening d877dd5767 unionfs: simplify writecount management
Use atomics to track the writecount granted to the underlying FS,
and avoid holding the vnode interlock while calling the underling FS'
VOP_ADD_WRITECOUNT().  This also fixes a WITNESS warning about nesting
the same lock type.  Also add comments explaining why we need to track
the writecount on the unionfs vnode in the first place.  Finally,
simplify writecount management to only use the upper vnode and assert
that we shouldn't have an active writecount on the lower vnode through
unionfs.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D33611
2022-01-02 19:52:58 -08:00
Konstantin Belousov 04fd468da0 mountmsdosfs(): some style
Reviewed by:	markj
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D33721
2022-01-02 22:25:07 +02:00
Alan Somers 398c88c758 fusefs: implement VOP_ALLOCATE
Now posix_fallocate will be correctly forwarded to fuse file system
servers, for those that support it.

MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D33389
2021-12-31 21:05:28 -07:00
Alan Somers 1613087a81 fusefs: fix .. lookups when the parent has been reclaimed.
By default, FUSE file systems are assumed not to support lookups for "."
and "..".  They must opt-in to that.  To cope with this limitation, the
fusefs kernel module caches every fuse vnode's parent's inode number,
and uses that during VOP_LOOKUP for "..".  But if the parent's vnode has
been reclaimed that won't be possible.  Previously we paniced in this
situation.  Now, we'll return ESTALE instead.  Or, if the file system
has opted into ".." lookups, we'll just do that instead.

This commit also fixes VOP_LOOKUP to respect the cache timeout for ".."
lookups, if the FUSE file system specified a finite timeout.

PR:		259974
MFC after:	2 weeks
Reviewed by:	pfg
Differential Revision: https://reviews.freebsd.org/D33239
2021-12-31 20:38:27 -07:00
Alan Somers 5169832c96 fusefs: copy_file_range must update file timestamps
If FUSE_COPY_FILE_RANGE returns successfully, update the atime of the
source and the mtime and ctime of the destination.

MFC after:	2 weeks
Reviewers:	pfg
Differential Revision: https://reviews.freebsd.org/D33159
2021-12-31 17:43:57 -07:00
Alan Somers 13d593a5b0 Fix a race in fusefs that can corrupt a file's size.
VOPs like VOP_SETATTR can change a file's size, with the vnode
exclusively locked.  But VOPs like VOP_LOOKUP look up the file size from
the server without the vnode locked.  So a race is possible.  For
example:

1) One thread calls VOP_SETATTR to truncate a file.  It locks the vnode
   and sends FUSE_SETATTR to the server.
2) A second thread calls VOP_LOOKUP and fetches the file's attributes from
   the server.  Then it blocks trying to acquire the vnode lock.
3) FUSE_SETATTR returns and the first thread releases the vnode lock.
4) The second thread acquires the vnode lock and caches the file's
   attributes, which are now out-of-date.

Fix this race by recording a timestamp in the vnode of the last time
that its filesize was modified.  Check that timestamp during VOP_LOOKUP
and VFS_VGET.  If it's newer than the time at which FUSE_LOOKUP was
issued to the server, ignore the attributes returned by FUSE_LOOKUP.

PR:		259071
Reported by:	Agata <chogata@moosefs.pro>
Reviewed by:	pfg
MFC after:	2 weeks
Differential Revision: https://reviews.freebsd.org/D33158
2021-12-31 17:38:42 -07:00
Fedor Uporov f1d5e2c862 Improve extents verification logic
Add functionality for extents validation inside the filesystem
extents block. The main logic is implemented under
ext4_validate_extent_entries() function, which verifies extents
or extents indexes depending of extent depth value.

PR:                     259112
Reported by:            Robert Morris
Reviewed by:            pfg
MFC after:              2 weeks
Differential Revision:  https://reviews.freebsd.org/D33375
2021-12-30 09:14:45 +03:00
Fedor Uporov ced2172822 Add more accurate check for root inode
Check that root inode has links and is directory.

PR:             259105
Reported by:    Robert Morris
MFC after:      2 weeks
2021-12-30 09:14:45 +03:00
Fedor Uporov bb9f1ba4b5 Add more accurate directory entries check
Rename ext2_dirbadentry() to ext2_check_direntry(). Add directory
entry inode value check, and call ext2_check_direntry() in all cases.
The dirchk sysctl is removed.

PR:                     259024,259041
Reported by:            Robert Morris
Reviewed by:            pfg
MFC after:              2 weeks
Differential Revision:  https://reviews.freebsd.org/D33374
2021-12-30 09:14:44 +03:00
Fedor Uporov 5034b44574 Remove unnecessary e2fs_first_dblock value check
MFC after:      2 weeks
2021-12-30 09:14:44 +03:00
Rick Macklem f10dc28ec2 nfscommon: Return NFSERR_ATTRNOTSUPP for AUDIT/ALARM ACEs
FreeBSD only supports Allow/Deny ACEs in NFSv4 ACLs.
As such, it does not make sense to parse Audit/Alarm
ACEs.  Modify nfsrv_dissectace() so that it returns
NFSERR_ATTRNOTSUPP if an Audit/Alarm ACE is found in
the ACL being parsed.  The code has been #ifdef notnow'd,
since Audit/Alarm ACEs might be supported someday.

This should not have significant impact, since FreeBSD
reports to clients that only Allow/Deny ACEs are
supported and an attempt to set one would have failed
anyhow.

MFC after:	2 weeks
2021-12-27 08:03:41 -08:00
Rick Macklem 0fa074b53e nfscommon: Add arguments for support of the dacl attribute
NFSv4.1/4.2 has an alternative to the acl attribute, called
dacl, that includes support for the ACL_ENTRY_INHERITED flag,
called NFSV4ACE_INHERITED in NFSv4.

This patch adds a dacl argument to nfsrv_buildacl(),
nfsrv_dissectacl() and nfsrv_dissectace(), so that they
will handle NFSV4ACE_INHERITED when dacl == true.

Since these functions are always called with dacl == false
for this patch, semantics should not have changed.
A future patch will add support for dacl.

MFC after:	2 weeks
2021-12-26 16:43:46 -08:00
Rick Macklem 744c2dc7dd rpc: Delete AUTH_NEEDS_TLS(_MUTUAL_HOST) auth_stat values
I thought that these new auth_stat values had been agreed
upon by the IETF NFSv4 working group, but that no longer
is the case.  As such, delete them and use AUTH_TOOWEAK
instead.  Leave the code that uses these new auth_stat
values in the sources #ifdef notnow, in case they are
defined in the future.

MFC after:	1 week
2021-12-23 14:31:53 -08:00
Rick Macklem b70042adfe nfscl: Check for mmap(2)'d file before doing direct output
Commit 867c27c23a modified the NFS client so that
it does IO_APPEND writes directly to the NFS server,
bypassing the buffer cache.  However, this could result
in stale data in client pages when the file is mmap(2)'d.
As such, the NFS client needs to call vm_object_is_active()
to check if the file is mmap(2)'d and only do direct
output if the file is not mmap(2)'d.

This patch adds this check.

Although a simple patch, I have given it a long MFC,
since the related commit 867c27c23a made a significant
semantics change and, as such, has a long MFC.

MFC after:	3 months
2021-12-20 13:10:26 -08:00
Rick Macklem 150da1e3cd nfscl: Partially revert commit 867c27c23a
Commit 867c27c23a enabled the n_directio_opens code
in open/close, which sets/clears NNONCACHE, for
IO_APPEND. This code should not be enabled unless
newnfs_directio_enable is non-zero.

This patch reverts that part of commit 867c27c23a.

A future patch that fixes the case where the
file that is being written IO_APPEND is mmap()'d.

MFC after:	3 months
2021-12-16 14:30:37 -08:00
Alan Somers b214fcceac Change VOP_READDIR's cookies argument to a **uint64_t
The cookies argument is only used by the NFS server.  NFSv2 defines the
cookie as 32 bits on the wire, but NFSv3 increased it to 64 bits.  Our
VOP_READDIR, however, has always defined it as u_long, which is 32 bits
on some architectures.  Change it to 64 bits on all architectures.  This
doesn't matter for any in-tree file systems, but it matters for some
FUSE file systems that use 64-bit directory cookies.

PR:             260375
Reviewed by:    rmacklem
Differential Revision: https://reviews.freebsd.org/D33404
2021-12-15 20:54:57 -07:00
Alan Somers 32fbc5d824 nfs: don't truncate directory cookies to 32-bits in the NFS server
In NFSv2, the directory cookie was 32-bits.  NFSv3 widened it to
64-bits and SVN r22521 widened the corresponding argument in
VOP_READDIR, but FreeBSD's NFS server continued to treat the cookies as
32-bits, and 0-extended to fill the field on the wire.  Nobody ever
noticed, because every in-tree file system generates cookies that fit
comfortably within 32-bits.

Also, have better type safety for txdr_hyper.  Turn it into an inline
function that type-checks its arguments.  Prevents warnings about
shift-count-overflow.

PR:		260375
MFC after:	2 weeks
Reviewed by:	rmacklem
Differential Revision: https://reviews.freebsd.org/D33404
2021-12-15 20:54:57 -07:00
Rick Macklem e0861304a7 nfscl: Handle CB_SEQUENCE not first op correctly
The check for "not first operation" in CB_SEQUENCE
was done after the slot, etc. was updated. This patch
moves the check to the beginning of CB_SEQUENCE
processing.

While here, also fix the check for "no CB_SEQUENCE operation first"
by moving the check to the beginning of callback operation parsing,
since the check was in a couple of the other operations, but
not all of them.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260412
MFC after:	2 weeks
2021-12-15 16:36:40 -08:00
Rick Macklem 867c27c23a nfscl: Change IO_APPEND writes to direct I/O
IO_APPEND writes have always been very slow over NFS, due to
the need to acquire an up to date file size after flushing
all writes to the NFS server.

This patch switches the IO_APPEND writes to use direct I/O,
bypassing the buffer cache.  As such, flushing of writes
normally only occurs when the open(..O_APPEND..) is done.
It does imply that all writes must be done synchronously
and must be committed to stable storage on the file server
(NFSWRITE_FILESYNC).

For a simple test program that does 10,000 IO_APPEND writes
in a loop, performance improved significantly with this patch.

For a UFS exported file system, the test ran 12x faster.
This drops to 3x faster when the open(2)/close(2) are done
for each loop iteration.
For a ZFS exported file system, the test ran 40% faster.

The much smaller improvement may have been because the ZFS
file system I tested against does not have a ZIL log and
does have "sync" enabled.

Note that IO_APPEND write performance is still much slower
than when done on local file systems.

Although this is a simple patch, it does result in a
significant semantics change, so I have given it a
large MFC time.

Tested by:	otis
MFC after:	3 months
2021-12-15 08:35:48 -08:00
Rick Macklem fe04c91184 nfscl: add a filesize limit check to nfs_allocate()
As reported in PR#260343, nfs_allocate() did not check
the filesize rlimit. This patch adds that check.

PR:	260343
Reviewed by:	asomers
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33422
2021-12-13 15:32:19 -08:00
Rick Macklem c302f889e2 nfsd: Limit parsing of layout errors to maxcnt bytes
This patch decrements maxcnt by the appropriate
number of bytes during parsing and checks to see
if there is data remaining.  If not, it just returns
from nfsrv_flexlayouterr() without further processing.
This prevents the tl pointer from running off the end
of the error data pointed at by layp, if there are
flaws in the data.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260293
MFC after:	2 weeks
2021-12-13 15:21:31 -08:00
Rick Macklem 24947b701d nfscl: Fix must_commit handling for mirrored pNFS mounts
For pNFS mounts to mirrored Flexible File layout pNFS servers,
the "must_commit" component in the nfsclwritedsdorpc
structure must be checked and the "must_commit" argument passed
into nfscl_doiods() must be updated.  Technically, only writes to
the DS with a writeverf change must be redone, but since this
occurrence will be rare, the must_commit argument to nfscl_doiosd()
is set to 1, so all writes to all DSs will be redone.

This bug would affect few, since use of mirrored pNFS servers
is rare and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots.

MFC after:	2 weeks
2021-12-12 15:40:30 -08:00
Rick Macklem ead50c94cb nfscl: Fix must_commit/writeverf handling for Direct I/O
Without this patch, the KASSERT(must_commit == 0,..) can be
triggered by the writeverf in the Direct I/O write reply changing.
This is not a situation that should cause a panic(). Correct
handling is to ignore the change in "writeverf" for Direct
I/O, since it is done with NFSWRITE_FILESYNC.

This patch modifies the semantics of the "must_commit"
argument slightly, allowing an initial value of 2 to indicate
that a change in "writeverf" should be ignored.
It also fixes the KASSERT()s.

This bug would affect few, since Direct I/O is not enabled
by default and "writeverf" rarely changes. Normally "writeverf"
only changes when a NFS server reboots, however I found the
bug when testing against a Linux 5.15.1 kernel nfsd, which
replied to a NFSWRITE_FILESYNC write with a "writeverf" of all
0x0 bytes.

MFC after:	2 weeks
2021-12-11 15:00:30 -08:00
Rick Macklem ab639f2398 nfscl: Check for an error return from nfsrv_getattrbits()
There were two places where the client code did not check
for a parse error return from nfsrv_getattrbits().

This patch fixes both of these cases.

Reported by:	rtm@lcs.mit.edu
Tested by:	rtm@lcs.mit.edu
PR:	260272
MFC after:	2 weeks
2021-12-09 14:32:22 -08:00