Commit graph

403 commits

Author SHA1 Message Date
Konstantin Belousov 6d79564fe3 devfs_allocv(): style
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2024-05-12 04:13:00 +03:00
Konstantin Belousov d3efbe0132 cdevpriv(9): add iterator
Reviewed by:	christos
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D44469
2024-03-23 08:59:00 +02:00
Konstantin Belousov 5c41d888de kcmp(2): implement for devfs files
Compare not vnodes, which are different between mount points, but
actual cdev referenced by the devfs node.

Reviewed by:	brooks, markj
Sponsored by:   The FreeBSD Foundation
MFC after:      1 week
Differential revision:  https://reviews.freebsd.org/D43518
2024-01-24 07:11:26 +02:00
Gordon Bergling 7cf293536e devfs(5): Fix a typo in a source code comment
- s/interpeted/interpreted/

MFC after:	3 days
2024-01-20 17:25:45 +01:00
Warner Losh 29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
Jason A. Harmening 67864268da devfs: add integrity asserts for cdevp_list
It's possible for misuse of cdev KPIs or for bugs in devfs itself to
result in e.g. a cdev object's container being freed while still on the
global list used to populate each devfs mount; see PR 273418 for a
recent example.

Since a node may be marked inactive well before it is reaped from the
list, add a new flag solely to track list membership, and employ it in
some basic list integrity assertions to catch bad actors.

Discussed with:	kib, mjg
MFC after:	1 week
2023-09-21 11:51:12 -05:00
Doug Rabson b5c4616582 Fix MNT_IGNORE for devfs, fdescfs and nullfs
The MNT_IGNORE flag can be used to mark certain filesystem mounts so
that utilities such as df(1) and mount(8) can filter out those mounts by
default. This can be used, for instance, to reduce the noise from
running container workloads inside jails which often have at least three
and sometimes as many as ten mounts per container.

The flag is supplied by the nmount(2) system call and is recorded so
that it can be reported by statfs(2). Unfortunately several filesystems
override the default behaviour and mask out the flag, defeating its
purpose. This change preserves the MNT_IGNORE flag for those filesystems
so that it can be reported correctly.

MFC after:	1 week
2023-08-26 12:08:37 +01:00
Warner Losh 95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
Warner Losh 4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Stefan Eßer 88a795e80c sys/fs: do not report blocks allocated for synthetic file systems
The pseudo file systems (devfs, fdescfs, procfs, etc.) report total
and available blocks and inodes despite being synthetic with no
underlying storage device to which those values could be applied.

The current code of these file systems tends to report a fixed number
of total blocks but no free blocks, and in the case of procfs,
libprocfs, linsysfs also no free inodes.

This can be irritating in e.g. the "df" output, since 100% of the
resources seem to be in use, but it can also create warnings in
monitoring tools used for capacity management.

This patch makes these file systems return the same value for the
total and free parameters, leading to 0% in use being displayed by
"df". Since there is no resource that can be exhausted, this appears
to be a sensible result.

Reviewed by:	mckusick
Differential Revision:	https://reviews.freebsd.org/D39442
2023-04-25 09:59:15 +02:00
Mateusz Guzik 829f0bcb5f vfs: add the concept of vnode state transitions
To quote from a comment above vput_final:
<quote>
* XXX Some filesystems pass in an exclusively locked vnode and strongly depend
* on the lock being held all the way until VOP_INACTIVE. This in particular
* happens with UFS which adds half-constructed vnodes to the hash, where they
* can be found by other code.
</quote>

As is there is no mechanism which allows filesystems to denote that a
vnode is fully initialized, consequently problems like the above are
only found the hard way(tm).

Add rudimentary support for state transitions, which in particular allow
to assert the vnode is not legally unlocked until its fate is decided
(either construction finishes or vgone is called to abort it).

The new field lands in a 1-byte hole, thus it does not grow the struct.

Bump __FreeBSD_version to 1400077

Reviewed by:	kib (previous version)
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D37759
2022-12-26 17:35:12 +00:00
Mateusz Guzik 5b5b7e2ca2 vfs: always retain path buffer after lookup
This removes some of the complexity needed to maintain HASBUF and
allows for removing injecting SAVENAME by filesystems.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D36542
2022-09-17 09:10:38 +00:00
Mateusz Guzik ad5e1f9c2d devfs: stop taking the interlock in devfs_delete
It buys nothing now that vhold does not require it.
2022-09-14 22:51:42 +00:00
Mateusz Guzik a1c555f48b devfs: retire the unused DEVFS_DEL_VNLOCKED flag 2022-09-14 22:47:53 +00:00
Mateusz Guzik 497240def8 Retire clone_drain_lock
It is only ever xlocked in drain_dev_clone_events and the only consumer of
that routine does not need it -- eventhandler code already makes sure the
relevant callback is no longer running.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D36268
2022-08-20 09:44:05 +00:00
Gordon Bergling 8ea3ceda76 fs: fix a few common typos in source code comments
- s/quadradically/quadratically/
- s/persistant/persistent/

Obtained from:	NetBSD
MFC after:	3 days
2022-02-06 13:48:31 +01:00
Konstantin Belousov 66c5fbca77 insmntque1(): remove useless arguments
Also remove once-used functions to clean up after failed insmntque1(),
which were destructor callbacks in previous life.

Reviewed by:	markj
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D34071
2022-01-31 16:49:08 +02:00
Mateusz Guzik 2a7e4cf843 Revert b58ca5df0b ("vfs: remove the now unused insmntque1")
I was somehow convinced that insmntque calls insmntque1 with a NULL
destructor. Unfortunately this worked well enough to not immediately
blow up in simple testing.

Keep not using the destructor in previously patched filesystems though
as it avoids unnecessary casts.

Noted by:	kib
Reported by:	pho
2022-01-27 16:32:22 +00:00
Mateusz Guzik 3af3e99ce4 devfs: stop using insmntque1
It adds nothing of value over insmntque.
2022-01-27 00:54:30 +01:00
Mateusz Guzik 3ffcfa599e vfs: add vop_stdadd_writecount_nomsync
This avoids needing to inspect the mount point every time.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D33125
2021-11-26 12:06:08 +00:00
Mateusz Guzik 2b68eb8e1d vfs: remove thread argument from VOP_STAT
and fo_stat.
2021-10-11 13:22:32 +00:00
Mateusz Guzik b4a58fbf64 vfs: remove cn_thread
It is always curthread.

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D32453
2021-10-11 13:21:47 +00:00
Mark Johnston 243b324f96 devfs: Avoid comparison with an uninitialized var in devfs_fp_check()
devvn_refthread() will initialize *devp only if it succeeds, so check for
success before comparing with fp->f_data.  Other devvn_refthread()
callers are careful to do this.

Reported by:	KMSAN
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D30068
2021-05-03 13:24:30 -04:00
Konstantin Belousov d485c77f20 Remove #define _KERNEL hacks from libprocstat
Make sys/buf.h, sys/pipe.h, sys/fs/devfs/devfs*.h headers usable in
userspace, assuming that the consumer has an idea what it is for.
Unhide more material from sys/mount.h and sys/ufs/ufs/inode.h,
sys/ufs/ufs/ufsmount.h for consumption of userspace tools, with the
same caveat.

Remove unacceptable hack from usr.sbin/makefs which relied on sys/buf.h
being unusable in userspace, where it override struct buf with its own
definition.  Instead, provide struct m_buf and struct m_vnode and adapt
code to use local variants.

Reviewed by:	mckusick
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D28679
2021-02-21 11:38:21 +02:00
Mateusz Guzik 3bc17248d3 devfs: fix use count leak when using TIOCSCTTY
by matching devfs_ctty_ref

Fixes: 3b44443626 ("devfs: rework si_usecount to track opens")
2021-02-09 01:54:21 +00:00
Edward Tomasz Napierala 4ddb3cc597 devfs(4): defer freeing until we drop devmtx ("cdev")
Before r332974 the old code would sometimes cause a rare lock order
reversal against pagequeue, which looked roughly like this:

witness_checkorder()
__mtx_lock-flags()
vm_page_alloc()
uma_small_alloc()
keg_alloc_slab()
keg_fetch-slab()
zone_fetch-slab()
zone_import()
zone_alloc_bucket()
uma_zalloc_arg()
bucket_alloc()
uma_zfree_arg()
free()
devfs_metoo()
devfs_populate_loop()
devfs_populate()
devfs_rioctl()
VOP_IOCTL_APV()
VOP_IOCTL()
vn_ioctl()
fo_ioctl()
kern_ioctl()
sys_ioctl()

Since r332974 the original problem no longer exists, but it still
makes sense to move things out of the - often congested - lock.

Reviewed By:	kib, markj
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D27334
2020-12-29 13:47:36 +00:00
Mateusz Guzik 586ee69f09 fs: clean up empty lines in .c and .h files 2020-09-01 21:18:40 +00:00
Conrad Meyer 0ac9e27ba9 devfs: Abstract locking assertions
The conversion was largely mechanical: sed(1) with:

  -e 's|mtx_assert(&devmtx, MA_OWNED)|dev_lock_assert_locked()|g'
  -e 's|mtx_assert(&devmtx, MA_NOTOWNED)|dev_lock_assert_unlocked()|g'

The definitions of these abstractions in fs/devfs/devfs_int.h are the
only non-mechanical change.

No functional change.
2020-08-12 00:32:31 +00:00
Mateusz Guzik 3b44443626 devfs: rework si_usecount to track opens
This removes a lot of special casing from the VFS layer.

Reviewed by:	kib (previous version)
Tested by:	pho (previous version)
Differential Revision:	https://reviews.freebsd.org/D25612
2020-08-11 14:27:57 +00:00
Mateusz Guzik ca423b858b devfs: bool -> int
Fixes buildworld after r364069
2020-08-10 11:46:39 +00:00
Mateusz Guzik 7b19bddac8 devfs: save on spurious relocking for devfs_populate
Tested by:	pho
2020-08-10 10:36:43 +00:00
Mateusz Guzik f8935a96d1 devfs: use cheaper lockmgr entry points
Tested by:	pho
2020-08-10 10:36:10 +00:00
Mateusz Guzik f9c13ab856 devfs: use vget_prep/vget_finish
Tested by:	pho
2020-08-10 10:35:47 +00:00
Mateusz Guzik d292b1940c vfs: remove the obsolete privused argument from vaccess
This brings argument count down to 6, which is passable without the
stack on amd64.
2020-08-05 09:27:03 +00:00
Mateusz Guzik 11c345b18f devfs: fix a vnode use-after-free in devfs_ioctl
The vnode to be replaced was read with a shared lock, meaning 2 racing threads
can find the same one.

While here clean it up a little bit.
2020-07-04 06:27:28 +00:00
Thomas Munro f270658873 vfs: track sequential reads and writes separately
For software like PostgreSQL and SQLite that sometimes reads sequentially
while also writing sequentially some distance behind with interleaved
syscalls on the same fd, performance is better on UFS if we do
sequential access heuristics separately for reads and writes.

Patch originally by Andrew Gierth in 2008, updated and proposed by me with
his permission.

Reviewed by:	mjg, kib, tmunro
Approved by:	mjg (mentor)
Obtained from:	Andrew Gierth <andrew@tao11.riddles.org.uk>
Differential Revision:	https://reviews.freebsd.org/D25024
2020-06-21 08:51:24 +00:00
Pawel Biernacki 7029da5c36 Mark more nodes as CTLFLAG_MPSAFE or CTLFLAG_NEEDGIANT (17 of many)
r357614 added CTLFLAG_NEEDGIANT to make it easier to find nodes that are
still not MPSAFE (or already are but aren’t properly marked).
Use it in preparation for a general review of all nodes.

This is non-functional change that adds annotations to SYSCTL_NODE and
SYSCTL_PROC nodes using one of the soon-to-be-required flags.

Mark all obvious cases as MPSAFE.  All entries that haven't been marked
as MPSAFE before are by default marked as NEEDGIANT

Approved by:	kib (mentor, blanket)
Commented by:	kib, gallatin, melifaro
Differential Revision:	https://reviews.freebsd.org/D23718
2020-02-26 14:26:36 +00:00
Mateusz Guzik f1fa1ba3d0 Fix up various vnode-related asserts which did not dump the used vnode 2020-02-03 14:25:32 +00:00
Kyle Evans 6a5abb1ee5 Provide O_SEARCH
O_SEARCH is defined by POSIX [0] to open a directory for searching, skipping
permissions checks on the directory itself after the initial open(). This is
close to the semantics we've historically applied for O_EXEC on a directory,
which is UB according to POSIX. Conveniently, O_SEARCH on a file is also
explicitly undefined behavior according to POSIX, so O_EXEC would be a fine
choice. The spec goes on to state that O_SEARCH and O_EXEC need not be
distinct values, but they're not defined to be the same value.

This was pointed out as an incompatibility with other systems that had made
its way into libarchive, which had assumed that O_EXEC was an alias for
O_SEARCH.

This defines compatibility O_SEARCH/FSEARCH (equivalent to O_EXEC and FEXEC
respectively) and expands our UB for O_EXEC on a directory. O_EXEC on a
directory is checked in vn_open_vnode already, so for completeness we add a
NOEXECCHECK when O_SEARCH has been specified on the top-level fd and do not
re-check that when descending in namei.

[0] https://pubs.opengroup.org/onlinepubs/9699919799/

Reviewed by:	kib
Differential Revision:	https://reviews.freebsd.org/D23247
2020-02-02 16:34:57 +00:00
Mateusz Guzik 45757984f8 vfs: consistently use size_t for buflen around VOP_VPTOCNP 2020-02-01 20:34:43 +00:00
Mateusz Guzik b249ce48ea vfs: drop the mostly unused flags argument from VOP_UNLOCK
Filesystems which want to use it in limited capacity can employ the
VOP_UNLOCK_FLAGS macro.

Reviewed by:	kib (previous version)
Differential Revision:	https://reviews.freebsd.org/D21427
2020-01-03 22:29:58 +00:00
Mateusz Guzik 6fa079fc3f vfs: flatten vop vectors
This eliminates the following loop from all VOP calls:

while(vop != NULL && \
    vop->vop_spare2 == NULL && vop->vop_bypass == NULL)
        vop = vop->vop_default;

Reviewed by:	jeff
Tesetd by:	pho
Differential Revision:	https://reviews.freebsd.org/D22738
2019-12-16 00:06:22 +00:00
Mateusz Guzik abd80ddb94 vfs: introduce v_irflag and make v_type smaller
The current vnode layout is not smp-friendly by having frequently read data
avoidably sharing cachelines with very frequently modified fields. In
particular v_iflag inspected for VI_DOOMED can be found in the same line with
v_usecount. Instead make it available in the same cacheline as the v_op, v_data
and v_type which all get read all the time.

v_type is avoidably 4 bytes while the necessary data will easily fit in 1.
Shrinking it frees up 3 bytes, 2 of which get used here to introduce a new
flag field with a new value: VIRF_DOOMED.

Reviewed by:	kib, jeff
Differential Revision:	https://reviews.freebsd.org/D22715
2019-12-08 21:30:04 +00:00
Kyle Evans 1b50b999f9 tty: implement TIOCNOTTY
Generally, it's preferred that an application fork/setsid if it doesn't want
to keep its controlling TTY, but it could be that a debugger is trying to
steal it instead -- so it would hook in, drop the controlling TTY, then do
some magic to set things up again. In this case, TIOCNOTTY is quite handy
and still respected by at least OpenBSD, NetBSD, and Linux as far as I can
tell.

I've dropped the note about obsoletion, as I intend to support TIOCNOTTY as
long as it doesn't impose a major burden.

Reviewed by:	bcr (manpages), kib
Differential Revision:	https://reviews.freebsd.org/D22572
2019-11-30 20:10:50 +00:00
Mateusz Guzik a02cab334c devfs: introduce a per-dev lock to protect ->si_devsw
This allows bumping threadcount without taking the global devmtx lock.

In particular this eliminates contention on said lock while using bhyve
with multiple vms.

Reviewed by:	kib
Tested by:	markj
MFC after:	2 weeks
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22548
2019-11-30 16:46:19 +00:00
Mateusz Guzik 1fccb43c39 vfs: change si_usecount management to count used vnodes
Currently si_usecount is effectively a sum of usecounts from all associated
vnodes. This is maintained by special-casing for VCHR every time usecount is
modified. Apart from complicating the code a little bit, it has a scalability
impact since it forces a read from a cacheline shared with said count.

There are no consumers of the feature in the ports tree. In head there are only
2: revoke and devfs_close. Both can get away with a weaker requirement than the
exact usecount, namely just the count of active vnodes. Changing the meaning to
the latter means we only need to modify it on 0<->1 transitions, avoiding the
check plenty of times (and entirely in something like vrefact).

Reviewed by:	kib, jeff
Tested by:	pho
Differential Revision:	https://reviews.freebsd.org/D22202
2019-11-20 12:05:59 +00:00
Mateusz Guzik 4c9ba39aea devfs: use MNTK_NOMSYNC
Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D22009
2019-10-13 15:41:47 +00:00
Konstantin Belousov e3fdd051f9 devfs_vptocnp(): correct the component name when node is not at top.
Node' cdp.si_name is the full path as provided by make_dev(9), it
should not be returned by VOP_VPTOCNP() when only the last component
is requested.  Use the dirent entry instead.

With this note, handling of VDIR and VCHR nodes only differs in
handling of root vnode, which simplifies and unifies the logic.

Reported by:	Li, Zhichao1 <Zhichao_Li1@Dell.com>
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
2019-10-11 18:41:24 +00:00
Mateusz Guzik 559ac49d41 devfs: add root vnode caching
See r353150.

Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21646
2019-10-06 22:16:55 +00:00
Mateusz Guzik dfa8dae493 devfs: plug redundant bwillwrite avoidance
vn_write already checks for vnode type to see if bwillwrite should be called.

This effectively reverts r244643.

Reviewed by:	kib
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D21905
2019-10-05 17:44:33 +00:00