Commit Graph

2486 Commits

Author SHA1 Message Date
Warner Losh
6d83b38186 geom_io: Shift to pause_sbt to eliminate bogus min and update comment.
Update to eliminate bogus min to ensure 0 was never passed to
pause. Instead, requrest 1ms with an 'infinite' precision, which
defaults to whatever the underlying time counter can do. This should
ensure we run fairly quickly to start processing done events, while
still giving a small pause for the system to catch its breath. This rate
limiter still is less than ideal, and this commit doesn't change
that. It should really have no functional change: it just uses a better
interface to express the desired sleep.

Sponsored by:		Netflix
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D45316
2024-05-24 08:31:55 -06:00
Warner Losh
32f40fc983 geom: Add counts for enomem and pausing
Add counts for the number of requests that complete with the ENOMEM as
kern.geom.nomem_count and the number of times we pause the g_down thread
to let the system recover as kern.geom.pause_count.

Sponsored by:		Netflix
Reviewed by:		kib
Differential Revision:	https://reviews.freebsd.org/D45309
2024-05-24 08:31:15 -06:00
Pawel Jakub Dawidek
56a8aca83a Stop treating size 0 as unknown size in vnode_create_vobject().
Whenever file is created, the vnode_create_vobject() function will
try to determine its size by calling vn_getsize_locked() as size 0
is ambigious: it means either the file size is 0 or the file size
is unknown.

Introduce special value for the size argument: VNODE_NO_SIZE.
Only when it is given, the vnode_create_vobject() will try to obtain
file's size on its own.

Introduce dedicated vnode_disk_create_vobject() for use by
g_vfs_open(), so we don't have to call vn_isdisk() in the common case
(for regular files).

Handle the case of mediasize==0 in g_vfs_open().

Reviewed by: alc, kib, markj, olce
Approved by: oshogbo (mentor), allanjude (mentor)
Differential Revision: https://reviews.freebsd.org/D45244
2024-05-23 06:08:14 +00:00
Warner Losh
1e84b85aad geom: Remove sysctl.h
These files don't need sysctl.h, so remove it.

Sponsored by:		Netflix
2024-05-22 16:24:11 -06:00
Ryan Libby
bd56aad33c buf: define and use BUF_DISOWNED
Implement an API where previously code was directly reaching into the
buf's internal lock.

Reviewed by:	mckusick, imp, kib, markj
Sponsored by:	Dell EMC Isilon
Differential Revision:	https://reviews.freebsd.org/D45249
2024-05-21 10:21:50 -07:00
Mariusz Zaborski
838d5ae6d8 geli: fix indentation
no functional changes
2024-05-19 15:37:07 +02:00
Mariusz Zaborski
4b3141f5d5 geli: allocate a UMA pool earlier
The functions g_eli_init_uma and g_eli_fini_uma are used to trace
the number of devices in GELI. There is an issue where the g_eli_create
function may fail before g_eli_init_uma is called, however
g_eli_fini_uma is still executed in the fail path. This can
incorrectly decrease the device count to zero, potentially leading to
the UMA pool being freed. Accessing the device after the pool has been
freed causes a system panic.

This commit resolves the issue by ensuring devices count is increassed
eariler.

PR:		278828
Reported by:	Andre Albsmeier <mail@fbsd2.e4m.org>
Reviewed by:	asomers
MFC after:	3 days
Differential Revision:	https://reviews.freebsd.org/D45225
2024-05-19 14:53:17 +02:00
Poul-Henning Kamp
69956de36f Remove final cross-reference to GBDE 2024-05-07 07:40:20 +00:00
Poul-Henning Kamp
8d2d1d6516 Remove GBDE source files 2024-05-07 07:31:09 +00:00
Matthew Grooms
ea2d874cca geom_stripe: Cascade cantrim just like we do for gmirror
If any of the disks can support trim, cascade that up the
stack. Otherwise, trims won't pass through striped raid setups.

PR: 277673
Reviewed by: imp (minor style tweaks from bug report)
2024-05-03 09:03:31 -06:00
Ricardo Branco
78444b5ade glabel: Add support for Linux swap
Reviewed by: imp, kib
Pull Request: https://github.com/freebsd/freebsd-src/pull/1205
2024-04-28 22:39:47 -06:00
Alan Somers
3acf3feaa8 geli: add a read-only kern.geom.eli.use_uma_bytes sysctl
It reports the value of the g_eli_alloc_sz variable.  Allocations of
this size or less will use UMA.  Larger allocations will use malloc.
Since malloc is slower, it is useful for users to know this variable so
they can avoid such allocations.  For example, ZFS users can set
vfs.zfs.vdev.aggregation_limit to this value.

MFC after:	1 week
Sponsored by:	Axcient
Reviewed by:	markj, imp
Differential Revision: https://reviews.freebsd.org/D44904
2024-04-22 13:20:03 -06:00
Gordon Bergling
c0a01ee83d geom(4): Fix a typo in a source code comment
- s/cant/can't/

MFC after:	3 days
2024-04-21 09:49:44 +02:00
Mark Johnston
955f213fa2 graid3: Fix teardown in g_raid3_try_destroy()
Commit 33cb9b3c3a replaced a g_raid3_destroy_device() call with a
g_raid3_free_device() call, which was incorrect and could lead to a
panic if a RAID3 GEOM failed to start (e.g., due to missing disks).

Reported by:	graid3 tests
Fixes:		33cb9b3c3a ("graid3: Fix teardown races")
MFC after:	3 days
Sponsored by:	Klara, Inc.
2024-04-20 12:04:57 -04:00
Ricardo Branco
a8fd0a5f44 glabel: Remove support for old reiserfs
Reviewed by: imp, emaste
Pull Request: https://github.com/freebsd/freebsd-src/pull/1101
2024-04-19 16:48:28 -06:00
Eugene Grosbein
81092e92ea graid: unbreak Promise RAID1 with 4+ providers
Fix a problem in graid implementation of Promise RAID1 created with 4+ disks.
Such an array generally works fine until reboot only due to a bug
in metadata writing code. Before the fix, next taste erronously created
RAID1E (kind of RAID10) instead of RAID1, hence graid used wrong offsets
for I/O operations.

The bug did not affect Promise RAID1 arrays with 2 or 3 disks only.

Reviewed by:	mav
MFC after:	3 days
2024-02-12 14:33:43 +07:00
Gordon Bergling
3fb6adb079 gjournal(8): Fix a typo in a sysctl description
- s/entires/entries/

MFC after:	5 days
2024-01-20 20:58:08 +01:00
Marius Strobl
53df7e58cc geom_redboot(4): Garbage collect disconnected driver
The last MIPS user has been removed in c09981f1 2 years ago, the last
ARM one in ff945277 even 5.5 years ago.
2024-01-14 22:22:21 +01:00
Marius Strobl
03e8d25b1f geom_map(4): Garbage collect disconnected driver
The last MIPS user has been removed in c09981f1 2 years ago, the last
ARM one in 58d5c511 even 5.5 years ago.
2024-01-14 22:22:21 +01:00
Alex
e4183f1745 geom/journal: Fix typos
Fixed a few typos.

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/884
2023-12-27 23:42:03 -07:00
Mark Johnston
bbf221e3e8 geom: Report copyout() errors in g_ctl_ioctl_ctl()
Despite the name, req->serror is used in some cases to copy non-error
messages to userspace.  So, report errors when copying out so long as
they don't clobber an earlier error.

Reviewed by:	mav, imp
MFC after:	1 week
Differential Revision:	https://reviews.freebsd.org/D43146
2023-12-25 21:04:01 -05:00
Warner Losh
fdafd315ad sys: Automated cleanup of cdefs and other formatting
Apply the following automated changes to try to eliminate
no-longer-needed sys/cdefs.h includes as well as now-empty
blank lines in a row.

Remove /^#if.*\n#endif.*\n#include\s+<sys/cdefs.h>.*\n/
Remove /\n+#include\s+<sys/cdefs.h>.*\n+#if.*\n#endif.*\n+/
Remove /\n+#if.*\n#endif.*\n+/
Remove /^#if.*\n#endif.*\n/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/types.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/param.h>/
Remove /\n+#include\s+<sys/cdefs.h>\n#include\s+<sys/capsicum.h>/

Sponsored by:		Netflix
2023-11-26 22:24:00 -07:00
Warner Losh
29363fb446 sys: Remove ancient SCCS tags.
Remove ancient SCCS tags from the tree, automated scripting, with two
minor fixup to keep things compiling. All the common forms in the tree
were removed with a perl script.

Sponsored by:		Netflix
2023-11-26 22:23:30 -07:00
Mitchell Horne
4eb861d362 shutdown: audit shutdown_post_sync event callbacks
Ensure they are all panic/debugger safe.

Most handlers for this event are for disk drivers/geom modules. There
are a mix of checks being used here (or not), so let's standardize on
checking the presence of the RB_NOSYNC flag.

This flag is set whenever:
 1. The kernel has panicked and kern.sync_on_panic=0*
 2. We reboot from within the kernel debugger (the "reset" command)
 3. Userspace requested it, e.g. by 'reboot -n'

Name the functions consistently.

*This sysctl is tuned to zero by default, but its existence means that
these handlers can be executed after a panic, at the user's discretion.
IMO this use-case is implicitly understood to be risky, and we'd be
better off eliminating it altogether.

Reviewed by:    markj
Sponsored by:   The FreeBSD Foundation
MFC after:      1 week
Differential Revision:  https://reviews.freebsd.org/D42337
2023-11-23 12:07:42 -04:00
Mitchell Horne
f3dc172763 geom: sort includes for some files
This is not exhaustive, just done ahead of some upcoming changes to
these files.

Don't include sys/cdefs.h explicitly. No functional change intended.

Reviewed by:	imp, jhb
MFC after:	3 days
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D42335
2023-11-23 12:07:42 -04:00
Mark Johnston
33cb9b3c3a graid3: Fix teardown races
Port commit dc399583ba from g_mirror, which has an effectively
identical startup sequence.

This fixes a race that was occasionally causing panics during GEOM test
suite runs on riscv.

MFC after:	1 month
2023-11-02 14:35:37 -04:00
Warner Losh
5c9f0f72f4 gpart: Be less picky about GPT Tables in some cases
When we're recoverying a damangae GPT, or when we're restoring a backed
up partition tables, don't enforce the 4k alignment for start/end LBAs.
This is useful for 512e/4kn drives when we're creating a new partition
table or partition. However, when we're trying to fix / restore an old
partition, we shouldn't force this alignment, since in that case it's
more important to use the partition table as is than to optimize
performance by rounding (which isn't required by the standard).

MFC After:		1 week
Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D42359
2023-10-26 10:14:54 -06:00
Mark Johnston
56279238b0 geom_linux_lvm: Avoid removing from vg_list before inserting
PR:		266693
Reported by:	Robert Morris <rtm@lcs.mit.edu>
MFC after:	1 week
2023-10-17 11:19:05 -04:00
Dimitry Andric
479d224efc Fix geom build with clang 17 and KTR enabled
When building a kernel with clang 17 and KTR enabled, such as with the
LINT configurations, a -Werror warning is emitted:

    sys/geom/geom_io.c:145:31: error: use of logical '&&' with constant operand [-Werror,-Wconstant-logical-operand]
      145 |         if ((KTR_COMPILE & KTR_GEOM) && (ktr_mask & KTR_GEOM)) {
          |             ~~~~~~~~~~~~~~~~~~~~~~~~ ^
    sys/geom/geom_io.c:145:31: note: use '&' for a bitwise operation
      145 |         if ((KTR_COMPILE & KTR_GEOM) && (ktr_mask & KTR_GEOM)) {
          |                                      ^~
          |                                      &
    sys/geom/geom_io.c:145:31: note: remove constant to silence this warning

Replace the multiple uses of the expression with one macro, and in this
macro use "!= 0" to get a logical operand instead of a bitwise one.

Reviewed by:	jhb
MFC after:	3 days
Differential Revision: https://reviews.freebsd.org/D41823
2023-09-17 14:13:09 +02:00
Zhenlei Huang
c941b82e1c geom_linux_lvm: Check the offset of physical volume header
The LVM label is stored on any of the first four sectors, and the
PV (physical volume) header is stored within the same sector following
the LVM label. The current implementation does not fully check the
offset of PV header, when attaching a bad formatted LVM PV the kernel
may crash due to out-of-bounds memory read.

PR:	266562
Reviewed by:	jhb
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36773
2023-08-22 17:20:10 +08:00
Warner Losh
78d146160d sys: Remove $FreeBSD$: one-line bare tag
Remove /^\s*\$FreeBSD\$$\n/
2023-08-16 11:55:17 -06:00
Warner Losh
031beb4e23 sys: Remove $FreeBSD$: one-line sh pattern
Remove /^\s*#[#!]?\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:58 -06:00
Warner Losh
685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh
2ff63af9b8 sys: Remove $FreeBSD$: one-line .h pattern
Remove /^\s*\*+\s*\$FreeBSD\$.*$\n/
2023-08-16 11:54:18 -06:00
Warner Losh
95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
John Baldwin
4c89c0127d g_raid concat: Fail requests to read beyond the end of the volume
Previously a debug kernel would trigger an assertion failure if an I/O
request attempted to read off the end of a concat volume, but a
non-debug kernel would use an invalid sub-disk to try to complete the
request eventually resulting in some sort of fault in the kernel.

Instead, turn the assertions into explicit checks that fail requests
beyond the end of the volume with EIO.  For requests which run over
the end of the volume, return a short request.

PR:		257838
Reported by:	Robert Morris <rtm@lcs.mit.edu>
Reviewed by:	emaste
MFC after:	1 week
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D41222
2023-08-04 16:41:05 -07:00
Marius Strobl
4ef1c6f75d base: Remove support for the VTOC8 partitioning scheme
The removal of the sparc64 support in February 2020 obsoleted the
VTOC8 partitioning scheme as no other FreeBSD platform makes use
of it. Moreover, the code is bitrotting as nothing defines e. g.
LOADER_VTOC8_SUPPORT any more and, thus, should go now, too. With
this change, the following commits are reverted as far as VTOC8
is concerned and parts haven't already previously been deleted
along with prior sparc64 removals:
094fcb157d
a7d366e958
ba8d50d08b

The alignment example d9711c28ef
added to the VTOC8 section of gpart.8 is folded into the MBR one.

This should finally conclude the deorbit of sparc64-specific bits.

        We had joy, we had fun
        we ran Unix on a Sun.
        But that source and the song
        of FreeBSD have all gone.

Credits to Michael Bueker for the original "Unix on a Sun" and Rod
McKuen for the "Seasons in the Sun" lyrics.
2023-07-26 13:16:12 +02:00
santhoshkumar-mani
d3eb9d3db3 bios: Don't keep sending BIO_FLUSH after first ENOTSUPP.
When a storage device reports that it does not support cache flush, the
GEOM disk layer by default returns ENOTSUPP in response to a BIO_FLUSH
command.

On AWS, local volumes do not advertise themselves as having write-cache
enabled.  When they are selected for L3 on all HDD nodes, the L3
subsystem may inadvertently kick these L3 devices if a BIO_FLUSH command
fails with an ENOTSUPP return code.  The fix is to make GEOM disk return
success (0) when this condition occurs and add a sysctl to make this
error handling config-driven

Reviewed by: imp
Pull Request: https://github.com/freebsd/freebsd-src/pull/710
2023-07-01 11:14:49 -06:00
Warner Losh
b61a573019 spdx: The BSD-2-Clause-NetBSD identifier is obsolete, drop -NetBSD
The SPDX folks have obsoleted the BSD-2-Clause-NetBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:04 -06:00
Warner Losh
4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Ed Maste
00172f3416 geom: use bool for one-bit wide bit-field
A one-bit wide bit-field can take only the values 0 and -1.  Clang 16
introduced a warning that "implicit truncation from 'int' to a one-bit
wide bit-field changes value from 1 to -1".  Fix by using c99 bool.

Reported by:	Clang, via dim
Reviewed by:	dim
Sponsored by:	The FreeBSD Foundation
2023-04-17 15:43:00 -04:00
Alan Somers
9309a460b2 Implement GEOM::rotation_rate for gmirror
If all of the mirror's children have the same rotation rate, report
that.  But if they have mixed rotation rates, or if any child has an
unknown rotation rate, report "Unknown".

MFC after:	2 weeks
Sponsored by:	Axcient
Reviewed by:	imp
Differential Revision: https://reviews.freebsd.org/D39458
2023-04-10 10:27:10 -06:00
Mark Johnston
fd02d0bc14 graid3: Pre-allocate the timeout event structure
As in commit 2f1cfb7f63 ("gmirror: Pre-allocate the timeout event
structure"), graid3 must avoid M_WAITOK allocations in callout handlers.

Reported by:	graid3 regression tests
MFC after	2 weeks
2023-03-30 13:38:15 -04:00
Ed Maste
87bb53cb53 gvinum: correct assertions
Pointer addresses are always >= 0.  Assert that the value is >= 0
instead.

PR:		207855, 207856
Reviewed by:	imp
Reported by:	David Binderman
Sponsored by:	The FreeBSD Foundation
Differential Revision:	https://reviews.freebsd.org/D37677
2023-03-21 10:03:12 -04:00
Ed Maste
6c7bc93da6 g_part_ebr: always create "compat" aliases
The "canonical" EBR partition names like `ada0s4+00002081` are not
particularly meaningful.  The "compat" aliases share the same namespace
as the parent MBR, resulting in user-friendly names like `ada0s6`.
These names are consistent with the way Linux names EBR partitions.

We previously provided a sysctl kern.features.geom_part_ebr_compat
(enabled by default) to control the "compat" names.  Remove the sysctl
and always create the aliases.

Relnotes: yes
Reviewed by: cem, imp
Sponsored by: The FreeBSD Foundation
Differential Revision: https://reviews.freebsd.org/D38812
2023-03-01 13:44:01 -05:00
Konstantin Belousov
2555f175b3 Move kstack_contains() and GET_STACK_USAGE() to MD machine/stack.h
Reviewed by:	jhb
Sponsored by:	The FreeBSD Foundation
MFC after:	1 week
Differential revision:	https://reviews.freebsd.org/D38320
2023-02-02 00:59:26 +02:00
Zhenlei Huang
bd5d9037c5 GEOM: Remove redundant NULL pointer check before g_free()
Reviewed by:	melifaro, pjd, imp
Approved by:	kp (mentor)
Differential Revision:	https://reviews.freebsd.org/D37779
2022-12-28 23:34:09 +08:00
Zhenlei Huang
2e543af13a geom_part: Fix potential integer overflow when checking size of the table
`hdr_entries` and `hdr_entsz` are both uint32_t as defined in UEFI spec.
Current spec does not have upper limit of the number of partition
entries and the size of partition entry, it is potential that malicious
or corrupted GPT header read from untrusted source contains large size of
entry number or size.

PR:		266548
Reviewed by:	oshogbo, cem, imp, markj
Approved by:	kp (mentor)
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D36709
2022-12-21 09:04:30 +08:00
Alan Somers
05d0f4308c Don't panic when tasting a disk with sectorsize=0
This can sometimes happen with broken HDDs.

MFC after:	2 weeks
Sponsored by:	Axcient
Reviewed by:	markj
Differential Revision: https://reviews.freebsd.org/D37313
2022-11-09 10:21:12 -07:00
Zhenlei Huang
5be5d0d5cb geom_part: Check number of GPT entries and size of GPT entry
Current specification does not have upper limit of the number of
partition entries and the size of partition entry. In
799eac8c3d Andrey V. Elsukov introduced a
limit maximum number of GPT entries to 4k, but that is for write routine
(gpart create) only. When attaching disks that have large number of GPT
entries exceeding the limit, or disks with large size of partition
entry, it is still possible to exhaust kernel memory.

1. Reuse the limit of the maximum number of partition entries.
2. Limit the maximum size of GPT entry to 1k.

In current specification (2.10) the size of GPT entry is 128 *
2^n while n >= 0, and the size - 128 is reserved. 1k should be
sufficient enough for foreseen future.

PR:		266548
Discussed with:	imp
Reviewed by:	markj
MFC after:	1 month
Differential Revision:	https://reviews.freebsd.org/D36717
2022-10-18 11:03:02 -04:00