Commit graph

316 commits

Author SHA1 Message Date
Warner Losh 99c14fb99f cam: Drop periph lock when completing I/O with ENOMEM status
When biofinish calls g_io_deliver with an error of ENOMEM, that kicks
off the slowdown protocol, forcing I/O to go through g_down rather than
be directly dispatch. One of the side effects is that the I/O is
resubmitted, so the start routines get called recursively, leading to a
recursive lock panic. Rather than make the periph lock recursive, drop
and reacquire the lock around such calls to biofinish.

For nda, this happens only when we can't allocate space to construct a
TRIM. For ada and da, this is only for certain ZONE operations.

Sponsored by:		Netflix
Reviewed by:		gallatin
Differential Revision:	https://reviews.freebsd.org/D45310
2024-05-24 08:32:04 -06:00
Andriy Gapon c01af41c3c ata_da: add quirk to disable NCQ TRIM for Samsung 860/870 SSDs
NCQ TRIM for Samsung 860/870 SSDs results in data corruption on systems
with some SATA controllers.

This can be easily reproduced using ZFS which uses TRIM and is able to
detect block content changes.

Linux bug report for this issue:
 https://bugzilla.kernel.org/show_bug.cgi?id=201693

Since at present we can not limit a quirk based on the contorller / SIM,
apply the quirk in all cases.

Reviewed by:	imp
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D43961
2024-02-19 12:08:12 +02:00
Warner Losh a6cef61766 ada: Another NCQ Trim instability drive
The Seagate IronWolf 110 SATA SSD drive has been reported to be unstable
with NCQ trim enabled.

PR: 264139
Sponsored by:		Netflix
2024-02-18 22:18:38 -07:00
Alexander Motin 519b24f029 CAM: Replace random sbuf_printf() with cheaper cat/putc. 2023-11-22 18:04:05 -05:00
Alexander Motin 1b44079584 CAM: Use sbuf_new_for_sysctl() in more places
There is no need to allocate buffer, worry about overflows, etc.

MFC after:	2 weeks
2023-11-22 15:33:48 -05:00
Warner Losh 2ffd30f7ee cam: Remove left-over sys/cdefs.h in sys/cam
These weren't removed when $FreeBSD$ was removed. They aren't needed and
now are a style(9) nonconformity.

Sponsored by:		Netflix
2023-11-06 12:20:23 -07:00
Warner Losh fd9a4a67d0 cam: Minor opt_cam.h cleanup
sys/cam/cam.h includes opt_cam.h, so none of the clients need to do
this. cam.h does all the right dancing to conditionally include
opt_cam.h only when it makes sense. It generally only matters when
cam_debug.h is included (it must be included before that). Many of the
stray opt_cam.h includes were after cam_debug.h which would be a problem
were it not included in cam/cam.h. The other users of CAM options that
aren't debug all already include cam/cam.h.

Also trim unneeded sys/cdefs.h files from the files touched.

Sponsored by:		Netflix
2023-11-06 10:47:15 -07:00
Zhenlei Huang d24729b2fd cam/ata: Postpone removal of two compat sysctls until 15
Prefer UNMAPPEDIO and ROTATING from flags sysctl. See
 1. aeab0812e6 (Add flags sysctl to ada)
 2. cf3ff63e55 (Convert unmappedio over to a flag)
 3. 96eb32bf0f (Convert rotating to a flag bit)

Reviewed by:	imp, ken, #cam
MFC after:	immediately (we want this in 14.0)
Differential Revision:	https://reviews.freebsd.org/D42402
2023-11-02 13:14:40 +08:00
Warner Losh 685dc743dc sys: Remove $FreeBSD$: one-line .c pattern
Remove /^[\s*]*__FBSDID\("\$FreeBSD\$"\);?\s*\n/
2023-08-16 11:54:36 -06:00
Warner Losh 95ee2897e9 sys: Remove $FreeBSD$: two-line .h pattern
Remove /^\s*\*\n \*\s+\$FreeBSD\$$\n/
2023-08-16 11:54:11 -06:00
John Baldwin b2c44f1fc1 cam: Remove non-sbuf announce/denounce proto and xport ops
Reviewed by:	mav, imp
Sponsored by:	Chelsio Communications
Differential Revision:	https://reviews.freebsd.org/D41264
2023-08-01 15:25:38 -07:00
Warner Losh 9db2db6bf6 cam/ata: Migrate to modern uintXX_t from u_intXX_t
As per https://lists.freebsd.org/archives/freebsd-scsi/2023-July/000257.html
move to the modern uintXX_t.

MFC After:	3 days
Sponsored by:	Netflix
2023-07-24 21:32:56 -06:00
Warner Losh 4d846d260e spdx: The BSD-2-Clause-FreeBSD identifier is obsolete, drop -FreeBSD
The SPDX folks have obsoleted the BSD-2-Clause-FreeBSD identifier. Catch
up to that fact and revert to their recommended match of BSD-2-Clause.

Discussed with:		pfg
MFC After:		3 days
Sponsored by:		Netflix
2023-05-12 10:44:03 -06:00
Warner Losh fd02926a68 cam: Properly mask out the status bits to get completion code
ccb_h.status has two parts: the actual status and some addition bits to
indicate additional information. It must be masked before comparing
against completion codes. Add new inline function cam_ccb_success to
simplify this to test whether or not the request succeeded. Most of the
code already does this, but a few places don't (the rest likely should
be converted to use cam_ccb_status and/or cam_ccb_success, but that's
for another day). This caused at least one bug in recognizing devices
behind a SATA port multiplexer, though some of these checks were
fine with the special knowledge of the code paths involved.

PR:			270459
Sponsored by:		Netflix
MFC After:		1 week (and maybe a EN requst)
Reviewed by:		ken, mav
Differential Revision:	https://reviews.freebsd.org/D39572
2023-04-15 16:32:41 -06:00
Alexander Motin 90bcc81bc3 Delay GEOM disk_create() until CAM periph probe completes.
Before this patch CAM periph drivers called both disk_alloc() and
disk_create() same time on periph creation.  But then prevented disks
from opening until the periph probe completion with cam_periph_hold().
As result, especially if disk misbehaves during the probe, GEOM event
thread, triggered to taste the disk, got blocked on open attempt,
potentially for a long time, unable to process other events.

This patch moves disk_create() call from periph creation to the end of
the probe. To allow disk_create() calls from non-sleepable CAM contexts
some of its duties requiring memory allocations are moved either back
to disk_alloc() or forward to g_disk_create(), so now disk_alloc() and
disk_add_alias() are the only disk methods that require sleeping.  If
disk fails during the probe disk_create() may just be skipped, going
directly to disk_destroy().  Other method calls during that time are
just ignored.  Since GEOM may now see the disks after CAM bus scan is
already completed, introduce per-periph boot hold functions. Enclosure
driver already had such mechanism, so just generalize it.

Reviewed by:	imp
MFC after:	1 month
Sponsored by:	iXsystems, Inc.
Differential Revision:	https://reviews.freebsd.org/D35784
2022-07-14 16:17:36 -04:00
Mitchell Horne 489ba22236 kerneldump: remove physical argument from d_dumper
The physical address argument is essentially ignored by every dumper
method. In addition, the dump routines don't actually pass a real
address; every call to dump_append() passes a value of zero for
physical.

Reviewed by:	markj
MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D35173
2022-05-13 10:42:48 -03:00
Warner Losh 1907e1c07c ada: Move comment
Move the comment about releasing ccb before periph to adaprobedone()
where it belongs.

Sponsored by:		Netflix
2022-05-04 16:54:38 -06:00
Warner Losh 6c8ab086fe ada: Retry commands with retries left on CAM_SEL_TIMEOUT
The AHCI and ATA SIMs will return CAM_SEL_TIMEOUT when an underlying
device has stopped responding. This is usually seen after a timeouted
out command and can be a transient event. Rather than fail the
peripheral immediately after seeing this, queue a retry. For transient
events, this allows drives to continue to provide data, though with some
added latency, just like we do when we have some other kind of retriable
error. If the error isn't transient (the drive is truly gone), then
we'll discover that eventually and fail the transaction and invalidate
the drive like we do today.

This helps us avoid a panic at the end of camperiphfree when
CAM_PERIPH_NEW_DEV_FOUND is set. However, the deferred callback should
be queued to xpt_async_td instead of being made inline there. This issue
will be solved in a different patch that does that. PR 263703.

This also helps us avoid another bug where we can drop all references to
the device (causing us to go through camperiphfree and destroy the path)
while we have an I/O pending in the ata_da state machine (usually in
state ADA_STATE_RAHEAD with ATA_SETFEATURES ATA_SF_ENAB_RCACHE
command). It's not clear why the reference that we take out to do the
reprobe isn't effective at blocking this. By retrying this condition,
though we avoid this bug (at least more often, I don't have a good
reproduction test case, I just see this panic a few times a month at
work on systems that have transient disk errors on ahci connected SATA
SSDs). PR 263704. It's too soon to know how much this helps us avoid
this bug.

Sponsored by:		Netflix
Differential Revision:	https://reviews.freebsd.org/D34977
2022-05-01 11:08:56 -06:00
Warner Losh ae1955cd67 adaasync: Harmonize with daasync
We should call cam_periph_async() always, like SCSI does. This routine
is supposed to be more of a catch-all.

cam_periph_async() only does actions for AC_LOST_DEVICE. It ignores all
other events (today), but this may not always be true. So this is a nop
change.

Drop in a 'break' so we don't fall through unnecessarily.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D35057
2022-04-26 11:01:39 -06:00
Warner Losh ccaec73d0b ada: Eliminate dead code
We never use the cgd that we get from the XPT_GDEV_TYPE call. Prior to
9a6844d55f we used it to determine if READ AHEAD or WRITE CACHING was
supported. However, all that information was moved into adasetflags so
we no longer need to this since it's cached in the softc and updated
with the IDENTIFY data changes automatically.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D35039
2022-04-25 12:55:04 -06:00
Warner Losh 9d899bbcb7 cam: Small reorg of ata xpt async code
Use a switch rather than a nested if to simplify the async event
processing code. No functional changes intended.

Sponsored by:		Netflix
Reviewed by:		mav
Differential Revision:	https://reviews.freebsd.org/D35038
2022-04-25 12:55:04 -06:00
Warner Losh b43cfe7171 ada/da: Borrow comment from nda about cleanup
Remove a XXX comment and replace it with a more accurate comment about
what happens to I/O queued to the hardware.

Sponsored by:		Netflix
2022-04-24 15:11:56 -06:00
Warner Losh 48ae3f4f64 ata/nvme: Add comment
Steal the comment from daonninvalidate about the call to disk_gone().

Sponsored by:		Netflix
2022-04-24 15:11:52 -06:00
Alexander Motin 38f8addaab CAM: Replicate e0ceec676d from da to ada and nda.
MFC after:	1 week
2022-04-23 20:15:17 -04:00
Gordon Bergling 49dace1d46 cam: Fix typos in source code comments
- s/paniced/panicked/

MFC after:	3 days
2022-04-02 10:13:35 +02:00
Warner Losh 4762aa57ad ata_xpt: Rename probe_softc to aprobe_softc
Since both scsi_xpt and ata_xpt use the same name for the softc, this
can lead to problems in gdb. Avoid the issue by renaming the ata
probe_softc to aprobe_softc as has been done for the aprobe in
0f280cbd0a. This was overlooked at the time.

Sponsored by:		Netflix
MFC After:		2 weeks
2022-01-14 17:21:09 -07:00
Andriy Gapon 75bc7150f4 add and use defintions for ATA power modes
Those can be returned by CHECK POWER MODE command (0xe5).
Note that some of the definitions duplicate definitions for Extended
Power Conditions.

MFC after:	2 weeks
Differential Revision:	https://reviews.freebsd.org/D33646
2022-01-11 15:41:38 +02:00
Alexander Motin 0e5c50bf60 cam: Relax callouts precisions.
On large systems even relatively rare callouts may fire many times
per second.  This should allow them to aggregate better, since we do
not require any precision when polling for media change, etc.

MFC after:	2 weeks
2022-01-07 12:59:16 -05:00
Andriy Gapon 15910dc0bc adaspindown: check disk power mode before sending IDLE command
If a disk is already in STANDBY mode, then setting IDLE mode can
actually spin it up.

Reviewed by:	mav
MFC after:	4 weeks
Differential Revision:	https://reviews.freebsd.org/D33588
2021-12-24 11:02:22 +02:00
Jessica Clarke f350bc1dd3 ada: Fix intra-object buffer overread of identify strings
In the ATA/ATAPI spec these are space-padded fixed-length strings with
no NUL-terminator (and byte swapped). When performing the identify we
call ata_param_fixup to swap the bytes back to be in order, strip any
leading/trailing spaces and coalesce consecutive spaces, padding with
NULs. However, if the input has no padding spaces, the fixed-up strings
are still not NUL-terminated. This causes two issues. The first is that
strlcpy will truncate the string by replacing the final byte with a NUL.
The second is that strlcpy will keep reading src until it finds a NUL in
order to calculate the return value, which is defined as the length of
src (so that callers can then compare it with the dsize input to see if
the input string was truncated), thereby reading past the end of the
buffer and into whatever adjacent fields are in the structure. In
practice there's a NUL byte somewhere in the structure, but on CHERI
with subobject bounds enabled in the compiler this overread will be
detected and trap as a bounds violation.

Note this matches ata_xpt's aprobedone, which does a bcopy to a
malloc'ed buffer and manually NUL-terminates it for the CAM path's
device's serial_num.

Found by:	CHERI
Reviewed by:	imp, scottl
Differential Revision:	https://reviews.freebsd.org/D32567
2021-10-27 18:38:37 +01:00
Alexander Motin 303477d325 cam(4): Mark all sysctls as CTLFLAG_MPSAFE.
This code does not use Giant lock for very long time.

MFC after:	2 weeks
2021-08-10 20:07:19 -04:00
Edward Tomasz Napierala b0cf8194c2 cam: revert half of 75b5caa08e
This turns debugging printf() into a KASSERT().
It's for ATA for now; SCSI will came later.

Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D31380
2021-08-08 13:24:19 +00:00
Young Xiao 431ddd9436 Fix potential NULL pointer dereference of device physical path
In ata_dev_advinfo() and nvme_dev_advinfo(), if the physical path is
being stored and there is a malloc failure (malloc(9) is called with
M_NOWAIT), we could wind up in a situation where the device's
physpath_len is set to the length the user provided, but the physpath
itself is NULL.

If another context then comes in to fetch the physical path value, we
would wind up trying to memcpy a NULL pointer into the caller's buffer.

So, set the physpath_len to 0 when we free the physpath on entry into
the store case for the physical path.  Reset the length to a non-zero
value only after we've successfully malloced a buffer to hold it.

This code mirrors scsi_xpt.c does already as well.

Signed-off-by:	Young Xiao <92siuyang@gmail.com>
Reviewed by:	imp
PR:		238014
2021-07-13 14:13:21 -06:00
Edward Tomasz Napierala 6f147a0734 cam: enable kern.cam.ada.enable_uma_ccbs by default
This makes the ada(4) driver use UMA for its CCBs.  While it's
da(4) counterpart needs some more testing, this one seems to be
safe now.

Please let me know via email if you notice any suspicious kernel
messages,

Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D30567
2021-07-07 09:40:34 +01:00
Edward Tomasz Napierala 75b5caa08e cam: turn KASSERTs into printfs for now
It looks like I've missed a couple of places where we don't clear
stack-allocated CCBs.  Don't panic when that happens, just print
a warning.

This is a temporary measure until I get those cases fixed.

Reviewed By:	markj
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision: https://reviews.freebsd.org/D30296
2021-05-16 20:19:19 +01:00
Edward Tomasz Napierala 0f206cc912 cam: add missing zeroing of a stack-allocated CCB.
This could cause a panic at boot.

Reported By:	Shawn Webb <shawn.webb AT hardenedbsd.org>
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
2021-05-16 11:38:26 +01:00
Edward Tomasz Napierala 3394d4239b cam: allocate CCBs from UMA for SCSI and ATA IO
This patch makes it possible for CAM to use small CCBs allocated
from an periph-specific UMA zone instead of the usual, huge ones.
The end result is that CCBs issued via da(4) take 544B (size of
ccb_scsiio) instead of the usual 2kB (size of 'union ccb', ~1.5kB,
rounded up by malloc(9)).  For ATA it's 272B.  We waste less
memory, we avoid zeroing the unused 1kB, and it should be easier
to allocate those CCBs in low memory conditions.  It should also
be possible to use uma_zone_reserve(9) to improve behaviour
in low memory conditions even further.

Note that this does not change the size, or the layout, of CCBs
as such.  CCBs get allocated in various different ways, in particular
on the stack, and I don't want to redo all that.  Instead, this
provides an opt-in mechanism for the periph to declare "my start()
callback is fine with receiving a CCB allocated from this UMA zone".
In other words, most of the code works exactly as it used to; the
change only happens to IOs issued by xpt_run_allockq(), which
is - conveniently - pretty much all that matters for performance.

The reason for doing it this way is that it's pretty small, localized
change, and can be implemented gradually and iteratively: take a
periph, make sure its start() callback only casts the CCBs it takes
to a particular type of CCB, for example ccb_scsiio, and that it only
casts CCBs returned by cam_periph_getccb() to that type, then add UMA
zone for that size, and declare it safe to XPT.

This is disabled by default.  Set 'kern.cam.ada.enable_uma_ccbs=1'
and 'kern.cam.da.enable_uma_ccbs=1' tunables to enable it.  Testing
is welcome; I will flip the default to enable in two weeks from now.

Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D28674
2021-05-15 12:03:49 +01:00
Edward Tomasz Napierala ec5325dbca cam: make sure to clear even more CCBs allocated on the stack
This is my second pass, this time over all of CAM except
for the SCSI target bits.  There should be no functional
changes.

Reviewed By:	imp
Sponsored by:	NetApp, Inc.
Sponsored by:	Klara, Inc.
Differential Revision:	https://reviews.freebsd.org/D29549
2021-04-11 15:24:22 +01:00
John Baldwin e07ac3f2fd cam: Don't permit crashdumps on non-pollable devices.
If a disk's SIM doesn't support polling, then it can't be used to
store crashdumps.  Leave d_dump NULL in that case so that dumpon(8)
fails gracefully rather than having dumps fail at crash time.

Reviewed by:	scottl, mav, imp
MFC after:	2 weeks
Sponsored by:	Chelsio
Differential Revision:	https://reviews.freebsd.org/D28454
2021-02-11 13:52:18 -08:00
Marius Strobl eae35125e9 ada(4): remove remainder of MD geometry translation support
This was missed in 9cf738228d and
r359718 respectively.
2020-12-25 20:20:54 +01:00
Konstantin Belousov cd85379104 Make MAXPHYS tunable. Bump MAXPHYS to 1M.
Replace MAXPHYS by runtime variable maxphys. It is initialized from
MAXPHYS by default, but can be also adjusted with the tunable kern.maxphys.

Make b_pages[] array in struct buf flexible.  Size b_pages[] for buffer
cache buffers exactly to atop(maxbcachebuf) (currently it is sized to
atop(MAXPHYS)), and b_pages[] for pbufs is sized to atop(maxphys) + 1.
The +1 for pbufs allow several pbuf consumers, among them vmapbuf(),
to use unaligned buffers still sized to maxphys, esp. when such
buffers come from userspace (*).  Overall, we save significant amount
of otherwise wasted memory in b_pages[] for buffer cache buffers,
while bumping MAXPHYS to desired high value.

Eliminate all direct uses of the MAXPHYS constant in kernel and driver
sources, except a place which initialize maxphys.  Some random (and
arguably weird) uses of MAXPHYS, e.g. in linuxolator, are converted
straight.  Some drivers, which use MAXPHYS to size embeded structures,
get private MAXPHYS-like constant; their convertion is out of scope
for this work.

Changes to cam/, dev/ahci, dev/ata, dev/mpr, dev/mpt, dev/mvs,
dev/siis, where either submitted by, or based on changes by mav.

Suggested by: mav (*)
Reviewed by:	imp, mav, imp, mckusick, scottl (intermediate versions)
Tested by:	pho
Sponsored by:	The FreeBSD Foundation
Differential revision:	https://reviews.freebsd.org/D27225
2020-11-28 12:12:51 +00:00
Alexander Motin cd500da924 Fix sbuf_finish() error code check in user-space.
MFC after:	1 week
Sponsored by:	iXsystems, Inc.
2020-10-13 23:29:06 +00:00
Mateusz Guzik 27dcd3d90b cam: clean up empty lines in .c and .h files 2020-09-01 22:13:48 +00:00
Adrian Chadd 69c41b7071 [ata_da] remove duplicate definition; it trips up ye olde gcc-6 on mips32
Checked first with: irc
2020-05-27 02:10:09 +00:00
Conrad Meyer 9982b3ee29 cam: ANSIfy 0-argument function definitions
No functional change.

Reviewed by:	imp
Differential Revision:	https://reviews.freebsd.org/D24854
2020-05-16 14:33:08 +00:00
Warner Losh 0f280cbd0a Make the ata probe* and xpt* routines aprobe* and axpt* respectively.
Often, in traiging core files, one only has a traceback of where a
panic occurred. We have probe* and xpt* routines that live in both the
scsi and ata layers with identical names. To make one or the other
stand out, prefix all the probe and xpt routines in ata with an
'a'. I've left the scsi ones alone since they were there first and are
more numerous. I also rejected using #define to do this as being too
confusing. I chose this method because the CAM name for the probe
device was already 'aprobe'.

Normally, this doesn't matter because file scope protects one from
interfering with the other. However, due to the indirect nature of
CAM's state machine, you don't know if the following traceback is
SCSI or ATA:
	xpt_done
	probedone
	xpt_done_process
	xpt_done_td
	fork_exit

nvme and mmc already have unique names.

MFC: 1 week
Differential revision: https://reviews.freebsd.org/D24825
2020-05-13 00:18:44 +00:00
Warner Losh 96eb32bf0f Convert rotating to a flag bit.
Move rotating to a flag bit. Add bit definitions for it. Create a
compat sysctl for it.
2020-04-27 23:43:12 +00:00
Warner Losh cf3ff63e55 Convert unmappedio over to a flag.
Make unmappedio a flag. Move it to the flags definition. Add compat
sysctl for it.
2020-04-27 23:43:08 +00:00
Warner Losh aeab0812e6 Add flags sysctl to ada
Report the ada device flags like we do the da devices. No booleans
have (yet) been converted, but iomapped and rotating are planned.
2020-04-27 23:43:04 +00:00
Warner Losh 9cf738228d Now that we don't have special-case geom hacking defined in md_var.h, stop
including it. sparc64 was the last straggler here, but these weren't removed at
the time.
2020-04-07 22:23:22 +00:00