zinject: "no-op" error injection

When injected, this causes the matching IO to appear to succeed, but the
actual work is never submitted to the physical device. This can be used
to simulate a write-back cache servicing a write, but the backing device
has failed and the cache cannot complete the operation in the
background.

Sponsored-by: Klara, Inc.
Sponsored-by: Wasabi Technology, Inc.
Reviewed-by: Brian Behlendorf <behlendorf1@llnl.gov>
Signed-off-by: Rob Norris <rob.norris@klarasystems.com>
Closes #16085
This commit is contained in:
Rob N 2024-04-16 06:52:20 +10:00 committed by GitHub
parent f22b110f60
commit 4725e543be
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
4 changed files with 19 additions and 6 deletions

View file

@ -221,6 +221,7 @@ static const struct errstr errstrtable[] = {
{ ENXIO, "nxio" },
{ ECHILD, "dtl" },
{ EILSEQ, "corrupt" },
{ ENOSYS, "noop" },
{ 0, NULL },
};
@ -269,8 +270,8 @@ usage(void)
"\t\tInject a fault into a particular device or the device's\n"
"\t\tlabel. Label injection can either be 'nvlist', 'uber',\n "
"\t\t'pad1', or 'pad2'.\n"
"\t\t'errno' can be 'nxio' (the default), 'io', 'dtl', or\n"
"\t\t'corrupt' (bit flip).\n"
"\t\t'errno' can be 'nxio' (the default), 'io', 'dtl',\n"
"\t\t'corrupt' (bit flip), or 'noop' (successfully do nothing).\n"
"\t\t'frequency' is a value between 0.0001 and 100.0 that limits\n"
"\t\tdevice error injection to a percentage of the IOs.\n"
"\n"
@ -889,7 +890,7 @@ main(int argc, char **argv)
if (error < 0) {
(void) fprintf(stderr, "invalid error type "
"'%s': must be one of: io decompress "
"decrypt nxio dtl corrupt\n",
"decrypt nxio dtl corrupt noop\n",
optarg);
usage();
libzfs_fini(g_zfs);

View file

@ -211,9 +211,11 @@ to flip a bit in the data after a read,
.It Sy dtl
for an ECHILD error,
.It Sy io
for an EIO error where reopening the device will succeed, or
for an EIO error where reopening the device will succeed,
.It Sy nxio
for an ENXIO error where reopening the device will fail.
for an ENXIO error where reopening the device will fail, or
.It Sy noop
to drop the IO without executing it, and return success.
.El
.Pp
For EIO and ENXIO, the "failed" reads or writes still occur.

View file

@ -4058,6 +4058,16 @@ zio_vdev_io_start(zio_t *zio)
zio->io_type == ZIO_TYPE_WRITE ||
zio->io_type == ZIO_TYPE_TRIM)) {
if (zio_handle_device_injection(vd, zio, ENOSYS) != 0) {
/*
* "no-op" injections return success, but do no actual
* work. Just skip the remaining vdev stages.
*/
zio_vdev_io_bypass(zio);
zio_interrupt(zio);
return (NULL);
}
if ((zio = vdev_queue_io(zio)) == NULL)
return (NULL);

View file

@ -47,7 +47,7 @@ function cleanup
function test_device_fault
{
typeset -a errno=("io" "decompress" "decrypt" "nxio" "dtl" "corrupt")
typeset -a errno=("io" "decompress" "decrypt" "nxio" "dtl" "corrupt" "noop")
for e in ${errno[@]}; do
log_must eval \
"zinject -d $DISK1 -e $e -T read -f 0.001 $TESTPOOL"