Find a file
Brian Behlendorf 7c9a42921e
Detect IO errors during device removal
* Detect IO errors during device removal

While device removal cannot verify the checksums of individual
blocks during device removal, it can reasonably detect hard IO
errors from the leaf vdevs.  Failure to perform this error
checking can result in device removal completing successfully,
but moving no data which will permanently corrupt the pool.

Situation 1: faulted/degraded vdevs

In the configuration shown below, the removal of mirror-0 will
permanently corrupt the pool.  Device removal will preferentially
copy data from 'vdev1 -> vdev3' and from 'vdev2 -> vdev4'.  Which
in this case will result in nothing being copied since one vdev
in each of those groups in unavailable.  However, device removal
will complete successfully since all IO errors are ignored.

  tank                DEGRADED     0     0     0
    mirror-0          DEGRADED     0     0     0
      /var/tmp/vdev1  FAULTED      0     0     0  external fault
      /var/tmp/vdev2  ONLINE       0     0     0
    mirror-1          DEGRADED     0     0     0
      /var/tmp/vdev3  ONLINE       0     0     0
      /var/tmp/vdev4  FAULTED      0     0     0  external fault

This issue is resolved by updating the source child selection
logic to exclude unreadable leaf vdevs.  Additionally, unwritable
destination child vdevs which can never succeed are skipped to
prevent generating a large number of write IO errors.

Situation 2: individual hard IO errors

During removal if an unexpected hard IO error is encountered when
either reading or writing the child vdev the entire removal
operation is cancelled.  While it may be possible to reconstruct
the data after removal that cannot be guaranteed.  The only
strictly safe thing to do is to cancel the removal.

As a future improvement we may want to instead suspend the removal
process and allow the damaged region to be retried.  But that work
is left for another time, hard IO errors during the removal process
are expected to be exceptionally rare.

Reviewed-by: Serapheim Dimitropoulos <serapheim@delphix.com>
Reviewed-by: Tony Hutter <hutter2@llnl.gov>
Reviewed-by: Tom Caputi <tcaputi@datto.com>
Signed-off-by: Brian Behlendorf <behlendorf1@llnl.gov>
Issue #6900
Closes #8161
2018-12-04 09:37:37 -08:00
.github Advise users to retain issue/PR templates 2018-10-17 10:25:38 -07:00
cmd Fix consistency of ztest_device_removal_active 2018-11-28 20:47:09 -08:00
config Fix systemd spec file macros 2018-11-11 18:06:36 -08:00
contrib Allow spaces in pool names for cmdline argument 2018-11-11 18:23:11 -08:00
etc Minor documentation, logging, and testing typos 2018-06-07 09:38:39 -07:00
include OpenZFS 8115 - parallel zfs mount 2018-11-15 11:33:58 -08:00
lib Move strlcat, strlcpy, and strnlen 2018-11-20 10:37:49 -08:00
man Detect IO errors during device removal 2018-12-04 09:37:37 -08:00
module Detect IO errors during device removal 2018-12-04 09:37:37 -08:00
rpm Fix systemd spec file macros 2018-11-11 18:06:36 -08:00
scripts Make gitrev more reliable 2018-10-22 12:23:30 -07:00
tests Detect IO errors during device removal 2018-12-04 09:37:37 -08:00
udev Add kernel module auto-loading 2018-03-13 10:45:55 -07:00
.gitignore Ignore *.o.ur-safe build artifacts 2018-05-13 18:59:02 -07:00
.gitmodules Add zimport.sh compatibility test script 2014-02-21 12:10:31 -08:00
.travis.yml Add .travis.yml 2017-11-13 09:18:18 -08:00
AUTHORS Update build system and packaging 2018-05-29 16:00:33 -07:00
autogen.sh Cause autogen.sh to fail if autoreconf fails 2018-07-06 09:27:37 -07:00
configure.ac Add libzutil for libzfs or libzpool consumers 2018-11-05 11:22:33 -08:00
copy-builtin Allow copy-builtin to work with modified sources 2018-10-17 12:06:05 -07:00
COPYRIGHT Update build system and packaging 2018-05-29 16:00:33 -07:00
LICENSE Update build system and packaging 2018-05-29 16:00:33 -07:00
Makefile.am Linux does not HAVE_DNLC 2018-10-17 10:30:08 -07:00
META Tag 0.8.0-rc2 2018-11-12 11:57:15 -08:00
NEWS Add NEWS file 2018-09-18 12:03:47 -07:00
NOTICE Update build system and packaging 2018-05-29 16:00:33 -07:00
README.md Explicitly state supported Linux versions 2018-05-30 20:11:19 -07:00
TEST Update build system and packaging 2018-05-29 16:00:33 -07:00
zfs.release.in Move zfs.release generation to configure step 2012-07-12 12:22:51 -07:00

img

ZFS on Linux is an advanced file system and volume manager which was originally developed for Solaris and is now maintained by the OpenZFS community.

codecov coverity

Official Resources

Installation

Full documentation for installing ZoL on your favorite Linux distribution can be found at our site.

Contribute & Develop

We have a separate document with contribution guidelines.

Release

ZFS on Linux is released under a CDDL license.
For more details see the NOTICE, LICENSE and COPYRIGHT files; UCRL-CODE-235197

Supported Kernels

  • The META file contains the officially recognized supported kernel versions.