linux/drivers/md
Song Liu 70d466f760 md/r5cache: gracefully handle journal device errors for writeback mode
For the raid456 with writeback cache, when journal device failed during
normal operation, it is still possible to persist all data, as all
pending data is still in stripe cache. However, it is necessary to handle
journal failure gracefully.

During journal failures, the following logic handles the graceful shutdown
of journal:
1. raid5_error() marks the device as Faulty and schedules async work
   log->disable_writeback_work;
2. In disable_writeback_work (r5c_disable_writeback_async), the mddev is
   suspended, set to write through, and then resumed. mddev_suspend()
   flushes all cached stripes;
3. All cached stripes need to be flushed carefully to the RAID array.

This patch fixes issues within the process above:
1. In r5c_update_on_rdev_error() schedule disable_writeback_work for
   journal failures;
2. In r5c_disable_writeback_async(), wait for MD_SB_CHANGE_PENDING,
   since raid5_error() updates superblock.
3. In handle_stripe(), allow stripes with data in journal (s.injournal > 0)
   to make progress during log_failed;
4. In delay_towrite(), if log failed only process data in the cache (skip
   new writes in dev->towrite);
5. In __get_priority_stripe(), process loprio_list during journal device
   failures.
6. In raid5_remove_disk(), wait for all cached stripes are flushed before
   calling log_exit().

Signed-off-by: Song Liu <songliubraving@fb.com>
Signed-off-by: Shaohua Li <shli@fb.com>
2017-05-11 22:11:11 -07:00
..
bcache drivers/md/bcache/util.h: remove duplicate inclusion of blkdev.h 2017-03-09 17:01:10 -08:00
persistent-data dm block manager: remove an unused argument from dm_block_manager_create() 2017-04-27 17:08:41 -04:00
bitmap.c md: fix several trivial typos in comments 2017-03-23 22:54:57 -07:00
bitmap.h md: move bitmap_destroy to the beginning of __md_stop 2017-03-16 16:55:58 -07:00
dm-bio-prison-v1.c dm bio prison v2: new interface for the bio prison 2017-03-07 11:30:16 -05:00
dm-bio-prison-v1.h dm bio prison v2: new interface for the bio prison 2017-03-07 11:30:16 -05:00
dm-bio-prison-v2.c dm bio prison v2: new interface for the bio prison 2017-03-07 11:30:16 -05:00
dm-bio-prison-v2.h dm bio prison v2: new interface for the bio prison 2017-03-07 11:30:16 -05:00
dm-bio-record.h
dm-bufio.c dm bufio: check new buffer allocation watermark every 30 seconds 2017-05-01 15:21:42 -04:00
dm-bufio.h dm bufio: add sector start offset to dm-bufio interface 2017-03-07 13:28:33 -05:00
dm-builtin.c dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-cache-background-tracker.c dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-background-tracker.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-block-types.h linux: drop __bitwise__ everywhere 2016-12-16 00:13:41 +02:00
dm-cache-metadata.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-cache-metadata.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-policy-internal.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-policy-smq.c dm cache policy smq: make the cleaner policy write-back more aggressively 2017-03-31 11:41:05 -04:00
dm-cache-policy.c
dm-cache-policy.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-target.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-core.h - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-crypt.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-delay.c dm: mark targets that pass integrity data 2017-04-24 12:04:32 -04:00
dm-era-target.c dm block manager: remove an unused argument from dm_block_manager_create() 2017-04-27 17:08:41 -04:00
dm-exception-store.c - Revert a dm-multipath change that caused a regression for unprivledged 2015-11-04 21:19:53 -08:00
dm-exception-store.h dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-flakey.c dm flakey: introduce "error_writes" feature 2016-12-13 15:01:31 -05:00
dm-integrity.c dm integrity: use previously calculated log2 of sectors_per_block 2017-04-27 12:16:32 -04:00
dm-io.c dm: support REQ_OP_WRITE_ZEROES 2017-04-08 11:25:38 -06:00
dm-ioctl.c dm: introduce enum dm_queue_mode to cleanup related code 2017-04-27 17:08:44 -04:00
dm-kcopyd.c dm kcopyd: switch to use REQ_OP_WRITE_ZEROES 2017-04-08 11:25:38 -06:00
dm-linear.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-log-userspace-base.c dm: drop NULL test before kmem_cache_destroy() and mempool_destroy() 2015-10-31 19:06:00 -04:00
dm-log-userspace-transfer.c dm log userspace transfer: match wait_for_completion_timeout return type 2015-04-15 12:10:20 -04:00
dm-log-userspace-transfer.h
dm-log-writes.c Merge branch 'for-4.9/block' of git://git.kernel.dk/linux-block 2016-10-07 14:42:05 -07:00
dm-log.c block,fs: use REQ_* flags directly 2016-11-01 09:43:26 -06:00
dm-mpath.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-queue-length.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-raid.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-raid1.c block: remove the discard_zeroes_data flag 2017-04-08 11:25:38 -06:00
dm-region-hash.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-round-robin.c dm round robin: revert "use percpu 'repeat_count' and 'current_path'" 2017-02-17 00:54:09 -05:00
dm-rq.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-rq.h dm: always defer request allocation to the owner of the request_queue 2017-01-27 15:08:35 -07:00
dm-service-time.c dm path selector: remove 'repeat_count' return from .select_path hook 2016-02-22 22:34:42 -05:00
dm-snap-persistent.c block,fs: use REQ_* flags directly 2016-11-01 09:43:26 -06:00
dm-snap-transient.c dm snapshot: fix hung bios when copy error occurs 2016-01-08 20:03:05 -05:00
dm-snap.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm-stats.c dm stats: fix a leaked s->histogram_boundaries array 2017-02-16 14:17:07 -05:00
dm-stats.h dm stats: support precise timestamps 2015-06-17 12:40:40 -04:00
dm-stripe.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-switch.c dm switch: simplify conditional in alloc_region_table() 2015-10-31 19:06:06 -04:00
dm-sysfs.c dm: move request-based code out to dm-rq.[hc] 2016-06-10 15:15:44 -04:00
dm-table.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-target.c dm: introduce a new DM_MAPIO_KILL return value 2017-05-01 18:19:03 -04:00
dm-thin-metadata.c dm block manager: remove an unused argument from dm_block_manager_create() 2017-04-27 17:08:41 -04:00
dm-thin-metadata.h dm thin: fix a race condition between discarding and provisioning a block 2016-07-20 12:43:35 -04:00
dm-thin.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-uevent.c
dm-uevent.h
dm-verity-fec.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm-verity-fec.h dm verity fec: limit error correction recursion 2017-03-16 09:37:31 -04:00
dm-verity-target.c dm verity: switch to using asynchronous hash crypto API 2017-04-24 15:37:04 -04:00
dm-verity.h dm verity: switch to using asynchronous hash crypto API 2017-04-24 15:37:04 -04:00
dm-zero.c block: rename bio bi_rw to bi_opf 2016-08-07 14:41:02 -06:00
dm.c - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
dm.h dm: introduce enum dm_queue_mode to cleanup related code 2017-04-27 17:08:44 -04:00
faulty.c md: fast clone bio in bio_clone_mddev() 2017-02-15 11:24:54 -08:00
Kconfig - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
linear.c Merge branch 'md-next' into md-linus 2017-05-01 14:09:21 -07:00
linear.h md linear: fix a race between linear_add() and linear_congested() 2017-02-13 09:17:50 -08:00
Makefile - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
md-cluster.c md-cluster: Fix a memleak in an error handling path 2017-04-14 08:08:29 -07:00
md-cluster.h md-cluster: add the support for resize 2017-03-16 16:55:50 -07:00
md.c md: don't return -EAGAIN in md_allow_write for external metadata arrays 2017-05-08 10:32:59 -07:00
md.h md: don't return -EAGAIN in md_allow_write for external metadata arrays 2017-05-08 10:32:59 -07:00
multipath.c md: support REQ_OP_WRITE_ZEROES 2017-04-08 11:25:38 -06:00
multipath.h
raid0.c md/md0: optimize raid0 discard handling 2017-05-08 21:18:03 -07:00
raid0.h block: kill merge_bvec_fn() completely 2015-08-13 12:31:57 -06:00
raid1.c md/raid1/10: avoid unnecessary locking 2017-05-11 15:32:17 -07:00
raid1.h md/raid1: Use a new variable to count flighting sync requests 2017-04-27 14:01:16 -07:00
raid5-cache.c md/r5cache: gracefully handle journal device errors for writeback mode 2017-05-11 22:11:11 -07:00
raid5-log.h md/r5cache: gracefully handle journal device errors for writeback mode 2017-05-11 22:11:11 -07:00
raid5-ppl.c raid5-ppl: use a single mempool for ppl_io_unit and header_page 2017-04-11 14:56:46 -07:00
raid5.c md/r5cache: gracefully handle journal device errors for writeback mode 2017-05-11 22:11:11 -07:00
raid5.h - A major update for DM cache that reduces the latency for deciding 2017-05-03 10:31:20 -07:00
raid10.c md/raid1/10: avoid unnecessary locking 2017-05-11 15:32:17 -07:00
raid10.h md/raid10: simplify the splitting of requests. 2017-04-11 10:13:02 -07:00