linux/drivers/md
Coly Li fadd94e05c bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set
In patch "bcache: fix cached_dev->count usage for bch_cache_set_error()",
cached_dev_get() is called when creating dc->writeback_thread, and
cached_dev_put() is called when exiting dc->writeback_thread. This
modification works well unless people detach the bcache device manually by
    'echo 1 > /sys/block/bcache<N>/bcache/detach'
Because this sysfs interface only calls bch_cached_dev_detach() which wakes
up dc->writeback_thread but does not stop it. The reason is, before patch
"bcache: fix cached_dev->count usage for bch_cache_set_error()", inside
bch_writeback_thread(), if cache is not dirty after writeback,
cached_dev_put() will be called here. And in cached_dev_make_request() when
a new write request makes cache from clean to dirty, cached_dev_get() will
be called there. Since we don't operate dc->count in these locations,
refcount d->count cannot be dropped after cache becomes clean, and
cached_dev_detach_finish() won't be called to detach bcache device.

This patch fixes the issue by checking whether BCACHE_DEV_DETACHING is
set inside bch_writeback_thread(). If this bit is set and cache is clean
(no existing writeback_keys), break the while-loop, call cached_dev_put()
and quit the writeback thread.

Please note if cache is still dirty, even BCACHE_DEV_DETACHING is set the
writeback thread should continue to perform writeback, this is the original
design of manually detach.

It is safe to do the following check without locking, let me explain why,
+	if (!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags) &&
+	    (!atomic_read(&dc->has_dirty) || !dc->writeback_running)) {

If the kenrel thread does not sleep and continue to run due to conditions
are not updated in time on the running CPU core, it just consumes more CPU
cycles and has no hurt. This should-sleep-but-run is safe here. We just
focus on the should-run-but-sleep condition, which means the writeback
thread goes to sleep in mistake while it should continue to run.
1, First of all, no matter the writeback thread is hung or not,
   kthread_stop() from cached_dev_detach_finish() will wake up it and
   terminate by making kthread_should_stop() return true. And in normal
   run time, bit on index BCACHE_DEV_DETACHING is always cleared, the
   condition
	!test_bit(BCACHE_DEV_DETACHING, &dc->disk.flags)
   is always true and can be ignored as constant value.
2, If one of the following conditions is true, the writeback thread should
   go to sleep,
   "!atomic_read(&dc->has_dirty)" or "!dc->writeback_running)"
   each of them independently controls the writeback thread should sleep or
   not, let's analyse them one by one.
2.1 condition "!atomic_read(&dc->has_dirty)"
   If dc->has_dirty is set from 0 to 1 on another CPU core, bcache will
   call bch_writeback_queue() immediately or call bch_writeback_add() which
   indirectly calls bch_writeback_queue() too. In bch_writeback_queue(),
   wake_up_process(dc->writeback_thread) is called. It sets writeback
   thread's task state to TASK_RUNNING and following an implicit memory
   barrier, then tries to wake up the writeback thread.
   In writeback thread, its task state is set to TASK_INTERRUPTIBLE before
   doing the condition check. If other CPU core sets the TASK_RUNNING state
   after writeback thread setting TASK_INTERRUPTIBLE, the writeback thread
   will be scheduled to run very soon because its state is not
   TASK_INTERRUPTIBLE. If other CPU core sets the TASK_RUNNING state before
   writeback thread setting TASK_INTERRUPTIBLE, the implict memory barrier
   of wake_up_process() will make sure modification of dc->has_dirty on
   other CPU core is updated and observed on the CPU core of writeback
   thread. Therefore the condition check will correctly be false, and
   continue writeback code without sleeping.
2.2 condition "!dc->writeback_running)"
   dc->writeback_running can be changed via sysfs file, every time it is
   modified, a following bch_writeback_queue() is alwasy called. So the
   change is always observed on the CPU core of writeback thread. If
   dc->writeback_running is changed from 0 to 1 on other CPU core, this
   condition check will observe the modification and allow writeback
   thread to continue to run without sleeping.
Now we can see, even without a locking protection, multiple conditions
check is safe here, no deadlock or process hang up will happen.

I compose a separte patch because that patch "bcache: fix cached_dev->count
usage for bch_cache_set_error()" already gets a "Reviewed-by:" from Hannes
Reinecke. Also this fix is not trivial and good for a separate patch.

Signed-off-by: Coly Li <colyli@suse.de>
Reviewed-by: Michael Lyle <mlyle@lyle.org>
Cc: Hannes Reinecke <hare@suse.com>
Cc: Huijun Tang <tang.junhui@zte.com.cn>
Signed-off-by: Jens Axboe <axboe@kernel.dk>
2018-03-18 20:15:20 -06:00
..
bcache bcache: quit dc->writeback_thread when BCACHE_DEV_DETACHING is set 2018-03-18 20:15:20 -06:00
persistent-data dm btree: fix serious bug in btree_split_beneath() 2018-01-17 09:07:55 -05:00
dm-bio-prison-v1.c dm bio prison: use rb_entry() rather than container_of() 2017-06-19 11:03:50 -04:00
dm-bio-prison-v1.h block: switch bios to blk_status_t 2017-06-09 09:27:32 -06:00
dm-bio-prison-v2.c dm bio prison: use rb_entry() rather than container_of() 2017-06-19 11:03:50 -04:00
dm-bio-prison-v2.h dm bio prison v2: new interface for the bio prison 2017-03-07 11:30:16 -05:00
dm-bio-record.h block: replace bi_bdev with a gendisk pointer and partitions index 2017-08-23 12:49:55 -06:00
dm-bufio.c dm bufio: eliminate unnecessary labels in dm_bufio_client_create() 2018-01-17 09:16:04 -05:00
dm-bufio.h dm integrity: optimize writing dm-bufio buffers that are partially changed 2017-08-28 11:47:17 -04:00
dm-builtin.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dm-cache-background-tracker.c dm cache background tracker: limit amount of background work that may be issued at once 2017-11-10 15:45:03 -05:00
dm-cache-background-tracker.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-block-types.h linux: drop __bitwise__ everywhere 2016-12-16 00:13:41 +02:00
dm-cache-metadata.c dm cache: convert dm_cache_metadata.ref_count from atomic_t to refcount_t 2017-10-24 15:09:51 -04:00
dm-cache-metadata.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-policy-internal.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-policy-smq.c dm cache policy smq: allocate cache blocks in order 2017-11-10 15:45:05 -05:00
dm-cache-policy.c
dm-cache-policy.h dm cache: significant rework to leverage dm-bio-prison-v2 2017-03-07 13:28:31 -05:00
dm-cache-target.c dm: fix various targets to dm_register_target after module __init resources created 2017-12-04 10:23:10 -05:00
dm-core.h dm: various cleanups to md->queue initialization code 2018-01-29 13:44:55 -05:00
dm-crypt.c - DM core fixes to ensure that bio submission follows a depth-first tree 2018-01-31 11:05:47 -08:00
dm-delay.c dm: backfill missing calls to mutex_destroy() 2018-01-17 09:16:15 -05:00
dm-era-target.c dm: do not set 'discards_supported' in targets that do not need it 2017-11-16 16:33:54 -05:00
dm-exception-store.c
dm-exception-store.h
dm-flakey.c dm flakey: check for null arg_name in parse_features() 2018-01-17 09:16:13 -05:00
dm-integrity.c dm integrity: don't store cipher request on the stack 2018-01-17 09:08:57 -05:00
dm-io.c dm io: remove BIOSET_NEED_RESCUER flag from bios bioset 2017-12-13 12:15:56 -05:00
dm-ioctl.c vfs: do bulk POLL* -> EPOLL* replacement 2018-02-11 14:34:03 -08:00
dm-kcopyd.c dm: backfill missing calls to mutex_destroy() 2018-01-17 09:16:15 -05:00
dm-linear.c - Some request-based DM core and DM multipath fixes and cleanups 2017-09-14 13:43:16 -07:00
dm-log-userspace-base.c
dm-log-userspace-transfer.c
dm-log-userspace-transfer.h
dm-log-writes.c dm log writes: fix max length used for kstrndup 2018-01-17 09:16:16 -05:00
dm-log.c block,fs: use REQ_* flags directly 2016-11-01 09:43:26 -06:00
dm-mpath.c - DM core fixes to ensure that bio submission follows a depth-first tree 2018-01-31 11:05:47 -08:00
dm-mpath.h
dm-path-selector.c
dm-path-selector.h
dm-queue-length.c dm mpath selector: more evenly distribute ties 2018-01-29 13:44:58 -05:00
dm-raid.c - DM core fixes to ensure that bio submission follows a depth-first tree 2018-01-31 11:05:47 -08:00
dm-raid1.c md: Convert timers to use timer_setup() 2017-11-14 20:11:57 -07:00
dm-region-hash.c
dm-round-robin.c dm round robin: revert "use percpu 'repeat_count' and 'current_path'" 2017-02-17 00:54:09 -05:00
dm-rq.c for-linus-20180204 2018-02-04 11:16:35 -08:00
dm-rq.h dm rq: do not update rq partially in each ending bio 2017-08-28 10:23:28 -04:00
dm-service-time.c dm mpath selector: more evenly distribute ties 2018-01-29 13:44:58 -05:00
dm-snap-persistent.c dm: make flush bios explicitly sync 2017-05-31 10:50:23 -04:00
dm-snap-transient.c
dm-snap.c dm snapshot: use mutex instead of rw_semaphore 2018-01-17 09:16:14 -05:00
dm-stats.c dm: backfill missing calls to mutex_destroy() 2018-01-17 09:16:15 -05:00
dm-stats.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
dm-stripe.c - Some request-based DM core and DM multipath fixes and cleanups 2017-09-14 13:43:16 -07:00
dm-switch.c locking/atomics: COCCINELLE/treewide: Convert trivial ACCESS_ONCE() patterns to READ_ONCE()/WRITE_ONCE() 2017-10-25 11:01:08 +02:00
dm-sysfs.c
dm-table.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
dm-target.c dm: don't return errnos from ->map 2017-06-09 09:27:32 -06:00
dm-thin-metadata.c dm thin metadata: THIN_MAX_CONCURRENT_LOCKS should be 6 2018-01-17 09:07:54 -05:00
dm-thin-metadata.h
dm-thin.c dm thin: fix trailing semicolon in __remap_and_issue_shared_cell 2018-01-29 13:44:57 -05:00
dm-uevent.c
dm-uevent.h
dm-unstripe.c dm unstripe: fix target length versus number of stripes size check 2018-01-29 13:44:58 -05:00
dm-verity-fec.c dm verity fec: fix GFP flags used with mempool_alloc() 2017-07-26 15:55:44 -04:00
dm-verity-fec.h dm verity fec: limit error correction recursion 2017-03-16 09:37:31 -04:00
dm-verity-target.c Merge branch 'linus' of git://git.kernel.org/pub/scm/linux/kernel/git/herbert/crypto-2.6 2017-11-14 10:52:09 -08:00
dm-verity.h dm: move dm-verity to generic async completion 2017-11-03 22:11:20 +08:00
dm-zero.c dm: don't return errnos from ->map 2017-06-09 09:27:32 -06:00
dm-zoned-metadata.c dm: backfill missing calls to mutex_destroy() 2018-01-17 09:16:15 -05:00
dm-zoned-reclaim.c dm zoned: use GFP_NOIO in I/O path 2017-07-26 15:55:43 -04:00
dm-zoned-target.c dm: backfill missing calls to mutex_destroy() 2018-01-17 09:16:15 -05:00
dm-zoned.h dm zoned: drive-managed zoned block device target 2017-06-19 11:05:20 -04:00
dm.c block: Add 'lock' as third argument to blk_alloc_queue_node() 2018-02-28 12:23:35 -07:00
dm.h dm: move dm_table_destroy() to same header as dm_table_create() 2018-01-17 09:16:06 -05:00
Kconfig dm: add unstriped target 2018-01-17 09:16:00 -05:00
Makefile dm: add unstriped target 2018-01-17 09:16:00 -05:00
md-bitmap.c Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-bitmap.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-cluster.c md-cluster: update document for raid10 2017-11-01 21:32:25 -07:00
md-cluster.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
md-faulty.c md: rename some drivers/md/ files to have an "md-" prefix 2017-10-16 19:06:36 -07:00
md-linear.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
md-linear.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md-multipath.c md: remove redundant variable q 2017-11-01 21:32:24 -07:00
md-multipath.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00
md.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
md.h raid5-ppl: PPL support for disks with write-back cache enabled 2018-01-15 14:29:42 -08:00
raid0.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
raid0.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid1-10.c License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid1.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
raid1.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid5-cache.c raid5-ppl: PPL support for disks with write-back cache enabled 2018-01-15 14:29:42 -08:00
raid5-log.h raid5-ppl: PPL support for disks with write-back cache enabled 2018-01-15 14:29:42 -08:00
raid5-ppl.c raid5-ppl: PPL support for disks with write-back cache enabled 2018-01-15 14:29:42 -08:00
raid5.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
raid5.h License cleanup: add SPDX GPL-2.0 license identifier to files with no license 2017-11-02 11:10:55 +01:00
raid10.c block: Use blk_queue_flag_*() in drivers instead of queue_flag_*() 2018-03-08 14:13:48 -07:00
raid10.h Merge branch 'for-next' of git://git.kernel.org/pub/scm/linux/kernel/git/shli/md 2017-11-14 16:07:26 -08:00