linux/block
Tejun Heo 0b462c89e3 blkcg: don't call into policy draining if root_blkg is already gone
While a queue is being destroyed, all the blkgs are destroyed and its
->root_blkg pointer is set to NULL.  If someone else starts to drain
while the queue is in this state, the following oops happens.

  NULL pointer dereference at 0000000000000028
  IP: [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  PGD e4a1067 PUD b773067 PMD 0
  Oops: 0000 [#1] PREEMPT SMP DEBUG_PAGEALLOC
  Modules linked in: cfq_iosched(-) [last unloaded: cfq_iosched]
  CPU: 1 PID: 537 Comm: bash Not tainted 3.16.0-rc3-work+ #2
  Hardware name: Bochs Bochs, BIOS Bochs 01/01/2011
  task: ffff88000e222250 ti: ffff88000efd4000 task.ti: ffff88000efd4000
  RIP: 0010:[<ffffffff8144e944>]  [<ffffffff8144e944>] blk_throtl_drain+0x84/0x230
  RSP: 0018:ffff88000efd7bf0  EFLAGS: 00010046
  RAX: 0000000000000000 RBX: ffff880015091450 RCX: 0000000000000001
  RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
  RBP: ffff88000efd7c10 R08: 0000000000000000 R09: 0000000000000001
  R10: ffff88000e222250 R11: 0000000000000000 R12: ffff880015091450
  R13: ffff880015092e00 R14: ffff880015091d70 R15: ffff88001508fc28
  FS:  00007f1332650740(0000) GS:ffff88001fa80000(0000) knlGS:0000000000000000
  CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
  CR2: 0000000000000028 CR3: 0000000009446000 CR4: 00000000000006e0
  Stack:
   ffffffff8144e8f6 ffff880015091450 0000000000000000 ffff880015091d80
   ffff88000efd7c28 ffffffff8144ae2f ffff880015091450 ffff88000efd7c58
   ffffffff81427641 ffff880015091450 ffffffff82401f00 ffff880015091450
  Call Trace:
   [<ffffffff8144ae2f>] blkcg_drain_queue+0x1f/0x60
   [<ffffffff81427641>] __blk_drain_queue+0x71/0x180
   [<ffffffff81429b3e>] blk_queue_bypass_start+0x6e/0xb0
   [<ffffffff814498b8>] blkcg_deactivate_policy+0x38/0x120
   [<ffffffff8144ec44>] blk_throtl_exit+0x34/0x50
   [<ffffffff8144aea5>] blkcg_exit_queue+0x35/0x40
   [<ffffffff8142d476>] blk_release_queue+0x26/0xd0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff81427505>] blk_put_queue+0x15/0x20
   [<ffffffff817d07bb>] scsi_device_dev_release_usercontext+0x16b/0x1c0
   [<ffffffff810bc339>] execute_in_process_context+0x89/0xa0
   [<ffffffff817d064c>] scsi_device_dev_release+0x1c/0x20
   [<ffffffff817930e2>] device_release+0x32/0xa0
   [<ffffffff81454968>] kobject_cleanup+0x38/0x70
   [<ffffffff81454848>] kobject_put+0x28/0x60
   [<ffffffff817934d7>] put_device+0x17/0x20
   [<ffffffff817d11b9>] __scsi_remove_device+0xa9/0xe0
   [<ffffffff817d121b>] scsi_remove_device+0x2b/0x40
   [<ffffffff817d1257>] sdev_store_delete+0x27/0x30
   [<ffffffff81792ca8>] dev_attr_store+0x18/0x30
   [<ffffffff8126f75e>] sysfs_kf_write+0x3e/0x50
   [<ffffffff8126ea87>] kernfs_fop_write+0xe7/0x170
   [<ffffffff811f5e9f>] vfs_write+0xaf/0x1d0
   [<ffffffff811f69bd>] SyS_write+0x4d/0xc0
   [<ffffffff81d24692>] system_call_fastpath+0x16/0x1b

776687bce4 ("block, blk-mq: draining can't be skipped even if
bypass_depth was non-zero") made it easier to trigger this bug by
making blk_queue_bypass_start() drain even when it loses the first
bypass test to blk_cleanup_queue(); however, the bug has always been
there even before the commit as blk_queue_bypass_start() could race
against queue destruction, win the initial bypass test but perform the
actual draining after blk_cleanup_queue() already destroyed all blkgs.

Fix it by skippping calling into policy draining if all the blkgs are
already gone.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Shirish Pargaonkar <spargaonkar@suse.com>
Reported-by: Sasha Levin <sasha.levin@oracle.com>
Reported-by: Jet Chen <jet.chen@intel.com>
Cc: stable@vger.kernel.org
Tested-by: Shirish Pargaonkar <spargaonkar@suse.com>
Signed-off-by: Jens Axboe <axboe@fb.com>
2014-07-12 17:55:10 +02:00
..
partitions block: Use macros from compiler.h instead of __attribute__((...)) 2014-02-18 12:20:01 -08:00
bio-integrity.c block: move bio.c and bio-integrity.c from fs/ to block/ 2014-05-19 08:34:46 -06:00
bio.c block: add support for limiting gaps in SG lists 2014-06-24 16:22:24 -06:00
blk-cgroup.c blkcg: don't call into policy draining if root_blkg is already gone 2014-07-12 17:55:10 +02:00
blk-cgroup.h Revert "block: add __init to blkcg_policy_register" 2014-06-22 16:34:11 -06:00
blk-core.c Merge branch 'for-linus' of git://git.kernel.dk/linux-block 2014-06-19 17:56:43 -10:00
blk-exec.c block: blk-exec.c: Cleaning up local variable address returnd 2014-06-08 19:51:31 -06:00
blk-flush.c block: remove elv_abort_queue and blk_abort_flushes 2014-06-11 15:31:21 -06:00
blk-integrity.c bio-integrity: Convert to bvec_iter 2013-11-23 22:33:50 -08:00
blk-ioc.c block: Substitute rcu_access_pointer() for rcu_dereference_raw() 2014-02-18 12:21:26 -08:00
blk-iopoll.c Merge branch 'locking-core-for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip into next 2014-06-03 12:57:53 -07:00
blk-lib.c block/blk-lib.c: make __blkdev_issue_zeroout static 2014-05-26 17:39:09 -06:00
blk-map.c block: remove struct request buffer member 2014-04-15 14:03:02 -06:00
blk-merge.c block: add support for limiting gaps in SG lists 2014-06-24 16:22:24 -06:00
blk-mq-cpu.c blk-mq: add file comments and update copyright notices 2014-05-28 10:15:41 -06:00
blk-mq-cpumap.c blk-mq: add file comments and update copyright notices 2014-05-28 10:15:41 -06:00
blk-mq-sysfs.c blk-mq: blk_mq_unregister_hctx() can be static 2014-05-30 10:31:13 -06:00
blk-mq-tag.c blk-mq: bitmap tag: fix races in bt_get() function 2014-06-17 22:13:08 -07:00
blk-mq-tag.h blk-mq: bitmap tag: fix races on shared ::wake_index fields 2014-06-17 22:12:35 -07:00
blk-mq.c blk-mq: blk_mq_start_hw_queue() should use blk_mq_run_hw_queue() 2014-06-25 08:22:34 -06:00
blk-mq.h blk-mq: fix schedule from atomic context 2014-06-03 21:04:39 -06:00
blk-settings.c block: ensure that bio_add_page() always accepts a page for an empty bio 2014-06-10 12:53:56 -06:00
blk-softirq.c block: fix regression with block enabled tagging 2014-04-09 21:54:06 -06:00
blk-sysfs.c block: only allocate/free mq_usage_counter in blk-mq 2014-05-27 09:37:08 -06:00
blk-tag.c block: don't assume last put of shared tags is for the host 2014-07-08 12:25:28 +02:00
blk-throttle.c Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-06-09 15:03:33 -07:00
blk-timeout.c block: ensure that the timer is always added 2014-05-30 15:41:39 -06:00
blk.h block: remove elv_abort_queue and blk_abort_flushes 2014-06-11 15:31:21 -06:00
bounce.c mm: convert some level-less printks to pr_* 2014-06-06 16:08:18 -07:00
bsg-lib.c bsg: Remove unused function bsg_goose_queue() 2012-12-06 14:33:02 +01:00
bsg.c block: add blk_rq_set_block_pc() 2014-06-06 07:57:37 -06:00
cfq-iosched.c Merge branch 'for-3.16' of git://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup 2014-06-09 15:03:33 -07:00
cmdline-parser.c block: remove unrelated header files and export symbol 2014-01-21 20:18:26 -08:00
compat_ioctl.c kernel-wide: fix missing validations on __get/__put/__copy_to/__copy_from_user() 2013-09-11 15:58:18 -07:00
deadline-iosched.c block: Stop abusing csd.list for fifo_time 2014-02-24 14:46:32 -08:00
elevator.c Revert "block: add __init to elv_register" 2014-06-22 16:34:11 -06:00
genhd.c block: Convert kmalloc_node(...GFP_ZERO...) to kzalloc_node(...) 2013-09-11 13:22:03 -06:00
ioctl.c block: replace IS_ERR and PTR_ERR with PTR_ERR_OR_ZERO 2013-11-08 09:05:31 -07:00
ioprio.c block: move ioprio.c from fs/ to block/ 2014-05-19 11:02:18 -06:00
Kconfig block: change config option name for cmdline partition parsing 2013-09-30 14:31:02 -07:00
Kconfig.iosched blkcg: make CONFIG_BLK_CGROUP bool 2012-03-06 21:27:21 +01:00
Makefile block: move mm/bounce.c to block/ 2014-05-19 20:01:52 -06:00
noop-iosched.c elevator: Fix a race in elevator switching 2013-07-03 13:25:24 +02:00
partition-generic.c Revert "loop: cleanup partitions when detaching loop device" 2013-04-08 10:12:11 +02:00
scsi_ioctl.c block: add blk_rq_set_block_pc() 2014-06-06 07:57:37 -06:00