linux/fs/btrfs
Filipe Manana ae9d8f1711 Btrfs: fix race between caching kthread and returning inode to inode cache
While the inode cache caching kthread is calling btrfs_unpin_free_ino(),
we could have a concurrent call to btrfs_return_ino() that adds a new
entry to the root's free space cache of pinned inodes. This concurrent
call does not acquire the fs_info->commit_root_sem before adding a new
entry if the caching state is BTRFS_CACHE_FINISHED, which is a problem
because the caching kthread calls btrfs_unpin_free_ino() after setting
the caching state to BTRFS_CACHE_FINISHED and therefore races with
the task calling btrfs_return_ino(), which is adding a new entry, while
the former (caching kthread) is navigating the cache's rbtree, removing
and freeing nodes from the cache's rbtree without acquiring the spinlock
that protects the rbtree.

This race resulted in memory corruption due to double free of struct
btrfs_free_space objects because both tasks can end up doing freeing the
same objects. Note that adding a new entry can result in merging it with
other entries in the cache, in which case those entries are freed.
This is particularly important as btrfs_free_space structures are also
used for the block group free space caches.

This memory corruption can be detected by a debugging kernel, which
reports it with the following trace:

[132408.501148] slab error in verify_redzone_free(): cache `btrfs_free_space': double free detected
[132408.505075] CPU: 15 PID: 12248 Comm: btrfs-ino-cache Tainted: G        W       4.1.0-rc5-btrfs-next-10+ #1
[132408.505075] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.8.1-0-g4adadbd-20150316_085822-nilsson.home.kraxel.org 04/01/2014
[132408.505075]  ffff880023e7d320 ffff880163d73cd8 ffffffff8145eec7 ffffffff81095dce
[132408.505075]  ffff880009735d40 ffff880163d73ce8 ffffffff81154e1e ffff880163d73d68
[132408.505075]  ffffffff81155733 ffffffffa054a95a ffff8801b6099f00 ffffffffa0505b5f
[132408.505075] Call Trace:
[132408.505075]  [<ffffffff8145eec7>] dump_stack+0x4f/0x7b
[132408.505075]  [<ffffffff81095dce>] ? console_unlock+0x356/0x3a2
[132408.505075]  [<ffffffff81154e1e>] __slab_error.isra.28+0x25/0x36
[132408.505075]  [<ffffffff81155733>] __cache_free+0xe2/0x4b6
[132408.505075]  [<ffffffffa054a95a>] ? __btrfs_add_free_space+0x2f0/0x343 [btrfs]
[132408.505075]  [<ffffffffa0505b5f>] ? btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
[132408.505075]  [<ffffffff810f3b30>] ? time_hardirqs_off+0x15/0x28
[132408.505075]  [<ffffffff81084d42>] ? trace_hardirqs_off+0xd/0xf
[132408.505075]  [<ffffffff811563a1>] ? kfree+0xb6/0x14e
[132408.505075]  [<ffffffff811563d0>] kfree+0xe5/0x14e
[132408.505075]  [<ffffffffa0505b5f>] btrfs_unpin_free_ino+0x8e/0x99 [btrfs]
[132408.505075]  [<ffffffffa0505e08>] caching_kthread+0x29e/0x2d9 [btrfs]
[132408.505075]  [<ffffffffa0505b6a>] ? btrfs_unpin_free_ino+0x99/0x99 [btrfs]
[132408.505075]  [<ffffffff8106698f>] kthread+0xef/0xf7
[132408.505075]  [<ffffffff810f3b08>] ? time_hardirqs_on+0x15/0x28
[132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
[132408.505075]  [<ffffffff814653d2>] ret_from_fork+0x42/0x70
[132408.505075]  [<ffffffff810668a0>] ? __kthread_parkme+0xad/0xad
[132408.505075] ffff880023e7d320: redzone 1:0x9f911029d74e35b, redzone 2:0x9f911029d74e35b.
[132409.501654] slab: double free detected in cache 'btrfs_free_space', objp ffff880023e7d320
[132409.503355] ------------[ cut here ]------------
[132409.504241] kernel BUG at mm/slab.c:2571!

Therefore fix this by having btrfs_unpin_free_ino() acquire the lock
that protects the rbtree while doing the searches and removing entries.

Fixes: 1c70d8fb4d ("Btrfs: fix inode caching vs tree log")
Cc: stable@vger.kernel.org
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Chris Mason <clm@fb.com>
2015-06-30 14:36:46 -07:00
..
tests btrfs: qgroup: Switch self test to extent-oriented qgroup mechanism. 2015-06-10 09:26:05 -07:00
acl.c btrfs: remove useless ACL check 2014-06-09 17:20:42 -07:00
async-thread.c btrfs: Fix lockdep warning of wr_ctx->wr_lock in scrub_free_wr_ctx() 2015-06-10 07:04:52 -07:00
async-thread.h btrfs: Fix lockdep warning of wr_ctx->wr_lock in scrub_free_wr_ctx() 2015-06-10 07:04:52 -07:00
backref.c btrfs: backref: Add special time_seq == (u64)-1 case for 2015-06-10 09:25:45 -07:00
backref.h btrfs: cleanup, remove inode_item_info helper 2015-01-14 19:23:47 +01:00
btrfs_inode.h Btrfs: fix metadata inconsistencies after directory fsync 2015-03-26 17:56:23 -07:00
check-integrity.c Merge branch 'cleanups-post-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.1 2015-03-25 10:52:48 -07:00
check-integrity.h block: submit_bio_wait() conversions 2013-11-24 16:33:41 -07:00
compression.c Merge branch 'cleanups-post-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.1 2015-03-25 10:52:48 -07:00
compression.h btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
ctree.c Btrfs: fix up read_tree_block to return proper error 2015-06-03 04:03:08 -07:00
ctree.h Btrfs: fix race between balance and unused block group deletion 2015-06-30 14:36:46 -07:00
delayed-inode.c Btrfs: fill ->last_trans for delayed inode in btrfs_fill_inode. 2015-04-26 06:27:03 -07:00
delayed-inode.h Btrfs: introduce the delayed inode ref deletion for the single link inode 2014-01-28 13:20:09 -08:00
delayed-ref.c btrfs: delayed-ref: double free in btrfs_add_delayed_tree_ref() 2015-06-24 12:28:03 -07:00
delayed-ref.h btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots. 2015-06-10 09:26:23 -07:00
dev-replace.c Btrfs: sysfs: add support to show replacing target in the sysfs 2015-06-19 14:03:54 +02:00
dev-replace.h
dir-item.c Btrfs: make xattr replace operations atomic 2014-11-20 17:20:07 -08:00
disk-io.c Btrfs: fix race between balance and unused block group deletion 2015-06-30 14:36:46 -07:00
disk-io.h Btrfs: disk-io: replace root args iff only fs_info used 2015-02-16 18:48:43 +01:00
export.c VFS: normal filesystems (and lustre): d_inode() annotations 2015-04-15 15:06:57 -04:00
export.h
extent-tree.c Btrfs: fix race between balance and unused block group deletion 2015-06-30 14:36:46 -07:00
extent-tree.h btrfs: qgroup: Add new qgroup calculation function 2015-06-10 09:25:49 -07:00
extent_io.c Btrfs: set UNWRITTEN for prealloc'ed extents in fiemap 2015-06-03 04:03:03 -07:00
extent_io.h btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
extent_map.c Btrfs: do not move em to modified list when unpinning 2014-11-21 11:59:54 -08:00
extent_map.h Btrfs: fix NULL pointer crash when running balance and scrub concurrently 2014-06-19 14:20:55 -07:00
file-item.c Merge branch 'cleanups-post-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.1 2015-03-25 10:52:48 -07:00
file.c Btrfs: avoid syncing log in the fast fsync path when not necessary 2015-06-10 07:02:43 -07:00
free-space-cache.c Btrfs: fix mutex unlock without prior lock on space cache truncation 2015-06-02 19:34:34 -07:00
free-space-cache.h Btrfs: allow block group cache writeout outside critical section in commit 2015-04-10 14:07:22 -07:00
hash.c btrfs: LLVMLinux: Remove VLAIS 2014-10-14 10:51:22 +02:00
hash.h Btrfs: fix btrfs boot when compiled as built-in 2014-01-28 13:20:31 -08:00
inode-item.c Btrfs: fix fsync log replay for inodes with a mix of regular refs and extrefs 2015-01-21 18:02:05 -08:00
inode-map.c Btrfs: fix race between caching kthread and returning inode to inode cache 2015-06-30 14:36:46 -07:00
inode-map.h
inode.c Btrfs: fix hang during inode eviction due to concurrent readahead 2015-06-03 04:03:09 -07:00
ioctl.c btrfs: Handle unaligned length in extent_same 2015-06-10 07:02:50 -07:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
locking.c btrfs: fix lockups from btrfs_clear_path_blocking 2014-11-19 10:34:35 -08:00
locking.h btrfs: fix lockups from btrfs_clear_path_blocking 2014-11-19 10:34:35 -08:00
lzo.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
Makefile Btrfs: add sanity tests for new qgroup accounting code 2014-06-09 17:20:49 -07:00
math.h btrfs: cleanup 64bit/32bit divs, compile time constants 2015-03-03 17:23:57 +01:00
ordered-data.c Btrfs: don't attach unnecessary extents to transaction on fsync 2015-06-10 07:02:44 -07:00
ordered-data.h Btrfs: avoid syncing log in the fast fsync path when not necessary 2015-06-10 07:02:43 -07:00
orphan.c btrfs: kill the key type accessor helpers 2014-09-17 13:37:12 -07:00
print-tree.c btrfs: remove parameter blocksize from read_tree_block 2014-10-02 17:14:50 +02:00
print-tree.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
props.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00
props.h Btrfs: add support for inode properties 2014-01-28 13:20:24 -08:00
qgroup.c btrfs: qgroup: allow user to clear the limitation on qgroup 2015-06-30 13:20:00 -07:00
qgroup.h btrfs: qgroup: Cleanup the old ref_node-oriented mechanism. 2015-06-10 09:26:11 -07:00
raid56.c Merge branch 'cleanups-post-3.19' of git://git.kernel.org/pub/scm/linux/kernel/git/kdave/linux into for-linus-4.1 2015-03-25 10:52:48 -07:00
raid56.h Btrfs: Make raid_map array be inlined in btrfs_bio structure 2015-01-21 18:06:47 -08:00
rcu-string.h
reada.c Btrfs: add ref_count and free function for btrfs_bio 2015-01-21 18:06:48 -08:00
relocation.c Btrfs: fix up read_tree_block to return proper error 2015-06-03 04:03:08 -07:00
root-tree.c Btrfs: use bitfield instead of integer data type for the some variants in btrfs_root 2014-06-09 17:20:40 -07:00
scrub.c btrfs: add error handling for scrub_workers_get() 2015-06-30 13:20:03 -07:00
send.c Btrfs: use received_uuid of parent during send 2015-06-12 13:20:38 -07:00
send.h btrfs: make static code static & remove dead code 2013-05-06 15:55:23 -04:00
struct-funcs.c
super.c Btrfs: show subvol= and subvolid= in /proc/mounts 2015-06-03 04:03:02 -07:00
sysfs.c Btrfs: Check if kobject is initialized before put 2015-06-22 14:43:31 +02:00
sysfs.h Btrfs: sysfs: btrfs_sysfs_remove_fsid() make it non static 2015-05-27 12:27:22 +02:00
transaction.c btrfs: qgroup: Make snapshot accounting work with new extent-oriented 2015-06-10 09:26:29 -07:00
transaction.h btrfs: qgroup: Add the ability to skip given qgroup for old/new_roots. 2015-06-10 09:26:23 -07:00
tree-defrag.c btrfs: let tree defrag work in SSD mode 2015-06-02 19:34:33 -07:00
tree-log.c Btrfs: remove csum_bytes_left 2015-06-03 04:03:06 -07:00
tree-log.h Btrfs: fix metadata inconsistencies after directory fsync 2015-03-26 17:56:23 -07:00
ulist.c btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c Btrfs: make btrfs_search_forward return with nodes unlocked 2014-09-17 13:38:02 -07:00
volumes.c Btrfs: fix race between balance and unused block group deletion 2015-06-30 14:36:46 -07:00
volumes.h Btrfs: sysfs: add pointer to access fs_info from fs_devices 2015-05-27 12:27:21 +02:00
xattr.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2015-04-26 17:22:07 -07:00
xattr.h btrfs: use generic posix ACL infrastructure 2014-01-25 23:58:18 -05:00
zlib.c btrfs: constify structs with op functions or static definitions 2015-02-16 18:48:44 +01:00