linux/fs
Dave Chinner 96355d5a1f xfs: Don't allow logging of XFS_ISTALE inodes
In tracking down a problem in this patchset, I discovered we are
reclaiming dirty stale inodes. This wasn't discovered until inodes
were always attached to the cluster buffer and then the rcu callback
that freed inodes was assert failing because the inode still had an
active pointer to the cluster buffer after it had been reclaimed.

Debugging the issue indicated that this was a pre-existing issue
resulting from the way the inodes are handled in xfs_inactive_ifree.
When we free a cluster buffer from xfs_ifree_cluster, all the inodes
in cache are marked XFS_ISTALE. Those that are clean have nothing
else done to them and so eventually get cleaned up by background
reclaim. i.e. it is assumed we'll never dirty/relog an inode marked
XFS_ISTALE.

On journal commit dirty stale inodes as are handled by both
buffer and inode log items to run though xfs_istale_done() and
removed from the AIL (buffer log item commit) or the log item will
simply unpin it because the buffer log item will clean it. What happens
to any specific inode is entirely dependent on which log item wins
the commit race, but the result is the same - stale inodes are
clean, not attached to the cluster buffer, and not in the AIL. Hence
inode reclaim can just free these inodes without further care.

However, if the stale inode is relogged, it gets dirtied again and
relogged into the CIL. Most of the time this isn't an issue, because
relogging simply changes the inode's location in the current
checkpoint. Problems arise, however, when the CIL checkpoints
between two transactions in the xfs_inactive_ifree() deferops
processing. This results in the XFS_ISTALE inode being redirtied
and inserted into the CIL without any of the other stale cluster
buffer infrastructure being in place.

Hence on journal commit, it simply gets unpinned, so it remains
dirty in memory. Everything in inode writeback avoids XFS_ISTALE
inodes so it can't be written back, and it is not tracked in the AIL
so there's not even a trigger to attempt to clean the inode. Hence
the inode just sits dirty in memory until inode reclaim comes along,
sees that it is XFS_ISTALE, and goes to reclaim it. This reclaiming
of a dirty inode caused use after free, list corruptions and other
nasty issues later in this patchset.

Hence this patch addresses a violation of the "never log XFS_ISTALE
inodes" caused by the deferops processing rolling a transaction
and relogging a stale inode in xfs_inactive_free. It also adds a
bunch of asserts to catch this problem in debug kernels so that
we don't reintroduce this problem in future.

Reproducer for this issue was generic/558 on a v4 filesystem.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com>
Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com>
2020-07-06 10:46:58 -07:00
..
9p 9p: read only once on O_NONBLOCK 2020-03-27 09:29:56 +00:00
adfs docs: filesystems: fix renamed references 2020-04-20 15:45:22 -06:00
affs docs: filesystems: fix renamed references 2020-04-20 15:45:22 -06:00
afs afs: Fix storage of cell names 2020-06-27 22:04:24 -07:00
autofs
befs
bfs docs: filesystems: fix renamed references 2020-04-20 15:45:22 -06:00
btrfs for-5.8-rc2-tag 2020-06-23 09:20:11 -07:00
cachefiles A fair amount of stuff this time around, dominated by yet another massive 2020-06-01 15:45:27 -07:00
ceph ceph: skip checking caps when session reconnecting and releasing reqs 2020-06-01 13:22:53 +02:00
cifs cifs: prevent truncation from long to int in wait_for_free_credits 2020-07-01 20:01:26 -05:00
coda docs: filesystems: convert coda.txt to ReST 2020-05-05 09:22:21 -06:00
configfs A fair amount of stuff this time around, dominated by yet another massive 2020-06-01 15:45:27 -07:00
cramfs docs: filesystems: fix renamed references 2020-04-20 15:45:22 -06:00
crypto fscrypt updates for 5.8 2020-06-01 12:10:17 -07:00
debugfs Merge 5.7-rc3 into driver-core-next 2020-04-27 09:34:55 +02:00
devpts
dlm dlm for 5.8 2020-06-05 16:43:16 -07:00
ecryptfs A fair amount of stuff this time around, dominated by yet another massive 2020-06-01 15:45:27 -07:00
efivarfs efivarfs: Don't return -EINTR when rate-limiting reads 2020-06-15 14:38:56 +02:00
efs
erofs erofs: fix partially uninitialized misuse in z_erofs_onlinepage_fixup 2020-06-24 09:47:44 +08:00
exfat exfat: flush dirty metadata in fsync 2020-06-29 17:11:18 +09:00
exportfs
ext2 mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
ext4 This is the second round of ext4 commits for 5.8 merge window. It 2020-06-15 09:32:10 -07:00
f2fs f2fs-for-5.8-rc1 2020-06-09 11:28:59 -07:00
fat fat: improve the readahead for FAT entries 2020-06-04 19:06:25 -07:00
freevxfs
fscache Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
fuse fuse update for 5.8 2020-06-09 15:48:24 -07:00
gfs2 gfs2: The freeze glock should never be frozen 2020-07-03 12:05:35 +02:00
hfs for-5.8/block-2020-06-01 2020-06-02 15:29:19 -07:00
hfsplus block: remove the error_sector argument to blkdev_issue_flush 2020-05-22 08:45:46 -06:00
hostfs hostfs: Use kasprintf() instead of fixed buffer formatting 2020-03-29 23:23:00 +02:00
hpfs hpfs: fix warning due to superfluous semicolon 2020-06-06 10:08:17 -07:00
hugetlbfs mmap locking API: convert mmap_sem API comments 2020-06-09 09:39:14 -07:00
iomap New code for 5.8: 2020-06-13 12:44:30 -07:00
isofs for-5.8/block-2020-06-01 2020-06-02 15:29:19 -07:00
jbd2 This is the second round of ext4 commits for 5.8 merge window. It 2020-06-15 09:32:10 -07:00
jffs2 jffs2: Replace zero-length array with flexible-array 2020-06-15 23:08:31 -05:00
jfs Replace zero-length array in JFS 2020-06-02 20:11:35 -07:00
kernfs mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
lockd
minix
nfs NFSv4 fix CLOSE not waiting for direct IO compeletion 2020-06-26 08:43:14 -04:00
nfs_common
nfsd nfsd: fix nfsdfs inode reference count leak 2020-06-29 14:48:28 -04:00
nilfs2 nilfs2: fix null pointer dereference at nilfs_segctor_do_construct() 2020-06-10 19:14:17 -07:00
nls treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
notify treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
ntfs Merge branch 'akpm' (patches from Andrew) 2020-06-02 12:21:36 -07:00
ocfs2 ocfs2: fix value of OCFS2_INVALID_SLOT 2020-06-26 00:27:37 -07:00
omfs fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
openpromfs
orangefs orangefs: a conversion and a cleanup... 2020-06-05 16:44:36 -07:00
overlayfs overlayfs update for 5.8 2020-06-09 15:40:50 -07:00
proc Merge branch 'fixes' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-07-03 23:20:14 -07:00
pstore Merge branch 'uaccess.__copy_from_user' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-01 16:18:46 -07:00
qnx4
qnx6 fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
quota sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
ramfs
reiserfs \n 2020-06-04 13:53:10 -07:00
romfs treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
squashfs Squashfs: Replace zero-length array with flexible-array 2020-06-15 23:08:32 -05:00
sysfs RDMA 5.8 merge window pull request 2020-06-05 14:05:57 -07:00
sysv docs: filesystems: fix renamed references 2020-04-20 15:45:22 -06:00
tracefs
ubifs mm: remove the pgprot argument to __vmalloc 2020-06-02 10:59:11 -07:00
udf for-5.8/block-2020-06-01 2020-06-02 15:29:19 -07:00
ufs
unicode .gitignore: add SPDX License Identifier 2020-03-25 11:50:48 +01:00
vboxsf vboxsf: don't use the source name in the bdi name 2020-05-07 08:45:47 -06:00
verity fs-verity: remove unnecessary extern keywords 2020-05-12 16:44:00 -07:00
xfs xfs: Don't allow logging of XFS_ISTALE inodes 2020-07-06 10:46:58 -07:00
zonefs zonefs changes for 5.8 2020-06-04 13:50:13 -07:00
aio.c aio: Replace zero-length array with flexible-array 2020-06-15 23:08:25 -05:00
anon_inodes.c
attr.c
bad_inode.c fs: move the fiemap definitions out of fs.h 2020-06-03 23:16:55 -04:00
binfmt_aout.c exec: Rename flush_old_exec begin_new_exec 2020-05-07 16:55:47 -05:00
binfmt_elf.c Merge branch 'uaccess.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:02:54 -07:00
binfmt_elf_fdpic.c Merge branch 'uaccess.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:02:54 -07:00
binfmt_em86.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_flat.c Merge branch 'uaccess.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:02:54 -07:00
binfmt_misc.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
binfmt_script.c Merge branch 'akpm' (patches from Andrew) 2020-06-04 19:18:29 -07:00
block_dev.c block: make function 'kill_bdev' static 2020-06-18 09:24:35 -06:00
buffer.c fs/buffer.c: use attach/detach_page_private 2020-06-02 10:59:07 -07:00
char_dev.c vfs: allow unprivileged whiteout creation 2020-05-14 16:44:23 +02:00
compat.c
compat_binfmt_elf.c Split the old READ_IMPLIES_EXEC workaround from executable PT_GNU_STACK 2020-06-05 13:45:21 -07:00
coredump.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
d_path.c
dax.c dax,iomap: Add helper dax_iomap_zero() to zero a range 2020-04-02 19:15:03 -07:00
dcache.c Merge git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next 2020-06-03 16:27:18 -07:00
dcookies.c
direct-io.c for-5.8-part2-tag 2020-06-14 09:47:25 -07:00
drop_caches.c sysctl: pass kernel pointers to ->proc_handler 2020-04-27 02:07:40 -04:00
eventfd.c eventfd: convert to f_op->read_iter() 2020-05-06 22:33:43 -04:00
eventpoll.c epoll: call final ep_events_available() check under the lock 2020-05-14 10:00:35 -07:00
exec.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
fcntl.c
fhandle.c
file.c fix multiplication overflow in copy_fdtable() 2020-05-19 18:29:36 -04:00
file_table.c Revert "fs: Do not check if there is a fsnotify watcher on pseudo inodes" 2020-06-29 09:40:55 -07:00
filesystems.c fs/filesystems.c: downgrade user-reachable WARN_ONCE() to pr_warn_once() 2020-04-10 15:36:22 -07:00
fs-writeback.c A lot of bug fixes and cleanups for ext4, including: 2020-06-05 16:19:28 -07:00
fs_context.c vfs: don't parse "silent" option 2020-05-14 16:44:25 +02:00
fs_parser.c fs_parse: remove pr_notice() about each validation 2020-04-02 09:35:26 -07:00
fs_pin.c
fs_struct.c
fs_types.c
fsopen.c
inode.c AFS Changes 2020-06-05 16:26:36 -07:00
internal.h A lot of bug fixes and cleanups for ext4, including: 2020-06-05 16:19:28 -07:00
io-wq.c io_uring: cancel all task's requests on exit 2020-06-15 08:51:34 -06:00
io-wq.h io_uring: cancel by ->task not pid 2020-06-15 08:51:38 -06:00
io_uring.c io_uring: fix regression with always ignoring signals in io_cqring_wait() 2020-07-04 13:44:45 -06:00
ioctl.c fs: remove the access_ok() check in ioctl_fiemap 2020-06-03 23:16:55 -04:00
Kconfig treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
Kconfig.binfmt treewide: replace '---help---' in Kconfig files with 'help' 2020-06-14 01:57:21 +09:00
libfs.c block: remove the error_sector argument to blkdev_issue_flush 2020-05-22 08:45:46 -06:00
locks.c Highlights: 2020-06-11 10:33:13 -07:00
Makefile
mbcache.c
mount.h proc/mounts: add cursor 2020-05-14 16:44:24 +02:00
mpage.c fs: convert mpage_readpages to mpage_readahead 2020-06-02 10:59:07 -07:00
namei.c vfs: clean up posix_acl_permission() logic aroudn MAY_NOT_BLOCK 2020-06-08 11:04:19 -07:00
namespace.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:09:11 -07:00
no-block.c
nsfs.c nsproxy: attach to namespaces via pidfds 2020-05-13 11:41:22 +02:00
open.c Merge branch 'akpm' (patches from Andrew) 2020-06-02 12:21:36 -07:00
pipe.c Notifications over pipes + Keyring notifications 2020-06-13 09:56:21 -07:00
pnode.c propagate_one(): mnt_set_mountpoint() needs mount_lock 2020-04-27 10:37:14 -04:00
pnode.h
posix_acl.c vfs: clean up posix_acl_permission() logic aroudn MAY_NOT_BLOCK 2020-06-08 11:04:19 -07:00
proc_namespace.c Merge branch 'proc-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2020-06-04 13:54:34 -07:00
read_write.c powerpc: Add back __ARCH_WANT_SYS_LLSEEK macro 2020-04-03 00:09:59 +11:00
readdir.c readdir.c: get rid of the last __put_user(), drop now-useless access_ok() 2020-05-01 20:29:54 -04:00
select.c pselect6() and friends: take handling the combined 6th/7th args into helper 2020-05-29 19:10:42 -04:00
seq_file.c fs/seq_file.c: seq_read: Update pr_info_ratelimited 2020-06-04 19:06:25 -07:00
signalfd.c
splice.c Notifications over pipes + Keyring notifications 2020-06-13 09:56:21 -07:00
stack.c
stat.c New code for 5.8: 2020-06-02 19:45:12 -07:00
statfs.c
super.c Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2020-06-10 16:09:11 -07:00
sync.c overlayfs update for 5.8 2020-06-09 15:40:50 -07:00
timerfd.c
userfaultfd.c mmap locking API: convert mmap_sem comments 2020-06-09 09:39:14 -07:00
utimes.c utimensat: AT_EMPTY_PATH support 2020-05-14 16:44:24 +02:00
xattr.c xattr: fix uninitialized out-param 2020-04-09 15:33:09 -04:00