linux/fs
Dave Chinner 292378edcb xfs: remote attribute blocks aren't really userdata
When adding a new remote attribute, we write the attribute to the
new extent before the allocation transaction is committed. This
means we cannot reuse busy extents as that violates crash
consistency semantics. Hence we currently treat remote attribute
extent allocation like userdata because it has the same overwrite
ordering constraints as userdata.

Unfortunately, this also allows the allocator to incorrectly apply
extent size hints to the remote attribute extent allocation. This
results in interesting failures, such as transaction block
reservation overruns and in-memory inode attribute fork corruption.

To fix this, we need to separate the busy extent reuse configuration
from the userdata configuration. This changes the definition of
XFS_BMAPI_METADATA slightly - it now means that allocation is
metadata and reuse of busy extents is acceptible due to the metadata
ordering semantics of the journal. If this flag is not set, it
means the allocation is that has unordered data writeback, and hence
busy extent reuse is not allowed. It no longer implies the
allocation is for user data, just that the data write will not be
strictly ordered. This matches the semantics for both user data
and remote attribute block allocation.

As such, This patch changes the "userdata" field to a "datatype"
field, and adds a "no busy reuse" flag to the field.
When we detect an unordered data extent allocation, we immediately set
the no reuse flag. We then set the "user data" flags based on the
inode fork we are allocating the extent to. Hence we only set
userdata flags on data fork allocations now and consider attribute
fork remote extents to be an unordered metadata extent.

The result is that remote attribute extents now have the expected
allocation semantics, and the data fork allocation behaviour is
completely unchanged.

It should be noted that there may be other ways to fix this (e.g.
use ordered metadata buffers for the remote attribute extent data
write) but they are more invasive and difficult to validate both
from a design and implementation POV. Hence this patch takes the
simple, obvious route to fixing the problem...

Reported-and-tested-by: Ross Zwisler <ross.zwisler@linux.intel.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2016-09-26 08:21:28 +10:00
..
9p Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-07 10:01:14 -04:00
adfs Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-07 10:01:14 -04:00
affs get rid of 'parent' argument of ->d_compare() 2016-07-31 16:37:25 -04:00
afs rxrpc: Limit the listening backlog 2016-06-10 18:14:47 -07:00
autofs4 Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-06 09:49:02 -04:00
befs fs/befs/io.c:befs_bread(): remove unneeded initialization to NULL 2016-05-23 17:04:14 -07:00
bfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
btrfs Merge branch 'for-linus-4.8' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-08-10 11:16:03 -07:00
cachefiles cachefiles: Fix race between inactivating and culling a cache object 2016-08-03 13:33:26 -04:00
ceph ceph: initialize pathbase in the !dentry case in encode_caps_cb() 2016-08-09 17:26:56 +02:00
cifs get rid of 'parent' argument of ->d_compare() 2016-07-31 16:37:25 -04:00
coda drop redundant ->owner initializations 2016-05-29 19:08:00 -04:00
configfs configfs: don't set buffer_needs_fill to zero if show() returns error 2016-07-10 21:02:18 +09:00
cramfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
crypto block, fs, mm, drivers: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
debugfs Merge branch 'd_real' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into work.misc 2016-06-30 23:34:49 -04:00
devpts userns: Remove the now unnecessary FS_USERNS_DEV_MOUNT flag 2016-06-23 15:47:31 -05:00
dlm dlm: Use kmemdup instead of kmalloc and memcpy 2016-06-23 11:55:58 -05:00
ecryptfs ecryptfs: don't allow mmap when the lower fs doesn't support it 2016-07-08 10:35:28 -05:00
efivarfs get rid of 'parent' argument of ->d_compare() 2016-07-31 16:37:25 -04:00
efs fs/efs/super.c: fix return value 2016-05-20 17:58:30 -07:00
exofs block, fs, mm, drivers: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
exportfs introduce a parallel variant of ->iterate() 2016-05-02 19:49:29 -04:00
ext2 Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-06 09:49:02 -04:00
ext4 Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-07-28 12:59:05 -07:00
f2fs Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-06 09:49:02 -04:00
fat Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-07 10:01:14 -04:00
freevxfs freevxfs: update Kconfig information 2016-06-13 10:20:39 +02:00
fscache Merge branch 'd_real' of git://git.kernel.org/pub/scm/linux/kernel/git/mszeredi/vfs into work.misc 2016-06-30 23:34:49 -04:00
fuse Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-06 09:49:02 -04:00
gfs2 fs: return EPERM on immutable inode 2016-08-07 10:03:31 -04:00
hfs Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-07 10:01:14 -04:00
hfsplus Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-07 10:01:14 -04:00
hostfs hostfs: Freeing an ERR_PTR in hostfs_fill_sb_common() 2016-08-04 00:18:10 +02:00
hpfs get rid of 'parent' argument of ->d_compare() 2016-07-31 16:37:25 -04:00
hugetlbfs mm, fs: remove remaining PAGE_CACHE_* and page_cache_{get,release} usage 2016-04-04 10:41:08 -07:00
isofs get rid of 'parent' argument of ->d_compare() 2016-07-31 16:37:25 -04:00
jbd2 The major change this cycle is deleting ext4's copy of the file system 2016-07-26 18:35:55 -07:00
jffs2 vfs: make the string hashes salt the hash 2016-06-10 20:21:46 -07:00
jfs get rid of 'parent' argument of ->d_compare() 2016-07-31 16:37:25 -04:00
kernfs Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-07-29 15:54:19 -07:00
lockd Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-07-28 12:59:05 -07:00
logfs Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-06 09:49:02 -04:00
minix simple local filesystems: switch to ->iterate_shared() 2016-05-02 19:49:32 -04:00
ncpfs get rid of 'parent' argument of ->d_compare() 2016-07-31 16:37:25 -04:00
nfs NFS client bugfixes for Linux 4.8 2016-08-12 12:32:24 -07:00
nfs_common
nfsd nfsd: don't return an unhashed lock stateid after taking mutex 2016-08-12 16:10:25 -04:00
nilfs2 nilfs2: move ioctl interface and disk layout to uapi separately 2016-08-02 19:35:21 -04:00
nls
notify fsnotify: avoid spurious EMFILE errors from inotify_init() 2016-05-19 19:12:14 -07:00
ntfs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-07-28 12:59:05 -07:00
ocfs2 Merge branch 'work.const-qstr' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-06 09:49:02 -04:00
omfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
openpromfs more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
orangefs orangefs: Account for jiffies wraparound. 2016-08-02 15:39:13 -04:00
overlayfs ovl: simplify empty checking 2016-07-29 12:05:25 +02:00
proc proc, meminfo: use correct helpers for calculating LRU sizes in meminfo 2016-08-11 16:58:13 -07:00
pstore ramoops: use persistent_ram_free() instead of kfree() for freeing prz 2016-08-05 11:21:46 -07:00
qnx4 more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
qnx6 more trivial ->iterate_shared conversions 2016-05-09 11:41:14 -04:00
quota Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-07-29 15:54:19 -07:00
ramfs tmpfs/ramfs: fix VM_MAYSHARE mappings for NOMMU 2016-05-20 17:58:30 -07:00
reiserfs reiserfs: fix "new_insert_key may be used uninitialized ..." 2016-08-02 19:35:22 -04:00
romfs romfs, squashfs: switch to ->iterate_shared() 2016-05-09 11:41:15 -04:00
squashfs fs: have ll_rw_block users pass in op and flags separately 2016-06-07 13:41:38 -06:00
sysfs kernfs: The cgroup filesystem also benefits from SB_I_NOEXEC 2016-06-23 15:41:56 -05:00
sysv vfs: make the string hashes salt the hash 2016-06-10 20:21:46 -07:00
tracefs tracefs: ->d_parent is never NULL or negative... 2016-05-29 16:22:07 -04:00
ubifs ubifs: switch_gc_head: Remove redondant sync of wbuf 2016-07-29 23:32:37 +02:00
udf Merge branch 'for-4.8/drivers' of git://git.kernel.dk/linux-block 2016-07-26 15:37:51 -07:00
ufs Merge branch 'work.misc' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-07-28 12:59:05 -07:00
xfs xfs: remote attribute blocks aren't really userdata 2016-09-26 08:21:28 +10:00
aio.c aio: make aio_setup_ring killable 2016-05-23 17:04:14 -07:00
anon_inodes.c
attr.c vfs: Don't modify inodes with a uid or gid unknown to the vfs 2016-07-05 15:06:46 -05:00
bad_inode.c switch ->setxattr() to passing dentry and inode separately 2016-05-27 20:09:16 -04:00
binfmt_aout.c fs: fix binfmt_aout.c build error 2016-05-28 16:34:59 -07:00
binfmt_elf.c binfmt_elf: fix calculations for bss padding 2016-08-02 19:35:14 -04:00
binfmt_elf_fdpic.c elf_fdpic_transfer_args_to_stack(): make it generic 2016-07-25 16:51:49 +10:00
binfmt_em86.c fs/binfmt_em86.c: fix incompatible pointer type 2016-08-02 19:35:15 -04:00
binfmt_flat.c binfmt_flat: allow compressed flat binary format to work on MMU systems 2016-07-28 13:29:12 +10:00
binfmt_misc.c binfmt_misc for-linus on 20160727 2016-08-07 10:13:14 -04:00
binfmt_script.c
block_dev.c block/mm: make bdev_ops->rw_page() take a bool for read/write 2016-08-07 14:41:02 -06:00
buffer.c xfs: update for 4.8-rc1 2016-07-27 09:53:35 -07:00
char_dev.c chardev: add missing line break in pr_warn 2016-07-14 16:21:53 +09:00
compat.c Fix a number of bugs, most notably a potential stale data exposure 2016-05-24 12:55:26 -07:00
compat_binfmt_elf.c
compat_ioctl.c [media] cec: add compat32 ioctl support 2016-06-28 10:00:13 -03:00
coredump.c coredump: fix dumping through pipes 2016-06-07 22:07:09 -04:00
dax.c libnvdimm for 4.8 2016-07-28 17:38:16 -07:00
dcache.c Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-07 10:01:14 -04:00
dcookies.c
direct-io.c direct-io: use bio set/get op accessors 2016-06-07 13:41:38 -06:00
drop_caches.c
eventfd.c eventfd: document lockless access in eventfd_poll 2016-03-22 15:36:02 -07:00
eventpoll.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
exec.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/gerg/m68knommu 2016-08-04 18:04:44 -04:00
fcntl.c
fhandle.c fs/coredump: prevent fsuid=0 dumps into user-controlled directories 2016-03-22 15:36:02 -07:00
file.c give readdir(2)/getdents(2)/etc. uniform exclusion with lseek() 2016-05-02 19:49:28 -04:00
file_table.c
filesystems.c
fs-writeback.c mm, writeback: flush plugged IO in wakeup_flusher_threads() 2016-08-09 19:58:06 -06:00
fs_pin.c
fs_struct.c
inode.c Merge branch 'for-linus-2' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-08-07 10:01:14 -04:00
internal.h binfmt_misc for-linus on 20160727 2016-08-07 10:13:14 -04:00
ioctl.c vfs: ioctl: prevent double-fetch in dedupe ioctl 2016-07-28 15:23:12 -07:00
iomap.c iomap: don't set FIEMAP_EXTENT_MERGED for extent based filesystems 2016-08-29 11:33:58 +10:00
Kconfig Highlights: 2016-08-04 19:59:06 -04:00
Kconfig.binfmt m68k: enable binfmt_flat on systems with an MMU 2016-07-28 13:29:13 +10:00
libfs.c lockless next_positive() 2016-06-20 17:11:29 -04:00
locks.c locks: use file_inode() 2016-07-01 10:24:18 -04:00
Makefile fs: introduce iomap infrastructure 2016-06-21 09:23:11 +10:00
mbcache.c mbcache: add reusable flag to cache entries 2016-02-22 22:44:04 -05:00
mount.h
mpage.c block/mm: make bdev_ops->rw_page() take a bool for read/write 2016-08-07 14:41:02 -06:00
namei.c fs: return EPERM on immutable inode 2016-08-07 10:03:31 -04:00
namespace.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-07-29 15:54:19 -07:00
no-block.c
nsfs.c
open.c binfmt_misc for-linus on 20160727 2016-08-07 10:13:14 -04:00
pipe.c mm: memcontrol: only mark charged pages with PageKmemcg 2016-08-09 10:14:10 -07:00
pnode.c propogate_mnt: Handle the first propogated copy being a slave 2016-05-05 09:54:45 -05:00
pnode.h
posix_acl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-07-29 15:54:19 -07:00
proc_namespace.c vfs: show_vfsstat: do not ignore errors from show_devname method 2016-03-16 13:09:08 -04:00
read_write.c x86/syscalls: Add compat_sys_preadv64v2/compat_sys_pwritev64v2 2016-07-15 10:30:26 +02:00
readdir.c restore killability of old mutex_lock_killable(&inode->i_mutex) users 2016-05-26 00:13:25 -04:00
select.c fs: poll/select/recvmmsg: use timespec64 for timeout events 2016-05-19 19:12:14 -07:00
seq_file.c Make file credentials available to the seqfile interfaces 2016-04-14 12:56:09 -07:00
signalfd.c
splice.c Merge branch 'ovl-fixes' into for-linus 2016-05-11 00:00:29 -04:00
stack.c
stat.c
statfs.c
super.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/ebiederm/user-namespace 2016-07-29 15:54:19 -07:00
sync.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
timerfd.c timerfd: Reject ALARM timerfds without CAP_WAKE_ALARM 2016-06-09 23:42:38 +02:00
userfaultfd.c mm: introduce fault_env 2016-07-26 16:19:19 -07:00
utimes.c fs: return EPERM on immutable inode 2016-08-07 10:03:31 -04:00
xattr.c vfs: Don't modify inodes with a uid or gid unknown to the vfs 2016-07-05 15:06:46 -05:00