linux/fs/btrfs
Josef Bacik 79d3d1d12e btrfs: don't allow large NOWAIT direct reads
Dylan and Jens reported a problem where they had an io_uring test that
was returning short reads, and bisected it to ee5b46a353 ("btrfs:
increase direct io read size limit to 256 sectors").

The root cause is their test was doing larger reads via io_uring with
NOWAIT and async.  This was triggering a page fault during the direct
read, however the first page was able to work just fine and thus we
submitted a 4k read for a larger iocb.

Btrfs allows for partial IO's in this case specifically because we don't
allow page faults, and thus we'll attempt to do any io that we can,
submit what we could, come back and fault in the rest of the range and
try to do the remaining IO.

However for !is_sync_kiocb() we'll call ->ki_complete() as soon as the
partial dio is done, which is incorrect.  In the sync case we can exit
the iomap code, submit more io's, and return with the amount of IO we
were able to complete successfully.

We were always doing short reads in this case, but for NOWAIT we were
getting saved by the fact that we were limiting direct reads to
sectorsize, and if we were larger than that we would return EAGAIN.

Fix the regression by simply returning EAGAIN in the NOWAIT case with
larger reads, that way io_uring can retry and get the larger IO and have
the fault logic handle everything properly.

This still leaves the AIO short read case, but that existed before this
change.  The way to properly fix this would be to handle partial iocb
completions, but that's a lot of work, for now deal with the regression
in the most straightforward way possible.

Reported-by: Dylan Yudaken <dylany@fb.com>
Fixes: ee5b46a353 ("btrfs: increase direct io read size limit to 256 sectors")
Reviewed-by: Filipe Manana <fdmanana@suse.com>
Signed-off-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: David Sterba <dsterba@suse.com>
2022-08-22 18:08:07 +02:00
..
tests btrfs: add optimized btrfs_ino() version for 64 bits systems 2022-07-25 17:45:41 +02:00
acl.c btrfs: reserve correct number of items for inode creation 2022-05-16 17:03:08 +02:00
async-thread.c btrfs: simplify WQ_HIGHPRI handling in struct btrfs_workqueue 2022-05-16 17:03:15 +02:00
async-thread.h btrfs: remove unused typedefs get_extent_t and btrfs_work_func_t 2022-07-25 17:45:36 +02:00
backref.c btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino 2022-07-25 17:45:36 +02:00
backref.h btrfs: sink iterator parameter to btrfs_ioctl_logical_to_ino 2022-07-25 17:45:36 +02:00
block-group.c btrfs: reset RO counter on block group if we fail to relocate 2022-07-27 21:23:16 +02:00
block-group.h btrfs: zoned: prevent allocation from previous data relocation BG 2022-06-21 14:43:48 +02:00
block-rsv.c btrfs: use enum for btrfs_block_rsv::type 2022-07-25 17:45:40 +02:00
block-rsv.h btrfs: use enum for btrfs_block_rsv::type 2022-07-25 17:45:40 +02:00
btrfs_inode.h btrfs: add optimized btrfs_ino() version for 64 bits systems 2022-07-25 17:45:41 +02:00
check-integrity.c btrfs: check-integrity: simplify bio allocation in btrfsic_read_block 2022-05-16 17:03:12 +02:00
check-integrity.h btrfs: check-integrity: split submit_bio from btrfsic checking 2022-05-16 17:03:12 +02:00
compression.c btrfs: don't call btrfs_page_set_checked in finish_compressed_bio_read 2022-07-25 19:56:16 +02:00
compression.h btrfs: fix repair of compressed extents 2022-07-25 19:56:16 +02:00
ctree.c btrfs: fix lockdep splat with reloc root extent buffers 2022-08-17 16:19:12 +02:00
ctree.h btrfs: fix lockdep splat with reloc root extent buffers 2022-08-17 16:19:12 +02:00
delalloc-space.c btrfs: convert count_max_extents() to use fs_info->max_extent_size 2022-07-25 17:45:41 +02:00
delalloc-space.h
delayed-inode.c btrfs: batch up release of reserved metadata for delayed items used for deletion 2022-07-25 17:45:37 +02:00
delayed-inode.h btrfs: reduce amount of reserved metadata for delayed item insertion 2022-07-25 17:44:36 +02:00
delayed-ref.c btrfs: switch btrfs_block_rsv::full to bool 2022-07-25 17:45:40 +02:00
delayed-ref.h btrfs: remove btrfs_delayed_extent_op::is_data 2022-05-16 17:17:31 +02:00
dev-replace.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
dev-replace.h
dir-item.c btrfs: use btrfs_for_each_slot in btrfs_search_dir_index_item 2022-05-16 17:03:07 +02:00
discard.c btrfs: fix typos in comments 2021-06-22 14:11:57 +02:00
discard.h
disk-io.c btrfs: move lockdep class helpers to locking.c 2022-08-17 16:19:10 +02:00
disk-io.h btrfs: move lockdep class helpers to locking.c 2022-08-17 16:19:10 +02:00
export.c
export.h
extent-io-tree.h btrfs: Convert from invalidatepage to invalidate_folio 2022-03-15 08:23:29 -04:00
extent-tree.c btrfs: fix lockdep splat with reloc root extent buffers 2022-08-17 16:19:12 +02:00
extent_io.c btrfs: don't merge pages into bio if their page offset is not contiguous 2022-08-22 18:06:58 +02:00
extent_io.h btrfs: fix repair of compressed extents 2022-07-25 19:56:16 +02:00
extent_map.c btrfs: assert we have a write lock when removing and replacing extent maps 2022-03-14 13:13:50 +01:00
extent_map.h btrfs: defrag: don't use merged extent map for their generation check 2022-02-23 17:43:13 +01:00
file-item.c btrfs: handle csum lookup errors properly on reads 2022-03-14 13:13:51 +01:00
file.c btrfs: update generation of hole file extent item when merging holes 2022-08-22 18:06:42 +02:00
free-space-cache.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
free-space-cache.h btrfs: change name and type of private member of btrfs_free_space_ctl 2022-01-03 15:09:50 +01:00
free-space-tree.c btrfs: use rbtree with leftmost node cached for tracking lowest block group 2022-05-16 17:03:13 +02:00
free-space-tree.h
inode-item.c btrfs: make should_throttle loop local in btrfs_truncate_inode_items 2022-01-07 14:18:25 +01:00
inode-item.h btrfs: add inode to truncate control 2022-01-07 14:18:24 +01:00
inode.c btrfs: don't allow large NOWAIT direct reads 2022-08-22 18:08:07 +02:00
ioctl.c btrfs: use fs_info->max_extent_size in get_extent_max_capacity() 2022-07-25 17:45:41 +02:00
Kconfig btrfs: use generic Kconfig option for 256kB page size limit 2022-01-20 08:52:55 +02:00
locking.c btrfs: fix lockdep splat with reloc root extent buffers 2022-08-17 16:19:12 +02:00
locking.h btrfs: fix lockdep splat with reloc root extent buffers 2022-08-17 16:19:12 +02:00
lzo.c btrfs: replace kmap() with kmap_local_page() in lzo.c 2022-07-25 17:45:33 +02:00
Makefile Kbuild: add -Wno-shift-negative-value where -Wextra is used 2022-03-13 17:30:31 +09:00
misc.h btrfs: use correct header for div_u64 in misc.h 2021-09-07 14:29:50 +02:00
ordered-data.c btrfs: remove the finish_func argument to btrfs_mark_ordered_io_finished 2022-07-25 17:45:37 +02:00
ordered-data.h btrfs: remove the finish_func argument to btrfs_mark_ordered_io_finished 2022-07-25 17:45:37 +02:00
orphan.c
print-tree.c btrfs: unify the error handling pattern for read_tree_block() 2022-03-14 13:13:53 +01:00
print-tree.h
props.c btrfs: move common inode creation code into btrfs_create_new_inode() 2022-05-16 17:03:08 +02:00
props.h btrfs: move common inode creation code into btrfs_create_new_inode() 2022-05-16 17:03:08 +02:00
qgroup.c btrfs: avoid blocking on space revervation when doing nowait dio writes 2022-05-16 17:03:10 +02:00
qgroup.h btrfs: avoid blocking on space revervation when doing nowait dio writes 2022-05-16 17:03:10 +02:00
raid56.c btrfs: raid56: transfer the bio counter reference to the raid submission helpers 2022-07-25 17:45:40 +02:00
raid56.h btrfs: do not return errors from raid56_parity_recover 2022-07-25 17:45:39 +02:00
rcu-string.h
ref-verify.c btrfs: stop accessing ->extent_root directly 2022-01-03 15:09:49 +01:00
ref-verify.h
reflink.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
reflink.h
relocation.c btrfs: fix lockdep splat with reloc root extent buffers 2022-08-17 16:19:12 +02:00
root-tree.c btrfs: avoid blocking on space revervation when doing nowait dio writes 2022-05-16 17:03:10 +02:00
scrub.c btrfs: do not return errors from raid56_parity_recover 2022-07-25 17:45:39 +02:00
send.c btrfs: send: always use the rbtree based inode ref management infrastructure 2022-07-25 17:45:42 +02:00
send.h btrfs: send: add new command FILEATTR for file attributes 2022-07-25 17:45:38 +02:00
space-info.c btrfs: zoned: activate metadata block group on flush_space 2022-07-25 17:45:42 +02:00
space-info.h btrfs: zoned: introduce space_info->active_total_bytes 2022-07-25 17:45:42 +02:00
struct-funcs.c btrfs: remove redundant check in up check_setget_bounds 2022-07-25 17:45:33 +02:00
subpage.c btrfs: remove extent writepage address space operation 2022-07-25 17:45:37 +02:00
subpage.h btrfs: make nodesize >= PAGE_SIZE case to reuse the non-subpage routine 2022-05-16 17:03:11 +02:00
super.c btrfs: use mask for all RAID1* profiles in btrfs_calc_avail_data_space 2022-07-25 17:45:38 +02:00
sysfs.c btrfs: sysfs: remove BIG_METADATA feature files 2022-07-25 17:45:39 +02:00
sysfs.h
transaction.c btrfs: clean up chained assignments 2022-07-25 17:45:39 +02:00
transaction.h btrfs: pass btrfs_fs_info for deleting snapshots and cleaner 2022-03-14 13:13:52 +01:00
tree-checker.c btrfs: tree-checker: check for overlapping extent items 2022-08-17 16:20:25 +02:00
tree-checker.h btrfs: tree-checker: check extent buffer owner against owner rootid 2022-05-16 17:03:09 +02:00
tree-defrag.c btrfs: remove unnecessary extent root check in btrfs_defrag_leaves 2022-01-03 15:09:48 +01:00
tree-log.c btrfs: fix warning during log replay when bumping inode link count 2022-08-17 16:19:50 +02:00
tree-log.h btrfs: tree-log: make the return value for log syncing consistent 2022-07-25 17:45:34 +02:00
tree-mod-log.c
tree-mod-log.h
ulist.c
ulist.h
uuid-tree.c btrfs: drop the _nr from the item helpers 2022-01-03 15:09:43 +01:00
verity.c btrfs: drop the _nr from the item helpers 2022-01-03 15:09:43 +01:00
volumes.c btrfs: fix possible memory leak in btrfs_get_dev_args_from_path() 2022-08-22 18:06:33 +02:00
volumes.h btrfs: do not return errors from btrfs_map_bio 2022-07-25 17:45:39 +02:00
xattr.c btrfs: check if root is readonly while setting security xattr 2022-08-22 18:06:30 +02:00
xattr.h
zlib.c btrfs: zlib: replace kmap() with kmap_local_page() in zlib_decompress_bio() 2022-07-25 17:45:41 +02:00
zoned.c btrfs: zoned: wait until zone is finished when allocation didn't progress 2022-07-25 17:45:42 +02:00
zoned.h btrfs: zoned: activate metadata block group on flush_space 2022-07-25 17:45:42 +02:00
zstd.c btrfs: zstd: replace kmap() with kmap_local_page() 2022-07-25 17:45:40 +02:00