The first two changes that involve files outside of fs/ext4:

- submit_bh() can never return an error, so change it to return void,
   and remove the unused checks from its callers
 
 - fix I_DIRTY_TIME handling so it will be set even if the inode
   already has I_DIRTY_INODE
 
 Performance:
 
 - Always enable i_version counter (as btrfs and xfs already do).
   Remove some uneeded i_version bumps to avoid unnecessary nfs cache
   invalidations.
 
 - Wake up journal waters in FIFO order, to avoid some journal users
   from not getting a journal handle for an unfairly long time.
 
 - In ext4_write_begin() allocate any necessary buffer heads before
   starting the journal handle.
 
 - Don't try to prefetch the block allocation bitmaps for a read-only
   file system.
 
 Bug Fixes:
 
 - Fix a number of fast commit bugs, including resources leaks and out
   of bound references in various error handling paths and/or if the fast
   commit log is corrupted.
 
 - Avoid stopping the online resize early when expanding a file system
   which is less than 16TiB to a size greater than 16TiB.
 
 - Fix apparent metadata corruption caused by a race with a metadata
   buffer head getting migrated while it was trying to be read.
 
 - Mark the lazy initialization thread freezable to prevent suspend
   failures.
 
 - Other miscellaneous bug fixes.
 
 Cleanups:
 
 - Break up the incredibly long ext4_full_super() function by
   refactoring to move code into more understandable, smaller
   functions.
 
 - Remove the deprecated (and ignored) noacl and nouser_attr mount
   option.
 
 - Factor out some common code in fast commit handling.
 
 - Other miscellaneous cleanups.
 -----BEGIN PGP SIGNATURE-----
 
 iQEzBAABCAAdFiEEK2m5VNv+CHkogTfJ8vlZVpUNgaMFAmM8/2gACgkQ8vlZVpUN
 gaPohAf9GDMUq3QIYoWLlJ+ygJhL0xQGPfC6sypMjHaUO5GSo+1+sAMU3JBftxUS
 LrgTtmzSKzwp9PyOHNs+mswUzhLZivKVCLMmOznQUZS228GSVKProhN1LPL4UP2Q
 Ks8i1M5XTWS+mtJ5J5Mw6jRHxcjfT6ynyJKPnIWKTwXyeru1WSJ2PWqtWQD4EZkE
 lImECy0jX/zlK02s0jDYbNIbXIvI/TTYi7wT8o1ouLCAXMDv5gJRc5TXCVtX8i59
 /Pl9rGG/+IWTnYT/aQ668S2g0Cz6Wyv2EkmiPUW0Y8NoLaaouBYZoC2hDujiv+l1
 ucEI14TEQ+DojJTdChrtwKqgZfqDOw==
 =xoLC
 -----END PGP SIGNATURE-----

Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4

Pull ext4 updates from Ted Ts'o:
 "The first two changes involve files outside of fs/ext4:

   - submit_bh() can never return an error, so change it to return void,
     and remove the unused checks from its callers

   - fix I_DIRTY_TIME handling so it will be set even if the inode
     already has I_DIRTY_INODE

  Performance:

   - Always enable i_version counter (as btrfs and xfs already do).
     Remove some uneeded i_version bumps to avoid unnecessary nfs cache
     invalidations

   - Wake up journal waiters in FIFO order, to avoid some journal users
     from not getting a journal handle for an unfairly long time

   - In ext4_write_begin() allocate any necessary buffer heads before
     starting the journal handle

   - Don't try to prefetch the block allocation bitmaps for a read-only
     file system

  Bug Fixes:

   - Fix a number of fast commit bugs, including resources leaks and out
     of bound references in various error handling paths and/or if the
     fast commit log is corrupted

   - Avoid stopping the online resize early when expanding a file system
     which is less than 16TiB to a size greater than 16TiB

   - Fix apparent metadata corruption caused by a race with a metadata
     buffer head getting migrated while it was trying to be read

   - Mark the lazy initialization thread freezable to prevent suspend
     failures

   - Other miscellaneous bug fixes

  Cleanups:

   - Break up the incredibly long ext4_full_super() function by
     refactoring to move code into more understandable, smaller
     functions

   - Remove the deprecated (and ignored) noacl and nouser_attr mount
     option

   - Factor out some common code in fast commit handling

   - Other miscellaneous cleanups"

* tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4: (53 commits)
  ext4: fix potential out of bound read in ext4_fc_replay_scan()
  ext4: factor out ext4_fc_get_tl()
  ext4: introduce EXT4_FC_TAG_BASE_LEN helper
  ext4: factor out ext4_free_ext_path()
  ext4: remove unnecessary drop path references in mext_check_coverage()
  ext4: update 'state->fc_regions_size' after successful memory allocation
  ext4: fix potential memory leak in ext4_fc_record_regions()
  ext4: fix potential memory leak in ext4_fc_record_modified_inode()
  ext4: remove redundant checking in ext4_ioctl_checkpoint
  jbd2: add miss release buffer head in fc_do_one_pass()
  ext4: move DIOREAD_NOLOCK setting to ext4_set_def_opts()
  ext4: remove useless local variable 'blocksize'
  ext4: unify the ext4 super block loading operation
  ext4: factor out ext4_journal_data_mode_check()
  ext4: factor out ext4_load_and_init_journal()
  ext4: factor out ext4_group_desc_init() and ext4_group_desc_free()
  ext4: factor out ext4_geometry_check()
  ext4: factor out ext4_check_feature_compatibility()
  ext4: factor out ext4_init_metadata_csum()
  ext4: factor out ext4_encoding_init()
  ...
This commit is contained in:
Linus Torvalds 2022-10-06 17:45:53 -07:00
commit bc32a6330f
27 changed files with 993 additions and 806 deletions

View file

@ -274,6 +274,9 @@ or bottom half).
This is specifically for the inode itself being marked dirty,
not its data. If the update needs to be persisted by fdatasync(),
then I_DIRTY_DATASYNC will be set in the flags argument.
I_DIRTY_TIME will be set in the flags in case lazytime is enabled
and struct inode has times updated since the last ->dirty_inode
call.
``write_inode``
this method is called when the VFS needs to write an inode to

View file

@ -52,8 +52,8 @@
#include "internal.h"
static int fsync_buffers_list(spinlock_t *lock, struct list_head *list);
static int submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
struct writeback_control *wbc);
static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
struct writeback_control *wbc);
#define BH_ENTRY(list) list_entry((list), struct buffer_head, b_assoc_buffers)
@ -2673,8 +2673,8 @@ static void end_bio_bh_io_sync(struct bio *bio)
bio_put(bio);
}
static int submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
struct writeback_control *wbc)
static void submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
struct writeback_control *wbc)
{
const enum req_op op = opf & REQ_OP_MASK;
struct bio *bio;
@ -2717,12 +2717,11 @@ static int submit_bh_wbc(blk_opf_t opf, struct buffer_head *bh,
}
submit_bio(bio);
return 0;
}
int submit_bh(blk_opf_t opf, struct buffer_head *bh)
void submit_bh(blk_opf_t opf, struct buffer_head *bh)
{
return submit_bh_wbc(opf, bh, NULL);
submit_bh_wbc(opf, bh, NULL);
}
EXPORT_SYMBOL(submit_bh);
@ -2801,8 +2800,6 @@ EXPORT_SYMBOL(write_dirty_buffer);
*/
int __sync_dirty_buffer(struct buffer_head *bh, blk_opf_t op_flags)
{
int ret = 0;
WARN_ON(atomic_read(&bh->b_count) < 1);
lock_buffer(bh);
if (test_clear_buffer_dirty(bh)) {
@ -2817,14 +2814,14 @@ int __sync_dirty_buffer(struct buffer_head *bh, blk_opf_t op_flags)
get_bh(bh);
bh->b_end_io = end_buffer_write_sync;
ret = submit_bh(REQ_OP_WRITE | op_flags, bh);
submit_bh(REQ_OP_WRITE | op_flags, bh);
wait_on_buffer(bh);
if (!ret && !buffer_uptodate(bh))
ret = -EIO;
if (!buffer_uptodate(bh))
return -EIO;
} else {
unlock_buffer(bh);
}
return ret;
return 0;
}
EXPORT_SYMBOL(__sync_dirty_buffer);

View file

@ -3592,9 +3592,6 @@ extern bool empty_inline_dir(struct inode *dir, int *has_inline_data);
extern struct buffer_head *ext4_get_first_inline_block(struct inode *inode,
struct ext4_dir_entry_2 **parent_de,
int *retval);
extern int ext4_inline_data_fiemap(struct inode *inode,
struct fiemap_extent_info *fieinfo,
int *has_inline, __u64 start, __u64 len);
extern void *ext4_read_inline_link(struct inode *inode);
struct iomap;
@ -3713,7 +3710,7 @@ extern int ext4_ext_insert_extent(handle_t *, struct inode *,
extern struct ext4_ext_path *ext4_find_extent(struct inode *, ext4_lblk_t,
struct ext4_ext_path **,
int flags);
extern void ext4_ext_drop_refs(struct ext4_ext_path *);
extern void ext4_free_ext_path(struct ext4_ext_path *);
extern int ext4_ext_check_inode(struct inode *inode);
extern ext4_lblk_t ext4_ext_next_allocated_block(struct ext4_ext_path *path);
extern int ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,

View file

@ -106,6 +106,25 @@ static int ext4_ext_trunc_restart_fn(struct inode *inode, int *dropped)
return 0;
}
static void ext4_ext_drop_refs(struct ext4_ext_path *path)
{
int depth, i;
if (!path)
return;
depth = path->p_depth;
for (i = 0; i <= depth; i++, path++) {
brelse(path->p_bh);
path->p_bh = NULL;
}
}
void ext4_free_ext_path(struct ext4_ext_path *path)
{
ext4_ext_drop_refs(path);
kfree(path);
}
/*
* Make sure 'handle' has at least 'check_cred' credits. If not, restart
* transaction with 'restart_cred' credits. The function drops i_data_sem
@ -636,8 +655,7 @@ int ext4_ext_precache(struct inode *inode)
ext4_set_inode_state(inode, EXT4_STATE_EXT_PRECACHED);
out:
up_read(&ei->i_data_sem);
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
return ret;
}
@ -724,19 +742,6 @@ static void ext4_ext_show_move(struct inode *inode, struct ext4_ext_path *path,
#define ext4_ext_show_move(inode, path, newblock, level)
#endif
void ext4_ext_drop_refs(struct ext4_ext_path *path)
{
int depth, i;
if (!path)
return;
depth = path->p_depth;
for (i = 0; i <= depth; i++, path++) {
brelse(path->p_bh);
path->p_bh = NULL;
}
}
/*
* ext4_ext_binsearch_idx:
* binary search for the closest index of the given block
@ -955,8 +960,7 @@ ext4_find_extent(struct inode *inode, ext4_lblk_t block,
return path;
err:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
if (orig_path)
*orig_path = NULL;
return ERR_PTR(ret);
@ -2174,8 +2178,7 @@ int ext4_ext_insert_extent(handle_t *handle, struct inode *inode,
err = ext4_ext_dirty(handle, inode, path + path->p_depth);
cleanup:
ext4_ext_drop_refs(npath);
kfree(npath);
ext4_free_ext_path(npath);
return err;
}
@ -3061,8 +3064,7 @@ int ext4_ext_remove_space(struct inode *inode, ext4_lblk_t start,
}
}
out:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
path = NULL;
if (err == -EAGAIN)
goto again;
@ -4375,8 +4377,7 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode,
allocated = map->m_len;
ext4_ext_show_leaf(inode, path);
out:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
trace_ext4_ext_map_blocks_exit(inode, flags, map,
err ? err : allocated);
@ -5245,8 +5246,7 @@ ext4_ext_shift_extents(struct inode *inode, handle_t *handle,
break;
}
out:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
return ret;
}
@ -5538,15 +5538,13 @@ static int ext4_insert_range(struct file *file, loff_t offset, loff_t len)
EXT4_GET_BLOCKS_METADATA_NOFAIL);
}
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
if (ret < 0) {
up_write(&EXT4_I(inode)->i_data_sem);
goto out_stop;
}
} else {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
}
ret = ext4_es_remove_extent(inode, offset_lblk,
@ -5766,10 +5764,8 @@ ext4_swap_extents(handle_t *handle, struct inode *inode1,
count -= len;
repeat:
ext4_ext_drop_refs(path1);
kfree(path1);
ext4_ext_drop_refs(path2);
kfree(path2);
ext4_free_ext_path(path1);
ext4_free_ext_path(path2);
path1 = path2 = NULL;
}
return replaced_count;
@ -5848,8 +5844,7 @@ int ext4_clu_mapped(struct inode *inode, ext4_lblk_t lclu)
}
out:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
return err ? err : mapped;
}
@ -5916,8 +5911,7 @@ int ext4_ext_replay_update_ex(struct inode *inode, ext4_lblk_t start,
ret = ext4_ext_dirty(NULL, inode, &path[path->p_depth]);
up_write(&EXT4_I(inode)->i_data_sem);
out:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
ext4_mark_inode_dirty(NULL, inode);
return ret;
}
@ -5935,8 +5929,7 @@ void ext4_ext_replay_shrink_inode(struct inode *inode, ext4_lblk_t end)
return;
ex = path[path->p_depth].p_ext;
if (!ex) {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
ext4_mark_inode_dirty(NULL, inode);
return;
}
@ -5949,8 +5942,7 @@ void ext4_ext_replay_shrink_inode(struct inode *inode, ext4_lblk_t end)
ext4_ext_dirty(NULL, inode, &path[path->p_depth]);
up_write(&EXT4_I(inode)->i_data_sem);
ext4_mark_inode_dirty(NULL, inode);
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
}
}
@ -5989,13 +5981,11 @@ int ext4_ext_replay_set_iblocks(struct inode *inode)
return PTR_ERR(path);
ex = path[path->p_depth].p_ext;
if (!ex) {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
goto out;
}
end = le32_to_cpu(ex->ee_block) + ext4_ext_get_actual_len(ex);
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
/* Count the number of data blocks */
cur = 0;
@ -6025,30 +6015,26 @@ int ext4_ext_replay_set_iblocks(struct inode *inode)
if (IS_ERR(path))
goto out;
numblks += path->p_depth;
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
while (cur < end) {
path = ext4_find_extent(inode, cur, NULL, 0);
if (IS_ERR(path))
break;
ex = path[path->p_depth].p_ext;
if (!ex) {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
return 0;
}
cur = max(cur + 1, le32_to_cpu(ex->ee_block) +
ext4_ext_get_actual_len(ex));
ret = skip_hole(inode, &cur);
if (ret < 0) {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
break;
}
path2 = ext4_find_extent(inode, cur, NULL, 0);
if (IS_ERR(path2)) {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
break;
}
for (i = 0; i <= max(path->p_depth, path2->p_depth); i++) {
@ -6062,10 +6048,8 @@ int ext4_ext_replay_set_iblocks(struct inode *inode)
if (cmp1 != cmp2 && cmp2 != 0)
numblks++;
}
ext4_ext_drop_refs(path);
ext4_ext_drop_refs(path2);
kfree(path);
kfree(path2);
ext4_free_ext_path(path);
ext4_free_ext_path(path2);
}
out:
@ -6092,13 +6076,11 @@ int ext4_ext_clear_bb(struct inode *inode)
return PTR_ERR(path);
ex = path[path->p_depth].p_ext;
if (!ex) {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
return 0;
}
end = le32_to_cpu(ex->ee_block) + ext4_ext_get_actual_len(ex);
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
cur = 0;
while (cur < end) {
@ -6117,8 +6099,7 @@ int ext4_ext_clear_bb(struct inode *inode)
ext4_fc_record_regions(inode->i_sb, inode->i_ino,
0, path[j].p_block, 1, 1);
}
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
}
ext4_mb_mark_bb(inode->i_sb, map.m_pblk, map.m_len, 0);
ext4_fc_record_regions(inode->i_sb, inode->i_ino,

View file

@ -667,8 +667,7 @@ static void ext4_es_insert_extent_ext_check(struct inode *inode,
}
}
out:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
}
static void ext4_es_insert_extent_ind_check(struct inode *inode,

View file

@ -229,6 +229,12 @@ __releases(&EXT4_SB(inode->i_sb)->s_fc_lock)
finish_wait(wq, &wait.wq_entry);
}
static bool ext4_fc_disabled(struct super_block *sb)
{
return (!test_opt2(sb, JOURNAL_FAST_COMMIT) ||
(EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY));
}
/*
* Inform Ext4's fast about start of an inode update
*
@ -240,8 +246,7 @@ void ext4_fc_start_update(struct inode *inode)
{
struct ext4_inode_info *ei = EXT4_I(inode);
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(inode->i_sb))
return;
restart:
@ -265,8 +270,7 @@ void ext4_fc_stop_update(struct inode *inode)
{
struct ext4_inode_info *ei = EXT4_I(inode);
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(inode->i_sb))
return;
if (atomic_dec_and_test(&ei->i_fc_updates))
@ -283,8 +287,7 @@ void ext4_fc_del(struct inode *inode)
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
struct ext4_fc_dentry_update *fc_dentry;
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(EXT4_SB(inode->i_sb)->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(inode->i_sb))
return;
restart:
@ -337,8 +340,7 @@ void ext4_fc_mark_ineligible(struct super_block *sb, int reason, handle_t *handl
struct ext4_sb_info *sbi = EXT4_SB(sb);
tid_t tid;
if (!test_opt2(sb, JOURNAL_FAST_COMMIT) ||
(EXT4_SB(sb)->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(sb))
return;
ext4_set_mount_flag(sb, EXT4_MF_FC_INELIGIBLE);
@ -493,10 +495,8 @@ void __ext4_fc_track_unlink(handle_t *handle,
void ext4_fc_track_unlink(handle_t *handle, struct dentry *dentry)
{
struct inode *inode = d_inode(dentry);
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(sbi->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(inode->i_sb))
return;
if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
@ -522,10 +522,8 @@ void __ext4_fc_track_link(handle_t *handle,
void ext4_fc_track_link(handle_t *handle, struct dentry *dentry)
{
struct inode *inode = d_inode(dentry);
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(sbi->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(inode->i_sb))
return;
if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
@ -551,10 +549,8 @@ void __ext4_fc_track_create(handle_t *handle, struct inode *inode,
void ext4_fc_track_create(handle_t *handle, struct dentry *dentry)
{
struct inode *inode = d_inode(dentry);
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(sbi->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(inode->i_sb))
return;
if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
@ -576,22 +572,20 @@ static int __track_inode(struct inode *inode, void *arg, bool update)
void ext4_fc_track_inode(handle_t *handle, struct inode *inode)
{
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
int ret;
if (S_ISDIR(inode->i_mode))
return;
if (ext4_fc_disabled(inode->i_sb))
return;
if (ext4_should_journal_data(inode)) {
ext4_fc_mark_ineligible(inode->i_sb,
EXT4_FC_REASON_INODE_JOURNAL_DATA, handle);
return;
}
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(sbi->s_mount_state & EXT4_FC_REPLAY))
return;
if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
return;
@ -634,15 +628,13 @@ static int __track_range(struct inode *inode, void *arg, bool update)
void ext4_fc_track_range(handle_t *handle, struct inode *inode, ext4_lblk_t start,
ext4_lblk_t end)
{
struct ext4_sb_info *sbi = EXT4_SB(inode->i_sb);
struct __track_range_args args;
int ret;
if (S_ISDIR(inode->i_mode))
return;
if (!test_opt2(inode->i_sb, JOURNAL_FAST_COMMIT) ||
(sbi->s_mount_state & EXT4_FC_REPLAY))
if (ext4_fc_disabled(inode->i_sb))
return;
if (ext4_test_mount_flag(inode->i_sb, EXT4_MF_FC_INELIGIBLE))
@ -710,10 +702,10 @@ static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc)
* After allocating len, we should have space at least for a 0 byte
* padding.
*/
if (len + sizeof(struct ext4_fc_tl) > bsize)
if (len + EXT4_FC_TAG_BASE_LEN > bsize)
return NULL;
if (bsize - off - 1 > len + sizeof(struct ext4_fc_tl)) {
if (bsize - off - 1 > len + EXT4_FC_TAG_BASE_LEN) {
/*
* Only allocate from current buffer if we have enough space for
* this request AND we have space to add a zero byte padding.
@ -730,10 +722,10 @@ static u8 *ext4_fc_reserve_space(struct super_block *sb, int len, u32 *crc)
/* Need to add PAD tag */
tl = (struct ext4_fc_tl *)(sbi->s_fc_bh->b_data + off);
tl->fc_tag = cpu_to_le16(EXT4_FC_TAG_PAD);
pad_len = bsize - off - 1 - sizeof(struct ext4_fc_tl);
pad_len = bsize - off - 1 - EXT4_FC_TAG_BASE_LEN;
tl->fc_len = cpu_to_le16(pad_len);
if (crc)
*crc = ext4_chksum(sbi, *crc, tl, sizeof(*tl));
*crc = ext4_chksum(sbi, *crc, tl, EXT4_FC_TAG_BASE_LEN);
if (pad_len > 0)
ext4_fc_memzero(sb, tl + 1, pad_len, crc);
ext4_fc_submit_bh(sb, false);
@ -775,7 +767,7 @@ static int ext4_fc_write_tail(struct super_block *sb, u32 crc)
* ext4_fc_reserve_space takes care of allocating an extra block if
* there's no enough space on this block for accommodating this tail.
*/
dst = ext4_fc_reserve_space(sb, sizeof(tl) + sizeof(tail), &crc);
dst = ext4_fc_reserve_space(sb, EXT4_FC_TAG_BASE_LEN + sizeof(tail), &crc);
if (!dst)
return -ENOSPC;
@ -785,8 +777,8 @@ static int ext4_fc_write_tail(struct super_block *sb, u32 crc)
tl.fc_len = cpu_to_le16(bsize - off - 1 + sizeof(struct ext4_fc_tail));
sbi->s_fc_bytes = round_up(sbi->s_fc_bytes, bsize);
ext4_fc_memcpy(sb, dst, &tl, sizeof(tl), &crc);
dst += sizeof(tl);
ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, &crc);
dst += EXT4_FC_TAG_BASE_LEN;
tail.fc_tid = cpu_to_le32(sbi->s_journal->j_running_transaction->t_tid);
ext4_fc_memcpy(sb, dst, &tail.fc_tid, sizeof(tail.fc_tid), &crc);
dst += sizeof(tail.fc_tid);
@ -808,15 +800,15 @@ static bool ext4_fc_add_tlv(struct super_block *sb, u16 tag, u16 len, u8 *val,
struct ext4_fc_tl tl;
u8 *dst;
dst = ext4_fc_reserve_space(sb, sizeof(tl) + len, crc);
dst = ext4_fc_reserve_space(sb, EXT4_FC_TAG_BASE_LEN + len, crc);
if (!dst)
return false;
tl.fc_tag = cpu_to_le16(tag);
tl.fc_len = cpu_to_le16(len);
ext4_fc_memcpy(sb, dst, &tl, sizeof(tl), crc);
ext4_fc_memcpy(sb, dst + sizeof(tl), val, len, crc);
ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc);
ext4_fc_memcpy(sb, dst + EXT4_FC_TAG_BASE_LEN, val, len, crc);
return true;
}
@ -828,8 +820,8 @@ static bool ext4_fc_add_dentry_tlv(struct super_block *sb, u32 *crc,
struct ext4_fc_dentry_info fcd;
struct ext4_fc_tl tl;
int dlen = fc_dentry->fcd_name.len;
u8 *dst = ext4_fc_reserve_space(sb, sizeof(tl) + sizeof(fcd) + dlen,
crc);
u8 *dst = ext4_fc_reserve_space(sb,
EXT4_FC_TAG_BASE_LEN + sizeof(fcd) + dlen, crc);
if (!dst)
return false;
@ -838,8 +830,8 @@ static bool ext4_fc_add_dentry_tlv(struct super_block *sb, u32 *crc,
fcd.fc_ino = cpu_to_le32(fc_dentry->fcd_ino);
tl.fc_tag = cpu_to_le16(fc_dentry->fcd_op);
tl.fc_len = cpu_to_le16(sizeof(fcd) + dlen);
ext4_fc_memcpy(sb, dst, &tl, sizeof(tl), crc);
dst += sizeof(tl);
ext4_fc_memcpy(sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc);
dst += EXT4_FC_TAG_BASE_LEN;
ext4_fc_memcpy(sb, dst, &fcd, sizeof(fcd), crc);
dst += sizeof(fcd);
ext4_fc_memcpy(sb, dst, fc_dentry->fcd_name.name, dlen, crc);
@ -874,22 +866,25 @@ static int ext4_fc_write_inode(struct inode *inode, u32 *crc)
tl.fc_tag = cpu_to_le16(EXT4_FC_TAG_INODE);
tl.fc_len = cpu_to_le16(inode_len + sizeof(fc_inode.fc_ino));
ret = -ECANCELED;
dst = ext4_fc_reserve_space(inode->i_sb,
sizeof(tl) + inode_len + sizeof(fc_inode.fc_ino), crc);
EXT4_FC_TAG_BASE_LEN + inode_len + sizeof(fc_inode.fc_ino), crc);
if (!dst)
return -ECANCELED;
goto err;
if (!ext4_fc_memcpy(inode->i_sb, dst, &tl, sizeof(tl), crc))
return -ECANCELED;
dst += sizeof(tl);
if (!ext4_fc_memcpy(inode->i_sb, dst, &tl, EXT4_FC_TAG_BASE_LEN, crc))
goto err;
dst += EXT4_FC_TAG_BASE_LEN;
if (!ext4_fc_memcpy(inode->i_sb, dst, &fc_inode, sizeof(fc_inode), crc))
return -ECANCELED;
goto err;
dst += sizeof(fc_inode);
if (!ext4_fc_memcpy(inode->i_sb, dst, (u8 *)ext4_raw_inode(&iloc),
inode_len, crc))
return -ECANCELED;
return 0;
goto err;
ret = 0;
err:
brelse(iloc.bh);
return ret;
}
/*
@ -1343,7 +1338,7 @@ struct dentry_info_args {
};
static inline void tl_to_darg(struct dentry_info_args *darg,
struct ext4_fc_tl *tl, u8 *val)
struct ext4_fc_tl *tl, u8 *val)
{
struct ext4_fc_dentry_info fcd;
@ -1352,8 +1347,14 @@ static inline void tl_to_darg(struct dentry_info_args *darg,
darg->parent_ino = le32_to_cpu(fcd.fc_parent_ino);
darg->ino = le32_to_cpu(fcd.fc_ino);
darg->dname = val + offsetof(struct ext4_fc_dentry_info, fc_dname);
darg->dname_len = le16_to_cpu(tl->fc_len) -
sizeof(struct ext4_fc_dentry_info);
darg->dname_len = tl->fc_len - sizeof(struct ext4_fc_dentry_info);
}
static inline void ext4_fc_get_tl(struct ext4_fc_tl *tl, u8 *val)
{
memcpy(tl, val, EXT4_FC_TAG_BASE_LEN);
tl->fc_len = le16_to_cpu(tl->fc_len);
tl->fc_tag = le16_to_cpu(tl->fc_tag);
}
/* Unlink replay function */
@ -1491,13 +1492,15 @@ static int ext4_fc_record_modified_inode(struct super_block *sb, int ino)
if (state->fc_modified_inodes[i] == ino)
return 0;
if (state->fc_modified_inodes_used == state->fc_modified_inodes_size) {
state->fc_modified_inodes = krealloc(
state->fc_modified_inodes,
int *fc_modified_inodes;
fc_modified_inodes = krealloc(state->fc_modified_inodes,
sizeof(int) * (state->fc_modified_inodes_size +
EXT4_FC_REPLAY_REALLOC_INCREMENT),
GFP_KERNEL);
if (!state->fc_modified_inodes)
if (!fc_modified_inodes)
return -ENOMEM;
state->fc_modified_inodes = fc_modified_inodes;
state->fc_modified_inodes_size +=
EXT4_FC_REPLAY_REALLOC_INCREMENT;
}
@ -1516,7 +1519,7 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl,
struct ext4_inode *raw_fc_inode;
struct inode *inode = NULL;
struct ext4_iloc iloc;
int inode_len, ino, ret, tag = le16_to_cpu(tl->fc_tag);
int inode_len, ino, ret, tag = tl->fc_tag;
struct ext4_extent_header *eh;
memcpy(&fc_inode, val, sizeof(fc_inode));
@ -1541,7 +1544,7 @@ static int ext4_fc_replay_inode(struct super_block *sb, struct ext4_fc_tl *tl,
if (ret)
goto out;
inode_len = le16_to_cpu(tl->fc_len) - sizeof(struct ext4_fc_inode);
inode_len = tl->fc_len - sizeof(struct ext4_fc_inode);
raw_inode = ext4_raw_inode(&iloc);
memcpy(raw_inode, raw_fc_inode, offsetof(struct ext4_inode, i_block));
@ -1682,15 +1685,18 @@ int ext4_fc_record_regions(struct super_block *sb, int ino,
if (replay && state->fc_regions_used != state->fc_regions_valid)
state->fc_regions_used = state->fc_regions_valid;
if (state->fc_regions_used == state->fc_regions_size) {
struct ext4_fc_alloc_region *fc_regions;
fc_regions = krealloc(state->fc_regions,
sizeof(struct ext4_fc_alloc_region) *
(state->fc_regions_size +
EXT4_FC_REPLAY_REALLOC_INCREMENT),
GFP_KERNEL);
if (!fc_regions)
return -ENOMEM;
state->fc_regions_size +=
EXT4_FC_REPLAY_REALLOC_INCREMENT;
state->fc_regions = krealloc(
state->fc_regions,
state->fc_regions_size *
sizeof(struct ext4_fc_alloc_region),
GFP_KERNEL);
if (!state->fc_regions)
return -ENOMEM;
state->fc_regions = fc_regions;
}
region = &state->fc_regions[state->fc_regions_used++];
region->ino = ino;
@ -1770,8 +1776,7 @@ static int ext4_fc_replay_add_range(struct super_block *sb,
ret = ext4_ext_insert_extent(
NULL, inode, &path, &newex, 0);
up_write((&EXT4_I(inode)->i_data_sem));
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
if (ret)
goto out;
goto next;
@ -1926,8 +1931,7 @@ static void ext4_fc_set_bitmaps_and_counters(struct super_block *sb)
for (j = 0; j < path->p_depth; j++)
ext4_mb_mark_bb(inode->i_sb,
path[j].p_block, 1, 1);
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
}
cur += ret;
ext4_mb_mark_bb(inode->i_sb, map.m_pblk,
@ -1972,6 +1976,34 @@ void ext4_fc_replay_cleanup(struct super_block *sb)
kfree(sbi->s_fc_replay_state.fc_modified_inodes);
}
static inline bool ext4_fc_tag_len_isvalid(struct ext4_fc_tl *tl,
u8 *val, u8 *end)
{
if (val + tl->fc_len > end)
return false;
/* Here only check ADD_RANGE/TAIL/HEAD which will read data when do
* journal rescan before do CRC check. Other tags length check will
* rely on CRC check.
*/
switch (tl->fc_tag) {
case EXT4_FC_TAG_ADD_RANGE:
return (sizeof(struct ext4_fc_add_range) == tl->fc_len);
case EXT4_FC_TAG_TAIL:
return (sizeof(struct ext4_fc_tail) <= tl->fc_len);
case EXT4_FC_TAG_HEAD:
return (sizeof(struct ext4_fc_head) == tl->fc_len);
case EXT4_FC_TAG_DEL_RANGE:
case EXT4_FC_TAG_LINK:
case EXT4_FC_TAG_UNLINK:
case EXT4_FC_TAG_CREAT:
case EXT4_FC_TAG_INODE:
case EXT4_FC_TAG_PAD:
default:
return true;
}
}
/*
* Recovery Scan phase handler
*
@ -2028,12 +2060,18 @@ static int ext4_fc_replay_scan(journal_t *journal,
}
state->fc_replay_expected_off++;
for (cur = start; cur < end; cur = cur + sizeof(tl) + le16_to_cpu(tl.fc_len)) {
memcpy(&tl, cur, sizeof(tl));
val = cur + sizeof(tl);
for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN;
cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) {
ext4_fc_get_tl(&tl, cur);
val = cur + EXT4_FC_TAG_BASE_LEN;
if (!ext4_fc_tag_len_isvalid(&tl, val, end)) {
ret = state->fc_replay_num_tags ?
JBD2_FC_REPLAY_STOP : -ECANCELED;
goto out_err;
}
ext4_debug("Scan phase, tag:%s, blk %lld\n",
tag2str(le16_to_cpu(tl.fc_tag)), bh->b_blocknr);
switch (le16_to_cpu(tl.fc_tag)) {
tag2str(tl.fc_tag), bh->b_blocknr);
switch (tl.fc_tag) {
case EXT4_FC_TAG_ADD_RANGE:
memcpy(&ext, val, sizeof(ext));
ex = (struct ext4_extent *)&ext.fc_ex;
@ -2053,13 +2091,13 @@ static int ext4_fc_replay_scan(journal_t *journal,
case EXT4_FC_TAG_PAD:
state->fc_cur_tag++;
state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur,
sizeof(tl) + le16_to_cpu(tl.fc_len));
EXT4_FC_TAG_BASE_LEN + tl.fc_len);
break;
case EXT4_FC_TAG_TAIL:
state->fc_cur_tag++;
memcpy(&tail, val, sizeof(tail));
state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur,
sizeof(tl) +
EXT4_FC_TAG_BASE_LEN +
offsetof(struct ext4_fc_tail,
fc_crc));
if (le32_to_cpu(tail.fc_tid) == expected_tid &&
@ -2086,7 +2124,7 @@ static int ext4_fc_replay_scan(journal_t *journal,
}
state->fc_cur_tag++;
state->fc_crc = ext4_chksum(sbi, state->fc_crc, cur,
sizeof(tl) + le16_to_cpu(tl.fc_len));
EXT4_FC_TAG_BASE_LEN + tl.fc_len);
break;
default:
ret = state->fc_replay_num_tags ?
@ -2141,19 +2179,20 @@ static int ext4_fc_replay(journal_t *journal, struct buffer_head *bh,
start = (u8 *)bh->b_data;
end = (__u8 *)bh->b_data + journal->j_blocksize - 1;
for (cur = start; cur < end; cur = cur + sizeof(tl) + le16_to_cpu(tl.fc_len)) {
memcpy(&tl, cur, sizeof(tl));
val = cur + sizeof(tl);
for (cur = start; cur < end - EXT4_FC_TAG_BASE_LEN;
cur = cur + EXT4_FC_TAG_BASE_LEN + tl.fc_len) {
ext4_fc_get_tl(&tl, cur);
val = cur + EXT4_FC_TAG_BASE_LEN;
if (state->fc_replay_num_tags == 0) {
ret = JBD2_FC_REPLAY_STOP;
ext4_fc_set_bitmaps_and_counters(sb);
break;
}
ext4_debug("Replay phase, tag:%s\n",
tag2str(le16_to_cpu(tl.fc_tag)));
ext4_debug("Replay phase, tag:%s\n", tag2str(tl.fc_tag));
state->fc_replay_num_tags--;
switch (le16_to_cpu(tl.fc_tag)) {
switch (tl.fc_tag) {
case EXT4_FC_TAG_LINK:
ret = ext4_fc_replay_link(sb, &tl, val);
break;
@ -2174,19 +2213,18 @@ static int ext4_fc_replay(journal_t *journal, struct buffer_head *bh,
break;
case EXT4_FC_TAG_PAD:
trace_ext4_fc_replay(sb, EXT4_FC_TAG_PAD, 0,
le16_to_cpu(tl.fc_len), 0);
tl.fc_len, 0);
break;
case EXT4_FC_TAG_TAIL:
trace_ext4_fc_replay(sb, EXT4_FC_TAG_TAIL, 0,
le16_to_cpu(tl.fc_len), 0);
trace_ext4_fc_replay(sb, EXT4_FC_TAG_TAIL,
0, tl.fc_len, 0);
memcpy(&tail, val, sizeof(tail));
WARN_ON(le32_to_cpu(tail.fc_tid) != expected_tid);
break;
case EXT4_FC_TAG_HEAD:
break;
default:
trace_ext4_fc_replay(sb, le16_to_cpu(tl.fc_tag), 0,
le16_to_cpu(tl.fc_len), 0);
trace_ext4_fc_replay(sb, tl.fc_tag, 0, tl.fc_len, 0);
ret = -ECANCELED;
break;
}

View file

@ -70,6 +70,9 @@ struct ext4_fc_tail {
__le32 fc_crc;
};
/* Tag base length */
#define EXT4_FC_TAG_BASE_LEN (sizeof(struct ext4_fc_tl))
/*
* Fast commit status codes
*/

View file

@ -543,6 +543,12 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
ret = -EAGAIN;
goto out;
}
/*
* Make sure inline data cannot be created anymore since we are going
* to allocate blocks for DIO. We know the inode does not have any
* inline data now because ext4_dio_supported() checked for that.
*/
ext4_clear_inode_state(inode, EXT4_STATE_MAY_INLINE_DATA);
offset = iocb->ki_pos;
count = ret;

View file

@ -1188,6 +1188,13 @@ static int ext4_write_begin(struct file *file, struct address_space *mapping,
page = grab_cache_page_write_begin(mapping, index);
if (!page)
return -ENOMEM;
/*
* The same as page allocation, we prealloc buffer heads before
* starting the handle.
*/
if (!page_has_buffers(page))
create_empty_buffers(page, inode->i_sb->s_blocksize, 0);
unlock_page(page);
retry_journal:
@ -5342,6 +5349,7 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
int error, rc = 0;
int orphan = 0;
const unsigned int ia_valid = attr->ia_valid;
bool inc_ivers = true;
if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb))))
return -EIO;
@ -5425,8 +5433,8 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
return -EINVAL;
}
if (IS_I_VERSION(inode) && attr->ia_size != inode->i_size)
inode_inc_iversion(inode);
if (attr->ia_size == inode->i_size)
inc_ivers = false;
if (shrink) {
if (ext4_should_order_data(inode)) {
@ -5528,6 +5536,8 @@ int ext4_setattr(struct user_namespace *mnt_userns, struct dentry *dentry,
}
if (!error) {
if (inc_ivers)
inode_inc_iversion(inode);
setattr_copy(mnt_userns, inode, attr);
mark_inode_dirty(inode);
}
@ -5768,9 +5778,6 @@ int ext4_mark_iloc_dirty(handle_t *handle,
}
ext4_fc_track_inode(handle, inode);
if (IS_I_VERSION(inode))
inode_inc_iversion(inode);
/* the do_update_inode consumes one bh->b_count */
get_bh(iloc->bh);

View file

@ -452,6 +452,7 @@ static long swap_inode_boot_loader(struct super_block *sb,
swap_inode_data(inode, inode_bl);
inode->i_ctime = inode_bl->i_ctime = current_time(inode);
inode_inc_iversion(inode);
inode->i_generation = prandom_u32();
inode_bl->i_generation = prandom_u32();
@ -665,6 +666,7 @@ static int ext4_ioctl_setflags(struct inode *inode,
ext4_set_inode_flags(inode, false);
inode->i_ctime = current_time(inode);
inode_inc_iversion(inode);
err = ext4_mark_iloc_dirty(handle, inode, &iloc);
flags_err:
@ -775,6 +777,7 @@ static int ext4_ioctl_setproject(struct inode *inode, __u32 projid)
EXT4_I(inode)->i_projid = kprojid;
inode->i_ctime = current_time(inode);
inode_inc_iversion(inode);
out_dirty:
rc = ext4_mark_iloc_dirty(handle, inode, &iloc);
if (!err)
@ -1060,9 +1063,6 @@ static int ext4_ioctl_checkpoint(struct file *filp, unsigned long arg)
if (!EXT4_SB(sb)->s_journal)
return -ENODEV;
if (flags & ~EXT4_IOC_CHECKPOINT_FLAG_VALID)
return -EINVAL;
if ((flags & JBD2_JOURNAL_FLUSH_DISCARD) &&
!bdev_max_discard_sectors(EXT4_SB(sb)->s_journal->j_dev))
return -EOPNOTSUPP;
@ -1257,6 +1257,7 @@ static long __ext4_ioctl(struct file *filp, unsigned int cmd, unsigned long arg)
err = ext4_reserve_inode_write(handle, inode, &iloc);
if (err == 0) {
inode->i_ctime = current_time(inode);
inode_inc_iversion(inode);
inode->i_generation = generation;
err = ext4_mark_iloc_dirty(handle, inode, &iloc);
}

View file

@ -56,8 +56,7 @@ static int finish_range(handle_t *handle, struct inode *inode,
retval = ext4_ext_insert_extent(handle, inode, &path, &newext, 0);
err_out:
up_write((&EXT4_I(inode)->i_data_sem));
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
lb->first_pblock = 0;
return retval;
}

View file

@ -32,8 +32,7 @@ get_ext_path(struct inode *inode, ext4_lblk_t lblock,
if (IS_ERR(path))
return PTR_ERR(path);
if (path[ext_depth(inode)].p_ext == NULL) {
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
*ppath = NULL;
return -ENODATA;
}
@ -103,12 +102,10 @@ mext_check_coverage(struct inode *inode, ext4_lblk_t from, ext4_lblk_t count,
if (unwritten != ext4_ext_is_unwritten(ext))
goto out;
from += ext4_ext_get_actual_len(ext);
ext4_ext_drop_refs(path);
}
ret = 1;
out:
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
return ret;
}
@ -472,19 +469,17 @@ mext_check_arguments(struct inode *orig_inode,
if (IS_IMMUTABLE(donor_inode) || IS_APPEND(donor_inode))
return -EPERM;
/* Ext4 move extent does not support swapfile */
/* Ext4 move extent does not support swap files */
if (IS_SWAPFILE(orig_inode) || IS_SWAPFILE(donor_inode)) {
ext4_debug("ext4 move extent: The argument files should "
"not be swapfile [ino:orig %lu, donor %lu]\n",
ext4_debug("ext4 move extent: The argument files should not be swap files [ino:orig %lu, donor %lu]\n",
orig_inode->i_ino, donor_inode->i_ino);
return -EBUSY;
return -ETXTBSY;
}
if (ext4_is_quota_file(orig_inode) && ext4_is_quota_file(donor_inode)) {
ext4_debug("ext4 move extent: The argument files should "
"not be quota files [ino:orig %lu, donor %lu]\n",
ext4_debug("ext4 move extent: The argument files should not be quota files [ino:orig %lu, donor %lu]\n",
orig_inode->i_ino, donor_inode->i_ino);
return -EBUSY;
return -EOPNOTSUPP;
}
/* Ext4 move extent supports only extent based file */
@ -631,11 +626,11 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
if (ret)
goto out;
ex = path[path->p_depth].p_ext;
next_blk = ext4_ext_next_allocated_block(path);
cur_blk = le32_to_cpu(ex->ee_block);
cur_len = ext4_ext_get_actual_len(ex);
/* Check hole before the start pos */
if (cur_blk + cur_len - 1 < o_start) {
next_blk = ext4_ext_next_allocated_block(path);
if (next_blk == EXT_MAX_BLOCKS) {
ret = -ENODATA;
goto out;
@ -663,7 +658,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
donor_page_index = d_start >> (PAGE_SHIFT -
donor_inode->i_blkbits);
offset_in_page = o_start % blocks_per_page;
if (cur_len > blocks_per_page- offset_in_page)
if (cur_len > blocks_per_page - offset_in_page)
cur_len = blocks_per_page - offset_in_page;
/*
* Up semaphore to avoid following problems:
@ -694,8 +689,7 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp, __u64 orig_blk,
ext4_discard_preallocations(donor_inode, 0);
}
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
ext4_double_up_write_data_sem(orig_inode, donor_inode);
unlock_two_nondirectories(orig_inode, donor_inode);

View file

@ -85,15 +85,20 @@ static struct buffer_head *ext4_append(handle_t *handle,
return bh;
inode->i_size += inode->i_sb->s_blocksize;
EXT4_I(inode)->i_disksize = inode->i_size;
err = ext4_mark_inode_dirty(handle, inode);
if (err)
goto out;
BUFFER_TRACE(bh, "get_write_access");
err = ext4_journal_get_write_access(handle, inode->i_sb, bh,
EXT4_JTR_NONE);
if (err) {
brelse(bh);
ext4_std_error(inode->i_sb, err);
return ERR_PTR(err);
}
if (err)
goto out;
return bh;
out:
brelse(bh);
ext4_std_error(inode->i_sb, err);
return ERR_PTR(err);
}
static int ext4_dx_csum_verify(struct inode *inode,
@ -126,7 +131,7 @@ static struct buffer_head *__ext4_read_dirblock(struct inode *inode,
struct ext4_dir_entry *dirent;
int is_dx_block = 0;
if (block >= inode->i_size) {
if (block >= inode->i_size >> inode->i_blkbits) {
ext4_error_inode(inode, func, line, block,
"Attempting to read directory block (%u) that is past i_size (%llu)",
block, inode->i_size);

View file

@ -2122,7 +2122,7 @@ int ext4_resize_fs(struct super_block *sb, ext4_fsblk_t n_blocks_count)
goto out;
}
if (ext4_blocks_count(es) == n_blocks_count)
if (ext4_blocks_count(es) == n_blocks_count && n_blocks_count_retry == 0)
goto out;
err = ext4_alloc_flex_bg_array(sb, n_group + 1);

File diff suppressed because it is too large Load diff

View file

@ -298,16 +298,14 @@ static int ext4_get_verity_descriptor_location(struct inode *inode,
last_extent = path[path->p_depth].p_ext;
if (!last_extent) {
EXT4_ERROR_INODE(inode, "verity file has no extents");
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
return -EFSCORRUPTED;
}
end_lblk = le32_to_cpu(last_extent->ee_block) +
ext4_ext_get_actual_len(last_extent);
desc_size_pos = (u64)end_lblk << inode->i_blkbits;
ext4_ext_drop_refs(path);
kfree(path);
ext4_free_ext_path(path);
if (desc_size_pos < sizeof(desc_size_disk))
goto bad;

View file

@ -2412,6 +2412,7 @@ ext4_xattr_set_handle(handle_t *handle, struct inode *inode, int name_index,
if (!error) {
ext4_xattr_update_super_block(handle, inode->i_sb);
inode->i_ctime = current_time(inode);
inode_inc_iversion(inode);
if (!value)
no_expand = 0;
error = ext4_mark_iloc_dirty(handle, inode, &is.iloc);

View file

@ -1718,9 +1718,14 @@ static int writeback_single_inode(struct inode *inode,
*/
if (!(inode->i_state & I_DIRTY_ALL))
inode_cgwb_move_to_attached(inode, wb);
else if (!(inode->i_state & I_SYNC_QUEUED) &&
(inode->i_state & I_DIRTY))
redirty_tail_locked(inode, wb);
else if (!(inode->i_state & I_SYNC_QUEUED)) {
if ((inode->i_state & I_DIRTY))
redirty_tail_locked(inode, wb);
else if (inode->i_state & I_DIRTY_TIME) {
inode->dirtied_when = jiffies;
inode_io_list_move_locked(inode, wb, &wb->b_dirty_time);
}
}
spin_unlock(&wb->list_lock);
inode_sync_complete(inode);
@ -2369,6 +2374,20 @@ void __mark_inode_dirty(struct inode *inode, int flags)
trace_writeback_mark_inode_dirty(inode, flags);
if (flags & I_DIRTY_INODE) {
/*
* Inode timestamp update will piggback on this dirtying.
* We tell ->dirty_inode callback that timestamps need to
* be updated by setting I_DIRTY_TIME in flags.
*/
if (inode->i_state & I_DIRTY_TIME) {
spin_lock(&inode->i_lock);
if (inode->i_state & I_DIRTY_TIME) {
inode->i_state &= ~I_DIRTY_TIME;
flags |= I_DIRTY_TIME;
}
spin_unlock(&inode->i_lock);
}
/*
* Notify the filesystem about the inode being dirtied, so that
* (if needed) it can update on-disk fields and journal the
@ -2378,7 +2397,8 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
trace_writeback_dirty_inode_start(inode, flags);
if (sb->s_op->dirty_inode)
sb->s_op->dirty_inode(inode, flags & I_DIRTY_INODE);
sb->s_op->dirty_inode(inode,
flags & (I_DIRTY_INODE | I_DIRTY_TIME));
trace_writeback_dirty_inode(inode, flags);
/* I_DIRTY_INODE supersedes I_DIRTY_TIME. */
@ -2399,21 +2419,15 @@ void __mark_inode_dirty(struct inode *inode, int flags)
*/
smp_mb();
if (((inode->i_state & flags) == flags) ||
(dirtytime && (inode->i_state & I_DIRTY_INODE)))
if ((inode->i_state & flags) == flags)
return;
spin_lock(&inode->i_lock);
if (dirtytime && (inode->i_state & I_DIRTY_INODE))
goto out_unlock_inode;
if ((inode->i_state & flags) != flags) {
const int was_dirty = inode->i_state & I_DIRTY;
inode_attach_wb(inode, NULL);
/* I_DIRTY_INODE supersedes I_DIRTY_TIME. */
if (flags & I_DIRTY_INODE)
inode->i_state &= ~I_DIRTY_TIME;
inode->i_state |= flags;
/*
@ -2486,7 +2500,6 @@ void __mark_inode_dirty(struct inode *inode, int flags)
out_unlock:
if (wb)
spin_unlock(&wb->list_lock);
out_unlock_inode:
spin_unlock(&inode->i_lock);
}
EXPORT_SYMBOL(__mark_inode_dirty);

View file

@ -122,8 +122,8 @@ static int journal_submit_commit_record(journal_t *journal,
{
struct commit_header *tmp;
struct buffer_head *bh;
int ret;
struct timespec64 now;
blk_opf_t write_flags = REQ_OP_WRITE | REQ_SYNC;
*cbh = NULL;
@ -155,13 +155,11 @@ static int journal_submit_commit_record(journal_t *journal,
if (journal->j_flags & JBD2_BARRIER &&
!jbd2_has_feature_async_commit(journal))
ret = submit_bh(REQ_OP_WRITE | REQ_SYNC | REQ_PREFLUSH |
REQ_FUA, bh);
else
ret = submit_bh(REQ_OP_WRITE | REQ_SYNC, bh);
write_flags |= REQ_PREFLUSH | REQ_FUA;
submit_bh(write_flags, bh);
*cbh = bh;
return ret;
return 0;
}
/*
@ -570,7 +568,7 @@ void jbd2_journal_commit_transaction(journal_t *journal)
journal->j_running_transaction = NULL;
start_time = ktime_get();
commit_transaction->t_log_start = journal->j_head;
wake_up(&journal->j_wait_transaction_locked);
wake_up_all(&journal->j_wait_transaction_locked);
write_unlock(&journal->j_state_lock);
jbd2_debug(3, "JBD2: commit phase 2a\n");

View file

@ -923,10 +923,16 @@ int jbd2_fc_wait_bufs(journal_t *journal, int num_blks)
for (i = j_fc_off - 1; i >= j_fc_off - num_blks; i--) {
bh = journal->j_fc_wbuf[i];
wait_on_buffer(bh);
/*
* Update j_fc_off so jbd2_fc_release_bufs can release remain
* buffer head.
*/
if (unlikely(!buffer_uptodate(bh))) {
journal->j_fc_off = i + 1;
return -EIO;
}
put_bh(bh);
journal->j_fc_wbuf[i] = NULL;
if (unlikely(!buffer_uptodate(bh)))
return -EIO;
}
return 0;
@ -1606,7 +1612,7 @@ static int jbd2_write_superblock(journal_t *journal, blk_opf_t write_flags)
{
struct buffer_head *bh = journal->j_sb_buffer;
journal_superblock_t *sb = journal->j_superblock;
int ret;
int ret = 0;
/* Buffer got discarded which means block device got invalidated */
if (!buffer_mapped(bh)) {
@ -1636,7 +1642,7 @@ static int jbd2_write_superblock(journal_t *journal, blk_opf_t write_flags)
sb->s_checksum = jbd2_superblock_csum(journal, sb);
get_bh(bh);
bh->b_end_io = end_buffer_write_sync;
ret = submit_bh(REQ_OP_WRITE | write_flags, bh);
submit_bh(REQ_OP_WRITE | write_flags, bh);
wait_on_buffer(bh);
if (buffer_write_io_error(bh)) {
clear_buffer_write_io_error(bh);
@ -1644,9 +1650,8 @@ static int jbd2_write_superblock(journal_t *journal, blk_opf_t write_flags)
ret = -EIO;
}
if (ret) {
printk(KERN_ERR "JBD2: Error %d detected when updating "
"journal superblock for %s.\n", ret,
journal->j_devname);
printk(KERN_ERR "JBD2: I/O error when updating journal superblock for %s.\n",
journal->j_devname);
if (!is_journal_aborted(journal))
jbd2_journal_abort(journal, ret);
}

View file

@ -256,6 +256,7 @@ static int fc_do_one_pass(journal_t *journal,
err = journal->j_fc_replay_callback(journal, bh, pass,
next_fc_block - journal->j_fc_first,
expected_commit_id);
brelse(bh);
next_fc_block++;
if (err < 0 || err == JBD2_FC_REPLAY_STOP)
break;

View file

@ -168,7 +168,7 @@ static void wait_transaction_locked(journal_t *journal)
int need_to_start;
tid_t tid = journal->j_running_transaction->t_tid;
prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
TASK_UNINTERRUPTIBLE);
need_to_start = !tid_geq(journal->j_commit_request, tid);
read_unlock(&journal->j_state_lock);
@ -194,7 +194,7 @@ static void wait_transaction_switching(journal_t *journal)
read_unlock(&journal->j_state_lock);
return;
}
prepare_to_wait(&journal->j_wait_transaction_locked, &wait,
prepare_to_wait_exclusive(&journal->j_wait_transaction_locked, &wait,
TASK_UNINTERRUPTIBLE);
read_unlock(&journal->j_state_lock);
/*
@ -920,7 +920,7 @@ void jbd2_journal_unlock_updates (journal_t *journal)
write_lock(&journal->j_state_lock);
--journal->j_barrier_count;
write_unlock(&journal->j_state_lock);
wake_up(&journal->j_wait_transaction_locked);
wake_up_all(&journal->j_wait_transaction_locked);
}
static void warn_dirty_buffer(struct buffer_head *bh)

View file

@ -90,8 +90,14 @@ int mb_cache_entry_create(struct mb_cache *cache, gfp_t mask, u32 key,
return -ENOMEM;
INIT_LIST_HEAD(&entry->e_list);
/* Initial hash reference */
atomic_set(&entry->e_refcnt, 1);
/*
* We create entry with two references. One reference is kept by the
* hash table, the other reference is used to protect us from
* mb_cache_entry_delete_or_get() until the entry is fully setup. This
* avoids nesting of cache->c_list_lock into hash table bit locks which
* is problematic for RT.
*/
atomic_set(&entry->e_refcnt, 2);
entry->e_key = key;
entry->e_value = value;
entry->e_reusable = reusable;
@ -106,15 +112,12 @@ int mb_cache_entry_create(struct mb_cache *cache, gfp_t mask, u32 key,
}
}
hlist_bl_add_head(&entry->e_hash_list, head);
/*
* Add entry to LRU list before it can be found by
* mb_cache_entry_delete() to avoid races
*/
hlist_bl_unlock(head);
spin_lock(&cache->c_list_lock);
list_add_tail(&entry->e_list, &cache->c_list);
cache->c_entry_count++;
spin_unlock(&cache->c_list_lock);
hlist_bl_unlock(head);
mb_cache_entry_put(cache, entry);
return 0;
}

View file

@ -527,12 +527,12 @@ static inline int __ntfs_grab_cache_pages(struct address_space *mapping,
goto out;
}
static inline int ntfs_submit_bh_for_read(struct buffer_head *bh)
static inline void ntfs_submit_bh_for_read(struct buffer_head *bh)
{
lock_buffer(bh);
get_bh(bh);
bh->b_end_io = end_buffer_read_sync;
return submit_bh(REQ_OP_READ, bh);
submit_bh(REQ_OP_READ, bh);
}
/**

View file

@ -653,7 +653,7 @@ xfs_fs_destroy_inode(
static void
xfs_fs_dirty_inode(
struct inode *inode,
int flag)
int flags)
{
struct xfs_inode *ip = XFS_I(inode);
struct xfs_mount *mp = ip->i_mount;
@ -661,7 +661,13 @@ xfs_fs_dirty_inode(
if (!(inode->i_sb->s_flags & SB_LAZYTIME))
return;
if (flag != I_DIRTY_SYNC || !(inode->i_state & I_DIRTY_TIME))
/*
* Only do the timestamp update if the inode is dirty (I_DIRTY_SYNC)
* and has dirty timestamp (I_DIRTY_TIME). I_DIRTY_TIME can be passed
* in flags possibly together with I_DIRTY_SYNC.
*/
if ((flags & ~I_DIRTY_TIME) != I_DIRTY_SYNC || !(flags & I_DIRTY_TIME))
return;
if (xfs_trans_alloc(mp, &M_RES(mp)->tr_fsyncts, 0, 0, 0, &tp))

View file

@ -240,7 +240,7 @@ void ll_rw_block(blk_opf_t, int, struct buffer_head * bh[]);
int sync_dirty_buffer(struct buffer_head *bh);
int __sync_dirty_buffer(struct buffer_head *bh, blk_opf_t op_flags);
void write_dirty_buffer(struct buffer_head *bh, blk_opf_t op_flags);
int submit_bh(blk_opf_t, struct buffer_head *);
void submit_bh(blk_opf_t, struct buffer_head *);
void write_boundary_block(struct block_device *bdev,
sector_t bblock, unsigned blocksize);
int bh_uptodate_or_lock(struct buffer_head *bh);

View file

@ -2372,13 +2372,14 @@ static inline void kiocb_clone(struct kiocb *kiocb, struct kiocb *kiocb_src,
* don't have to write inode on fdatasync() when only
* e.g. the timestamps have changed.
* I_DIRTY_PAGES Inode has dirty pages. Inode itself may be clean.
* I_DIRTY_TIME The inode itself only has dirty timestamps, and the
* I_DIRTY_TIME The inode itself has dirty timestamps, and the
* lazytime mount option is enabled. We keep track of this
* separately from I_DIRTY_SYNC in order to implement
* lazytime. This gets cleared if I_DIRTY_INODE
* (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. I.e.
* either I_DIRTY_TIME *or* I_DIRTY_INODE can be set in
* i_state, but not both. I_DIRTY_PAGES may still be set.
* (I_DIRTY_SYNC and/or I_DIRTY_DATASYNC) gets set. But
* I_DIRTY_TIME can still be set if I_DIRTY_SYNC is already
* in place because writeback might already be in progress
* and we don't want to lose the time update
* I_NEW Serves as both a mutex and completion notification.
* New inodes set I_NEW. If two processes both create
* the same inode, one of them will release its inode and