Commit graph

67 commits

Author SHA1 Message Date
Dave Chinner 4225441a1e Merge branch 'xfs-generic-sb-counters' into for-next
Conflicts:
	fs/xfs/xfs_super.c
2015-02-24 10:27:28 +11:00
Dave Chinner 3790a8cd8a xfs: xfs_alloc_fix_minleft can underflow near ENOSPC
Test generic/224 is failing with a corruption being detected on one
of Michael's test boxes.  Debug that Michael added is indicating
that the minleft trimming is resulting in an underflow:

.....
 before fixup:              rlen          1  args->len          0
 after xfs_alloc_fix_len  : rlen          1  args->len          1
 before goto out_nominleft: rlen          1  args->len          0
 before fixup:              rlen          1  args->len          0
 after xfs_alloc_fix_len  : rlen          1  args->len          1
 after fixup:               rlen          1  args->len          1
 before fixup:              rlen          1  args->len          0
 after xfs_alloc_fix_len  : rlen          1  args->len          1
 after fixup:               rlen 4294967295  args->len 4294967295
 XFS: Assertion failed: fs_is_ok, file: fs/xfs/libxfs/xfs_alloc.c, line: 1424

The "goto out_nominleft:" indicates that we are getting close to
ENOSPC in the AG, and a couple of allocations later we underflow
and the corruption check fires in xfs_alloc_ag_vextent_size().

The issue is that the extent length fixups comaprisons are done
with variables of xfs_extlen_t types. These are unsigned so an
underflow looks like a really big value and hence is not detected
as being smaller than the minimum length allowed for the extent.
Hence the corruption check fires as it is noticing that the returned
length is longer than the original extent length passed in.

This can be easily fixed by ensuring we do the underflow test on
signed values, the same way xfs_alloc_fix_len() prevents underflow.
So we realise in future that these casts prevent underflows from
going undetected, add comments to the code indicating this.

Reported-by: Michael L. Semon <mlsemon35@gmail.com>
Tested-by: Michael L. Semon <mlsemon35@gmail.com>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-24 10:16:04 +11:00
Wang Sheng-Hui dd5e71274a xfs: remove old and redundant comment in xfs_mount_validate_sb
The error messages document the reason for the checks better than the comment
and the comments about volume mounts date back to Irix and so aren't relevant
any more. So just remove the old and redundant comment.

Signed-off-by: Wang Sheng-Hui <shhuiw@foxmail.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-24 10:15:04 +11:00
Eric Sandeen 5fb5aeeeb6 xfs: pass mp to XFS_WANT_CORRUPTED_RETURN
Today, if we hit an XFS_WANT_CORRUPTED_RETURN we don't print any
information about which filesystem hit it.  Passing in the mp allows
us to print the filesystem (device) name, which is a pretty critical
piece of information.

Tested by running fsfuzzer 'til I hit some.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 22:39:13 +11:00
Eric Sandeen c29aad4115 xfs: pass mp to XFS_WANT_CORRUPTED_GOTO
Today, if we hit an XFS_WANT_CORRUPTED_GOTO we don't print any
information about which filesystem hit it.  Passing in the mp allows
us to print the filesystem (device) name, which is a pretty critical
piece of information.

Tested by running fsfuzzer 'til I hit some.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 22:39:08 +11:00
Dave Chinner 964aa8d9e4 xfs: remove xfs_mod_incore_sb API
Now that there are no users of the bitfield based incore superblock
modification API, just remove the whole damn lot of it, including
all the bitfield definitions. This finally removes a lot of cruft
that has been around for a long time.

Credit goes to Christoph Hellwig for providing a great patch
connecting all the dots to enale us to do this. This patch is
derived from that work.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 21:24:37 +11:00
Dave Chinner bab98bbe6e xfs: introduce xfs_mod_frextents
Add a new helper to modify the incore counter of free realtime
extents. This matches the helpers used for inode and data block
counters, and removes a significant users of the xfs_mod_incore_sb()
interface.

Based on a patch originally from Christoph Hellwig.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 21:22:54 +11:00
Dave Chinner 5681ca4006 xfs: Remove icsb infrastructure
Now that the in-core superblock infrastructure has been replaced with
generic per-cpu counters, we don't need it anymore. Nuke it from
orbit so we are sure that it won't haunt us again...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 21:22:31 +11:00
Dave Chinner 0d485ada40 xfs: use generic percpu counters for free block counter
XFS has hand-rolled per-cpu counters for the superblock since before
there was any generic implementation. The free block counter is
special in that it is used for ENOSPC detection outside transaction
contexts for for delayed allocation. This means that the counter
needs to be accurate at zero. The current per-cpu counter code jumps
through lots of hoops to ensure we never run past zero, but we don't
need to make all those jumps with the generic counter
implementation.

The generic counter implementation allows us to pass a "batch"
threshold at which the addition/subtraction to the counter value
will be folded back into global value under lock. We can use this
feature to reduce the batch size as we approach 0 in a very similar
manner to the existing counters and their rebalance algorithm. If we
use a batch size of 1 as we approach 0, then every addition and
subtraction will be done against the global value and hence allow
accurate detection of zero threshold crossing.

Hence we can replace the handrolled, accurate-at-zero counters with
generic percpu counters.

Note: this removes just enough of the icsb infrastructure to compile
without warnings. The rest will go in subsequent commits.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 21:22:03 +11:00
Dave Chinner e88b64ea1f xfs: use generic percpu counters for free inode counter
XFS has hand-rolled per-cpu counters for the superblock since before
there was any generic implementation. The free inode counter is not
used for any limit enforcement - the per-AG free inode counters are
used during allocation to determine if there are inode available for
allocation.

Hence we don't need any of the complexity of the hand-rolled
counters and we can simply replace them with generic per-cpu
counters similar to the inode counter.

This version introduces a xfs_mod_ifree() helper function from
Christoph Hellwig.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 21:19:53 +11:00
Dave Chinner 501ab32387 xfs: use generic percpu counters for inode counter
XFS has hand-rolled per-cpu counters for the superblock since before
there was any generic implementation. There are some warts around
the  use of them for the inode counter as the hand rolled counter is
designed to be accurate at zero, but has no specific accurracy at
any other value. This design causes problems for the maximum inode
count threshold enforcement, as there is no trigger that balances
the counters as they get close tothe maximum threshold.

Instead of designing new triggers for balancing, just replace the
handrolled per-cpu counter with a generic counter.  This enables us
to update the counter through the normal superblock modification
funtions, but rather than do that we add a xfs_mod_icount() helper
function (from Christoph Hellwig) and keep the percpu counter
outside the superblock in the struct xfs_mount.

This means we still need to initialise the per-cpu counter
specifically when we read the superblock, and vice versa when we
log/write it, but it does mean that we don't need to change any
other code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-02-23 21:19:28 +11:00
Dave Chinner 438c3c8d2b Merge branch 'xfs-buf-type-fixes' into for-next 2015-01-22 09:51:30 +11:00
Dave Chinner fe22d552b8 xfs: set buf types when converting extent formats
Conversion from local to extent format does not set the buffer type
correctly on the new extent buffer when a symlink data is moved out
of line.

Fix the symlink code and leave a comment in the generic bmap code
reminding us that the format-specific data copy needs to set the
destination buffer type appropriately.

cc: <stable@vger.kernel.org> # 3.10 to current
Tested-by: Jan Kara <jack@suse.cz>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:30:06 +11:00
Dave Chinner 465e2def7c Merge branch 'xfs-sb-logging-rework' into for-next
Conflicts:
	fs/xfs/xfs_mount.c
2015-01-22 09:20:53 +11:00
Dave Chinner 074e427ba7 xfs: sanitise sb_bad_features2 handling
We currently have to ensure that every time we update sb_features2
that we update sb_bad_features2. Now that we log and format the
superblock in it's entirety we actually don't have to care because
we can simply update the sb_bad_features2 when we format it into the
buffer. This removes the need for anything but the mount and
superblock formatting code to care about sb_bad_features2, and
hence removes the possibility that we forget to update bad_features2
when necessary in the future.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:10:33 +11:00
Dave Chinner 61e63ecb57 xfs: consolidate superblock logging functions
We now have several superblock loggin functions that are identical
except for the transaction reservation and whether it shoul dbe a
synchronous transaction or not. Consolidate these all into a single
function, a single reserveration and a sync flag and call it
xfs_sync_sb().

Also, xfs_mod_sb() is not really a modification function - it's the
operation of logging the superblock buffer. hence change the name of
it to reflect this.

Note that we have to change the mp->m_update_flags that are passed
around at mount time to a boolean simply to indicate a superblock
update is needed.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:10:31 +11:00
Dave Chinner 4d11a40239 xfs: remove bitfield based superblock updates
When we log changes to the superblock, we first have to write them
to the on-disk buffer, and then log that. Right now we have a
complex bitfield based arrangement to only write the modified field
to the buffer before we log it.

This used to be necessary as a performance optimisation because we
logged the superblock buffer in every extent or inode allocation or
freeing, and so performance was extremely important. We haven't done
this for years, however, ever since the lazy superblock counters
pulled the superblock logging out of the transaction commit
fast path.

Hence we have a bunch of complexity that is not necessary that makes
writing the in-core superblock to disk much more complex than it
needs to be. We only need to log the superblock now during
management operations (e.g. during mount, unmount or quota control
operations) so it is not a performance critical path anymore.

As such, remove the complex field based logging mechanism and
replace it with a simple conversion function similar to what we use
for all other on-disk structures.

This means we always log the entirity of the superblock, but again
because we rarely modify the superblock this is not an issue for log
bandwidth or CPU time. Indeed, if we do log the superblock
frequently, delayed logging will minimise the impact of this
overhead.

[Fixed gquota/pquota inode sharing regression noticed by bfoster.]

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-22 09:10:26 +11:00
Dave Chinner 6bcf0939ff Merge branch 'xfs-misc-fixes-for-3.20-2' into for-next 2015-01-09 11:06:17 +11:00
Dave Chinner aa5d95c1b5 xfs: move xfs_bmap_finish prototype
This function is used libxfs code, but is implemented separately in
userspace. Move the function prototype to xfs_bmap.h so that the
prototype is shared even if the implementations aren't.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:47:14 +11:00
Dave Chinner 9799b438ce xfs: move struct xfs_bmalloca to libxfs
It no long is used for stack splits, so strip the kernel workqueue
bits from it and push it back into libxfs/xfs_bmap.h so that
it can be shared with the userspace code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:46:49 +11:00
Dave Chinner 5ebdc213ac xfs: move xfs_types.h to libxfs
The types used by the core XFS code are common between kernel and
userspace. xfs_types.h is duplicated in both kernel and userspace,
so move it to libxfs along with all the other shared code.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:46:31 +11:00
Dave Chinner 2155355fda xfs: move xfs_fs.h to libxfs
Ioctl API definitions are shared with userspace, so move the header
file that defines them all to libxfs along with all the other code
shared with userspace.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2015-01-09 10:45:13 +11:00
Jan Kara 1a43ec03dd xfs: Keep sb_bad_features2 consistent with sb_features2
Currently when we modify sb_features2, we store the same value also in
sb_bad_features2. However in most places we forget to mark field
sb_bad_features2 for logging and thus it can happen that a change to it
is lost. This results in an inconsistent sb_features2 and
sb_bad_features2 fields e.g. after xfstests test xfs/187.

Fix the problem by changing XFS_SB_FEATURES2 to actually mean both
sb_features2 and sb_bad_features2 fields since this is always what we
want to log. This isn't ideal because the fact that XFS_SB_FEATURES2
means two fields could cause some problem in future however the code is
hopefully less error prone that it is now.

Signed-off-by: Jan Kara <jack@suse.cz>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-24 09:48:35 +11:00
Dave Chinner 6044e4386c Merge branch 'xfs-misc-fixes-for-3.19-2' into for-next
Conflicts:
	fs/xfs/xfs_iops.c
2014-12-04 09:46:17 +11:00
Dave Chinner 32296f865e xfs: fix set-but-unused warnings
The kernel compile doesn't turn on these checks by default, so it's
only when I do a kernel-user sync that I find that there are lots of
compiler warnings waiting to be fixed. Fix up these set-but-unused
warnings.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04 09:43:17 +11:00
Dave Chinner 9a2cc41cda xfs: move type conversion functions to xfs_dir.h
These are currently considered private to libxfs, but they are
widely used by the userspace code to decode, walk and check
directory structures. Hence they really form part of the external
API and as such need to bemoved to xfs_dir2.h.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04 09:43:17 +11:00
Dave Chinner 1b767ee386 xfs: move ftype conversion functions to libxfs
These functions are needed in userspace for repair and mkfs to
do the right thing. Move them to libxfs so they can be easily
shared.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04 09:43:17 +11:00
Dave Chinner 4db431f57b xfs: cleanup xfs_bmse_merge returns
Signed-off-by: Dave Chinner <dchinner@redhat.com>

xfs_bmse_merge() has a jump label for return that just returns the
error value. Convert all the code to just return the error directly
and use XFS_WANT_CORRUPTED_RETURN. This also allows the final call
to xfs_bmbt_update() to return directly.

Noticed while reviewing coccinelle return cleanup patches and
wondering why the same return pattern as in xfs_bmse_shift_one()
wasn't picked up by the checker pattern...

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04 09:42:40 +11:00
Dave Chinner b11bd671ba xfs: cleanup xfs_bmse_shift_one goto mess
xfs_bmse_shift_one() jumps around determining whether to shift or
merge, making the code flow difficult to follow. Clean it up and
use direct error returns (including XFS_WANT_CORRUPTED_RETURN) to
make the code flow better and be easier to read.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04 09:42:24 +11:00
Dave Chinner 7a1df15616 xfs: fix premature enospc on inode allocation
After growing a filesystem, XFS can fail to allocate inodes even
though there is a large amount of space available in the filesystem
for inodes. The issue is caused by a nearly full allocation group
having enough free space in it to be considered for inode
allocation, but not enough contiguous free space to actually
allocation inodes.  This situation results in successful selection
of the AG for allocation, then failure of the allocation resulting
in ENOSPC being reported to the caller.

It is caused by two possible issues. Firstly, we only consider the
lognest free extent and whether it would fit an inode chunk. If the
extent is not correctly aligned, then we can't allocate an inode
chunk in it regardless of the fact that it is large enough. This
tends to be a permanent error until space in the AG is freed.

The second issue is that we don't actually lock the AGI or AGF when
we are doing these checks, and so by the time we get to actually
allocating the inode chunk the space we thought we had in the AG may
have been allocated. This tends to be a spurious error as it
requires a race to trigger. Hence this case is ignored in this patch
as the reported problem is for permanent errors.

The first issue could be addressed by simply taking into account the
alignment when checking the longest extent. This, however, would
prevent allocation in AGs that have aligned, exact sized extents
free. However, this case should be fairly rare compared to the
number of allocations that occur near ENOSPC that would trigger this
condition.

Hence, when selecting the inode AG, take into account the inode
cluster alignment when checking the lognest free extent in the AG.
If we can't find any AGs with a contiguous free space large
enough to be aligned, drop the alignment addition and just try for
an AG that has enough contiguous free space available for an inode
chunk. This won't prevent issues from occurring, but should avoid
situations where other AGs have lots of free space but the selected
AG can't allocate due to alignment constraints.

Reported-by: Arkadiusz Miskiewicz <arekm@maven.pl>
Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Brian Foster <bfoster@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-04 09:42:21 +11:00
Dave Chinner c14fc01340 Merge branch 'xfs-coccinelle-cleanups' into for-next 2014-12-01 09:03:02 +11:00
kbuild test robot d254aaec5d xfs: fix simple_return.cocci warning in xfs_bmse_shift_one
fs/xfs/libxfs/xfs_bmap.c:5591:1-6: WARNING: end returns can be simpified

 Simplify a trivial if-return sequence.  Possibly combine with a
 preceding function call.
Generated by: scripts/coccinelle/misc/simple_return.cocci

CC: Brian Foster <bfoster@redhat.com>
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
2014-12-01 08:42:52 +11:00
kbuild test robot b72091f2fb libxfs: fix simple_return.cocci warnings
fs/xfs/libxfs/xfs_ialloc.c:1141:1-6: WARNING: end returns can be simpified

 Simplify a trivial if-return sequence.  Possibly combine with a
 preceding function call.
Generated by: scripts/coccinelle/misc/simple_return.cocci

Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-12-01 08:24:58 +11:00
Christoph Hellwig 508b6b3b73 xfs: merge xfs_inum.h into xfs_format.h
Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:27:10 +11:00
Christoph Hellwig bb58e6188a xfs: move most of xfs_sb.h to xfs_format.h
More on-disk format consolidation.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:27:09 +11:00
Christoph Hellwig 4fb6e8ade2 xfs: merge xfs_ag.h into xfs_format.h
More on-disk format consolidation.  A few declarations that weren't on-disk
format related move into better suitable spots.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:25:04 +11:00
Christoph Hellwig 5beda58bf2 xfs: move acl structures to xfs_format.h
Move the on-disk ACL format to xfs_format.h, so that repair can
use the common defintion.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:24:37 +11:00
Christoph Hellwig 6d3ebaae7c xfs: merge xfs_dinode.h into xfs_format.h
More consolidatation for the on-disk format defintions.  Note that the
XFS_IS_REALTIME_INODE moves to xfs_linux.h instead as it is not related
to the on disk format, but depends on a CONFIG_ option.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-11-28 14:24:06 +11:00
Dave Chinner 6889e783cd Merge branch 'xfs-misc-fixes-for-3.18-3' into for-next 2014-10-13 10:22:45 +11:00
Eric Sandeen 04dd1a0d4b xfs: fix crc field handling in xfs_sb_to/from_disk
I discovered this in userspace, but the same change applies
to the kernel.

If we xfs_mdrestore an image from a non-crc filesystem, lo
and behold the restored image has gained a CRC:

# db/xfs_metadump.sh -o /dev/sdc1 - | xfs_mdrestore - test.img
# xfs_db -c "sb 0" -c "p crc" /dev/sdc1
crc = 0 (correct)
# xfs_db -c "sb 0" -c "p crc" test.img
crc = 0xb6f8d6a0 (correct)

This is because xfs_sb_from_disk doesn't fill in sb_crc,
but xfs_sb_to_disk(XFS_SB_ALL_BITS) does write the in-memory
CRC to disk - so we get uninitialized memory on disk.

Fix this by always initializing sb_crc to 0 when we read
the superblock, and masking out the CRC bit from ALL_BITS
when we write it.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-02 09:24:11 +10:00
Eric Sandeen 6ee49a20c1 xfs: don't send null bp to xfs_trans_brelse()
In this case, if bp is NULL, error is set, and we send a
NULL bp to xfs_trans_brelse, which will try to dereference it.

Test whether we actually have a buffer before we try to
free it.

Coverity spotted this.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-10-02 09:23:49 +10:00
Dave Chinner bd438f825f Merge branch 'xfs-sparse-fixes' into for-next 2014-09-29 10:52:44 +10:00
Dave Chinner e68ed77521 xfs: fix use of agi_newino in finobt lookup
Sparse warns that we are passing the big-endian valueo f agi_newino
to the initial btree lookup function when trying to find a new
inode. This is wrong - we need to pass the host order value, not the
disk order value. This will adversely affect the next inode
allocated, but given that the free inode btree is usually much
smaller than the allocated inode btree it is much less likely to be
a performance issue if we start the search in the wrong place.

Signed-off-by: Dave Chinner <dchinner@redhat.com>
Reviewed-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-09-29 10:43:15 +10:00
Dave Chinner 33044dc408 Merge branch 'xfs-misc-fixes-for-3.18-2' into for-next 2014-09-23 22:55:51 +10:00
Eric Sandeen fb04013156 xfs: don't ASSERT on corrupt ftype
xfs_dir3_data_get_ftype() gets the file type off disk, but ASSERTs
if it's invalid:

     ASSERT(type < XFS_DIR3_FT_MAX);

We shouldn't ASSERT on bad values read from disk.  V3 dirs are
CRC-protected, but V2 dirs + ftype are not.

Signed-off-by: Eric Sandeen <sandeen@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-09-23 16:05:32 +10:00
Dave Chinner f6d31f4b04 Merge branch 'xfs-shift-extents-rework' into for-next 2014-09-23 15:51:14 +10:00
Brian Foster f71721d061 xfs: writeback and inval. file range to be shifted by collapse
The collapse range operation currently writes the entire file before
starting the collapse to avoid changes in the in-core extent list due to
writeback causing the extent count to change. Now that collapse range is
fsb based rather than extent index based it can sustain changes in the
extent list during the shift sequence without disruption.

Modify xfs_collapse_file_space() to writeback and invalidate pages
associated with the range of the file to be shifted.
xfs_free_file_space() currently has similar behavior, but the space free
need only affect the region of the file that is freed and this could
change in the future.

Also update the comments to reflect the current implementation. We
retain the eofblocks trim permanently as a best option for dealing with
delalloc extents. We don't shift delalloc extents because this scenario
only occurs with post-eof preallocation (since data must be flushed such
that the cache can be invalidated and data can be shifted). That means
said space must also be initialized before being shifted into the
accessible region of the file only to be immediately truncated off as
the last part of the collapse. In other words, the eofblocks trim will
happen anyways, we just run it first to ensure the file remains in a
consistent state throughout the collapse.

Finally, detect and fail explicitly in the event of a delalloc extent
during the extent shift. The implementation does not support delalloc
extents and the caller is expected to prevent this scenario in advance
as is done by collapse.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-09-23 15:39:05 +10:00
Brian Foster a979bdfea1 xfs: refactor single extent shift into xfs_bmse_shift_one() helper
xfs_bmap_shift_extents() has a variety of conditions and error checks
that make the logic difficult to follow and indent heavy. Refactor the
loop body of this function into a new xfs_bmse_shift_one() helper. This
simplifies the error checks, eliminates index decrement on merge hack by
pushing the index increment down into the helper, and makes the code
more readable by reducing multiple levels of indentation.

This is a code refactor only. The behavior of extent shift and collapse
range is not modified.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-09-23 15:39:04 +10:00
Brian Foster ddb19e3180 xfs: refactor shift-by-merge into xfs_bmse_merge() helper
The extent shift mechanism in xfs_bmap_shift_extents() is complicated
and handles several different, non-deterministic scenarios. These
include extent shifts, extent merges and potential btree updates in
either of the former scenarios.

Refactor the code to be more linear and readable. The loop logic in
xfs_bmap_shift_extents() and some initial error checking is adjusted
slightly. The associated btree lookup and update/delete operations are
condensed into single blocks of code. This reduces the number of
btree-specific blocks and facilitates the separation of the merge
operation into a new xfs_bmse_merge() and xfs_bmse_can_merge() helpers.

This is a code refactor only. The behavior of extent shift and collapse
range is not modified.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-09-23 15:38:09 +10:00
Brian Foster 2c845f5a5f xfs: track collapse via file offset rather than extent index
The collapse range implementation uses a transaction per extent shift.
The progress of the overall operation is tracked via the current extent
index of the in-core extent list. This is racy because the ilock must be
dropped and reacquired for each transaction according to locking and log
reservation rules. Therefore, writeback to prior regions of the file is
possible and can change the extent count. This changes the extent to
which the current index refers and causes the collapse to fail mid
operation. To avoid this problem, the entire file is currently written
back before the collapse operation starts.

To eliminate the need to flush the entire file, use the file offset
(fsb) to track the progress of the overall extent shift operation rather
than the extent index. Modify xfs_bmap_shift_extents() to
unconditionally convert the start_fsb parameter to an extent index and
return the file offset of the extent where the shift left off, if
further extents exist. The bulk of ths function can remain based on
extent index as ilock is held by the caller. xfs_collapse_file_space()
now uses the fsb output as the starting point for the subsequent shift.

Signed-off-by: Brian Foster <bfoster@redhat.com>
Reviewed-by: Dave Chinner <dchinner@redhat.com>
Signed-off-by: Dave Chinner <david@fromorbit.com>
2014-09-23 15:37:09 +10:00