bch2_count_inode_sectors() uses for_each_btree_key() internally, which
handles lock restarts - the lockrestart_do() in check_i_sectors() is
redundant, and buggy here since the count that
bch2_count_inode_sectors() returns was interpreted as an error by
lockrestart_do().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
- Convert to for_each_btree_key2(), for_each_btree_key_commit(),
for_each_btree_key_reverse()
- No more bare bch2_btree_iter_peek(); we're now fault-injection lock
restarts, so we always need a lockrestart_do() or equivalent.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds a new macro, like for_each_btree_key2(), but for iterating in
reverse order.
Also, change for_each_btree_key2() to properly check the return value of
bch2_btree_iter_advance().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
In CONFIG_BCACHEFS_DEBUG mode, we'll now randomly issue transaction
restarts - with a decaying probability based on the number of restarts
we've already had, to ensure that transactions eventually make forward
progress. This should help shake out some bugs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Now that we have error codes, with subtypes, we can switch to our own
error code for transaction restarts - and even better, a distinct error
code for each transaction restart reason: clearer code and better
debugging.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
In bch2_bucket_alloc_trans(), we're iterating over buckets - but not
directly with an iterator, since we're iterating over the freespace
btree.
This means that we need to clear iter->path->preserve, otherwise we'll
end up retaining a btree_path for every alloc key we touched - which is
not what we want here.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Instead of overloading standard error codes (EINTR/EAGAIN), and defining
short lists of error codes in multiple places that potentially end up
overlapping & conflicting, we're now going to have one master list of
error codes.
Error codes are defined with an x-macro: thus we also have
bch2_err_str() now.
Also, error codes have a class field. Now, instead of checking for
errors with ==, code should use bch2_err_matches(), which returns true
if the error is equal to or a sub-error of the error class.
This means we can define unique errors for every source location where
an error is generated, which will help improve our error messages.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We can rebuild alloc info if these btree roots are missing - no need to
bail out and say the filesystem is unrecoverable
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Like bch2_copygc_wait_amount, should_invalidate_buckets() needs to try
to ensure that there are always more buckets free than the largest
reserve.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
With the upcoming patches to add assertions for incorrect nested
transaction restart handling, this code is now bogus. Switch it to
for_each_btree_key_norestart() so that transaction restarts are only
handled in one place.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This will help us improve nested transactions - we need to add
assertions that whenever an inner transaction handles a restart, it
still returns -EINTR to the outer transaction.
This also adds nested_lockrestart_do() and nested_commit_do() which use
the new counters to correctly return -EINTR when the transaction was
restarted.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This adds a new helper, bch2_trans_run(), that runs a function with a
btree_transaction context but without handling transaction restarts.
We're adding checks for nested transaction restart handling: when an
inner transaction handles a transaction restart it will still have to
return it to the outer transaction, or else assertions will be popped in
the outer transaction.
But some places don't need restart handling at the outer scope, so this
helper does what they need.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This converts bch2_gc_stripes_done() and bch2_gc_reflink_done() to the
new for_each_btree_key_commit() macro.
The new for_each_btree_key2() and for_each_btree_key_commit() macros
handles transaction retries, allowing us to avoid nested transactions -
which we want to avoid since they're tricky to do completely correctly
and upcoming assertions are going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
The new for_each_btree_key2() macro handles transaction retries,
allowing us to avoid nested transactions - which we want to avoid since
they're tricky to do completely correctly and upcoming assertions are
going to be checking for that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We have an obvious wake up race if we do the wakeup _before_ updating
the counters the thing doing the waiting is reading.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We now record the length of time btree locks are held and expose this in debugfs.
Enabled via CONFIG_BCACHEFS_LOCK_TIME_STATS.
Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
Printbufs indentation feature doesn't yet work with '\n' and '\t'. So we've
replaced all instances of '\n' with prt_newline.
Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We need the caller name and a place to store our results, btree_trans provides this.
Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
We try to ensure we never hold btree locks for too long - bcachefs tries
to be soft realtime. This adds a check when restarting a transaction,
where a transaction restart is cheap - if we've been holding locks for
too long, drop and retake them.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This introduces two new macros for iterating through the btree, with
transaction restart handling
- for_each_btree_key2()
- for_each_btree_key_commit()
Every iteration is now in an implicit transaction, and - as with
lockrestart_do() and commit_do() - returning -EINTR will cause the
transaction to be restarted, at the same key.
This patch converts a bunch of code that was open coding this to these
new macros, saving a substantial amount of code.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
When we find an extent past an inode's i_size, we need to do the
deletion in the inode's snapshot (which will emit a whiteout if
necessary); and we also need to note that we now have an a key at that
position and snapshot, so that we don't go into an infinite loop.
Also, switch to walking inodes in reverse older, oldest snapshot to
newest, so that we emit the fewest whiteouts possible.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Fsck now checks for keys in different snapshot IDs that are now
redundant due to other snapshots being deleted - it needs to for its own
algorithms to not get confused.
When it detects this it should re-run the post snapshot deletion cleanup
- this patch does that.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
- Bunch of refactoring, and move some code out of
bch2_snapshots_start() and into bch2_snapshots_check(), for constency
with the rest of fsck
- Interior snapshot nodes no longer point to a subvolume; this is so we
don't end up with dangling subvol references when deleting or require
scanning the full snapshots btree.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
This makes the snapshots_seen data structure fsck private and improves
it; we now also track the equivalence class for each snapshot id we've
seen, which means we can detect when snapshot deletion hasn't finished
or run correctly (which will otherwise confuse fsck).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
fsck doesn't want to run while we're cleaning up deleted snapshots - if
that work needs to be done, we want it to have finished before fsck
runs, otherwise fsck will get confused when it finds multiple keys in
the same snapshot ID equivalence class (i.e. the mechanism that
snapshot deletion uses for cleaning up redundant keys).
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
We should never see an inode marked as unlinked that's a subvolume root
(or a directory) in fsck, but even if we do it's not correct for fsck to
delete the subvolume: subvolumes are owned by dirents, and if we find a
dangling subvolume (not marked as unlinked) we want fsck to reattach it.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
snapshots_seen is becoming private to fsck, and snapshot_id_list is
actually what the data update path needs.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Snapshots being deleted won't in general have a corresponding subvolume:
this fixes a spurious fsck error where we'd complain about a snapshot
pointing to a missing subvolume - but the subvolume had been deleted,
and the snapshot was pending deletion as well.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
Better/more descriptive naming, and prep for adding
nested_lockrestart_do() and nested_commit_do().
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
There's no need to print fsck errors for errors that are expected, and
the user has already opted to repair.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
These messages log the updates we're doing in bch2_check_fix_ptrs(),
which is useful when debugging but not usually needed.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
There's no point reading an extent in order to move it if the write is
going to fail because we're shutting down. This patch changes the move
path so that moving_io now owns a ref on c->writes - as a bonus,
rebalance and copygc will now notice that we're shutting down and exit
quicker.
Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>
- add bch2_moving_ctxt_(init|exit)
- split out __bch2_evacutae_bucket() which takes an existing
moving_ctxt, this will be used for improving copygc performance by
pipelining across multiple buckets
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>
move_ratelimit() now has a bool that specifies whether we want to
wait for copygc to finish.
When copygc is running, we're probably low on free buckets instead
of consuming the remaining buckets, we want to wait for copygc to
finish.
This should help with performance, and run away bucket fragmentation.
Signed-off-by: Daniel Hill <daniel@gluo.nz>
Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>