linux

mirror of https://github.com/torvalds/linux synced 2024-10-03 09:48:02 +00:00

Author	SHA1	Message	Date
Kent Overstreet	7903e3d2d7	bcachefs: Fix check_i_sectors() bch2_count_inode_sectors() uses for_each_btree_key() internally, which handles lock restarts - the lockrestart_do() in check_i_sectors() is redundant, and buggy here since the count that bch2_count_inode_sectors() returns was interpreted as an error by lockrestart_do(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:37 -04:00
Kent Overstreet	1ed0a5d280	bcachefs: Convert fsck errors to errcode.h Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:37 -04:00
Kent Overstreet	549d173c1b	bcachefs: EINTR -> BCH_ERR_transaction_restart Now that we have error codes, with subtypes, we can switch to our own error code for transaction restarts - and even better, a distinct error code for each transaction restart reason: clearer code and better debugging. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:37 -04:00
Kent Overstreet	d4bf5eecd7	bcachefs: Use bch2_err_str() in error messages Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:36 -04:00
Kent Overstreet	615f867c14	bcachefs: Improved errcodes Instead of overloading standard error codes (EINTR/EAGAIN), and defining short lists of error codes in multiple places that potentially end up overlapping & conflicting, we're now going to have one master list of error codes. Error codes are defined with an x-macro: thus we also have bch2_err_str() now. Also, error codes have a class field. Now, instead of checking for errors with ==, code should use bch2_err_matches(), which returns true if the error is equal to or a sub-error of the error class. This means we can define unique errors for every source location where an error is generated, which will help improve our error messages. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:36 -04:00
Kent Overstreet	eace11a730	bcachefs: Convert more fsck code to for_each_btree_key2() The new for_each_btree_key2() macro handles transaction retries, allowing us to avoid nested transactions - which we want to avoid since they're tricky to do completely correctly and upcoming assertions are going to be checking for that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:36 -04:00
Kent Overstreet	a1783320d4	bcachefs: for_each_btree_key2() This introduces two new macros for iterating through the btree, with transaction restart handling - for_each_btree_key2() - for_each_btree_key_commit() Every iteration is now in an implicit transaction, and - as with lockrestart_do() and commit_do() - returning -EINTR will cause the transaction to be restarted, at the same key. This patch converts a bunch of code that was open coding this to these new macros, saving a substantial amount of code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	0d06b4eca6	bcachefs: Fix repair for extent past end of inode When we find an extent past an inode's i_size, we need to do the deletion in the inode's snapshot (which will emit a whiteout if necessary); and we also need to note that we now have an a key at that position and snapshot, so that we don't go into an infinite loop. Also, switch to walking inodes in reverse older, oldest snapshot to newest, so that we emit the fewest whiteouts possible. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	c7a09cb1b1	bcachefs: When fsck finds redundant snapshot keys, trigger snapshots cleanup Fsck now checks for keys in different snapshot IDs that are now redundant due to other snapshots being deleted - it needs to for its own algorithms to not get confused. When it detects this it should re-run the post snapshot deletion cleanup - this patch does that. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	35f1a5034d	bcachefs: Improve fsck for subvols/snapshots - Bunch of refactoring, and move some code out of bch2_snapshots_start() and into bch2_snapshots_check(), for constency with the rest of fsck - Interior snapshot nodes no longer point to a subvolume; this is so we don't end up with dangling subvol references when deleting or require scanning the full snapshots btree. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	49124d8a7f	bcachefs: Improve snapshots_seen This makes the snapshots_seen data structure fsck private and improves it; we now also track the equivalence class for each snapshot id we've seen, which means we can detect when snapshot deletion hasn't finished or run correctly (which will otherwise confuse fsck). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	4ab35c34d5	bcachefs: Fix subvol/snapshot deleting in recovery fsck doesn't want to run while we're cleaning up deleted snapshots - if that work needs to be done, we want it to have finished before fsck runs, otherwise fsck will get confused when it finds multiple keys in the same snapshot ID equivalence class (i.e. the mechanism that snapshot deletion uses for cleaning up redundant keys). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	e4085b70f2	bcachefs: fsck_inode_rm() shouldn't delete subvols We should never see an inode marked as unlinked that's a subvolume root (or a directory) in fsck, but even if we do it's not correct for fsck to delete the subvolume: subvolumes are owned by dirents, and if we find a dangling subvolume (not marked as unlinked) we want fsck to reattach it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	e68914ca84	bcachefs: Rename __bch2_trans_do() -> commit_do() Better/more descriptive naming, and prep for adding nested_lockrestart_do() and nested_commit_do(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:35 -04:00
Kent Overstreet	d8f31407c8	bcachefs: Fix hash_check_key() hash_check_key() was incorrectly handling transaction restarts - switch it to for_each_btree_key_norestart(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:32 -04:00
Kent Overstreet	0095aa94bc	bcachefs: Improve some fsck error messages We have string names for d_type; use it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	41fc862224	bcachefs: In fsck, pass BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE when deleting dirents A user reported an error where we hit an assertion due to deleting a key in an internal snapshot node, when deleting a dirent that points to a nonexisting inode. We try to avoid doing updates to keys for internal snapshot nodes, but upon inspection of the places where we remove dirents in fsck it appears BTREE_UPDATE_INTERNAL_SNAPSHOT_NODE is correct for all of them: either the target dirent doesn't exist, or it's a directory with multiple dirents pointing to it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	e492e7b6f6	bcachefs: Improve error logging in fsck.c This adds error logging to a bunch of functions in fsck.c - in fsck, reduntant error messages is probably better than not enough. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	e296b1f9ca	bcachefs: Fix inode_backpointer_exists() If the dirent an inode points to doesn't exist, we shouldn't be returning an error - just 0/false. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:31 -04:00
Kent Overstreet	292dea86df	bcachefs: fsck: Work around transaction restarts In check_extents() and check_dirents(), we're working towards only handling transaction restarts in one place, at the top level - but we're not there yet. check_i_sectors() and check_subdir_count() handle transaction restarts locally, which means the iterator for the dirent/extent is left unlocked (should_be_locked == 0), leading to asserts popping when we go to do updates. This patch hacks around this for now, until we can delete the offending code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:30 -04:00
Kent Overstreet	91d961badf	bcachefs: darrays Inspired by CCAN darray - simple, stupid resizable (dynamic) arrays. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:28 -04:00
Kent Overstreet	fa8e94faee	bcachefs: Heap allocate printbufs This patch changes printbufs dynamically allocate and reallocate a buffer as needed. Stack usage has become a bit of a problem, and a major cause of that has been static size string buffers on the stack. The most involved part of this refactoring is that printbufs must now be exited with printbuf_exit(). Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:25 -04:00
Kent Overstreet	9e34316156	bcachefs: Small fsck fix The check_dirents pass handles transaction restarts at the toplevel - check_subdir_count() was incorrectly handling transaction restarts without returning -EINTR, meaning that the iterator pointing to the dirent being checked was left invalid. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:23 -04:00
Kent Overstreet	f0f41a6d74	bcachefs: Add error messages for memory allocation failures This adds some missing diagnostics from rare but annoying to debug runtime allocation failure paths. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:20 -04:00
Kent Overstreet	3e52c22255	bcachefs: Add journal_seq to inode & alloc keys Add fields to inode & alloc keys that record the journal sequence number when they were most recently modified. For alloc keys, this is needed to know what journal sequence number we have to flush before the bucket can be reused. Currently this is tracked in memory, but we'll be getting rid of the in memory bucket array. For inodes, this is needed for fsync when the inode has been evicted from the vfs cache. Currently we use a bloom filter per outstanding journal buf - but that mechanism has been broken since we added the ability to not issue a flush/fua for every journal write. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	c27314b448	bcachefs: Fix __remove_dirent() __lookup_inode() doesn't work for what __remove_dirent() wants - it just wants the first inode at a given inode number, they all have the same hash info. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	47f80bbf38	bcachefs: Fix check_inodes() We were starting at the wrong btree position, and thus not actually checking any inodes - oops. Also, make check_key_has_snapshot() a mustfix fsck error, since later fsck code assumes that all keys have valid snapshot IDs. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:16 -04:00
Kent Overstreet	285b181ad4	bcachefs: Improve transaction restart handling in fsck code The fsck code has been handling transaction restarts locally, to avoid calling fsck_err() multiple times (and asking the user/logging the error multiple times) on transaction restart. However, with our improving assertions about iterator validity, this isn't working anymore - the code wasn't entirely correct, in ways that are fine for now but are going to matter once we start wanting online fsck. This code converts much of the fsck code to handle transaction restarts in a more rigorously correct way - moving restart handling up to the top level of check_dirent, check_xattr and others - at the cost of logging errors multiple times on transaction restart. Fixing the issues with logging errors multiple times is probably going to require memoizing calls to fsck_err() - we'll leave that for future improvements. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	2027875bd8	bcachefs: Add BCH_SUBVOLUME_UNLINKED Snapshot deletion needs to become a multi step process, where we unlink, then tear down the page cache, then delete the subvolume - the deleting flag is equivalent to an inode with i_nlink = 0. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	6b3d8b8992	bcachefs: Don't run triggers in fix_reflink_p_key() It seems some users have reflink pointers which span many indirect extents, from a short window in time when merging of reflink pointers was allowed. Now, we're seeing transaction path overflows in fix_reflink_p(), the code path to clear out the reflink_p fields now used for front/back pad - but, we don't actually need to be running triggers in that path, which is an easy partial fix. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:15 -04:00
Kent Overstreet	b0d1b70af8	bcachefs: Must check for errors from bch2_trans_cond_resched() But we don't need to call it from outside the btree iterator code anymore, since it's called by bch2_trans_begin() and bch2_btree_path_traverse(). Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	4db650277d	bcachefs: Subvol dirents are now only visible in parent subvol This changes the on disk format for dirents that point to subvols so that they also record the subvolid of the parent subvol, so that we can filter them out in other subvolumes. This also updates the dirent code to do that filtering, and in particular tweaks the rename code - we need to ensure that there's only ever one dirent (counting multiplicities in different snapshots) that point to a subvolume. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	6e0c886d3c	bcachefs: Fix check_path() for snapshots check_path() wasn't checking the snapshot ID when checking for directory structure loops - so, snapshots would cause us to detect a loop that wasn't actually a loop. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	6d76aefea1	bcachefs: Fix for leaking of reflinked extents When a reflink pointer points to only part of an indirect extent, and then that indirect extent is fragmented (e.g. by copygc), if the reflink pointer only points to one of the fragments we leak a reference. Fix this by storing front/back pad values in reflink pointers - when inserting reflink pointesr, we initialize them to cover the full range of the indirect extents we reference. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	bfe88863cf	bcachefs: New on disk format to fix reflink_p pointers We had a bug where reflink_p pointers weren't being initialized to 0, and when we started using the second word, things broke badly. This patch revs the on disk format version and adds cleanup code to zero out the second word of reflink_p pointers before we start using it. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	9a796fdb06	bcachefs: bch2_trans_exit() no longer returns errors Now that peek_node()/next_node() are converted to return errors directly, we don't need bch2_trans_exit() to return errors - it's cleaner this way and wasn't used much anymore. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	488f97764a	bcachefs: Fix check_path() across subvolumes Checking of directory structure across subvolumes was broken - we need to look up the snapshot ID of the parent subvolume when crossing subvol boundaries. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:14 -04:00
Kent Overstreet	97996ddfdb	bcachefs: bch2_subvolume_get() Factor out a little helper. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:13 -04:00
Kent Overstreet	ea0531f84e	bcachefs: Fix check_inode_update_hardlinks() We were incorrectly using bch2_inode_write(), which gets the snapshot ID from the iterator, with a BTREE_ITER_ALL_SNAPSHOTS iterator - fortunately caught by an assertion in the update path. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:13 -04:00
Kent Overstreet	42d237320e	bcachefs: Snapshot creation, deletion This is the final patch in the patch series implementing snapshots. This patch implements two new ioctls that work like creation and deletion of directories, but fancier. - BCH_IOCTL_SUBVOLUME_CREATE, for creating new subvolumes and snaphots - BCH_IOCTL_SUBVOLUME_DESTROY, for deleting subvolumes and snapshots Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:13 -04:00
Kent Overstreet	18443cb9f0	bcachefs: Update data move path for snapshots The data move path operates on existing extents, and not within a subvolume as the regular IO paths do. It needs to change because it may cause existing extents to be split, and when splitting an existing extent in an ancestor snapshot we need to make sure the new split has the same visibility in child snapshots as the existing extent. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	ef1669ffc6	bcachefs: Update fsck for snapshots This updates the fsck algorithms to handle snapshots - meaning there will be multiple versions of the same key (extents, inodes, dirents, xattrs) in different snapshots, and we have to carefully consider which keys are visible in which snapshot. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	6fed42bb77	bcachefs: Plumb through subvolume id To implement snapshots, we need every filesystem btree operation (every btree operation without a subvolume) to start by looking up the subvolume and getting the current snapshot ID, with bch2_subvolume_get_snapshot() - then, that snapshot ID is used for doing btree lookups in BTREE_ITER_FILTER_SNAPSHOTS mode. This patch adds those bch2_subvolume_get_snapshot() calls, and also switches to passing around a subvol_inum instead of just an inode number. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	81ed9ce367	bcachefs: Per subvolume lost+found On existing filesystems, we have a single global lost+found. Introducing subvolumes means we need to introduce per subvolume lost+found directories, because inodes are added to lost+found by their inode number, and inode numbers are now only unique within a subvolume. This patch adds support to fsck for per subvolume lost+found. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	b9e1adf579	bcachefs: Add support for dirents that point to subvolumes Dirents currently always point to inodes. Subvolumes add a new type of dirent, with d_type DT_SUBVOL, that instead points to an entry in the subvolumes btree, and the subvolume has a pointer to the root inode. This patch adds bch2_dirent_read_target() to get the inode (and potentially subvolume) a dirent points to, and changes existing code to use that instead of reading from d_inum directly. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	14b393ee76	bcachefs: Subvolumes, snapshots This patch adds subvolume.c - support for the subvolumes and snapshots btrees and related data types and on disk data structures. The next patches will start hooking up this new code to existing code. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:12 -04:00
Kent Overstreet	67e0dd8f0d	bcachefs: btree_path This splits btree_iter into two components: btree_iter is now the externally visible componont, and it points to a btree_path which is now reference counted. This means we no longer have to clone iterators up front if they might be mutated - btree_path can be shared by multiple iterators, and cloned if an iterator would mutate a shared btree_path. This will help us use iterators more efficiently, as well as slimming down the main long lived state in btree_trans, and significantly cleans up the logic for iterator lifetimes. Signed-off-by: Kent Overstreet <kent.overstreet@linux.dev>	2023-10-22 17:09:11 -04:00
Kent Overstreet	1a488e7306	bcachefs: Kill BTREE_INSERT_NOUNLOCK With the recent transaction restart changes, it's no longer needed - all transaction commits have BTREE_INSERT_NOUNLOCK semantics. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:10 -04:00
Kent Overstreet	3820054426	bcachefs: Don't squash return code in check_dirents() We were squashing BCH_FSCK_ERRORS_NOT_FIXED. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:09 -04:00
Kent Overstreet	914f2786b8	bcachefs: Improvements to fsck check_dirents() The fsck code handles transaction restarts in a very ad hoc way, and not always correctly. This patch makes some improvements to check_dirents(), but more work needs to be done to figure out how this kind of code should be structured. Signed-off-by: Kent Overstreet <kent.overstreet@gmail.com>	2023-10-22 17:09:08 -04:00

1 2 3

129 commits