linux/fs/btrfs
Robbie Ko 801bec365e Btrfs: send, fix failure to move directories with the same name around
When doing an incremental send we can end up not moving directories that
have the same name. This happens when the same parent directory has
different child directories with the same name in the parent and send
snapshots.

For example, consider the following scenario:

  Parent snapshot:

  .                   (ino 256)
  |---- d/            (ino 257)
  |     |--- p1/      (ino 258)
  |
  |---- p1/           (ino 259)

  Send snapshot:

  .                    (ino 256)
  |--- d/              (ino 257)
       |--- p1/        (ino 259)
             |--- p1/  (ino 258)

The directory named "d" (inode 257) has in both snapshots an entry with
the name "p1" but it refers to different inodes in both snapshots (inode
258 in the parent snapshot and inode 259 in the send snapshot). When
attempting to move inode 258, the operation is delayed because its new
parent, inode 259, was not yet moved/renamed (as the stream is currently
processing inode 258). Then when processing inode 259, we also end up
delaying its move/rename operation so that it happens after inode 258 is
moved/renamed. This decision to delay the move/rename rename operation
of inode 259 is due to the fact that the new parent inode (257) still
has inode 258 as its child, which has the same name has inode 259. So
we end up with inode 258 move/rename operation waiting for inode's 259
move/rename operation, which in turn it waiting for inode's 258
move/rename. This results in ending the send stream without issuing
move/rename operations for inodes 258 and 259 and generating the
following warnings in syslog/dmesg:

[148402.979747] ------------[ cut here ]------------
[148402.980588] WARNING: CPU: 14 PID: 4117 at fs/btrfs/send.c:6177 btrfs_ioctl_send+0xe03/0xe51 [btrfs]
[148402.981928] Modules linked in: btrfs crc32c_generic xor raid6_pq acpi_cpufreq tpm_tis ppdev tpm parport_pc psmouse parport sg pcspkr i2c_piix4 i2c_core evdev processor serio_raw button loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy [last unloaded: btrfs]
[148402.986999] CPU: 14 PID: 4117 Comm: btrfs Tainted: G        W       4.6.0-rc7-btrfs-next-31+ #1
[148402.988136] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[148402.988136]  0000000000000000 ffff88022139fca8 ffffffff8126b42c 0000000000000000
[148402.988136]  0000000000000000 ffff88022139fce8 ffffffff81052b14 000018212139fac8
[148402.988136]  ffff88022b0db400 0000000000000000 0000000000000001 0000000000000000
[148402.988136] Call Trace:
[148402.988136]  [<ffffffff8126b42c>] dump_stack+0x67/0x90
[148402.988136]  [<ffffffff81052b14>] __warn+0xc2/0xdd
[148402.988136]  [<ffffffff81052beb>] warn_slowpath_null+0x1d/0x1f
[148402.988136]  [<ffffffffa04bc831>] btrfs_ioctl_send+0xe03/0xe51 [btrfs]
[148402.988136]  [<ffffffffa048b358>] btrfs_ioctl+0x14f/0x1f81 [btrfs]
[148402.988136]  [<ffffffff8108e456>] ? arch_local_irq_save+0x9/0xc
[148402.988136]  [<ffffffff8108eb51>] ? __lock_is_held+0x3c/0x57
[148402.988136]  [<ffffffff8118da05>] vfs_ioctl+0x18/0x34
[148402.988136]  [<ffffffff8118e00c>] do_vfs_ioctl+0x550/0x5be
[148402.988136]  [<ffffffff81196f0c>] ? __fget+0x6b/0x77
[148402.988136]  [<ffffffff81196fa1>] ? __fget_light+0x62/0x71
[148402.988136]  [<ffffffff8118e0d1>] SyS_ioctl+0x57/0x79
[148402.988136]  [<ffffffff8149e025>] entry_SYSCALL_64_fastpath+0x18/0xa8
[148402.988136]  [<ffffffff8108e89d>] ? trace_hardirqs_off_caller+0x3f/0xaa
[148403.011373] ---[ end trace a4539270c8056f8b ]---
[148403.012296] ------------[ cut here ]------------
[148403.013071] WARNING: CPU: 14 PID: 4117 at fs/btrfs/send.c:6194 btrfs_ioctl_send+0xe19/0xe51 [btrfs]
[148403.014447] Modules linked in: btrfs crc32c_generic xor raid6_pq acpi_cpufreq tpm_tis ppdev tpm parport_pc psmouse parport sg pcspkr i2c_piix4 i2c_core evdev processor serio_raw button loop autofs4 ext4 crc16 jbd2 mbcache sr_mod cdrom sd_mod ata_generic virtio_scsi ata_piix libata virtio_pci virtio_ring virtio e1000 scsi_mod floppy [last unloaded: btrfs]
[148403.019708] CPU: 14 PID: 4117 Comm: btrfs Tainted: G        W       4.6.0-rc7-btrfs-next-31+ #1
[148403.020104] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS by qemu-project.org 04/01/2014
[148403.020104]  0000000000000000 ffff88022139fca8 ffffffff8126b42c 0000000000000000
[148403.020104]  0000000000000000 ffff88022139fce8 ffffffff81052b14 000018322139fac8
[148403.020104]  ffff88022b0db400 0000000000000000 0000000000000001 0000000000000000
[148403.020104] Call Trace:
[148403.020104]  [<ffffffff8126b42c>] dump_stack+0x67/0x90
[148403.020104]  [<ffffffff81052b14>] __warn+0xc2/0xdd
[148403.020104]  [<ffffffff81052beb>] warn_slowpath_null+0x1d/0x1f
[148403.020104]  [<ffffffffa04bc847>] btrfs_ioctl_send+0xe19/0xe51 [btrfs]
[148403.020104]  [<ffffffffa048b358>] btrfs_ioctl+0x14f/0x1f81 [btrfs]
[148403.020104]  [<ffffffff8108e456>] ? arch_local_irq_save+0x9/0xc
[148403.020104]  [<ffffffff8108eb51>] ? __lock_is_held+0x3c/0x57
[148403.020104]  [<ffffffff8118da05>] vfs_ioctl+0x18/0x34
[148403.020104]  [<ffffffff8118e00c>] do_vfs_ioctl+0x550/0x5be
[148403.020104]  [<ffffffff81196f0c>] ? __fget+0x6b/0x77
[148403.020104]  [<ffffffff81196fa1>] ? __fget_light+0x62/0x71
[148403.020104]  [<ffffffff8118e0d1>] SyS_ioctl+0x57/0x79
[148403.020104]  [<ffffffff8149e025>] entry_SYSCALL_64_fastpath+0x18/0xa8
[148403.020104]  [<ffffffff8108e89d>] ? trace_hardirqs_off_caller+0x3f/0xaa
[148403.038981] ---[ end trace a4539270c8056f8c ]---

There's another issue caused by similar (but more complex) changes in the
directory hierarchy that makes move/rename operations fail, described with
the following example:

  Parent snapshot:

  .
  |---- a/                                                   (ino 262)
  |     |---- c/                                             (ino 268)
  |
  |---- d/                                                   (ino 263)
        |---- ance/                                          (ino 267)
                |---- e/                                     (ino 264)
                |---- f/                                     (ino 265)
                |---- ance/                                  (ino 266)

  Send snapshot:

  .
  |---- a/                                                   (ino 262)
  |---- c/                                                   (ino 268)
  |     |---- ance/                                          (ino 267)
  |
  |---- d/                                                   (ino 263)
  |     |---- ance/                                          (ino 266)
  |
  |---- f/                                                   (ino 265)
        |---- e/                                             (ino 264)

When the inode 265 is processed, the path for inode 267 is computed, which
at that time corresponds to "d/ance", and it's stored in the names cache.
Later on when processing inode 266, we end up orphanizing (renaming to a
name matching the pattern o<ino>-<gen>-<seq>) inode 267 because it has
the same name as inode 266 and it's currently a child of the new parent
directory (inode 263) for inode 266. After the orphanization and while we
are still processing inode 266, a rename operation for inode 266 is
generated. However the source path for that rename operation is incorrect
because it ends up using the old, pre-orphanization, name of inode 267.
The no longer valid name for inode 267 was previously cached when
processing inode 265 and it remains usable and considered valid until
the inode currently being processed has a number greater than 267.
This resulted in the receiving side failing with the following error:

  ERROR: rename d/ance/ance -> d/ance failed: No such file or directory

So fix these issues by detecting such circular dependencies for rename
operations and by clearing the cached name of an inode once the inode
is orphanized.

A test case for fstests will follow soon.

Signed-off-by: Robbie Ko <robbieko@synology.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
[Rewrote change log to be more detailed and organized, and improved
 comments]

Signed-off-by: Filipe Manana <fdmanana@suse.com>
2016-08-01 07:23:10 +01:00
..
tests Btrfs: fix error return code in btrfs_init_test_fs() 2016-06-23 10:44:39 -07:00
acl.c posix_acl: Inode acl caching fixes 2016-03-31 00:30:15 -04:00
async-thread.c btrfs: async-thread: Fix a use-after-free error for trace 2016-01-25 16:50:26 -08:00
async-thread.h btrfs: async_thread: Fix workqueue 'max_active' value when initializing 2015-08-31 11:46:40 -07:00
backref.c Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
backref.h btrfs: cleanup, remove inode_item_info helper 2015-01-14 19:23:47 +01:00
btrfs_inode.h Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
check-integrity.c btrfs: Use correct format specifier 2016-06-17 18:32:40 +02:00
check-integrity.h
compression.c btrfs: make find_workspace warn if there are no workspaces 2016-05-10 09:46:16 +02:00
compression.h btrfs: move btrfs_compression_type to compression.h 2016-03-11 17:12:46 +01:00
ctree.c Btrfs: fix error handling in map_private_extent_buffer 2016-06-23 10:44:40 -07:00
ctree.h Btrfs: add tracepoints for flush events 2016-07-07 18:45:53 +02:00
delayed-inode.c Btrfs: change delayed reservation fallback behavior 2016-07-07 18:45:53 +02:00
delayed-inode.h Btrfs: fix ->iterate_shared() by upgrading i_rwsem for delayed nodes 2016-06-25 06:20:10 -07:00
delayed-ref.c btrfs: drop null testing before destroy functions 2016-02-18 11:46:03 +01:00
delayed-ref.h btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
dev-replace.c Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
dev-replace.h btrfs: refactor btrfs_dev_replace_start for reuse 2016-04-28 10:59:13 +02:00
dir-item.c Btrfs: make xattr replace operations atomic 2014-11-20 17:20:07 -08:00
disk-io.c Btrfs: Force stripesize to the value of sectorsize 2016-06-23 10:44:42 -07:00
disk-io.h Btrfs: self-tests: Support non-4k page size 2016-06-02 19:23:14 +02:00
export.c BTRFS: support NFSv2 export 2015-10-06 06:55:23 -07:00
export.h
extent-tree.c Btrfs: avoid deadlocks during reservations in btrfs_truncate_block 2016-07-20 16:58:04 -07:00
extent_io.c Btrfs: fix error handling in map_private_extent_buffer 2016-06-23 10:44:40 -07:00
extent_io.h Btrfs: self-tests: Support non-4k page size 2016-06-02 19:23:14 +02:00
extent_map.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
extent_map.h btrfs: cleanup, stop casting for extent_map->lookup everywhere 2016-01-15 19:22:28 +01:00
file-item.c btrfs: sink gfp parameter to set_extent_bits 2016-04-29 11:01:47 +02:00
file.c Btrfs: add missing check for writeback errors on fsync 2016-08-01 07:21:13 +01:00
free-space-cache.c Btrfs: self-tests: Support non-4k page size 2016-06-02 19:23:14 +02:00
free-space-cache.h btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
free-space-tree.c Revert "btrfs: synchronize incompat feature bits with sysfs files" 2016-01-29 08:19:37 -08:00
free-space-tree.h Btrfs: implement the free space B-tree 2015-12-17 12:16:47 -08:00
hash.c btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
hash.h btrfs: advertise which crc32c implementation is being used at module load 2016-06-06 14:08:28 +02:00
inode-item.c btrfs: rename btrfs_std_error to btrfs_handle_fs_error 2016-04-28 10:36:54 +02:00
inode-map.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
inode-map.h Btrfs: Initialize btrfs_root->highest_objectid when loading tree root and subvolume roots 2016-01-15 19:25:02 +01:00
inode.c Btrfs: fix callers of btrfs_block_rsv_migrate 2016-07-07 18:45:53 +02:00
ioctl.c Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs 2016-05-27 17:14:05 -07:00
Kconfig rcu: Make SRCU optional by using CONFIG_SRCU 2015-01-06 11:04:29 -08:00
locking.c btrfs: cleanup, remove stray return statements 2016-01-07 14:30:52 +01:00
locking.h btrfs: fix lockups from btrfs_clear_path_blocking 2014-11-19 10:34:35 -08:00
lzo.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00
Makefile Btrfs: add free space tree sanity tests 2015-12-17 12:16:47 -08:00
math.h btrfs: cleanup 64bit/32bit divs, compile time constants 2015-03-03 17:23:57 +01:00
ordered-data.c btrfs: fix disk_i_size update bug when fallocate() fails 2016-06-23 10:44:41 -07:00
ordered-data.h Btrfs: fix race setting block group readonly during device replace 2016-05-30 12:58:21 +01:00
orphan.c
print-tree.c btrfs: teach print_leaf about temporary item subtypes 2016-02-11 16:15:43 +01:00
print-tree.h
props.c btrfs: move btrfs_compression_type to compression.h 2016-03-11 17:12:46 +01:00
props.h
qgroup.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
qgroup.h btrfs: qgroup: Check if qgroup reserved space leaked 2015-10-21 18:41:10 -07:00
raid56.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
raid56.h Btrfs: add RAID 5/6 BTRFS_RBIO_REBUILD_MISSING operation 2015-08-09 07:34:26 -07:00
rcu-string.h
reada.c Btrfs: fix race between readahead and device replace/removal 2016-05-30 12:58:18 +01:00
relocation.c Btrfs: use FLUSH_LIMIT for relocation in reserve_metadata_bytes 2016-07-07 18:45:53 +02:00
root-tree.c Merge branch 'cleanups-4.7' into for-chris-4.7-20160525 2016-05-25 22:51:03 +02:00
scrub.c Btrfs: fix race setting block group back to RW mode during device replace 2016-05-30 12:58:24 +01:00
send.c Btrfs: send, fix failure to move directories with the same name around 2016-08-01 07:23:10 +01:00
send.h Btrfs: use linux/sizes.h to represent constants 2016-01-07 14:38:02 +01:00
struct-funcs.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
super.c btrfs: avoid blocking open_ctree from cleaner_kthread 2016-06-17 18:32:40 +02:00
sysfs.c btrfs: sysfs: protect reading label by lock 2016-05-06 15:22:49 +02:00
sysfs.h btrfs: sysfs: introduce helper for syncing bits with sysfs files 2016-01-21 18:50:40 +01:00
transaction.c Btrfs: track transid for delayed ref flushing 2016-06-22 17:54:18 -07:00
transaction.h btrfs: account for non-CoW'd blocks in btrfs_abort_transaction 2016-06-17 18:32:40 +02:00
tree-defrag.c Btrfs: fix locking bugs when defragging leaves 2015-12-18 02:51:32 +00:00
tree-log.c Merge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-06-18 05:57:59 -10:00
tree-log.h Btrfs: fix unreplayable log after snapshot delete + parent dir fsync 2016-03-01 08:23:25 -08:00
ulist.c btrfs: fix string and comment grammatical issues and typos 2016-05-25 22:35:14 +02:00
ulist.h btrfs: ulist: Add ulist_del() function. 2015-06-10 09:26:17 -07:00
uuid-tree.c
volumes.c Merge branch 'for-linus-4.7' of git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs 2016-06-25 08:42:31 -07:00
volumes.h Merge branch 'foreign/jeffm/uapi' into for-chris-4.7-20160516 2016-05-16 15:46:29 +02:00
xattr.c switch xattr_handler->set() to passing dentry and inode separately 2016-05-27 15:39:43 -04:00
xattr.h btrfs: Switch to generic xattr handlers 2016-05-17 19:17:09 -04:00
zlib.c mm, fs: get rid of PAGE_CACHE_* and page_cache_{get,release} macros 2016-04-04 10:41:08 -07:00