btrfs: do unsigned integer division in the extent buffer binary search loop

In the search loop of the binary search function, we are doing a division
by 2 of the sum of the high and low slots. Because the slots are integers,
the generated assembly code for it is the following on x86_64:

   0x00000000000141f1 <+145>:	mov    %eax,%ebx
   0x00000000000141f3 <+147>:	shr    $0x1f,%ebx
   0x00000000000141f6 <+150>:	add    %eax,%ebx
   0x00000000000141f8 <+152>:	sar    %ebx

It's a few more instructions than a simple right shift, because signed
integer division needs to round towards zero. However we know that slots
can never be negative (btrfs_header_nritems() returns an u32), so we
can instead use unsigned types for the low and high slots and therefore
use unsigned integer division, which results in a single instruction on
x86_64:

   0x00000000000141f0 <+144>:	shr    %ebx

So use unsigned types for the slots and therefore unsigned division.

This is part of a small patchset comprised of the following two patches:

  btrfs: eliminate extra call when doing binary search on extent buffer
  btrfs: do unsigned integer division in the extent buffer binary search loop

The following fs_mark test was run on a non-debug kernel (Debian's default
kernel config) before and after applying the patchset:

  $ cat test.sh
  #!/bin/bash

  DEV=/dev/sdi
  MNT=/mnt/sdi
  MOUNT_OPTIONS="-o ssd"
  MKFS_OPTIONS="-O no-holes -R free-space-tree"
  FILES=100000
  THREADS=$(nproc --all)
  FILE_SIZE=0

  umount $DEV &> /dev/null
  mkfs.btrfs -f $MKFS_OPTIONS $DEV
  mount $MOUNT_OPTIONS $DEV $MNT

  OPTS="-S 0 -L 6 -n $FILES -s $FILE_SIZE -t $THREADS -k"
  for ((i = 1; i <= $THREADS; i++)); do
      OPTS="$OPTS -d $MNT/d$i"
  done

  fs_mark $OPTS

  umount $MNT

Results before applying patchset:

  FSUse%        Count         Size    Files/sec     App Overhead
       2      1200000            0     174472.0         11549868
       4      2400000            0     253503.0         11694618
       4      3600000            0     257833.1         11611508
       6      4800000            0     247089.5         11665983
       6      6000000            0     211296.1         12121244
      10      7200000            0     187330.6         12548565

Results after applying patchset:

  FSUse%        Count         Size    Files/sec     App Overhead
       2      1200000            0     207556.0         11393252
       4      2400000            0     266751.1         11347909
       4      3600000            0     274397.5         11270058
       6      4800000            0     259608.4         11442250
       6      6000000            0     238895.8         11635921
       8      7200000            0     211942.2         11873825

Reviewed-by: Josef Bacik <josef@toxicpanda.com>
Signed-off-by: Filipe Manana <fdmanana@suse.com>
Reviewed-by: David Sterba <dsterba@suse.com>
Signed-off-by: David Sterba <dsterba@suse.com>
This commit is contained in:
Filipe Manana 2023-02-08 17:46:49 +00:00 committed by David Sterba
parent 7b00dfffeb
commit a724f313f8
2 changed files with 12 additions and 7 deletions

View file

@ -853,8 +853,8 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans,
/*
* Search for a key in the given extent_buffer.
*
* The lower boundary for the search is specified by the slot number @low. Use a
* value of 0 to search over the whole extent buffer.
* The lower boundary for the search is specified by the slot number @first_slot.
* Use a value of 0 to search over the whole extent buffer.
*
* The slot in the extent buffer is returned via @slot. If the key exists in the
* extent buffer, then @slot will point to the slot where the key is, otherwise
@ -863,18 +863,23 @@ int btrfs_realloc_node(struct btrfs_trans_handle *trans,
* Slot may point to the total number of items (i.e. one position beyond the last
* key) if the key is bigger than the last key in the extent buffer.
*/
int btrfs_generic_bin_search(struct extent_buffer *eb, int low,
int btrfs_generic_bin_search(struct extent_buffer *eb, int first_slot,
const struct btrfs_key *key, int *slot)
{
unsigned long p;
int item_size;
int high = btrfs_header_nritems(eb);
/*
* Use unsigned types for the low and high slots, so that we get a more
* efficient division in the search loop below.
*/
u32 low = first_slot;
u32 high = btrfs_header_nritems(eb);
int ret;
const int key_size = sizeof(struct btrfs_disk_key);
if (low > high) {
if (unlikely(low > high)) {
btrfs_err(eb->fs_info,
"%s: low (%d) > high (%d) eb %llu owner %llu level %d",
"%s: low (%u) > high (%u) eb %llu owner %llu level %d",
__func__, low, high, eb->start,
btrfs_header_owner(eb), btrfs_header_level(eb));
return -EINVAL;

View file

@ -508,7 +508,7 @@ int btrfs_trim_fs(struct btrfs_fs_info *fs_info, struct fstrim_range *range);
int __init btrfs_ctree_init(void);
void __cold btrfs_ctree_exit(void);
int btrfs_generic_bin_search(struct extent_buffer *eb, int low,
int btrfs_generic_bin_search(struct extent_buffer *eb, int first_slot,
const struct btrfs_key *key, int *slot);
/*