git/oidset.c
René Scharfe 8b2f8cbcb1 oidset: use khash
Reimplement oidset using khash.h in order to reduce its memory footprint
and make it faster.

Performance of a command that mainly checks for duplicate objects using
an oidset, with master and Clang 6.0.1:

  $ cmd="./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'"

  $ /usr/bin/time $cmd >/dev/null
  0.22user 0.03system 0:00.25elapsed 99%CPU (0avgtext+0avgdata 48484maxresident)k
  0inputs+0outputs (0major+11204minor)pagefaults 0swaps

  $ hyperfine "$cmd"
  Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'

    Time (mean ± σ):     250.0 ms ±   6.0 ms    [User: 225.9 ms, System: 23.6 ms]

    Range (min … max):   242.0 ms … 261.1 ms

And with this patch:

  $ /usr/bin/time $cmd >/dev/null
  0.14user 0.00system 0:00.15elapsed 100%CPU (0avgtext+0avgdata 41396maxresident)k
  0inputs+0outputs (0major+8318minor)pagefaults 0swaps

  $ hyperfine "$cmd"
  Benchmark #1: ./git-cat-file --batch-all-objects --unordered --buffer --batch-check='%(objectname)'

    Time (mean ± σ):     151.9 ms ±   4.9 ms    [User: 130.5 ms, System: 21.2 ms]

    Range (min … max):   148.2 ms … 170.4 ms

Initial-patch-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-04 11:12:13 -07:00

31 lines
629 B
C

#include "cache.h"
#include "oidset.h"
int oidset_contains(const struct oidset *set, const struct object_id *oid)
{
khiter_t pos = kh_get_oid(&set->set, *oid);
return pos != kh_end(&set->set);
}
int oidset_insert(struct oidset *set, const struct object_id *oid)
{
int added;
kh_put_oid(&set->set, *oid, &added);
return !added;
}
int oidset_remove(struct oidset *set, const struct object_id *oid)
{
khiter_t pos = kh_get_oid(&set->set, *oid);
if (pos == kh_end(&set->set))
return 0;
kh_del_oid(&set->set, pos);
return 1;
}
void oidset_clear(struct oidset *set)
{
kh_release_oid(&set->set);
oidset_init(set, 0);
}