development/git - HydraGit

mirror of https://github.com/git/git synced 2024-09-12 21:04:12 +00:00

Author	SHA1	Message	Date
Taylor Blau	97b89c8150	p5326: don't set core.multiPackIndex unnecessarily When this performance test was originally written, `core.multiPackIndex` was not the default and thus had to be enabled. But now that we have `18e449f86b` (midx: enable core.multiPackIndex by default, 2020-09-25), we no longer need this. Drop the unnecessary setup (even though it's not hurting anything, it is unnecessary at best and confusing at worst). Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 16:34:18 -07:00
Taylor Blau	2082224f17	p5326: create missing 'perf-tag' tag Some of the tests in test_full_bitmap rely on having a tag named perf-tag in place. We could create it in test_full_bitmap(), but we want to have it in place before the repack starts. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-14 16:34:18 -07:00
Taylor Blau	2d59597333	p5326: perf tests for MIDX bitmaps These new performance tests demonstrate effectively the same behavior as p5310, but use a multi-pack bitmap instead of a single-pack one. Notably, p5326 does not create a MIDX bitmap with multiple packs. This is so we can measure a direct comparison between it and p5310. Any difference between the two is measuring just the overhead of using MIDX bitmaps. Here are the results of p5310 and p5326 together, measured at the same time and on the same machine (using a Xenon W-2255 CPU): Test HEAD ------------------------------------------------------------------------ 5310.2: repack to disk 96.78(93.39+11.33) 5310.3: simulated clone 9.98(9.79+0.19) 5310.4: simulated fetch 1.75(4.26+0.19) 5310.5: pack to file (bitmap) 28.20(27.87+8.70) 5310.6: rev-list (commits) 0.41(0.36+0.05) 5310.7: rev-list (objects) 1.61(1.54+0.07) 5310.8: rev-list count with blob:none 0.25(0.21+0.04) 5310.9: rev-list count with blob:limit=1k 2.65(2.54+0.10) 5310.10: rev-list count with tree:0 0.23(0.19+0.04) 5310.11: simulated partial clone 4.34(4.21+0.12) 5310.13: clone (partial bitmap) 11.05(12.21+0.48) 5310.14: pack to file (partial bitmap) 31.25(34.22+3.70) 5310.15: rev-list with tree filter (partial bitmap) 0.26(0.22+0.04) versus the same tests (this time using a multi-pack index): Test HEAD ------------------------------------------------------------------------ 5326.2: setup multi-pack index 78.99(75.29+11.58) 5326.3: simulated clone 11.78(11.56+0.22) 5326.4: simulated fetch 1.70(4.49+0.13) 5326.5: pack to file (bitmap) 28.02(27.72+8.76) 5326.6: rev-list (commits) 0.42(0.36+0.06) 5326.7: rev-list (objects) 1.65(1.58+0.06) 5326.8: rev-list count with blob:none 0.26(0.21+0.05) 5326.9: rev-list count with blob:limit=1k 2.97(2.86+0.10) 5326.10: rev-list count with tree:0 0.25(0.20+0.04) 5326.11: simulated partial clone 5.65(5.49+0.16) 5326.13: clone (partial bitmap) 12.22(13.43+0.38) 5326.14: pack to file (partial bitmap) 30.05(31.57+7.25) 5326.15: rev-list with tree filter (partial bitmap) 0.24(0.20+0.04) There is slight overhead in "simulated clone", "simulated partial clone", and "clone (partial bitmap)". Unsurprisingly, that overhead is due to using the MIDX's reverse index to map between bit positions and MIDX positions. This can be reproduced by running "git repack -adb" along with "git multi-pack-index write --bitmap" in a large-ish repository. Then run: $ perf record -o pack.perf git -c core.multiPackIndex=false \ pack-objects --all --stdout >/dev/null </dev/null $ perf record -o midx.perf git -c core.multiPackIndex=true \ pack-objects --all --stdout >/dev/null </dev/null and compare the two with "perf diff -c delta -o 1 pack.perf midx.perf". The most notable results are below (the next largest positive delta is +0.14%): # Event 'cycles' # # Baseline Delta Shared Object Symbol # ........ ....... .................. .......................... # +5.86% git [.] nth_midxed_offset +5.24% git [.] nth_midxed_pack_int_id 3.45% +0.97% git [.] offset_to_pack_pos 3.30% +0.57% git [.] pack_pos_to_offset +0.30% git [.] pack_pos_to_midx Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-01 13:56:43 -07:00
Taylor Blau	9387fbd646	p5310: extract full and partial bitmap tests A new p5326 introduced by the next patch will want these same tests, interjecting its own setup in between. Move them out so that both perf tests can reuse them. Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-01 13:56:43 -07:00
Junio C Hamano	506d2a354a	Merge branch 'ds/commit-and-checkout-with-sparse-index' "git checkout" and "git commit" learn to work without unnecessarily expanding sparse indexes. * ds/commit-and-checkout-with-sparse-index: unpack-trees: resolve sparse-directory/file conflicts t1092: document bad 'git checkout' behavior checkout: stop expanding sparse indexes sparse-index: recompute cache-tree commit: integrate with sparse-index p2000: compress repo names p2000: add 'git checkout -' test and decrease depth	2021-08-04 13:28:53 -07:00
Junio C Hamano	e163f73b7b	Merge branch 'ps/perf-with-separate-output-directory' Test update. * ps/perf-with-separate-output-directory: perf: fix when running with TEST_OUTPUT_DIRECTORY	2021-08-02 14:06:41 -07:00
Derrick Stolee	11042ab914	p2000: compress repo names By using shorter names for the test repos, we will get a slightly more compressed performance summary without comprimising clarity. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-14 15:05:53 -07:00
Derrick Stolee	0d53d19946	p2000: add 'git checkout -' test and decrease depth As we increase our list of commands to test in p2000-sparse-operations.sh, we will want to have a slightly smaller test repository. Reduce the size by a factor of four by reducing the depth of the step that creates a big index around a moderately-sized repository. Also add a step to run 'git checkout -' on repeat. This requires having a previous location in the reflog, so add that to the initialization steps. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-14 15:05:53 -07:00
Junio C Hamano	4da281e84d	Merge branch 'ab/pickaxe-pcre2' Rewrite the backend for "diff -G/-S" to use pcre2 engine when available. * ab/pickaxe-pcre2: (22 commits) xdiff-interface: replace discard_hunk_line() with a flag xdiff users: use designated initializers for out_line pickaxe -G: don't special-case create/delete pickaxe -G: terminate early on matching lines xdiff-interface: allow early return from xdiff_emit_line_fn xdiff-interface: prepare for allowing early return pickaxe -S: slightly optimize contains() pickaxe: rename variables in has_changes() for brevity pickaxe -S: support content with NULs under --pickaxe-regex pickaxe: assert that we must have a needle under -G or -S pickaxe: refactor function selection in diffcore-pickaxe() perf: add performance test for pickaxe pickaxe/style: consolidate declarations and assignments diff.h: move pickaxe fields together again pickaxe: die when --find-object and --pickaxe-all are combined pickaxe: die when -G and --pickaxe-regex are combined pickaxe tests: add missing test for --no-pickaxe-regex being an error pickaxe tests: test for -G, -S and --find-object incompatibility pickaxe tests: add test for "log -S" not being a regex pickaxe tests: add test for diffgrep_consume() internals ...	2021-07-13 16:52:50 -07:00
Patrick Steinhardt	3663e5904d	perf: fix when running with TEST_OUTPUT_DIRECTORY When the TEST_OUTPUT_DIRECTORY is defined, then all test data will be written in that directory instead of the default directory located in "t/". While this works as expected for our normal tests, performance tests fail to locate and aggregate performance data because they don't know to handle TEST_OUTPUT_DIRECTORY correctly and always look at the default location. Fix the issue by adding a `--results-dir` parameter to "aggregate.perl" which identifies the directory where results are and by making the "run" script awake of the TEST_OUTPUT_DIRECTORY variable. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-07-02 15:47:30 -07:00
Ævar Arnfjörð Bjarmason	d90d441c33	perf: add performance test for pickaxe Add a test for the -G and -S pickaxe options and related options. This test supports being run with GIT_TEST_LONG=1 to adjust the limit on the number of commits from 1k to 10k. The 1k limit seems to hit a good spot on git.git Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-05-11 12:47:31 +09:00
Junio C Hamano	a0f521b56c	Merge branch 'rs/repack-without-loosening-promised-objects' "git repack -A -d" in a partial clone unnecessarily loosened objects in promisor pack. * rs/repack-without-loosening-promised-objects: repack: avoid loosening promisor objects in partial clones	2021-05-10 16:59:47 +09:00
Junio C Hamano	8e97852919	Merge branch 'ds/sparse-index-protections' Builds on top of the sparse-index infrastructure to mark operations that are not ready to mark with the sparse index, causing them to fall back on fully-populated index that they always have worked with. * ds/sparse-index-protections: (47 commits) name-hash: use expand_to_path() sparse-index: expand_to_path() name-hash: don't add directories to name_hash revision: ensure full index resolve-undo: ensure full index read-cache: ensure full index pathspec: ensure full index merge-recursive: ensure full index entry: ensure full index dir: ensure full index update-index: ensure full index stash: ensure full index rm: ensure full index merge-index: ensure full index ls-files: ensure full index grep: ensure full index fsck: ensure full index difftool: ensure full index commit: ensure full index checkout: ensure full index ...	2021-04-30 13:50:26 +09:00
Rafael Silva	a643157d5a	repack: avoid loosening promisor objects in partial clones When `git repack -A -d` is run in a partial clone, `pack-objects` is invoked twice: once to repack all promisor objects, and once to repack all non-promisor objects. The latter `pack-objects` invocation is with --exclude-promisor-objects and --unpack-unreachable, which loosens all objects unused during this invocation. Unfortunately, this includes promisor objects. Because the -d argument to `git repack` subsequently deletes all loose objects also in packs, these just-loosened promisor objects will be immediately deleted. However, this extra disk churn is unnecessary in the first place. For example, in a newly-cloned partial repo that filters all blob objects (e.g. `--filter=blob:none`), `repack` ends up unpacking all trees and commits into the filesystem because every object, in this particular case, is a promisor object. Depending on the repo size, this increases the disk usage considerably: In my copy of the linux.git, the object directory peaked 26GB of more disk usage. In order to avoid this extra disk churn, pass the names of the promisor packfiles as --keep-pack arguments to the second invocation of `pack-objects`. This informs `pack-objects` that the promisor objects are already in a safe packfile and, therefore, do not need to be loosened. For testing, we need to validate whether any object was loosened. However, the "evidence" (loosened objects) is deleted during the process which prevents us from inspecting the object directory. Instead, let's teach `pack-objects` to count loosened objects and emit via trace2 thus allowing inspecting the debug events after the process is finished. This new event is used on the added regression test. Lastly, add a new perf test to evaluate the performance impact made by this changes (tested on git.git): Test HEAD^ HEAD ---------------------------------------------------------- 5600.3: gc 134.38(41.93+90.95) 7.80(6.72+1.35) -94.2% For a bigger repository, such as linux.git, the improvement is even bigger: Test HEAD^ HEAD ------------------------------------------------------------------- 5600.3: gc 6833.00(918.07+3162.74) 268.79(227.02+39.18) -96.1% These improvements are particular big because every object in the newly-cloned partial repository is a promisor object. Reported-by: SZEDER Gábor <szeder.dev@gmail.com> Helped-by: Jeff King <peff@peff.net> Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Rafael Silva <rafaeloliveira.cs@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-28 13:36:13 +09:00
Jeff King	c1fa951d7e	revision: avoid parsing with --exclude-promisor-objects When --exclude-promisor-objects is given, before traversing any objects we iterate over all of the objects in any promisor packs, marking them as UNINTERESTING and SEEN. We turn the oid we get from iterating the pack into an object with parse_object(), but this has two problems: - it's slow; we are zlib inflating (and reconstructing from deltas) every byte of every object in the packfile - it leaves the tree buffers attached to their structs, which means our heap usage will grow to store every uncompressed tree simultaneously. This can be gigabytes. We can obviously fix the second by freeing the tree buffers after we've parsed them. But we can observe that the function doesn't look at the object contents at all! The only reason we call parse_object() is that we need a "struct object" on which to set the flags. There are two options here: - we can look up just the object type via oid_object_info(), and then call the appropriate lookup_foo() function - we can call lookup_unknown_object(), which gives us an OBJ_NONE struct (which will get auto-converted later by object_as_type() via calls to lookup_commit(), etc). The first one is closer to the current code, but we do pay the price to look up the type for each object. The latter should be more efficient in CPU, though it wastes a little bit of memory (the "unknown" object structs are a union of all object types, so some of the structs are bigger than they need to be). It also runs the risk of triggering a latent bug in code that calls lookup_object() directly but isn't ready to handle OBJ_NONE (such code would already be buggy, but we use lookup_unknown_object() infrequently enough that it might be hiding). I went with the second option here. I don't think the risk is high (and we'd want to find and fix any such bugs anyway), and it should be more efficient overall. The new tests in p5600 show off the improvement (this is on git.git): Test HEAD^ HEAD ------------------------------------------------------------------------------- 5600.5: count commits 0.37(0.37+0.00) 0.38(0.38+0.00) +2.7% 5600.6: count non-promisor commits 11.74(11.37+0.37) 0.04(0.03+0.00) -99.7% The improvement is particularly big in this script because _every_ object in the newly-cloned partial repo is a promisor object. So after marking them all, there's nothing left to traverse. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:22:37 -07:00
Jeff King	fcc07e980b	is_promisor_object(): free tree buffer after parsing To get the list of all promisor objects, we not only include all objects in promisor packs, but also parse each of those objects to see which objects they reference. After parsing a tree object, the tree->buffer field will remain populated until we explicitly free it. So in a partial clone of blob:none, for example, we are essentially reading every tree in the repository (since they're all in the initial promisor pack), and keeping all of their uncompressed contents in memory at once. This patch frees the tree buffers after we've finished marking all of their reachable objects. We shouldn't need to do this for any other object type. While we are using some extra memory to store the structs, no other object type stores the whole contents in its parsed form (we do sometimes hold on to commit buffers, but less so these days due to commit graphs, plus most commands which care about promisor objects turn off the save_commit_buffer global). Even for a moderate-sized repository like git.git, this patch drops the peak heap (as measured by massif) for git-fsck from ~1.7GB to ~138MB. Fsck is a good candidate for measuring here because it doesn't interact with the promisor code except to call is_promisor_object(), so we can isolate just this problem. The added perf test shows only a tiny improvement on my machine for git.git, since 1.7GB isn't enough to cause any real memory pressure: Test HEAD^ HEAD -------------------------------------------------------------------------------- 5600.4: fsck 21.26(20.90+0.35) 20.84(20.79+0.04) -2.0% With linux.git the absolute change is a bit bigger, though still a small percentage: Test HEAD^ HEAD ----------------------------------------------------------------------------- 5600.4: fsck 262.26(259.13+3.12) 254.92(254.62+0.29) -2.8% I didn't have the patience to run it under massif with linux.git, but it's probably on the order of about 14GB improvement, since that's the sum of the sizes of all of the uncompressed trees (but still isn't enough to create memory pressure on this particular machine, which has 64GB of RAM). Smaller machines would probably see a bigger effect on runtime (and sadly our perf suite does not measure peak heap). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-13 13:16:39 -07:00
Junio C Hamano	58840e62a4	Merge branch 'ps/pack-bitmap-optim' Optimize "rev-list --use-bitmap-index --objects" corner case that uses negative tags as the stopping points. * ps/pack-bitmap-optim: pack-bitmap: avoid traversal of objects referenced by uninteresting tag	2021-04-07 16:54:09 -07:00
Derrick Stolee	c9e40ae8ec	p2000: add sparse-index repos p2000-sparse-operations.sh compares different Git commands in repositories with many files at HEAD but using sparse-checkout to focus on a small portion of those files. Add extra copies of the repository that use the sparse-index format so we can track how that affects the performance of different commands. At this point in time, the sparse-index is 100% overhead from the CPU front, and this is measurable in these tests: Test --------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.59(0.51+0.12) 2000.3: git status (full-index-v4) 0.59(0.52+0.11) 2000.4: git status (sparse-index-v3) 1.40(1.32+0.12) 2000.5: git status (sparse-index-v4) 1.41(1.36+0.08) 2000.6: git add -A (full-index-v3) 2.32(1.97+0.19) 2000.7: git add -A (full-index-v4) 2.17(1.92+0.14) 2000.8: git add -A (sparse-index-v3) 2.31(2.21+0.15) 2000.9: git add -A (sparse-index-v4) 2.30(2.20+0.13) 2000.10: git add . (full-index-v3) 2.39(2.02+0.20) 2000.11: git add . (full-index-v4) 2.20(1.94+0.16) 2000.12: git add . (sparse-index-v3) 2.36(2.27+0.12) 2000.13: git add . (sparse-index-v4) 2.33(2.21+0.16) 2000.14: git commit -a -m A (full-index-v3) 2.47(2.12+0.20) 2000.15: git commit -a -m A (full-index-v4) 2.26(2.00+0.17) 2000.16: git commit -a -m A (sparse-index-v3) 3.01(2.92+0.16) 2000.17: git commit -a -m A (sparse-index-v4) 3.01(2.94+0.15) Note that there is very little difference between the v3 and v4 index formats when the sparse-index is enabled. This is primarily due to the fact that the relative file sizes are the same, and the command time is mostly taken up by parsing tree objects to expand the sparse index into a full one. With the current file layout, the index file sizes are given by this table: \| full index \| sparse index \| +-------------+--------------+ v3 \| 108 MiB \| 1.6 MiB \| v4 \| 80 MiB \| 1.2 MiB \| Future updates will improve the performance of Git commands when the index is sparse. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:49 -07:00
Derrick Stolee	0b5fcb08b5	t/perf: add performance test for sparse operations Create a test script that takes the default performance test (the Git codebase) and multiplies it by 256 using four layers of duplicated trees of width four. This results in nearly one million blob entries in the index. Then, we can clone this repository with sparse-checkout patterns that demonstrate four copies of the initial repository. Each clone will use a different index format or mode so peformance can be tested across the different options. Note that the initial repo is stripped of submodules before doing the copies. This preserves the expected data shape of the sparse index, because directories containing submodules are not collapsed to a sparse directory entry. Run a few Git commands on these clones, especially those that use the index (status, add, commit). Here are the results on my Linux machine: Test -------------------------------------------------------------- 2000.2: git status (full-index-v3) 0.37(0.30+0.09) 2000.3: git status (full-index-v4) 0.39(0.32+0.10) 2000.4: git add -A (full-index-v3) 1.42(1.06+0.20) 2000.5: git add -A (full-index-v4) 1.26(0.98+0.16) 2000.6: git add . (full-index-v3) 1.40(1.04+0.18) 2000.7: git add . (full-index-v4) 1.26(0.98+0.17) 2000.8: git commit -a -m A (full-index-v3) 1.42(1.11+0.16) 2000.9: git commit -a -m A (full-index-v4) 1.33(1.08+0.16) It is perhaps noteworthy that there is an improvement when using index version 4. This is because the v3 index uses 108 MiB while the v4 index uses 80 MiB. Since the repeated portions of the directories are very short (f3/f1/f2, for example) this ratio is less pronounced than in similarly-sized real repositories. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-30 12:57:44 -07:00
Junio C Hamano	858119f6d7	Merge branch 'nk/diff-index-fsmonitor' "git diff-index" codepath has been taught to trust fsmonitor status to reduce number of lstat() calls. * nk/diff-index-fsmonitor: fsmonitor: add perf test for git diff HEAD fsmonitor: add assertion that fsmonitor is valid to check_removed fsmonitor: skip lstat deletion check during git diff-index	2021-03-24 14:36:27 -07:00
Junio C Hamano	2744383cbd	Merge branch 'tb/geometric-repack' "git repack" so far has been only capable of repacking everything under the sun into a single pack (or split by size). A cleverer strategy to reduce the cost of repacking a repository has been introduced. * tb/geometric-repack: builtin/pack-objects.c: ignore missing links with --stdin-packs builtin/repack.c: reword comment around pack-objects flags builtin/repack.c: be more conservative with unsigned overflows builtin/repack.c: assign pack split later t7703: test --geometric repack with loose objects builtin/repack.c: do not repack single packs with --geometric builtin/repack.c: add '--geometric' option packfile: add kept-pack cache for find_kept_pack_entry() builtin/pack-objects.c: rewrite honor-pack-keep logic p5303: measure time to repack with keep p5303: add missing &&-chains builtin/pack-objects.c: add '--stdin-packs' option revision: learn '--no-kept-objects' packfile: introduce 'find_kept_pack_entry()'	2021-03-24 14:36:27 -07:00
Junio C Hamano	e8d5a423ca	Merge branch 'jk/perf-in-worktrees' Perf test update to work better in secondary worktrees. * jk/perf-in-worktrees: t/perf: avoid copying worktree files from test repo t/perf: handle worktrees as test repos	2021-03-22 14:00:23 -07:00
Patrick Steinhardt	540cdc11ad	pack-bitmap: avoid traversal of objects referenced by uninteresting tag When preparing the bitmap walk, we first establish the set of of have and want objects by iterating over the set of pending objects: if an object is marked as uninteresting, it's declared as an object we already have, otherwise as an object we want. These two sets are then used to compute which transitively referenced objects we need to obtain. One special case here are tag objects: when a tag is requested, we resolve it to its first not-tag object and add both resolved objects as well as the tag itself into either the have or want set. Given that the uninteresting-property always propagates to referenced objects, it is clear that if the tag is uninteresting, so are its children and vice versa. But we fail to propagate the flag, which effectively means that referenced objects will always be interesting except for the case where they have already been marked as uninteresting explicitly. This mislabeling does not impact correctness: we now have it in our "wants" set, and given that we later do an `AND NOT` of the bitmaps of "wants" and "haves" sets it is clear that the result must be the same. But we now start to needlessly traverse the tag's referenced objects in case it is uninteresting, even though we know that each referenced object will be uninteresting anyway. In the worst case, this can lead to a complete graph walk just to establish that we do not care for any object. Fix the issue by propagating the `UNINTERESTING` flag to pointees of tag objects and add a benchmark with negative revisions to p5310. This shows some nice performance benefits, tested with linux.git: Test HEAD~ HEAD --------------------------------------------------------------------------------------------------------------- 5310.3: repack to disk 193.18(181.46+16.42) 194.61(183.41+15.83) +0.7% 5310.4: simulated clone 25.93(24.88+1.05) 25.81(24.73+1.08) -0.5% 5310.5: simulated fetch 2.64(5.30+0.69) 2.59(5.16+0.65) -1.9% 5310.6: pack to file (bitmap) 58.75(57.56+6.30) 58.29(57.61+5.73) -0.8% 5310.7: rev-list (commits) 1.45(1.18+0.26) 1.46(1.22+0.24) +0.7% 5310.8: rev-list (objects) 15.35(14.22+1.13) 15.30(14.23+1.07) -0.3% 5310.9: rev-list with tag negated via --not --all (objects) 22.49(20.93+1.56) 0.11(0.09+0.01) -99.5% 5310.10: rev-list with negative tag (objects) 0.61(0.44+0.16) 0.51(0.35+0.16) -16.4% 5310.11: rev-list count with blob:none 12.15(11.19+0.96) 12.18(11.19+0.99) +0.2% 5310.12: rev-list count with blob:limit=1k 17.77(15.71+2.06) 17.75(15.63+2.12) -0.1% 5310.13: rev-list count with tree:0 1.69(1.31+0.38) 1.68(1.28+0.39) -0.6% 5310.14: simulated partial clone 20.14(19.15+0.98) 19.98(18.93+1.05) -0.8% 5310.16: clone (partial bitmap) 12.78(13.89+1.07) 12.72(13.99+1.01) -0.5% 5310.17: pack to file (partial bitmap) 42.07(45.44+2.72) 41.44(44.66+2.80) -1.5% 5310.18: rev-list with tree filter (partial bitmap) 0.44(0.29+0.15) 0.46(0.32+0.14) +4.5% While most benchmarks are probably in the range of noise, the newly added 5310.9 and 5310.10 benchmarks consistenly perform better. Signed-off-by: Patrick Steinhardt <ps@pks.im>. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-22 12:10:56 -07:00
Nipunn Koorapati	7e5aa13d2c	fsmonitor: add perf test for git diff HEAD Update the xargs call so that if your large repo contains symlinks, test-tool chmtime failure does not end the script. On Linux Test this tree upstream/master --------------------------------------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.52(0.43+0.10) 0.53(0.49+0.05) +1.9% 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.21(0.15+0.07) 0.22(0.13+0.09) +4.8% 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.65(0.93+0.71) 1.69(1.03+0.65) +2.4% 7519.7: status (dirty) (fsmonitor=fsmonitor-watchman) 11.99(11.34+1.58) 11.95(11.02+1.79) -0.3% 7519.8: diff (fsmonitor=fsmonitor-watchman) 0.25(0.17+0.26) 0.25(0.18+0.26) +0.0% 7519.9: diff HEAD (fsmonitor=fsmonitor-watchman) 0.39(0.25+0.34) 0.89(0.35+0.74) +128.2% 7519.10: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.16(0.13+0.04) 0.16(0.12+0.05) +0.0% 7519.11: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.12: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.16(0.12+0.05) 0.16(0.12+0.05) +0.0% 7519.13: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.16(0.11+0.06) 0.16(0.12+0.05) +0.0% 7519.14: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.18(0.13+0.06) 0.17(0.10+0.08) -5.6% 7519.15: add (fsmonitor=fsmonitor-watchman) 2.25(1.53+0.68) 2.25(1.47+0.74) +0.0% 7519.18: status (fsmonitor=disabled) 0.88(0.73+1.03) 0.89(0.67+1.08) +1.1% 7519.19: status -uno (fsmonitor=disabled) 0.45(0.43+0.89) 0.45(0.34+0.98) +0.0% 7519.20: status -uall (fsmonitor=disabled) 1.88(1.16+1.58) 1.88(1.22+1.51) +0.0% 7519.21: status (dirty) (fsmonitor=disabled) 7.53(7.05+2.11) 7.53(6.98+2.04) +0.0% 7519.22: diff (fsmonitor=disabled) 0.42(0.37+0.92) 0.42(0.38+0.91) +0.0% 7519.23: diff HEAD (fsmonitor=disabled) 0.44(0.41+0.90) 0.44(0.40+0.91) +0.0% 7519.24: diff -- 0_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.09+0.05) +0.0% 7519.25: diff -- 10_files (fsmonitor=disabled) 0.13(0.10+0.04) 0.13(0.10+0.04) +0.0% 7519.26: diff -- 100_files (fsmonitor=disabled) 0.13(0.09+0.05) 0.13(0.10+0.04) +0.0% 7519.27: diff -- 1000_files (fsmonitor=disabled) 0.13(0.09+0.06) 0.13(0.09+0.05) +0.0% 7519.28: diff -- 10000_files (fsmonitor=disabled) 0.14(0.11+0.05) 0.14(0.10+0.05) +0.0% 7519.29: add (fsmonitor=disabled) 2.43(1.61+1.64) 2.43(1.69+1.57) +0.0% On linux (2.29.2 vs w/ this patch): nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff 2>&1 \| grep lstat 0.04 0.000063 3 20 6 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c git diff HEAD 2>&1 \| grep lstat 94.98 5.242262 10 523783 13 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff 2>&1 \| grep lstat 0.38 0.000032 5 7 3 lstat nipunn@nipunn-dbx:~/src/server3$ strace -f -c ../git/bin-wrappers/git diff HEAD 2>&1 \| grep lstat 99.44 0.741892 9 81634 10 lstat On mac (2.29.2 vs w/ this patch): nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff 2>&1 \| grep "^lstat64 " lstat64 8 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c git diff HEAD 2>&1 \| grep "^lstat64 " lstat64 120242 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff 2>&1 \| grep "^lstat64 " lstat64 4 nipunn-mbp:server nipunn$ sudo dtruss -L -f -c ../git/bin-wrappers/git diff HEAD 2>&1 \| grep "^lstat64 " lstat64 4497 There are still a bunch of lstats - on directories, but not every file. Progress! Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-18 13:31:14 -07:00
Junio C Hamano	700696bcfc	Merge branch 'jh/fsmonitor-prework' Preliminary changes to fsmonitor integration. * jh/fsmonitor-prework: fsmonitor: refactor initialization of fsmonitor_last_update token fsmonitor: allow all entries for a folder to be invalidated fsmonitor: log FSMN token when reading and writing the index fsmonitor: log invocation of FSMonitor hook to trace2 read-cache: log the number of scanned files to trace2 read-cache: log the number of lstat calls to trace2 preload-index: log the number of lstat calls to trace2 p7519: add trace logging during perf test p7519: move watchman cleanup earlier in the test p7519: fix watchman watch-list test on Windows p7519: do not rely on "xargs -d" in test	2021-03-01 14:02:56 -08:00
Jeff King	36e834abc1	t/perf: avoid copying worktree files from test repo When running the perf suite, we copy files from an existing $GIT_DIR to a scratch repository to give us a realistic setup on which to operate. Since the perf scripts themselves may modify the scratch repository, we want to make sure we've scrubbed any references back to the original. One existing example is that we avoid copying the file "commondir" at the top-level of the repository. In a worktree git-dir (e.g., .git/worktrees/foo), that file contains the path to the parent repository; copying it could mean ref updates in the scratch repository affect the original. But there are other files we should cover, too: - "gitdir" in a worktree git-dir contains the path to the actual .git file in the working tree. We _shouldn't_ end up looking at it at all, since the lack of a "commondir" file means Git won't consider this to be a worktree git-dir. But it's best to err on the safe side. - in a parent repository that contains worktrees, the "$GIT_DIR/worktrees" directory will contain the git dirs for the individual worktrees. Which will themselves contain commondir and gitdir files that may reference the original repository. We should likewise remove them. Note that this does mean that the perf suite's scratch repositories will never have any worktrees. That's OK; we don't have any perf tests that are influenced by their presence. If we add any, they'd probably want to create the worktrees themselves anyway. This patch adds both paths to the set of omissions in test_perf_copy_repo_contents(). Note that we won't get confused here by matching arbitrary names like refs/heads/commondir. This list is always matching top-level entries in $GIT_DIR (we rely on "cp -R" to do the actual recursion). Suggested-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 14:21:04 -08:00
Jeff King	85b87a5396	t/perf: handle worktrees as test repos The perf suite gets confused when test_perf_default_repo is pointed at a worktree (which includes when it is run from within a worktree at all, since the default is to use the current repository). Here's an example: $ git worktree add ~/foo Preparing worktree (new branch 'foo') HEAD is now at `328c109303` The eighth batch $ cd ~/foo $ make [...build output...] $ cd t/perf $ ./p0000-perf-lib-sanity.sh -v -i [...] perf 1 - test_perf_default_repo works: running: foo=$(git rev-parse HEAD) && test_export foo fatal: ambiguous argument 'HEAD': unknown revision or path not in the working tree. Use '--' to separate paths from revisions, like this: 'git <command> [<revision>...] -- [<file>...]' The problem is that we didn't copy all of the necessary files from the source repository (in this case we got HEAD, but we have no refs!). We discover the git-dir with "rev-parse --git-dir", but this points to the worktree's partial repository in .../.git/worktrees/foo. That partial repository has a "commondir" file which points to the main repository, where the actual refs are stored, but we don't copy it. This is the correct thing to do, though! If we did copy it, then our scratch test repo would be pointing back to the original main repo, and any ref updates we made in the tests would impact that original repo. Instead, we need to either: 1. Make a scratch copy of the original main repo (in addition to the worktree repo), and point the scratch worktree repo's commondir at it. This preserves the original relationship, but it's doubtful any script really cares (if they are testing worktree performance, they'd probably make their own worktrees). And it's trickier to get right. 2. Collapse the main and worktree repos into a single scratch repo. This can be done by copying everything from both, preferring any files from the worktree repo. This patch does the second one. With this applied, the example above results in p0000 running successfully. Reported-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-26 14:21:04 -08:00
Jeff King	fbf20aeeef	p5303: measure time to repack with keep Add two new tests to measure repack performance. Both tests split the repository into synthetic "pushes", and then leave the remaining objects in a big base pack. The first new test marks an empty pack as "kept" and then passes --honor-pack-keep to avoid including objects in it. That doesn't change the resulting pack, but it does let us compare to the normal repack case to see how much overhead we add to check whether objects are kept or not. The other test is of --stdin-packs, which gives us a sense of how that number scales based on the number of packs we provide as input. In each of those tests, the empty pack isn't considered, but the residual pack (objects that were left over and not included in one of the synthetic push packs) is marked as kept. (Note that in the single-pack case of the --stdin-packs test, there is nothing do since there are no non-excluded packs). Here are some timings on a recent clone of the kernel: 5303.5: repack (1) 57.26(54.59+10.84) 5303.6: repack with kept (1) 57.33(54.80+10.51) in the 50-pack case, things start to slow down: 5303.11: repack (50) 71.54(88.57+4.84) 5303.12: repack with kept (50) 85.12(102.05+4.94) and by the time we hit 1,000 packs, things are substantially worse, even though the resulting pack produced is the same: 5303.17: repack (1000) 216.87(490.79+14.57) 5303.18: repack with kept (1000) 665.63(938.87+15.76) That's because the code paths around handling .keep files are known to scale badly; they look in every single pack file to find each object. Our solution to that was to notice that most repos don't have keep files, and to make that case a fast path. But as soon as you add a single .keep, that part of pack-objects slows down again (even if we have fewer objects total to look at). Likewise, the scaling is pretty extreme on --stdin-packs (but each subsequent test is also being asked to do more work): 5303.7: repack with --stdin-packs (1) 0.01(0.01+0.00) 5303.13: repack with --stdin-packs (50) 3.53(12.07+0.24) 5303.19: repack with --stdin-packs (1000) 195.83(371.82+8.10) Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Jeff King	60bb5f2f5d	p5303: add missing &&-chains These are in a helper function, so the usual chain-lint doesn't notice them. This function is still not perfect, as it has some git invocations on the left-hand-side of the pipe, but it's primary purpose is timing, not finding bugs or correctness issues. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-22 23:30:52 -08:00
Jeff Hostetler	4f2009dce2	p7519: add trace logging during perf test Add optional trace logging to allow us to better compare performance of various fsmonitor providers and compare results with non-fsmonitor runs. Currently, this includes Trace2 logging, but may be extended to include other trace targets, such as GIT_TRACE_FSMONITOR if desired. Using this logging helped me explain an odd behavior on MacOS where the kernel was dropping events and causing the hook to Watchman to timeout. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	a7556c3bde	p7519: move watchman cleanup earlier in the test Shutdown Watchman after the Watchman-based tests and before the block of "no fsmonitor" tests. This helps ensure that Watchman cannot affect the test results for the other. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	0917763d67	p7519: fix watchman watch-list test on Windows Only use the final portion of the test trash directory file name when verifying that Watchman was started. On Windows and under the SDK, $GIT_WORKTREE is a cygwin-style path with forward slashes and a "/c/" drive name. However `watchman watch-list` reports a proper Windows-style pathname with drive letters and backslashes. This causes the grep to fail. Since we don't really care about the full pathname (and we really don't want to bother with normalizaing them), just see if the test-name portion of the path is found. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Jeff Hostetler	eb10e637cf	p7519: do not rely on "xargs -d" in test Convert the test to use a more portable method to update the mtime on a large number of files under version control. The Mac version of xargs does not support the "-d" option. Likewise, the "-0" and "--null" options are not portable. Furthermore, use `test-tool chmtime` rather than `touch` to update the mtime to ensure that it is actually updated (especially on file systems with only whole second resolution). Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-02-16 17:14:34 -08:00
Junio C Hamano	938ecaa42f	Merge branch 'jk/pretty-lazy-load-commit' Some pretty-format specifiers do not need the data in commit object (e.g. "%H"), but we were over-eager to load and parse it, which has been made even lazier. * jk/pretty-lazy-load-commit: pretty: lazy-load commit data when expanding user-format	2021-02-10 14:48:33 -08:00
Junio C Hamano	6a7bf0ddb2	Merge branch 'jk/p5303-sed-portability-fix' into maint A perf script was made more portable. * jk/p5303-sed-portability-fix: p5303: avoid sed GNU-ism	2021-02-08 14:05:55 -08:00
Junio C Hamano	4cc0e8794d	Merge branch 'jk/p5303-sed-portability-fix' A perf script was made more portable. * jk/p5303-sed-portability-fix: p5303: avoid sed GNU-ism	2021-02-05 16:40:45 -08:00
Junio C Hamano	e93f5c6878	Merge branch 'nk/perf-fsmonitor-cleanup' into maint Test fix. * nk/perf-fsmonitor-cleanup: p7519: allow running without watchman prereq	2021-02-05 16:31:22 -08:00
Jeff King	f08b6c553d	p5303: avoid sed GNU-ism Using "1~5" isn't portable. Nobody seems to have noticed, since perhaps people don't tend to run the perf suite on more exotic platforms. Still, it's better to set a good example. We can use: perl -ne 'print if $. % 5 == 1' instead. But we can further observe that perl does a good job of the other parts of this pipeline, and fold the whole thing together. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-29 15:13:54 -08:00
Jeff King	018b9deba5	pretty: lazy-load commit data when expanding user-format When we expand a user-format, we try to avoid work that isn't necessary for the output. For instance, we don't bother parsing the commit header until we know we need the author, subject, etc. But we do always load the commit object's contents from disk, even if the format doesn't require it (e.g., just "%H"). Traditionally this didn't matter much, because we'd have loaded it as part of the traversal anyway, and we'd typically have those bytes attached to the commit struct (or these days, cached in a commit-slab). But when we have a commit-graph, we might easily get to the point of pretty-printing a commit without ever having looked at the actual object contents. We should push off that load (and reencoding) until we're certain that it's needed. I think the results of p4205 show the advantage pretty clearly (we serve parent and tree oids out of the commit struct itself, so they benefit as well): # using git.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 0.40(0.39+0.01) 0.03(0.02+0.01) -92.5% 4205.2: log with %h 0.45(0.44+0.01) 0.09(0.09+0.00) -80.0% 4205.3: log with %T 0.40(0.39+0.00) 0.04(0.04+0.00) -90.0% 4205.4: log with %t 0.46(0.46+0.00) 0.09(0.08+0.01) -80.4% 4205.5: log with %P 0.39(0.39+0.00) 0.03(0.03+0.00) -92.3% 4205.6: log with %p 0.46(0.46+0.00) 0.10(0.09+0.00) -78.3% 4205.7: log with %h-%h-%h 0.52(0.51+0.01) 0.15(0.14+0.00) -71.2% 4205.8: log with %an-%ae-%s 0.42(0.41+0.00) 0.42(0.41+0.01) +0.0% # using linux.git as the test repo Test HEAD^ HEAD ---------------------------------------------------------------------- 4205.1: log with %H 7.12(6.97+0.14) 0.76(0.65+0.11) -89.3% 4205.2: log with %h 7.35(7.19+0.16) 1.30(1.19+0.11) -82.3% 4205.3: log with %T 7.58(7.42+0.15) 1.02(0.94+0.08) -86.5% 4205.4: log with %t 8.05(7.89+0.15) 1.55(1.41+0.13) -80.7% 4205.5: log with %P 7.12(7.01+0.10) 0.76(0.69+0.07) -89.3% 4205.6: log with %p 7.38(7.27+0.10) 1.32(1.20+0.12) -82.1% 4205.7: log with %h-%h-%h 7.81(7.67+0.13) 1.79(1.67+0.12) -77.1% 4205.8: log with %an-%ae-%s 7.90(7.74+0.15) 7.81(7.66+0.15) -1.1% I added the final test to show where we don't improve (the 1% there is just lucky noise), but also as a regression test to make sure we're not doing anything stupid like loading the commit multiple times when there are several placeholders that need it. Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-28 14:07:35 -08:00
Junio C Hamano	7bfa022993	Merge branch 'nk/perf-fsmonitor-cleanup' Test fix. * nk/perf-fsmonitor-cleanup: p7519: allow running without watchman prereq	2021-01-15 15:20:29 -08:00
Junio C Hamano	d3aff11c3e	Merge branch 'es/perf-export-fix' Tweak unneeded recursion from a test framework helper function. * es/perf-export-fix: t/perf: avoid unnecessary test_export() recursion	2021-01-06 23:33:44 -08:00
Taylor Blau	cc2d43be2b	p7519: allow running without watchman prereq p7519 measures the performance of the fsmonitor code. To do this, it uses the installed copy of Watchman. If Watchman isn't installed, a noop integration script is installed in its place. When in the latter mode, it is expected that the script should not write a "last update token": in fact, it doesn't write anything at all since the script is blank. Commit `33226af42b` (t/perf/fsmonitor: improve error message if typoing hook name, 2020-10-26) made sure that running 'git update-index --fsmonitor' did not write anything to stderr, but this is not the case when using the empty Watchman script, since Git will complain that: $ which watchman watchman not found $ cat .git/hooks/fsmonitor-empty $ git -c core.fsmonitor=.git/hooks/fsmonitor-empty update-index --fsmonitor warning: Empty last update token. Prior to `33226af42b`, the output wasn't checked at all, which allowed this noop mode to work. But, `33226af42b` breaks p7519 when running it without a 'watchman(1)' on your system. Handle this by only checking that the stderr is empty only when running with a real watchman executable. Otherwise, assert that the error message is the expected one when running in the noop mode. Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 13:48:25 -08:00
Eric Sunshine	5bc12c11cc	t/perf: avoid unnecessary test_export() recursion test_export() has been self-recursive since its inception even though a simple for-loop would have served just as well to append its arguments to the `test_export_` variable separated by the pipe character "\|". Recently `test_export_` was changed instead to a space-separated list of tokens to be exported, an operation which can be accomplished via a single simple assignment, with no need for looping or recursion. Therefore, simplify the implementation. While at it, take advantage of the fact that variable names to be exported are shell identifiers, thus won't be composed of special characters or whitespace, thus simple a `$*` can be used rather than magical `"$@"`. Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-22 13:45:36 -08:00
Junio C Hamano	d4187bd4d5	Merge branch 'es/perf-export-fix' Dev-support fix for BSD. * es/perf-export-fix: t/perf: fix test_export() failure with BSD `sed`	2020-12-18 15:15:18 -08:00
Eric Sunshine	f4698738f9	t/perf: fix test_export() failure with BSD `sed` test_perf() runs each test in its own subshell which makes it difficult to persist variables between tests. test_export() addresses this shortcoming by grabbing the values of specified variables after a test runs but before the subshell exits, and writes those values to a file which is loaded into the environment of subsequent tests. To grab the values to be persisted, test_export() pipes the output of the shell's builtin `set` command through `sed` which plucks them out using a regular expression along the lines of `s/^(var1\|var2)/.../p`. Unfortunately, though, this use of alternation is not portable. For instance, BSD-lineage `sed` (including macOS `sed`) does not support it in the default "basic regular expression" mode (BRE). It may be possible to enable "extended regular expression" mode (ERE) in some cases with `sed -E`, however, `-E` is neither portable nor part of POSIX. Fortunately, alternation is unnecessary in this case and can easily be avoided, so replace it with a series of simple expressions such as `s/^var1/.../p;s/^var2/.../p`. While at it, tighten the expressions so they match the variable names exactly rather than matching prefixes (i.e. use `s/^var1=/.../p`). If the requirements of test_export() become more complex in the future, then an alternative would be to replace `sed` with `perl` which supports alternation on all platforms, however, the simple elimination of alternation via multiple `sed` expressions suffices for the present. Reported-by: Sangeeta <sangunb09@gmail.com> Diagnosed-by: Philippe Blain <levraiphilippeblain@gmail.com> Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-16 11:00:29 -08:00
Junio C Hamano	8e2def76f7	Merge branch 'nk/perf-fsmonitor-cleanup' Test clean-up. * nk/perf-fsmonitor-cleanup: perf/fsmonitor: use test_must_be_empty helper	2020-12-08 15:11:20 -08:00
Junio C Hamano	1bc550effe	Merge branch 'ps/update-ref-multi-transaction' "git update-ref --stdin" learns to take multiple transactions in a single session. * ps/update-ref-multi-transaction: update-ref: disallow "start" for ongoing transactions p1400: use `git-update-ref --stdin` to test multiple transactions update-ref: allow creation of multiple transactions t1400: avoid touching refs on filesystem	2020-12-08 15:11:17 -08:00
Nipunn Koorapati	36fa907d7a	perf/fsmonitor: use test_must_be_empty helper Simplify test and make error messages more clear here. Per feedback from Junio in `33226af42b` (t/perf/fsmonitor: improve error message if typoing hook name, 2020-10-26) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Acked-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-30 13:51:43 -08:00
Patrick Steinhardt	21020430a4	p1400: use `git-update-ref --stdin` to test multiple transactions In commit `0a0fbbe3ff` (refs: remove lookup cache for reference-transaction hook, 2020-08-25), a new benchmark was added to p1400 which has the intention to exercise creation of multiple transactions in a single process. As git-update-ref wasn't yet able to create multiple transactions with a single run we instead used git-push. As its non-atomic version creates a transaction per reference update, this was the best approximation we could make at that point in time. Now that `git-update-ref --stdin` supports creation of multiple transactions, let's convert the benchmark to use that instead. It has less overhead and it's also a lot clearer what the actual intention is. Signed-off-by: Patrick Steinhardt <ps@pks.im> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-11-16 13:44:01 -08:00
Nipunn Koorapati	1c6833c800	t/perf/fsmonitor: add benchmark for dirty status This benchmark covers the git status time for a heavily dirty directory - benchmarking fsmonitor's refresh When running to compare our perl vs rs-git-fsmonitor - we see that the perl script incurs significant overhead - further motivation to provide a faster implementation within git. 7519.7: status (dirty) (fsmonitor=query-watchman) 10.05(7.78+1.56) 7519.20: status (dirty) (fsmonitor=rs-git-fsmonitor) 6.72(4.37+1.64) 7519.33: status (dirty) (fsmonitor=disabled) 5.62(4.24+2.03) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:34 -07:00
Nipunn Koorapati	a948864ae7	t/perf/fsmonitor: perf comparison of multiple fsmonitor integrations Allows for simple perf comparison of different integrations. I ran it to compare our perl script w/ rs-git-fsmonitor and found 20-30ms of overhead on every command. Output looks like this (extra newlines added for readability) Test this tree --------------------------------------------------------------------------- 7519.4: status (fsmonitor=query-watchman) 0.42(0.37+0.05) 7519.5: status -uno (fsmonitor=query-watchman) 0.19(0.12+0.07) 7519.6: status -uall (fsmonitor=query-watchman) 1.36(0.73+0.62) 7519.7: diff (fsmonitor=query-watchman) 0.14(0.09+0.05) 7519.8: diff -- 0_files (fsmonitor=query-watchman) 0.14(0.11+0.03) 7519.9: diff -- 10_files (fsmonitor=query-watchman) 0.14(0.10+0.04) 7519.10: diff -- 100_files (fsmonitor=query-watchman) 0.14(0.09+0.05) 7519.11: diff -- 1000_files (fsmonitor=query-watchman) 0.14(0.08+0.06) 7519.12: diff -- 10000_files (fsmonitor=query-watchman) 0.14(0.09+0.05) 7519.13: add (fsmonitor=query-watchman) 2.04(1.32+0.66) 7519.16: status (fsmonitor=rs-git-fsmonitor) 0.39(0.32+0.08) 7519.17: status -uno (fsmonitor=rs-git-fsmonitor) 0.17(0.11+0.06) 7519.18: status -uall (fsmonitor=rs-git-fsmonitor) 1.33(0.71+0.61) 7519.19: diff (fsmonitor=rs-git-fsmonitor) 0.11(0.07+0.04) 7519.20: diff -- 0_files (fsmonitor=rs-git-fsmonitor) 0.11(0.09+0.03) 7519.21: diff -- 10_files (fsmonitor=rs-git-fsmonitor) 0.11(0.09+0.03) 7519.22: diff -- 100_files (fsmonitor=rs-git-fsmonitor) 0.11(0.07+0.04) 7519.23: diff -- 1000_files (fsmonitor=rs-git-fsmonitor) 0.11(0.06+0.06) 7519.24: diff -- 10000_files (fsmonitor=rs-git-fsmonitor) 0.11(0.06+0.06) 7519.25: add (fsmonitor=rs-git-fsmonitor) 2.03(1.28+0.69) 7519.28: status (fsmonitor=disabled) 0.77(0.59+0.99) 7519.29: status -uno (fsmonitor=disabled) 0.42(0.33+0.85) 7519.30: status -uall (fsmonitor=disabled) 1.59(1.02+1.34) 7519.31: diff (fsmonitor=disabled) 0.35(0.30+0.81) 7519.32: diff -- 0_files (fsmonitor=disabled) 0.11(0.08+0.04) 7519.33: diff -- 10_files (fsmonitor=disabled) 0.11(0.07+0.04) 7519.34: diff -- 100_files (fsmonitor=disabled) 0.11(0.08+0.03) 7519.35: diff -- 1000_files (fsmonitor=disabled) 0.11(0.10+0.02) 7519.36: diff -- 10000_files (fsmonitor=disabled) 0.12(0.07+0.06) 7519.37: add (fsmonitor=disabled) 2.24(1.48+1.44) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:34 -07:00
Nipunn Koorapati	6cba4234a5	t/perf/fsmonitor: initialize test with git reset Previously, the git add of the previous suiterun would pollute the numbers in the second run Before: Test this tree ----------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.40(0.36+0.04) 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.19(0.12+0.07) 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.36(0.74+0.61) 7519.7: diff (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.8: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.9: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.14(0.09+0.05) 7519.10: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.11: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.14(0.08+0.06) 7519.12: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.13: add (fsmonitor=fsmonitor-watchman) 2.03(1.28+0.69) 7519.16: status (fsmonitor=disabled) 0.64(0.49+0.90) 7519.17: status -uno (fsmonitor=disabled) 1.15(0.92+1.00) 7519.18: status -uall (fsmonitor=disabled) 2.32(1.46+1.55) 7519.19: diff (fsmonitor=disabled) 1.44(1.12+1.76) 7519.20: diff -- 0_files (fsmonitor=disabled) 0.11(0.07+0.05) 7519.21: diff -- 10_files (fsmonitor=disabled) 0.11(0.06+0.05) 7519.22: diff -- 100_files (fsmonitor=disabled) 0.11(0.08+0.03) 7519.23: diff -- 1000_files (fsmonitor=disabled) 0.11(0.08+0.04) 7519.24: diff -- 10000_files (fsmonitor=disabled) 0.12(0.06+0.07) 7519.25: add (fsmonitor=disabled) 2.25(1.47+1.47) After: Test this tree ----------------------------------------------------------------------------- 7519.4: status (fsmonitor=fsmonitor-watchman) 0.41(0.33+0.09) 7519.5: status -uno (fsmonitor=fsmonitor-watchman) 0.20(0.14+0.07) 7519.6: status -uall (fsmonitor=fsmonitor-watchman) 1.37(0.78+0.58) 7519.7: diff (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.04) 7519.8: diff -- 0_files (fsmonitor=fsmonitor-watchman) 0.14(0.08+0.06) 7519.9: diff -- 10_files (fsmonitor=fsmonitor-watchman) 0.14(0.09+0.05) 7519.10: diff -- 100_files (fsmonitor=fsmonitor-watchman) 0.14(0.10+0.05) 7519.11: diff -- 1000_files (fsmonitor=fsmonitor-watchman) 0.14(0.11+0.04) 7519.12: diff -- 10000_files (fsmonitor=fsmonitor-watchman) 0.14(0.09+0.05) 7519.13: add (fsmonitor=fsmonitor-watchman) 2.04(1.27+0.71) 7519.16: status (fsmonitor=disabled) 0.78(0.59+0.99) 7519.17: status -uno (fsmonitor=disabled) 0.43(0.32+0.88) 7519.18: status -uall (fsmonitor=disabled) 1.58(0.96+1.38) 7519.19: diff (fsmonitor=disabled) 0.36(0.31+0.79) 7519.20: diff -- 0_files (fsmonitor=disabled) 0.11(0.08+0.03) 7519.21: diff -- 10_files (fsmonitor=disabled) 0.11(0.07+0.04) 7519.22: diff -- 100_files (fsmonitor=disabled) 0.11(0.08+0.04) 7519.23: diff -- 1000_files (fsmonitor=disabled) 0.11(0.07+0.05) 7519.24: diff -- 10000_files (fsmonitor=disabled) 0.12(0.08+0.05) 7519.25: add (fsmonitor=disabled) 2.25(1.48+1.47) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:34 -07:00
Nipunn Koorapati	a05b71ab91	t/perf/fsmonitor: factor setup for fsmonitor into function This prepares for it being called multiple times when testing different hooks Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:34 -07:00
Nipunn Koorapati	78ff8b3236	t/perf/fsmonitor: silence initial git commit It is extremely verbose, printing >10K non-useful lines Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:34 -07:00
Nipunn Koorapati	dd79c16746	t/perf/fsmonitor: shorten DESC to basename The full name is lengthy and makes it hard to read Before: 7519.3: status (fsmonitor=/home/nipunn/src/server/.git/hooks/rs-git-fsmonitor) 0.02(0.01+0.00) After 7519.3: status (fsmonitor=rs-git-fsmonitor) 0.03(0.02+0.00) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:34 -07:00
Nipunn Koorapati	3d53ebcd10	t/perf/fsmonitor: factor description out for readability There was much duplication here. Prepares for making changes to the description. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:34 -07:00
Nipunn Koorapati	33226af42b	t/perf/fsmonitor: improve error message if typoing hook name Previously - it would silently run the perf suite w/o using fsmonitor - fsmonitor errors are not hard failures. Now it errors loudly. GIT_PERF_7519_FSMONITOR="$HOME/rs-git-fsmonitorr" ./p7519-fsmonitor.sh -i -v fatal: cannot run /home/nipunn/rs-git-fsmonitorr: No such file or directory not ok 2 - setup for fsmonitor Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:33 -07:00
Nipunn Koorapati	0288b9322d	t/perf/fsmonitor: move watchman setup to one-time-repo-setup It is only required to be set up once. This prepares for testing multiple hooks in one invocation. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:33 -07:00
Nipunn Koorapati	bb7cc7e754	t/perf/fsmonitor: separate one time repo initialization In preparation for testing multiple fsmonitor hooks Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-26 16:39:33 -07:00
Nipunn Koorapati	2bfa953e5d	p7519-fsmonitor: add a git add benchmark Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.48(0.79+0.67) 1.48(0.79+0.67) +0.0% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.11+0.05) 0.17(0.13+0.04) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.77+0.58) 1.37(0.72+0.63) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.84(0.21+0.63) 0.14(0.11+0.03) -83.3% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.09+0.04) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.07+0.06) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.12(0.08+0.05) +0.0% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.09+0.04) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.08+0.06) 0.13(0.07+0.06) -7.1% 7519.11: add (fsmonitor=.git/hooks/fsmonitor-watchman) 2.75(1.41+1.27) 2.03(1.26+0.70) -26.2% 7519.13: status (fsmonitor=) 1.38(1.03+1.04) 1.37(1.04+1.04) -0.7% 7519.14: status -uno (fsmonitor=) 1.11(0.83+0.98) 1.10(0.89+0.90) -0.9% 7519.15: status -uall (fsmonitor=) 2.30(1.57+1.42) 2.31(1.49+1.50) +0.4% 7519.16: diff (fsmonitor=) 1.43(1.13+1.76) 1.46(1.19+1.72) +2.1% 7519.17: diff -- 0_files (fsmonitor=) 0.10(0.08+0.04) 0.11(0.08+0.04) +10.0% 7519.18: diff -- 10_files (fsmonitor=) 0.10(0.07+0.05) 0.11(0.08+0.04) +10.0% 7519.19: diff -- 100_files (fsmonitor=) 0.10(0.07+0.04) 0.11(0.07+0.05) +10.0% 7519.20: diff -- 1000_files (fsmonitor=) 0.10(0.08+0.03) 0.11(0.08+0.04) +10.0% 7519.21: diff -- 10000_files (fsmonitor=) 0.11(0.08+0.05) 0.12(0.07+0.06) +9.1% 7519.22: add (fsmonitor=) 2.26(1.46+1.49) 2.27(1.42+1.55) +0.4% Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 12:52:23 -07:00
Nipunn Koorapati	471b115745	p7519-fsmonitor: refactor to avoid code duplication Much of the benchmark code is redundant. This is easier to understand and edit. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 12:52:23 -07:00
Nipunn Koorapati	ed5a24573d	perf lint: add make test-lint to perf tests Perf tests have not been linted for some time. They've grown some seq instead of test_seq. This runs the existing lints on the perf tests as well. Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 12:52:23 -07:00
Nipunn Koorapati	89afd5f5ad	t/perf: add fsmonitor perf test for git diff Results for the git-diff fsmonitor optimization in patch in the parent-rev (using a 400k file repo to test) As you can see here - git diff with fsmonitor running is significantly better with this patch series (80% faster on my workload)! GIT_PERF_LARGE_REPO=~/src/server ./run v2.29.0-rc1 . -- p7519-fsmonitor.sh Test v2.29.0-rc1 this tree ----------------------------------------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 1.46(0.82+0.64) 1.47(0.83+0.62) +0.7% 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.16(0.12+0.04) 0.17(0.12+0.05) +6.3% 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 1.37(0.76+0.60) +0.7% 7519.5: diff (fsmonitor=.git/hooks/fsmonitor-watchman) 0.85(0.22+0.63) 0.14(0.10+0.05) -83.5% 7519.6: diff -- 0_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.05) 0.13(0.11+0.02) +8.3% 7519.7: diff -- 10_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.08+0.04) 0.13(0.09+0.04) +8.3% 7519.8: diff -- 100_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.07+0.05) 0.13(0.07+0.06) +8.3% 7519.9: diff -- 1000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.12(0.09+0.04) 0.13(0.08+0.05) +8.3% 7519.10: diff -- 10000_files (fsmonitor=.git/hooks/fsmonitor-watchman) 0.14(0.09+0.05) 0.13(0.10+0.03) -7.1% 7519.12: status (fsmonitor=) 1.67(0.93+1.49) 1.67(0.99+1.42) +0.0% 7519.13: status -uno (fsmonitor=) 0.37(0.30+0.82) 0.37(0.33+0.79) +0.0% 7519.14: status -uall (fsmonitor=) 1.58(0.97+1.35) 1.57(0.86+1.45) -0.6% 7519.15: diff (fsmonitor=) 0.34(0.28+0.83) 0.34(0.27+0.83) +0.0% 7519.16: diff -- 0_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.08+0.02) +0.0% 7519.17: diff -- 10_files (fsmonitor=) 0.09(0.07+0.03) 0.09(0.06+0.05) +0.0% 7519.18: diff -- 100_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.06+0.04) +0.0% 7519.19: diff -- 1000_files (fsmonitor=) 0.09(0.06+0.04) 0.09(0.05+0.05) +0.0% 7519.20: diff -- 10000_files (fsmonitor=) 0.10(0.08+0.04) 0.10(0.06+0.05) +0.0% I also added a benchmark for a tiny git diff workload w/ a pathspec. I see an approximately .02 second overhead added w/ and w/o fsmonitor From looking at these results, I suspected that refresh_fsmonitor is already happening during git diff - independent of this patch series' optimization. Confirmed that suspicion by breaking on refresh_fsmonitor. (gdb) bt [simplified] 0 refresh_fsmonitor at fsmonitor.c:176 1 ie_match_stat at read-cache.c:375 2 match_stat_with_submodule at diff-lib.c:237 4 builtin_diff_files at builtin/diff.c:260 5 cmd_diff at builtin/diff.c:541 6 run_builtin at git.c:450 7 handle_builtin at git.c:700 8 run_argv at git.c:767 9 cmd_main at git.c:898 10 main at common-main.c:52 Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 12:52:22 -07:00
Nipunn Koorapati	5851462e8d	t/perf/p7519-fsmonitor.sh: warm cache on first git status The first git status would be inflated due to warming of filesystem cache. This makes the results comparable. Before Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 2.52(1.59+1.56) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.18(0.12+0.06) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.36(0.73+0.62) 7519.7: status (fsmonitor=) 0.69(0.52+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.28+0.81) 7519.9: status -uall (fsmonitor=) 1.53(0.93+1.32) After Test this tree -------------------------------------------------------------------------------- 7519.2: status (fsmonitor=.git/hooks/fsmonitor-watchman) 0.39(0.33+0.06) 7519.3: status -uno (fsmonitor=.git/hooks/fsmonitor-watchman) 0.17(0.13+0.05) 7519.4: status -uall (fsmonitor=.git/hooks/fsmonitor-watchman) 1.34(0.77+0.56) 7519.7: status (fsmonitor=) 0.70(0.53+0.90) 7519.8: status -uno (fsmonitor=) 0.37(0.32+0.78) 7519.9: status -uall (fsmonitor=) 1.55(1.01+1.25) Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 12:52:22 -07:00
Nipunn Koorapati	dc69d47d21	t/perf/README: elaborate on output format Signed-off-by: Nipunn Koorapati <nipunn@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-10-20 12:52:22 -07:00
Junio C Hamano	221b755f3a	Merge branch 'jk/dont-count-existing-objects-twice' There is a logic to estimate how many objects are in the repository, which is mean to run once per process invocation, but it ran every time the estimated value was requested. * jk/dont-count-existing-objects-twice: packfile: actually set approximate_object_count_valid	2020-09-22 12:36:32 -07:00
Jeff King	67bb65de5d	packfile: actually set approximate_object_count_valid The approximate_object_count() function tries to compute the count only once per process. But ever since it was introduced in `8e3f52d778` (find_unique_abbrev: move logic out of get_short_sha1(), 2016-10-03), we failed to actually set the "valid" flag, meaning we'd compute it fresh on every call. This turns out not to be _too_ bad, because we're only iterating through the packed_git list, and not making any system calls. But since it may get called for every abbreviated hash we output, even this can add up if you have many packs. Here are before-and-after timings for a new perf test which just asks rev-list to abbreviate each commit hash (the test repo is linux.git, with commit-graphs): Test origin HEAD ---------------------------------------------------------------------------- 5303.3: rev-list (1) 28.91(28.46+0.44) 29.03(28.65+0.38) +0.4% 5303.4: abbrev-commit (1) 1.18(1.06+0.11) 1.17(1.02+0.14) -0.8% 5303.7: rev-list (50) 28.95(28.56+0.38) 29.50(29.17+0.32) +1.9% 5303.8: abbrev-commit (50) 3.67(3.56+0.10) 3.57(3.42+0.15) -2.7% 5303.11: rev-list (1000) 30.34(29.89+0.43) 30.82(30.35+0.46) +1.6% 5303.12: abbrev-commit (1000) 86.82(86.52+0.29) 77.82(77.59+0.22) -10.4% 5303.15: load 10,000 packs 0.08(0.02+0.05) 0.08(0.02+0.06) +0.0% It doesn't help at all when we have 1 pack (5303.4), but we get a 10% speedup when there are 1000 packs (5303.12). That's a modest speedup for a case that's already slow and we'd hope to avoid in general (note how slow it is even after, because we have to look in each of those packs for abbreviations). But it's a one-line change that clearly matches the original intent, so it seems worth doing. The included perf test may also be useful for keeping an eye on any regressions in the overall abbreviation code. Reported-by: Rasmus Villemoes <rv@rasmusvillemoes.dk> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-17 11:36:14 -07:00
Junio C Hamano	6ddd76fd6c	Merge branch 'ps/ref-transaction-hook' Code simplification by removing ineffective optimization. * ps/ref-transaction-hook: refs: remove lookup cache for reference-transaction hook	2020-08-31 15:49:53 -07:00
Patrick Steinhardt	0a0fbbe3ff	refs: remove lookup cache for reference-transaction hook When adding the reference-transaction hook, there were concerns about the performance impact it may have on setups which do not make use of the new hook at all. After all, it gets executed every time a reftx is prepared, committed or aborted, which linearly scales with the number of reference-transactions created per session. And as there are code paths like `git push` which create a new transaction for each reference to be updated, this may translate to calling `find_hook()` quite a lot. To address this concern, a cache was added with the intention to not repeatedly do negative hook lookups. Turns out this cache caused a regression, which was fixed via `e5256c82e5` (refs: fix interleaving hook calls with reference-transaction hook, 2020-08-07). In the process of discussing the fix, we realized that the cache doesn't really help even in the negative-lookup case. While performance tests added to benchmark this did show a slight improvement in the 1% range, this really doesn't warrent having a cache. Furthermore, it's quite flaky, too. E.g. running it twice in succession produces the following results: Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.2: update-ref 2.79(2.16+0.74) 2.73(2.12+0.71) -2.2% 1400.3: update-ref --stdin 0.22(0.08+0.14) 0.21(0.08+0.12) -4.5% Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.2: update-ref 2.70(2.09+0.72) 2.74(2.13+0.71) +1.5% 1400.3: update-ref --stdin 0.21(0.10+0.10) 0.21(0.08+0.13) +0.0% One case notably absent from those benchmarks is a single executable searching for the hook hundreds of times, which is exactly the case for which the negative cache was added. p1400.2 will spawn a new update-ref for each transaction and p1400.3 only has a single reference-transaction for all reference updates. So this commit adds a third benchmark, which performs an non-atomic push of a thousand references. This will create a new reference transaction per reference. But even for this case, the negative cache doesn't consistently improve performance: Test master pks-reftx-hook-remove-cache -------------------------------------------------------------------------- 1400.4: nonatomic push 6.63(6.50+0.13) 6.81(6.67+0.14) +2.7% 1400.4: nonatomic push 6.35(6.21+0.14) 6.39(6.23+0.16) +0.6% 1400.4: nonatomic push 6.43(6.31+0.13) 6.42(6.28+0.15) -0.2% So let's just remove the cache altogether to simplify the code. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-25 15:34:42 -07:00
Jeff King	218389b9f3	p5302: count up to online-cpus for thread tests When PERF_EXTRA is enabled, p5302 checks the performance of index-pack with various numbers of threads. This can be useful for deciding what the default should be (which is currently capped at 3 threads based on the results of this script). However, we only go up to 8 threads, and modern machines may have more. Let's get the number of CPUs from test-tool, and test various numbers of threads between one and that maximum. Note that the current tests aren't all identical, as we have to set GIT_FORCE_THREADS for the --threads=1 test (which measures the overhead of starting a single worker thread versus the "0" case of using the main thread). To keep the loop simple, we'll keep the "0" case out of it, and set GIT_FORCE_THREADS=1 for all of the other cases (it's a noop for all but the "1" case, since numbers higher than 1 would always need threads). Note also that we could skip running "test-tool" if PERF_EXTRA isn't set. However, there's some small value in knowing the number of threads, so that we can mark each test as skipped in the output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 12:02:36 -07:00
Jeff King	47274251a4	p5302: disable thread-count parameter tests by default The primary function of the perf suite is to detect regressions (or improvements) between versions of Git. The only numbers we show a direct comparison for are timings between the same test run on two different versions. However, it can sometimes be used to collect other information. For instance, p5302 runs the same index-pack operation with different thread counts. The output doesn't directly compare these, but anybody interested in working on index-pack can manually compare the results. For a normal regression run of the full perf-suite, though, this incurs a significant cost to generate numbers nobody will actually look at; about 25% of the total time of the test suite is spent in p5302. And the low-thread-count runs are the most expensive part of it, since they're (unsurprisingly) not using as many threads. Let's skip these tests by default, but make it possible for people working on index-pack to still run them by setting an environment variable. Rather than make this specific to p5302, let's introduce a generic mechanism. This makes it possible to run the full suite with every possible test if somebody really wants to burn some CPU. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-08-21 12:02:36 -07:00
Patrick Steinhardt	6754159767	refs: implement reference transaction hook The low-level reference transactions used to update references are currently completely opaque to the user. While certainly desirable in most usecases, there are some which might want to hook into the transaction to observe all queued reference updates as well as observing the abortion or commit of a prepared transaction. One such usecase would be to have a set of replicas of a given Git repository, where we perform Git operations on all of the repositories at once and expect the outcome to be the same in all of them. While there exist hooks already for a certain subset of Git commands that could be used to implement a voting mechanism for this, many others currently don't have any mechanism for this. The above scenario is the motivation for the new "reference-transaction" hook that reaches directly into Git's reference transaction mechanism. The hook receives as parameter the current state the transaction was moved to ("prepared", "committed" or "aborted") and gets via its standard input all queued reference updates. While the exit code gets ignored in the "committed" and "aborted" states, a non-zero exit code in the "prepared" state will cause the transaction to be aborted prematurely. Given the usecase described above, a voting mechanism can now be implemented via this hook: as soon as it gets called, it will take all of stdin and use it to cast a vote to a central service. When all replicas of the repository agree, the hook will exit with zero, otherwise it will abort the transaction by returning non-zero. The most important upside is that this will catch _all_ commands writing references at once, allowing to implement strong consistency for reference updates via a single mechanism. In order to test the impact on the case where we don't have any "reference-transaction" hook installed in the repository, this commit introduce two new performance tests for git-update-refs(1). Run against an empty repository, it produces the following results: Test origin/master HEAD -------------------------------------------------------------------- 1400.2: update-ref 2.70(2.10+0.71) 2.71(2.10+0.73) +0.4% 1400.3: update-ref --stdin 0.21(0.09+0.11) 0.21(0.07+0.14) +0.0% The performance test p1400.2 creates, updates and deletes a branch a thousand times, thus averaging runtime of git-update-refs over 3000 invocations. p1400.3 instead calls `git-update-refs --stdin` three times and queues a thousand creations, updates and deletes respectively. As expected, p1400.3 consistently shows no noticeable impact, as for each batch of updates there's a single call to access(3P) for the negative hook lookup. On the other hand, for p1400.2, one can see an impact caused by this patchset. But doing five runs of the performance tests where each one was run with GIT_PERF_REPEAT_COUNT=10, the overhead ranged from -1.5% to +1.1%. These inconsistent performance numbers can be explained by the overhead of spawning 3000 processes. This shows that the overhead of assembling the hook path and executing access(3P) once to check if it's there is mostly outweighed by the operating system's overhead. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-19 10:46:13 -07:00
Jeff King	9639474b6d	pack-bitmap: pass object filter to fill-in traversal Sometimes a bitmap traversal still has to walk some commits manually, because those commits aren't included in the bitmap packfile (e.g., due to a push or commit since the last full repack). If we're given an object filter, we don't pass it down to this traversal. It's not necessary for correctness because the bitmap code has its own filters to post-process the bitmap result (which it must, to filter out the objects that _are_ mentioned in the bitmapped packfile). And with blob filters, there was no performance reason to pass along those filters, either. The fill-in traversal could omit them from the result, but it wouldn't save us any time to do so, since we'd still have to walk each tree entry to see if it's a blob or not. But now that we support tree filters, there's opportunity for savings. A tree:depth=0 filter means we can avoid accessing trees entirely, since we know we won't them (or any of the subtrees or blobs they point to). The new test in p5310 shows this off (the "partial bitmap" state is one where HEAD~100 and its ancestors are all in a bitmapped pack, but HEAD~100..HEAD are not). Here are the results (run against linux.git): Test HEAD^ HEAD ------------------------------------------------------------------------------------------------- [...] 5310.16: rev-list with tree filter (partial bitmap) 0.19(0.17+0.02) 0.03(0.02+0.01) -84.2% The absolute number of savings isn't _huge_, but keep in mind that we only omitted 100 first-parent links (in the version of linux.git here, that's 894 actual commits). In a more pathological case, we might have a much larger proportion of non-bitmapped commits. I didn't bother creating such a case in the perf script because the setup is expensive, and this is plenty to show the savings as a percentage. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 21:57:58 -07:00
Taylor Blau	b0a8d4820b	pack-bitmap.c: support 'tree:0' filtering In the previous patch, we made it easy to define other filters that exclude all objects of a certain type. Use that in order to implement bitmap-level filtering for the '--filter=tree:<n>' filter when 'n' is equal to 0. The general case is not helped by bitmaps, since for values of 'n > 0', the object filtering machinery requires a full-blown tree traversal in order to determine the depth of a given tree. Caching this is non-obvious, too, since the same tree object can have a different depth depending on the context (e.g., a tree was moved up in the directory hierarchy between two commits). But, the 'n = 0' case can be helped, and this patch does so. Running p5310.11 in this tree and on master with the kernel, we can see that this case is helped substantially: Test master this tree -------------------------------------------------------------------------------- 5310.11: rev-list count with tree:0 10.68(10.39+0.27) 0.06(0.04+0.01) -99.4% Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-05-04 21:57:58 -07:00
Junio C Hamano	6ae3c79788	Merge branch 'jk/fast-import-use-hashmap' The custom hash function used by "git fast-import" has been replaced with the one from hashmap.c, which gave us a nice performance boost. * jk/fast-import-use-hashmap: fast-import: replace custom hash with hashmap.c	2020-04-28 15:49:58 -07:00
Jeff King	d8410a816b	fast-import: replace custom hash with hashmap.c We use a custom hash in fast-import to store the set of objects we've imported so far. It has a fixed set of 2^16 buckets and chains any collisions with a linked list. As the number of objects grows larger than that, the load factor increases and we degrade to O(n) lookups and O(n^2) insertions. We can scale better by using our hashmap.c implementation, which will resize the bucket count as we grow. This does incur an extra memory cost of 8 bytes per object, as hashmap stores the integer hash value for each entry in its hashmap_entry struct (which we really don't care about here, because we're just reusing the embedded object hash). But I think the numbers below justify this (and our per-object memory cost is already much higher). I also looked at using khash, but it seemed to perform slightly worse than hashmap at all sizes, and worse even than the existing code for small sizes. It's also awkward to use here, because we want to look up a "struct object_entry" from a "struct object_id", and it doesn't handle mismatched keys as well. Making a mapping of object_id to object_entry would be more natural, but that would require pulling the embedded oid out of the object_entry or incurring an extra 32 bytes per object. In a synthetic test creating as many cheap, tiny objects as possible perl -e ' my $bits = shift; my $nr = 2**$bits; for (my $i = 0; $i < $nr; $i++) { print "blob\n"; print "data 4\n"; print pack("N", $i); } ' $bits \| git fast-import I got these results: nr_objects master khash hashmap 2^20 0m4.317s 0m5.109s 0m3.890s 2^21 0m10.204s 0m9.702s 0m7.933s 2^22 0m27.159s 0m17.911s 0m16.751s 2^23 1m19.038s 0m35.080s 0m31.963s 2^24 4m18.766s 1m10.233s 1m6.793s which points to hashmap as the winner. We didn't have any perf tests for fast-export or fast-import, so I added one as a more real-world case. It uses an export without blobs since that's significantly cheaper than a full one, but still is an interesting case people might use (e.g., for rewriting history). It will emphasize this change in some ways (as a percentage we spend more time making objects and less shuffling blob bytes around) and less in others (the total object count is lower). Here are the results for linux.git: Test HEAD^ HEAD ---------------------------------------------------------------------------- 9300.1: export (no-blobs) 67.64(66.96+0.67) 67.81(67.06+0.75) +0.3% 9300.2: import (no-blobs) 284.04(283.34+0.69) 198.09(196.01+0.92) -30.3% It only has ~5.2M commits and trees, so this is a larger effect than I expected (the 2^23 case above only improved by 50s or so, but here we gained almost 90s). This is probably due to actually performing more object lookups in a real import with trees and commits, as opposed to just dumping a bunch of blobs into a pack. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-04-06 13:41:24 -07:00
Jeff King	14d277879c	p5310: stop timing non-bitmap pack-to-disk Commit `645c432d61` (pack-objects: use reachability bitmap index when generating non-stdout pack, 2016-09-10) added two timing tests for packing to an on-disk file, both with and without bitmaps. However, the non-bitmap one isn't interesting to have as part of p5310's regression suite. It _could_ be used as a baseline to show off the improvement in the bitmap case, but: - the point of the t/perf suite is to find performance regressions, and it won't help with that. We don't compare the numbers between two tests (which the perf suite has no idea are even related), and any change in its numbers would have nothing to do with bitmaps. - it did show off the improvement in the commit message of `645c432d61`, but it wasn't even necessary there. The bitmap case already shows an improvement (because before the patch, it behaved the same as the non-bitmap case), and the perf suite is even able to show the difference between the before and after measurements. On top of that, it's one of the most expensive tests in the suite, clocking in around 60s for linux.git on my machine (as compared to 16s for the bitmapped version). And by default when using "./run", we'd run it three times! So let's just drop it. It's not useful and is adding minutes to perf runs. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-03-27 15:11:21 -07:00
Jeff King	3ab3185f99	pack-objects: support filters with bitmaps Just as rev-list recently learned to combine filters and bitmaps, let's do the same for pack-objects. The infrastructure is all there; we just need to pass along our filter options, and the pack-bitmap code will decide to use bitmaps or not. This unsurprisingly makes things faster for partial clones of large repositories (here we're cloning linux.git): Test HEAD^ HEAD ------------------------------------------------------------------------------ 5310.11: simulated partial clone 38.94(37.28+5.87) 11.06(11.27+4.07) -71.6% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-14 10:46:22 -08:00
Jeff King	84243da129	pack-bitmap: implement BLOB_LIMIT filtering Just as the previous commit implemented BLOB_NONE, we can support BLOB_LIMIT filters by looking at the sizes of any blobs in the result and unsetting their bits as appropriate. This is slightly more expensive than BLOB_NONE, but still produces a noticeable speedup (these results are on git.git): Test HEAD~2 HEAD ------------------------------------------------------------------------------------ 5310.9: rev-list count with blob:none 1.80(1.77+0.02) 0.22(0.20+0.02) -87.8% 5310.10: rev-list count with blob:limit=1k 1.99(1.96+0.03) 0.29(0.25+0.03) -85.4% The implementation is similar to the BLOB_NONE one, with the exception that we have to go object-by-object while walking the blob-type bitmap (since we can't mask out the matches, but must look up the size individually for each blob). The trick with using ctz64() is taken from show_objects_for_type(), which likewise needs to find individual bits (but wants to quickly skip over big chunks without blobs). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-14 10:46:22 -08:00
Jeff King	4f3bd5606a	pack-bitmap: implement BLOB_NONE filtering We can easily support BLOB_NONE filters with bitmaps. Since we know the types of all of the objects, we just need to clear the result bits of any blobs. Note two subtleties in the implementation (which I also called out in comments): - we have to include any blobs that were specifically asked for (and not reached through graph traversal) to match the non-bitmap version - we have to handle in-pack and "ext_index" objects separately. Arguably prepare_bitmap_walk() could be adding these ext_index objects to the type bitmaps. But it doesn't for now, so let's match the rest of the bitmap code here (it probably wouldn't be an efficiency improvement to do so since the cost of extending those bitmaps is about the same as our loop here, but it might make the code a bit simpler). Here are perf results for the new test on git.git: Test HEAD^ HEAD -------------------------------------------------------------------------------- 5310.9: rev-list count with blob:none 1.67(1.62+0.05) 0.22(0.21+0.02) -86.8% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-14 10:46:22 -08:00
Jeff King	4eb707ebd6	rev-list: allow commit-only bitmap traversals Ever since we added reachability bitmap support, we've been able to use it with rev-list to get the full list of objects, like: git rev-list --objects --use-bitmap-index --all But you can't do so without --objects, since we weren't ready to just show the commits. However, the internals of the bitmap code are mostly ready for this: they avoid opening up trees when walking to fill in the bitmaps. We just need to actually pass in the rev_info to traverse_bitmap_commit_list() so it knows which types to bother triggering our callback for. For completeness, the perf test now covers both the existing --objects case, as well as the new commits-only behavior (the objects one got way faster when we introduced bitmaps, but obviously isn't improved now). Here are numbers for linux.git: Test HEAD^ HEAD ------------------------------------------------------------------------ 5310.7: rev-list (commits) 8.29(8.10+0.19) 1.76(1.72+0.04) -78.8% 5310.8: rev-list (objects) 8.06(7.94+0.12) 8.14(7.94+0.13) +1.0% That run was cheating a little, as I didn't have any commit-graph in the repository, and we'd built it by default these days when running git-gc. Here are numbers with a commit-graph: Test HEAD^ HEAD ------------------------------------------------------------------------ 5310.7: rev-list (commits) 0.70(0.58+0.12) 0.51(0.46+0.04) -27.1% 5310.8: rev-list (objects) 6.20(6.09+0.10) 6.27(6.16+0.11) +1.1% Still an improvement, but a lot less impressive. We could have the perf script remove any commit-graph to show the out-sized effect, but it probably makes sense to leave it in what would be a more typical setup. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-02-14 10:46:22 -08:00
Junio C Hamano	6d831b8a3e	Merge branch 'cs/store-packfiles-in-hashmap' In a repository with many packfiles, the cost of the procedure that avoids registering the same packfile twice was unnecessarily high by using an inefficient search algorithm, which has been corrected. * cs/store-packfiles-in-hashmap: packfile.c: speed up loading lots of packfiles	2019-12-16 13:08:32 -08:00
Junio C Hamano	55c37d12d3	Merge branch 'jk/perf-wo-git-dot-pm' Test cleanup. * jk/perf-wo-git-dot-pm: t/perf: don't depend on Git.pm	2019-12-10 13:11:44 -08:00
Junio C Hamano	7cb0d37f6d	Merge branch 'tg/perf-remove-stale-result' PerfTest fix to avoid stale result mixed up with the latest round of test results. * tg/perf-remove-stale-result: perf-lib: use a single filename for all measurement types	2019-12-06 15:09:23 -08:00
Colin Stolley	ec48540fe8	packfile.c: speed up loading lots of packfiles When loading packfiles on start-up, we traverse the internal packfile list once per file to avoid reloading packfiles that have already been loaded. This check runs in quadratic time, so for poorly maintained repos with a large number of packfiles, it can be pretty slow. Add a hashmap containing the packfile names as we load them so that the average runtime cost of checking for already-loaded packs becomes constant. Add a perf test to p5303 to show speed-up. The existing p5303 test runtimes are dominated by other factors and do not show an appreciable speed-up. The new test in p5303 clearly exposes a speed-up in bad cases. In this test we create 10,000 packfiles and measure the start-up time of git rev-parse, which does little else besides load in the packs. Here are the numbers for the new p5303 test: Test HEAD^ HEAD --------------------------------------------------------------------- 5303.12: load 10,000 packs 1.03(0.92+0.10) 0.12(0.02+0.09) -88.3% Signed-off-by: Colin Stolley <cstolley@runbox.com> Helped-by: Jeff King <peff@peff.net> [jc: squashed the change to call hashmap in install_packed_git() by peff] Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-12-03 07:59:45 -08:00
Junio C Hamano	8faff3899e	Merge branch 'jk/optim-in-pack-idx-conversion' Code clean-up. * jk/optim-in-pack-idx-conversion: pack-objects: avoid pointless oe_map_new_pack() calls	2019-12-01 09:04:38 -08:00
Jeff King	528d9e6d01	t/perf: don't depend on Git.pm The perf suite's aggregate.perl depends on Git.pm, which is a mild annoyance if you've built git with NO_PERL. It turns out that the only thing we use it for is a single call of the command_oneline() helper. We can just replace this with backticks or similar. Annoyingly, perl has no backtick equivalent that avoids a shell eval, which means our $arg would require quoting. This probably doesn't matter for our purposes, but it's better to be safe and model good style. So we'll just provide a short helper around open(), which takes its arguments as a list. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-27 10:53:36 +09:00
Jeff King	b8dcc45387	perf-lib: use a single filename for all measurement types The perf tests write files recording the results of tests. These results are later aggregated by 'aggregate.perl'. If the tests are run multiple times, those results are overwritten by the new results. This works just fine as long as there are only perf tests measuring the times, whose results are stored in "$base".times files. However `22bec79d1a` ("t/perf: add infrastructure for measuring sizes", 2018-08-17) introduced a new type of test for measuring the size of something. The results of this are written to "$base".size files. "$base" is essentially made up of the basename of the script plus the test number. So if test numbers shift because a new test was introduced earlier in the script we might end up with both a ".times" and a ".size" file for the same test. In the aggregation script the ".times" file is preferred over the ".size" file, so some size tests might end with performance numbers from a previous run of the test. This is mainly relevant when writing perf tests that check both performance and sizes, and can get quite confusing during developement. We could fix this by doing a more thorough job of cleaning out old ".times" and ".size" files before running each test. However, an even easier solution is to just use the same filename for both types of measurement, meaning we'll always overwrite the previous result. We don't even need to change the file format to distinguish the two; aggregate.perl already decides which is which based on a regex of the content (this may become ambiguous if we add new types in the future, but we could easily add a header field to the file at that point). Based on an initial patch from Thomas Gummerer, who discovered the problem and did all of the analysis (which I stole for the commit message above): https://public-inbox.org/git/20191119185047.8550-1-t.gummerer@gmail.com/ Helped-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-27 10:48:25 +09:00
Jeff King	f66e0401ab	pack-objects: avoid pointless oe_map_new_pack() calls This patch fixes an extreme slowdown in pack-objects when you have more than 1023 packs. See below for numbers. Since `43fa44fa3b` (pack-objects: move in_pack out of struct object_entry, 2018-04-14), we use a complicated system to save some per-object memory. Each object_entry structs gets a 10-bit field to store the index of the pack it's in. We map those indices into pointers using packing_data->in_pack_by_idx, which we initialize at the start of the program. If we have 2^10 or more packs, then we instead create an array of pack pointers, one per object. This is packing_data->in_pack. So far so good. But there's one other tricky case: if a new pack arrives after we've initialized in_pack_by_idx, it won't have an index yet. We solve that by calling oe_map_new_pack(), which just switches on the fly to the less-optimal in_pack mechanism, allocating the array and back-filling it for already-seen objects. But that logic kicks in even when we've switched to it already (whether because we really did see a new pack, or because we had too many packs in the first place). The result doesn't produce a wrong outcome, but it's very slow. What happens is this: - imagine you have a repo with 500k objects and 2000 packs that you want to repack. - before looking at any objects, we call prepare_in_pack_by_idx(). It starts allocating an index for each pack. On the 1024th pack, it sees there are too many, so it bails, leaving in_pack_by_idx as NULL. - while actually adding objects to the packing list, we call oe_set_in_pack(), which checks whether the pack already has an index. If it's one of the packs after the first 1023, then it doesn't have one, and we'll call oe_map_new_pack(). But there's no useful work for that function to do. We're already using in_pack, so it just uselessly walks over the complete list of objects, trying to backfill in_pack. And we end up doing this for almost 1000 packs (each of which may be triggered by more than one object). And each time it triggers, we may iterate over up to 500k objects. So in the absolute worst case, this is quadratic in the number of objects. The solution is simple: we don't need to bother checking whether the pack has an index if we've already converted to using in_pack, since by definition we're not going to use it. So we can just push the "does the pack have a valid index" check down into that half of the conditional, where we know we're going to use it. The current test in p5303 sadly doesn't notice this problem, since it maxes out at 1000 packs. If we add a new test to it at 2000 packs, it does show the improvement: Test HEAD^ HEAD ---------------------------------------------------------------------- 5303.12: repack (2000) 26.72(39.68+0.67) 15.70(28.70+0.66) -41.2% However, these many-pack test cases are rather expensive to run, so adding larger and larger numbers isn't appealing. Instead, we can show it off more easily by using GIT_TEST_FULL_IN_PACK_ARRAY, which forces us into the absolute worst case: no pack has an index, so we'll trigger oe_map_new_pack() pointlessly for every single object, making it truly quadratic. Here are the numbers (on git.git) with the included change to p5303: Test HEAD^ HEAD ---------------------------------------------------------------------- 5303.3: rev-list (1) 2.05(1.98+0.06) 2.06(1.99+0.06) +0.5% 5303.4: repack (1) 33.45(33.46+0.19) 2.75(2.73+0.22) -91.8% 5303.6: rev-list (50) 2.07(2.01+0.06) 2.06(2.01+0.05) -0.5% 5303.7: repack (50) 34.21(35.18+0.16) 3.49(4.50+0.12) -89.8% 5303.9: rev-list (1000) 2.87(2.78+0.08) 2.88(2.80+0.07) +0.3% 5303.10: repack (1000) 41.26(51.30+0.47) 10.75(20.75+0.44) -73.9% Again, those improvements aren't realistic for the 1-pack case (because in the real world, the full-array solution doesn't kick in), but it's more useful to be testing the more-complicated code path. While we're looking at this issue, we'll tweak one more thing: in oe_map_new_pack(), we call REALLOC_ARRAY(pack->in_pack). But we'd never expect to get here unless we're back-filling it for the first time, in which case it would be NULL. So let's switch that to ALLOC_ARRAY() for clarity, and add a BUG() to document the expectation. Unfortunately this code isn't well-covered in the test suite because it's inherently racy (it only kicks in if somebody else adds a new pack while we're in the middle of repacking). Signed-off-by: Jeff King <peff@peff.net> Reviewed-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-12 13:36:36 +09:00
Elijah Newren	96c0caf5e3	Fix spelling errors in messages shown to users Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-10 16:00:54 +09:00
Jeff King	362f8b280c	t/perf: rename duplicate-numbered test script There are two perf scripts numbered p5600, but with otherwise different names ("clone-reference" versus "partial-clone"). We store timing results in files named after the whole script, so internally we don't get confused between the two. But "aggregate.perl" just prints the test number for each result, giving multiple entries for "5600.3". It also makes it impossible to skip one test but not the other with GIT_SKIP_TESTS. Let's renumber the one that appeared later (by date -- the source of the problem is that the two were developed on independent branches). For the non-perf test suite, our test-lint rule would have complained about this when the two were merged, but t/perf never learned that trick. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-08-12 09:05:13 -07:00
Jeff King	39b44ba771	check_everything_connected: assume alternate ref tips are valid When we receive a remote ref update to sha1 "X", we want to check that we have all of the objects needed by "X". We can assume that our repository is not currently corrupted, and therefore if we have a ref pointing at "Y", we have all of its objects. So we can stop our traversal from "X" as soon as we hit "Y". If we make the same non-corruption assumption about any repositories we use to store alternates, then we can also use their ref tips to shorten the traversal. This is especially useful when cloning with "--reference", as we otherwise do not have any local refs to check against, and have to traverse the whole history, even though the other side may have sent us few or no objects. Here are results for the included perf test (which shows off more or less the maximal savings, getting one new commit and sharing the whole history): Test HEAD^ HEAD -------------------------------------------------------------------- [on git.git] 5600.3: clone --reference 2.94(2.86+0.08) 0.09(0.08+0.01) -96.9% [on linux.git] 5600.3: clone --reference 45.74(45.34+0.41) 0.36(0.30+0.08) -99.2% Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-07-01 10:11:09 -07:00
Junio C Hamano	82dca958dd	Merge branch 'ab/perf-installed-fix' Performance test framework has been broken and measured the version of Git that happens to be on $PATH, not the specified one to measure, for a while, which has been corrected. * ab/perf-installed-fix: perf-lib.sh: forbid the use of GIT_TEST_INSTALLED perf tests: add "bindir" prefix to git tree test results perf-lib.sh: remove GIT_TEST_INSTALLED from perf-lib.sh perf-lib.sh: make "./run <revisions>" use the correct gits perf aggregate: remove GIT_TEST_INSTALLED from --codespeed perf README: correct docs for `3c8f12c96c` regression	2019-05-19 16:45:28 +09:00
Junio C Hamano	6cfa633565	Merge branch 'jk/perf-aggregate-wo-libjson' The script to aggregate perf result unconditionally depended on libjson-perl even though it did not have to, which has been corrected. * jk/perf-aggregate-wo-libjson: t/perf: depend on perl JSON only when using --codespeed	2019-05-13 23:50:34 +09:00
Junio C Hamano	e7a1b38f9c	Merge branch 'jk/p5302-avoid-collision-check-cost' Fix index-pack perf test so that the repeated invocations always run in an empty repository, which emulates the initial clone situation better. * jk/p5302-avoid-collision-check-cost: p5302: create the repo in each index-pack test	2019-05-13 23:50:32 +09:00
Junio C Hamano	2bfb182bc5	Merge branch 'ew/repack-with-bitmaps-by-default' The connectivity bitmaps are created by default in bare repositories now; also the pathname hash-cache is created by default to avoid making crappy deltas when repacking. * ew/repack-with-bitmaps-by-default: pack-objects: default to writing bitmap hash-cache t5310: correctly remove bitmaps for jgit test repack: enable bitmaps by default on bare repos	2019-05-13 23:50:32 +09:00
Junio C Hamano	5b51f0d38d	Merge branch 'js/partial-clone-connectivity-check' During an initial "git clone --depth=..." partial clone, it is pointless to spend cycles for a large portion of the connectivity check that enumerates and skips promisor objects (which by definition is all objects fetched from the other side). This has been optimized out. * js/partial-clone-connectivity-check: t/perf: add perf script for partial clones clone: do faster object check for partial clones	2019-05-13 23:50:32 +09:00
Ævar Arnfjörð Bjarmason	82b7eb231d	perf-lib.sh: forbid the use of GIT_TEST_INSTALLED As noted in preceding commits setting GIT_TEST_INSTALLED has never been supported or documented, and as noted in an earlier t/perf/README change to the extent that it's been documented nobody's notices that the example hasn't worked since `3c8f12c96c` ("test-lib: reorder and include GIT-BUILD-OPTIONS a lot earlier", 2012-06-24). We could directly support GIT_TEST_INSTALLED for invocations without the "run" script, such as: GIT_TEST_INSTALLED=../../ ./p0000-perf-lib-sanity.sh GIT_TEST_INSTALLED=/home/avar/g/git ./p0000-perf-lib-sanity.sh But while not having this "error" will "work", it won't write the the resulting "test-results/*" files to the right place, and thus a subsequent call to aggregate.perl won't work as expected. Let's just tell the user that they need to use the "run" script, which'll correctly deal with this and set the right PERF_RESULTS_PREFIX. If someone's in desperate need of bypassing "run" for whatever reason they can trivially do so by setting "PERF_SET_GIT_TEST_INSTALLED", but not we won't have people who expect GIT_TEST_INSTALLED to just work wondering why their aggregation doesn't work, even though they're running the right "git". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>	2019-05-08 11:00:28 +09:00
Ævar Arnfjörð Bjarmason	fab80ee79d	perf tests: add "bindir" prefix to git tree test results Change the output file names in test-results/ to be "test-results/bindir_<munged dir>" rather than just "test-results/<munged dir>". This is for consistency with the "build_" directories we have for built revisions, i.e. "test-results/build_<SHA-1>". There's no user-visible functional changes here, it just makes it easier to see at a glance what "test-results" files are of what "type" as they're all explicitly grouped together now, and to grep this code to find both the run_dirs_helper() implementation and its corresponding aggregate.perl code. Note that we already guarantee that the rest of the PERF_RESULTS_PREFIX is an absolute path, and since it'll start with e.g. "/" which we munge to "_" we'll up with a readable string like "bindir_home_avar[...]". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>	2019-05-08 11:00:28 +09:00
Ævar Arnfjörð Bjarmason	df0f502195	perf-lib.sh: remove GIT_TEST_INSTALLED from perf-lib.sh Follow-up my preceding change which fixed the immediate "./run <revisions>" regression in `0baf78e7bc` ("perf-lib.sh: rely on test-lib.sh for --tee handling", 2019-03-15) and entirely get rid of GIT_TEST_INSTALLED from perf-lib.sh (and aggregate.perl). As noted in that change the dance we're doing with GIT_TEST_INSTALLED perf-lib.sh isn't necessary, but there I was doing the most minimal set of changes to quickly fix a regression. But it's much simpler to never deal with the "GIT_TEST_INSTALLED" we were setting in perf-lib.sh at all. Instead the run_dirs_helper() sets the previously inferred $PERF_RESULTS_PREFIX directly. Setting this at the callsite that's already best positioned to exhaustively know about all the different cases we need to handle where PERF_RESULTS_PREFIX isn't what we want already (the empty string) makes the most sense. In one-off cases like: ./run ./p0000-perf-lib-sanity.sh ./p0000-perf-lib-sanity.sh We'll just do the right thing because PERF_RESULTS_PREFIX will be empty, and test-lib.sh takes care of finding where our git is. Any refactoring of this code needs to change both the shell code and the Perl code in aggregate.perl, because when running e.g.: ./run ../../ -- <test> The "../../" path to a relative bindir needs to be munged to a filename containing the results, and critically aggregate.perl does not get passed the path to those aggregations, just "../..". Let's fix cases where aggregate.perl would print e.g. ".." in its report output for this, and "git" for "/home/avar/g/git", i.e. it would always pick the last element. Now'll always print the full path instead. This also makes the code sturdier, e.g. you can feed "../.." to "./run" and then an absolute path to the aggregate.perl script, as long as the absolute path and "../.." resolved to the same directory printing the aggregation will work. Also simplify the "[_*]" on the RHS of "tr -c", we're trimming everything to "_", so we don't need that. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>	2019-05-08 11:00:28 +09:00

1 2 3 4 5 ...

326 commits