development/git - HydraGit

mirror of https://github.com/git/git synced 2024-08-27 03:29:21 +00:00

Author	SHA1	Message	Date
Junio C Hamano	0fea6b73f1	Merge branch 'tb/multi-pack-verbatim-reuse' Streaming spans of packfile data used to be done only from a single, primary, pack in a repository with multiple packfiles. It has been extended to allow reuse from other packfiles, too. * tb/multi-pack-verbatim-reuse: (26 commits) t/perf: add performance tests for multi-pack reuse pack-bitmap: enable reuse from all bitmapped packs pack-objects: allow setting `pack.allowPackReuse` to "single" t/test-lib-functions.sh: implement `test_trace2_data` helper pack-objects: add tracing for various packfile metrics pack-bitmap: prepare to mark objects from multiple packs for reuse pack-revindex: implement `midx_pair_to_pack_pos()` pack-revindex: factor out `midx_key_to_pack_pos()` helper midx: implement `midx_preferred_pack()` git-compat-util.h: implement checked size_t to uint32_t conversion pack-objects: include number of packs reused in output pack-objects: prepare `write_reused_pack_verbatim()` for multi-pack reuse pack-objects: prepare `write_reused_pack()` for multi-pack reuse pack-objects: pass `bitmapped_pack`'s to pack-reuse functions pack-objects: keep track of `pack_start` for each reuse pack pack-objects: parameterize pack-reuse routines over a single pack pack-bitmap: return multiple packs via `reuse_partial_packfile_from_bitmap()` pack-bitmap: simplify `reuse_partial_packfile_from_bitmap()` signature ewah: implement `bitmap_is_empty()` pack-bitmap: pass `bitmapped_pack` struct to pack-reuse functions ...	2024-01-12 16:09:56 -08:00
Junio C Hamano	ec5ab1482d	Merge branch 'js/update-urls-in-doc-and-comment' Stale URLs have been updated to their current counterparts (or archive.org) and HTTP links are replaced with working HTTPS links. * js/update-urls-in-doc-and-comment: doc: refer to internet archive doc: update links for andre-simon.de doc: switch links to https doc: update links to current pages	2023-12-18 14:10:12 -08:00
Taylor Blau	ba47d88795	t/perf: add performance tests for multi-pack reuse To ensure that we don't regress either the size or runtime performance of multi-pack reuse, add a performance test to measure both of these. The test partitions the objects in GIT_TEST_PERF_LARGE_REPO into 1, 10, and 100 packs, and then tries to perform a "clone" at each stage with both single- and multi-pack reuse enabled. Note that the `repack_into_n_chunks()` function in this new test script differs from the existing `repack_into_n()`. The former partitions the repository into N equal-sized chunks, while the latter produces N packs of five commits each (plus their objects), and then another pack with the remainder. On git.git, I can produce the following results on my machine: Test this tree -------------------------------------------------------------------------------- 5332.3: clone for 1-pack scenario (single-pack reuse) 1.57(2.99+0.15) 5332.4: clone size for 1-pack scenario (single-pack reuse) 231.8M 5332.5: clone for 1-pack scenario (multi-pack reuse) 1.79(2.96+0.21) 5332.6: clone size for 1-pack scenario (multi-pack reuse) 231.7M 5332.9: clone for 10-pack scenario (single-pack reuse) 3.89(16.75+0.35) 5332.10: clone size for 10-pack scenario (single-pack reuse) 209.9M 5332.11: clone for 10-pack scenario (multi-pack reuse) 1.56(2.99+0.17) 5332.12: clone size for 10-pack scenario (multi-pack reuse) 224.4M 5332.15: clone for 100-pack scenario (single-pack reuse) 8.24(54.31+0.59) 5332.16: clone size for 100-pack scenario (single-pack reuse) 278.3M 5332.17: clone for 100-pack scenario (multi-pack reuse) 2.13(2.44+0.33) 5332.18: clone size for 100-pack scenario (multi-pack reuse) 357.9M Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-12-14 14:38:09 -08:00
Junio C Hamano	98d0a1f93e	Merge branch 'vd/for-each-ref-unsorted-optimization' "git for-each-ref --no-sort" still sorted the refs alphabetically which paid non-trivial cost. It has been redefined to show output in an unspecified order, to allow certain optimizations to take advantage of. * vd/for-each-ref-unsorted-optimization: t/perf: add perf tests for for-each-ref ref-filter.c: use peeled tag for '*' format fields for-each-ref: clean up documentation of --format ref-filter.c: filter & format refs in the same callback ref-filter.c: refactor to create common helper functions ref-filter.c: rename 'ref_filter_handler()' to 'filter_one()' ref-filter.h: add functions for filter/format & format-only ref-filter.h: move contains caches into filter ref-filter.h: add max_count and omit_empty to ref_format ref-filter.c: really don't sort when using --no-sort	2023-12-09 16:37:50 -08:00
Josh Soref	d05b08cd52	doc: switch links to https These sites offer https versions of their content. Using the https versions provides some protection for users. Signed-off-by: Josh Soref <jsoref@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-11-26 10:07:05 +09:00
Victoria Dye	294bfc2441	t/perf: add perf tests for for-each-ref Add performance tests for 'for-each-ref'. The tests exercise different combinations of filters/formats/options, as well as the overall performance of 'git for-each-ref \| git cat-file --batch-check' to demonstrate the performance difference vs. 'git for-each-ref' with "%(fieldname)" format specifiers. All tests are run against a repository with 40k loose refs - 10k commits, each having a unique: - branch - custom ref (refs/custom/special_) - annotated tag pointing at the commit - annotated tag pointing at the other annotated tag (i.e., a nested tag) After those tests are finished, the refs are packed with 'pack-refs --all' and the same tests are rerun. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-11-16 14:03:01 +09:00
Patrick Steinhardt	13420028e5	global: convert trivial usages of `test <expr> -a/-o <expr>` Our coding guidelines say to not use `test` with `-a` and `-o` because it can easily lead to bugs. Convert trivial cases where we still use these to instead instead concatenate multiple invocations of `test` via `&&` and `\|\|`, respectively. While not all of the converted instances can cause ambiguity, it is worth getting rid of all of them regardless: - It becomes easier to reason about the code as we do not have to argue why one use of `-a`/`-o` is okay while another one isn't. - We don't encourage people to use these expressions. Signed-off-by: Patrick Steinhardt <ps@pks.im> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-11-11 09:21:00 +09:00
Shuqi Liang	f9815878c1	check-attr: integrate with sparse-index Set the requires-full-index to false for "check-attr". Add a test to ensure that the index is not expanded whether the files are outside or inside the sparse-checkout cone when the sparse index is enabled. The `p2000` tests demonstrate a ~63% execution time reduction for 'git check-attr' using a sparse index. Test before after ----------------------------------------------------------------------- 2000.106: git check-attr -a f2/f4/a (full-v3) 0.05 0.05 +0.0% 2000.107: git check-attr -a f2/f4/a (full-v4) 0.05 0.05 +0.0% 2000.108: git check-attr -a f2/f4/a (sparse-v3) 0.04 0.02 -50.0% 2000.109: git check-attr -a f2/f4/a (sparse-v4) 0.04 0.01 -75.0% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-08-11 09:44:52 -07:00
Junio C Hamano	a813d9e239	Merge branch 'sl/worktree-sparse' "git worktree" learned to work better with sparse index feature. * sl/worktree-sparse: worktree: integrate with sparse-index	2023-06-23 11:21:16 -07:00
Shuqi Liang	8fac776f44	worktree: integrate with sparse-index The index is read in 'worktree.c' at two points: 1.The 'validate_no_submodules' function, which checks if there are any submodules present in the worktree. 2.The 'check_clean_worktree' function, which verifies if a worktree is 'clean', i.e., there are no untracked or modified but uncommitted files. This is done by running the 'git status' command, and an error message is thrown if the worktree is not clean. Given that 'git status' is already sparse-aware, the function is also sparse-aware. Hence we can just set the requires-full-index to false for "git worktree". Add tests that verify that 'git worktree' behaves correctly when the sparse index is enabled and test to ensure the index is not expanded. The `p2000` tests demonstrate a ~20% execution time reduction for 'git worktree' using a sparse index: (Note:the p2000 test results didn't reflect the huge speedup because of the index reading time is minuscule comparing to the filesystem operations.) Test before after ----------------------------------------------------------------------- 2000.102: git worktree add....(full-v3) 3.15 2.82 -10.5% 2000.103: git worktree add....(full-v4) 3.14 2.84 -9.6% 2000.104: git worktree add....(sparse-v3) 2.59 2.14 -16.4% 2000.105: git worktree add....(sparse-v4) 2.10 1.57 -25.2% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Acked-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-06-12 12:13:58 -07:00
Shuqi Liang	48c5fbfb89	diff-tree: integrate with sparse index The index is read in 'cmd_diff_tree' at two points: 1. The first index read was added in `fd66bcc31f` (diff-tree: read the index so attribute checks work in bare repositories, 2017-12-06) to deal with reading '.gitattributes' content. `77efbb366a` (attr: be careful about sparse directories, 2021-09-08) established that, in a sparse index, we do _not_ try to load a '.gitattributes' file from within a sparse directory. 2. The second index access point is involved in rename detection, specifically when reading from stdin.This was initially added in `f0c6b2a2fd` ([PATCH] Optimize diff-tree -[CM]--stdin, 2005-05-27), where 'setup' was set to 'DIFF_SETUP_USE_SIZE_CACHE \|DIFF_SETUP_USE_CACHE'. That assignment was later modified to drop the'DIFF_SETUP_USE_CACHE' in `ff7fe37b05` (diff.c: move read_index() code back to the caller, 2018-08-13).However, 'DIFF_SETUP_USE_SIZE_CACHE' seems to be unused as of `6e0b8ed6d3` (diff.c: do not use a separate "size cache"., 2007-05-07) and nothing about 'detect_rename' otherwise indicates index usage. Hence we can just set the requires-full-index to false for "diff-tree". Add tests that verify that 'git diff-tree' behaves correctly when the sparse index is enabled and test to ensure the index is not expanded. The `p2000` tests demonstrate a ~98% execution time reduction for 'git diff-tree' using a sparse index: Test before after ----------------------------------------------------------------------- 2000.94: git diff-tree HEAD (full-v3) 0.05 0.04 -20.0% 2000.95: git diff-tree HEAD (full-v4) 0.06 0.05 -16.7% 2000.96: git diff-tree HEAD (sparse-v3) 0.59 0.01 -98.3% 2000.97: git diff-tree HEAD (sparse-v4) 0.61 0.01 -98.4% 2000.98: git diff-tree HEAD -- f2/f4/a (full-v3) 0.05 0.05 +0.0% 2000.99: git diff-tree HEAD -- f2/f4/a (full-v4) 0.05 0.04 -20.0% 2000.100: git diff-tree HEAD -- f2/f4/a (sparse-v3) 0.58 0.01 -98.3% 2000.101: git diff-tree HEAD -- f2/f4/a (sparse-v4) 0.55 0.01 -98.2% Helped-by: Victoria Dye <vdye@github.com> Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-18 10:40:33 -07:00
Junio C Hamano	5ca11547bb	Merge branch 'sl/diff-files-sparse' Teach "diff-files" not to expand sparse-index unless needed. * sl/diff-files-sparse: diff-files: integrate with sparse index t1092: add tests for `git diff-files`	2023-05-15 13:59:06 -07:00
Shuqi Liang	8c30be9176	diff-files: integrate with sparse index Remove full index requirement for `git diff-files`. Refactor the ensure_expanded and ensure_not_expanded functions by introducing a common helper function, ensure_index_state. Add test to ensure the index is no expanded in `git diff-files`. The `p2000` tests demonstrate a ~96% execution time reduction for 'git diff-files' and a ~97% execution time reduction for 'git diff-files' for a file using a sparse index: Test before after ----------------------------------------------------------------------- 2000.94: git diff-files (full-v3) 0.09 0.08 -11.1% 2000.95: git diff-files (full-v4) 0.09 0.09 +0.0% 2000.96: git diff-files (sparse-v3) 0.52 0.02 -96.2% 2000.97: git diff-files (sparse-v4) 0.51 0.02 -96.1% 2000.98: git diff-files -- f2/f4/a (full-v3) 0.06 0.07 +16.7% 2000.99: git diff-files -- f2/f4/a (full-v4) 0.08 0.08 +0.0% 2000.100: git diff-files -- f2/f4/a (sparse-v3) 0.46 0.01 -97.8% 2000.101: git diff-files -- f2/f4/a (sparse-v4) 0.51 0.02 -96.1% Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-05-09 14:26:36 -07:00
Junio C Hamano	849c8b3dbf	Merge branch 'tb/pack-revindex-on-disk' The on-disk reverse index that allows mapping from the pack offset to the object name for the object stored at the offset has been enabled by default. * tb/pack-revindex-on-disk: t: invert `GIT_TEST_WRITE_REV_INDEX` config: enable `pack.writeReverseIndex` by default pack-revindex: introduce `pack.readReverseIndex` pack-revindex: introduce GIT_TEST_REV_INDEX_DIE_ON_DISK pack-revindex: make `load_pack_revindex` take a repository t5325: mark as leak-free pack-write.c: plug a leak in stage_tmp_packfiles()	2023-04-27 16:00:59 -07:00
Junio C Hamano	7ac228c994	Merge branch 'rn/sparse-describe' "git describe --dirty" learns to work better with sparse-index. * rn/sparse-describe: describe: enable sparse index for describe	2023-04-21 15:35:04 -07:00
Junio C Hamano	d47ee0a565	Merge branch 'sl/sparse-write-tree' "git write-tree" learns to work better with sparse-index. * sl/sparse-write-tree: write-tree: integrate with sparse index	2023-04-17 18:05:11 -07:00
Taylor Blau	a8dd7e05b1	config: enable `pack.writeReverseIndex` by default Back in `e37d0b8730` (builtin/index-pack.c: write reverse indexes, 2021-01-25), Git learned how to read and write a pack's reverse index from a file instead of in-memory. A pack's reverse index is a mapping from pack position (that is, the order that objects appear together in a ".pack") to their position in lexical order (that is, the order that objects are listed in an ".idx" file). Reverse indexes are consulted often during pack-objects, as well as during auxiliary operations that require mapping between pack offsets, pack order, and index index. They are useful in GitHub's infrastructure, where we have seen a dramatic increase in performance when writing ".rev" files[1]. In particular: - an ~80% reduction in the time it takes to serve fetches on a popular repository, Homebrew/homebrew-core. - a ~60% reduction in the peak memory usage to serve fetches on that same repository. - a collective savings of ~35% in CPU time across all pack-objects invocations serving fetches across all repositories in a single datacenter. Reverse indexes are also beneficial to end-users as well as forges. For example, the time it takes to generate a pack containing the objects for the 10 most recent commits in linux.git (representing a typical push) is significantly faster when on-disk reverse indexes are available: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~10 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 543.0 ms ± 20.3 ms [User: 616.2 ms, System: 58.8 ms] Range (min … max): 521.0 ms … 577.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 245.0 ms ± 11.4 ms [User: 335.6 ms, System: 31.3 ms] Range (min … max): 226.0 ms … 259.6 ms 13 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 2.22 ± 0.13 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' The same is true of writing a pack containing the objects for the 30 most-recent commits: $ { git rev-parse HEAD && printf '^' && git rev-parse HEAD~30 } >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} pack-objects --delta-base-offset --revs --stdout <in >/dev/null' Benchmark 1: git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 866.5 ms ± 16.2 ms [User: 1414.5 ms, System: 97.0 ms] Range (min … max): 839.3 ms … 886.9 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null Time (mean ± σ): 581.6 ms ± 10.2 ms [User: 1181.7 ms, System: 62.6 ms] Range (min … max): 567.5 ms … 599.3 ms 10 runs Summary 'git.compile -c pack.readReverseIndex=true pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ran 1.49 ± 0.04 times faster than 'git.compile -c pack.readReverseIndex=false pack-objects --delta-base-offset --revs --stdout <in >/dev/null' ...and savings on trivial operations like computing the on-disk size of a single (packed) object are even more dramatic: $ git rev-parse HEAD >in $ hyperfine -L v false,true 'git.compile -c pack.readReverseIndex={v} cat-file --batch-check="%(objectsize:disk)" <in' Benchmark 1: git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 305.8 ms ± 11.4 ms [User: 264.2 ms, System: 41.4 ms] Range (min … max): 290.3 ms … 331.1 ms 10 runs Benchmark 2: git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in Time (mean ± σ): 4.0 ms ± 0.3 ms [User: 1.7 ms, System: 2.3 ms] Range (min … max): 1.6 ms … 4.6 ms 1155 runs Summary 'git.compile -c pack.readReverseIndex=true cat-file --batch-check="%(objectsize:disk)" <in' ran 76.96 ± 6.25 times faster than 'git.compile -c pack.readReverseIndex=false cat-file --batch-check="%(objectsize:disk)" <in' In the more than two years since `e37d0b8730` was merged, Git's implementation of on-disk reverse indexes has been thoroughly tested, both from users enabling `pack.writeReverseIndexes`, and from GitHub's deployment of the feature. The latter has been running without incident for more than two years. This patch changes Git's behavior to write on-disk reverse indexes by default when indexing a pack, which should make the above operations faster for everybody's Git installation after a repack. (The previous commit explains some potential drawbacks of using on-disk reverse indexes in certain limited circumstances, that essentially boil down to a trade-off between time to generate, and time to access. For those limited cases, the `pack.readReverseIndex` escape hatch can be used). [1]: https://github.blog/2021-04-29-scaling-monorepo-maintenance/#reverse-indexes Signed-off-by: Taylor Blau <me@ttaylorr.com> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-13 07:55:46 -07:00
Junio C Hamano	7727da99df	Merge branch 'ds/ahead-behind' "git for-each-ref" learns '%(ahead-behind:<base>)' that computes the distances from a single reference point in the history with bunch of commits in bulk. * ds/ahead-behind: commit-reach: add tips_reachable_from_bases() for-each-ref: add ahead-behind format atom commit-reach: implement ahead_behind() logic commit-graph: introduce `ensure_generations_valid()` commit-graph: return generation from memory commit-graph: simplify compute_generation_numbers() commit-graph: refactor compute_topological_levels() for-each-ref: explicitly test no matches for-each-ref: add --stdin option	2023-04-06 13:38:21 -07:00
Shuqi Liang	1a65b41b38	write-tree: integrate with sparse index Update 'git write-tree' to allow using the sparse-index in memory without expanding to a full one. The recursive algorithm for update_one() was already updated in `2de37c5` (cache-tree: integrate with sparse directory entries, 2021-03-03) to handle sparse directory entries in the index. Hence we can just set the requires-full-index to false for "write-tree". The `p2000` tests demonstrate a ~96% execution time reduction for 'git write-tree' using a sparse index: Test before after ----------------------------------------------------------------- 2000.78: git write-tree (full-v3) 0.34 0.33 -2.9% 2000.79: git write-tree (full-v4) 0.32 0.30 -6.3% 2000.80: git write-tree (sparse-v3) 0.47 0.02 -95.8% 2000.81: git write-tree (sparse-v4) 0.45 0.02 -95.6% Signed-off-by: Shuqi Liang <cheskaqiqi@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-04 12:50:54 -07:00
Raghul Nanth A	748b8d669a	describe: enable sparse index for describe git describe compares the index with the working tree when (and only when) it is run with the "--dirty" flag. This is done by the run_diff_index() function. The function has been made aware of the sparse-index in the series that led to `8d2c3732` (Merge branch 'ld/sparse-diff-blame', 2021-12-21). Hence we can just set the requires-full-index to false for "describe". Performance metrics Test HEAD~1 HEAD ------------------------------------------------------------------------------------------------- 2000.2: git describe --dirty (full-v3) 0.08(0.09+0.01) 0.08(0.06+0.03) +0.0% 2000.3: git describe --dirty (full-v4) 0.09(0.07+0.03) 0.08(0.05+0.04) -11.1% 2000.4: git describe --dirty (sparse-v3) 0.88(0.82+0.06) 0.02(0.01+0.05) -97.7% 2000.5: git describe --dirty (sparse-v4) 0.68(0.60+0.08) 0.02(0.02+0.04) -97.1% 2000.6: echo >>new && git describe --dirty (full-v3) 0.08(0.04+0.05) 0.08(0.05+0.04) +0.0% 2000.7: echo >>new && git describe --dirty (full-v4) 0.08(0.07+0.03) 0.08(0.05+0.04) +0.0% 2000.8: echo >>new && git describe --dirty (sparse-v3) 0.75(0.69+0.07) 0.02(0.03+0.03) -97.3% 2000.9: echo >>new && git describe --dirty (sparse-v4) 0.81(0.73+0.09) 0.02(0.01+0.05) -97.5% Signed-off-by: Raghul Nanth A <nanth.raghul@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-03 11:30:23 -07:00
Junio C Hamano	290a973bb9	Merge branch 'ds/p2000-fix-grep-sparse' Fix perf test. * ds/p2000-fix-grep-sparse: p2000: remove stray '--sparse' flag from test	2023-03-31 17:50:23 -07:00
Derrick Stolee	d52fcf493b	p2000: remove stray '--sparse' flag from test This argument was added in `7cae7627c4` (builtin/grep.c: integrate with sparse index, 2022-09-22), but it was a carry-over from an earlier version where the --sparse flag was added to the 'git grep' builtin. This argument does not exist, so currently the p2000-sparse-operations.sh performance test script fails when reaching this step. With this fix, the script works with these numbers for my copy of the Git source code repository: Test HEAD ------------------------------------------------------------ 2000.30: git grep --cached ... (full-v3) 0.34(1.20+0.14) 2000.31: git grep --cached ... (full-v4) 0.31(1.15+0.13) 2000.32: git grep --cached ... (sparse-v3) 0.26(1.13+0.12) 2000.33: git grep --cached ... (sparse-v4) 0.27(1.13+0.12) Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 13:25:52 -07:00
Derrick Stolee	cbfe360b14	commit-reach: add tips_reachable_from_bases() Both 'git for-each-ref --merged=<X>' and 'git branch --merged=<X>' use the ref-filter machinery to select references or branches (respectively) that are reachable from a set of commits presented by one or more --merged arguments. This happens within reach_filter(), which uses the revision-walk machinery to walk history in a standard way. However, the commit-reach.c file is full of custom searches that are more efficient, especially for reachability queries that can terminate early when reachability is discovered. Add a new tips_reachable_from_bases() method to commit-reach.c and call it from within reach_filter() in ref-filter.c. This affects both 'git branch' and 'git for-each-ref' as tested in p1500-graph-walks.sh. For the Linux kernel repository, we take an already-fast algorithm and make it even faster: Test HEAD~1 HEAD ------------------------------------------------------------------- 1500.5: contains: git for-each-ref --merged 0.13 0.02 -84.6% 1500.6: contains: git branch --merged 0.14 0.02 -85.7% 1500.7: contains: git tag --merged 0.15 0.03 -80.0% (Note that we remove the iterative 'git rev-list' test from p1500 because it no longer makes sense as a comparison to 'git for-each-ref' and would just waste time running it for these comparisons.) The algorithm is implemented in commit-reach.c in the method tips_reachable_from_base(). This method takes a string_list of tips and assigns the 'util' for each item with the value 1 if the base commit can reach those tips. Like other reachability queries in commit-reach.c, the fastest way to search for "can A reach B?" is to do a depth-first search up to the generation number of B, preferring to explore first parents before later parents. While we must walk all reachable commits up to that generation number when the answer is "no", the depth-first search can answer "yes" much faster than other approaches in most cases. This search becomes trickier when there are multiple targets for the depth-first search. The commits with lower generation number are more likely to be within the history of the start commit, but we don't want to waste time searching commits of low generation number if the commit target with lowest generation number has already been found. The trick here is to take the input commits and sort them by generation number in ascending order. Track the index within this order as min_generation_index. When we find a commit, if its index in the list is equal to min_generation_index, then we can increase the generation number boundary of our search to the next-lowest value in the list. With this mechanism, the number of commits to search is minimized with respect to the depth-first search heuristic. We will walk all commits up to the minimum generation number of a commit that is _not_ reachable from the start, but we will walk only the necessary portion of the depth-first search for the reachable commits of lower generation. Add extra tests for this behavior in t6600-test-reach.sh as the interesting data shape of that repository can sometimes demonstrate corner case bugs. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Derrick Stolee	49abcd21da	for-each-ref: add ahead-behind format atom The previous change implemented the ahead_behind() method, including an algorithm to compute the ahead/behind values for a number of commit tips relative to a number of commit bases. Now, integrate that algorithm as part of 'git for-each-ref' hidden behind a new format atom, ahead-behind. This naturally extends to 'git branch' and 'git tag' builtins, as well. This format allows specifying multiple bases, if so desired, and all matching references are compared against all of those bases. For this reason, failing to read a reference provided from these atoms results in an error. In order to translate the ahead_behind() method information to the format output code in ref-filter.c, we must populate arrays of ahead_behind_count structs. In struct ref_array, we store the full array that will be passed to ahead_behind(). In struct ref_array_item, we store an array of pointers that point to the relvant items within the full array. In this way, we can pull all relevant ahead/behind values directly when formatting output for a specific item. It also ensures the lifetime of the ahead_behind_count structs matches the time that the array is being used. Add specific tests of the ahead/behind counts in t6600-test-reach.sh, as it has an interesting repository shape. In particular, its merging strategy and its use of different commit-graphs would demonstrate over- counting if the ahead_behind() method did not already account for that possibility. Also add tests for the specific for-each-ref, branch, and tag builtins. In the case of 'git tag', there are intersting cases that happen when some of the selected tips are not commits. This requires careful logic around commits_nr in the second loop of filter_ahead_behind(). Also, the test in t7004 is carefully located to avoid being dependent on the GPG prereq. It also avoids using the test_commit helper, as that will add ticks to the time and disrupt the expected timestamps in later tag tests. Also add performance tests in a new p1300-graph-walks.sh script. This will be useful for more uses in the future, but for now compare the ahead-behind counting algorithm in 'git for-each-ref' to the naive implementation by running 'git rev-list --count' processes for each input. For the Git source code repository, the improvement is already obvious: Test this tree --------------------------------------------------------------- 1500.2: ahead-behind counts: git for-each-ref 0.07(0.07+0.00) 1500.3: ahead-behind counts: git branch 0.07(0.06+0.00) 1500.4: ahead-behind counts: git tag 0.07(0.06+0.00) 1500.5: ahead-behind counts: git rev-list 1.32(1.04+0.27) But the standard performance benchmark is the Linux kernel repository, which demosntrates a significant improvement: Test this tree --------------------------------------------------------------- 1500.2: ahead-behind counts: git for-each-ref 0.27(0.24+0.02) 1500.3: ahead-behind counts: git branch 0.27(0.24+0.03) 1500.4: ahead-behind counts: git tag 0.28(0.27+0.01) 1500.5: ahead-behind counts: git rev-list 4.57(4.03+0.54) The 'git rev-list' test exists in this change as a demonstration, but it will be removed in the next change to avoid wasting time on this comparison. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-20 12:17:33 -07:00
Carlo Marcelo Arenas Belón	acabd2048e	grep: correctly identify utf-8 characters with \{b,w} in -P When UTF is enabled for a PCRE match, the corresponding flags are added to the pcre2_compile() call, but PCRE2_UCP wasn't included. This prevents extending the meaning of the character classes to include those new valid characters and therefore result in failed matches for expressions that rely on that extention, for ex: $ git grep -P '\bÆvar' Add PCRE2_UCP so that \w will include Æ and therefore \b could correctly match the beginning of that word. This has an impact on performance that has been estimated to be between 20% to 40% and that is shown through the added performance test. Signed-off-by: Carlo Marcelo Arenas Belón <carenas@gmail.com> Acked-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-01-18 15:24:52 -08:00
Victoria Dye	dc5d40f5bc	read-tree: use 'skip_cache_tree_update' option When running 'read-tree' with a single tree and no prefix, 'prime_cache_tree()' is called after the tree is unpacked. In that situation, skip a redundant call to 'cache_tree_update()' in 'unpack_trees()' by enabling the 'skip_cache_tree_update' unpack option. Removing the redundant cache tree update provides a substantial performance improvement to 'git read-tree <tree-ish>', as shown by a test added to 'p0006-read-tree-checkout.sh': Test before after ---------------------------------------------------------------------- read-tree br_ballast_plus_1 3.94(1.80+1.57) 3.00(1.14+1.28) -23.9% Note that the 'read-tree' in 't1022-read-tree-partial-clone.sh' is updated to read two trees, rather than one. The test was first introduced in `d3da223f22` (cache-tree: prefetch in partial clone read-tree, 2021-07-23) to exercise the 'cache_tree_update()' code path, as used in 'git merge'. Since this patch drops the call to 'cache_tree_update()' in single-tree 'git read-tree', change the test to use the two-tree variant so that 'cache_tree_update()' is called as intended. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:34 -05:00
Victoria Dye	0e47bca0f7	reset: use 'skip_cache_tree_update' option Enable the 'skip_cache_tree_update' option in the variants that call 'prime_cache_tree()' after 'unpack_trees()' (specifically, 'git reset --mixed' and 'git reset --hard'). This avoids redundantly rebuilding the cache tree in both 'cache_tree_update()' at the end of 'unpack_trees()' and in 'prime_cache_tree()', resulting in a small (but consistent) performance improvement. From the newly-added 'p7102-reset.sh' test: Test before after -------------------------------------------------------------------- 7102.1: reset --hard (...) 2.11(0.40+1.54) 1.97(0.38+1.47) -6.6% Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:34 -05:00
Victoria Dye	94fcf0e852	cache-tree: add perf test comparing update and prime Add a performance test comparing the execution times of 'prime_cache_tree()' and 'cache_tree_update(_, WRITE_TREE_SILENT \| WRITE_TREE_REPAIR)'. The goal of comparing these two is to identify which is the faster method for rebuilding an invalid cache tree, ultimately to remove one when both are (reundantly) called in immediate succession. Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call each tested function multiple times to ensure the reported times (to 0.01s resolution) convey the differences between them. The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to capture a baseline for the overhead associated with running the tool), 'cache_tree_update()', and 'prime_cache_tree()' on four scenarios: - A completely valid cache tree - A cache tree with 2 invalid paths - A cache tree with 50 invalid paths - A completely empty cache tree Example results: Test this tree ----------------------------------------------------------- 0090.2: no-op, clean 1.27(0.48+0.52) 0090.3: prime_cache_tree, clean 2.02(0.83+0.85) 0090.4: cache_tree_update, clean 1.30(0.49+0.54) 0090.5: no-op, invalidate 2 1.29(0.48+0.54) 0090.6: prime_cache_tree, invalidate 2 1.98(0.81+0.83) 0090.7: cache_tree_update, invalidate 2 2.12(0.94+0.86) 0090.8: no-op, invalidate 50 1.32(0.50+0.55) 0090.9: prime_cache_tree, invalidate 50 2.10(0.86+0.89) 0090.10: cache_tree_update, invalidate 50 2.35(1.14+0.90) 0090.11: no-op, empty 1.33(0.50+0.54) 0090.12: prime_cache_tree, empty 2.04(0.84+0.87) 0090.13: cache_tree_update, empty 2.51(1.27+0.92) These timings show that, while 'cache_tree_update()' is faster when the cache tree is completely valid, it is equal to or slower than 'prime_cache_tree()' when there are any invalid paths. Since the redundant calls are mostly in scenarios where the cache tree will be at least partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will likely perform better than 'cache_tree_update()' in typical cases. Helped-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com>	2022-11-10 21:49:33 -05:00
Junio C Hamano	67bf4a83e9	Merge branch 'sy/sparse-grep' "git grep" learned to expand the sparse-index more lazily and on demand in a sparse checkout. * sy/sparse-grep: builtin/grep.c: integrate with sparse index	2022-10-10 10:08:40 -07:00
Shaoxuan Yuan	7cae7627c4	builtin/grep.c: integrate with sparse index Turn on sparse index and remove ensure_full_index(). Before this patch, `git-grep` utilizes the ensure_full_index() method to expand the index and search all the entries. Because this method requires walking all the trees and constructing the index, it is the slow part within the whole command. To achieve better performance, this patch uses grep_tree() to search the sparse directory entries and get rid of the ensure_full_index() method. Why grep_tree() is a better choice over ensure_full_index()? 1) grep_tree() is as correct as ensure_full_index(). grep_tree() looks into every sparse-directory entry (represented by a tree) recursively when looping over the index, and the result of doing so matches the result of expanding the index. 2) grep_tree() utilizes pathspecs to limit the scope of searching. ensure_full_index() always expands the index, which means it will always walk all the trees and blobs in the repo without caring if the user only wants a subset of the content, i.e. using a pathspec. On the other hand, grep_tree() will only search the contents that match the pathspec, and thus possibly walking fewer trees. 3) grep_tree() does not construct and copy back a new index, while ensure_full_index() does. This also saves some time. ---------------- Performance test - Summary: p2000 tests demonstrate a ~71% execution time reduction for `git grep --cached bogus -- "f2/f1/f1/"` using tree-walking logic. However, notice that this result varies depending on the pathspec given. See below "Command used for testing" for more details. Test HEAD~ HEAD ------------------------------------------------------- 2000.78: git grep ... (full-v3) 0.35 0.39 (≈) 2000.79: git grep ... (full-v4) 0.36 0.30 (≈) 2000.80: git grep ... (sparse-v3) 0.88 0.23 (-73.8%) 2000.81: git grep ... (sparse-v4) 0.83 0.26 (-68.6%) - Command used for testing: git grep --cached bogus -- "f2/f1/f1/" The reason for specifying a pathspec is that, if we don't specify a pathspec, then grep_tree() will walk all the trees and blobs to find the pattern, and the time consumed doing so is not too different from using the original ensure_full_index() method, which also spends most of the time walking trees. However, when a pathspec is specified, this latest logic will only walk the area of trees enclosed by the pathspec, and the time consumed is reasonably a lot less. Generally speaking, because the performance gain is acheived by walking less trees, which are specified by the pathspec, the HEAD time v.s. HEAD~ time in sparse-v[3\|4], should be proportional to "pathspec enclosed area" v.s. "all area", respectively. Namely, the wider the <pathspec> is encompassing, the less the performance difference between HEAD~ and HEAD, and vice versa. That is, if we don't specify a pathspec, the performance difference [1] is indistinguishable: both methods walk all the trees and take generally same amount of time (even with the index construction time included for ensure_full_index()). [1] Performance test result without pathspec (hence walking all trees): Command used: git grep --cached bogus Test HEAD~ HEAD --------------------------------------------------- 2000.78: git grep ... (full-v3) 6.17 5.19 (≈) 2000.79: git grep ... (full-v4) 6.19 5.46 (≈) 2000.80: git grep ... (sparse-v3) 6.57 6.44 (≈) 2000.81: git grep ... (sparse-v4) 6.65 6.28 (≈) -------------------------- NEEDSWORK about submodules There are a few NEEDSWORKs that belong to improvements beyond this topic. See the NEEDSWORK in builtin/grep.c::grep_submodule() for more context. The other two NEEDSWORKs in t1092 are also relative. Suggested-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Helped-by: Victoria Dye <vdye@github.com> Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-23 09:41:27 -07:00
Đoàn Trần Công Danh	81580fa06d	t: convert egrep usage to "grep -E" Despite POSIX states that: > The old egrep and fgrep commands are likely to be supported for many > years to come as implementation extensions, allowing historical > applications to operate unmodified. GNU grep 3.8 started to warn[1]: > The egrep and fgrep commands, which have been deprecated since > release 2.5.3 (2007), now warn that they are obsolescent and should > be replaced by grep -E and grep -F. Prepare for their removal in the future. [1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-21 11:00:18 -07:00
Junio C Hamano	42bf77c7d0	Merge branch 'vd/scalar-to-main' Hoist the remainder of "scalar" out of contrib/ to the main part of the codebase. * vd/scalar-to-main: Documentation/technical: include Scalar technical doc t/perf: add 'GIT_PERF_USE_SCALAR' run option t/perf: add Scalar performance tests scalar-clone: add test coverage scalar: add to 'git help -a' command list scalar: implement the `help` subcommand git help: special-case `scalar` scalar: include in standard Git build & installation scalar: fix command documentation section header	2022-09-19 14:35:25 -07:00
Junio C Hamano	3fe0121479	Merge branch 'ac/bitmap-lookup-table' The pack bitmap file gained a bitmap-lookup table to speed up locating the necessary bitmap for a given commit. * ac/bitmap-lookup-table: pack-bitmap-write: drop unused pack_idx_entry parameters bitmap-lookup-table: add performance tests for lookup table pack-bitmap: prepare to read lookup table extension pack-bitmap-write: learn pack.writeBitmapLookupTable and add tests pack-bitmap-write.c: write lookup table extension bitmap: move `get commit positions` code to `bitmap_writer_finish` Documentation/technical: describe bitmap lookup table extension	2022-09-05 18:33:39 -07:00
Victoria Dye	ba1b117eec	t/perf: add 'GIT_PERF_USE_SCALAR' run option Add a 'GIT_PERF_USE_SCALAR' environment variable (and corresponding perf config 'useScalar') to register a repository created with any of: * test_perf_fresh_repo * test_perf_default_repo * test_perf_large_repo as a Scalar enlistment. This is intended to allow a developer to test the impact of Scalar on already-defined performance scenarios. Suggested-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 10:02:56 -07:00
Victoria Dye	e2809233d1	t/perf: add Scalar performance tests Create 'p9210-scalar.sh' for testing Scalar performance and comparing performance of Git operations in Scalar registrations and standard repositories. Example results: Test this tree ------------------------------------------------------------------------ 9210.2: scalar clone 14.82(18.00+3.63) 9210.3: git clone 26.15(36.67+6.90) 9210.4: git status (scalar) 0.04(0.01+0.01) 9210.5: git status (non-scalar) 0.10(0.02+0.11) 9210.6: test_commit --append --no-tag A (scalar) 0.08(0.02+0.03) 9210.7: test_commit --append --no-tag A (non-scalar) 0.13(0.03+0.11) Helped-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-02 10:02:56 -07:00
Junio C Hamano	3658170b92	Merge branch 'es/fix-chained-tests' Fix broken "&&-" chains and failures in early iterations of a loop. * es/fix-chained-tests: t5329: notice a failure within a loop t: detect and signal failure within loop t1092: fix buggy sparse "blame" test t2407: fix broken &&-chains in compound statement	2022-08-29 14:55:15 -07:00
Junio C Hamano	25402204fe	Merge branch 'vd/fix-perf-tests' Rather trivial perf-test code fixes. * vd/fix-perf-tests: p0006: fix 'read-tree' argument ordering p0004: fix prereq declaration	2022-08-29 14:55:13 -07:00
Abhradeep Chakraborty	761416ef91	bitmap-lookup-table: add performance tests for lookup table Add performance tests to verify the performance of lookup table. `p5310-pack-bitmaps.sh` contain tests with and without lookup table. `p5312-pack-bitmaps-revs.sh` contain same tests with and without lookup table but with `pack.writeReverseIndex` enabled. Lookup table makes Git run faster in most of the cases. Below is the result of `t/perf/p5310-pack-bitmaps.sh`.`perf/p5326-multi-pack-bitmaps.sh` gives similar result. The repository used in the test is linux kernel. Test this tree ----------------------------------------------------------------------- 5310.4: enable lookup table: false 0.01(0.00+0.00) 5310.5: repack to disk 320.89(230.20+23.45) 5310.6: simulated clone 14.04(5.78+1.79) 5310.7: simulated fetch 1.95(3.05+0.20) 5310.8: pack to file (bitmap) 44.73(20.55+7.45) 5310.9: rev-list (commits) 0.78(0.46+0.10) 5310.10: rev-list (objects) 4.07(3.97+0.08) 5310.11: rev-list with tag negated via --not 0.06(0.02+0.03) --all (objects) 5310.12: rev-list with negative tag (objects) 0.21(0.15+0.05) 5310.13: rev-list count with blob:none 0.24(0.17+0.06) 5310.14: rev-list count with blob:limit=1k 7.07(5.92+0.48) 5310.15: rev-list count with tree:0 0.25(0.17+0.07) 5310.16: simulated partial clone 5.67(3.28+0.64) 5310.18: clone (partial bitmap) 16.05(8.34+1.86) 5310.19: pack to file (partial bitmap) 59.76(27.22+7.43) 5310.20: rev-list with tree filter (partial bitmap) 0.90(0.18+0.16) 5310.24: enable lookup table: true 0.01(0.00+0.00) 5310.25: repack to disk 319.73(229.30+23.01) 5310.26: simulated clone 13.69(5.72+1.78) 5310.27: simulated fetch 1.84(3.02+0.16) 5310.28: pack to file (bitmap) 45.63(20.67+7.50) 5310.29: rev-list (commits) 0.56(0.39+0.8) 5310.30: rev-list (objects) 3.77(3.74+0.08) 5310.31: rev-list with tag negated via --not 0.05(0.02+0.03) --all (objects) 5310.32: rev-list with negative tag (objects) 0.21(0.15+0.05) 5310.33: rev-list count with blob:none 0.23(0.17+0.05) 5310.34: rev-list count with blob:limit=1k 6.65(5.72+0.40) 5310.35: rev-list count with tree:0 0.23(0.16+0.06) 5310.36: simulated partial clone 5.57(3.26+0.59) 5310.38: clone (partial bitmap) 15.89(8.39+1.84) 5310.39: pack to file (partial bitmap) 58.32(27.55+7.47) 5310.40: rev-list with tree filter (partial bitmap) 0.73(0.18+0.15) Test 4-15 are tested without using lookup table. Same tests are repeated in 16-30 (using lookup table). Mentored-by: Taylor Blau <me@ttaylorr.com> Co-Mentored-by: Kaartic Sivaraam <kaartic.sivaraam@gmail.com> Signed-off-by: Abhradeep Chakraborty <chakrabortyabhradeep79@gmail.com> Reviewed-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-26 10:14:02 -07:00
Eric Sunshine	0e66bc1b21	t: detect and signal failure within loop Failures within `for` and `while` loops can go unnoticed if not detected and signaled manually since the loop itself does not abort when a contained command fails, nor will a failure necessarily be detected when the loop finishes since the loop returns the exit code of the last command it ran on the final iteration, which may not be the command which failed. Therefore, detect and signal failures manually within loops using the idiom `\|\| return 1` (or `\|\| exit 1` within subshells). Signed-off-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-22 12:53:02 -07:00
Victoria Dye	77b9e85c0f	p0006: fix 'read-tree' argument ordering In the 'p0006' test "read-tree br_base br_ballast", move the '-n' flag used in 'git read-tree' ahead of its positional arguments. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 14:35:30 -07:00
Victoria Dye	59c72303dd	p0004: fix prereq declaration Fix multi-threaded 'p0004' test's use of the 'REPO_BIG_ENOUGH_FOR_MULTI' prerequisite. Unlike normal 't/' tests, 't/perf/' tests need to have their prerequisites declared with the '--prereq' flag. Signed-off-by: Victoria Dye <vdye@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 14:35:28 -07:00
Shaoxuan Yuan	ede241c715	rm: integrate with sparse-index Enable the sparse index within the `git-rm` command. The `p2000` tests demonstrate a ~92% execution time reduction for 'git rm' using a sparse index. Test HEAD~1 HEAD -------------------------------------------------------------------------- 2000.74: git rm ... (full-v3) 0.41(0.37+0.05) 0.43(0.36+0.07) +4.9% 2000.75: git rm ... (full-v4) 0.38(0.34+0.05) 0.39(0.35+0.05) +2.6% 2000.76: git rm ... (sparse-v3) 0.57(0.56+0.01) 0.05(0.05+0.00) -91.2% 2000.77: git rm ... (sparse-v4) 0.57(0.55+0.02) 0.03(0.03+0.00) -94.7% ---- Also, normalize a behavioral difference of `git-rm` under sparse-index. See related discussion [1]. `git-rm` a sparse-directory entry within a sparse-index enabled repo behaves differently from a sparse directory within a sparse-checkout enabled repo. For example, in a sparse-index repo, where 'folder1' is a sparse-directory entry, `git rm -r --sparse folder1` provides this: rm 'folder1/' Whereas in a sparse-checkout repo without sparse-index, doing so provides this: rm 'folder1/0/0/0' rm 'folder1/0/1' rm 'folder1/a' Because `git rm` a sparse-directory entry does not need to expand the index, therefore we should accept the current behavior, which is faster than "expand the sparse-directory entry to match the sparse-checkout situation". Modify a previous test so such difference is not considered as an error. [1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398 Helped-by: Victoria Dye <vdye@github.com> Helped-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-08 13:23:26 -07:00
Junio C Hamano	4e0d160bbc	Merge branch 'rs/mergesort' Make our mergesort implementation type-safe. * rs/mergesort: mergesort: remove llist_mergesort() packfile: use DEFINE_LIST_SORT fetch-pack: use DEFINE_LIST_SORT commit: use DEFINE_LIST_SORT blame: use DEFINE_LIST_SORT test-mergesort: use DEFINE_LIST_SORT test-mergesort: use DEFINE_LIST_SORT_DEBUG mergesort: add macros for typed sort of linked lists mergesort: tighten merge loop mergesort: unify ranks loops	2022-08-03 13:36:09 -07:00
René Scharfe	b378c2ff1e	test-mergesort: use DEFINE_LIST_SORT Build a typed sort function for the mergesort performance test tool using DEFINE_LIST_SORT instead of calling llist_mergesort(). This gets rid of the next pointer accessor functions and improves the performance at the cost of a slightly higher object text size. Before: 0071.12: llist_mergesort() unsorted 0.24(0.22+0.01) 0071.14: llist_mergesort() sorted 0.12(0.10+0.01) 0071.16: llist_mergesort() reversed 0.12(0.10+0.01) __TEXT __DATA __OBJC others dec hex 6407 276 0 24701 31384 7a98 t/helper/test-mergesort.o With this patch: 0071.12: DEFINE_LIST_SORT unsorted 0.22(0.21+0.01) 0071.14: DEFINE_LIST_SORT sorted 0.11(0.10+0.01) 0071.16: DEFINE_LIST_SORT reversed 0.11(0.10+0.01) __TEXT __DATA __OBJC others dec hex 6615 276 0 25832 32723 7fd3 t/helper/test-mergesort.o Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-17 15:20:38 -07:00
Junio C Hamano	2fec2d2895	Merge branch 'jk/perf-lib-test-titles' Show test titles to the performance test output again. * jk/perf-lib-test-titles: perf-lib: fix missing test titles in output	2022-06-17 10:33:31 -07:00
Jeff King	55d9d4bbd0	perf-lib: fix missing test titles in output Commit `5dccd9155f` (t/perf: add iteration setup mechanism to perf-lib, 2022-04-04) modified the parameter parsing of test_wrapper() such that the test title was no longer in $1, and is instead in $test_title_. We correctly pass the new variable to the code which outputs the title to the log, but missed the spot in test_wrapper() where the title is written to the ".descr" file which is used to produce the final output table. As a result, all of the titles are missing from that table (or worse, using whatever was left in $1): $ ./p0000-perf-lib-sanity.sh [...] Test this tree ------------------------------ 0000.1: 0.01(0.01+0.00) 0000.2: 0.01(0.00+0.01) 0000.4: 0.00(0.00+0.00) 0000.5: true 0.00(0.00+0.00) 0000.7: 0.00(0.00+0.00) 0000.8: 0.00(0.00+0.00) After this patch, we get the pre-5dccd9155f output: Test this tree -------------------------------------------------------------------------- 0000.1: test_perf_default_repo works 0.00(0.00+0.00) 0000.2: test_checkout_worktree works 0.01(0.00+0.01) 0000.4: export a weird var 0.00(0.00+0.00) 0000.5: éḿíẗ ńöń-ÁŚĆÍÍ ćḧáŕáćẗéŕś 0.00(0.00+0.00) 0000.7: important variables available in subshells 0.00(0.00+0.00) 0000.8: test-lib-functions correctly loaded in subshells 0.00(0.00+0.00) Signed-off-by: Jeff King <peff@peff.net> Acked-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-16 11:57:35 -07:00
Junio C Hamano	9e496fffc8	Merge branch 'jh/builtin-fsmonitor-part3' More fsmonitor--daemon. * jh/builtin-fsmonitor-part3: (30 commits) t7527: improve implicit shutdown testing in fsmonitor--daemon fsmonitor--daemon: allow --super-prefix argument t7527: test Unicode NFC/NFD handling on MacOS t/lib-unicode-nfc-nfd: helper prereqs for testing unicode nfc/nfd t/helper/hexdump: add helper to print hexdump of stdin fsmonitor: on macOS also emit NFC spelling for NFD pathname t7527: test FSMonitor on case insensitive+preserving file system fsmonitor: never set CE_FSMONITOR_VALID on submodules t/perf/p7527: add perf test for builtin FSMonitor t7527: FSMonitor tests for directory moves fsmonitor: optimize processing of directory events fsm-listen-darwin: shutdown daemon if worktree root is moved/renamed fsm-health-win32: force shutdown daemon if worktree root moves fsm-health-win32: add polling framework to monitor daemon health fsmonitor--daemon: stub in health thread fsmonitor--daemon: rename listener thread related variables fsmonitor--daemon: prepare for adding health thread fsmonitor--daemon: cd out of worktree root fsm-listen-darwin: ignore FSEvents caused by xattr changes on macOS unpack-trees: initialize fsmonitor_has_run_once in o->result ...	2022-06-10 15:04:15 -07:00
Junio C Hamano	c276c21da6	Merge branch 'ds/sparse-sparse-checkout' "sparse-checkout" learns to work well with the sparse-index feature. * ds/sparse-sparse-checkout: sparse-checkout: integrate with sparse index p2000: add test for 'git sparse-checkout [add\|set]' sparse-index: complete partial expansion sparse-index: partially expand directories sparse-checkout: --no-sparse-index needs a full index cache-tree: implement cache_tree_find_path() sparse-index: introduce partially-sparse indexes sparse-index: create expand_index() t1092: stress test 'git sparse-checkout set' t1092: refactor 'sparse-index contents' test	2022-06-03 14:30:35 -07:00
Junio C Hamano	83937e9592	Merge branch 'ns/batch-fsync' Introduce a filesystem-dependent mechanism to optimize the way the bits for many loose object files are ensured to hit the disk platter. * ns/batch-fsync: core.fsyncmethod: performance tests for batch mode t/perf: add iteration setup mechanism to perf-lib core.fsyncmethod: tests for batch mode test-lib-functions: add parsing helpers for ls-files and ls-tree core.fsync: use batch mode and sync loose objects by default on Windows unpack-objects: use the bulk-checkin infrastructure update-index: use the bulk-checkin infrastructure builtin/add: add ODB transaction around add_files_to_cache cache-tree: use ODB transaction around writing a tree core.fsyncmethod: batched disk flushes for loose-objects bulk-checkin: rebrand plug/unplug APIs as 'odb transactions' bulk-checkin: rename 'state' variable and separate 'plugged' boolean	2022-06-03 14:30:34 -07:00
Jeff Hostetler	7667f9d2ae	t/perf/p7527: add perf test for builtin FSMonitor Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-26 15:59:27 -07:00

1 2 3 4 5 ...

370 commits