development/git - HydraGit

mirror of https://github.com/git/git synced 2024-09-12 21:04:12 +00:00

Author	SHA1	Message	Date
Denton Liu	ad6dad0996	*.[ch]: manually align parameter lists In previous patches, extern was mechanically removed from function declarations without care to formatting, causing parameter lists to be misaligned. Manually format changed sections such that the parameter lists should be realigned. Viewing this patch with 'git diff -w' should produce no output. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:10 +09:00
Denton Liu	b199d7147a	.[ch]: remove extern from function declarations using sed There has been a push to remove extern from function declarations. Finish the job by removing all instances of "extern" for function declarations in headers using sed. This was done by running the following on my system with sed 4.2.2: $ git ls-files \.{c,h} \| grep -v ^compat/ \| xargs sed -i'' -e 's/^$\s$extern $[^(]([^]$/\1\2/' Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Then, leftover instances of extern were found by running $ git grep -w -C3 extern \.{c,h} and manually checking the output. No other instances were found. Note that the regex used specifically excludes function variables which _should_ be left as extern. Not the most elegant way to do it but it gets the job done. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:08 +09:00
Denton Liu	554544276a	.[ch]: remove extern from function declarations using spatch There has been a push to remove extern from function declarations. Remove some instances of "extern" for function declarations which are caught by Coccinelle. Note that Coccinelle has some difficulty with processing functions with `__attribute__` or varargs so some `extern` declarations are left behind to be dealt with in a future patch. This was the Coccinelle patch used: @@ type T; identifier f; @@ - extern T f(...); and it was run with: $ git ls-files \.{c,h} \| grep -v ^compat/ \| xargs spatch --sp-file contrib/coccinelle/noextern.cocci --in-place Files under `compat/` are intentionally excluded as some are directly copied from external sources and we should avoid churning them as much as possible. Signed-off-by: Denton Liu <liu.denton@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-05-05 15:20:06 +09:00
Junio C Hamano	5795a75f9b	Merge branch 'bp/post-index-change-hook' A new hook "post-index-change" is called when the on-disk index file changes, which can help e.g. a virtualized working tree implementation. * bp/post-index-change-hook: read-cache: add post-index-change hook	2019-04-25 16:41:11 +09:00
Junio C Hamano	e36adf7122	Merge branch 'ps/stash-in-c' "git stash" rewritten in C. * ps/stash-in-c: (28 commits) tests: add a special setup where stash.useBuiltin is off stash: optionally use the scripted version again stash: add back the original, scripted `git stash` stash: convert `stash--helper.c` into `stash.c` stash: replace all `write-tree` child processes with API calls stash: optimize `get_untracked_files()` and `check_changes()` stash: convert save to builtin stash: make push -q quiet stash: convert push to builtin stash: convert create to builtin stash: convert store to builtin stash: convert show to builtin stash: convert list to builtin stash: convert pop to builtin stash: convert branch to builtin stash: convert drop and clear to builtin stash: convert apply to builtin stash: mention options in `show` synopsis stash: add tests for `git stash show` config stash: rename test cases to be more descriptive ...	2019-04-22 11:14:43 +09:00
Nguyễn Thái Ngọc Duy	0daf7ff6c0	sha1-name.c: remove the_repo from get_oid_mb() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	65e5046400	sha1-name.c: remove the_repo from other get_oid_* Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	e270f42c4d	sha1-name.c: remove the_repo from maybe_die_on_misspelt_object_name Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	ec580eaaa3	sha1-name.c: add repo_get_oid() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	2b1790f5ab	sha1-name.c: remove the_repo from get_oid_1() There is a cyclic dependency between one of these functions so they cannot be converted one by one, so all related functions are converted at once. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:53 +09:00
Nguyễn Thái Ngọc Duy	4e99f2dbea	sha1-name.c: add repo_for_each_abbrev() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:52 +09:00
Nguyễn Thái Ngọc Duy	8bb95572b0	sha1-name.c: add repo_find_unique_abbrev_r() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-16 18:56:52 +09:00
Nguyễn Thái Ngọc Duy	8f56e9d4ba	refs.c: remove the_repo from substitute_branch_name() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-08 17:26:32 +09:00
Elijah Newren	5ec1e72823	Use 'unsigned short' for mode, like diff_filespec does struct diff_filespec defines mode to be an 'unsigned short'. Several other places in the API which we'd like to interact with using a diff_filespec used a plain unsigned (or unsigned int). This caused problems when taking addresses, so switch to unsigned short. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-04-08 16:02:07 +09:00
Junio C Hamano	6b5688b760	Merge branch 'ma/clear-repository-format' The setup code has been cleaned up to avoid leaks around the repository_format structure. * ma/clear-repository-format: setup: fix memory leaks with `struct repository_format` setup: free old value before setting `work_tree`	2019-03-20 15:16:07 +09:00
Junio C Hamano	32038fef00	Merge branch 'jh/trace2' A more structured way to obtain execution trace has been added. * jh/trace2: trace2: add for_each macros to clang-format trace2: t/helper/test-trace2, t0210.sh, t0211.sh, t0212.sh trace2:data: add subverb for rebase trace2:data: add subverb to reset command trace2:data: add subverb to checkout command trace2:data: pack-objects: add trace2 regions trace2:data: add trace2 instrumentation to index read/write trace2:data: add trace2 hook classification trace2:data: add trace2 transport child classification trace2:data: add trace2 sub-process classification trace2:data: add editor/pager child classification trace2:data: add trace2 regions to wt-status trace2: collect Windows-specific process information trace2: create new combined trace facility trace2: Documentation/technical/api-trace2.txt	2019-03-07 09:59:56 +09:00
Junio C Hamano	4e021dc28e	Merge branch 'wh/author-committer-ident-config' Four new configuration variables {author,committer}.{name,email} have been introduced to override user.{name,email} in more specific cases. * wh/author-committer-ident-config: config: allow giving separate author and committer idents	2019-03-07 09:59:53 +09:00
Junio C Hamano	7d0c1f4556	Merge branch 'tg/checkout-no-overlay' "git checkout --no-overlay" can be used to trigger a new mode of checking out paths out of the tree-ish, that allows paths that match the pathspec that are in the current index and working tree and are not in the tree-ish. * tg/checkout-no-overlay: revert "checkout: introduce checkout.overlayMode config" checkout: introduce checkout.overlayMode config checkout: introduce --{,no-}overlay option checkout: factor out mark_cache_entry_for_checkout function checkout: clarify comment read-cache: add invalidate parameter to remove_marked_cache_entries entry: support CE_WT_REMOVE flag in checkout_entry entry: factor out unlink_entry function move worktree tests to t24*	2019-03-07 09:59:51 +09:00
Thomas Gummerer	0640897dc5	ident: don't require calling prepare_fallback_ident first In `fd5a58477c` ("ident: add the ability to provide a "fallback identity"", 2019-02-25) I made it a requirement to call prepare_fallback_ident as the first function in the ident API. However in stash we didn't actually end up following that. This leads to a BUG if user.email and user.name are set. It was not caught in the test suite because we only rely on environment variables for setting the user name and email instead of the config. Instead of making it a bug to call other functions in the ident API first, just return silently if the identity of a user was already set up. Reported-by: Denton Liu <liu.denton@gmail.com> Helped-by: Jeff King <peff@peff.net> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-07 09:41:40 +09:00
Martin Ågren	e8805af1c3	setup: fix memory leaks with `struct repository_format` After we set up a `struct repository_format`, it owns various pieces of allocated memory. We then either use those members, because we decide we want to use the "candidate" repository format, or we discard the candidate / scratch space. In the first case, we transfer ownership of the memory to a few global variables. In the latter case, we just silently drop the struct and end up leaking memory. Introduce an initialization macro `REPOSITORY_FORMAT_INIT` and a function `clear_repository_format()`, to be used on each side of `read_repository_format()`. To have a clear and simple memory ownership, let all users of `struct repository_format` duplicate the strings that they take from it, rather than stealing the pointers. Call `clear_...()` at the start of `read_...()` instead of just zeroing the struct, since we sometimes enter the function multiple times. Thus, it is important to initialize the struct before calling `read_...()`, so document that. It's also important because we might not even call `read_...()` before we call `clear_...()`, see, e.g., builtin/init-db.c. Teach `read_...()` to clear the struct on error, so that it is reset to a safe state, and document this. (In `setup_git_directory_gently()`, we look at `repo_fmt.hash_algo` even if `repo_fmt.version` is -1, which we weren't actually supposed to do per the API. After this commit, that's ok.) We inherit the existing code's combining "error" and "no version found". Both are signalled through `version == -1` and now both cause us to clear any partial configuration we have picked up. For "extensions.*", that's fine, since they require a positive version number. For "core.bare" and "core.worktree", we're already verifying that we have a non-negative version number before using them. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-01 08:52:00 +09:00
Johannes Schindelin	fd5a58477c	ident: add the ability to provide a "fallback identity" In `3bc2111fc2` (stash: tolerate missing user identity, 2018-11-18), `git stash` learned to provide a fallback identity for the case that no proper name/email was given (and `git stash` does not really care about a correct identity anyway, but it does want to create a commit object). In preparation for the same functionality in the upcoming built-in version of `git stash`, let's offer the same functionality as an API function. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> [tg: add docs; make it a bug to call the function before other functions in the ident API] Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-01 08:03:46 +09:00
Paul-Sebastian Ungureanu	f5116f43f6	sha1-name.c: add `get_oidf()` which acts like `get_oid()` Compared to `get_oid()`, `get_oidf()` has as parameters a pointer to `object_id`, a printf format string and additional arguments. This will help simplify the code in subsequent commits. Original-idea-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Paul-Sebastian Ungureanu <ungureanupaulsebastian@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-03-01 08:03:46 +09:00
Jeff Hostetler	ee4512ed48	trace2: create new combined trace facility Create a new unified tracing facility for git. The eventual intent is to replace the current trace_printf* and trace_performance* routines with a unified set of git_trace2* routines. In addition to the usual printf-style API, trace2 provides higer-level event verbs with fixed-fields allowing structured data to be written. This makes post-processing and analysis easier for external tools. Trace2 defines 3 output targets. These are set using the environment variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be set to "1" or to an absolute pathname (just like the current GIT_TRACE). * GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command summary data. * GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE. It extends the output with columns for the command process, thread, repo, absolute and relative elapsed times. It reports events for child process start/stop, thread start/stop, and per-thread function nesting. * GIT_TR2_EVENT is a new structured format. It writes event data as a series of JSON records. Calls to trace2 functions log to any of the 3 output targets enabled without the need to call different trace_printf* or trace_performance* routines. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-22 15:27:59 -08:00
Ben Peart	1956ecd0ab	read-cache: add post-index-change hook Add a post-index-change hook that is invoked after the index is written in do_write_locked_index(). This hook is meant primarily for notification, and cannot affect the outcome of git commands that trigger the index write. The hook is passed a flag to indicate whether the working directory was updated or not and a flag indicating if a skip-worktree bit could have changed. These flags enable the hook to optimize its response to the index change notification. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-15 11:00:33 -08:00
Junio C Hamano	cba595ab1a	Merge branch 'jk/loose-object-cache-oid' Code clean-up. * jk/loose-object-cache-oid: prefer "hash mismatch" to "sha1 mismatch" sha1-file: avoid "sha1 file" for generic use in messages sha1-file: prefer "loose object file" to "sha1 file" in messages sha1-file: drop has_sha1_file() convert has_sha1_file() callers to has_object_file() sha1-file: convert pass-through functions to object_id sha1-file: modernize loose header/stream functions sha1-file: modernize loose object file functions http: use struct object_id instead of bare sha1 update comment references to sha1_object_info() sha1-file: fix outdated sha1 comment references	2019-02-06 22:05:27 -08:00
Junio C Hamano	ecbe1beb8e	Merge branch 'lt/date-human' A new date format "--date=human" that morphs its output depending on how far the time is from the current time has been introduced. "--date=auto" can be used to use this new format when the output is going to the pager or to the terminal and otherwise the default format. * lt/date-human: Add `human` date format tests. Add `human` format to test-tool Add 'human' date format documentation Replace the proposed 'auto' mode with 'auto:' Add 'human' date format	2019-02-06 22:05:24 -08:00
Junio C Hamano	b2fc9d2fb0	Merge branch 'jk/unused-parameter-cleanup' Code cleanup. * jk/unused-parameter-cleanup: convert: drop path parameter from actual conversion functions convert: drop len parameter from conversion checks config: drop unused parameter from maybe_remove_section() show_date_relative(): drop unused "tz" parameter column: drop unused "opts" parameter in item_length() create_bundle(): drop unused "header" parameter apply: drop unused "def" parameter from find_name_gnu() match-trees: drop unused path parameter from score functions	2019-02-06 22:05:23 -08:00
Junio C Hamano	7589e63648	Merge branch 'nd/the-index-final' The assumption to work on the single "in-core index" instance has been reduced from the library-ish part of the codebase. * nd/the-index-final: cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch read-cache.c: remove the_* from index_has_changes() merge-recursive.c: remove implicit dependency on the_repository merge-recursive.c: remove implicit dependency on the_index sha1-name.c: remove implicit dependency on the_index read-cache.c: replace update_index_if_able with repo_& read-cache.c: kill read_index() checkout: avoid the_index when possible repository.c: replace hold_locked_index() with repo_hold_locked_index() notes-utils.c: remove the_repository references grep: use grep_opt->repo instead of explict repo argument	2019-02-06 22:05:23 -08:00
Junio C Hamano	09a9c1f427	Merge branch 'tt/bisect-in-c' More code in "git bisect" has been rewritten in C. * tt/bisect-in-c: bisect--helper: `bisect_start` shell function partially in C bisect--helper: `get_terms` & `bisect_terms` shell function in C bisect--helper: `bisect_next_check` shell function in C bisect--helper: `check_and_set_terms` shell function in C wrapper: move is_empty_file() and rename it as is_empty_or_missing_file() bisect--helper: `bisect_write` shell function in C bisect--helper: `bisect_reset` shell function in C	2019-02-06 22:05:22 -08:00
Junio C Hamano	cfd9167c15	Merge branch 'dt/cat-file-batch-ambiguous' "git cat-file --batch" reported a dangling symbolic link by mistake, when it wanted to report that a given name is ambiguous. * dt/cat-file-batch-ambiguous: t1512: test ambiguous cat-file --batch and --batch-output Do not print 'dangling' for cat-file in case of ambiguity	2019-02-06 22:05:21 -08:00
Junio C Hamano	1c418243a5	Merge branch 'jk/add-ignore-errors-bit-assignment-fix' "git add --ignore-errors" did not work as advertised and instead worked as an unintended synonym for "git add --renormalize", which has been fixed. * jk/add-ignore-errors-bit-assignment-fix: add: use separate ADD_CACHE_RENORMALIZE flag	2019-02-05 14:26:13 -08:00
William Hubbs	39ab4d0951	config: allow giving separate author and committer idents The author.email, author.name, committer.email and committer.name settings are analogous to the GIT_AUTHOR_* and GIT_COMMITTER_* environment variables, but for the git config system. This allows them to be set separately for each repository. Git supports setting different authorship and committer information with environment variables. However, environment variables are set in the shell, so if different authorship and committer information is needed for different repositories an external tool is required. This adds support to git config for author.email, author.name, committer.email and committer.name settings so this information can be set per repository. Also, it generalizes the fmt_ident function so it can handle author vs committer identification. Signed-off-by: William Hubbs <williamh@gentoo.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-02-04 12:18:13 -08:00
Junio C Hamano	371820d5f1	Merge branch 'bc/tree-walk-oid' The code to walk tree objects has been taught that we may be working with object names that are not computed with SHA-1. * bc/tree-walk-oid: cache: make oidcpy always copy GIT_MAX_RAWSZ bytes tree-walk: store object_id in a separate member match-trees: use hashcpy to splice trees match-trees: compute buffer offset correctly when splicing tree-walk: copy object ID before use	2019-01-29 12:47:56 -08:00
Junio C Hamano	33e4ae9c50	Merge branch 'bc/sha-256' Add sha-256 hash and plug it through the code to allow building Git with the "NewHash". * bc/sha-256: hash: add an SHA-256 implementation using OpenSSL sha256: add an SHA-256 implementation using libgcrypt Add a base implementation of SHA-256 support commit-graph: convert to using the_hash_algo t/helper: add a test helper to compute hash speed sha1-file: add a constant for hash block size t: make the sha1 test-tool helper generic t: add basic tests for our SHA-1 implementation cache: make hashcmp and hasheq work with larger hashes hex: introduce functions to print arbitrary hashes sha1-file: provide functions to look up hash algorithms sha1-file: rename algorithm to "sha1"	2019-01-29 12:47:55 -08:00
Stephen P. Smith	b841d4ff43	Add `human` format to test-tool Add the human format support to the test tool so that GIT_TEST_DATE_NOW can be used to specify the current time. The get_time() helper function was created and and checks the GIT_TEST_DATE_NOW environment variable. If GIT_TEST_DATE_NOW is set, then that date is used instead of the date returned by by gettimeofday(). All calls to gettimeofday() were replaced by calls to get_time(). Renamed occurances of TEST_DATE_NOW to GIT_TEST_DATE_NOW since the variable is now used in the get binary and not just in the test-tool. Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-29 09:40:08 -08:00
Jeff King	3d42034a18	show_date_relative(): drop unused "tz" parameter The timestamp we receive is in epoch time, so there's no need for a timezone parameter to interpret it. The matching show_date() uses "tz" to show dates in author local time, but relative dates show only the absolute time difference. The author's location is irrelevant, barring relativistic effects from using Git close to the speed of light. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-24 12:35:44 -08:00
Nguyễn Thái Ngọc Duy	f8adbec9fe	cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch By default, index compat macros are off from now on, because they could hide the_index dependency. Only those in builtin can use it. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-24 11:55:06 -08:00
David Turner	d1dd94b308	Do not print 'dangling' for cat-file in case of ambiguity The return values -1 and -2 from get_oid could mean two different things, depending on whether they were from an enum returned by get_tree_entry_follow_symlinks, or from a different code path. This caused 'dangling' to be printed from a git cat-file in the case of an ambiguous (-2) result. Unify the results of get_oid* and get_tree_entry_follow_symlinks to be one common type, with unambiguous values. Signed-off-by: David Turner <novalis@novalis.org> Reported-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 15:22:02 -08:00
Linus Torvalds	acdd37769d	Add 'human' date format This adds --date=human, which skips the timezone if it matches the current time-zone, and doesn't print the whole date if that matches (ie skip printing year for dates that are "this year", but also skip the whole date itself if it's in the last few days and we can just say what weekday it was). For really recent dates (same day), use the relative date stamp, while for old dates (year doesn't match), don't bother with time and timezone. Also add 'auto' date mode, which defaults to human if we're using the pager. So you can do git config --add log.date auto and your "git log" commands will show the human-legible format unless you're scripting things. Note that this time format still shows the timezone for recent enough events (but not so recent that they show up as relative dates). You can combine it with the "-local" suffix to never show timezones for an even more simplified view. Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org> Signed-off-by: Stephen P. Smith <ischis2@cox.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-18 10:31:23 -08:00
Jeff King	9e5da3d055	add: use separate ADD_CACHE_RENORMALIZE flag Commit `9472935d81` (add: introduce "--renormalize", 2017-11-16) taught git-add to pass HASH_RENORMALIZE to add_to_index(), which then passes the flag along to index_path(). However, the flags taken by add_to_index() and the ones taken by index_path() are distinct namespaces. We cannot take HASH_* flags in add_to_index(), because they overlap with the ADD_CACHE_* flags we already take (in this case, HASH_RENORMALIZE conflicts with ADD_CACHE_IGNORE_ERRORS). We can solve this by adding a new ADD_CACHE_RENORMALIZE flag, and using it to set HASH_RENORMALIZE within add_to_index(). In order to make it clear that these two flags come from distinct sets, let's also change the name "newflags" in the function to "hash_flags". Reported-by: Dmitriy Smirnov <dmitriy.smirnov@jetbrains.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-17 13:40:21 -08:00
brian m. carlson	974e4a85e3	cache: make oidcpy always copy GIT_MAX_RAWSZ bytes There are some situations in which we want to store an object ID into struct object_id without the_hash_algo necessarily being set correctly. One such case is when cloning a repository, where we must read refs from the remote side without having a repository from which to read the preferred algorithm. In this cases, we may have the_hash_algo set to SHA-1, which is the default, but read refs into struct object_id that are SHA-256. When copying these values, we will want to copy them completely, not just the first 20 bytes. Consequently, make sure that oidcpy copies the maximum number of bytes at all times, regardless of the setting of the_hash_algo. Since oidcpy and hashcpy are no longer functionally identical, remove the Cocinelle object_id transformations that convert from one into the other. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-15 09:57:41 -08:00
Nguyễn Thái Ngọc Duy	150fe065f7	read-cache.c: remove the_* from index_has_changes() Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:05 -08:00
Nguyễn Thái Ngọc Duy	3a7a698e93	sha1-name.c: remove implicit dependency on the_index This kills the_index dependency in get_oid_with_context() but for get_oid() and friends, they still assume the_repository (which also means the_index). Unfortunately the widespread use of get_oid() will make it hard to make the conversion now. We probably will add repo_get_oid() at some point and limit the use of get_oid() in builtin/ instead of forcing all get_oid() call sites to carry struct repository. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy	1b0d968b34	read-cache.c: replace update_index_if_able with repo_& Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy	e1ff0a32e4	read-cache.c: kill read_index() read_index() shares the same problem as hold_locked_index(): it assumes $GIT_DIR/index. Move all call sites to repo_read_index() instead. read_index_preload() and read_index_unmerged() are also killed as a consequence. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Nguyễn Thái Ngọc Duy	3a95f31d1c	repository.c: replace hold_locked_index() with repo_hold_locked_index() hold_locked_index() assumes the index path at $GIT_DIR/index. This is not good for places that take an arbitrary index_state instead of the_index, which is basically everywhere except builtin/. Replace it with repo_hold_locked_index(). hold_locked_index() remains as a wrapper around repo_hold_locked_index() to reduce changes in builtin/ Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-14 12:13:04 -08:00
Jeff King	00a7760e81	sha1-file: modernize loose header/stream functions As with the open/map/close functions for loose objects that were recently converted, the functions for parsing the loose object stream use the name "sha1" and a bare "unsigned char *". Let's fix that so that unpack_sha1_header() becomes unpack_loose_header(), etc. These conversions are less clear-cut than the file access functions. You could argue that the they are parsing Git's canonical object format (i.e., "type size\0contents", over which we compute the hash), which is not strictly tied to loose storage. But in practice these functions are used only for loose objects, and using the term "loose_header" (instead of "object_header") distinguishes it from the object header found in packfiles (which contains the same information in a different format). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Jeff King	c93206b412	update comment references to sha1_object_info() Commit `abef9020e3` (sha1_file: convert sha1_object_info* to object_id, 2018-03-12) renamed the function to oid_object_info(), but missed some comments which mention it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-08 09:40:19 -08:00
Thomas Gummerer	6fdc205722	read-cache: add invalidate parameter to remove_marked_cache_entries When marking cache entries for removal, and later removing them all at once using 'remove_marked_cache_entries()', cache entries currently have to be invalidated manually in the cache tree and in the untracked cache. Add an invalidate flag to the function. With the flag set, the function will take care of invalidating the path in the cache tree and in the untracked cache. Note that the current callsites already do the invalidation properly in other places, so we're just passing 0 from there to keep the status quo. This will be useful in a subsequent commit. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 15:28:05 -08:00
Thomas Gummerer	b702dd12d5	entry: factor out unlink_entry function Factor out the 'unlink_entry()' function from unpack-trees.c to entry.c. It will be used in other places as well in subsequent steps. As it's no longer a static function, also move the documentation to the header file to make it more discoverable. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 15:28:05 -08:00
Pranit Bauva	e3b1e3bdc0	wrapper: move is_empty_file() and rename it as is_empty_or_missing_file() is_empty_file() can help to refactor a lot of code. This will be very helpful in porting "git bisect" to C. Suggested-by: Torsten Bögershausen <tboegi@web.de> Mentored-by: Lars Schneider <larsxschneider@gmail.com> Mentored-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Pranit Bauva <pranit.bauva@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-01-02 10:23:02 -08:00
brian m. carlson	13eeedb5d1	Add a base implementation of SHA-256 support SHA-1 is weak and we need to transition to a new hash function. For some time, we have referred to this new function as NewHash. Recently, we decided to pick SHA-256 as NewHash. The reasons behind the choice of SHA-256 are outlined in the thread starting at [1] and in the commit history for the hash function transition document. Add a basic implementation of SHA-256 based off libtomcrypt, which is in the public domain. Optimize it and restructure it to meet our coding standards. Pull in the update and final functions from the SHA-1 block implementation, as we know these function correctly with all compilers. This implementation is slower than SHA-1, but more performant implementations will be introduced in future commits. Wire up SHA-256 in the list of hash algorithms, and add a test that the algorithm works correctly. Note that with this patch, it is still not possible to switch to using SHA-256 in Git. Additional patches are needed to prepare the code to handle a larger hash algorithm and further test fixes are needed. [1] https://public-inbox.org/git/20180609224913.GC38834@genre.crustytoothpaste.net/ Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:53 +09:00
brian m. carlson	a2ce0a7526	sha1-file: add a constant for hash block size There is one place we need the hash algorithm block size: the HMAC code for push certs. Expose this constant in struct git_hash_algo and expose values for SHA-1 and for the largest value of any hash. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:52 +09:00
brian m. carlson	0dab7129ab	cache: make hashcmp and hasheq work with larger hashes In `183a638b7d` ("hashcmp: assert constant hash size", 2018-08-23), we modified hashcmp to assert that the hash size was always 20 to help it optimize and inline calls to memcmp. In a future series, we replaced many calls to hashcmp and oidcmp with calls to hasheq and oideq to improve inlining further. However, we want to support hash algorithms other than SHA-1, namely SHA-256. When doing so, we must handle the case where these values are 32 bytes long as well as 20. Adjust hashcmp to handle two cases: 20-byte matches, and maximum-size matches. Therefore, when we include SHA-256, we'll automatically handle it properly, while at the same time teaching the compiler that there are only two possible options to consider. This will allow the compiler to write the most efficient possible code. Copy similar code into hasheq and perform an identical transformation. At least with GCC 8.2.0, making hasheq defer to hashcmp when there are two branches prevents the compiler from inlining the comparison, while the code in this patch is inlined properly. Add a comment to avoid an accidental performance regression from well-intentioned refactoring. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:52 +09:00
brian m. carlson	47edb64997	hex: introduce functions to print arbitrary hashes Currently, we have functions that turn an arbitrary SHA-1 value or an object ID into hex format, either using a static buffer or with a user-provided buffer. Add variants of these functions that can handle an arbitrary hash algorithm, specified by constant. Update the documentation as well. While we're at it, remove the "extern" declaration from this family of functions, since it's not needed and our style now recommends against it. We use the variant taking the algorithm structure pointer as the internal variant, since taking an algorithm pointer is the easiest way to handle all of the variants in use. Note that we maintain these functions because there are hashes which must change based on the hash algorithm in use but are not object IDs (such as pack checksums). Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 16:54:52 +09:00
Nguyễn Thái Ngọc Duy	0f086e6dca	checkout: print something when checking out paths One of the problems with "git checkout" is that it does so many different things and could confuse people specially when we fail to handle ambiguation correctly. One way to help with that is tell the user what sort of operation is actually carried out. When switching branches, we always print something unless --quiet, either - "HEAD is now at ..." - "Reset branch ..." - "Already on ..." - "Switched to and reset ..." - "Switched to a new branch ..." - "Switched to branch ..." Checking out paths however is silent. Print something so that if we got the user intention wrong, they won't waste too much time to find that out. For the remaining cases of checkout we now print either - "Checked out ... paths out of the index" - "Checked out ... paths out of <abbrev hash>" Since the purpose of printing this is to help disambiguate. Only do it when "--" is missing. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-11-14 15:10:35 +09:00
Junio C Hamano	11aa560de9	Merge branch 'bp/refresh-index-using-preload' The helper function to refresh the cached stat information in the in-core index has learned to perform the lstat() part of the operation in parallel on multi-core platforms. * bp/refresh-index-using-preload: refresh_index: remove unnecessary calls to preload_index() speed up refresh_index() by utilizing preload_index()	2018-11-13 22:37:26 +09:00
Junio C Hamano	abb4824d13	Merge branch 'ao/submodule-wo-gitmodules-checked-out' The submodule support has been updated to read from the blob at HEAD:.gitmodules when the .gitmodules file is missing from the working tree. * ao/submodule-wo-gitmodules-checked-out: t/helper: add test-submodule-nested-repo-config submodule: support reading .gitmodules when it's not in the working tree submodule: add a helper to check if it is safe to write to .gitmodules t7506: clean up .gitmodules properly before setting up new scenario submodule: use the 'submodule--helper config' command submodule--helper: add a new 'config' subcommand t7411: be nicer to future tests and really clean things up t7411: merge tests 5 and 6 submodule: factor out a config_set_in_gitmodules_file_gently function submodule: add a print_config_from_gitmodules() helper	2018-11-13 22:37:22 +09:00
Junio C Hamano	6c268fdda9	Merge branch 'js/mingw-perl5lib' Windows fix. * js/mingw-perl5lib: mingw: unset PERL5LIB by default config: move Windows-specific config settings into compat/mingw.c config: allow for platform-specific core.* config settings config: rename `dummy` parameter to `cb` in git_default_config()	2018-11-13 22:37:20 +09:00
Junio C Hamano	8c758f9a67	Merge branch 'nd/per-worktree-config' A fourth class of configuration files (in addition to the traditional "system wide", "per user in the $HOME directory" and "per repository in the $GIT_DIR/config") has been introduced so that different worktrees that share the same repository (hence the same $GIT_DIR/config file) can use different customization. * nd/per-worktree-config: worktree: add per-worktree config files t1300: extract and use test_cmp_config()	2018-11-13 22:37:18 +09:00
Junio C Hamano	b49ef560ed	Merge branch 'ag/rebase-i-in-c' Rewrite of the remaining "rebase -i" machinery in C. * ag/rebase-i-in-c: rebase -i: move rebase--helper modes to rebase--interactive rebase -i: remove git-rebase--interactive.sh rebase--interactive2: rewrite the submodes of interactive rebase in C rebase -i: implement the main part of interactive rebase as a builtin rebase -i: rewrite init_basic_state() in C rebase -i: rewrite write_basic_state() in C rebase -i: rewrite the rest of init_revisions_and_shortrevisions() in C rebase -i: implement the logic to initialize $revisions in C rebase -i: remove unused modes and functions rebase -i: rewrite complete_action() in C t3404: todo list with commented-out commands only aborts sequencer: change the way skip_unnecessary_picks() returns its result sequencer: refactor append_todo_help() to write its message to a buffer rebase -i: rewrite checkout_onto() in C rebase -i: rewrite setup_reflog_action() in C sequencer: add a new function to silence a command, except if it fails rebase -i: rewrite the edit-todo functionality in C editor: add a function to launch the sequence editor rebase -i: rewrite append_todo_help() in C sequencer: make three functions and an enum from sequencer.c public	2018-11-02 11:04:53 +09:00
Johannes Schindelin	bdfbb0ea93	config: move Windows-specific config settings into compat/mingw.c Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-31 12:46:27 +09:00
Ben Peart	99ce720c33	speed up refresh_index() by utilizing preload_index() Speed up refresh_index() by utilizing preload_index() to do most of the work spread across multiple threads. This works because most cache entries will get marked CE_UPTODATE so that refresh_cache_ent() can bail out early when called from within refresh_index(). On a Windows repo with ~200K files, this drops refresh times from 6.64 seconds to 2.87 seconds for a savings of 57%. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-30 11:28:39 +09:00
Junio C Hamano	7a43ab6fb2	Merge branch 'sg/split-index-racefix' The codepath to support the experimental split-index mode had remaining "racily clean" issues fixed. * sg/split-index-racefix: split-index: BUG() when cache entry refers to non-existing shared entry split-index: smudge and add racily clean cache entries to split index split-index: don't compare cached data of entries already marked for split index split-index: count the number of deleted entries t1700-split-index: date back files to avoid racy situations split-index: add tests to demonstrate the racy split index problem t1700-split-index: document why FSMONITOR is disabled in this test script	2018-10-26 14:22:10 +09:00
Nguyễn Thái Ngọc Duy	58b284a2e9	worktree: add per-worktree config files A new repo extension is added, worktreeConfig. When it is present: - Repository config reading by default includes $GIT_DIR/config _and_ $GIT_DIR/config.worktree. "config" file remains shared in multiple worktree setup. - The special treatment for core.bare and core.worktree, to stay effective only in main worktree, is gone. These config settings are supposed to be in config.worktree. This extension is most useful in multiple worktree setup because you now have an option to store per-worktree config (which is either .git/config.worktree for main worktree, or .git/worktrees/xx/config.worktree for linked ones). This extension can be used in single worktree mode, even though it's pretty much useless (but this can happen after you remove all linked worktrees and move back to single worktree). "git config" reads from both "config" and "config.worktree" by default (i.e. without either --user, --file...) when this extension is present. Default writes still go to "config", not "config.worktree". A new option --worktree is added for that (). Since a new repo extension is introduced, existing git binaries should refuse to access to the repo (both from main and linked worktrees). So they will not misread the config file (i.e. skip the config.worktree part). They may still accidentally write to the config file anyway if they use with "git config --file <path>". This design places a bet on the assumption that the majority of config variables are shared so it is the default mode. A safer move would be default writes go to per-worktree file, so that accidental changes are isolated. () "git config --worktree" points back to "config" file when this extension is not present and there is only one worktree so that it works in any both single and multiple worktree setups. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-22 13:17:04 +09:00
Junio C Hamano	4d87b38e6c	Merge branch 'nd/status-refresh-progress' "git status" learns to show progress bar when refreshing the index takes a long time. * nd/status-refresh-progress: status: show progress bar if refreshing the index takes too long	2018-10-19 13:34:03 +09:00
Junio C Hamano	11877b9ebe	Merge branch 'nd/the-index' Various codepaths in the core-ish part learn to work on an arbitrary in-core index structure, not necessarily the default instance "the_index". * nd/the-index: (23 commits) revision.c: reduce implicit dependency the_repository revision.c: remove implicit dependency on the_index ws.c: remove implicit dependency on the_index tree-diff.c: remove implicit dependency on the_index submodule.c: remove implicit dependency on the_index line-range.c: remove implicit dependency on the_index userdiff.c: remove implicit dependency on the_index rerere.c: remove implicit dependency on the_index sha1-file.c: remove implicit dependency on the_index patch-ids.c: remove implicit dependency on the_index merge.c: remove implicit dependency on the_index merge-blobs.c: remove implicit dependency on the_index ll-merge.c: remove implicit dependency on the_index diff-lib.c: remove implicit dependency on the_index read-cache.c: remove implicit dependency on the_index diff.c: remove implicit dependency on the_index grep.c: remove implicit dependency on the_index diff.c: remove the_index dependency in textconv() functions blame.c: rename "repo" argument to "r" combine-diff.c: remove implicit dependency on the_index ...	2018-10-19 13:34:02 +09:00
SZEDER Gábor	5581a019ba	split-index: smudge and add racily clean cache entries to split index Ever since the split index feature was introduced [1], refreshing a split index is prone to a variant of the classic racy git problem. Consider the following sequence of commands updating the split index when the shared index contains a racily clean cache entry, i.e. an entry whose cached stat data matches with the corresponding file in the worktree and the cached mtime matches that of the index: echo "cached content" >file git update-index --split-index --add file echo "dirty worktree" >file # size stays the same! # ... wait ... git update-index --add other-file Normally, when a non-split index is updated, then do_write_index() (the function responsible for writing all kinds of indexes, "regular", split, and shared) recognizes racily clean cache entries, and writes them with smudged stat data, i.e. with file size set to 0. When subsequent git commands read the index, they will notice that the smudged stat data doesn't match with the file in the worktree, and then go on to check the file's content and notice its dirtiness. In the above example, however, in the second 'git update-index' prepare_to_write_split_index() decides which cache entries stored only in the shared index should be replaced in the new split index. Alas, this function never looks out for racily clean cache entries, and since the file's stat data in the worktree hasn't changed since the shared index was written, it won't be replaced in the new split index. Consequently, do_write_index() doesn't even get this racily clean cache entry, and can't smudge its stat data. Subsequent git commands will then see that the index has more recent mtime than the file and that the (not smudged) cached stat data still matches with the file in the worktree, and, ultimately, will erroneously consider the file clean. Modify prepare_to_write_split_index() to recognize racily clean cache entries, and mark them to be added to the split index. Note that there are two places where it should check raciness: first those cache entries that are only stored in the shared index, and then those that have been copied by unpack_trees() from the shared index while it constructed a new index. This way do_write_index() will get these racily clean cache entries as well, and will then write them with smudged stat data to the new split index. This change makes all tests in 't1701-racy-split-index.sh' pass, so flip the two 'test_expect_failure' tests to success. Also add the '#' (as in nr. of trial) to those tests' description that were omitted when the tests expected failure. Note that after this change if the index is split when it contains a racily clean cache entry, then a smudged cache entry will be written both to the new shared and to the new split indexes. This doesn't affect regular git commands: as far as they are concerned this is just an entry in the split index replacing an outdated entry in the shared index. It did affect a few tests in 't1700-split-index.sh', though, because they actually check which entries are stored in the split index; a previous patch in this series has already made the necessary adjustments in 't1700'. And racily clean cache entries and index splitting are rare enough to not worry about the resulting duplicated smudged cache entries, and the additional complexity required to prevent them is not worth it. Several tests failed occasionally when the test suite was run with 'GIT_TEST_SPLIT_INDEX=yes'. Here are those that I managed to trace back to this racy split index problem, starting with those failing more frequently, with a link to a failing Travis CI build job for each. The highlighted line [2] shows when the racy file was written, which is not always in the failing test but in a preceeding setup test. t3903-stash.sh: https://travis-ci.org/git/git/jobs/385542084#L5858 t4024-diff-optimize-common.sh: https://travis-ci.org/git/git/jobs/386531969#L3174 t4015-diff-whitespace.sh: https://travis-ci.org/git/git/jobs/360797600#L8215 t2200-add-update.sh: https://travis-ci.org/git/git/jobs/382543426#L3051 t0090-cache-tree.sh: https://travis-ci.org/git/git/jobs/416583010#L3679 There might be others, e.g. perhaps 't1000-read-tree-m-3way.sh' and others using 'lib-read-tree-m-3way.sh', but I couldn't confirm yet. [1] In the branch leading to the merge commit v2.1.0-rc0~45 (Merge branch 'nd/split-index', 2014-07-16). [2] Note that those highlighted lines are in the 'after failure' fold, and your browser might unhelpfully fold it up before you could take a good look. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-12 07:23:29 +09:00
Antonio Ospite	b5c259f226	submodule: add a helper to check if it is safe to write to .gitmodules Introduce a helper function named is_writing_gitmodules_ok() to verify that the .gitmodules file is safe to write. The function name follows the scheme of is_staging_gitmodules_ok(). The two symbolic constants GITMODULES_INDEX and GITMODULES_HEAD are used to get help from the C preprocessor in preventing typos, especially for future users. This is in preparation for a future change which teaches git how to read .gitmodules from the index or from the current branch if the file is not available in the working tree. The rationale behind the check is that writing to .gitmodules requires the file to be present in the working tree, unless a brand new .gitmodules is being created (in which case the .gitmodules file would not exist at all: neither in the working tree nor in the index or in the current branch). Expose the functionality also via a "submodule-helper config --check-writeable" command, as git scripts may want to perform the check before modifying submodules configuration. Signed-off-by: Antonio Ospite <ao2@ao2.it> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-10-09 12:40:21 +09:00
Nguyễn Thái Ngọc Duy	26d024ecf0	ws.c: remove implicit dependency on the_index Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:51:18 -07:00
Nguyễn Thái Ngọc Duy	58bf2a4cc7	sha1-file.c: remove implicit dependency on the_index Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:11 -07:00
Nguyễn Thái Ngọc Duy	7e196c3a28	merge.c: remove implicit dependency on the_index Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:10 -07:00
Nguyễn Thái Ngọc Duy	92a1bf5a58	read-cache.c: remove 'const' from index_has_changes() This function calls do_diff_cache() which eventually needs to set this "istate" to unpack_options->src_index [1]. This is an unfortunate fact that unpack_trees() _will_ update [2] src_index so we can't really pass a const index_state there. Just remove 'const'. [1] Right now diff_cache() in diff-lib.c assigns the_index to src_index. But the plan is to get rid of the_index, so it should be 'istate' from here that gets assigned to src_index. [2] Some transient bits in the source index are touched. Optional extensions can also be removed. But other than that the source tree should still be valid. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-21 09:48:10 -07:00
Junio C Hamano	769af0fd9e	Merge branch 'jk/cocci' spatch transformation to replace boolean uses of !hashcmp() to newly introduced oideq() is added, and applied, to regain performance lost due to support of multiple hash algorithms. * jk/cocci: show_dirstat: simplify same-content check read-cache: use oideq() in ce_compare functions convert hashmap comparison functions to oideq() convert "hashcmp() != 0" to "!hasheq()" convert "oidcmp() != 0" to "!oideq()" convert "hashcmp() == 0" to hasheq() convert "oidcmp() == 0" to oideq() introduce hasheq() and oideq() coccinelle: use <...> for function exclusion	2018-09-17 13:53:57 -07:00
Junio C Hamano	c2407322b6	Merge branch 'nd/clone-case-smashing-warning' Running "git clone" against a project that contain two files with pathnames that differ only in cases on a case insensitive filesystem would result in one of the files lost because the underlying filesystem is incapable of holding both at the same time. An attempt is made to detect such a case and warn. * nd/clone-case-smashing-warning: clone: report duplicate entries on case-insensitive filesystems	2018-09-17 13:53:47 -07:00
Nguyễn Thái Ngọc Duy	ae9af12287	status: show progress bar if refreshing the index takes too long Refreshing the index is usually very fast, but it can still take a long time sometimes. Cold cache is one. Or copying a repo to a new place (). It's good to show something to let the user know "git status" is not hanging, it's just busy doing something. () In this case, all stat info in the index becomes invalid and git falls back to rehashing all file content to see if there's any difference between updating stat info in the index. This is quite expensive. Even with a repo as small as git.git, it takes 3 seconds. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-09-17 09:38:50 -07:00
Jeff King	e3ff0683e2	convert "hashcmp() == 0" to hasheq() This is the partner patch to the previous one, but covering the "hash" variants instead of "oid". Note that our coccinelle rule is slightly more complex to avoid triggering the call in hasheq(). I didn't bother to add a new rule to convert: - hasheq(E1->hash, E2->hash) + oideq(E1, E2) Since these are new functions, there won't be any such existing callers. And since most of the code is already using oideq, we're not likely to introduce new ones. We might still see "!hashcmp(E1->hash, E2->hash)" from topics in flight. But because our new rule comes after the existing ones, that should first get converted to "!oidcmp(E1, E2)" and then to "oideq(E1, E2)". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	4a7e27e957	convert "oidcmp() == 0" to oideq() Using the more restrictive oideq() should, in the long run, give the compiler more opportunities to optimize these callsites. For now, this conversion should be a complete noop with respect to the generated code. The result is also perhaps a little more readable, as it avoids the "zero is equal" idiom. Since it's so prevalent in C, I think seasoned programmers tend not to even notice it anymore, but it can sometimes make for awkward double negations (e.g., we can drop a few !!oidcmp() instances here). This patch was generated almost entirely by the included coccinelle patch. This mechanical conversion should be completely safe, because we check explicitly for cases where oidcmp() is compared to 0, which is what oideq() is doing under the hood. Note that we don't have to catch "!oidcmp()" separately; coccinelle's standard isomorphisms make sure the two are treated equivalently. I say "almost" because I did hand-edit the coccinelle output to fix up a few style violations (it mostly keeps the original formatting, but sometimes unwraps long lines). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	14438c4497	introduce hasheq() and oideq() The main comparison functions we provide for comparing object ids are hashcmp() and oidcmp(). These are more flexible than a strict equality check, since they also express ordering. That makes them useful for sorting and binary searching. However, it also makes them potentially slower than a strict equality check. Consider this C code, which is traditionally what our hashcmp has looked like: #include <string.h> int hashcmp(const unsigned char a, const unsigned char b) { return memcmp(a, b, 20); } Compiling with "gcc -O2 -S -fverbose-asm", the generated assembly shows that we actually call memcmp(). But if we change this to a strict equality check: return !memcmp(a, b, 20); we get a faster inline version: movq (%rdi), %rax # MEM[(void )a_4(D)], MEM[(void )a_4(D)] movq 8(%rdi), %rdx # MEM[(void )a_4(D)], tmp101 xorq (%rsi), %rax # MEM[(void )b_5(D)], tmp94 xorq 8(%rsi), %rdx # MEM[(void )b_5(D)], tmp93 orq %rax, %rdx # tmp94, tmp93 jne .L2 #, movl 16(%rsi), %eax # MEM[(void )b_5(D)], tmp104 cmpl %eax, 16(%rdi) # tmp104, MEM[(void *)a_4(D)] je .L5 #, Obviously our hashcmp() doesn't include the "!". But because it's an inline function, optimizing compilers are able to see "!hashcmp(a,b)" in calling code and take advantage of this case. So there has been no value thus far in introducing a more restricted interface for doing strict equality checks. But as Git learns about more values for the_hash_algo, our hashcmp() will grow more complicated and may even delegate at runtime to functions optimized specifically for that hash size. That breaks the inline connection we have, and the compiler will have to assume that the caller really cares about the sign and magnitude of the memcmp() result, even though the vast majority don't. We can solve that by introducing a hasheq() function (and matching oideq() wrapper), which callers can use to make it clear that they only care about equality. For now, the implementation will literally be "!hashcmp()", but it frees us up later to introduce code optimized specifically for the equality check. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-29 11:32:49 -07:00
Jeff King	183a638b7d	hashcmp: assert constant hash size Prior to `509f6f62a4` (cache: update object ID functions for the_hash_algo, 2018-07-16), hashcmp() called memcmp() with a constant size of 20 bytes. Some compilers were able to turn that into a few quad-word comparisons, which is faster than actually calling memcmp(). In `509f6f62a4`, we started using the_hash_algo->rawsz instead. Even though this will always be 20, the compiler doesn't know that while inlining hashcmp() and ends up just generating a call to memcmp(). Eventually we'll have to deal with multiple hash sizes, but for the upcoming v2.19, we can restore some of the original performance by asserting on the size. That gives the compiler enough information to know that the memcmp will always be called with a length of 20, and it performs the same optimization. Here are numbers for p0001.2 run against linux.git on a few versions. This is using -O2 with gcc 8.2.0. Test v2.18.0 v2.19.0-rc0 HEAD ------------------------------------------------------------------------------ 0001.2: 34.24(33.81+0.43) 34.83(34.42+0.40) +1.7% 33.90(33.47+0.42) -1.0% You can see that v2.19 is a little slower than v2.18. This commit ended up slightly faster than v2.18, but there's a fair bit of run-to-run noise (the generated code in the two cases is basically the same). This patch does seem to be consistently 1-2% faster than v2.19. I tried changing hashcpy(), which was also touched by `509f6f62a4`, in the same way, but couldn't measure any speedup. Which makes sense, at least for this workload. A traversal of the whole commit graph requires looking up every entry of every tree via lookup_object(). That's many multiples of the numbers of objects in the repository (most of the lookups just return "yes, we already saw that object"). [jn: verified using "make object.s" that the memcmp call goes away.] Reported-by: Derrick Stolee <stolee@gmail.com> Signed-off-by: Jeff King <peff@peff.net Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-23 06:20:58 -07:00
Junio C Hamano	5ade034464	Merge branch 'en/incl-forward-decl' Code hygiene improvement for the header files. * en/incl-forward-decl: Remove forward declaration of an enum compat/precompose_utf8.h: use more common include guard style urlmatch.h: fix include guard Move definition of enum branch_track from cache.h to branch.h alloc: make allocate_alloc_state and clear_alloc_state more consistent Add missing includes and forward declarations	2018-08-20 12:41:32 -07:00
Junio C Hamano	0c54cdaf65	Merge branch 'jk/for-each-object-iteration' The API to iterate over all objects learned to optionally list objects in the order they appear in packfiles, which helps locality of access if the caller accesses these objects while as objects are enumerated. * jk/for-each-object-iteration: for_each__object: move declarations to object-store.h cat-file: use a single strbuf for all output cat-file: split batch "buf" into two variables cat-file: use oidset check-and-insert cat-file: support "unordered" output for --batch-all-objects cat-file: rename batch_{loose,packed}_object callbacks t1006: test cat-file --batch-all-objects with duplicates for_each_packed_object: support iterating in pack-order for_each__object: give more comprehensive docstrings for_each__object: take flag arguments as enum for_each__object: store flag definitions in a single location	2018-08-20 11:33:52 -07:00
Duy Nguyen	b878579ae7	clone: report duplicate entries on case-insensitive filesystems Paths that only differ in case work fine in a case-sensitive filesystems, but if those repos are cloned in a case-insensitive one, you'll get problems. The first thing to notice is "git status" will never be clean with no indication what exactly is "dirty". This patch helps the situation a bit by pointing out the problem at clone time. Even though this patch talks about case sensitivity, the patch makes no assumption about folding rules by the filesystem. It simply observes that if an entry has been already checked out at clone time when we're about to write a new path, some folding rules are behind this. In the case that we can't rely on filesystem (via inode number) to do this check, fall back to fspathcmp() which is not perfect but should not give false positives. This patch is tested with vim-colorschemes and Sublime-Gitignore repositories on a JFS partition with case insensitive support on Linux. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-17 12:10:37 -07:00
Junio C Hamano	30cf1911e2	Merge branch 'js/vscode' Add a script (in contrib/) to help users of VSCode work better with our codebase. * js/vscode: vscode: let cSpell work on commit messages, too vscode: add a dictionary for cSpell vscode: use 8-space tabs, no trailing ws, etc for Git's source code vscode: wrap commit messages at column 72 by default vscode: only overwrite C/C++ settings mingw: define WIN32 explicitly cache.h: extract enum declaration from inside a struct declaration vscode: hard-code a couple defines contrib: add a script to initialize VS Code configuration	2018-08-15 15:08:26 -07:00
Junio C Hamano	1689c22c1c	Merge branch 'jk/core-use-replace-refs' A new configuration variable core.usereplacerefs has been added, primarily to help server installations that want to ignore the replace mechanism altogether. * jk/core-use-replace-refs: add core.usereplacerefs config option check_replace_refs: rename to read_replace_refs check_replace_refs: fix outdated comment	2018-08-15 15:08:23 -07:00
Elijah Newren	e730b81df6	Move definition of enum branch_track from cache.h to branch.h 'branch_track' feels more closely related to branching, and it is needed later in branch.h; rather than #include'ing cache.h in branch.h for this small enum, just move the enum and the external declaration for git_branch_track to branch.h. Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-15 11:52:09 -07:00
Jeff King	0889aae1cd	for_each__object: move declarations to object-store.h The for_each_loose_object() and for_each_packed_object() functions are meant to be part of a unified interface: they use the same set of for_each_object_flags, and it's not inconceivable that we might one day add a single for_each_object() wrapper around them. Let's put them together in a single file, so we can avoid awkwardness like saying "the flags for this function are over in cache.h". Moving the loose functions to packfile.h is silly. Moving the packed functions to cache.h works, but makes the "cache.h is a kitchen sink" problem worse. The best place is the recently-created object-store.h, since these are quite obviously related to object storage. The for_each__in_objdir() functions do not use the same flags, but they are logically part of the same interface as for_each_loose_object(), and share callback signatures. So we'll move those, as well, as they also make sense in object-store.h. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-14 12:29:57 -07:00
Jeff King	736eb88fdc	for_each_packed_object: support iterating in pack-order We currently iterate over objects within a pack in .idx order, which uses the object hashes. That means that it is effectively random with respect to the location of the object within the pack. If you're going to access the actual object data, there are two reasons to move linearly through the pack itself: 1. It improves the locality of access in the packfile. In the cold-cache case, this may mean fewer disk seeks, or better usage of disk cache. 2. We store related deltas together in the packfile. Which means that the delta base cache can operate much more efficiently if we visit all of those related deltas in sequence, as the earlier items are likely to still be in the cache. Whereas if we visit the objects in random order, our cache entries are much more likely to have been evicted by unrelated deltas in the meantime. So in general, if you're going to access the object contents pack order is generally going to end up more efficient. But if you're simply generating a list of object names, or if you're going to end up sorting the result anyway, you're better off just using the .idx order, as finding the pack order means generating the in-memory pack-revindex. According to the numbers in `8b8dfd5132` (pack-revindex: radix-sort the revindex, 2013-07-11), that takes about 200ms for linux.git, and 20ms for git.git (those numbers are a few years old but are still a good ballpark). That makes it a good optimization for some cases (we can save tens of seconds in git.git by having good locality of delta access, for a 20ms cost), but a bad one for others (e.g., right now "cat-file --batch-all-objects --batch-check="%(objectname)" is 170ms in git.git, so adding 20ms to that is noticeable). Hence this patch makes it an optional flag. You can't actually do any interesting timings yet, as it's not plumbed through to any user-facing tools like cat-file. That will come in a later patch. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:28 -07:00
Jeff King	8b36155190	for_each_*_object: give more comprehensive docstrings We already mention the local/alternate behavior of these functions, but we can help clarify a few other behaviors: - there's no need to mention LOCAL_ONLY specifically, since we already reference the flags by type (and as we add more flags, we don't want to have to mention each) - clarify that reachability doesn't matter here; this is all accessible objects - what ordering/uniqueness guarantees we give - how pack-specific flags are handled for the loose case Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:26 -07:00
Jeff King	a7ff6f5a0f	for_each_*_object: take flag arguments as enum It's not wrong to pass our flags in an "unsigned", as we know it will be at least as large as the enum. However, using the enum in the declaration makes it more obvious where to find the list of flags. While we're here, let's also drop the "extern" noise-words from the declarations, per our modern coding style. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:48:25 -07:00
Jeff King	202e7f1e16	for_each_*_object: store flag definitions in a single location These flags were split between cache.h and packfile.h, because some of the flags apply only to packs. However, they share a single numeric namespace, since both are respected for the packed variant. Let's make sure they're defined together so that nobody accidentally adds a new flag in one location that duplicates the other. While we're here, let's also put them in an enum (which helps debugger visibility) and use "(1<<n)" rather than counting powers of 2 manually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-13 13:47:50 -07:00
Alban Gruin	2aed01811d	editor: add a function to launch the sequence editor As part of the rewrite of interactive rebase, the sequencer will need to open the sequence editor to allow the user to edit the todo list. Instead of duplicating the existing launch_editor() function, this refactors it to a new function, launch_specified_editor(), which takes the editor as a parameter, in addition to the path, the buffer and the environment variables. launch_sequence_editor() is then added to launch the sequence editor. Signed-off-by: Alban Gruin <alban.gruin@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-08-10 11:56:22 -07:00
Junio C Hamano	78a72ad4f8	Merge branch 'jt/commit-graph-per-object-store' The singleton commit-graph in-core instance is made per in-core repository instance. * jt/commit-graph-per-object-store: commit-graph: add repo arg to graph readers commit-graph: store graph in struct object_store commit-graph: add free_commit_graph commit-graph: add missing forward declaration object-store: add missing include commit-graph: refactor preparing commit graph	2018-08-02 15:30:47 -07:00
Junio C Hamano	c18ac30e9e	Merge branch 'en/dirty-merge-fixes' The recursive merge strategy did not properly ensure there was no change between HEAD and the index before performing its operation, which has been corrected. * en/dirty-merge-fixes: merge: fix misleading pre-merge check documentation merge-recursive: enforce rule that index matches head before merging t6044: add more testcases with staged changes before a merge is invoked merge-recursive: fix assumption that head tree being merged is HEAD merge-recursive: make sure when we say we abort that we actually abort t6044: add a testcase for index matching head, when head doesn't match HEAD t6044: verify that merges expected to abort actually abort index_has_changes(): avoid assuming operating on the_index read-cache.c: move index_has_changes() from merge.c	2018-08-02 15:30:45 -07:00
Junio C Hamano	ae533c4a92	Merge branch 'jm/cache-entry-from-mem-pool' For a large tree, the index needs to hold many cache entries allocated on heap. These cache entries are now allocated out of a dedicated memory pool to amortize malloc(3) overhead. * jm/cache-entry-from-mem-pool: block alloc: add validations around cache_entry lifecyle block alloc: allocate cache entries from mem_pool mem-pool: fill out functionality mem-pool: add life cycle management functions mem-pool: only search head block for available space block alloc: add lifecycle APIs for cache_entry structs read-cache: teach make_cache_entry to take object_id read-cache: teach refresh_cache_entry to take istate	2018-08-02 15:30:43 -07:00
Junio C Hamano	37aac3e408	Merge branch 'bc/object-id' Conversion from uchar[40] to struct object_id continues. * bc/object-id: pretty: switch hard-coded constants to the_hash_algo sha1-file: convert constants to uses of the_hash_algo log-tree: switch GIT_SHA1_HEXSZ to the_hash_algo->hexsz diff: switch GIT_SHA1_HEXSZ to use the_hash_algo builtin/merge-recursive: make hash independent builtin/merge: switch to use the_hash_algo builtin/fmt-merge-msg: make hash independent builtin/update-index: simplify parsing of cacheinfo builtin/update-index: convert to using the_hash_algo refs/files-backend: use the_hash_algo for writing refs sha1-name: use the_hash_algo when parsing object names strbuf: allocate space with GIT_MAX_HEXSZ commit: express tree entry constants in terms of the_hash_algo hex: switch to using the_hash_algo tree-walk: replace hard-coded constants with the_hash_algo cache: update object ID functions for the_hash_algo	2018-08-02 15:30:39 -07:00
Johannes Schindelin	58930fdb19	cache.h: extract enum declaration from inside a struct declaration While it is technically possible, it is confusing. Not only the user, but also VS Code's intellisense. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-30 13:14:38 -07:00
Jeff King	6ebd1cafe2	check_replace_refs: rename to read_replace_refs This was added as a NEEDSWORK in `c3c36d7de2` (replace-object: check_replace_refs is safe in multi repo environment, 2018-04-11), waiting for a calmer period. Since doing so now doesn't conflict with anything in 'pu', it seems as good a time as any. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-18 15:45:14 -07:00
Jeff King	72470aa38a	check_replace_refs: fix outdated comment Commit `afc711b8e1` (rename read_replace_refs to check_replace_refs, 2014-02-18) added a comment mentioning that check_replace_refs is set in two ways: - from user intent via --no-replace-objects, etc - after seeing there are no replace refs to respect Since `c3c36d7de2` (replace-object: check_replace_refs is safe in multi repo environment, 2018-04-11) the second is no longer true. Let's drop that part of the comment. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-18 15:45:06 -07:00
Junio C Hamano	00624d608c	Merge branch 'sb/object-store-grafts' The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h	2018-07-18 12:20:28 -07:00
Jonathan Tan	dade47c06c	commit-graph: add repo arg to graph readers Add a struct repository argument to the functions in commit-graph.h that read the commit graph. (This commit does not affect functions that write commit graphs.) Because the commit graph functions can now read the commit graph of any repository, the global variable core_commit_graph has been removed. Instead, the config option core.commitGraph is now read on the first time in a repository that a commit is attempted to be parsed using its commit graph. This commit includes a test that exercises the functionality on an arbitrary repository that is not the_repository. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-17 15:47:48 -07:00
brian m. carlson	509f6f62a4	cache: update object ID functions for the_hash_algo Most of our code has been converted to use struct object_id for object IDs. However, there are some places that still have not, and there are a variety of places that compare equivalently sized hashes that are not object IDs. All of these hashes are artifacts of the internal hash algorithm in use, and when we switch to NewHash for object storage, all of these uses will also switch. Update the hashcpy, hashclr, and hashcmp functions to use the_hash_algo, since they are used in a variety of places to copy and manipulate buffers that need to move data into or out of struct object_id. This has the effect of making the corresponding oid* functions use the_hash_algo as well. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-16 14:27:38 -07:00
Elijah Newren	e1f8694f33	merge-recursive: fix assumption that head tree being merged is HEAD `git merge-recursive` does a three-way merge between user-specified trees base, head, and remote. Since the user is allowed to specify head, we can not necesarily assume that head == HEAD. Modify index_has_changes() to take an extra argument specifying the tree to compare against. If NULL, it will compare to HEAD. We then use this from merge-recursive to make sure we compare to the user-specified head. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-11 09:38:36 -07:00
Elijah Newren	1b9fbefbe0	index_has_changes(): avoid assuming operating on the_index Modify index_has_changes() to take a struct istate* instead of just operating on the_index. This is only a partial conversion, though, because we call do_diff_cache() which implicitly assumes work is to be done on the_index. Ongoing work is being done elsewhere to do the remainder of the conversion, and thus is not duplicated here. Instead, a simple check is put in place until that work is complete. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 13:13:18 -07:00
Jameson Miller	8616a2d0cb	block alloc: add validations around cache_entry lifecyle Add an option (controlled by an environment variable) perform extra validations on mem_pool allocated cache entries. When set: 1) Invalidate cache_entry memory when discarding cache_entry. 2) When discarding index_state struct, verify that all cache_entries were allocated from expected mem_pool. 3) When discarding mem_pools, invalidate mem_pool memory. This should provide extra checks that mem_pools and their allocated cache_entries are being used as expected. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 10:58:27 -07:00
Jameson Miller	8e72d67529	block alloc: allocate cache entries from mem_pool When reading large indexes from disk, a portion of the time is dominated in malloc() calls. This can be mitigated by allocating a large block of memory and manage it ourselves via memory pools. This change moves the cache entry allocation to be on top of memory pools. Design: The index_state struct will gain a notion of an associated memory_pool from which cache_entries will be allocated from. When reading in the index from disk, we have information on the number of entries and their size, which can guide us in deciding how large our initial memory allocation should be. When an index is discarded, the associated memory_pool will be discarded as well - so the lifetime of a cache_entry is tied to the lifetime of the index_state that it was allocated for. In the case of a Split Index, the following rules are followed. 1st, some terminology is defined: Terminology: - 'the_index': represents the logical view of the index - 'split_index': represents the "base" cache entries. Read from the split index file. 'the_index' can reference a single split_index, as well as cache_entries from the split_index. `the_index` will be discarded before the `split_index` is. This means that when we are allocating cache_entries in the presence of a split index, we need to allocate the entries from the `split_index`'s memory pool. This allows us to follow the pattern that `the_index` can reference cache_entries from the `split_index`, and that the cache_entries will not be freed while they are still being referenced. Managing transient cache_entry structs: Cache entries are usually allocated for an index, but this is not always the case. Cache entries are sometimes allocated because this is the type that the existing checkout_entry function works with. Because of this, the existing code needs to handle cache entries associated with an index / memory pool, and those that only exist transiently. Several strategies were contemplated around how to handle this: Chosen approach: An extra field was added to the cache_entry type to track whether the cache_entry was allocated from a memory pool or not. This is currently an int field, as there are no more available bits in the existing ce_flags bit field. If / when more bits are needed, this new field can be turned into a proper bit field. Alternatives: 1) Do not include any information about how the cache_entry was allocated. Calling code would be responsible for tracking whether the cache_entry needed to be freed or not. Pro: No extra memory overhead to track this state Con: Extra complexity in callers to handle this correctly. The extra complexity and burden to not regress this behavior in the future was more than we wanted. 2) cache_entry would gain knowledge about which mem_pool allocated it Pro: Could (potentially) do extra logic to know when a mem_pool no longer had references to any cache_entry Con: cache_entry would grow heavier by a pointer, instead of int We didn't see a tangible benefit to this approach 3) Do not add any extra information to a cache_entry, but when freeing a cache entry, check if the memory exists in a region managed by existing mem_pools. Pro: No extra memory overhead to track state Con: Extra computation is performed when freeing cache entries We decided tracking and iterating over known memory pool regions was less desirable than adding an extra field to track this stae. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 10:58:27 -07:00
Jameson Miller	a849735bfb	block alloc: add lifecycle APIs for cache_entry structs It has been observed that the time spent loading an index with a large number of entries is partly dominated by malloc() calls. This change is in preparation for using memory pools to reduce the number of malloc() calls made to allocate cahce entries when loading an index. Add an API to allocate and discard cache entries, abstracting the details of managing the memory backing the cache entries. This commit does actually change how memory is managed - this will be done in a later commit in the series. This change makes the distinction between cache entries that are associated with an index and cache entries that are not associated with an index. A main use of cache entries is with an index, and we can optimize the memory management around this. We still have other cases where a cache entry is not persisted with an index, and so we need to handle the "transient" use case as well. To keep the congnitive overhead of managing the cache entries, there will only be a single discard function. This means there must be enough information kept with the cache entry so that we know how to discard them. A summary of the main functions in the API is: make_cache_entry: create cache entry for use in an index. Uses specified parameters to populate cache_entry fields. make_empty_cache_entry: Create an empty cache entry for use in an index. Returns cache entry with empty fields. make_transient_cache_entry: create cache entry that is not used in an index. Uses specified parameters to populate cache_entry fields. make_empty_transient_cache_entry: create cache entry that is not used in an index. Returns cache entry with empty fields. discard_cache_entry: A single function that knows how to discard a cache entry regardless of how it was allocated. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 10:58:27 -07:00
Jameson Miller	825ed4d9a0	read-cache: teach make_cache_entry to take object_id Teach make_cache_entry function to take object_id instead of a SHA-1. Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 10:58:15 -07:00
Jameson Miller	768d796506	read-cache: teach refresh_cache_entry to take istate Refactor refresh_cache_entry() to work on a specific index, instead of implicitly using the_index. This is in preparation for making the make_cache_entry function apply to a specific index. Signed-off-by: Jameson Miller <jamill@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-07-03 10:58:15 -07:00
Junio C Hamano	b16b60f71b	Merge branch 'sb/object-store-grafts' into sb/object-store-lookup * sb/object-store-grafts: commit: allow lookup_commit_graft to handle arbitrary repositories commit: allow prepare_commit_graft to handle arbitrary repositories shallow: migrate shallow information into the object parser path.c: migrate global git_path_* to take a repository argument cache: convert get_graft_file to handle arbitrary repositories commit: convert read_graft_file to handle arbitrary repositories commit: convert register_commit_graft to handle arbitrary repositories commit: convert commit_graft_pos() to handle arbitrary repositories shallow: add repository argument to is_repository_shallow shallow: add repository argument to check_shallow_file_for_update shallow: add repository argument to register_shallow shallow: add repository argument to set_alternate_shallow_file commit: add repository argument to lookup_commit_graft commit: add repository argument to prepare_commit_graft commit: add repository argument to read_graft_file commit: add repository argument to register_commit_graft commit: add repository argument to commit_graft_pos object: move grafts to object parser object-store: move object access functions to object-store.h	2018-06-29 10:43:28 -07:00
Junio C Hamano	110240588d	Merge branch 'sb/object-store-alloc' The conversion to pass "the_repository" and then "a_repository" throughout the object access API continues. * sb/object-store-alloc: alloc: allow arbitrary repositories for alloc functions object: allow create_object to handle arbitrary repositories object: allow grow_object_hash to handle arbitrary repositories alloc: add repository argument to alloc_commit_index alloc: add repository argument to alloc_report alloc: add repository argument to alloc_object_node alloc: add repository argument to alloc_tag_node alloc: add repository argument to alloc_commit_node alloc: add repository argument to alloc_tree_node alloc: add repository argument to alloc_blob_node object: add repository argument to grow_object_hash object: add repository argument to create_object repository: introduce parsed objects field	2018-06-25 13:22:38 -07:00
Junio C Hamano	2289880f78	Merge branch 'nd/command-list' The list of commands with their various attributes were spread across a few places in the build procedure, but it now is getting a bit more consolidated to allow more automation. * nd/command-list: completion: allow to customize the completable command list completion: add and use --list-cmds=alias completion: add and use --list-cmds=nohelpers Move declaration for alias.c to alias.h completion: reduce completable command list completion: let git provide the completable command list command-list.txt: documentation and guide line help: use command-list.txt for the source of guides help: add "-a --verbose" to list all commands with synopsis git: support --list-cmds=list-<category> completion: implement and use --list-cmds=main,others git --list-cmds: collect command list in a string_list git.c: convert --list-* to --list-cmds=* Remove common-cmds.h help: use command-list.h for common command list generate-cmds.sh: export all commands to command-list.h generate-cmds.sh: factor out synopsis extract code	2018-06-01 15:06:37 +09:00
Junio C Hamano	42c8ce1c49	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (42 commits) merge-one-file: compute empty blob object ID add--interactive: compute the empty tree value Update shell scripts to compute empty tree object ID sha1_file: only expose empty object constants through git_hash_algo dir: use the_hash_algo for empty blob object ID sequencer: use the_hash_algo for empty tree object ID cache-tree: use is_empty_tree_oid sha1_file: convert cached object code to struct object_id builtin/reset: convert use of EMPTY_TREE_SHA1_BIN builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX wt-status: convert two uses of EMPTY_TREE_SHA1_HEX submodule: convert several uses of EMPTY_TREE_SHA1_HEX sequencer: convert one use of EMPTY_TREE_SHA1_HEX merge: convert empty tree constant to the_hash_algo builtin/merge: switch tree functions to use object_id builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo sha1-file: add functions for hex empty tree and blob OIDs builtin/receive-pack: avoid hard-coded constants for push certs diff: specify abbreviation size in terms of the_hash_algo upload-pack: replace use of several hard-coded constants ...	2018-05-30 14:04:10 +09:00
Junio C Hamano	7913f53b56	Sync with Git 2.17.1 * maint: (25 commits) Git 2.17.1 Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 fsck: complain when .gitmodules is a symlink index-pack: check .gitmodules files with --strict unpack-objects: call fsck_finish() after fscking objects fsck: call fsck_finish() after fscking objects fsck: check .gitmodules content fsck: handle promisor objects in .gitmodules check fsck: detect gitmodules files fsck: actually fsck blob data fsck: simplify ".git" check index-pack: make fsck error message more specific verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant ...	2018-05-29 17:10:05 +09:00
Junio C Hamano	ad635e82d6	Merge branch 'nd/pack-objects-pack-struct' "git pack-objects" needs to allocate tons of "struct object_entry" while doing its work, and shrinking its size helps the performance quite a bit. * nd/pack-objects-pack-struct: ci: exercise the whole test suite with uncommon code in pack-objects pack-objects: reorder members to shrink struct object_entry pack-objects: shrink delta_size field in struct object_entry pack-objects: shrink size field in struct object_entry pack-objects: clarify the use of object_entry::size pack-objects: don't check size when the object is bad pack-objects: shrink z_delta_size field in struct object_entry pack-objects: refer to delta objects by index instead of pointer pack-objects: move in_pack out of struct object_entry pack-objects: move in_pack_pos out of struct object_entry pack-objects: use bitfield for object_entry::depth pack-objects: use bitfield for object_entry::dfs_state pack-objects: turn type and in_pack_type to bitfields pack-objects: a bit of document about struct object_entry read-cache.c: make $GIT_TEST_SPLIT_INDEX boolean	2018-05-23 14:38:19 +09:00
Junio C Hamano	fcb6df3254	Merge branch 'sb/oid-object-info' The codepath around object-info API has been taught to take the repository object (which in turn tells the API which object store the objects are to be located). * sb/oid-object-info: cache.h: allow oid_object_info to handle arbitrary repositories packfile: add repository argument to cache_or_unpack_entry packfile: add repository argument to unpack_entry packfile: add repository argument to read_object packfile: add repository argument to packed_object_info packfile: add repository argument to packed_to_object_type packfile: add repository argument to retry_bad_packed_offset cache.h: add repository argument to oid_object_info cache.h: add repository argument to oid_object_info_extended	2018-05-23 14:38:16 +09:00
Junio C Hamano	b577198526	Merge branch 'nd/pack-format-doc' Doc update. * nd/pack-format-doc: pack-format.txt: more details on pack file format	2018-05-23 14:38:11 +09:00
Junio C Hamano	68f95b26e4	Sync with Git 2.16.4 * maint-2.16: Git 2.16.4 Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:25:26 +09:00
Junio C Hamano	023020401d	Sync with Git 2.15.2 * maint-2.15: Git 2.15.2 Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:18:06 +09:00
Junio C Hamano	9e0f06d55d	Sync with Git 2.14.4 * maint-2.14: Git 2.14.4 Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:15:14 +09:00
Junio C Hamano	7b01c71b64	Sync with Git 2.13.7 * maint-2.13: Git 2.13.7 verify_path: disallow symlinks in .gitmodules update-index: stat updated files earlier verify_dotfile: mention case-insensitivity in comment verify_path: drop clever fallthrough skip_prefix: add case-insensitive variant is_{hfs,ntfs}_dotgitmodules: add tests is_ntfs_dotgit: match other .git files is_hfs_dotgit: match other .git files is_ntfs_dotgit: use a size_t for traversing string submodule-config: verify submodule names as paths	2018-05-22 14:10:49 +09:00
Jeff King	10ecfa7649	verify_path: disallow symlinks in .gitmodules There are a few reasons it's not a good idea to make .gitmodules a symlink, including: 1. It won't be portable to systems without symlinks. 2. It may behave inconsistently, since Git may look at this file in the index or a tree without bothering to resolve any symbolic links. We don't do this _yet_, but the config infrastructure is there and it's planned for the future. With some clever code, we could make (2) work. And some people may not care about (1) if they only work on one platform. But there are a few security reasons to simply disallow it: a. A symlinked .gitmodules file may circumvent any fsck checks of the content. b. Git may read and write from the on-disk file without sanity checking the symlink target. So for example, if you link ".gitmodules" to "../oops" and run "git submodule add", we'll write to the file "oops" outside the repository. Again, both of those are problems that _could_ be solved with sufficient code, but given the complications in (1) and (2), we're better off just outlawing it explicitly. Note the slightly tricky call to verify_path() in update-index's update_one(). There we may not have a mode if we're not updating from the filesystem (e.g., we might just be removing the file). Passing "0" as the mode there works fine; since it's not a symlink, we'll just skip the extra checks. Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Johannes Schindelin	e7cb0b4455	is_ntfs_dotgit: match other .git files When we started to catch NTFS short names that clash with .git, we only looked for GIT~1. This is sufficient because we only ever clone into an empty directory, so .git is guaranteed to be the first subdirectory or file in that directory. However, even with a fresh clone, .gitmodules is not necessarily the first file to be written that would want the NTFS short name GITMOD~1: a malicious repository can add .gitmodul0000 and friends, which sorts before `.gitmodules` and is therefore checked out first. For that reason, we have to test not only for ~1 short names, but for others, too. It's hard to just adapt the existing checks in is_ntfs_dotgit(): since Windows 2000 (i.e., in all Windows versions still supported by Git), NTFS short names are only generated in the <prefix>~<number> form up to number 4. After that, a different prefix is used, calculated from the long file name using an undocumented, but stable algorithm. For example, the short name of .gitmodules would be GITMOD~1, but if it is taken, and all of ~2, ~3 and ~4 are taken, too, the short name GI7EBA~1 will be used. From there, collisions are handled by incrementing the number, shortening the prefix as needed (until ~9999999 is reached, in which case NTFS will not allow the file to be created). We'd also want to handle .gitignore and .gitattributes, which suffer from a similar problem, using the fall-back short names GI250A~1 and GI7D29~1, respectively. To accommodate for that, we could reimplement the hashing algorithm, but it is just safer and simpler to provide the known prefixes. This algorithm has been reverse-engineered and described at https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but still available via https://web.archive.org/. These can be recomputed by running the following Perl script: -- snip -- use warnings; use strict; sub compute_short_name_hash ($) { my $checksum = 0; foreach (split('', $_[0])) { $checksum = ($checksum * 0x25 + ord($_)) & 0xffff; } $checksum = ($checksum * 314159269) & 0xffffffff; $checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000); $checksum -= (($checksum * 1152921497) >> 60) * 1000000007; return scalar reverse sprintf("%x", $checksum & 0xffff); } print compute_short_name_hash($ARGV[0]); -- snap -- E.g., running that with the argument ".gitignore" will result in "250a" (which then becomes "gi250a" in the code). Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Jeff King <peff@peff.net>	2018-05-21 23:50:11 -04:00
Nguyễn Thái Ngọc Duy	65b5f9483e	Move declaration for alias.c to alias.h Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-21 13:23:14 +09:00
Stefan Beller	0437a2e365	cache: convert get_graft_file to handle arbitrary repositories This conversion was done without the #define trick used in the earlier series refactoring to have better repository access, because this function is easy to review, as all lines are converted and it has only one caller. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-18 08:13:10 +09:00
Stefan Beller	cbd53a2193	object-store: move object access functions to object-store.h This should make these functions easier to find and cache.h less overwhelming to read. In particular, this moves: - read_object_file - oid_object_info - write_object_file As a result, most of the codebase needs to #include object-store.h. In this patch the #include is only added to files that would fail to compile otherwise. It would be better to #include wherever identifiers from the header are used. That can happen later when we have better tooling for it. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:42:03 +09:00
Stefan Beller	14ba97f81c	alloc: allow arbitrary repositories for alloc functions We have to convert all of the alloc functions at once, because alloc_report uses a funky macro for reporting. It is better for the sake of mechanical conversion to convert multiple functions at once rather than changing the structure of the reporting function. We record all memory allocation in alloc.c, and free them in clear_alloc_state, which is called for all repositories except the_repository. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-16 11:16:50 +09:00
Nguyễn Thái Ngọc Duy	011b648646	pack-format.txt: more details on pack file format The current document mentions OBJ_* constants without their actual values. A git developer would know these are from cache.h but that's not very friendly to a person who wants to read this file to implement a pack file parser. Similarly, the deltified representation is not documented at all (the "document" is basically patch-delta.c). Translate that C code to English with a bit more about what ofs-delta and ref-delta mean. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-13 10:20:03 +09:00
Stefan Beller	dd5d9deb01	alloc: add repository argument to alloc_commit_index This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	17bfe87369	alloc: add repository argument to alloc_report This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	13e3fdcb76	alloc: add repository argument to alloc_object_node This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	a0bd9086bb	alloc: add repository argument to alloc_tag_node This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	8ba0e5ec57	alloc: add repository argument to alloc_commit_node This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	cf7203bdc6	alloc: add repository argument to alloc_tree_node This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Stefan Beller	f0de1d62ae	alloc: add repository argument to alloc_blob_node This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-09 12:12:36 +09:00
Junio C Hamano	174774cd51	Merge branch 'sb/object-store-replace' The effort to pass the repository in-core structure throughout the API continues. This round deals with the code that implements the refs/replace/ mechanism. * sb/object-store-replace: replace-object: allow lookup_replace_object to handle arbitrary repositories replace-object: allow do_lookup_replace_object to handle arbitrary repositories replace-object: allow prepare_replace_object to handle arbitrary repositories refs: allow for_each_replace_ref to handle arbitrary repositories refs: store the main ref store inside the repository struct replace-object: add repository argument to lookup_replace_object replace-object: add repository argument to do_lookup_replace_object replace-object: add repository argument to prepare_replace_object refs: add repository argument to for_each_replace_ref refs: add repository argument to get_main_ref_store replace-object: check_replace_refs is safe in multi repo environment replace-object: eliminate replace objects prepared flag object-store: move lookup_replace_object to replace-object.h replace-object: move replace_map to object store replace_object: use oidmap	2018-05-08 15:59:21 +09:00
Junio C Hamano	b10edb2df5	Merge branch 'ds/commit-graph' Precompute and store information necessary for ancestry traversal in a separate file to optimize graph walking. * ds/commit-graph: commit-graph: implement "--append" option commit-graph: build graph from starting commits commit-graph: read only from specific pack-indexes commit: integrate commit graph with commit parsing commit-graph: close under reachability commit-graph: add core.commitGraph setting commit-graph: implement git commit-graph read commit-graph: implement git-commit-graph write commit-graph: implement write_commit_graph() commit-graph: create git-commit-graph builtin graph: add commit graph design document commit-graph: add format document csum-file: refactor finalize_hashfile() method csum-file: rename hashclose() to finalize_hashfile()	2018-05-08 15:59:20 +09:00
Junio C Hamano	92034a9cd5	Merge branch 'dj/runtime-prefix' A build-time option has been added to allow Git to be told to refer to its associated files relative to the main binary, in the same way that has been possible on Windows for quite some time, for Linux, BSDs and Darwin. * dj/runtime-prefix: Makefile: quote $INSTLIBDIR when passing it to sed Makefile: remove unused @@PERLLIBDIR@@ substitution variable mingw/msvc: use the new-style RUNTIME_PREFIX helper exec_cmd: provide a new-style RUNTIME_PREFIX helper for Windows exec_cmd: RUNTIME_PREFIX on some POSIX systems Makefile: add Perl runtime prefix support Makefile: generate Perl header from template file	2018-05-08 15:59:17 +09:00
brian m. carlson	e1ccd7e2b1	sha1_file: only expose empty object constants through git_hash_algo There really isn't any case in which we want to expose the constants for empty trees and blobs outside of using the hash algorithm abstraction. Make these constants static and stop exposing the defines in cache.h. Remove the constants which are no longer in use. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:53 +09:00
brian m. carlson	d8a92ced62	sha1-file: add functions for hex empty tree and blob OIDs Oftentimes, we'll want to refer to an empty tree or empty blob by its hex name without having to call oid_to_hex or explicitly refer to the_hash_algo. Add helper functions that format these values into static buffers and return them for easy use. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:51 +09:00
brian m. carlson	75691ea345	Update struct index_state to use struct object_id Adjust struct index_state to use struct object_id instead of unsigned char [20]. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:50 +09:00
brian m. carlson	6862ebbfcb	sha1-file: convert freshen functions to object_id Convert the various functions for freshening objects and has_loose_object_nonlocal to use struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:49 +09:00
brian m. carlson	c51c39418b	packfile: remove unused member from struct pack_entry The sha1 member in struct pack_entry is unused except for one instance in which we store a value in it. Since nobody ever reads this value, don't bother to compute it and remove the member from struct pack_entry. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:49 +09:00
brian m. carlson	6f13fd0ec6	Remove unused member in struct object_context The tree member of struct object_context is unused except in one place where we write to it. Since there are no users of this member, remove it. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:49 +09:00
brian m. carlson	69d124255e	cache: add a function to read an object ID from a buffer In various places throughout the codebase, we need to read data into a struct object_id from a pack or other unsigned char buffer. Add an inline function that does this based on the current hash algorithm in use, and use it in several places. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-05-02 13:59:48 +09:00
Stefan Beller	9d98354f48	cache.h: allow oid_object_info to handle arbitrary repositories This involves also adapting oid_object_info_extended and a some internal functions that are used to implement these. It all has to happen in one patch, because of a single recursive chain of calls visits all these functions. oid_object_info_extended is also used in partial clones, which allow fetching missing objects. As this series will not add the repository struct to the transport code and fetch_object(), add a TODO note and omit fetching if a user tries to use a partial clone in a repository other than the_repository. Among the functions modified to handle arbitrary repositories, unpack_entry() is one of them. Note that it still references the globals "delta_base_cache" and "delta_base_cached", but those are safe to be referenced (the former is indexed partly by "struct packed_git *", which is repo-specific, and the latter is only used to limit the size of the former as an optimization). Helped-by: Brandon Williams <bmwill@google.com> Helped-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:28 +09:00
Stefan Beller	0df8e96566	cache.h: add repository argument to oid_object_info Add a repository argument to allow the callers of oid_object_info to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Stefan Beller	7ecd869060	cache.h: add repository argument to oid_object_info_extended Add a repository argument to allow oid_object_info_extended callers to be more specific about which repository to act on. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. Signed-off-by: Stefan Beller <sbeller@google.com> Reviewed-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-26 10:54:27 +09:00
Junio C Hamano	ff6eb825f0	Merge branch 'jk/relative-directory-fix' Some codepaths, including the refs API, get and keep relative paths, that go out of sync when the process does chdir(2). The chdir-notify API is introduced to let these codepaths adjust these cached paths to the new current directory. * jk/relative-directory-fix: refs: use chdir_notify to update cached relative paths set_work_tree: use chdir_notify add chdir-notify API trace.c: export trace_setup_key set_git_dir: die when setenv() fails	2018-04-25 13:28:52 +09:00
Nguyễn Thái Ngọc Duy	fd9b1baef8	pack-objects: turn type and in_pack_type to bitfields An extra field type_valid is added to carry the equivalent of OBJ_BAD in the original "type" field. in_pack_type always contains a valid type so we only need 3 bits for it. A note about accepting OBJ_NONE as "valid" type. The function read_object_list_from_stdin() can pass this value [1] and it eventually calls create_object_entry() where current code skip setting "type" field if the incoming type is zero. This does not have any bad side effects because "type" field should be memset()'d anyway. But since we also need to set type_valid now, skipping oe_set_type() leaves type_valid zero/false, which will make oe_type() return OBJ_BAD, not OBJ_NONE anymore. Apparently we do care about OBJ_NONE in prepare_pack(). This switch from OBJ_NONE to OBJ_BAD may trigger fatal: unable to get type of object ... Accepting OBJ_NONE [2] does sound wrong, but this is how it is has been for a very long time and I haven't time to dig in further. [1] See `5c49c11686` (pack-objects: better check_object() performances - 2007-04-16) [2] `21666f1aae` (convert object type handling from a string to a number - 2007-02-26) Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-16 12:38:58 +09:00
Stefan Beller	47f351e9b3	object-store: move lookup_replace_object to replace-object.h lookup_replace_object is a low-level function that most users of the object store do not need to use directly. Move it to replace-object.h to avoid a dependency loop in an upcoming change to its inline definition that will make use of repository.h. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-12 11:38:56 +09:00
Dan Jacques	226c0ddd0d	exec_cmd: RUNTIME_PREFIX on some POSIX systems Enable Git to resolve its own binary location using a variety of OS-specific and generic methods, including: - procfs via "/proc/self/exe" (Linux) - _NSGetExecutablePath (Darwin) - KERN_PROC_PATHNAME sysctl on BSDs. - argv0, if absolute (all, including Windows). This is used to enable RUNTIME_PREFIX support for non-Windows systems, notably Linux and Darwin. When configured with RUNTIME_PREFIX, Git will do a best-effort resolution of its executable path and automatically use this as its "exec_path" for relative helper and data lookups, unless explicitly overridden. Small incidental formatting cleanup of "exec_cmd.c". Signed-off-by: Dan Jacques <dnj@google.com> Thanks-to: Robbie Iannucci <iannucci@google.com> Thanks-to: Junio C Hamano <gitster@pobox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-11 18:10:28 +09:00
Junio C Hamano	cf0b1793ea	Merge branch 'sb/object-store' Refactoring the internal global data structure to make it possible to open multiple repositories, work with and then close them. Rerolled by Duy on top of a separate preliminary clean-up topic. The resulting structure of the topics looked very sensible. * sb/object-store: (27 commits) sha1_file: allow sha1_loose_object_info to handle arbitrary repositories sha1_file: allow map_sha1_file to handle arbitrary repositories sha1_file: allow map_sha1_file_1 to handle arbitrary repositories sha1_file: allow open_sha1_file to handle arbitrary repositories sha1_file: allow stat_sha1_file to handle arbitrary repositories sha1_file: allow sha1_file_name to handle arbitrary repositories sha1_file: add repository argument to sha1_loose_object_info sha1_file: add repository argument to map_sha1_file sha1_file: add repository argument to map_sha1_file_1 sha1_file: add repository argument to open_sha1_file sha1_file: add repository argument to stat_sha1_file sha1_file: add repository argument to sha1_file_name sha1_file: allow prepare_alt_odb to handle arbitrary repositories sha1_file: allow link_alt_odb_entries to handle arbitrary repositories sha1_file: add repository argument to prepare_alt_odb sha1_file: add repository argument to link_alt_odb_entries sha1_file: add repository argument to read_info_alternates sha1_file: add repository argument to link_alt_odb_entry sha1_file: add raw_object_store argument to alt_odb_usable pack: move approximate object count to object store ...	2018-04-11 13:09:55 +09:00
Derrick Stolee	1b70dfd594	commit-graph: add core.commitGraph setting The commit graph feature is controlled by the new core.commitGraph config setting. This defaults to 0, so the feature is opt-in. The intention of core.commitGraph is that a user can always stop checking for or parsing commit graph files if core.commitGraph=0. Signed-off-by: Derrick Stolee <dstolee@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-11 10:43:01 +09:00
Junio C Hamano	0873c393c7	Merge branch 'nd/remove-ignore-env-field' Code clean-up for the "repository" abstraction. * nd/remove-ignore-env-field: repository.h: add comment and clarify repo_set_gitdir repository: delete ignore_env member sha1_file.c: move delayed getenv(altdb) back to setup_git_env() repository.c: delete dead functions repository.c: move env-related setup code back to environment.c repository: initialize the_repository in main()	2018-04-10 16:28:20 +09:00
Junio C Hamano	a5bbc29994	Merge branch 'bc/object-id' Conversion from uchar[20] to struct object_id continues. * bc/object-id: (36 commits) convert: convert to struct object_id sha1_file: introduce a constant for max header length Convert lookup_replace_object to struct object_id sha1_file: convert read_sha1_file to struct object_id sha1_file: convert read_object_with_reference to object_id tree-walk: convert tree entry functions to object_id streaming: convert istream internals to struct object_id tree-walk: convert get_tree_entry_follow_symlinks internals to object_id builtin/notes: convert static functions to object_id builtin/fmt-merge-msg: convert remaining code to object_id sha1_file: convert sha1_object_info* to object_id Convert remaining callers of sha1_object_info_extended to object_id packfile: convert unpack_entry to struct object_id sha1_file: convert retry_bad_packed_offset to struct object_id sha1_file: convert assert_sha1_type to object_id builtin/mktree: convert to struct object_id streaming: convert open_istream to use struct object_id sha1_file: convert check_sha1_signature to struct object_id sha1_file: convert read_loose_object to use struct object_id builtin/index-pack: convert struct ref_delta_entry to object_id ...	2018-04-10 08:25:45 +09:00
Junio C Hamano	5d806b74d5	Merge branch 'ti/fetch-everything-local-optim' A "git fetch" from a repository with insane number of refs into a repository that is already up-to-date still wasted too many cycles making many lstat(2) calls to see if these objects at the tips exist as loose objects locally. These lstat(2) calls are optimized away by enumerating all loose objects beforehand. It is unknown if the new strategy negatively affects existing use cases, fetching into a repository with many loose objects from a repository with small number of refs. * ti/fetch-everything-local-optim: fetch-pack.c: use oidset to check existence of loose object	2018-04-10 08:25:43 +09:00
Jeff King	48988c4d0c	set_git_dir: die when setenv() fails The set_git_dir() function returns an error if setenv() fails, but there are zero callers who pay attention to this return value. If this ever were to happen, it could cause confusing results, as sub-processes would see a potentially stale GIT_DIR (e.g., if it is relative and we chdir()-ed to the root of the working tree). We _could_ try to fix each caller, but there's really nothing useful to do after this failure except die. Let's just lump setenv() failure into the same category as malloc failure: things that should never happen and cause us to abort catastrophically. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-30 12:49:57 -07:00
Stefan Beller	e35454fa62	sha1_file: add repository argument to map_sha1_file Add a repository argument to allow map_sha1_file callers to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. While at it, move the declaration to object-store.h, where it should be easier to find. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:55 -07:00
Stefan Beller	cf78ae4f3d	sha1_file: add repository argument to sha1_file_name Add a repository argument to allow sha1_file_name callers to be more specific about which repository to handle. This is a small mechanical change; it doesn't change the implementation to handle repositories other than the_repository yet. As with the previous commits, use a macro to catch callers passing a repository other than the_repository at compile time. While at it, move the declaration to object-store.h, where it should be easier to find. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:55 -07:00
Stefan Beller	a80d72db2a	object-store: move packed_git and packed_git_mru to object store In a process with multiple repositories open, packfile accessors should be associated to a single repository and not shared globally. Move packed_git and packed_git_mru into the_repository and adjust callers to reflect this. [nd: while at there, wrap access to these two fields in get_packed_git() and get_packed_git_mru(). This allows us to lazily initialize these fields without caller doing that explicitly] Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-26 10:05:46 -07:00
Stefan Beller	0d4a132144	object-store: migrate alternates struct and functions from cache.h Migrate the struct alternate_object_database and all its related functions to the object store as these functions are easier found in that header. The migration is just a verbatim copy, no need to include the object store header at any C file, because cache.h includes repository.h which in turn includes the object-store.h Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-23 11:06:01 -07:00
Junio C Hamano	b0e0fc267b	Merge branch 'tg/split-index-fixes' into maint The split-index mode had a few corner case bugs fixed. * tg/split-index-fixes: travis: run tests with GIT_TEST_SPLIT_INDEX split-index: don't write cache tree with null oid entries read-cache: fix reading the shared index for other repos	2018-03-22 14:24:10 -07:00
Junio C Hamano	beb2cdf504	Merge branch 'ma/skip-writing-unchanged-index' Internal API clean-up to allow write_locked_index() optionally skip writing the in-core index when it is not modified. * ma/skip-writing-unchanged-index: write_locked_index(): add flag to avoid writing unchanged index	2018-03-21 11:30:10 -07:00
Takuto Ikuta	024aa4696c	fetch-pack.c: use oidset to check existence of loose object When fetching from a repository with large number of refs, because to check existence of each refs in local repository to packed and loose objects, 'git fetch' ends up doing a lot of lstat(2) to non-existing loose form, which makes it slow. Instead of making as many lstat(2) calls as the refs the remote side advertised to see if these objects exist in the loose form, first enumerate all the existing loose objects in hashmap beforehand and use it to check existence of them if the number of refs is larger than the number of loose objects. With this patch, the number of lstat(2) calls in `git fetch` is reduced from 411412 to 13794 for chromium repository, it has more than 480000 remote refs. I took time stat of `git fetch` when fetch-pack happens for chromium repository 3 times on linux with SSD. * with this patch 8.105s 8.309s 7.640s avg: 8.018s * master 12.287s 11.175s 12.227s avg: 11.896s On my MacBook Air which has slower lstat(2). * with this patch 14.501s * master 1m16.027s `git fetch` on slow disk will be improved largely. Signed-off-by: Takuto Ikuta <tikuta@chromium.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 11:17:26 -07:00
brian m. carlson	b383a13cc0	Convert lookup_replace_object to struct object_id Convert both the argument and the return value to be pointers to struct object_id. Update the callers and their internals to deal with the new type. Remove several temporaries which are no longer needed. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
brian m. carlson	b4f5aca40e	sha1_file: convert read_sha1_file to struct object_id Convert read_sha1_file to take a pointer to struct object_id and rename it read_object_file. Do the same for read_sha1_file_extended. Convert one use in grep.c to use the new function without any other code change, since the pointer being passed is a void pointer that is already initialized with a pointer to struct object_id. Update the declaration and definitions of the modified functions, and apply the following semantic patch to convert the remaining callers: @@ expression E1, E2, E3; @@ - read_sha1_file(E1.hash, E2, E3) + read_object_file(&E1, E2, E3) @@ expression E1, E2, E3; @@ - read_sha1_file(E1->hash, E2, E3) + read_object_file(E1, E2, E3) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1.hash, E2, E3, E4) + read_object_file_extended(&E1, E2, E3, E4) @@ expression E1, E2, E3, E4; @@ - read_sha1_file_extended(E1->hash, E2, E3, E4) + read_object_file_extended(E1, E2, E3, E4) Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
brian m. carlson	02f0547eaa	sha1_file: convert read_object_with_reference to object_id Convert read_object_with_reference to take pointers to struct object_id. Update the internals of the function accordingly. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:50 -07:00
brian m. carlson	abef9020e3	sha1_file: convert sha1_object_info* to object_id Convert sha1_object_info and sha1_object_info_extended to take pointers to struct object_id and rename them to use "oid" instead of "sha1" in their names. Update the declaration and definition and apply the following semantic patch, plus the standard object_id transforms: @@ expression E1, E2; @@ - sha1_object_info(E1.hash, E2) + oid_object_info(&E1, E2) @@ expression E1, E2; @@ - sha1_object_info(E1->hash, E2) + oid_object_info(E1, E2) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1.hash, E2, E3) + oid_object_info_extended(&E1, E2, E3) @@ expression E1, E2, E3; @@ - sha1_object_info_extended(E1->hash, E2, E3) + oid_object_info_extended(E1, E2, E3) Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
brian m. carlson	e816caa07b	sha1_file: convert assert_sha1_type to object_id Convert this function to take a pointer to struct object_id and rename it to assert_oid_type. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
brian m. carlson	17e65451e3	sha1_file: convert check_sha1_signature to struct object_id Convert this function to take a pointer to struct object_id and rename it check_object_signature. Introduce temporaries to convert the return values of lookup_replace_object and lookup_replace_object_extended into struct object_id. The temporaries are needed because in order to convert lookup_replace_object, open_istream needs to be converted, and open_istream needs check_sha1_signature to be converted, causing a loop of dependencies. The temporaries will be removed in a future patch. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:49 -07:00
brian m. carlson	d61d87bd15	sha1_file: convert read_loose_object to use struct object_id Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:48 -07:00
brian m. carlson	aab9583f7b	Convert find_unique_abbrev* to struct object_id Convert find_unique_abbrev and find_unique_abbrev_r to each take a pointer to struct object_id. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-14 09:23:48 -07:00
Junio C Hamano	169c9c0169	Merge branch 'bw/c-plus-plus' Avoid using identifiers that clash with C++ keywords. Even though it is not a goal to compile Git with C++ compilers, changes like this help use of code analysis tools that targets C++ on our codebase. * bw/c-plus-plus: (37 commits) replace: rename 'new' variables trailer: rename 'template' variables tempfile: rename 'template' variables wrapper: rename 'template' variables environment: rename 'namespace' variables diff: rename 'template' variables environment: rename 'template' variables init-db: rename 'template' variables unpack-trees: rename 'new' variables trailer: rename 'new' variables submodule: rename 'new' variables split-index: rename 'new' variables remote: rename 'new' variables ref-filter: rename 'new' variables read-cache: rename 'new' variables line-log: rename 'new' variables imap-send: rename 'new' variables http: rename 'new' variables entry: rename 'new' variables diffcore-delta: rename 'new' variables ...	2018-03-06 14:54:07 -08:00
Nguyễn Thái Ngọc Duy	357a03ebe9	repository.c: move env-related setup code back to environment.c It does not make sense that generic repository code contains handling of environment variables, which are specific for the main repository only. Refactor repo_set_gitdir() function to take $GIT_DIR and optionally _all_ other customizable paths. These optional paths can be NULL and will be calculated according to the default directory layout. Note that some dead functions are left behind to reduce diff noise. They will be deleted in the next patch. Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-05 11:14:03 -08:00
Martin Ågren	610008146e	write_locked_index(): add flag to avoid writing unchanged index We have several callers like if (active_cache_changed && write_locked_index(...)) handle_error(); rollback_lock_file(...); where the final rollback is needed because "!active_cache_changed" shortcuts the if-expression. There are also a few variants of this, including some if-else constructs that make it more clear when the explicit rollback is really needed. Teach `write_locked_index()` to take a new flag SKIP_IF_UNCHANGED and simplify the callers. Leave the most complicated of the callers (in builtin/update-index.c) unchanged. Rewriting it to use this new flag would end up duplicating logic. We could have made the new flag behave the other way round ("FORCE_WRITE"), but that could break existing users behind their backs. Let's take the more conservative approach. We can still migrate existing callers to use our new flag. Later we might even be able to flip the default, possibly without entirely ignoring the risk to in-flight or out-of-tree topics. Suggested-by: Jeff King <peff@peff.net> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-03-01 13:28:01 -08:00
Brandon Williams	a63b5fca9b	environment: rename 'template' variables Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-22 10:08:05 -08:00
Junio C Hamano	0fd90daba8	Merge branch 'bc/hash-algo' More abstraction of hash function from the codepath. * bc/hash-algo: hash: update obsolete reference to SHA1_HEADER bulk-checkin: abstract SHA-1 usage csum-file: abstract uses of SHA-1 csum-file: rename sha1file to hashfile read-cache: abstract away uses of SHA-1 pack-write: switch various SHA-1 values to abstract forms pack-check: convert various uses of SHA-1 to abstract forms fast-import: switch various uses of SHA-1 to the_hash_algo sha1_file: switch uses of SHA-1 to the_hash_algo builtin/unpack-objects: switch uses of SHA-1 to the_hash_algo builtin/index-pack: improve hash function abstraction hash: create union for hash context allocation hash: move SHA-1 macros to hash.h	2018-02-15 14:55:47 -08:00
Junio C Hamano	8be8342b4c	Merge branch 'po/object-id' Conversion from uchar[20] to struct object_id continues. * po/object-id: sha1_file: rename hash_sha1_file_literally sha1_file: convert write_loose_object to object_id sha1_file: convert force_object_loose to object_id sha1_file: convert write_sha1_file to object_id notes: convert write_notes_tree to object_id notes: convert combine_notes_* to object_id commit: convert commit_tree* to object_id match-trees: convert splice_tree to object_id cache: clear whole hash buffer with oidclr sha1_file: convert hash_sha1_file to object_id dir: convert struct sha1_stat to use object_id sha1_file: convert pretend_sha1_file to object_id	2018-02-15 14:55:43 -08:00
Brandon Williams	6ca32f4714	object_info: change member name from 'typename' to 'type_name' Rename C++ keyword in order to bring the codebase closer to being able to be compiled with a C++ compiler. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-14 13:10:05 -08:00
Junio C Hamano	e75c862125	Merge branch 'tg/split-index-fixes' The split-index mode had a few corner case bugs fixed. * tg/split-index-fixes: travis: run tests with GIT_TEST_SPLIT_INDEX split-index: don't write cache tree with null oid entries read-cache: fix reading the shared index for other repos	2018-02-13 13:39:13 -08:00
Junio C Hamano	9238941618	Merge branch 'cc/sha1-file-name' Code clean-up. * cc/sha1-file-name: sha1_file: improve sha1_file_name() perfs sha1_file: remove static strbuf from sha1_file_name()	2018-02-13 13:39:10 -08:00
Junio C Hamano	867622398f	Merge branch 'gs/retire-mru' Retire mru API as it does not give enough abstraction over underlying list API to be worth it. * gs/retire-mru: mru: Replace mru.[ch] with list.h implementation	2018-02-13 13:39:06 -08:00
Junio C Hamano	6bed209a20	Merge branch 'jh/partial-clone' The machinery to clone & fetch, which in turn involves packing and unpacking objects, have been told how to omit certain objects using the filtering mechanism introduced by the jh/object-filtering topic, and also mark the resulting pack as a promisor pack to tolerate missing objects, taking advantage of the mechanism introduced by the jh/fsck-promisors topic. * jh/partial-clone: t5616: test bulk prefetch after partial fetch fetch: inherit filter-spec from partial clone t5616: end-to-end tests for partial clone fetch-pack: restore save_commit_buffer after use unpack-trees: batch fetching of missing blobs clone: partial clone partial-clone: define partial clone settings in config fetch: support filters fetch: refactor calculation of remote list fetch-pack: test support excluding large blobs fetch-pack: add --no-filter fetch-pack, index-pack, transport: partial clone upload-pack: add object filtering for partial clone	2018-02-13 13:39:04 -08:00
Junio C Hamano	f3d618d2bf	Merge branch 'jh/fsck-promisors' In preparation for implementing narrow/partial clone, the machinery for checking object connectivity used by gc and fsck has been taught that a missing object is OK when it is referenced by a packfile specially marked as coming from trusted repository that promises to make them available on-demand and lazily. * jh/fsck-promisors: gc: do not repack promisor packfiles rev-list: support termination at promisor objects sha1_file: support lazily fetching missing objects introduce fetch-object: fetch one promisor object index-pack: refactor writing of .keep files fsck: support promisor objects as CLI argument fsck: support referenced promisor objects fsck: support refs pointing to promisor objects fsck: introduce partialclone extension extension.partialclone: introduce partial clone extension	2018-02-13 13:39:03 -08:00
brian m. carlson	164e716330	hash: move SHA-1 macros to hash.h Most of the other code dealing with SHA-1 and other hashes is located in hash.h, which is in turn loaded by cache.h. Move the SHA-1 macros to hash.h as well, so we can use them in additional hash-related items in the future. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-02 11:28:40 -08:00
Patryk Obara	1752cbbc44	sha1_file: rename hash_sha1_file_literally This function was already converted to use struct object_id earlier. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Patryk Obara	4bdb70a4f7	sha1_file: convert force_object_loose to object_id Convert the definition and declaration of force_object_loose to struct object_id and adjust usage of this function. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Patryk Obara	a09c985eae	sha1_file: convert write_sha1_file to object_id Convert the definition and declaration of write_sha1_file to struct object_id and adjust usage of this function. This commit also converts static function write_sha1_file_prepare, as it is closely related. Rename these functions to write_object_file and write_object_file_prepare respectively. Replace sha1_to_hex, hashcpy and hashclr with their oid equivalents wherever possible. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Patryk Obara	97a41a0c01	cache: clear whole hash buffer with oidclr As long as GIT_SHA1_RAWSZ is equal to GIT_MAX_RAWSZ there's no problem, but when new hashing algorithm will be in place this memset will clear only 20-byte prefix of hash buffer. Alternatively, hashclr implementation could be adjusted, but this function is almost removed from codebase already. Separate implementation of oidclr prevents potential buffer overrun in case someone incorrectly used hashclr on object_id in future. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Patryk Obara	f070faccc1	sha1_file: convert hash_sha1_file to object_id Convert the declaration and definition of hash_sha1_file to use struct object_id and adjust all function calls. Rename this function to hash_object_file. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Patryk Obara	4b33e60201	dir: convert struct sha1_stat to use object_id Convert the declaration of struct sha1_stat. Adjust all usages of this struct and replace hash{clr,cmp,cpy} with oid{clr,cmp,cpy} wherever possible. Rename it to struct oid_stat. Rename static function load_sha1_stat to load_oid_stat. Remove macro EMPTY_BLOB_SHA1_BIN, as it's no longer used. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:36 -08:00
Patryk Obara	829e5c3b92	sha1_file: convert pretend_sha1_file to object_id Convert the declaration and definition of pretend_sha1_file to use struct object_id and adjust all usages of this function. Rename it to pretend_object_file. Signed-off-by: Patryk Obara <patryk.obara@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-30 10:42:35 -08:00
Gargi Sharma	ec2dd32c70	mru: Replace mru.[ch] with list.h implementation Replace the custom calls to mru.[ch] with calls to list.h. This patch is the final step in removing the mru API completely and inlining the logic. This patch leads to significant code reduction and the mru API hence, is not a useful abstraction anymore. Signed-off-by: Gargi Sharma <gs051095@gmail.com> Reviewed-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-24 09:52:16 -08:00
Thomas Gummerer	4bddd98311	split-index: don't write cache tree with null oid entries In `a96d3cc3f6` ("cache-tree: reject entries with null sha1", 2017-04-21) we made sure that broken cache entries do not get propagated to new trees. Part of that was making sure not to re-use an existing cache tree that includes a null oid. It did so by dropping the cache tree in 'do_write_index()' if one of the entries contains a null oid. In split index mode however, there are two invocations to 'do_write_index()', one for the shared index and one for the split index. The cache tree is only written once, to the split index. As we only loop through the elements that are effectively being written by the current invocation, that may not include the entry with a null oid in the split index (when it is already written to the shared index), where we write the cache tree. Therefore in split index mode we may still end up writing the cache tree, even though there is an entry with a null oid in the index. Fix this by checking for null oids in prepare_to_write_split_index, where we loop the entries of the shared index as well as the entries for the split index. This fixes t7009 with GIT_TEST_SPLIT_INDEX. Also add a new test that's more specifically showing the problem. Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-19 10:36:39 -08:00
Thomas Gummerer	a125a22334	read-cache: fix reading the shared index for other repos read_index_from() takes a path argument for the location of the index file. For reading the shared index in split index mode however it just ignores that path argument, and reads it from the gitdir of the current repository. This works as long as an index in the_repository is read. Once that changes, such as when we read the index of a submodule, or of a different working tree than the current one, the gitdir of the_repository will no longer contain the appropriate shared index, and git will fail to read it. For example t3007-ls-files-recurse-submodules.sh was broken with GIT_TEST_SPLIT_INDEX set in `188dce131f` ("ls-files: use repository object", 2017-06-22), and t7814-grep-recurse-submodules.sh was also broken in a similar manner, probably by introducing struct repository there, although I didn't track down the exact commit for that. `be489d02d2` ("revision.c: --indexed-objects add objects from all worktrees", 2017-08-23) breaks with split index mode in a similar manner, not erroring out when it can't read the index, but instead carrying on with pruning, without taking the index of the worktree into account. Fix this by passing an additional gitdir parameter to read_index_from, to indicate where it should look for and read the shared index from. read_cache_from() defaults to using the gitdir of the_repository. As it is mostly a convenience macro, having to pass get_git_dir() for every call seems overkill, and if necessary users can have more control by using read_index_from(). Helped-by: Brandon Williams <bmwill@google.com> Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-19 10:36:34 -08:00
Christian Couder	ea6577303f	sha1_file: remove static strbuf from sha1_file_name() Using a static buffer in sha1_file_name() is error prone and the performance improvements it gives are not needed in many of the callers. So let's get rid of this static buffer and, if necessary or helpful, let's use one in the caller. Suggested-by: Jeff Hostetler <git@jeffhostetler.com> Helped-by: Kevin Daudt <me@ikke.info> Signed-off-by: Christian Couder <chriscool@tuxfamily.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-17 12:21:32 -08:00
Junio C Hamano	b6825b5c8e	Merge branch 'ew/empty-merge-with-dirty-index-maint' into ew/empty-merge-with-dirty-index * ew/empty-merge-with-dirty-index-maint: merge-recursive: avoid incorporating uncommitted changes in a merge move index_has_changes() from builtin/am.c to merge.c for reuse t6044: recursive can silently incorporate dirty changes in a merge	2017-12-22 12:48:38 -08:00
Elijah Newren	b101793c43	move index_has_changes() from builtin/am.c to merge.c for reuse index_has_changes() is a function we want to reuse outside of just am, making it also available for merge-recursive and merge-ort. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-22 12:20:29 -08:00
Junio C Hamano	0c69a132cb	Merge branch 'ls/editor-waiting-message' Git shows a message to tell the user that it is waiting for the user to finish editing when spawning an editor, in case the editor opens to a hidden window or somewhere obscure and the user gets lost. * ls/editor-waiting-message: launch_editor(): indicate that Git waits for user input refactor "dumb" terminal determination	2017-12-19 11:33:59 -08:00
Junio C Hamano	8d7fefaac4	Merge branch 'ar/unconfuse-three-dots' Ancient part of codebase still shows dots after an abbreviated object name just to show that it is not a full object name, but these ellipses are confusing to people who newly discovered Git who are used to seeing abbreviated object names and find them confusing with the range syntax. * ar/unconfuse-three-dots: t2020: test variations that matter t4013: test new output from diff --abbrev --raw diff: diff_aligned_abbrev: remove ellipsis after abbreviated SHA-1 value t4013: prepare for upcoming "diff --raw --abbrev" output format change checkout: describe_detached_head: remove ellipsis after committish print_sha1_ellipsis: introduce helper Documentation: user-manual: limit usage of ellipsis Documentation: revisions: fix typo: "three dot" ---> "three-dot" (in line with "two-dot").	2017-12-19 11:33:58 -08:00
Junio C Hamano	721cc4314c	Merge branch 'bc/hash-algo' An infrastructure to define what hash function is used in Git is introduced, and an effort to plumb that throughout various codepaths has been started. * bc/hash-algo: repository: fix a sparse 'using integer as NULL pointer' warning Switch empty tree and blob lookups to use hash abstraction Integrate hash algorithm support with repo setup Add structure representing hash algorithm setup: expose enumerated repo info	2017-12-13 13:28:54 -08:00
Jeff Hostetler	1e1e39b308	partial-clone: define partial clone settings in config Create get and set routines for "partial clone" config settings. These will be used in a future commit by clone and fetch to remember the promisor remote and the default filter-spec. Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:58:51 -08:00
Jonathan Tan	8b4c0103a9	sha1_file: support lazily fetching missing objects Teach sha1_file to fetch objects from the remote configured in extensions.partialclone whenever an object is requested but missing. The fetching of objects can be suppressed through a global variable. This is used by fsck and index-pack. However, by default, such fetching is not suppressed. This is meant as a temporary measure to ensure that all Git commands work in such a situation. Future patches will update some commands to either tolerate missing objects (without fetching them) or be more efficient in fetching them. In order to determine the code changes in sha1_file.c necessary, I investigated the following: (1) functions in sha1_file.c that take in a hash, without the user regarding how the object is stored (loose or packed) (2) functions in packfile.c (because I need to check callers that know about the loose/packed distinction and operate on both differently, and ensure that they can handle the concept of objects that are neither loose nor packed) (1) is handled by the modification to sha1_object_info_extended(). For (2), I looked at for_each_packed_object and others. For for_each_packed_object, the callers either already work or are fixed in this patch: - reachable - only to find recent objects - builtin/fsck - already knows about missing objects - builtin/cat-file - warning message added in this commit Callers of the other functions do not need to be changed: - parse_pack_index - http - indirectly from http_get_info_packs - find_pack_entry_one - this searches a single pack that is provided as an argument; the caller already knows (through other means) that the sought object is in a specific pack - find_sha1_pack - fast-import - appears to be an optimization to not store a file if it is already in a pack - http-walker - to search through a struct alt_base - http-push - to search through remote packs - has_sha1_pack - builtin/fsck - already knows about promisor objects - builtin/count-objects - informational purposes only (check if loose object is also packed) - builtin/prune-packed - check if object to be pruned is packed (if not, don't prune it) - revision - used to exclude packed objects if requested by user - diff - just for optimization Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-08 09:52:42 -08:00
Junio C Hamano	4c6dad0059	Merge branch 'bw/protocol-v1' A new mechanism to upgrade the wire protocol in place is proposed and demonstrated that it works with the older versions of Git without harming them. * bw/protocol-v1: Documentation: document Extra Parameters ssh: introduce a 'simple' ssh variant i5700: add interop test for protocol transition http: tell server that the client understands v1 connect: tell server that the client understands v1 connect: teach client to recognize v1 server response upload-pack, receive-pack: introduce protocol version 1 daemon: recognize hidden request arguments protocol: introduce protocol extension mechanisms pkt-line: add packet_write function connect: in ref advertisement, shallows are last	2017-12-06 09:23:44 -08:00
Jonathan Tan	498f1f61f1	fsck: introduce partialclone extension Currently, Git does not support repos with very large numbers of objects or repos that wish to minimize manipulation of certain blobs (for example, because they are very large) very well, even if the user operates mostly on part of the repo, because Git is designed on the assumption that every referenced object is available somewhere in the repo storage. In such an arrangement, the full set of objects is usually available in remote storage, ready to be lazily downloaded. Teach fsck about the new state of affairs. In this commit, teach fsck that missing promisor objects referenced from the reflog are not an error case; in future commits, fsck will be taught about other cases. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-05 09:46:05 -08:00
Jonathan Tan	75b97fec17	extension.partialclone: introduce partial clone extension Introduce new repository extension option: `extensions.partialclone` See the update to Documentation/technical/repository-version.txt in this patch for more information. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-05 09:46:05 -08:00
Lars Schneider	a64f213d3f	refactor "dumb" terminal determination Move the code to detect "dumb" terminals into a single location. This avoids duplicating the terminal detection code yet again in a subsequent commit. Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-04 09:38:30 -08:00
Ann T Ropea	a2cd709de3	print_sha1_ellipsis: introduce helper Introduce a helper print_sha1_ellipsis() that pays attention to the GIT_PRINT_SHA1_ELLIPSIS environment variable, and prepare the tests to unconditionally set it for the test pieces that will be broken once the code stops showing the extra dots by default. The removal of these dots is merely a plan at this step and has not happened yet but soon will. Document GIT_PRINT_SHA1_ELLIPSIS. Signed-off-by: Ann T Ropea <bedhanger@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-12-04 08:25:35 -08:00
Junio C Hamano	af6e0fe3a5	Merge branch 'tb/add-renormalize' "git add --renormalize ." is a new and safer way to record the fact that you are correcting the end-of-line convention and other "convert_to_git()" glitches in the in-repository data. * tb/add-renormalize: add: introduce "--renormalize"	2017-11-27 11:06:37 +09:00
Junio C Hamano	c9fdbca92c	Merge branch 'av/fsmonitor' Various fixes to bp/fsmonitor topic. * av/fsmonitor: fsmonitor: simplify determining the git worktree under Windows fsmonitor: store fsmonitor bitmap before splitting index fsmonitor: read from getcwd(), not the PWD environment variable fsmonitor: delay updating state until after split index is merged fsmonitor: document GIT_TRACE_FSMONITOR fsmonitor: don't bother pretty-printing JSON from watchman fsmonitor: set the PWD to the top of the working tree	2017-11-21 14:07:51 +09:00
Junio C Hamano	e05336bdda	Merge branch 'bp/fsmonitor' We learned to talk to watchman to speed up "git status" and other operations that need to see which paths have been modified. * bp/fsmonitor: fsmonitor: preserve utf8 filenames in fsmonitor-watchman log fsmonitor: read entirety of watchman output fsmonitor: MINGW support for watchman integration fsmonitor: add a performance test fsmonitor: add a sample integration script for Watchman fsmonitor: add test cases for fsmonitor extension split-index: disable the fsmonitor extension when running the split index test fsmonitor: add a test tool to dump the index extension update-index: add fsmonitor support to update-index ls-files: Add support in ls-files to display the fsmonitor valid bit fsmonitor: add documentation for the fsmonitor extension. fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. update-index: add a new --force-write-index option preload-index: add override to enable testing preload-index bswap: add 64 bit endianness helper get_be64	2017-11-21 14:07:50 +09:00
Torsten Bögershausen	9472935d81	add: introduce "--renormalize" Make it safer to normalize the line endings in a repository. Files that had been commited with CRLF will be commited with LF. The old way to normalize a repo was like this: # Make sure that there are not untracked files $ echo "* text=auto" >.gitattributes $ git read-tree --empty $ git add . $ git commit -m "Introduce end-of-line normalization" The user must make sure that there are no untracked files, otherwise they would have been added and tracked from now on. The new "add --renormalize" does not add untracked files: $ echo "* text=auto" >.gitattributes $ git add --renormalize . $ git commit -m "Introduce end-of-line normalization" Note that "git add --renormalize <pathspec>" is the short form for "git add -u --renormalize <pathspec>". While at it, document that the same renormalization may be needed, whenever a clean filter is added or changed. Helped-By: Junio C Hamano <gitster@pobox.com> Signed-off-by: Torsten Bögershausen <tboegi@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-17 10:31:05 +09:00
Junio C Hamano	e539a83455	Merge branch 'bp/read-index-from-skip-verification' Drop (perhaps overly cautious) sanity check before using the index read from the filesystem at runtime. * bp/read-index-from-skip-verification: read_index_from(): speed index loading by skipping verification of the entry order	2017-11-15 12:14:37 +09:00
brian m. carlson	eb0ccfd7f5	Switch empty tree and blob lookups to use hash abstraction Switch the uses of empty_tree_oid and empty_blob_oid to use the current_hash abstraction that represents the current hash algorithm in use. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-13 13:20:44 +09:00
brian m. carlson	78a6766802	Integrate hash algorithm support with repo setup In future versions of Git, we plan to support an additional hash algorithm. Integrate the enumeration of hash algorithms with repository setup, and store a pointer to the enumerated data in struct repository. Of course, we currently only support SHA-1, so hard-code this value in read_repository_format. In the future, we'll enumerate this value from the configuration. Add a constant, the_hash_algo, which points to the hash_algo structure pointer in the repository global. Note that this is the hash which is used to serialize data to disk, not the hash which is used to display items to the user. The transition plan anticipates that these may be different. We can add an additional element in the future (say, ui_hash_algo) to provide for this case. Include repository.h in cache.h since we now need to have access to these struct and variable definitions. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-13 13:20:44 +09:00
Junio C Hamano	bde1370010	Merge branch 'rs/hex-to-bytes-cleanup' Code cleanup. * rs/hex-to-bytes-cleanup: sha1_file: use hex_to_bytes() http-push: use hex_to_bytes() notes: move hex_to_bytes() to hex.c and export it	2017-11-09 14:31:27 +09:00
Ben Peart	00ec50e56d	read_index_from(): speed index loading by skipping verification of the entry order There is code in post_read_index_from() to catch out of order entries when reading an index file. This order verification is ~13% of the cost of every call to read_index_from(). Update check_ce_order() so that it skips this verification unless the "verify_ce_order" global variable is set. Teach fsck to force this verification. The effect can be seen using t/perf/p0002-read-cache.sh: Test HEAD HEAD~1 -------------------------------------------------------------------------------------- 0002.1: read_cache/discard_cache 1000 times 0.41(0.04+0.04) 0.50(0.00+0.10) +22.0% Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-08 10:39:41 +09:00
Junio C Hamano	0b646bcac9	Merge branch 'ma/lockfile-fixes' An earlier update made it possible to use an on-stack in-core lockfile structure (as opposed to having to deliberately leak an on-heap one). Many codepaths have been updated to take advantage of this new facility. * ma/lockfile-fixes: read_cache: roll back lock in `update_index_if_able()` read-cache: leave lock in right state in `write_locked_index()` read-cache: drop explicit `CLOSE_LOCK`-flag cache.h: document `write_locked_index()` apply: remove `newfd` from `struct apply_state` apply: move lockfile into `apply_state` cache-tree: simplify locking logic checkout-index: simplify locking logic tempfile: fix documentation on `delete_tempfile()` lockfile: fix documentation on `close_lock_file_gently()` treewide: prefer lockfiles on the stack sha1_file: do not leak `lock_file`	2017-11-06 13:11:21 +09:00
Alex Vandiver	ba1b9caca6	fsmonitor: delay updating state until after split index is merged If the fsmonitor extension is used in conjunction with the split index extension, the set of entries in the index when it is first loaded is only a subset of the real index. This leads to only the non-"base" index being marked as CE_FSMONITOR_VALID. Delay the expansion of the ewah bitmap until after tweak_split_index has been called to merge in the base index as well. The new fsmonitor_dirty is kept from being leaked by dint of being cleaned up in post_read_index_from, which is guaranteed to be called after do_read_index in read_index_from. Signed-off-by: Alex Vandiver <alexmv@dropbox.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 13:28:20 +09:00
René Scharfe	0ec218656a	notes: move hex_to_bytes() to hex.c and export it Make the function for converting pairs of hexadecimal digits to binary available to other call sites. Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-11-01 10:35:35 +09:00
Brandon Williams	19113a26b6	http: tell server that the client understands v1 Tell a server that protocol v1 can be used by sending the http header 'Git-Protocol' with 'version=1' indicating this. Also teach the apache http server to pass through the 'Git-Protocol' header as an environment variable 'GIT_PROTOCOL'. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-17 10:51:29 +09:00
Brandon Williams	373d70efb2	protocol: introduce protocol extension mechanisms Create protocol.{c,h} and provide functions which future servers and clients can use to determine which protocol to use or is being used. Also introduce the 'GIT_PROTOCOL' environment variable which will be used to communicate a colon separated list of keys with optional values to a server. Unknown keys and values must be tolerated. This mechanism is used to communicate which version of the wire protocol a client would like to use with a server. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-17 10:51:29 +09:00
Martin Ågren	b74c90fb41	read_cache: roll back lock in `update_index_if_able()` `update_index_if_able()` used to always commit the lock or roll it back. Commit `03b866477` (read-cache: new API write_locked_index instead of write_index/write_cache, 2014-06-13) stopped rolling it back in case a write was not even attempted. This change in behavior is not motivated in the commit message and appears to be accidental: the `else`-path was removed, although that changed the behavior in case the `if` shortcuts. Reintroduce the rollback and document this behavior. While at it, move the documentation on this function from the function definition to the function declaration in cache.h. If `write_locked_index(..., COMMIT_LOCK)` fails, it will roll back the lock for us (see the previous commit). Noticed-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-07 10:20:56 +09:00
Martin Ågren	df60cf5789	read-cache: leave lock in right state in `write_locked_index()` If the original version of `write_locked_index()` returned with an error, it didn't roll back the lockfile unless the error occured at the very end, during closing/committing. See commit `03b866477` (read-cache: new API write_locked_index instead of write_index/write_cache, 2014-06-13). In commit `9f41c7a6b` (read-cache: close index.lock in do_write_index, 2017-04-26), we learned to close the lock slightly earlier in the callstack. That was mostly a side-effect of lockfiles being implemented using temporary files, but didn't cause any real harm. Recently, commit `076aa2cbd` (tempfile: auto-allocate tempfiles on heap, 2017-09-05) introduced a subtle bug. If the temporary file is deleted (i.e., the lockfile is rolled back), the tempfile-pointer in the `struct lock_file` will be left dangling. Thus, an attempt to reuse the lockfile, or even just to roll it back, will induce undefined behavior -- most likely a crash. Besides not crashing, we clearly want to make things consistent. The guarantees which the lockfile-machinery itself provides is A) if we ask to commit and it fails, roll back, and B) if we ask to close and it fails, do _not_ roll back. Let's do the same for consistency. Do not delete the temporary file in `do_write_index()`. One of its callers, `write_locked_index()` will thereby avoid rolling back the lock. The other caller, `write_shared_index()`, will delete its temporary file anyway. Both of these callers will avoid undefined behavior (crashing). Teach `write_locked_index(..., COMMIT_LOCK)` to roll back the lock before returning. If we have already succeeded and committed, it will be a noop. Simplify the existing callers where we now have a superfluous call to `rollback_lockfile()`. That should keep future readers from wondering why the callers are inconsistent. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-07 10:20:56 +09:00
Martin Ågren	812d6b0075	read-cache: drop explicit `CLOSE_LOCK`-flag `write_locked_index()` takes two flags: `COMMIT_LOCK` and `CLOSE_LOCK`. At most one is allowed. But it is also possible to use no flag, i.e., `0`. But when `write_locked_index()` calls `do_write_index()`, the temporary file, a.k.a. the lockfile, will be closed. So passing `0` is effectively the same as `CLOSE_LOCK`, which seems like a bug. We might feel tempted to restructure the code in order to close the file later, or conditionally. It also feels a bit unfortunate that we simply "happen" to close the lock by way of an implementation detail of lockfiles. But note that we need to close the temporary file before `stat`-ing it, at least on Windows. See `9f41c7a6b` (read-cache: close index.lock in do_write_index, 2017-04-26). Drop `CLOSE_LOCK` and make it explicit that `write_locked_index()` always closes the lock. Whether it is also committed is governed by the remaining flag, `COMMIT_LOCK`. This means we neither have nor suggest that we have a mode to write the index and leave the file open. Whatever extra contents we might eventually want to write, we should probably write it from within `write_locked_index()` itself anyway. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-07 10:20:56 +09:00
Martin Ågren	8dc3834610	cache.h: document `write_locked_index()` The next patches will tweak the behavior of this function. Document it in order to establish a basis for those patches. Signed-off-by: Martin Ågren <martin.agren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-06 10:07:18 +09:00
Junio C Hamano	d4e93836a6	Merge branch 'jk/no-optional-locks' Some commands (most notably "git status") makes an opportunistic update when performing a read-only operation to help optimize later operations in the same repository. The new "--no-optional-locks" option can be passed to Git to disable them. * jk/no-optional-locks: git: add --no-optional-locks option	2017-10-03 15:42:49 +09:00
Ben Peart	883e248b8a	fsmonitor: teach git to optionally utilize a file system monitor to speed up detecting new or changed files. When the index is read from disk, the fsmonitor index extension is used to flag the last known potentially dirty index entries. The registered core.fsmonitor command is called with the time the index was last updated and returns the list of files changed since that time. This list is used to flag any additional dirty cache entries and untracked cache directories. We can then use this valid state to speed up preload_index(), ie_match_stat(), and refresh_cache_ent() as they do not need to lstat() files to detect potential changes for those entries marked CE_FSMONITOR_VALID. In addition, if the untracked cache is turned on valid_cached_dir() can skip checking directories for new or changed files as fsmonitor will invalidate the cache only for those directories that have been identified as having potential changes. To keep the CE_FSMONITOR_VALID state accurate during git operations; when git updates a cache entry to match the current state on disk, it will now set the CE_FSMONITOR_VALID bit. Inversely, anytime git changes a cache entry, the CE_FSMONITOR_VALID bit is cleared and the corresponding untracked cache directory is marked invalid. Signed-off-by: Ben Peart <benpeart@microsoft.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-01 17:23:01 +09:00
Junio C Hamano	14a8168e2f	Merge branch 'rj/no-sign-compare' Many codepaths have been updated to squelch -Wsign-compare warnings. * rj/no-sign-compare: ALLOC_GROW: avoid -Wsign-compare warnings cache.h: hex2chr() - avoid -Wsign-compare warnings commit-slab.h: avoid -Wsign-compare warnings git-compat-util.h: xsize_t() - avoid -Wsign-compare warnings	2017-09-29 11:23:42 +09:00
Jeff King	27344d6a6c	git: add --no-optional-locks option Some tools like IDEs or fancy editors may periodically run commands like "git status" in the background to keep track of the state of the repository. Some of these commands may refresh the index and write out the result in an opportunistic way: if they can get the index lock, then they update the on-disk index with any updates they find. And if not, then their in-core refresh is lost and just has to be recomputed by the next caller. But taking the index lock may conflict with other operations in the repository. Especially ones that the user is doing themselves, which _aren't_ opportunistic. In other words, "git status" knows how to back off when somebody else is holding the lock, but other commands don't know that status would be happy to drop the lock if somebody else wanted it. There are a couple possible solutions: 1. Have some kind of "pseudo-lock" that allows other commands to tell status that they want the lock. This is likely to be complicated and error-prone to implement (and maybe even impossible with just dotlocks to work from, as it requires some inter-process communication). 2. Avoid background runs of commands like "git status" that want to do opportunistic updates, preferring instead plumbing like diff-files, etc. This is awkward for a couple of reasons. One is that "status --porcelain" reports a lot more about the repository state than is available from individual plumbing commands. And two is that we actually _do_ want to see the refreshed index. We just don't want to take a lock or write out the result. Whereas commands like diff-files expect us to refresh the index separately and write it to disk so that they can depend on the result. But that write is exactly what we're trying to avoid. 3. Ask "status" not to lock or write the index. This is easy to implement. The big downside is that any work done in refreshing the index for such a call is lost when the process exits. So a background process may end up re-hashing a changed file multiple times until the user runs a command that does an index refresh themselves. This patch implements the option 3. The idea (and the test) is largely stolen from a Git for Windows patch by Johannes Schindelin, `67e5ce7f63` (status: offer not to lock the index and update it, 2016-08-12). The twist here is that instead of making this an option to "git status", it becomes a "git" option and matching environment variable. The reason there is two-fold: 1. An environment variable is carried through to sub-processes. And whether an invocation is a background process or not should apply to the whole process tree. So you could do "git --no-optional-locks foo", and if "foo" is a script or alias that calls "status", you'll still get the effect. 2. There may be other programs that want the same treatment. I've punted here on finding more callers to convert, since "status" is the obvious one to call as a repeated background job. But "git diff"'s opportunistic refresh of the index may be a good candidate. The test is taken from `67e5ce7f63`, and it's worth repeating Johannes's explanation: Note that the regression test added in this commit does not really verify that no index.lock file was written; that test is not possible in a portable way. Instead, we verify that .git/index is rewritten only when `git status` is run without `--no-optional-locks`. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-27 16:11:01 +09:00
Ramsay Jones	356a293f39	cache.h: hex2chr() - avoid -Wsign-compare warnings Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-22 13:00:38 +09:00
Jonathan Nieder	607bd8315c	pack: make packed_git_mru global a value instead of a pointer The MRU cache that keeps track of recently used packs is represented using two global variables: struct mru packed_git_mru_storage; struct mru *packed_git_mru = &packed_git_mru_storage; Callers never assign to the packed_git_mru pointer, though, so we can simplify by eliminating it and using &packed_git_mru_storage (renamed to &packed_git_mru) directly. This variable is only used by the packfile subsystem, making this a relatively uninvasive change (and any new unadapted callers would trigger a compile error). Noticed while moving these globals to the object_store struct. Signed-off-by: Stefan Beller <sbeller@google.com> Signed-off-by: Jonathan Nieder <jrnieder@gmail.com> Acked-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-09-14 15:05:48 +09:00
Junio C Hamano	f04f860dfa	Merge branch 'sb/sha1-file-cleanup' into maint Code clean-up. * sb/sha1-file-cleanup: sha1_file: make read_info_alternates static	2017-09-10 17:03:04 +09:00
Junio C Hamano	822a4d4178	Merge branch 'jk/hashcmp-memcmp' into maint Code clean-up. * jk/hashcmp-memcmp: hashcmp: use memcmp instead of open-coded loop	2017-09-10 17:02:59 +09:00
Junio C Hamano	eabdcd4ab4	Merge branch 'jt/packmigrate' Code movement to make it easier to hack later. * jt/packmigrate: (23 commits) pack: move for_each_packed_object() pack: move has_pack_index() pack: move has_sha1_pack() pack: move find_pack_entry() and make it global pack: move find_sha1_pack() pack: move find_pack_entry_one(), is_pack_valid() pack: move check_pack_index_ptr(), nth_packed_object_offset() pack: move nth_packed_object_{sha1,oid} pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() pack: move unpack_object_header() pack: move get_size_from_delta() pack: move unpack_object_header_buffer() pack: move {,re}prepare_packed_git and approximate_object_count pack: move install_packed_git() pack: move add_packed_git() pack: move unuse_pack() pack: move use_pack() pack: move pack-closing functions pack: move release_pack_memory() pack: move open_pack_index(), parse_pack_index() ...	2017-08-26 22:55:09 -07:00
Junio C Hamano	6b8aa3294e	Merge branch 'po/object-id' * po/object-id: sha1_file: convert index_stream to struct object_id sha1_file: convert hash_sha1_file_literally to struct object_id sha1_file: convert index_fd to struct object_id sha1_file: convert index_path to struct object_id read-cache: convert to struct object_id builtin/hash-object: convert to struct object_id	2017-08-26 22:55:07 -07:00
Junio C Hamano	b6c4058f97	Merge branch 'sb/diff-color-move' "git diff" has been taught to optionally paint new lines that are the same as deleted lines elsewhere differently from genuinely new lines. * sb/diff-color-move: (25 commits) diff: document the new --color-moved setting diff.c: add dimming to moved line detection diff.c: color moved lines differently, plain mode diff.c: color moved lines differently diff.c: buffer all output if asked to diff.c: emit_diff_symbol learns about DIFF_SYMBOL_SUMMARY diff.c: emit_diff_symbol learns about DIFF_SYMBOL_STAT_SEP diff.c: convert word diffing to use emit_diff_symbol diff.c: convert show_stats to use emit_diff_symbol diff.c: convert emit_binary_diff_body to use emit_diff_symbol submodule.c: migrate diff output to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_REWRITE_DIFF diff.c: emit_diff_symbol learns about DIFF_SYMBOL_BINARY_FILES diff.c: emit_diff_symbol learns DIFF_SYMBOL_HEADER diff.c: emit_diff_symbol learns DIFF_SYMBOL_FILEPAIR_{PLUS, MINUS} diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_INCOMPLETE diff.c: emit_diff_symbol learns DIFF_SYMBOL_WORDS[_PORCELAIN] diff.c: migrate emit_line_checked to use emit_diff_symbol diff.c: emit_diff_symbol learns DIFF_SYMBOL_NO_LF_EOF diff.c: emit_diff_symbol learns DIFF_SYMBOL_CONTEXT_FRAGINFO ...	2017-08-26 22:55:03 -07:00
Jonathan Tan	7709f468fd	pack: move for_each_packed_object() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	f9a8672a81	pack: move has_pack_index() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	150e3001d0	pack: move has_sha1_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	d6fe0036fd	pack: move find_sha1_pack() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	a2551953b9	pack: move find_pack_entry_one(), is_pack_valid() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	9e0f45f5a6	pack: move check_pack_index_ptr(), nth_packed_object_offset() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	d5a1676182	pack: move nth_packed_object_{sha1,oid} Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	f1d8130be0	pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry() Both sha1_file.c and packfile.c now need read_object(), so a copy of read_object() was created in packfile.c. This patch makes both mark_bad_packed_object() and has_packed_and_bad() global. Unlike most of the other patches in this series, these 2 functions need to remain global. Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	3588dd6e99	pack: move unpack_object_header() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	7b3aa75df7	pack: move get_size_from_delta() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	32b42e152f	pack: move unpack_object_header_buffer() Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00
Jonathan Tan	0abe14f6a5	pack: move {,re}prepare_packed_git and approximate_object_count Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-08-23 15:12:07 -07:00

... 3 4 5 6 7 ...

2100 commits