development/git - HydraGit

mirror of https://github.com/git/git synced 2024-10-30 14:03:28 +00:00

Author	SHA1	Message	Date
Junio C Hamano	0807e57807	Merge branch 'en/header-split-cache-h' Header clean-up. * en/header-split-cache-h: (24 commits) protocol.h: move definition of DEFAULT_GIT_PORT from cache.h mailmap, quote: move declarations of global vars to correct unit treewide: reduce includes of cache.h in other headers treewide: remove double forward declaration of read_in_full cache.h: remove unnecessary includes treewide: remove cache.h inclusion due to pager.h changes pager.h: move declarations for pager.c functions from cache.h treewide: remove cache.h inclusion due to editor.h changes editor: move editor-related functions and declarations into common file treewide: remove cache.h inclusion due to object.h changes object.h: move some inline functions and defines from cache.h treewide: remove cache.h inclusion due to object-file.h changes object-file.h: move declarations for object-file.c functions from cache.h treewide: remove cache.h inclusion due to git-zlib changes git-zlib: move declarations for git-zlib functions from cache.h treewide: remove cache.h inclusion due to object-name.h changes object-name.h: move declarations for object-name.c functions from cache.h treewide: remove unnecessary cache.h inclusion treewide: be explicit about dependence on mem-pool.h treewide: be explicit about dependence on oid-array.h ...	2023-04-25 13:56:20 -07:00
Elijah Newren	87bed17907	object-file.h: move declarations for object-file.c functions from cache.h Signed-off-by: Elijah Newren <newren@gmail.com> Acked-by: Calvin Wan <calvinwan@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-04-11 08:52:10 -07:00
Junio C Hamano	6047b28eb7	Merge branch 'en/header-split-cleanup' Split key function and data structure definitions out of cache.h to new header files and adjust the users. * en/header-split-cleanup: csum-file.h: remove unnecessary inclusion of cache.h write-or-die.h: move declarations for write-or-die.c functions from cache.h treewide: remove cache.h inclusion due to setup.h changes setup.h: move declarations for setup.c functions from cache.h treewide: remove cache.h inclusion due to environment.h changes environment.h: move declarations for environment.c functions from cache.h treewide: remove unnecessary includes of cache.h wrapper.h: move declarations for wrapper.c functions from cache.h path.h: move function declarations for path.c functions from cache.h cache.h: remove expand_user_path() abspath.h: move absolute path functions from cache.h environment: move comment_line_char from cache.h treewide: remove unnecessary cache.h inclusion from several sources treewide: remove unnecessary inclusion of gettext.h treewide: be explicit about dependence on gettext.h treewide: remove unnecessary cache.h inclusion from a few headers	2023-04-06 13:38:31 -07:00
Junio C Hamano	72871b198f	Merge branch 'ab/remove-implicit-use-of-the-repository' Code clean-up around the use of the_repository. * ab/remove-implicit-use-of-the-repository: libs: use "struct repository " argument, not "the_repository" post-cocci: adjust comments for recent repo_ migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"	2023-04-06 13:38:30 -07:00
Junio C Hamano	e7dca80692	Merge branch 'ab/remove-implicit-use-of-the-repository' into en/header-split-cache-h * ab/remove-implicit-use-of-the-repository: libs: use "struct repository " argument, not "the_repository" post-cocci: adjust comments for recent repo_ migration cocci: apply the "revision.h" part of "the_repository.pending" cocci: apply the "rerere.h" part of "the_repository.pending" cocci: apply the "refs.h" part of "the_repository.pending" cocci: apply the "promisor-remote.h" part of "the_repository.pending" cocci: apply the "packfile.h" part of "the_repository.pending" cocci: apply the "pretty.h" part of "the_repository.pending" cocci: apply the "object-store.h" part of "the_repository.pending" cocci: apply the "diff.h" part of "the_repository.pending" cocci: apply the "commit.h" part of "the_repository.pending" cocci: apply the "commit-reach.h" part of "the_repository.pending" cocci: apply the "cache.h" part of "the_repository.pending" cocci: add missing "the_repository" macros to "pending" cocci: sort "the_repository" rules by header cocci: fix incorrect & verbose "the_repository" rules cocci: remove dead rule from "the_repository.pending.cocci"	2023-04-04 08:25:52 -07:00
Ævar Arnfjörð Bjarmason	12cb1c10a6	cocci: apply the "refs.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "refs.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:46 -07:00
Ævar Arnfjörð Bjarmason	bc726bd075	cocci: apply the "object-store.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "object-store.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:45 -07:00
Ævar Arnfjörð Bjarmason	ecb5091fd4	cocci: apply the "commit.h" part of "the_repository.pending" Apply the part of "the_repository.pending.cocci" pertaining to "commit.h". Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-28 07:36:45 -07:00
Jeff King	d051f1718e	fast-export: drop unused parameter from anonymize_commit_message() As the comment above the function indicates, we do not bother actually storing commit messages in our anonymization map. But we still take the message as a parameter, and just ignore it. Let's stop doing that, which will make -Wunused-parameter happier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-22 15:37:09 -07:00
Jeff King	65c756fff0	fast-export: drop data parameter from anonymous generators The anonymization code has a specific generator callback for each type of data (e.g., one for paths, one for oids, and so on). These all take a "data" parameter, but none of them use it for anything. Which is not surprising, as the point is to generate a new name independent of any input, and each function keeps its own static counter. We added the extra pointer in `d5bf91fde4` (fast-export: add a "data" callback parameter to anonymize_str(), 2020-06-23) to handle --anonymize-map parsing, but that turned out to be awkward itself, and was recently dropped. So let's get rid of this "data" parameter that nobody is using, both from the generators and from anonymize_str() which plumbed it through. This simplifies the code, and makes -Wunused-parameter happier. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-22 15:37:09 -07:00
Jeff King	aa548459a0	fast-export: de-obfuscate --anonymize-map handling When we handle an --anonymize-map option, we parse the orig/anon pair, and then feed the "orig" string to anonymize_str(), along with a generator function that duplicates the "anon" string to be cached in the map. This works, because anonymize_str() says "ah, there is no mapping yet for orig; I'll add one from the generator". But there are some downsides: 1. It's a bit too clever, as it's not obvious what the code is trying to do or why it works. 2. It requires allowing generator functions to take an extra void pointer, which is not something any of the normal callers of anonymize_str() want. 3. It does the wrong thing if the same token is provided twice. When there are conflicting options, like: git fast-export --anonymize \ --anonymize-map=foo:one \ --anonymize-map=foo:two we usually let the second one override the first. But by using anonymize_str(), which has first-one-wins logic, we do the opposite. So instead of relying on anonymize_str(), let's directly add the entry ourselves. We can tweak the tests to show that we handle overridden options correctly now. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-22 15:37:09 -07:00
Jeff King	dcc4e134aa	fast-export: factor out anonymized_entry creation When anonymizing output, there's only one spot where we generate new entries to add to our hashmap: when anonymize_str() doesn't find an entry, we use the generate() callback to make one and add it. Let's pull that into its own function in preparation for another caller. Note that we'll add one extra feature. In anonymize_str(), we know that we won't find an existing entry in the hashmap (since it will only try to add after failing to find one). But other callers won't have the same behavior, so we should catch this case and free the now-dangling entry. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-22 15:37:09 -07:00
Jeff King	d6484e9fab	fast-export: simplify initialization of anonymized hashmaps We take pains to avoid doing a lookup on a hashmap which has not been initialized with hashmap_init(). That was necessary back when this code was written. But hashmap_get() became safer in `b7879b0ba6` (hashmap: allow re-use after hashmap_free(), 2020-11-02). Since then it's OK to call functions on a zero-initialized table; it will just correctly return NULL, since there is no match. This simplifies the code a little, and also lets us keep the initialization line closer to when we add an entry (which is when the hashmap really does need to be totally initialized). That will help later refactoring. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-22 15:37:08 -07:00
Jeff King	76e50f7fbc	fast-export: drop const when storing anonymized values We store anonymized values as pointers to "const char ", since they are conceptually const to callers who use them. But they are actually allocated strings whose memory is owned by the struct. The ownership mismatch hasn't been a big deal since we never free() them (they are held until the program ends), but let's switch them to "char " in preparation for changing that. Since most code only accesses them via anonymize_str(), it can continue to narrow them to "const char *" in its return value. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-22 15:37:08 -07:00
Elijah Newren	f394e093df	treewide: be explicit about dependence on gettext.h Dozens of files made use of gettext functions, without explicitly including gettext.h. This made it more difficult to find which files could remove a dependence on cache.h. Make C files explicitly include gettext.h if they are using it. However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an include of gettext.h, it was left out to avoid conflicting with an in-flight topic. Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-03-21 10:56:51 -07:00
Elijah Newren	41771fa435	cache.h: remove dependence on hex.h; make other files include it explicitly Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2023-02-23 17:25:29 -08:00
Jeff King	61bdc7c5d8	diff: mark unused parameters in callbacks The diff code provides a format_callback interface, but not every callback needs each parameter (e.g., the "opt" and "data" parameters are frequently left unused). Likewise for the output_prefix callback, the low-level change/add_remove interfaces, the callbacks used by xdi_diff(), etc. Mark unused arguments in the callback implementations to quiet -Wunused-parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-12-13 22:16:23 +09:00
Junio C Hamano	dd407f1c7c	Merge branch 'ab/unused-annotation' Undoes 'jk/unused-annotation' topic and redoes it to work around Coccinelle rules misfiring false positives in unrelated codepaths. * ab/unused-annotation: git-compat-util.h: use "deprecated" for UNUSED variables git-compat-util.h: use "UNUSED", not "UNUSED(var)"	2022-09-14 12:56:39 -07:00
Junio C Hamano	a6b42ec0c6	Merge branch 'jk/unused-annotation' Annotate function parameters that are not used (but cannot be removed for structural reasons), to prepare us to later compile with -Wunused warning turned on. * jk/unused-annotation: is_path_owned_by_current_uid(): mark "report" parameter as unused run-command: mark unused async callback parameters mark unused read_tree_recursive() callback parameters hashmap: mark unused callback parameters config: mark unused callback parameters streaming: mark unused virtual method parameters transport: mark bundle transport_options as unused refs: mark unused virtual method parameters refs: mark unused reflog callback parameters refs: mark unused each_ref_fn parameters git-compat-util: add UNUSED macro	2022-09-14 12:56:39 -07:00
Ævar Arnfjörð Bjarmason	5cf88fd8b0	git-compat-util.h: use "UNUSED", not "UNUSED(var)" As reported in [1] the "UNUSED(var)" macro introduced in `2174b8c75d` (Merge branch 'jk/unused-annotation' into next, 2022-08-24) breaks coccinelle's parsing of our sources in files where it occurs. Let's instead partially go with the approach suggested in [2] of making this not take an argument. As noted in [1] "coccinelle" will ignore such tokens in argument lists that it doesn't know about, and it's less of a surprise to syntax highlighters. This undoes the "help us notice when a parameter marked as unused is actually use" part of `9b24034754` (git-compat-util: add UNUSED macro, 2022-08-19), a subsequent commit will further tweak the macro to implement a replacement for that functionality. 1. https://lore.kernel.org/git/220825.86ilmg4mil.gmgdl@evledraar.gmail.com/ 2. https://lore.kernel.org/git/220819.868rnk54ju.gmgdl@evledraar.gmail.com/ Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-09-01 10:49:48 -07:00
Jeff King	02c3c59e62	hashmap: mark unused callback parameters Hashmap comparison functions must conform to a particular callback interface, but many don't use all of their parameters. Especially the void cmp_data pointer, but some do not use keydata either (because they can easily form a full struct to pass when doing lookups). Let's mark these to make -Wunused-parameter happy. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 12:18:55 -07:00
SZEDER Gábor	99d86d60e5	parse-options: PARSE_OPT_KEEP_UNKNOWN only applies to --options The description of 'PARSE_OPT_KEEP_UNKNOWN' starts with "Keep unknown arguments instead of erroring out". This is a bit misleading, as this flag only applies to unknown --options, while non-option arguments are kept even without this flag. Update the description to clarify this, and rename the flag to PARSE_OPTIONS_KEEP_UNKNOWN_OPT to make this obvious just by looking at the flag name. Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-08-19 11:13:14 -07:00
Junio C Hamano	2da81d1efb	Merge branch 'ab/plug-leak-in-revisions' Plug the memory leaks from the trickiest API of all, the revision walker. * ab/plug-leak-in-revisions: (27 commits) revisions API: add a TODO for diff_free(&revs->diffopt) revisions API: have release_revisions() release "topo_walk_info" revisions API: have release_revisions() release "date_mode" revisions API: call diff_free(&revs->pruning) in revisions_release() revisions API: release "reflog_info" in release revisions() revisions API: clear "boundary_commits" in release_revisions() revisions API: have release_revisions() release "prune_data" revisions API: have release_revisions() release "grep_filter" revisions API: have release_revisions() release "filter" revisions API: have release_revisions() release "cmdline" revisions API: have release_revisions() release "mailmap" revisions API: have release_revisions() release "commits" revisions API users: use release_revisions() for "prune_data" users revisions API users: use release_revisions() with UNLEAK() revisions API users: use release_revisions() in builtin/log.c revisions API users: use release_revisions() in http-push.c revisions API users: add "goto cleanup" for release_revisions() stash: always have the owner of "stash_info" free it revisions API users: use release_revisions() needing REV_INFO_INIT revision.[ch]: document and move code declared around "init" ...	2022-06-07 14:10:56 -07:00
Junio C Hamano	2cc712324d	Merge branch 'rs/fast-export-pathspec-fix' "git fast-export -- <pathspec>" lost the pathspec when showing the second and subsequent commits, which has been corrected. * rs/fast-export-pathspec-fix: 2.36 fast-export regression fix	2022-05-04 09:51:28 -07:00
René Scharfe	d1c25272f5	2.36 fast-export regression fix `e900d494dc` (diff: add an API for deferred freeing, 2021-02-11) added a way to allow reusing diffopts: the no_free bit. `244c27242f` (diff.[ch]: have diff_free() call clear_pathspec(opts.pathspec), 2022-02-16) made that mechanism mandatory. git fast-export doesn't set no_free, so path limiting stopped working after the first commit. Set the flag and add a basic test to make sure only changes to the specified files are exported. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-30 11:50:33 -07:00
Ævar Arnfjörð Bjarmason	2108fe4a19	revisions API users: add straightforward release_revisions() Add a release_revisions() to various users of "struct rev_list" in those straightforward cases where we only need to add the release_revisions() call to the end of a block, and don't need to e.g. refactor anything to use a "goto cleanup" pattern. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-04-13 23:56:08 -07:00
Junio C Hamano	430883a70c	Merge branch 'ab/object-file-api-updates' Object-file API shuffling. * ab/object-file-api-updates: object-file API: pass an enum to read_object_with_reference() object-file.c: add a literal version of write_object_file_prepare() object-file API: have hash_object_file() take "enum object_type" object API: rename hash_object_file_literally() to write_() object-file API: split up and simplify check_object_signature() object API users + docs: check <0, not !0 with check_object_signature() object API docs: move check_object_signature() docs to cache.h object API: correct "buf" v.s. "map" mismatch in .c and *.h object-file API: have write_object_file() take "enum object_type" object-file API: add a format_object_header() function object-file API: return "void", not "int" from hash_object_file() object-file.c: split up declaration of unrelated variables	2022-03-16 17:53:08 -07:00
Ævar Arnfjörð Bjarmason	44439c1c58	object-file API: have hash_object_file() take "enum object_type" Change the hash_object_file() function to take an "enum object_type". Since a preceding commit all of its callers are passing either "{commit,tree,blob,tag}_type", or the result of a call to type_name(), the parse_object() caller that would pass NULL is now using stream_object_signature(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:32 -08:00
Ævar Arnfjörð Bjarmason	0f156dbb04	object-file API: split up and simplify check_object_signature() Split up the check_object_signature() function into that non-streaming version (it accepts an already filled "buf"), and a new stream_object_signature() which will retrieve the object from storage, and hash it on-the-fly. All of the callers of check_object_signature() were effectively calling two different functions, if we go by cyclomatic complexity. I.e. they'd either take the early "if (map)" branch and return early, or not. This has been the case since the "if (map)" condition was added in `090ea12671` (parse_object: avoid putting whole blob in core, 2012-03-07). We can then further simplify the resulting check_object_signature() function since only one caller wanted to pass a non-NULL "buf" and a non-NULL "real_oidp". That "read_loose_object()" codepath used by "git fsck" can instead use hash_object_file() followed by oideq(). Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-25 17:16:31 -08:00
Jean-Noël Avila	9164d97a63	i18n: fix some misformated placeholders in command synopsis * add '<>' around arguments where missing * convert plurals into '...' forms This applies the style guide for documentation. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Phillip Wood <phillip.wood123@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-02-04 13:58:28 -08:00
Junio C Hamano	c17de5a505	Merge branch 'ja/i18n-similar-messages' Similar message templates have been consolidated so that translators need to work on fewer number of messages. * ja/i18n-similar-messages: i18n: turn even more messages into "cannot be used together" ones i18n: ref-filter: factorize "%(foo) atom used without %(bar) atom" i18n: factorize "--foo outside a repository" i18n: refactor "unrecognized %(foo) argument" strings i18n: factorize "no directory given for --foo" i18n: factorize "--foo requires --bar" and the like i18n: tag.c factorize i18n strings i18n: standardize "cannot open" and "cannot read" i18n: turn "options are incompatible" into "cannot be used together" i18n: refactor "%s, %s and %s are mutually exclusive" i18n: refactor "foo and bar are mutually exclusive"	2022-01-10 11:52:56 -08:00
Jean-Noël Avila	6fa00ee843	i18n: factorize "--foo requires --bar" and the like They are all replaced by "the option '%s' requires '%s'", which is a new string but replaces 17 previous unique strings. Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-05 13:31:00 -08:00
Jean-Noël Avila	12909b6b8a	i18n: turn "options are incompatible" into "cannot be used together" Signed-off-by: Jean-Noël Avila <jn.avila@free.fr> Reviewed-by: Johannes Sixt <j6t@kdbg.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-01-05 13:29:23 -08:00
William Sprent	726a228dfb	fast-export: fix surprising behavior with --first-parent The revision traversal machinery typically processes and returns all children before any parent. fast-export needs to operate in the reverse fashion, handling parents before any of their children in order to build up the history starting from the root commit(s). This would be a clear case where we could just use the revision traversal machinery's "reverse" option to achieve this desired affect. However, this wasn't what the code did. It added its own array for queuing. The obvious hand-rolled solution would be to just push all the commits into the array and then traverse afterwards, but it didn't quite do that either. It instead attempted to process anything it could as soon as it could, and once it could, check whether it could process anything that had been queued. As far as I can tell, this was an effort to save a little memory in the case of multiple root commits since it could process some commits before queueing all of them. This involved some helper functions named has_unshown_parent() and handle_tail(). For typical invocations of fast-export, this alternative essentially amounted to a hand-rolled method of reversing the commits -- it was a bunch of work to duplicate the revision traversal machinery's "reverse" option. This hand-rolled reversing mechanism is actually somewhat difficult to reason about. It takes some time to figure out how it ensures in normal cases that it will actually process all traversed commits (rather than just dropping some and not printing anything for them). And it turns out there are some cases where the code does drop commits without handling them, and not even printing an error or warning for the user. Due to the has_unshown_parent() checks, some commits could be left in the array at the end of the "while...get_revision()" loop which would be unprocessed. This could be triggered for example with git fast-export main -- --first-parent or non-sensical traversal rules such as git fast-export main -- --grep=Merge --invert-grep While most traversals that don't include all parents should likely trigger errors in fast-export (or at least require being used in combination with --reference-excluded-parents), the --first-parent traversal is at least reasonable and it'd be nice if it didn't just drop commits. It'd also be nice for future readers of the code to have a simpler "reverse traversal" mechanism. Use the "reverse" option of the revision traversal machinery to achieve both. Even for the non-sensical traversal flags like the --grep one above, this would be an improvement. For example, in that case, the code previously would have silently truncated history to only those commits that do not have an ancestor containing "Merge" in their commit message. After this code change, that case would include all commits without "Merge" in their commit message -- but any commit that previously had a "Merge"-mentioning parent would lose that parent (likely resulting in many new root commits). While the new behavior is still odd, it is at least understandable given that --reference-excluded-parents is not the default. Helped-by: Elijah Newren <newren@gmail.com> Signed-off-by: William Sprent <williams@unity3d.com> Reviewed-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-21 12:51:09 -08:00
Junio C Hamano	061a21d36d	Merge branch 'ab/fsck-unexpected-type' "git fsck" has been taught to report mismatch between expected and actual types of an object better. * ab/fsck-unexpected-type: fsck: report invalid object type-path combinations fsck: don't hard die on invalid object types object-file.c: stop dying in parse_loose_header() object-file.c: return ULHR_TOO_LONG on "header too long" object-file.c: use "enum" return type for unpack_loose_header() object-file.c: simplify unpack_loose_short_header() object-file.c: make parse_loose_header_extended() public object-file.c: return -1, not "status" from unpack_loose_header() object-file.c: don't set "typep" when returning non-zero cat-file tests: test for current --allow-unknown-type behavior cat-file tests: add corrupt loose object test cat-file tests: test for missing/bogus object with -t, -s and -p cat-file tests: move bogus_* variable declarations earlier fsck tests: test for garbage appended to a loose object fsck tests: test current hash/type mismatch behavior fsck tests: refactor one test to use a sub-repo fsck tests: add test for fsck-ing an unknown type	2021-10-25 16:06:56 -07:00
Ævar Arnfjörð Bjarmason	96e41f58fe	fsck: report invalid object type-path combinations Improve the error that's emitted in cases where we find a loose object we parse, but which isn't at the location we expect it to be. Before this change we'd prefix the error with a not-a-OID derived from the path at which the object was found, due to an emergent behavior in how we'd end up with an "OID" in these codepaths. Now we'll instead say what object we hashed, and what path it was found at. Before this patch series e.g.: $ git hash-object --stdin -w -t blob </dev/null `e69de29bb2` $ mv objects/e6/ objects/e7 Would emit ("[...]" used to abbreviate the OIDs): git fsck error: hash mismatch for ./objects/e7/9d[...] (expected e79d[...]) error: e79d[...]: object corrupt or missing: ./objects/e7/9d[...] Now we'll instead emit: error: e69d[...]: hash-path mismatch, found at: ./objects/e7/9d[...] Furthermore, we'll do the right thing when the object type and its location are bad. I.e. this case: $ git hash-object --stdin -w -t garbage --literally </dev/null 8315a83d2acc4c174aed59430f9a9c4ed926440f $ mv objects/83 objects/84 As noted in an earlier commits we'd simply die early in those cases, until preceding commits fixed the hard die on invalid object type: $ git fsck fatal: invalid object type Now we'll instead emit sensible error messages: $ git fsck error: 8315[...]: hash-path mismatch, found at: ./objects/84/15[...] error: 8315[...]: object is of unknown type 'garbage': ./objects/84/15[...] In both fsck.c and object-file.c we're using null_oid as a sentinel value for checking whether we got far enough to be certain that the issue was indeed this OID mismatch. We need to add the "object corrupt or missing" special-case to deal with cases where read_loose_object() will return an error before completing check_object_signature(), e.g. if we have an error in unpack_loose_rest() because we find garbage after the valid gzip content: $ git hash-object --stdin -w -t blob </dev/null `e69de29bb2` $ chmod 755 objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ echo garbage >>objects/e6/9de29bb2d1d6434b8b29ae775ad8c2e48c5391 $ git fsck error: garbage at end of loose object 'e69d[...]' error: unable to unpack contents of ./objects/e6/9d[...] error: e69d[...]: object corrupt or missing: ./objects/e6/9d[...] There is currently some weird messaging in the edge case when the two are combined, i.e. because we're not explicitly passing along an error state about this specific scenario from check_stream_oid() via read_loose_object() we'll end up printing the null OID if an object is of an unknown type and it can't be unpacked by zlib, e.g.: $ git hash-object --stdin -w -t garbage --literally </dev/null 8315a83d2acc4c174aed59430f9a9c4ed926440f $ chmod 755 objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f $ echo garbage >>objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f $ /usr/bin/git fsck fatal: invalid object type $ ~/g/git/git fsck error: garbage at end of loose object '8315a83d2acc4c174aed59430f9a9c4ed926440f' error: unable to unpack contents of ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 8315a83d2acc4c174aed59430f9a9c4ed926440f: object corrupt or missing: ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f error: 0000000000000000000000000000000000000000: object is of unknown type 'garbage': ./objects/83/15a83d2acc4c174aed59430f9a9c4ed926440f [...] I think it's OK to leave that for future improvements, which would involve enum-ifying more error state as we've done with "enum unpack_loose_header_result" in preceding commits. In these increasingly more obscure cases the worst that can happen is that we'll get slightly nonsensical or inapplicable error messages. There's other such potential edge cases, all of which might produce some confusing messaging, but still be handled correctly as far as passing along errors goes. E.g. if check_object_signature() returns and oideq(real_oid, null_oid()) is true, which could happen if it returns -1 due to the read_istream() call having failed. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-01 15:06:01 -07:00
Junio C Hamano	febba8038d	Merge branch 'tk/fast-export-anonymized-tag-fix' The output from "git fast-export", when its anonymization feature is in use, showed an annotated tag incorrectly. * tk/fast-export-anonymized-tag-fix: fast-export: fix anonymized tag using original length	2021-09-10 11:46:27 -07:00
Tal Kelrich	2f040a9671	fast-export: fix anonymized tag using original length Commit `7f40759496` (fast-export: tighten anonymize_mem() interface to handle only strings, 2020-06-23) changed the interface used in anonymizing strings, but failed to update the size of annotated tag messages to match the new anonymized string. As a result, exporting tags having messages longer than 13 characters would create output that couldn't be parsed by fast-import, as the data length indicated was larger than the data output. Reset the message size when anonymizing, and add a tag with a "long" message to the test. Signed-off-by: Tal Kelrich <hasturkun@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-08-31 12:11:57 -07:00
brian m. carlson	14228447c9	hash: provide per-algorithm null OIDs Up until recently, object IDs did not have an algorithm member, only a hash. Consequently, it was possible to share one null (all-zeros) object ID among all hash algorithms. Now that we're going to be handling objects from multiple hash algorithms, it's important to make sure that all object IDs have a correct algorithm field. Introduce a per-algorithm null OID, and add it to struct hash_algo. Introduce a wrapper function as well, and use it everywhere we used to use the null_oid constant. Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-04-27 16:31:39 +09:00
ZheNing Hu	e73fe3dd02	builtin/*: update usage format According to the guidelines in parse-options.h, we should not end in a full stop or start with a capital letter. Fix old error and usage messages to match this expectation. Signed-off-by: ZheNing Hu <adlternative@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-01-06 15:10:49 -08:00
Ævar Arnfjörð Bjarmason	56f56ac50b	style: do not "break" in switch() after "return" Remove this unreachable code. It was found by SunCC, it's found by a non-fatal warning emitted by SunCC. It's one of the things it's more vehement about than GCC & Clang. It complains about a lot of other similarly unreachable code, e.g. a BUG(...) without a "return", and a "return 0" after a long if/else, both of whom have "return" statements. Those are also genuine redundancies to a compiler, but arguably make the code a bit easier to read & less fragile to maintain. These return/break cases are just unnecessary however, and as seen here the surrounding code just did a plain "return" without a "break" already. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-12-15 16:32:50 -08:00
Junio C Hamano	58138d3f26	Merge branch 'js/default-branch-name-part-2' Update the tests to drop word 'master' from them. * js/default-branch-name-part-2: t9902: avoid using the branch name `master` tests: avoid variations of the `master` branch name t3200: avoid variations of the `master` branch name fast-export: avoid using unnecessary language in a code comment t/test-terminal: avoid non-inclusive language	2020-10-05 14:01:50 -07:00
Junio C Hamano	86cca370e1	Merge branch 'jk/drop-unaligned-loads' Compilation fix around type punning. * jk/drop-unaligned-loads: Revert "fast-export: use local array to store anonymized oid" bswap.h: drop unaligned loads	2020-10-04 12:49:06 -07:00
Jeff King	176380fd11	Revert "fast-export: use local array to store anonymized oid" This reverts commit `f39ad38410`. That commit was trying to silence a type-punning warning on older versions of gcc. However, its analysis was all wrong. I didn't notice that we _were_ in fact type-punning because there are two versions of put_be32(): one that uses casts and unaligned loads, and another that uses bitshifts. I looked at the latter, but on my platform we were defaulting to the former. However, as of the previous commit, we'll always use the bitshift version. So we can drop this hackery to avoid the warning, making the code slightly cleaner. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-24 12:30:11 -07:00
Johannes Schindelin	5a0c32bd4b	fast-export: avoid using unnecessary language in a code comment In an ongoing effort to avoid non-inclusive language, let's avoid using the branch name "master" in a code comment. Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-21 15:19:27 -07:00
Jonathan Tan	f24c30e0b6	wt-status: tolerate dangling marks When a user checks out the upstream branch of HEAD, the upstream branch not being a local branch, and then runs "git status", like this: git clone $URL client cd client git checkout @{u} git status no status is printed, but instead an error message: fatal: HEAD does not point to a branch (This error message when running "git branch" persists even after checking out other things - it only stops after checking out a branch.) This is because "git status" reads the reflog when determining the "HEAD detached" message, and thus attempts to DWIM "@{u}", but that doesn't work because HEAD no longer points to a branch. Therefore, when calculating the status of a worktree, tolerate dangling marks. This is done by adding an additional parameter to dwim_ref() and repo_dwim_ref(). Signed-off-by: Jonathan Tan <jonathantanmy@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-09-02 14:39:25 -07:00
Jeff King	f39ad38410	fast-export: use local array to store anonymized oid Some older versions of gcc complain about this line: builtin/fast-export.c:412:2: error: dereferencing type-punned pointer will break strict-aliasing rules [-Werror=strict-aliasing] put_be32(oid.hash + hashsz - 4, counter++); ^ This seems to be a false positive, as there's no type-punning at all here. oid.hash is an array of unsigned char; when we pass it to a function it decays to a pointer to unsigned char. We do take a void pointer in put_be32(), but it's immediately aliased with another pointer to unsigned char (and clearly the compiler is looking inside the inlined put_be32(), since the warning doesn't happen with -O0). This happens on gcc 4.8 and 4.9, but not later versions (I tested gcc 6, 7, 8, and 9). We can work around it by using a local array instead of an object_id struct. This is a little more intimate with the details of object_id, but for whatever reason doesn't seem to trigger the compiler warning. We can revert this patch once we decide that those gcc versions are too old to care about for a warning like this (gcc 4.8 is the default compiler for Ubuntu Trusty, which is out-of-support but not fully end-of-life'd until April 2022). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Jeff King	8a49495583	fast-export: anonymize "master" refname Running "fast-export --anonymize" will leave "refs/heads/master" untouched in the output, for two reasons: - it helped to have some known reference point between the original and anonymized repository - since it's historically the default branch name, it doesn't leak any information Now that we can ask fast-export to retain particular tokens, we have a much better tool for the first one (because it works for any ref, not just master). For the second, the notion of "default branch name" is likely to become configurable soon, at which point the name _does_ leak information. Let's drop this special case in preparation. Note that we have to adjust the test a bit, since it relied on using the name "master" in the anonymized repos. We could just use --anonymize-map=master to keep the same output, but then we wouldn't know if it works because of our hard-coded master or because of the explicit map. So let's flip the test a bit, and confirm that we anonymize "master", but keep "other" in the output. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Jeff King	65b5d9fae7	fast-export: allow seeding the anonymized mapping After you anonymize a repository, it can be hard to find which commits correspond between the original and the result, and thus hard to reproduce commands that triggered bugs in the original. Let's make it possible to seed the anonymization map. This lets users either: - mark names to be retained as-is, if they don't consider them secret (in which case their original commands would just work) - map names to new values, which lets them adapt the reproduction recipe to the new names without revealing the originals The implementation is fairly straight-forward. We already store each anonymized token in a hashmap (so that the same token appearing twice is converted to the same result). We can just introduce a new "seed" hashmap which is consulted first. This does make a few more promises to the user about how we'll anonymize things (e.g., token-splitting pathnames). But it's unlikely that we'd want to change those rules, even if the actual anonymization of a single token changes. And it makes things much easier for the user, who can unblind only a directory name without having to specify each path within it. One alternative to this approach would be to anonymize as we see fit, and then dump the whole refname and pathname mappings to a file. This does work, but it's a bit awkward to use (you have to manually dig the items you care about out of the mapping). Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-25 14:19:23 -07:00
Jeff King	d5bf91fde4	fast-export: add a "data" callback parameter to anonymize_str() The anonymize_str() function takes a generator callback, but there's no way to pass extra context to it. Let's add the usual "void *data" parameter to the generator interface and pass it along. This is mildly annoying for existing callers, all of which pass NULL, but is necessary to avoid extra globals in some cases we'll add in a subsequent patch. While we're touching each of these callbacks, we can further observe that none of them use the existing orig/len parameters at all. This makes sense, since the point is for their output to have no discernable basis in the original (my original version had some notion that we might use a one-way function to obfuscate the names, but it was never implemented). So let's drop those extra parameters. If a caller really wants to do something with them, it can pass a struct through the new data parameter. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-06-23 19:56:26 -07:00

1 2 3 4 5

207 commits