git/builtin
Jeff King 65b5d9fae7 fast-export: allow seeding the anonymized mapping
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.

Let's make it possible to seed the anonymization map. This lets users
either:

  - mark names to be retained as-is, if they don't consider them secret
    (in which case their original commands would just work)

  - map names to new values, which lets them adapt the reproduction
    recipe to the new names without revealing the originals

The implementation is fairly straight-forward. We already store each
anonymized token in a hashmap (so that the same token appearing twice is
converted to the same result). We can just introduce a new "seed"
hashmap which is consulted first.

This does make a few more promises to the user about how we'll anonymize
things (e.g., token-splitting pathnames). But it's unlikely that we'd
want to change those rules, even if the actual anonymization of a single
token changes. And it makes things much easier for the user, who can
unblind only a directory name without having to specify each path within
it.

One alternative to this approach would be to anonymize as we see fit,
and then dump the whole refname and pathname mappings to a file. This
does work, but it's a bit awkward to use (you have to manually dig the
items you care about out of the mapping).

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 14:19:23 -07:00
..
add.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
am.c auto-gc: pass --quiet down from am, commit, merge and rebase 2020-05-07 12:24:35 -07:00
annotate.c
apply.c apply.c: make init_apply_state() take a struct repository 2018-08-13 14:14:44 -07:00
archive.c pack-protocol.txt: accept error packets in any context 2019-01-02 13:05:30 -08:00
bisect--helper.c Merge branch 'cb/bisect-helper-parser-fix' 2020-06-08 18:06:32 -07:00
blame.c Merge branch 'dl/opt-callback-cleanup' 2020-05-05 14:54:27 -07:00
branch.c Merge branch 'jk/for-each-ref-multi-key-sort-fix' 2020-05-08 14:25:04 -07:00
bundle.c bundle-verify: add --quiet 2019-11-11 11:46:29 +09:00
cat-file.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
check-attr.c cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch 2019-01-24 11:55:06 -08:00
check-ignore.c check-ignore: fix documentation and implementation to match 2020-02-18 15:28:58 -08:00
check-mailmap.c
check-ref-format.c
checkout-index.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
checkout.c Merge branch 'bc/filter-process' 2020-06-08 18:06:30 -07:00
clean.c Merge branch 'dl/opt-callback-cleanup' 2020-05-05 14:54:27 -07:00
clone.c Merge branch 'js/reflog-anonymize-for-clone-and-fetch' 2020-06-17 21:54:01 -07:00
column.c builtin: consistently pass cmd_* prefix to parse_options 2019-05-13 14:22:53 +09:00
commit-graph.c commit-graph: drop COMMIT_GRAPH_WRITE_CHECK_OIDS flag 2020-05-18 12:51:11 -07:00
commit-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
commit.c Merge branch 'jc/auto-gc-quiet' 2020-05-13 12:19:19 -07:00
config.c config: add '--show-scope' to print the scope of a config value 2020-02-10 10:49:12 -08:00
count-objects.c rename "alternate_object_database" to "object_directory" 2018-11-13 14:22:02 +09:00
credential.c
describe.c Merge branch 'jc/describe-misnamed-annotated-tag' 2020-03-26 17:11:21 -07:00
diff-files.c cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch 2019-01-24 11:55:06 -08:00
diff-index.c cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch 2019-01-24 11:55:06 -08:00
diff-tree.c diff-tree.c: load notes machinery when required 2020-04-20 18:22:54 -07:00
diff.c oid_array: rename source file from sha1-array 2020-03-30 10:59:08 -07:00
difftool.c hashmap: remove type arg from hashmap_{get,put,remove}_entry 2019-10-07 10:20:12 +09:00
env--helper.c env--helper: mark a file-local symbol as static 2019-07-11 14:31:04 -07:00
fast-export.c fast-export: allow seeding the anonymized mapping 2020-06-25 14:19:23 -07:00
fetch-pack.c stateless-connect: send response end packet 2020-05-24 16:26:00 -07:00
fetch.c Merge branch 'js/reflog-anonymize-for-clone-and-fetch' 2020-06-17 21:54:01 -07:00
fmt-merge-msg.c Lib-ify fmt-merge-msg 2020-03-24 15:04:43 -07:00
for-each-ref.c Merge branch 'jk/for-each-ref-multi-key-sort-fix' 2020-05-08 14:25:04 -07:00
fsck.c Merge branch 'ds/multi-pack-verify' 2020-05-24 19:39:39 -07:00
gc.c commit-graph.h: store an odb in 'struct write_commit_graph_context' 2020-02-04 11:36:37 -08:00
get-tar-commit-id.c builtin/get-tar-commit-id: make hash size independent 2019-04-01 11:57:39 +09:00
grep.c Merge branch 'dl/opt-callback-cleanup' 2020-05-05 14:54:27 -07:00
hash-object.c builtin: consistently pass cmd_* prefix to parse_options 2019-05-13 14:22:53 +09:00
help.c Merge branch 'es/bugreport' 2020-05-01 13:39:59 -07:00
index-pack.c promisor-remote: accept 0 as oid_nr in function 2020-04-02 12:42:32 -07:00
init-db.c Merge branch 'bc/sha-256-part-1-of-4' 2020-03-26 17:11:20 -07:00
interpret-trailers.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
log.c Merge branch 'dl/opt-callback-cleanup' 2020-05-05 14:54:27 -07:00
ls-files.c Merge branch 'dl/opt-callback-cleanup' 2020-05-05 14:54:27 -07:00
ls-remote.c parse_opt_ref_sorting: always use with NONEG flag 2019-03-21 12:03:35 +09:00
ls-tree.c Merge branch 'nd/attr-pathspec-in-tree-walk' 2019-01-14 15:29:28 -08:00
mailinfo.c
mailsplit.c
merge-base.c rebase: --fork-point regression fix 2020-02-11 09:59:39 -08:00
merge-file.c assert NOARG/NONEG behavior of parse-options callbacks 2018-11-06 12:56:29 +09:00
merge-index.c cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch 2019-01-24 11:55:06 -08:00
merge-ours.c cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch 2019-01-24 11:55:06 -08:00
merge-recursive.c Ensure index matches head before invoking merge machinery, round N 2019-08-19 10:08:03 -07:00
merge-tree.c Merge branch 'jk/tree-walk-overflow' 2019-08-22 12:34:10 -07:00
merge.c Merge branch 'an/merge-single-strategy-optim' 2020-06-02 13:35:01 -07:00
mktag.c sha1-file: allow check_object_signature() to handle any repo 2020-01-31 10:45:39 -08:00
mktree.c mktree: drop unused length parameter 2019-05-13 14:22:54 +09:00
multi-pack-index.c multi-pack-index: add [--[no-]progress] option. 2019-10-23 12:05:06 +09:00
mv.c cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch 2019-01-24 11:55:06 -08:00
name-rev.c name-rev: sort tip names before applying 2020-02-05 10:36:33 -08:00
notes.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
pack-objects.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
pack-redundant.c object-store: rename and expand packed_git's sha1 member 2019-04-01 11:57:38 +09:00
pack-refs.c Honor core.precomposeUnicode in more places 2019-04-26 10:54:03 +09:00
patch-id.c patch-id: use oid_to_hex() to print multiple object IDs 2019-12-09 12:26:40 -08:00
prune-packed.c Lib-ify prune-packed 2020-03-24 15:04:44 -07:00
prune.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
pull.c Merge branch 'dl/opt-callback-cleanup' 2020-05-05 14:54:27 -07:00
push.c Merge branch 'dl/push-recurse-submodules-fix' 2020-05-05 14:54:28 -07:00
range-diff.c range-diff: clear other_arg at end of function 2019-12-06 12:36:53 -08:00
read-tree.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
rebase.c Merge branch 'jc/auto-gc-quiet' 2020-05-13 12:19:19 -07:00
receive-pack.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
reflog.c parse_config_key(): return subsection len as size_t 2020-04-10 14:44:29 -07:00
remote-ext.c
remote-fd.c
remote.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
repack.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
replace.c sha1-file: pass git_hash_algo to hash_object_file() 2020-01-31 10:45:39 -08:00
rerere.c Merge branch 'nd/the-index' into md/list-objects-filter-by-depth 2019-01-15 15:38:29 -08:00
reset.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
rev-list.c rev-list --count: comment on the use of count_right++ 2020-02-18 13:21:46 -08:00
rev-parse.c Merge branch 'tb/shallow-cleanup' 2020-05-13 12:19:18 -07:00
revert.c Merge branch 'ra/cherry-pick-revert-skip' 2019-07-19 11:30:21 -07:00
rm.c rm: support the --pathspec-from-file option 2020-02-19 10:56:49 -08:00
send-pack.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
shortlog.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
show-branch.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
show-index.c builtin/show-index: replace sha1_to_hex 2019-08-19 15:04:59 -07:00
show-ref.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
sparse-checkout.c sparse-checkout: avoid staging deletions of all files 2020-06-05 08:05:50 -07:00
stash.c Merge branch 'en/fill-directory-exponential' 2020-04-29 16:15:31 -07:00
stripspace.c stripspace: allow -s/-c outside git repository 2018-12-26 15:41:47 -08:00
submodule--helper.c submodule: port subcommand 'set-url' from shell to C 2020-05-08 09:17:55 -07:00
symbolic-ref.c
tag.c Merge branch 'jk/for-each-ref-multi-key-sort-fix' 2020-05-08 14:25:04 -07:00
unpack-file.c
unpack-objects.c sha1-file: pass git_hash_algo to hash_object_file() 2020-01-31 10:45:39 -08:00
update-index.c Use OPT_CALLBACK and OPT_CALLBACK_F 2020-04-28 10:47:10 -07:00
update-ref.c update-ref: implement interactive transaction handling 2020-04-02 11:09:49 -07:00
update-server-info.c
upload-archive.c archive: initialize archivers earlier 2018-10-26 10:17:59 +09:00
upload-pack.c builtin: consistently pass cmd_* prefix to parse_options 2019-05-13 14:22:53 +09:00
var.c
verify-commit.c Merge branch 'jk/no-system-includes-in-dot-c' 2019-07-31 14:38:56 -07:00
verify-pack.c
verify-tag.c verify-tag: drop signal.h include 2019-06-19 08:19:21 -07:00
worktree.c Merge branch 'es/worktree-duplicate-paths' 2020-06-22 15:55:03 -07:00
write-tree.c cmd_{read,write}_tree: rename "unused" variable that is used 2019-05-13 14:22:53 +09:00