Commit graph

59883 commits

Author SHA1 Message Date
Junio C Hamano bd42bbe1a4 Git 2.28-rc0
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-09 14:00:45 -07:00
Junio C Hamano d341042f71 Merge branch 'mt/entry-fstat-fallback-fix' into master
"git checkout" failed to catch an error from fstat() after updating
a path in the working tree.

* mt/entry-fstat-fallback-fix:
  entry: check for fstat() errors after checkout
2020-07-09 14:00:45 -07:00
Junio C Hamano 3ed0f1e3a1 Merge branch 'ma/rebase-doc-typofix' into master
Typofix.

* ma/rebase-doc-typofix:
  git-rebase.txt: fix description list separator
2020-07-09 14:00:45 -07:00
Junio C Hamano 9850823f06 Merge branch 'jn/eject-fetch-write-commit-graph-out-of-experimental' into master
"fetch.writeCommitGraph" was enabled when "feature.experimental" is
asked for, but it was found to be a bit too risky even for bold
folks in its current shape.  The configuration has been ejected, at
least for now, from the "experimental" feature set.

* jn/eject-fetch-write-commit-graph-out-of-experimental:
  experimental: default to fetch.writeCommitGraph=false
2020-07-09 14:00:44 -07:00
Junio C Hamano 24ecfdf206 Merge branch 'tb/fix-persistent-shallow' into master
When "fetch.writeCommitGraph" configuration is set in a shallow
repository and a fetch moves the shallow boundary, we wrote out
broken commit-graph files that do not match the reality, which has
been corrected.

* tb/fix-persistent-shallow:
  commit.c: don't persist substituted parents when unshallowing
2020-07-09 14:00:44 -07:00
Junio C Hamano 46be023084 Merge branch 'ct/diff-with-merge-base-clarification' into master
Recent update to "git diff" meant as a code clean-up introduced a
bug in its error handling code, which has been corrected.

* ct/diff-with-merge-base-clarification:
  diff: check for merge bases before assigning sym->base
2020-07-09 14:00:43 -07:00
Junio C Hamano 20d451c4da Merge branch 'rs/line-log-until' into master
"git log -Lx,y:path --before=date" lost track of where the range
should be because it didn't take the changes made by the youngest
commits that are omitted from the output into account.

* rs/line-log-until:
  revision: disable min_age optimization with line-log
2020-07-09 14:00:42 -07:00
Junio C Hamano b7ebe8f047 Merge branch 'ra/send-email-in-reply-to-from-command-line-wins' into master
"git send-email --in-reply-to=<msg>" did not use the In-Reply-To:
header with the value given from the command line, and let it be
overridden by the value on In-Reply-To: header in the messages
being sent out (if exists).

* ra/send-email-in-reply-to-from-command-line-wins:
  send-email: restore --in-reply-to superseding behavior
2020-07-09 14:00:42 -07:00
Junio C Hamano b2b7a5410d Merge branch 'vs/completion-with-set-u' into master
The command line completion support (in contrib/) used to be
prepared to work with "set -u" but recent changes got a bit more
sloppy.  This has been corrected.

* vs/completion-with-set-u:
  completion: nounset mode fixes
2020-07-09 14:00:41 -07:00
Junio C Hamano 8251695fe7 Merge branch 'cc/cat-file-usage-update' into master
Doc/usage update.

* cc/cat-file-usage-update:
  cat-file: add missing [=<format>] to usage/synopsis
2020-07-09 14:00:41 -07:00
Martin Ågren 81de0c01cf git-rebase.txt: fix description list separator
We don't give a "::" for the list separator, but just a single ":". This
ends up rendering literally, "--apply: Use applying strategies ...". As
a follow-on error, the list continuation, "+", also ends up rendering
literally (because we don't have a list).

This was introduced in 52eb738d6b ("rebase: add an --am option",
2020-02-15) and survived the rename in 10cdb9f38a ("rebase: rename the
two primary rebase backends", 2020-02-15).

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-09 11:35:57 -07:00
Matheus Tavares 35e6e212fd entry: check for fstat() errors after checkout
In 11179eb311 ("entry.c: check if file exists after checkout",
2017-10-05) we started checking the result of the lstat() call done
after writing a file, to avoid writing garbage to the corresponding
cache entry. However, the code skips calling lstat() if it's possible
to use fstat() when it still has the file descriptor open. And when
calling fstat() we don't do the same error checking. To fix that, let
the callers of fstat_output() know when fstat() fails. In this case,
write_entry() will try to use lstat() and properly report an error if
that fails as well.

Signed-off-by: Matheus Tavares <matheus.bernardino@usp.br>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-09 09:45:06 -07:00
Jonathan Nieder b5651a2092 experimental: default to fetch.writeCommitGraph=false
The fetch.writeCommitGraph feature makes fetches write out a commit
graph file for the newly downloaded pack on fetch.  This improves the
performance of various commands that would perform a revision walk and
eventually ought to be the default for everyone.  To prepare for that
future, it's enabled by default for users that set
feature.experimental=true to experience such future defaults.

Alas, for --unshallow fetches from a shallow clone it runs into a
snag: by the time Git has fetched the new objects and is writing a
commit graph, it has performed a revision walk and r->parsed_objects
contains information about the shallow boundary from *before* the
fetch.  The commit graph writing code is careful to avoid writing a
commit graph file in shallow repositories, but the new state is not
shallow, and the result is that from that point on, commands like "git
log" make use of a newly written commit graph file representing a
fictional history with the old shallow boundary.

We could fix this by making the commit graph writing code more careful
to avoid writing a commit graph that could have used any grafts or
shallow state, but it is possible that there are other pieces of
mutated state that fetch's commit graph writing code may be relying
on.  So disable it in the feature.experimental configuration.

Google developers have been running in this configuration (by setting
fetch.writeCommitGraph=false in the system config) to work around this
bug since it was discovered in April.  Once the fix lands, we'll
enable fetch.writeCommitGraph=true again to give it some early testing
before rolling out to a wider audience.

In other words:

- this patch only affects behavior with feature.experimental=true

- it makes feature.experimental match the configuration Google has
  been using for the last few months, meaning it would leave users in
  a better tested state than without it

- this should improve testing for other features guarded by
  feature.experimental, by making feature.experimental safer to use

Reported-by: Jay Conrod <jayconrod@google.com>
Helped-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-08 16:37:43 -07:00
Taylor Blau ce16364e89 commit.c: don't persist substituted parents when unshallowing
Since 37b9dcabfc (shallow.c: use '{commit,rollback}_shallow_file',
2020-04-22), Git knows how to reset stat-validity checks for the
$GIT_DIR/shallow file, allowing it to change between a shallow and
non-shallow state in the same process (e.g., in the case of 'git fetch
--unshallow').

However, when $GIT_DIR/shallow changes, Git does not alter or remove any
grafts (nor substituted parents) in memory.

This comes up in a "git fetch --unshallow" with fetch.writeCommitGraph
set to true. Ordinarily in a shallow repository (and before 37b9dcabfc,
even in this case), commit_graph_compatible() would return false,
indicating that the repository should not be used to write a
commit-graphs (since commit-graph files cannot represent a shallow
history). But since 37b9dcabfc, in an --unshallow operation that check
succeeds.

Thus even though the repository isn't shallow any longer (that is, we
have all of the objects), the in-core representation of those objects
still has munged parents at the shallow boundaries.  When the
commit-graph write proceeds, we use the incorrect parentage, producing
wrong results.

There are two ways for a user to work around this: either (1) set
'fetch.writeCommitGraph' to 'false', or (2) drop the commit-graph after
unshallowing.

One way to fix this would be to reset the parsed object pool entirely
(flushing the cache and thus preventing subsequent reads from modifying
their parents) after unshallowing. That would produce a problem when
callers have a now-stale reference to the old pool, and so this patch
implements a different approach. Instead, attach a new bit to the pool,
'substituted_parent', which indicates if the repository *ever* stored a
commit which had its parents modified (i.e., the shallow boundary
prior to unshallowing).

This bit needs to be sticky because all reads subsequent to modifying a
commit's parents are unreliable when unshallowing. Modify the check in
'commit_graph_compatible' to take this bit into account, and correctly
avoid generating commit-graphs in this case, thus solving the bug.

Helped-by: Derrick Stolee <dstolee@microsoft.com>
Helped-by: Jonathan Nieder <jrnieder@gmail.com>
Reported-by: Jay Conrod <jayconrod@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-08 16:13:46 -07:00
Jeff King 5f46e610cb diff: check for merge bases before assigning sym->base
In symdiff_prepare(), we iterate over the set of parsed objects to pick
out any symmetric differences, including the left, right, and base
elements. We assign the results into pointers in a "struct symdiff", and
then complain if we didn't find a base, like so:

    sym->left = rev->pending.objects[lpos].name;
    sym->right = rev->pending.objects[rpos].name;
    sym->base = rev->pending.objects[basepos].name;
    if (basecount == 0)
            die(_("%s...%s: no merge base"), sym->left, sym->right);

But the least lines are backwards. If basecount is 0, then basepos will
be -1, and we will access memory outside of the pending array. This
isn't usually that big a deal, since we don't do anything besides a
single pointer-sized read before exiting anyway, but it does violate the
C standard, and of course memory-checking tools like ASan complain.

Let's put the basecount check first. Note that we haveto split it from
the other assignments, since the die() relies on sym->left and
sym->right having been assigned (this isn't strictly necessary, but is
easier to read than dereferencing the pending array again).

Reported-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-08 13:57:18 -07:00
Junio C Hamano 4a0fcf9f76 The seventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-06 22:13:31 -07:00
Junio C Hamano 43f46d6da5 Merge branch 'es/worktree-code-cleanup'
Code cleanup.

* es/worktree-code-cleanup:
  worktree: avoid dead-code in conditional
2020-07-06 22:09:19 -07:00
Junio C Hamano efafdca421 Merge branch 'dl/test-must-fail-fixes-5'
The effort to avoid using test_must_fail on non-git command continues.

* dl/test-must-fail-fixes-5:
  lib-submodule-update: pass 'test_must_fail' as an argument
  lib-submodule-update: prepend "git" to $command
  lib-submodule-update: consolidate --recurse-submodules
  lib-submodule-update: add space after function name
2020-07-06 22:09:18 -07:00
Junio C Hamano 0a23331aa6 Merge branch 'jk/fast-export-anonym-alt'
"git fast-export --anonymize" learned to take customized mapping to
allow its users to tweak its output more usable for debugging.

* jk/fast-export-anonym-alt:
  fast-export: use local array to store anonymized oid
  fast-export: anonymize "master" refname
  fast-export: allow seeding the anonymized mapping
  fast-export: add a "data" callback parameter to anonymize_str()
  fast-export: move global "idents" anonymize hashmap into function
  fast-export: use a flex array to store anonymized entries
  fast-export: stop storing lengths in anonymized hashmaps
  fast-export: tighten anonymize_mem() interface to handle only strings
  fast-export: store anonymized oids as hex strings
  fast-export: use xmemdupz() for anonymizing oids
  t9351: derive anonymized tree checks from original repo
2020-07-06 22:09:17 -07:00
Junio C Hamano 0ac0947b14 Merge branch 'js/diff-files-i-t-a-fix-for-difftool'
"git difftool" has trouble dealing with paths added to the index
with the intent-to-add bit.

* js/diff-files-i-t-a-fix-for-difftool:
  difftool -d: ensure that intent-to-add files are handled correctly
  diff-files --raw: show correct post-image of intent-to-add files
2020-07-06 22:09:17 -07:00
Junio C Hamano 11cbda2add Merge branch 'js/default-branch-name'
The name of the primary branch in existing repositories, and the
default name used for the first branch in newly created
repositories, is made configurable, so that we can eventually wean
ourselves off of the hardcoded 'master'.

* js/default-branch-name:
  contrib: subtree: adjust test to change in fmt-merge-msg
  testsvn: respect `init.defaultBranch`
  remote: use the configured default branch name when appropriate
  clone: use configured default branch name when appropriate
  init: allow setting the default for the initial branch name via the config
  init: allow specifying the initial branch name for the new repository
  docs: add missing diamond brackets
  submodule: fall back to remote's HEAD for missing remote.<name>.branch
  send-pack/transport-helper: avoid mentioning a particular branch
  fmt-merge-msg: stop treating `master` specially
2020-07-06 22:09:17 -07:00
Junio C Hamano 480e78595e Merge branch 'rs/pack-bits-in-object-better'
By renumbering object flag bits, "struct object" managed to lose
bloated inter-field padding.

* rs/pack-bits-in-object-better:
  revision: reallocate TOPO_WALK object flags
2020-07-06 22:09:17 -07:00
Junio C Hamano 67d99b82de Merge branch 'bc/http-push-flagsfix'
The code to push changes over "dumb" HTTP had a bad interaction
with the commit reachability code due to incorrect allocation of
object flag bits, which has been corrected.

* bc/http-push-flagsfix:
  http-push: ensure unforced pushes fail when data would be lost
2020-07-06 22:09:17 -07:00
Junio C Hamano 8a78e4d615 Merge branch 'js/pu-to-seen'
The documentation and some tests have been adjusted for the recent
renaming of "pu" branch to "seen".

* js/pu-to-seen:
  tests: reference `seen` wherever `pu` was referenced
  docs: adjust the technical overview for the rename `pu` -> `seen`
  docs: adjust for the recent rename of `pu` to `seen`
2020-07-06 22:09:16 -07:00
Junio C Hamano 0258ed1e08 Merge branch 'cb/is-descendant-of'
Code clean-up.

* cb/is-descendant-of:
  commit-reach: avoid is_descendant_of() shim
2020-07-06 22:09:16 -07:00
Junio C Hamano 5c61d10b16 Merge branch 'mk/pb-pretty-email-without-domain-part-fix'
Docfix.

* mk/pb-pretty-email-without-domain-part-fix:
  doc: fix author vs. committer copy/paste error
2020-07-06 22:09:15 -07:00
Junio C Hamano 65ffaca0e4 Merge branch 'jl/complete-git-prune'
Add "git prune" to the completion (in contrib/), which could be
typed by end-users from the command line.

* jl/complete-git-prune:
  bash-completion: add git-prune into bash completion
2020-07-06 22:09:15 -07:00
Junio C Hamano 645f63111b Merge branch 'es/get-worktrees-unsort'
API cleanup for get_worktrees()

* es/get-worktrees-unsort:
  worktree: drop get_worktrees() unused 'flags' argument
  worktree: drop get_worktrees() special-purpose sorting option
2020-07-06 22:09:15 -07:00
Junio C Hamano e7e113a1df Merge branch 'bc/sha-256-cvs-svn-updates'
CVS/SVN interface have been prepared for SHA-256 transition

* bc/sha-256-cvs-svn-updates:
  git-cvsexportcommit: port to SHA-256
  git-cvsimport: port to SHA-256
  git-cvsserver: port to SHA-256
  git-svn: set the OID length based on hash algorithm
  perl: make SVN code hash independent
  perl: make Git::IndexInfo work with SHA-256
  perl: create and switch variables for hash constants
  t/lib-git-svn: make hash size independent
  t9101: make hash independent
  t9104: make hash size independent
  t9100: make test work with SHA-256
  t9108: make test hash independent
  t9168: make test hash independent
  t9109: make test hash independent
2020-07-06 22:09:14 -07:00
Junio C Hamano d80bea479d Merge branch 'ak/commit-graph-to-slab'
A few fields in "struct commit" that do not have to always be
present have been moved to commit slabs.

* ak/commit-graph-to-slab:
  commit-graph: minimize commit_graph_data_slab access
  commit: move members graph_pos, generation to a slab
  commit-graph: introduce commit_graph_data_slab
  object: drop parsed_object_pool->commit_count
2020-07-06 22:09:14 -07:00
Junio C Hamano 0cc4dcacb3 Merge branch 'en/sparse-status'
"git status" learned to report the status of sparse checkout.

* en/sparse-status:
  git-prompt: include sparsity state as well
  git-prompt: document how in-progress operations affect the prompt
  wt-status: show sparse checkout status as well
2020-07-06 22:09:13 -07:00
Junio C Hamano 33a22c1a88 Merge branch 'ps/ref-transaction-hook'
A new hook.

* ps/ref-transaction-hook:
  refs: implement reference transaction hook
2020-07-06 22:09:13 -07:00
Junio C Hamano 12210859da Merge branch 'bc/sha-256-part-2'
SHA-256 migration work continues.

* bc/sha-256-part-2: (44 commits)
  remote-testgit: adapt for object-format
  bundle: detect hash algorithm when reading refs
  t5300: pass --object-format to git index-pack
  t5704: send object-format capability with SHA-256
  t5703: use object-format serve option
  t5702: offer an object-format capability in the test
  t/helper: initialize the repository for test-sha1-array
  remote-curl: avoid truncating refs with ls-remote
  t1050: pass algorithm to index-pack when outside repo
  builtin/index-pack: add option to specify hash algorithm
  remote-curl: detect algorithm for dumb HTTP by size
  builtin/ls-remote: initialize repository based on fetch
  t5500: make hash independent
  serve: advertise object-format capability for protocol v2
  connect: parse v2 refs with correct hash algorithm
  connect: pass full packet reader when parsing v2 refs
  Documentation/technical: document object-format for protocol v2
  t1302: expect repo format version 1 for SHA-256
  builtin/show-index: provide options to determine hash algo
  t5302: modernize test formatting
  ...
2020-07-06 22:09:13 -07:00
René Scharfe 01faa91cb7 revision: disable min_age optimization with line-log
If one of the options --before, --min-age or --until is given,
limit_list() filters out younger commits early on.  Line-log needs all
those commits to trace the movement of line ranges, though.  Skip this
optimization if both are used together.

Reported-by: Мария Долгополова <dolgopolovamariia@gmail.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-06 18:38:03 -07:00
Johannes Schindelin 3080c50980 difftool -d: ensure that intent-to-add files are handled correctly
In https://github.com/git-for-windows/git/issues/2677, a `git difftool
-d` problem was reported. The underlying cause was a bug in `git
diff-files --raw` that we just fixed: it reported intent-to-add files
with the empty _tree_ as the post-image OID, when we need to show
an all-zero (or, "null") OID instead, to indicate to the caller that
they have to look at the worktree file.

The symptom of that problem shown by `git difftool` was this:

	error: unable to read sha1 file of <path> (<empty-tree-OID>)
	error: could not write '<filename>'

Make sure that the reported `difftool` problem stays fixed.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 16:15:45 -07:00
Johannes Schindelin 85953a3187 diff-files --raw: show correct post-image of intent-to-add files
The documented behavior of `git diff-files --raw` is to display

	[...] 0{40} if creation, unmerged or "look at work tree".

on the right hand (i.e. postimage) side. This happens for files that
have unstaged modifications, and for files that are unmodified but
stat-dirty.

For intent-to-add files, we used to show the empty blob's hash instead.
In c26022ea8f (diff: convert diff_addremove to struct object_id,
2017-05-30), we made that worse by inadvertently changing that to the
hash of the empty tree.

Let's make the behavior consistent with files that have unstaged
modifications (which applies to intent-to-add files, too) by showing
all-zero values also for intent-to-add files.

Accordingly, this patch adjusts the expectations set by the regression
test introduced in feea6946a5 (diff-files: treat "i-t-a" files as
"not-in-index", 2020-06-20).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 16:15:43 -07:00
Rafael Aquini f9f60d7066 send-email: restore --in-reply-to superseding behavior
git send-email --in-reply-to= fails to override In-Reply-To email headers,
if they're present in the output of format-patch, even when explicitly
told to do so by the option --no-thread, which breaks the contract of the
command line switch option, per its man page.

"
   --in-reply-to=<identifier>
       Make the first mail (or all the mails with --no-thread) appear as
       a reply to the given Message-Id, which avoids breaking threads to
       provide a new patch series.
"

This patch fixes the aformentioned issue, by bringing --in-reply-to's old
overriding behavior back.

The test was donated by Carlo Marcelo Arenas Belón.

Signed-off-by: Rafael Aquini <aquini@redhat.com>
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 16:12:21 -07:00
Christian Couder 0172f7834a cat-file: add missing [=<format>] to usage/synopsis
When displaying cat-file usage, the fact that a <format> can
be specified is only visible when lookling at the --batch and
--batch-check options which are shown like this:

    --batch[=<format>]    show info and content of objects fed from the standard input
    --batch-check[=<format>]
                          show info about objects fed from the standard input

It seems more coherent and improves discovery to also show it
on the usage line.

In the documentation the DESCRIPTION tells us that "The output
format can be overridden using the optional <format> argument",
but we can't see the <format> argument in the SYNOPSIS above
the description which is confusing.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 15:54:05 -07:00
Ville Skyttä c2dbcd206d completion: nounset mode fixes
Accessing unset variables results an errors when the shell is in
nounset/-u mode. This fixes the cases I've come across while using git
completion in a shell running in that mode for a while. It's hard to
tell if this is the complete set, but at least it improves things.

Signed-off-by: Ville Skyttä <ville.skytta@iki.fi>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-07-01 14:55:30 -07:00
Đoàn Trần Công Danh 508fd8e8ba contrib: subtree: adjust test to change in fmt-merge-msg
We're starting to stop treating `master' specially in fmt-merge-msg.
Adjust the test to reflect that change.

Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-30 08:41:15 -07:00
Junio C Hamano a08a83db2b The sixth batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-29 14:17:27 -07:00
Junio C Hamano 298d704e70 Merge branch 'sk/diff-files-show-i-t-a-as-new'
"git diff-files" has been taught to say paths that are marked as
intent-to-add are new files, not modified from an empty blob.

* sk/diff-files-show-i-t-a-as-new:
  diff-files: treat "i-t-a" files as "not-in-index"
2020-06-29 14:17:27 -07:00
Junio C Hamano fa2c57d562 Merge branch 'rs/commit-reach-leakfix'
Leakfix.

* rs/commit-reach-leakfix:
  commit-reach: plug minor memory leak after using is_descendant_of()
2020-06-29 14:17:27 -07:00
Junio C Hamano b381c98891 Merge branch 'rs/pull-leakfix'
Leakfix.

* rs/pull-leakfix:
  pull: plug minor memory leak after using is_descendant_of()
2020-06-29 14:17:26 -07:00
Junio C Hamano 610486749a Merge branch 'rs/retire-strbuf-write-fd'
A misdesigned strbuf_write_fd() function has been retired.

* rs/retire-strbuf-write-fd:
  strbuf: remove unreferenced strbuf_write_fd method.
  bugreport.c: replace strbuf_write_fd with write_in_full
2020-06-29 14:17:26 -07:00
Junio C Hamano 1ea1f93fd9 Merge branch 'dl/diff-usage-comment-update'
An in-code comment in "git diff" has been updated.

* dl/diff-usage-comment-update:
  builtin/diff: fix botched update of usage comment
  builtin/diff: update usage comment
2020-06-29 14:17:25 -07:00
Junio C Hamano 1033b98291 Merge branch 'xl/upgrade-repo-format'
Allow runtime upgrade of the repository format version, which needs
to be done carefully.

There is a rather unpleasant backward compatibility worry with the
last step of this series, but it is the right thing to do in the
longer term.

* xl/upgrade-repo-format:
  check_repository_format_gently(): refuse extensions for old repositories
  sparse-checkout: upgrade repository to version 1 when enabling extension
  fetch: allow adding a filter after initial clone
  repository: add a helper function to perform repository format upgrade
2020-06-29 14:17:24 -07:00
Jeff King f39ad38410 fast-export: use local array to store anonymized oid
Some older versions of gcc complain about this line:

  builtin/fast-export.c:412:2: error: dereferencing type-punned pointer
       will break strict-aliasing rules [-Werror=strict-aliasing]
    put_be32(oid.hash + hashsz - 4, counter++);
    ^

This seems to be a false positive, as there's no type-punning at all
here. oid.hash is an array of unsigned char; when we pass it to a
function it decays to a pointer to unsigned char. We do take a void
pointer in put_be32(), but it's immediately aliased with another pointer
to unsigned char (and clearly the compiler is looking inside the inlined
put_be32(), since the warning doesn't happen with -O0).

This happens on gcc 4.8 and 4.9, but not later versions (I tested gcc 6,
7, 8, and 9).

We can work around it by using a local array instead of an object_id
struct. This is a little more intimate with the details of object_id,
but for whatever reason doesn't seem to trigger the compiler warning.
We can revert this patch once we decide that those gcc versions are too
old to care about for a warning like this (gcc 4.8 is the default
compiler for Ubuntu Trusty, which is out-of-support but not fully
end-of-life'd until April 2022).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 14:19:23 -07:00
Jeff King 8a49495583 fast-export: anonymize "master" refname
Running "fast-export --anonymize" will leave "refs/heads/master"
untouched in the output, for two reasons:

  - it helped to have some known reference point between the original
    and anonymized repository

  - since it's historically the default branch name, it doesn't leak any
    information

Now that we can ask fast-export to retain particular tokens, we have a
much better tool for the first one (because it works for any ref, not
just master).

For the second, the notion of "default branch name" is likely to become
configurable soon, at which point the name _does_ leak information.
Let's drop this special case in preparation.

Note that we have to adjust the test a bit, since it relied on using the
name "master" in the anonymized repos. We could just use
--anonymize-map=master to keep the same output, but then we wouldn't
know if it works because of our hard-coded master or because of the
explicit map.

So let's flip the test a bit, and confirm that we anonymize "master",
but keep "other" in the output.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 14:19:23 -07:00
Jeff King 65b5d9fae7 fast-export: allow seeding the anonymized mapping
After you anonymize a repository, it can be hard to find which commits
correspond between the original and the result, and thus hard to
reproduce commands that triggered bugs in the original.

Let's make it possible to seed the anonymization map. This lets users
either:

  - mark names to be retained as-is, if they don't consider them secret
    (in which case their original commands would just work)

  - map names to new values, which lets them adapt the reproduction
    recipe to the new names without revealing the originals

The implementation is fairly straight-forward. We already store each
anonymized token in a hashmap (so that the same token appearing twice is
converted to the same result). We can just introduce a new "seed"
hashmap which is consulted first.

This does make a few more promises to the user about how we'll anonymize
things (e.g., token-splitting pathnames). But it's unlikely that we'd
want to change those rules, even if the actual anonymization of a single
token changes. And it makes things much easier for the user, who can
unblind only a directory name without having to specify each path within
it.

One alternative to this approach would be to anonymize as we see fit,
and then dump the whole refname and pathname mappings to a file. This
does work, but it's a bit awkward to use (you have to manually dig the
items you care about out of the mapping).

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-06-25 14:19:23 -07:00