In t7003 we conditionally check whether the reflog for branches pruned
by git-filter-branch(1) get deleted based on whether or not we use the
"files" backend. Same as with the preceding commit, this condition was
added because in its initial iteration the "reftable" backend did not
delete reflogs when their corresponding ref was deleted. Since then, the
backend has been aligned to behave the same as the "files" backend
though, which makes this check unnecessary.
Remove it.
Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Despite POSIX states that:
> The old egrep and fgrep commands are likely to be supported for many
> years to come as implementation extensions, allowing historical
> applications to operate unmodified.
GNU grep 3.8 started to warn[1]:
> The egrep and fgrep commands, which have been deprecated since
> release 2.5.3 (2007), now warn that they are obsolescent and should
> be replaced by grep -E and grep -F.
Prepare for their removal in the future.
[1]: https://lists.gnu.org/archive/html/info-gnu/2022-09/msg00001.html
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After it has rewritten all of the commits, filter-branch will then
rewrite each of the input refs based on the resulting map of old/new
commits. But we don't have any explicit test coverage of this code.
Let's make sure we are covering each of those cases:
- deleting a ref when all of its commits were pruned
- rewriting a ref based on the mapping (this happens throughout the
script, but let's make sure we generate the correct messages)
- rewriting a ref whose tip was excluded, in which case we rewrite to
the nearest ancestor. Note in this case that we still insist that no
"warning" line is present (even though it looks like we'd trigger
the "... was rewritten into multiple commits" one). See the next
commit for more details.
Note these all pass currently, but the latter two will fail when run
with GIT_TEST_DEFAULT_HASH=sha256.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Carefully excluding t7064, which sees independent development elsewhere
at the time of writing, we use `main` as the default branch name in
t7[0-4]*. This trick was performed via
$ (cd t &&
sed -i -e 's/master/main/g' -e 's/MASTER/MAIN/g' \
-e 's/Master/Main/g' -- t7[0-4]*.sh &&
git checkout HEAD -- t7064\*)
This allows us to define `GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main`
for those tests.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In addition to the manual adjustment to let the `linux-gcc` CI job run
the test suite with `master` and then with `main`, this patch makes sure
that GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME is set in all test scripts
that currently rely on the initial branch name being `master by default.
To determine which test scripts to mark up, the first step was to
force-set the default branch name to `master` in
- all test scripts that contain the keyword `master`,
- t4211, which expects `t/t4211/history.export` with a hard-coded ref to
initialize the default branch,
- t5560 because it sources `t/t556x_common` which uses `master`,
- t8002 and t8012 because both source `t/annotate-tests.sh` which also
uses `master`)
This trick was performed by this command:
$ sed -i '/^ *\. \.\/\(test-lib\|lib-\(bash\|cvs\|git-svn\)\|gitweb-lib\)\.sh$/i\
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
' $(git grep -l master t/t[0-9]*.sh) \
t/t4211*.sh t/t5560*.sh t/t8002*.sh t/t8012*.sh
After that, careful, manual inspection revealed that some of the test
scripts containing the needle `master` do not actually rely on a
specific default branch name: either they mention `master` only in a
comment, or they initialize that branch specificially, or they do not
actually refer to the current default branch. Therefore, the
aforementioned modification was undone in those test scripts thusly:
$ git checkout HEAD -- \
t/t0027-auto-crlf.sh t/t0060-path-utils.sh \
t/t1011-read-tree-sparse-checkout.sh \
t/t1305-config-include.sh t/t1309-early-config.sh \
t/t1402-check-ref-format.sh t/t1450-fsck.sh \
t/t2024-checkout-dwim.sh \
t/t2106-update-index-assume-unchanged.sh \
t/t3040-subprojects-basic.sh t/t3301-notes.sh \
t/t3308-notes-merge.sh t/t3423-rebase-reword.sh \
t/t3436-rebase-more-options.sh \
t/t4015-diff-whitespace.sh t/t4257-am-interactive.sh \
t/t5323-pack-redundant.sh t/t5401-update-hooks.sh \
t/t5511-refspec.sh t/t5526-fetch-submodules.sh \
t/t5529-push-errors.sh t/t5530-upload-pack-error.sh \
t/t5548-push-porcelain.sh \
t/t5552-skipping-fetch-negotiator.sh \
t/t5572-pull-submodule.sh t/t5608-clone-2gb.sh \
t/t5614-clone-submodules-shallow.sh \
t/t7508-status.sh t/t7606-merge-custom.sh \
t/t9302-fast-import-unpack-limit.sh
We excluded one set of test scripts in these commands, though: the range
of `git p4` tests. The reason? `git p4` stores the (foreign) remote
branch in the branch called `p4/master`, which is obviously not the
default branch. Manual analysis revealed that only five of these tests
actually require a specific default branch name to pass; They were
modified thusly:
$ sed -i '/^ *\. \.\/lib-git-p4\.sh$/i\
GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master\
export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME\
' t/t980[0167]*.sh t/t9811*.sh
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using a specific invalid hard-coded object ID, look one
up from the translation table.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git filter-branch" when used with the "--state-branch" option
still attempted to rewrite the commits whose filtered result is
known from the previous attempt (which is recorded on the state
branch); the command has been corrected not to waste cycles doing
so.
* mb/filter-branch-optim:
filter-branch: skip commits present on --state-branch
The commits in state:filter.map have already been processed, so don't
filter them again. This makes incremental git filter-branch much faster.
Also add tests for --state-branch option.
Signed-off-by: Michael Barabanov <michael.barabanov@gmail.com>
Acked-by: Ian Campbell <ijc@hellion.org.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid using pipes downstream of Git commands since the exit codes
of commands upstream of pipes get swallowed, thus potentially
hiding failure of those commands. Instead, capture Git command
output to a file and apply the downstream command(s) to that file.
Signed-off-by: Pratik Karki <predatoramigo@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git filter-branch -- --all" prints error messages when processing refs that
point at objects that are not committish. Such refs can be created by
"git replace" with trees or blobs. And also "git tag" with trees or blobs can
create such refs.
Filter these problematic refs out early, before they are seen by the logic to
see which refs have been modified and which have been left intact (which is
where the unwanted error messages come from), and warn that these refs are left
unwritten while doing so.
Signed-off-by: Yuki Kokubun <orga.chem.job@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously, the git_commit_non_empty_tree function would always pass any
commit with no parents to git-commit-tree, regardless of whether the
tree was nonempty. The new commit would then be recorded in the
filter-branch revision map, and subsequent commits which leave the tree
untouched would be correctly filtered.
With this change, parentless commits with an empty tree are correctly
pruned, and an empty file is recorded in the revision map, signifying
that it was rewritten to "no commits." This works naturally with the
parent mapping for subsequent commits.
Signed-off-by: Devin J. Pohly <djpohly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Sanity check before changing the logic in git_commit_non_empty_tree.
Signed-off-by: Devin J. Pohly <djpohly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
New test to expose a bug in filter-branch whereby the root commit is
never pruned, even though its tree is empty and --prune-empty is given.
The setup isn't exactly pretty, but I couldn't think of a simpler way to
create a parallel commit graph sans the first commit.
Signed-off-by: Devin J. Pohly <djpohly@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A recent optimization to filter-branch in v2.7.0 introduced a
regression when --prune-empty filter is used, which has been
corrected.
* jk/filter-branch-no-index:
filter-branch: resolve $commit^{tree} in no-index case
Commit 348d4f2 (filter-branch: skip index read/write when
possible, 2015-11-06) taught filter-branch to optimize out
the final "git write-tree" when we know we haven't touched
the tree with any of our filters. It does by simply putting
the literal text "$commit^{tree}" into the "$tree" variable,
avoiding a useless rev-parse call.
However, when we pass this to git_commit_non_empty_tree(),
it gets confused; it resolves "$commit^{tree}" itself, and
compares our string to the 40-hex sha1, which obviously
doesn't match. As a result, "--prune-empty" (or any custom
filter using git_commit_non_empty_tree) will fail to drop
an empty commit (when filter-branch is used without a tree
or index filter).
Let's resolve $tree to the 40-hex ourselves, so that
git_commit_non_empty_tree can work. Unfortunately, this is a
bit slower due to the extra process overhead:
$ cd t/perf && ./run 348d4f2 HEAD p7000-filter-branch.sh
[...]
Test 348d4f2 HEAD
--------------------------------------------------------------
7000.2: noop filter 3.76(0.24+0.26) 4.54(0.28+0.24) +20.7%
We could try to make git_commit_non_empty_tree more clever.
However, the value of $tree here is technically
user-visible. The user can provide arbitrary shell code at
this stage, which could itself have a similar assumption to
what is in git_commit_non_empty_tree. So the conservative
choice to fix this regression is to take the 20% hit and
give the pre-348d4f2 behavior. We still end up much faster
than before the optimization:
$ cd t/perf && ./run 348d4f2^ HEAD p7000-filter-branch.sh
[...]
Test 348d4f2^ HEAD
--------------------------------------------------------------
7000.2: noop filter 9.51(4.32+0.40) 4.51(0.28+0.23) -52.6%
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The Git CodingGuidelines prefer the $(...) construct for command
substitution instead of using the backquotes `...`.
The backquoted form is the traditional method for command
substitution, and is supported by POSIX. However, all but the
simplest uses become complicated quickly. In particular, embedded
command substitutions and/or the use of double quotes require
careful escaping with the backslash character.
The patch was generated by:
for _f in $(find . -name "*.sh")
do
perl -i -pe 'BEGIN{undef $/;} s/`(.+?)`/\$(\1)/smg' "${_f}"
done
and then carefully proof-read.
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
'git filter-branch' fails complaining about an ambiguous argument, if
a tree-filter renames a path and the new pathname happens to match an
existing object name.
After the tree-filter has been applied, 'git filter-branch' looks for
changed paths by running:
git diff-index -r --name-only --ignore-submodules $commit
which then, because of the lack of disambiguating double-dash, can't
decide whether to treat '$commit' as revision or path and errors out.
Add that disambiguating double-dash after 'git diff-index's revision
argument to make sure that '$commit' is interpreted as a revision.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Jeff King <peff@peff.net>
df062010 (filter-branch: avoid passing commit message through sed)
introduced a regression when filtering commits with multi-line headers,
if the header contains a blank line. An example of this is a gpg-signed
commit:
$ git cat-file commit signed-commit
tree 3d4038e029712da9fc59a72afbfcc90418451630
parent 110eac945dc1713b27bdf49e74e5805db66971f0
author A U Thor <author@example.com> 1112912413 -0700
committer C O Mitter <committer@example.com> 1112912413 -0700
gpgsig -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iEYEABECAAYFAlYXADwACgkQE7b1Hs3eQw23CACgldB/InRyDgQwyiFyMMm3zFpj
pUsAnA+f3aMUsd9mNroloSmlOgL6jIMO
=0Hgm
-----END PGP SIGNATURE-----
Adding gpg
As a consequence, "filter-branch --msg-filter cat" (which should leave the
commit message unchanged) spills the signature (after the internal blank
line) into the original commit message.
The reason is that although the signature is indented, making the line a
whitespace only line, the "read" call is splitting the line based on
the shell's IFS, which defaults to <space><tab><newline>. The leading
space is consumed and $header_line is empty, causing the "skip header
lines" loop to exit.
The rest of the commit object is then re-used as the rewritten commit
message, causing the new message to include the signature of the
original commit.
Set IFS to an empty string for the "read" call, thus disabling the word
splitting, which causes $header_line to be set to the non-empty value ' '.
This allows the loop to fully consume the header lines before
emitting the original, intact commit message.
[jc: this is literally based on MJG's suggestion]
Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: James McCoy <vega.james@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
On some systems (like OS X), if sed encounters input without
a trailing newline, it will silently add it. As a result,
"git filter-branch" on such systems may silently rewrite
commit messages that omit a trailing newline. Even though
this is not something we generate ourselves with "git
commit", it's better for filter-branch to preserve the
original data as closely as possible.
We're using sed here only to strip the header fields from
the commit object. We can accomplish the same thing with a
shell loop. Since shell "read" calls are slow (usually one
syscall per byte), we use "cat" once we've skipped past the
header. Depending on the size of your commit messages, this
is probably faster (you pay the cost to fork, but then read
the data in saner-sized chunks). This idea is shamelessly
stolen from Junio.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When multiple parents of a merge commit get mapped to the same
commit, filter-branch used to pass all instances of the parent
commit to the parent and commit filters and to "git commit-tree" or
"git_commit_non_empty_tree".
This can often happen when extracting a small project from a large
repository; merges can join history with no commits on any branch
which affect the paths being retained. Once the intermediate
commits have been filtered out, all the immediate parents of the
merge commit can end up being mapped to the same commit - either the
original merge-base or an ancestor of it.
"git commit-tree" would display an error but write the commit with
the normalized parents in any case. "git_commit_non_empty_tree"
would fail to notice that the commit being made was in fact a
non-merge commit and would retain it even if a further pass with
"--prune-empty" would discard the commit as empty.
Ensure that duplicate parents are pruned before the parent filter to
make "--prune-empty" idempotent, removing all empty non-merge
commits in a singe pass.
Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When used with "-d temporary-directory" option, "git filter-branch"
failed to come back to the original working tree to perform the
final clean-up procedure.
* jk/filter-branch-come-back-to-original:
filter-branch: return to original dir after filtering
The first thing filter-branch does is to create a temporary
directory, either ".git-rewrite" in the current directory
(which may be the working tree or the repository if bare),
or in a directory specified by "-d". We then chdir to
$tempdir/t as our temporary working directory in which to run
tree filters.
After finishing the filter, we then attempt to go back to
the original directory with "cd ../..". This works in the
.git-rewrite case, but if "-d" is used, we end up in a
random directory. The only thing we do after this chdir is
to run git-read-tree, but that means that:
1. The working directory is not updated to reflect the
filtered history.
2. We dump random files into "$tempdir/.." (e.g., if you
use "-d /tmp/foo", we dump junk into /tmp).
Fix it by recording the full path to the original directory
and returning there explicitly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This saves us some code, but it also reduces the number of
processes we start for each filtered commit. Since we can
parse both author and committer in the same sed invocation,
we save one process. And since the new interface avoids tr,
we save 4 processes.
It also avoids using "tr", which has had some odd
portability problems reported with from Solaris's xpg6
version.
We also tweak one of the tests in t7003 to double-check that
we are properly exporting the variables (because test-lib.sh
exports GIT_AUTHOR_NAME, it will be automatically exported
in subprograms. We override this to make sure that
filter-branch handles it properly itself).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Running filter-branch on a history that has a commit with timestamp
at epoch used to fail, but it should have been fixed. Add test to
make sure it won't break again.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
t7003-filter-branch.sh had a make_commit() function that was identical
to test_commit() in test-lib.sh except that it used tr to create a
lowercase file name from the uppercase branch name instead of
appending ".t".
Not only is this unneeded code duplication, it also was something
simply waiting to fail on case-insensitive file systems. So replace
all uses of make_commit with test_commit.
While we're editing the setup, chain it together with && so that
failures early in the sequence don't get lost and add a commit graph.
Signed-off-by: Brian Gernhardt <brian@gernhardtsoftware.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We can be clever and know by ourselves when we need the behavior
implied by "--remap-to-ancestor". No need to encumber users by having
them exposed to it as a tunable. (Option kept for backward compatibility,
but it's now a no-op.)
Signed-off-by: Csaba Henk <csaba@gluster.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The sed utilities on IRIX and Solaris do not interpret the sequence '\t'
to mean a tab character; they read a literal character 't'. So, use a
literal tab instead.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The test would not fail if the filtering failed to do anything, since
in
test -z "$(git diff HEAD directorymoved:newsubdir)"'
'directorymoved:newsubdir' is not valid, so git-diff fails without
printing anything on stdout. But then the exit status of git-diff is
lost, whereas test -z "" succeeds.
Use 'git diff --exit-code' instead, which does the right thing and has
the added bonus of showing the differences if there are any.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add tests to make sure that:
1) a submodule can be removed and its content replaced with regular files
('rewrite submodule with another content'). This test passes only with
the previous patch applied.
2) it is possible to replace submodule revision by direct index
manipulation ('replace submodule revision'). Although it would be
better to run such a filter in --index-filter, this test shows that
this functionality is not broken by the previous patch. This succeeds
both with and without the previous patch.
Signed-off-by: Michal Sojka <sojkam1@fel.cvut.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since a0e4639 (filter-branch: fix ref rewriting with
--subdirectory-filter, 2008-08-12) git-filter-branch has done
nearest-ancestor rewriting when using a --subdirectory-filter.
However, that rewriting strategy is also a useful building block in
other tasks. For example, if you want to split out a subset of files
from your history, you would typically call
git filter-branch -- <refs> -- <files>
But this fails for all refs that do not point directly to a commit
that affects <files>, because their referenced commit will not be
rewritten and the ref remains untouched.
The code was already there for the --subdirectory-filter case, so just
introduce an option that enables it independently.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The improved error handling catches a bug in filter-branch when using
-d pointing to a path outside any git repository:
$ git filter-branch -d /tmp/foo master
fatal: Not a git repository (or any of the parent directories): .git
This error message comes from git for-each-ref in line 224. GIT_DIR is
set correctly by git-sh-setup (to the foo.git repository), but not
exported (yet).
Signed-off-by: Lars Noschinski <lars@public.noschinski.de>
Acked-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
9273b56 (filter-branch: Fix fatal error on bare repositories, 2009-02-03)
fixed a missing check of return status from an underlying command in
git-filter-branch, but there still are places that do not check errors.
For example, the command does not pay attention to the exit status of the
command given by --commit-filter. It should abort in such a case.
This attempts to fix all the remaining places that fails to checks errors.
In two places, I've had to break apart pipelines in order to check the
error code for the first stage of the pipeline, as discussed here:
http://kerneltrap.org/mailarchive/git/2009/1/28/4835614
Feedback on this patch was provided by Johannes Sixt, Johannes Schindelin
and Junio C Hamano. Thomas Rast helped with pipeline error handling.
Signed-off-by: Eric Kidd <git@randomhacks.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* js/filter-branch-submodule:
filter-branch: do not consider diverging submodules a 'dirty worktree'
filter-branch: Fix fatal error on bare repositories
When git filter-branch is run on a bare repository, it prints out a fatal
error message:
$ git filter-branch branch
Rewrite 476c4839280c219c2317376b661d9d95c1727fc3 (9/9)
WARNING: Ref 'refs/heads/branch' is unchanged
fatal: This operation must be run in a work tree
Note that this fatal error message doesn't prevent git filter-branch from
exiting successfully. (Why doesn't git filter-branch actually exit with an
error when a shell command fails? I'm not sure why it was designed this
way.)
This error message is caused by the following section of code at the end of
git-filter-branch.sh:
if [ "$(is_bare_repository)" = false ]; then
unset GIT_DIR GIT_WORK_TREE GIT_INDEX_FILE
test -z "$ORIG_GIT_DIR" || {
GIT_DIR="$ORIG_GIT_DIR" && export GIT_DIR
}
... elided ...
git read-tree -u -m HEAD
fi
The problem is the call to $(is_bare_repository), which is made before
GIT_DIR and GIT_WORK_TREE are restored. This call always returns "false",
even when we're running in a bare repository. But this means that we will
attempt to call 'git read-tree' even in a bare repository, which will fail
and print an error.
This patch modifies git-filter-branch.sh to restore the original
environment variables before trying to call is_bare_repository.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
git_commit_non_empty_tree is added to the functions that can be run from
commit filters. Its effect is to commit only commits actually touching the
tree and that are not merge points either.
The option --prune-empty is added. It defaults the commit-filter to
'git_commit_non_empty_tree "$@"', and can be used with any other
combination of filters, except --commit-hook that must used
'git_commit_non_empty_tree "$@"' where one puts 'git commit-tree "$@"'
usually to achieve the same result.
Signed-off-by: Pierre Habouzit <madcoder@debian.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* tr/filter-branch:
revision --simplify-merges: make it a no-op without pathspec
revision --simplify-merges: do not leave commits unprocessed
revision --simplify-merges: use decoration instead of commit->util field
Documentation: rev-list-options: move --simplify-merges documentation
filter-branch: use --simplify-merges
filter-branch: fix ref rewriting with --subdirectory-filter
filter-branch: Extend test to show rewriting bug
Topo-sort before --simplify-merges
revision traversal: show full history with merge simplification
revision.c: whitespace fix
The tag rewriting code used a 'sed' expression to substitute the new tag
name into the corresponding field of the annotated tag object. But this is
problematic if the tag name contains special characters. In particular,
if the tag name contained a slash, then the 'sed' expression had a syntax
error. We now protect against this by using 'printf' to assemble the
tag header.
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The previous ancestor discovery code failed on any refs that are
(pre-rewrite) ancestors of commits marked for rewriting. This means
that in a situation
A -- B(topic) -- C(master)
where B is dropped by --subdirectory-filter pruning, the 'topic' was
not moved up to A as intended, but left unrewritten because we asked
about 'git rev-list ^master topic', which does not return anything.
Instead, we use the straightforward
git rev-list -1 $ref -- $filter_subdir
to find the right ancestor. To justify this, note that the nearest
ancestor is unique: We use the output of
git rev-list --parents -- $filter_subdir
to rewrite commits in the first pass, before any ref rewriting. If B
is a non-merge commit, the only candidate is its parent. If it is a
merge, there are two cases:
- All sides of the merge bring the same subdirectory contents. Then
rev-list already pruned away the merge in favour for just one of its
parents, so there is only one candidate.
- Some merge sides, or the merge outcome, differ. Then the merge is
not pruned and can be rewritten directly.
So it is always safe to use rev-list -1.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This extends the --subdirectory-filter test in t7003 to demonstrate a
rewriting bug: when rewriting two refs A and B such that B is an
ancestor of A, it fails to rewrite B.
The underlying issue is that the rev-list invocation at
git-filter-branch.sh:332 more or less boils down to
git rev-list B --boundary ^A
which outputs nothing because B is an ancestor of A.
Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Commit 46eb449c restricted git-filter-branch to non-bare repositories
unnecessarily; git-filter-branch can work on bare repositories just
fine.
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Petr Baudis <pasky@suse.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This reverts commit cfabd6eee1. I had
implemented it without understanding what --full-history does. Consider
this history:
C--M--N
/ / /
A--B /
\ /
D-/
where B and C modify a path, X, in the same way so that the result is
identical, and D does not modify it at all. With the path limiter X and
without --full-history this is simplified to
A--B
i.e. only one of the paths via B or C is chosen. I had assumed that
--full-history would keep both paths like this
C--M
/ /
A--B
removing the path via D; but in fact it keeps the entire history.
Currently, git does not have the capability to simplify to this
intermediary case. However, the other extreme to keep the entire history
is not wanted either in usual cases. I think we can expect that histories
like the above are rare, and in the usual cases we want a simplified
history. So let's remove --full-history again.
(Concerning t7003, subsequent tests depend on what the test case sets up,
so we can't just back out the entire test case.)
Signed-off-by: Johannes Sixt <johannes.sixt@telecom.at>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As a general principle, we should not use "git diff" to validate the
results of what git command that is being tested has done. We would not
know if we are testing the command in question, or locating a bug in the
cute hack of "git diff --no-index".
Rather use test_cmp for that purpose.
Signed-off-by: Junio C Hamano <gitster@pobox.com>