Commit graph

641 commits

Author SHA1 Message Date
Jakub Narebski d8faea9d18 diff: strip extra "/" when stripping prefix
There are two ways a user might want to use "diff --relative":

  1. For a file in a directory, like "subdir/file", the user
     can use "--relative=subdir/" to strip the directory.

  2. To strip part of a filename, like "foo-10", they can
     use "--relative=foo-".

We currently handle both of those situations. However, if the user passes
"--relative=subdir" (without the trailing slash), we produce inconsistent
results. For the unified diff format, we collapse the double-slash of
"a//file" correctly into "a/file". But for other formats (raw, stat,
name-status), we end up with "/file".

We can do what the user means here and strip the extra "/" (and only a
slash).  We are not hurting any existing users of (2) above with this
behavior change because the existing output for this case was nonsensical.

Patch by Jakub, tests and commit message by Jeff King.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-11 09:46:47 -07:00
Junio C Hamano bb89e84f95 Merge branch 'sv/maint-diff-q-clear-fix' into maint
* sv/maint-diff-q-clear-fix:
  Fix DIFF_QUEUE_CLEAR refactoring
2010-08-03 15:17:34 -07:00
Junio C Hamano ee38d823f7 Fix DIFF_QUEUE_CLEAR refactoring
It introduced a macro to reduce repeated assignments to three fields,
but an unrelated and incorrect change snuck in by mistake, which broke
commands like "git diff-files -p --submodule".

Noticed by Sven Verdoolaege.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-08-02 08:30:02 -07:00
Bo Yang e13f38a33e diff.c: fix a graph output bug
When --graph is in effect, the line-prefix typically has colored graph
line segments and ends with reset.  The color sequence "set" given to
this function is for showing the metainfo part of the patch text and
(1) it should not be applied to the graph lines, and (2) it will be
reset at the end of line_prefix so it won't be in effect anyway.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-07-08 18:09:14 -07:00
Junio C Hamano a76b2084fb Merge branch 'jl/status-ignore-submodules'
* jl/status-ignore-submodules:
  Add the option "--ignore-submodules" to "git status"
  git submodule: ignore dirty submodules for summary and status

Conflicts:
	builtin/commit.c
	t/t7508-status.sh
	wt-status.c
	wt-status.h
2010-06-30 11:55:39 -07:00
Junio C Hamano e1165dd144 Merge branch 'jl/maint-diff-ignore-submodules'
* jl/maint-diff-ignore-submodules:
  t4027,4041: Use test -s to test for an empty file
  Add optional parameters to the diff option "--ignore-submodules"
  git diff: rename test that had a conflicting name
2010-06-30 11:55:37 -07:00
Junio C Hamano 4af574dbdc Merge branch 'ab/blame-textconv'
* ab/blame-textconv:
  t/t8006: test textconv support for blame
  textconv: support for blame
  textconv: make the API public

Conflicts:
	diff.h
2010-06-27 12:07:44 -07:00
Jens Lehmann 46a958b3da Add the option "--ignore-submodules" to "git status"
In some use cases it is not desirable that "git status" considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree (this was the behavior before
1.7.0). An example for that are scripts which just want to check for
submodule commits while ignoring any changes to the work tree. Also users
having large submodules known not to change might want to use this option,
as the - sometimes substantial - time it takes to scan the submodule work
tree(s) is saved when using the "dirty" parameter.

And if you want to ignore any changes to submodules, you can now do that
by using this option without parameters or with "all" (when the config
option status.submodulesummary is set, using "all" will also suppress the
output of the submodule summary).

A new function handle_ignore_submodules_arg() is introduced to parse this
option new to "git status" in a single location, as "git diff" already
knew it.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-25 11:30:25 -07:00
Junio C Hamano 262657dce6 Merge branch 'maint'
* maint:
  Update draft release notes to 1.7.1.1
  tests: remove unnecessary '^' from 'expr' regular expression

Conflicts:
	diff.c
2010-06-22 09:35:36 -07:00
Junio C Hamano 3c656899cd Merge branch 'cc/maint-diff-CC-binary' into maint
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-22 09:27:07 -07:00
Junio C Hamano cb2af93ac1 Merge branch 'bw/diff-metainfo-color' into maint
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-21 05:40:10 -07:00
Junio C Hamano 60335534a6 Merge branch 'rs/diff-no-minimal' into maint
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-21 05:38:50 -07:00
Junio C Hamano 5977744d04 Merge branch 'cc/maint-diff-CC-binary'
* cc/maint-diff-CC-binary:
  diff: fix "git show -C -C" output when renaming a binary file

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano 98ad90fbab Merge branch 'by/diff-graph'
* by/diff-graph:
  Make --color-words work well with --graph
  graph.c: register a callback for graph output
  Emit a whole line in one go
  diff.c: Output the text graph padding before each diff line
  Output the graph columns at the end of the commit message
  Add a prefix output callback to diff output

Conflicts:
	diff.c
2010-06-18 11:16:57 -07:00
Junio C Hamano 18fd805583 Merge branch 'jh/diff-index-line-abbrev'
* jh/diff-index-line-abbrev:
  diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s

Conflicts:
	diff.c
2010-06-18 11:16:56 -07:00
Junio C Hamano 2621ac50cc Merge branch 'ec/diff-noprefix-config'
* ec/diff-noprefix-config:
  diff: add configuration option for disabling diff prefixes.
2010-06-18 11:16:55 -07:00
Junio C Hamano 448598b508 Merge branch 'bw/diff-metainfo-color'
* bw/diff-metainfo-color:
  diff: fix coloring of extended diff headers
2010-06-13 11:21:25 -07:00
Junio C Hamano 39b5977b13 Merge branch 'rs/diff-no-minimal'
* rs/diff-no-minimal:
  git diff too slow for a file
2010-06-13 11:20:46 -07:00
Jens Lehmann dd44d419d3 Add optional parameters to the diff option "--ignore-submodules"
In some use cases it is not desirable that the diff family considers
submodules that only contain untracked content as dirty. This may happen
e.g. when the submodule is not under the developers control and not all
build generated files have been added to .gitignore by the upstream
developers. Using the "untracked" parameter for the "--ignore-submodules"
option disables checking for untracked content and lets git diff report
them as changed only when they have new commits or modified content.

Sometimes it is not wanted to have submodules show up as changed when they
just contain changes to their work tree. An example for that are scripts
which just want to check for submodule commits while ignoring any changes
to the work tree. Also users having large submodules known not to change
might want to use this option, as the - sometimes substantial - time it
takes to scan the submodule work tree(s) is saved.

Signed-off-by: Jens Lehmann <Jens.Lehmann@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:33:17 -07:00
Axel Bonnet a788d7d58b textconv: make the API public
The textconv functionality allows one to convert a file into text before
running diff. But this functionality can be useful to other features
such as blame.

Signed-off-by: Axel Bonnet <axel.bonnet@ensimag.imag.fr>
Signed-off-by: Clément Poulain <clement.poulain@ensimag.imag.fr>
Signed-off-by: Diane Gasselin <diane.gasselin@ensimag.imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-11 13:17:57 -07:00
Christian Couder 296c6bb21a diff: fix "git show -C -C" output when renaming a binary file
A bug was introduced in 3e97c7c6af
(No diff -b/-w output for all-whitespace changes, Nov 19 2009)
that made the lines:

  diff --git a/bar b/sub/bar
  similarity index 100%
  rename from bar
  rename to sub/bar

disappear from "git show -C -C" output when file bar is a binary
file.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-06-06 15:14:27 -07:00
Bo Yang 4297c0aeb5 Make --color-words work well with --graph
'--color-words' algorithm can be described as:

  1. collect a the minus/plus lines of a diff hunk, divided into
     minus-lines and plus-lines;

  2. break both minus-lines and plus-lines into words and
     place them into two mmfile_t with one word for each line;

  3. use xdiff to run diff on the two mmfile_t to get the words level diff;

And for the common parts of the both file, we output the plus side text.
diff_words->current_plus is used to trace the current position of the plus file
which printed. diff_words->last_minus is used to trace the last minus word
printed.

For '--graph' to work with '--color-words', we need to output the graph prefix
on each line of color words output. Generally, there are two conditions on
which we should output the prefix.

  1. diff_words->last_minus == 0 &&
     diff_words->current_plus == diff_words->plus.text.ptr

     that is: the plus text must start as a new line, and if there is no minus
     word printed, a graph prefix must be printed.

  2. diff_words->current_plus > diff_words->plus.text.ptr &&
     *(diff_words->current_plus - 1) == '\n'

     that is: a graph prefix must be printed following a '\n'

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:20 -07:00
Bo Yang 2efcc97764 Emit a whole line in one go
Since the graph prefix will be printed when calling
emit_line, so the functions should be used to emit a
complete line out once a time. No one should call
emit_line to just output some strings instead of a
complete line.
Use a strbuf to compose the whole line, and then
call emit_line to output it once.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:02:04 -07:00
Bo Yang 7be5761073 diff.c: Output the text graph padding before each diff line
Change output from diff with -p/--dirstat/--binary/--numstat/--stat/
--shortstat/--check/--summary options to align with graph paddings.

Thanks Jeff King <peff@peff.net> for reporting the '--summary' bug and
his initial patch.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Bo Yang a3c158d4a5 Add a prefix output callback to diff output
The callback can be used to add some prefix string to each line of
diff output.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 18:00:21 -07:00
Johan Herland 3e5a188f1d diff.c: Ensure "index $from..$to" line contains unambiguous SHA1s
In the metainfo section of git diffs there's an "index" line providing
abbreviated (unless --full-index is used) blob SHA1s from the
pre-/post-images used to generate the diff. These provide hints that
can be used to reconstruct a 3-way merge when applying the patch
(see the --3way option to 'git am' for more details).

In order for this to work, however, the blob SHA1s must not be
abbreviated into ambiguity.

This patch eliminates the possible ambiguity by using find_unique_abbrev()
to produce the abbreviated SHA1s (instead of blind abbreviation by way of
"%.*s").

A testcase verifying the fix is also included.

Signed-off-by: Johan Herland <johan@herland.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-31 17:44:01 -07:00
Junio C Hamano 82c531b3b6 Merge branch 'by/log-follow'
* by/log-follow:
  tests: rename duplicate t4205
  Make git log --follow find copies among unmodified files.
  Make diffcore_std only can run once before a diff_flush
  Add a macro DIFF_QUEUE_CLEAR.
2010-05-21 04:02:23 -07:00
Junio C Hamano 1bdd46cd3a Merge branch 'tr/word-diff'
* tr/word-diff:
  diff: add --word-diff option that generalizes --color-words

Conflicts:
	diff.c
2010-05-21 04:02:17 -07:00
Bert Wesarg 374664478f diff: fix coloring of extended diff headers
Coloring the extended headers where done as a whole not per line. less with
option -R (which is the default from git) does not support this coloring
mode because of performance reasons. The -r option would be an alternative
but has problems with lines that are longer than the screen. Therefore
stick to the idiom to color each line separately. The problem is, that the
result of ill_metainfo() will also be used as an parameter to an external
diff driver, so we need to disable coloring in this case.

Because coloring is now done inside fill_metainfo() we can simply add this
string to the diff header and therefore keep the last newline in the
extended header. This results also into the fact that the external diff
driver now gets this last newline too. Which is a change in behavior
but a good one.

Signed-off-by: Bert Wesarg <bert.wesarg@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-19 21:06:40 -07:00
Will Palmer 1c9eecff97 diff-options: make --patch a synonym for -p
Here we simply make --patch a synonym for -p, whose mnemonic was "patch"
all along.

Signed-off-by: Will Palmer <wmpalmer@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:50:03 -07:00
Eli Collins f89504ddb9 diff: add configuration option for disabling diff prefixes.
With new configuration "diff.noprefix", "git diff" does not show a source or destination prefix ala "git diff --no-prefix".

Signed-off-by: Eli Collins <eli@cloudera.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-18 21:31:51 -07:00
Junio C Hamano dd75d07899 Merge branch 'jk/cached-textconv'
* jk/cached-textconv:
  diff: avoid useless filespec population
  diff: cache textconv output
  textconv: refactor calls to run_textconv
  introduce notes-cache interface
  make commit_tree a library function
2010-05-08 22:33:08 -07:00
Bo Yang 1da6175d43 Make diffcore_std only can run once before a diff_flush
When file renames/copies detection is turned on, the
second diffcore_std will degrade a 'C' pair to a 'R' pair.

And this may happen when we run 'git log --follow' with
hard copies finding. That is, the try_to_follow_renames()
will run diffcore_std to find the copies, and then
'git log' will issue another diffcore_std, which will reduce
'src->rename_used' and recognize this copy as a rename.
This is not what we want.

So, I think we really don't need to run diffcore_std more
than one time.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:28 -07:00
Bo Yang 9ca5df9061 Add a macro DIFF_QUEUE_CLEAR.
Refactor the diff_queue_struct code, this macro help
to reset the structure.

Signed-off-by: Bo Yang <struggleyb.nku@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-07 09:34:27 -07:00
Junio C Hamano 6b6f5d4664 Merge branch 'maint-1.7.0' into maint
* maint-1.7.0:
  remove ecb parameter from xdi_diff_outf()
2010-05-04 15:20:47 -07:00
René Scharfe dfea79004c remove ecb parameter from xdi_diff_outf()
xdi_diff_outf() overrides the structure members of its last parameter,
ignoring any value that callers pass in.  It's no surprise then that all
callers pass a pointer to an uninitialized structure.  They also don't
read it after the call, so the parameter is neither used for input nor
for output.   Turn it into a local variable of xdi_diff_outf().

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-04 15:19:14 -07:00
René Scharfe 582aa00bdf git diff too slow for a file
Ever since the xdiff library had been introduced to git, all its callers
have used the flag XDF_NEED_MINIMAL.  It makes sure that the smallest
possible diff is produced, but that takes quite some time if there are
lots of differences that can be expressed in multiple ways.

This flag makes a difference for only 0.1% of the non-merge commits in
the git repo of Linux, both in terms of diff size and execution time.
The patches there are mostly nice and small.

SungHyun Nam however reported a case in a different repo where a diff
took more than 20 times longer to generate with XDF_NEED_MINIMAL than
without.  Rebasing became really slow.

This patch removes this flag from all callers.  The default of xdiff is
saner because it has minimal to no impact in the normal case of small
diffs and doesn't incur that much of a speed penalty for large ones.

A follow-up patch may introduce a command line option to set the flag if
the user needs it, similar to GNU diff's -d/--minimal.

Signed-off-by: Rene Scharfe <rene.scharfe@lsrfire.ath.cx>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-05-02 07:59:50 -07:00
Junio C Hamano 4fd8145c0c Merge branch 'jk/maint-diffstat-overflow' into maint
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-22 22:29:13 -07:00
Junio C Hamano bc32d342c2 Merge branch 'jk/maint-diffstat-overflow'
* jk/maint-diffstat-overflow:
  diff: use large integers for diffstat calculations
2010-04-18 21:31:50 -07:00
Jeff King 0974c117ff diff: use large integers for diffstat calculations
The diffstat "added" and "changed" fields generally store
line counts; however, for binary files, they store file
sizes. Since we store and print these values as ints, a
diffstat on a file larger than 2G can show a negative size.
Instead, let's use uintmax_t, which should be at least 64
bits on modern platforms.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-17 11:30:21 -07:00
Thomas Rast 882749a04f diff: add --word-diff option that generalizes --color-words
This teaches the --color-words engine a more general interface that
supports two new modes:

* --word-diff=plain, inspired by the 'wdiff' utility (most similar to
  'wdiff -n <old> <new>'): uses delimiters [-removed-] and {+added+}

* --word-diff=porcelain, which generates an ad-hoc machine readable
  format:
  - each diff unit is prefixed by [-+ ] and terminated by newline as
    in unified diff
  - newlines in the input are output as a line consisting only of a
    tilde '~'

Both of these formats still support color if it is enabled, using it
to highlight the differences.  --color-words becomes a synonym for
--word-diff=color, which is the color-only format.  Also adds some
compatibility/convenience options.

Thanks to Junio C Hamano and Miles Bader for good ideas.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-14 10:56:53 -07:00
Junio C Hamano daaf2e8892 Merge branch 'jc/conflict-marker-size' into maint
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-09 22:38:34 -07:00
Junio C Hamano 7ec1eb93f7 Merge early parts of jk/cached-textconv 2010-04-08 23:31:51 -07:00
Junio C Hamano aed6ca52e7 diff.c: work around pointer constness warnings
The textconv leak fix introduced two invocations of free() to release
memory pointed by "const char *", which get annoying compiler warning.
2010-04-08 23:30:49 -07:00
Junio C Hamano 3f3f8d9d09 Merge branch 'jc/conflict-marker-size'
* jc/conflict-marker-size:
  diff --check: honor conflict-marker-size attribute
2010-04-06 14:50:46 -07:00
Jeff King b337398266 diff: avoid useless filespec population
builtin_diff calls fill_mmfile fairly early, which in turn
calls diff_populate_filespec, which actually retrieves the
file's blob contents into a buffer. Long ago, this was
sensible as we would need to look at the blobs eventually.

These days, however, we may not ever want those blobs if we
end up using a textconv cache, and for large binary files
(exactly the sort for which you might have a textconv
cache), just retrieving the objects can be costly.

This patch just pushes the fill_mmfile call a bit later, so
we can avoid populating the filespec in some cases.  There
is one thing to note that looks like a bug but isn't. We
push the fill_mmfile down into the first branch of a
conditional. It seems like we would need it on the other
branch, too, but we don't; fill_textconv does it for us (in
fact, before this, we were just writing over the results of
the fill_mmfile on that branch).

Here's a timing sample on a commit with 45 changed jpgs and
avis. The result is fully textconv cached, but we still
wasted a lot of time just pulling the blobs from storage.
The total size of the blobs (source and dest) is about
180M.

  [before]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

  [after]
  $ time git show >/dev/null
  real    0m0.009s
  user    0m0.004s
  sys     0m0.004s

And that's on a warm cache. On a cold cache, the "after"
case is not much worse, but the "before" case has to do an
extra 180M of I/O.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:11:20 -07:00
Jeff King d9bae1a178 diff: cache textconv output
Running a textconv filter can take a long time. It's
particularly bad for a large file which needs to be spooled
to disk, but even for small files, the fork+exec overhead
can add up for something like "git log -p".

This patch uses the notes-cache mechanism to keep a fast
cache of textconv output. Caches are stored in
refs/notes/textconv/$x, where $x is the userdiff driver
defined in gitattributes.

Caching is enabled only if diff.$x.cachetextconv is true.

In my test repo, on a commit with 45 jpg and avi files
changed and a textconv to show their exif tags:

  [before]
  $ time git show >/dev/null
  real    0m13.724s
  user    0m12.057s
  sys     0m1.624s

  [after, first run]
  $ git config diff.mfo.cachetextconv true
  $ time git show >/dev/null
  real    0m14.252s
  user    0m12.197s
  sys     0m1.800s

  [after, subsequent runs]
  $ time git show >/dev/null
  real    0m0.352s
  user    0m0.148s
  sys     0m0.200s

So for a slight (3.8%) cost on the first run, we achieve an
almost 40x speed up on subsequent runs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:05:31 -07:00
Jeff King 840383b2c2 textconv: refactor calls to run_textconv
This patch adds a fill_textconv wrapper, which centralizes
some minor logic like error checking and handling the case
of no-textconv.

In addition to dropping the number of lines, this will make
it easier in future patches to handle multiple types of
textconv.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-02 00:01:57 -07:00
Jeff King b76c056b95 fix textconv leak in emit_rewrite_diff
We correctly free() for the normal diff case, but leak for
rewrite diffs.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-04-01 23:49:29 -07:00
Junio C Hamano 890a13a452 Sync with 1.7.0.4
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-03-31 15:14:27 -07:00