Commit graph

1036 commits

Author SHA1 Message Date
Junio C Hamano abecddea25 Merge branch 'jc/diff-ws-error-highlight'
A hotfix to a new feature in 2.5.0-rc.

* jc/diff-ws-error-highlight:
  diff: parse ws-error-highlight option more strictly
2015-07-15 12:30:14 -07:00
René Scharfe 3f4f17b51b diff: parse ws-error-highlight option more strictly
Check if a matched token is followed by a delimiter before advancing the
pointer arg.  This avoids accepting composite words like "allnew" or
"defaultcontext" and misparsing them as "new" or "context".

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-12 09:55:23 -07:00
David Turner 076c98372e log: add "log.follow" configuration variable
People who work on projects with mostly linear history with frequent
whole file renames may want to always use "git log --follow" when
inspecting the life of the content that live in a single path.

Teach the command to behave as if "--follow" was given from the
command line when log.follow configuration variable is set *and*
there is one (and only one) path on the command line.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-09 10:24:23 -07:00
Junio C Hamano 6998d890c7 Merge branch 'jk/color-diff-plain-is-context' into maint
"color.diff.plain" was a misnomer; give it 'color.diff.context' as
a more logical synonym.

* jk/color-diff-plain-is-context:
  diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT
  diff: accept color.diff.context as a synonym for "plain"
2015-06-25 11:02:11 -07:00
Junio C Hamano db65170ee5 Merge branch 'jk/color-diff-plain-is-context'
"color.diff.plain" was a misnomer; give it 'color.diff.context' as
a more logical synonym.

* jk/color-diff-plain-is-context:
  diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT
  diff: accept color.diff.context as a synonym for "plain"
2015-06-11 09:29:53 -07:00
Junio C Hamano 709cd912d4 Merge branch 'jc/diff-ws-error-highlight'
Allow whitespace breakages in deleted and context lines to be also
painted in the output.

* jc/diff-ws-error-highlight:
  diff.c: --ws-error-highlight=<kind> option
  diff.c: add emit_del_line() and emit_context_line()
  t4015: separate common setup and per-test expectation
  t4015: modernise style
2015-06-11 09:29:51 -07:00
Jeff King 8dbf3eb685 diff.h: rename DIFF_PLAIN color slot to DIFF_CONTEXT
The latter is a much more descriptive name (and we support
"color.diff.context" now). This also updates the name of any
local variables which were used to store the color.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-27 13:54:42 -07:00
Jeff King 74b15bfbf6 diff: accept color.diff.context as a synonym for "plain"
The term "plain" is a bit ambiguous; let's allow the more
specific "context", but keep "plain" around for
compatibility.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-27 13:54:37 -07:00
Junio C Hamano b8767f791c diff.c: --ws-error-highlight=<kind> option
Traditionally, we only cared about whitespace breakages introduced
in new lines.  Some people want to paint whitespace breakages on old
lines, too.  When they see a whitespace breakage on a new line, they
can spot the same kind of whitespace breakage on the corresponding
old line and want to say "Ah, those breakages are there but they
were inherited from the original, so let's not touch them for now."

Introduce `--ws-error-highlight=<kind>` option, that lets them pass
a comma separated list of `old`, `new`, and `context` to specify
what lines to highlight whitespace errors on.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-26 23:00:01 -07:00
Junio C Hamano 0e383e185a diff.c: add emit_del_line() and emit_context_line()
Traditionally, we only had emit_add_line() helper, which knows how
to find and paint whitespace breakages on the given line, because we
only care about whitespace breakages introduced in new lines.  The
context lines and old (i.e. deleted) lines are emitted with a
simpler emit_line_0() that paints the entire line in plain or old
colors.

Identify callers of emit_line_0() that show deleted lines and
context lines, have them call new helpers, emit_del_line() and
emit_context_line(), so that we can later tweak what is done to
these two classes of lines.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-26 22:13:02 -07:00
Junio C Hamano a393c6bfd9 Merge branch 'rs/deflate-init-cleanup' into maint
Code simplification.

* rs/deflate-init-cleanup:
  zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}
2015-03-23 11:23:38 -07:00
Junio C Hamano 6902c4da58 Merge branch 'rs/deflate-init-cleanup'
Code simplification.

* rs/deflate-init-cleanup:
  zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}
2015-03-17 16:01:26 -07:00
Junio C Hamano a4b4f9b8e3 Merge branch 'mk/diff-shortstat-dirstat-fix' into maint
"git diff --shortstat --dirstat=changes" showed a dirstat based on
lines that was never asked by the end user in addition to the
dirstat that the user asked for.

* mk/diff-shortstat-dirstat-fix:
  diff --shortstat --dirstat: remove duplicate output
2015-03-13 22:56:04 -07:00
Junio C Hamano b6488fe191 Merge branch 'mk/diff-shortstat-dirstat-fix'
"git diff --shortstat --dirstat=changes" showed a dirstat based on
lines that was never asked by the end user in addition to the
dirstat that the user asked for.

* mk/diff-shortstat-dirstat-fix:
  diff --shortstat --dirstat: remove duplicate output
2015-03-06 15:02:29 -08:00
René Scharfe 9a6f1287fb zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw}
Clear the git_zstream variable at the start of git_deflate_init() etc.
so that callers don't have to do that.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05 15:46:03 -08:00
Mårten Kongstad ab27389aff diff --shortstat --dirstat: remove duplicate output
When --shortstat is used in conjunction with --dirstat=changes, git diff will
output the dirstat information twice: first as calculated by the 'lines'
algorithm, then as calculated by the 'changes' algorithm:

    $ git diff --dirstat=changes,10 --shortstat v2.2.0..v2.2.1
     23 files changed, 453 insertions(+), 54 deletions(-)
      33.5% Documentation/RelNotes/
      26.2% t/
      46.6% Documentation/RelNotes/
      16.6% t/

The same duplication happens for --shortstat together with --dirstat=files, but
not for --shortstat together with --dirstat=lines.

Limit output to only include one dirstat part, calculated as specified
by the --dirstat parameter. Also, add test for this.

Signed-off-by: Mårten Kongstad <marten.kongstad@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-02 11:31:27 -08:00
Junio C Hamano b946576839 Merge branch 'jn/parse-config-slot'
Code cleanup.

* jn/parse-config-slot:
  color_parse: do not mention variable name in error message
  pass config slots as pointers instead of offsets
2014-10-20 12:23:48 -07:00
Jeff King f6c5a2968c color_parse: do not mention variable name in error message
Originally the color-parsing function was used only for
config variables. It made sense to pass the variable name so
that the die() message could be something like:

  $ git -c color.branch.plain=bogus branch
  fatal: bad color value 'bogus' for variable 'color.branch.plain'

These days we call it in other contexts, and the resulting
error messages are a little confusing:

  $ git log --pretty='%C(bogus)'
  fatal: bad color value 'bogus' for variable '--pretty format'

  $ git config --get-color foo.bar bogus
  fatal: bad color value 'bogus' for variable 'command line'

This patch teaches color_parse to complain only about the
value, and then return an error code. Config callers can
then propagate that up to the config parser, which mentions
the variable name. Other callers can provide a custom
message. After this patch these three cases now look like:

  $ git -c color.branch.plain=bogus branch
  error: invalid color value: bogus
  fatal: unable to parse 'color.branch.plain' from command-line config

  $ git log --pretty='%C(bogus)'
  error: invalid color value: bogus
  fatal: unable to parse --pretty format

  $ git config --get-color foo.bar bogus
  error: invalid color value: bogus
  fatal: unable to parse default color value

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-14 11:01:21 -07:00
Junio C Hamano bedd3b4b7b Merge branch 'nd/large-blobs'
Teach a few codepaths to punt (instead of dying) when large blobs
that would not fit in core are involved in the operation.

* nd/large-blobs:
  diff: shortcut for diff'ing two binary SHA-1 objects
  diff --stat: mark any file larger than core.bigfilethreshold binary
  diff.c: allow to pass more flags to diff_populate_filespec
  sha1_file.c: do not die failing to malloc in unpack_compressed_entry
  wrapper.c: introduce gentle xmallocz that does not die()
2014-09-11 10:33:33 -07:00
René Scharfe d318027932 run-command: introduce CHILD_PROCESS_INIT
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20 09:53:37 -07:00
Nguyễn Thái Ngọc Duy 1aaf69e669 diff: shortcut for diff'ing two binary SHA-1 objects
If we are given two SHA-1 and asked to determine if they are different
(but not _what_ differences), we know right away by comparing SHA-1.

A side effect of this patch is, because large files are marked binary,
diff-tree will not need to unpack them. 'diff-index --cached' will not
either. But 'diff-files' still does.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18 10:16:55 -07:00
Nguyễn Thái Ngọc Duy 6bf3b81348 diff --stat: mark any file larger than core.bigfilethreshold binary
Too large files may lead to failure to allocate memory. If it happens
here, it could impact quite a few commands that involve
diff. Moreover, too large files are inefficient to compare anyway (and
most likely non-text), so mark them binary and skip looking at their
content.

Noticed-by: Dale R. Worley <worley@alum.mit.edu>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18 10:16:45 -07:00
Nguyễn Thái Ngọc Duy 8e5dd3d654 diff.c: allow to pass more flags to diff_populate_filespec
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-18 10:16:35 -07:00
Junio C Hamano cfececfe1f Merge branch 'bg/xcalloc-nmemb-then-size' into maint
* bg/xcalloc-nmemb-then-size:
  transport-helper.c: rearrange xcalloc arguments
  remote.c: rearrange xcalloc arguments
  reflog-walk.c: rearrange xcalloc arguments
  pack-revindex.c: rearrange xcalloc arguments
  notes.c: rearrange xcalloc arguments
  imap-send.c: rearrange xcalloc arguments
  http-push.c: rearrange xcalloc arguments
  diff.c: rearrange xcalloc arguments
  config.c: rearrange xcalloc arguments
  commit.c: rearrange xcalloc arguments
  builtin/remote.c: rearrange xcalloc arguments
  builtin/ls-remote.c: rearrange xcalloc arguments
2014-07-22 10:25:17 -07:00
René Scharfe cedc61a998 strbuf: use strbuf_addstr() for adding C strings
Avoid code duplication and let strbuf_addstr() call strlen() for us.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-17 13:33:52 -07:00
Junio C Hamano cb4575fb18 Merge branch 'jk/diff-follow-must-take-one-pathspec' into maint
"git format-patch" did not enforce the rule that the "--follow"
option from the log/diff family of commands must be used with
exactly one pathspec.

* jk/diff-follow-must-take-one-pathspec:
  move "--follow needs one pathspec" rule to diff_setup_done
2014-06-25 11:47:23 -07:00
Jeff King 0539cc0038 stat_opt: check extra strlen call
As in earlier commits, the diff option parser uses
starts_with to find that an argument starts with "--stat-",
and then adds strlen("stat-") to find the rest of the
option.

However, in this case the starts_with and the strlen are
separated across functions, making it easy to call the
latter without the former. Let's use skip_prefix instead of
raw pointer arithmetic to catch such a case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:45:19 -07:00
Jeff King 95b567c7c3 use skip_prefix to avoid repeating strings
It's a common idiom to match a prefix and then skip past it
with strlen, like:

  if (starts_with(foo, "bar"))
	  foo += strlen("bar");

This avoids magic numbers, but means we have to repeat the
string (and there is no compiler check that we didn't make a
typo in one of the strings).

We can use skip_prefix to handle this case without repeating
ourselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:44:45 -07:00
Jeff King ae021d8791 use skip_prefix to avoid magic numbers
It's a common idiom to match a prefix and then skip past it
with a magic number, like:

  if (starts_with(foo, "bar"))
	  foo += 3;

This is easy to get wrong, since you have to count the
prefix string yourself, and there's no compiler check if the
string changes.  We can use skip_prefix to avoid the magic
numbers here.

Note that some of these conversions could be much shorter.
For example:

  if (starts_with(arg, "--foo=")) {
	  bar = arg + 6;
	  continue;
  }

could become:

  if (skip_prefix(arg, "--foo=", &bar))
	  continue;

However, I have left it as:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = v;
	  continue;
  }

to visually match nearby cases which need to actually
process the string. Like:

  if (skip_prefix(arg, "--foo=", &v)) {
	  bar = atoi(v);
	  continue;
  }

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-20 10:44:45 -07:00
Jeff King 9e1a5ebe52 parse_diff_color_slot: drop ofs parameter
This function originally took a whole config variable name
("var") and an offset ("ofs"). It checked "var+ofs" against
each color slot, but reported errors using the whole "var".

However, since 8b8e862 (ignore unknown color configuration,
2009-12-12), it returns -1 rather than printing its own
error, and therefore only cares about var+ofs. We can drop
the ofs parameter and teach its sole caller to derive the
pointer itself.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-18 14:56:17 -07:00
Junio C Hamano a634a6d209 Merge branch 'bg/xcalloc-nmemb-then-size'
Like calloc(3), xcalloc() takes nmemb and then size.

* bg/xcalloc-nmemb-then-size:
  transport-helper.c: rearrange xcalloc arguments
  remote.c: rearrange xcalloc arguments
  reflog-walk.c: rearrange xcalloc arguments
  pack-revindex.c: rearrange xcalloc arguments
  notes.c: rearrange xcalloc arguments
  imap-send.c: rearrange xcalloc arguments
  http-push.c: rearrange xcalloc arguments
  diff.c: rearrange xcalloc arguments
  config.c: rearrange xcalloc arguments
  commit.c: rearrange xcalloc arguments
  builtin/remote.c: rearrange xcalloc arguments
  builtin/ls-remote.c: rearrange xcalloc arguments
2014-06-16 12:17:50 -07:00
Junio C Hamano b0e2c999af Merge branch 'jk/diff-follow-must-take-one-pathspec'
* jk/diff-follow-must-take-one-pathspec:
  move "--follow needs one pathspec" rule to diff_setup_done
2014-06-16 10:07:09 -07:00
Junio C Hamano 6779e43b0d Merge branch 'jk/external-diff-use-argv-array'
Code clean-up (and a bugfix which has been merged for 2.0).

* jk/external-diff-use-argv-array:
  run_external_diff: refactor cmdline setup logic
  run_external_diff: hoist common bits out of conditional
  run_external_diff: drop fflush(NULL)
  run_external_diff: clean up error handling
  run_external_diff: use an argv_array for the environment
2014-06-03 12:06:43 -07:00
Junio C Hamano 8eaf517835 Merge branch 'ks/tree-diff-nway'
Instead of running N pair-wise diff-trees when inspecting a
N-parent merge, find the set of paths that were touched by walking
N+1 trees in parallel.  These set of paths can then be turned into
N pair-wise diff-tree results to be processed through rename
detections and such.  And N=2 case nicely degenerates to the usual
2-way diff-tree, which is very nice.

* ks/tree-diff-nway:
  mingw: activate alloca
  combine-diff: speed it up, by using multiparent diff tree-walker directly
  tree-diff: rework diff_tree() to generate diffs for multiparent cases as well
  Portable alloca for Git
  tree-diff: reuse base str(buf) memory on sub-tree recursion
  tree-diff: no need to call "full" diff_tree_sha1 from show_path()
  tree-diff: rework diff_tree interface to be sha1 based
  tree-diff: diff_tree() should now be static
  tree-diff: remove special-case diff-emitting code for empty-tree cases
  tree-diff: simplify tree_entry_pathcmp
  tree-diff: show_path prototype is not needed anymore
  tree-diff: rename compare_tree_entry -> tree_entry_pathcmp
  tree-diff: move all action-taking code out of compare_tree_entry()
  tree-diff: don't assume compare_tree_entry() returns -1,0,1
  tree-diff: consolidate code for emitting diffs and recursion in one place
  tree-diff: show_tree() is not needed
  tree-diff: no need to pass match to skip_uninteresting()
  tree-diff: no need to manually verify that there is no mode change for a path
  combine-diff: move changed-paths scanning logic into its own function
  combine-diff: move show_log_first logic/action out of paths scanning
2014-06-03 12:06:40 -07:00
Brian Gesiak 1a4927c5c5 diff.c: rearrange xcalloc arguments
xcalloc() takes two arguments: the number of elements and their size.
diffstat_add() passes the arguments in reverse order, passing the
size of a diffstat_file*, followed by the number of diffstat_file* to
be allocated.

Rearrange them so they are in the correct order.

Signed-off-by: Brian Gesiak <modocache@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27 14:02:03 -07:00
Jeff King dd63f169d9 move "--follow needs one pathspec" rule to diff_setup_done
Because of the way "--follow" is implemented, we must have
exactly one pathspec. "git log" enforces this restriction,
but other users of the revision traversal code do not. For
example, "git format-patch --follow" will segfault during
try_to_follow_renames, as we have no pathspecs at all.

We can push this check down into diff_setup_done, which is
probably a better place anyway. It is the diff code that
introduces this restriction, so other parts of the code
should not need to care themselves.

Reported-by: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-20 11:09:03 -07:00
Junio C Hamano 5f11a7aad0 Merge branch 'jk/external-diff-use-argv-array' (early part)
Crash fix for codepath that miscounted the necessary size for an
array when spawning an external diff program.

* 'jk/external-diff-use-argv-array' (early part):
  run_external_diff: use an argv_array for the command line
2014-04-28 15:47:35 -07:00
Jeff King f3efe78782 run_external_diff: refactor cmdline setup logic
The current logic makes it hard to see what gets put onto
the command line in which cases. Pulling out a helper
function lets us see that we have two sets of file data, and
the second set either uses the original name, or the "other"
renamed/copy name.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21 10:32:19 -07:00
Jeff King 0d4217d92e run_external_diff: hoist common bits out of conditional
Whether we have diff_filespecs to give to the diff command
or not, we always are going to run the program and pass it
the pathname. Let's pull that duplicated part out of the
conditional to make it more obvious.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21 10:32:07 -07:00
Jeff King 5b88caa417 run_external_diff: drop fflush(NULL)
This fflush was added in d5535ec (Use run_command() to spawn
external diff programs instead of fork/exec., 2007-10-19),
because flushing buffers before forking is a good habit.

But later, 7d0b18a (Add output flushing before fork(),
2008-08-04) added it to the generic run-command interface,
meaning that our flush here is redundant.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21 10:31:51 -07:00
Jeff King 89294d143d run_external_diff: clean up error handling
When the external diff reports an error, we try to clean up
and die. However, we can make this process a bit simpler:

  1. We do not need to bother freeing memory, since we are
     about to exit.  Nor do we need to clean up our
     tempfiles, since the atexit() handler will do it for
     us. So we can die as soon as we see the error.

  3. We can just call die() rather than fprintf/exit. This
     does technically change our exit code, but the exit
     code of "1" is not meaningful here. In fact, it is
     probably wrong, since "1" from diff usually means
     "completed successfully, but there were differences".

And while we're there, we can mark the error message for
translation, and drop the full stop at the end to make it
more like our other messages.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21 10:31:36 -07:00
Jeff King ae049c955c run_external_diff: use an argv_array for the environment
We currently use static buffers and a static array for
formatting the environment passed to the external diff.
There's nothing wrong in the code, but it is much easier to
verify that it is correct if we use a dynamic argv_array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21 10:30:33 -07:00
Jeff King 82fbf269b9 run_external_diff: use an argv_array for the command line
We currently generate the command-line for the external
command using a fixed-length array of size 10. But if there
is a rename, we actually need 11 elements (10 items, plus a
NULL), and end up writing a random NULL onto the stack.

Rather than bump the limit, let's just use an argv_array, which
makes this sort of error impossible.

Noticed-by: Max L <infthi.inbox@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-21 10:29:50 -07:00
Jiang Xin d1d96a82bb i18n: remove obsolete comments for translators in diffstat generation
Since we do not translate diffstat any more, remove the obsolete comments.

Signed-off-by: Jiang Xin <worldhello.net@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-17 11:09:56 -07:00
Junio C Hamano d59c12d7ad Merge branch 'jl/nor-or-nand-and'
Eradicate mistaken use of "nor" (that is, essentially "nor" used
not in "neither A nor B" ;-)) from in-code comments, command output
strings, and documentations.

* jl/nor-or-nand-and:
  code and test: fix misuses of "nor"
  comments: fix misuses of "nor"
  contrib: fix misuses of "nor"
  Documentation: fix misuses of "nor"
2014-04-08 12:00:28 -07:00
Kirill Smelkov 7195fbfaf5 combine-diff: speed it up, by using multiparent diff tree-walker directly
As was recently shown in "combine-diff: optimize
combine_diff_path sets intersection", combine-diff runs very slowly. In
that commit we optimized paths sets intersection, but that accounted
only for ~ 25% of the slowness, and as my tracing showed, for linux.git
v3.10..v3.11, for merges a lot of time is spent computing
diff(commit,commit^2) just to only then intersect that huge diff to
almost small set of files from diff(commit,commit^1).

In previous commit, we described the problem in more details, and
reworked the diff tree-walker to be general one - i.e. to work in
multiple parent case too. Now is the time to take advantage of it for
finding paths for combine diff.

The implementation is straightforward - if we know, we can get generated
diff paths directly, and at present that means no diff filtering or
rename/copy detection was requested(*), we can call multiparent tree-walker
directly and get ready paths.

(*) because e.g. at present, all diffcore transformations work on
    diff_filepair queues, but in the future, that limitation can be
    lifted, if filters would operate directly on combine_diff_paths.

Timings for `git log --raw --no-abbrev --no-renames` without `-c` ("git log")
and with `-c` ("git log -c") and with `-c --merges` ("git log -c --merges")
before and after the patch are as follows:

                linux.git v3.10..v3.11

            log     log -c     log -c --merges

    before  1.9s    16.4s      15.2s
    after   1.9s     2.4s       1.1s

The result stayed the same.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07 14:41:49 -07:00
Kirill Smelkov 72441af7c4 tree-diff: rework diff_tree() to generate diffs for multiparent cases as well
Previously diff_tree(), which is now named ll_diff_tree_sha1(), was
generating diff_filepair(s) for two trees t1 and t2, and that was
usually used for a commit as t1=HEAD~, and t2=HEAD - i.e. to see changes
a commit introduces.

In Git, however, we have fundamentally built flexibility in that a
commit can have many parents - 1 for a plain commit, 2 for a simple merge,
but also more than 2 for merging several heads at once.

For merges there is a so called combine-diff, which shows diff, a merge
introduces by itself, omitting changes done by any parent. That works
through first finding paths, that are different to all parents, and then
showing generalized diff, with separate columns for +/- for each parent.
The code lives in combine-diff.c .

There is an impedance mismatch, however, in that a commit could
generally have any number of parents, and that while diffing trees, we
divide cases for 2-tree diffs and more-than-2-tree diffs. I mean there
is no special casing for multiple parents commits in e.g.
revision-walker .

That impedance mismatch *hurts* *performance* *badly* for generating
combined diffs - in "combine-diff: optimize combine_diff_path
sets intersection" I've already removed some slowness from it, but from
the timings provided there, it could be seen, that combined diffs still
cost more than an order of magnitude more cpu time, compared to diff for
usual commits, and that would only be an optimistic estimate, if we take
into account that for e.g. linux.git there is only one merge for several
dozens of plain commits.

That slowness comes from the fact that currently, while generating
combined diff, a lot of time is spent computing diff(commit,commit^2)
just to only then intersect that huge diff to almost small set of files
from diff(commit,commit^1).

That's because at present, to compute combine-diff, for first finding
paths, that "every parent touches", we use the following combine-diff
property/definition:

D(A,P1...Pn) = D(A,P1) ^ ... ^ D(A,Pn)      (w.r.t. paths)

where

D(A,P1...Pn) is combined diff between commit A, and parents Pi

and

D(A,Pi) is usual two-tree diff Pi..A

So if any of that D(A,Pi) is huge, tracting 1 n-parent combine-diff as n
1-parent diffs and intersecting results will be slow.

And usually, for linux.git and other topic-based workflows, that
D(A,P2) is huge, because, if merge-base of A and P2, is several dozens
of merges (from A, via first parent) below, that D(A,P2) will be diffing
sum of merges from several subsystems to 1 subsystem.

The solution is to avoid computing n 1-parent diffs, and to find
changed-to-all-parents paths via scanning A's and all Pi's trees
simultaneously, at each step comparing their entries, and based on that
comparison, populate paths result, and deduce we could *skip*
*recursing* into subdirectories, if at least for 1 parent, sha1 of that
dir tree is the same as in A. That would save us from doing significant
amount of needless work.

Such approach is very similar to what diff_tree() does, only there we
deal with scanning only 2 trees simultaneously, and for n+1 tree, the
logic is a bit more complex:

D(T,P1...Pn) calculation scheme
-------------------------------

D(T,P1...Pn) = D(T,P1) ^ ... ^ D(T,Pn)	(regarding resulting paths set)

    D(T,Pj)		- diff between T..Pj
    D(T,P1...Pn)	- combined diff from T to parents P1,...,Pn

We start from all trees, which are sorted, and compare their entries in
lock-step:

     T     P1       Pn
     -     -        -
    |t|   |p1|     |pn|
    |-|   |--| ... |--|      imin = argmin(p1...pn)
    | |   |  |     |  |
    |-|   |--|     |--|
    |.|   |. |     |. |
     .     .        .
     .     .        .

at any time there could be 3 cases:

    1)  t < p[imin];
    2)  t > p[imin];
    3)  t = p[imin].

Schematic deduction of what every case means, and what to do, follows:

1)  t < p[imin]  ->  ∀j t ∉ Pj  ->  "+t" ∈ D(T,Pj)  ->  D += "+t";  t↓

2)  t > p[imin]

    2.1) ∃j: pj > p[imin]  ->  "-p[imin]" ∉ D(T,Pj)  ->  D += ø;  ∀ pi=p[imin]  pi↓
    2.2) ∀i  pi = p[imin]  ->  pi ∉ T  ->  "-pi" ∈ D(T,Pi)  ->  D += "-p[imin]";  ∀i pi↓

3)  t = p[imin]

    3.1) ∃j: pj > p[imin]  ->  "+t" ∈ D(T,Pj)  ->  only pi=p[imin] remains to investigate
    3.2) pi = p[imin]  ->  investigate δ(t,pi)
     |
     |
     v

    3.1+3.2) looking at δ(t,pi) ∀i: pi=p[imin] - if all != ø  ->

                      ⎧δ(t,pi)  - if pi=p[imin]
             ->  D += ⎨
                      ⎩"+t"     - if pi>p[imin]

    in any case t↓  ∀ pi=p[imin]  pi↓

~

For comparison, here is how diff_tree() works:

D(A,B) calculation scheme
-------------------------

    A     B
    -     -
   |a|   |b|    a < b   ->  a ∉ B   ->   D(A,B) +=  +a    a↓
   |-|   |-|    a > b   ->  b ∉ A   ->   D(A,B) +=  -b    b↓
   | |   | |    a = b   ->  investigate δ(a,b)            a↓ b↓
   |-|   |-|
   |.|   |.|
    .     .
    .     .

~~~~~~~~

This patch generalizes diff tree-walker to work with arbitrary number of
parents as described above - i.e. now there is a resulting tree t, and
some parents trees tp[i] i=[0..nparent). The generalization builds on
the fact that usual diff

D(A,B)

is by definition the same as combined diff

D(A,[B]),

so if we could rework the code for common case and make it be not slower
for nparent=1 case, usual diff(t1,t2) generation will not be slower, and
multiparent diff tree-walker would greatly benefit generating
combine-diff.

What we do is as follows:

1) diff tree-walker ll_diff_tree_sha1() is internally reworked to be
   a paths generator (new name diff_tree_paths()), with each generated path
   being `struct combine_diff_path` with info for path, new sha1,mode and for
   every parent which sha1,mode it was in it.

2) From that info, we can still generate usual diff queue with
   struct diff_filepairs, via "exporting" generated
   combine_diff_path, if we know we run for nparent=1 case.
   (see emit_diff() which is now named emit_diff_first_parent_only())

3) In order for diff_can_quit_early(), which checks

       DIFF_OPT_TST(opt, HAS_CHANGES))

   to work, that exporting have to be happening not in bulk, but
   incrementally, one diff path at a time.

   For such consumers, there is a new callback in diff_options
   introduced:

       ->pathchange(opt, struct combine_diff_path *)

   which, if set to !NULL, is called for every generated path.

   (see new compat ll_diff_tree_sha1() wrapper around new paths
    generator for setup)

4) The paths generation itself, is reworked from previous
   ll_diff_tree_sha1() code according to "D(A,P1...Pn) calculation
   scheme" provided above:

   On the start we allocate [nparent] arrays in place what was
   earlier just for one parent tree.

   then we just generalize loops, and comparison according to the
   algorithm.

Some notes(*):

1) alloca(), for small arrays, is used for "runs not slower for
   nparent=1 case than before" goal - if we change it to xmalloc()/free()
   the timings get ~1% worse. For alloca() we use just-introduced
   xalloca/xalloca_free compatibility wrappers, so it should not be a
   portability problem.

2) For every parent tree, we need to keep a tag, whether entry from that
   parent equals to entry from minimal parent. For performance reasons I'm
   keeping that tag in entry's mode field in unused bit - see S_IFXMIN_NEQ.
   Not doing so, we'd need to alloca another [nparent] array, which hurts
   performance.

3) For emitted paths, memory could be reused, if we know the path was
   processed via callback and will not be needed later. We use efficient
   hand-made realloc-style path_appendnew(), that saves us from ~1-1.5%
   of potential additional slowdown.

4) goto(s) are used in several places, as the code executes a little bit
   faster with lowered register pressure.

Also

- we should now check for FIND_COPIES_HARDER not only when two entries
  names are the same, and their hashes are equal, but also for a case,
  when a path was removed from some of all parents having it.

  The reason is, if we don't, that path won't be emitted at all (see
  "a > xi" case), and we'll just skip it, and FIND_COPIES_HARDER wants
  all paths - with diff or without - to be emitted, to be later analyzed
  for being copies sources.

  The new check is only necessary for nparent >1, as for nparent=1 case
  xmin_eqtotal always =1 =nparent, and a path is always added to diff as
  removal.

~~~~~~~~

Timings for

    # without -c, i.e. testing only nparent=1 case
    `git log --raw --no-abbrev --no-renames`

before and after the patch are as follows:

                navy.git        linux.git v3.10..v3.11

    before      0.611s          1.889s
    after       0.619s          1.907s
    slowdown    1.3%            0.9%

This timings show we did no harm to usual diff(tree1,tree2) generation.
From the table we can see that we actually did ~1% slowdown, but I think
I've "earned" that 1% in the previous patch ("tree-diff: reuse base
str(buf) memory on sub-tree recursion", HEAD~~) so for nparent=1 case,
net timings stays approximately the same.

The output also stayed the same.

(*) If we revert 1)-4) to more usual techniques, for nparent=1 case,
    we'll get ~2-2.5% of additional slowdown, which I've tried to avoid, as
   "do no harm for nparent=1 case" rule.

For linux.git, combined diff will run an order of magnitude faster and
appropriate timings will be provided in the next commit, as we'll be
taking advantage of the new diff tree-walker for combined-diff
generation there.

P.S. and combined diff is not some exotic/for-play-only stuff - for
example for a program I write to represent Git archives as readonly
filesystem, there is initial scan with

    `git log --reverse --raw --no-abbrev --no-renames -c`

to extract log of what was created/changed when, as a result building a
map

    {}  sha1    ->  in which commit (and date) a content was added

that `-c` means also show combined diff for merges, and without them, if
a merge is non-trivial (merges changes from two parents with both having
separate changes to a file), or an evil one, the map will not be full,
i.e. some valid sha1 would be absent from it.

That case was my initial motivation for combined diffs speedup.

Signed-off-by: Kirill Smelkov <kirr@mns.spb.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07 14:40:46 -07:00
Justin Lebar 01689909eb comments: fix misuses of "nor"
Signed-off-by: Justin Lebar <jlebar@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-31 15:29:27 -07:00
Junio C Hamano a5aca6e883 Merge branch 'tr/diff-submodule-no-reuse-worktree' into maint
"git diff --external-diff" incorrectly fed the submodule directory
in the working tree to the external diff driver when it knew it is
the same as one of the versions being compared.

* tr/diff-submodule-no-reuse-worktree:
  diff: do not reuse_worktree_file for submodules
2014-03-18 14:03:41 -07:00
Junio C Hamano 34120a5fb5 Merge branch 'nd/diff-quiet-stat-dirty' into maint
"git diff --quiet -- pathspec1 pathspec2" sometimes did not return
correct status value.

* nd/diff-quiet-stat-dirty:
  diff: do not quit early on stat-dirty files
  diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later
2014-03-18 13:59:56 -07:00
Junio C Hamano 6f75e48323 Merge branch 'rm/strchrnul-not-strlen'
* rm/strchrnul-not-strlen:
  use strchrnul() in place of strchr() and strlen()
2014-03-18 13:51:18 -07:00
Junio C Hamano fe9122a352 Merge branch 'dd/use-alloc-grow'
Replace open-coded reallocation with ALLOC_GROW() macro.

* dd/use-alloc-grow:
  sha1_file.c: use ALLOC_GROW() in pretend_sha1_file()
  read-cache.c: use ALLOC_GROW() in add_index_entry()
  builtin/mktree.c: use ALLOC_GROW() in append_to_tree()
  attr.c: use ALLOC_GROW() in handle_attr_line()
  dir.c: use ALLOC_GROW() in create_simplify()
  reflog-walk.c: use ALLOC_GROW()
  replace_object.c: use ALLOC_GROW() in register_replace_object()
  patch-ids.c: use ALLOC_GROW() in add_commit()
  diffcore-rename.c: use ALLOC_GROW()
  diff.c: use ALLOC_GROW()
  commit.c: use ALLOC_GROW() in register_commit_graft()
  cache-tree.c: use ALLOC_GROW() in find_subtree()
  bundle.c: use ALLOC_GROW() in add_to_ref_list()
  builtin/pack-objects.c: use ALLOC_GROW() in check_pbase_path()
2014-03-18 13:50:21 -07:00
Junio C Hamano 481e6aaacc Merge branch 'tr/diff-submodule-no-reuse-worktree'
"git diff --external-diff" incorrectly fed the submodule directory
in the working tree to the external diff driver when it knew it is
the same as one of the versions being compared.

* tr/diff-submodule-no-reuse-worktree:
  diff: do not reuse_worktree_file for submodules
2014-03-14 14:25:20 -07:00
Rohit Mani 2c5495f7b6 use strchrnul() in place of strchr() and strlen()
Avoid scanning strings twice, once with strchr() and then with
strlen(), by using strchrnul().

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Mani <rohit.mani@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-10 08:35:30 -07:00
Junio C Hamano 2687ffdeb7 Merge branch 'jc/hold-diff-remove-q-synonym-for-no-deletion'
Remove a confusing and deprecated "-q" option from "git diff-files";
"git diff-files --diff-filter=d" can be used instead.
2014-03-07 15:17:41 -08:00
Dmitry S. Dolzhenko 4c960a432c diff.c: use ALLOC_GROW()
Use ALLOC_GROW() instead of open-coding it in diffstat_add() and
diff_q().

Signed-off-by: Dmitry S. Dolzhenko <dmitrys.dolzhenko@yandex.ru>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-03 14:48:39 -08:00
Junio C Hamano 1e745453fe Merge branch 'nd/diff-quiet-stat-dirty'
"git diff --quiet -- pathspec1 pathspec2" sometimes did not return
correct status value.

* nd/diff-quiet-stat-dirty:
  diff: do not quit early on stat-dirty files
  diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later
2014-02-27 14:01:21 -08:00
Nguyễn Thái Ngọc Duy f34b205f6c diff: do not quit early on stat-dirty files
When QUICK is set (i.e. with --quiet) we try to do as little work as
possible, stopping after seeing the first change. stat-dirty is
considered a "change" but it may turn out not, if no actual content is
changed. The actual content test is performed too late in the process
and the shortcut may be taken prematurely, leading to incorrect return
code.

Assume we do "git diff --quiet". If we have a stat-dirty file "a" and
a really dirty file "b". We break the loop in run_diff_files() and
stop after "a" because we have got a "change". Later in
diffcore_skip_stat_unmatch() we find out "a" is actually not
changed. But there's nothing else in the diff queue, we incorrectly
declare "no change", ignoring the fact that "b" is changed.

This also happens to "git diff --quiet HEAD" when it hits
diff_can_quit_early() in oneway_diff().

This patch does the content test earlier in order to keep going if "a"
is unchanged. The test result is cached so that when
diffcore_skip_stat_unmatch() is done in the end, we spend no cycles on
re-testing "a".

Reported-by: IWAMOTO Toshihiro <iwamoto@valinux.co.jp>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:50:14 -08:00
Nguyễn Thái Ngọc Duy fceb907225 diff.c: move diffcore_skip_stat_unmatch core logic out for reuse later
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-24 14:50:03 -08:00
Thomas Rast aba4727281 diff: do not reuse_worktree_file for submodules
The GIT_EXTERNAL_DIFF calling code attempts to reuse existing worktree
files for the worktree side of diffs, for performance reasons.
However, that code also tries to do the same with submodules.  This
results in calls to $GIT_EXTERNAL_DIFF where the old-file is a file of
the form "Submodule commit $sha1", but the new-file is a directory in
the worktree.

Fix it by never reusing a worktree "file" in the submodule case.

Reported-by: Grégory Pakosz <gregory.pakosz@gmail.com>
Signed-off-by: Thomas Rast <tr@thomasrast.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-02-18 12:06:08 -08:00
Junio C Hamano e049109ef1 Merge branch 'jk/diff-filespec-cleanup'
* jk/diff-filespec-cleanup:
  diff_filespec: use only 2 bits for is_binary flag
  diff_filespec: reorder is_binary field
  diff_filespec: drop xfrm_flags field
  diff_filespec: drop funcname_pattern_ident field
  diff_filespec: reorder dirty_submodule macro definitions
2014-01-27 10:45:03 -08:00
Jeff King 428d52a5a5 diff_filespec: drop xfrm_flags field
The only mention of this field in the code is by some
debugging code which prints it out (and it will always be
zero, since we never touch it otherwise). It was obsoleted
very early on by 25d5ea4 ([PATCH] Redo rename/copy detection
logic., 2005-05-24).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-01-17 10:50:11 -08:00
Junio C Hamano 2da5cbd651 Merge branch 'sb/diff-orderfile-config'
Allow "git diff -O<file>" to be configured with a new configuration
variable.

* sb/diff-orderfile-config:
  diff: add diff.orderfile configuration variable
  diff: let "git diff -O" read orderfile from any file and fail properly
  t4056: add new tests for "git diff -O"
2014-01-10 10:32:42 -08:00
Junio C Hamano 6904f9aa5b Merge branch 'zk/difftool-counts'
Show the total number of paths and the number of paths shown so far
when "git difftool" prompts to launch an external diff tool, which
would give users some sense of progress.

* zk/difftool-counts:
  diff.c: fix some recent whitespace style violations
  difftool: display the number of files in the diff queue in the prompt
2013-12-27 14:58:13 -08:00
Samuel Bronson 6d8940b562 diff: add diff.orderfile configuration variable
diff.orderfile acts as a default for the -O command line option.

[sb: split up aw's original patch; rework tests and docs, treat option
as pathname]

Signed-off-by: Anders Waldenborg <anders@0x63.nu>
Signed-off-by: Samuel Bronson <naesten@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-18 16:39:00 -08:00
Junio C Hamano ad70448576 Merge branch 'cc/starts-n-ends-with'
Remove a few duplicate implementations of prefix/suffix comparison
functions, and rename them to starts_with and ends_with.

* cc/starts-n-ends-with:
  replace {pre,suf}fixcmp() with {starts,ends}_with()
  strbuf: introduce starts_with() and ends_with()
  builtin/remote: remove postfixcmp() and use suffixcmp() instead
  environment: normalize use of prefixcmp() by removing " != 0"
2013-12-17 12:02:44 -08:00
Jeff King 0ea7d5b6f8 diff.c: fix some recent whitespace style violations
These were introduced by ee7fb0b.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-16 13:04:47 -08:00
Zoltan Klinger ee7fb0b1d4 difftool: display the number of files in the diff queue in the prompt
When --prompt option is set, git-difftool displays a prompt for each
modified file to be viewed in an external diff program.  At that
point, it could be useful to display a counter and the total number
of files in the diff queue.

Below is the current difftool prompt for the first of 5 modified files:

    Viewing: 'diff.c'
    Launch 'vimdiff' [Y/n]:

Consider the modified prompt:

    Viewing (1/5): 'diff.c'
    Launch 'vimdiff' [Y/n]:

The current GIT_EXTERNAL_DIFF mechanism does not tell the number of
paths in the diff queue nor the current counter.  To make this
"counter/total" info available for GIT_EXTERNAL_DIFF programs
without breaking existing ones by doing the following:

 - Keep track of the number of paths shown so far in diff_options;

 - Export two new environment variables from run_external_diff() to
   show the total number of paths (from diff_queue_struct) and the
   current value of the counter (from diff_options); and

 - Update git-difftool--helper to use these two environment variables.

Signed-off-by: Zoltan Klinger <zoltan.klinger@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-06 14:00:27 -08:00
Christian Couder 5955654823 replace {pre,suf}fixcmp() with {starts,ends}_with()
Leaving only the function definitions and declarations so that any
new topic in flight can still make use of the old functions, replace
existing uses of the prefixcmp() and suffixcmp() with new API
functions.

The change can be recreated by mechanically applying this:

    $ git grep -l -e prefixcmp -e suffixcmp -- \*.c |
      grep -v strbuf\\.c |
      xargs perl -pi -e '
        s|!prefixcmp\(|starts_with\(|g;
        s|prefixcmp\(|!starts_with\(|g;
        s|!suffixcmp\(|ends_with\(|g;
        s|suffixcmp\(|!ends_with\(|g;
      '

on the result of preparatory changes in this series.

Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-05 14:13:21 -08:00
Nicolas Vigier b0d12fc9b2 Use the word 'stuck' instead of 'sticked'
The past participle of 'stick' is 'stuck'.

Signed-off-by: Nicolas Vigier <boklm@mars-attacks.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-10-31 15:47:38 -07:00
Junio C Hamano 4197361e39 Merge branch 'mg/more-textconv'
Make "git grep" and "git show" pay attention to --textconv when
dealing with blob objects.

* mg/more-textconv:
  grep: honor --textconv for the case rev:path
  grep: allow to use textconv filters
  t7008: demonstrate behavior of grep with textconv
  cat-file: do not die on --textconv without textconv filters
  show: honor --textconv for blobs
  diff_opt: track whether flags have been set explicitly
  t4030: demonstrate behavior of show with textconv
2013-10-23 13:21:31 -07:00
Junio C Hamano 01a2a03c56 Merge branch 'jc/diff-filter-negation'
Teach "git diff --diff-filter" to express "I do not want to see
these classes of changes" more directly by listing only the
unwanted ones in lowercase (e.g. "--diff-filter=d" will show
everything but deletion) and deprecate "diff-files -q" which did
the same thing as "--diff-filter=d".

* jc/diff-filter-negation:
  diff: deprecate -q option to diff-files
  diff: allow lowercase letter to specify what change class to exclude
  diff: reject unknown change class given to --diff-filter
  diff: preparse --diff-filter string argument
  diff: factor out match_filter()
  diff: pass the whole diff_options to diffcore_apply_filter()
2013-09-09 14:28:35 -07:00
Stefan Beller 3b0c18af5c diff: fix a possible null pointer dereference
The condition in the ternary operator was wrong, hence the wrong char
pointer could be used as the parameter for show_submodule_summary.
one->path may be null, but we definitely need a non null path given
to the function.

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Acked-By: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-09 12:07:36 -07:00
Stefan Beller c189c4f2c4 diff: remove ternary operator evaluating always to true
The line being changed is deep inside the function builtin_diff.
The variable name_b, which is used to evaluate the ternary expression
must evaluate to true at that position, hence the replacement with
just name_b.

The name_b variable only occurs a few times in that lengthy function:
As a parameter to the function itself:
	static void builtin_diff(const char *name_a,
				 const char *name_b,
				...
The next occurrences are at:
	/* Never use a non-valid filename anywhere if at all possible */
	name_a = DIFF_FILE_VALID(one) ? name_a : name_b;
	name_b = DIFF_FILE_VALID(two) ? name_b : name_a;

	a_one = quote_two(a_prefix, name_a + (*name_a == '/'));
	b_two = quote_two(b_prefix, name_b + (*name_b == '/'));

In the last line of this block 'name_b' is dereferenced and compared
to '/'. This would crash if name_b was NULL. Hence in the following code
we can assume name_b being non-null.

The next occurrence is just as a function argument, which doesn't change
the memory, which name_b points to, so the assumption name_b being not
null still holds:
	emit_rewrite_diff(name_a, name_b, one, two,
				textconv_one, textconv_two, o);

The next occurrence would be the line of this patch. As name_b still must
be not null, we can remove the ternary operator.

Inside the emit_rewrite_diff function there is a also a line
	ecbdata.ws_rule = whitespace_rule(name_b ? name_b : name_a);
which was also simplified as there is also a dereference before the
ternary operator.

Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-08-09 12:05:16 -07:00
Junio C Hamano 0def7126fd Merge branch 'ob/typofixes'
* ob/typofixes:
  typofix: in-code comments
  typofix: documentation
  typofix: release notes
2013-07-24 19:23:01 -07:00
Junio C Hamano 0c544a22f9 Merge branch 'sb/misc-fixes'
Assorted code cleanups and a minor fix.

* sb/misc-fixes:
  diff.c: Do not initialize a variable, which gets reassigned anyway.
  commit: Fix a memory leak in determine_author_info
  daemon.c:handle: Remove unneeded check for null pointer.
2013-07-24 19:20:59 -07:00
Ondřej Bílka 749f763dbb typofix: in-code comments
Signed-off-by: Ondřej Bílka <neleai@seznam.cz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-22 16:06:49 -07:00
Junio C Hamano d3aeb31dc4 Merge branch 'nd/const-struct-cache-entry'
* nd/const-struct-cache-entry:
  Convert "struct cache_entry *" to "const ..." wherever possible
2013-07-22 11:24:01 -07:00
Junio C Hamano e2ecd252b5 Merge branch 'mm/diff-no-patch-synonym-to-s'
"git show -s" was less discoverable than it should be.

* mm/diff-no-patch-synonym-to-s:
  Documentation/git-log.txt: capitalize section names
  Documentation: move description of -s, --no-patch to diff-options.txt
  Documentation/git-show.txt: include common diff options, like git-log.txt
  diff: allow --patch & cie to override -s/--no-patch
  diff: allow --no-patch as synonym for -s
  t4000-diff-format.sh: modernize style
2013-07-22 11:23:27 -07:00
Junio C Hamano c48f6816f0 diff: remove "diff-files -q" in a version of Git in a distant future
This was inherited from "show-diff -q" that was invented to tell
comparison between the index and the working tree to ignore only
removals in 2005.

These days, it is spelled as "--diff-filter=d".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-19 15:22:29 -07:00
Junio C Hamano 95a7c546b0 diff: deprecate -q option to diff-files
This reimplements the ancient "-q" option to "git diff-files" that
was inherited from "show-diff -q" in terms of "--diff-filter=d".  We
will be deprecating the "-q" option, so let's issue a warning when
we do so.

Incidentally this also tentatively fixes "git diff --no-index" to
honor "-q" and hide deletions; the use will get the same warning.

We should remove the support for "-q" in a future version but it is
not that urgent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-19 15:20:47 -07:00
Matthieu Moy 71482d389d diff: allow --patch & cie to override -s/--no-patch
All options that trigger a patch output now override --no-patch.

The case of --binary deserves extra attention: the name may suggest that
it turns a normal patch into a binary patch, but it actually already
enables patch output when normally disabled (e.g. "git log --binary"
displays a patch), hence it makes sense for "git show --no-patch
--binary" to display the binary patch.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 17:50:56 -07:00
Matthieu Moy d09cd15d19 diff: allow --no-patch as synonym for -s
This follows the usual convention of having a --no-foo option to negate
--foo.

Signed-off-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 17:50:56 -07:00
Junio C Hamano 7f2ea5f0f2 diff: allow lowercase letter to specify what change class to exclude
In order to express "we do not care about deletions", we had to say
"--diff-filter=ACMRTXUB", giving all the possible change class
except for the one we do not want, "D".

This is cumbersome.  As all the change classes are in uppercase,
allow their lowercase counterpart to selectively exclude the class
from the output.  When such a negated change class is in the input,
start the filter option with the full bits set.

This would allow us to express the old "show-diff -q" with
"git diff-files --diff-filter=d".

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 17:17:39 -07:00
Junio C Hamano bf142ec434 diff: reject unknown change class given to --diff-filter
We used to accept "git diff --diff-filter=Q" (note that there is no
such change class 'Q') silently and showed no output (because there
is no such change class 'Q').

Error out when such an input is given.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 16:24:14 -07:00
Junio C Hamano 1ecc1cbd3a diff: preparse --diff-filter string argument
Instead of running strchr() on the list of status characters over
and over again, parse the --diff-filter option into bitfields and
use the bits to see if the change to the filepair matches the status
requested.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 16:23:34 -07:00
Junio C Hamano 08578fa13e diff: factor out match_filter()
diffcore_apply_filter() checks if a filepair matches the filter
given with the "--diff-filter" option for each input filepairs with
a fairly complex expression in two places.

Create a helper function and call it.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 15:09:34 -07:00
Junio C Hamano 949226fe77 diff: pass the whole diff_options to diffcore_apply_filter()
The --diff-filter=<arg> option given by the user is kept as a
string, and passed to the underlying diffcore_apply_filter()
function as a string for each resulting path we run number of
strchr() to see if each class of change among ACDMRTXUB is meant to
be given.

Change the function signature to pass the whole diff_options, so
that we can pre-parse this string in the next patch.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-17 14:19:24 -07:00
Stefan Beller d3c9cf32ca diff.c: Do not initialize a variable, which gets reassigned anyway.
Signed-off-by: Stefan Beller <stefanbeller@googlemail.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-15 09:45:21 -07:00
Junio C Hamano 77f3c3f174 Merge branch 'jc/maint-diff-core-safecrlf'
"git diff" refused to even show difference when core.safecrlf is
set to true (i.e. error out) and there are offending lines in the
working tree files.

* jc/maint-diff-core-safecrlf:
  diff: demote core.safecrlf=true to core.safecrlf=warn
2013-07-11 13:05:45 -07:00
Nguyễn Thái Ngọc Duy 9c5e6c802c Convert "struct cache_entry *" to "const ..." wherever possible
I attempted to make index_state->cache[] a "const struct cache_entry **"
to find out how existing entries in index are modified and where. The
question I have is what do we do if we really need to keep track of on-disk
changes in the index. The result is

 - diff-lib.c: setting CE_UPTODATE

 - name-hash.c: setting CE_HASHED

 - preload-index.c, read-cache.c, unpack-trees.c and
   builtin/update-index: obvious

 - entry.c: write_entry() may refresh the checked out entry via
   fill_stat_cache_info(). This causes "non-const struct cache_entry
   *" in builtin/apply.c, builtin/checkout-index.c and
   builtin/checkout.c

 - builtin/ls-files.c: --with-tree changes stagemask and may set
   CE_UPDATE

Of these, write_entry() and its call sites are probably most
interesting because it modifies on-disk info. But this is stat info
and can be retrieved via refresh, at least for porcelain
commands. Other just uses ce_flags for local purposes.

So, keeping track of "dirty" entries is just a matter of setting a
flag in index modification functions exposed by read-cache.c. Except
unpack-trees, the rest of the code base does not do anything funny
behind read-cache's back.

The actual patch is less valueable than the summary above. But if
anyone wants to re-identify the above sites. Applying this patch, then
this:

    diff --git a/cache.h b/cache.h
    index 430d021..1692891 100644
    --- a/cache.h
    +++ b/cache.h
    @@ -267,7 +267,7 @@ static inline unsigned int canon_mode(unsigned int mode)
     #define cache_entry_size(len) (offsetof(struct cache_entry,name) + (len) + 1)

     struct index_state {
    -	struct cache_entry **cache;
    +	const struct cache_entry **cache;
     	unsigned int version;
     	unsigned int cache_nr, cache_alloc, cache_changed;
     	struct string_list *resolve_undo;

will help quickly identify them without bogus warnings.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-07-09 09:12:48 -07:00
Junio C Hamano 5430bb283b diff: demote core.safecrlf=true to core.safecrlf=warn
Otherwise the user will not be able to start to guess where in the
contents in the working tree the offending unsafe CR lies.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-25 13:55:03 -07:00
Antoine Pelisse 36617af7ed diff: add --ignore-blank-lines option
The goal of the patch is to introduce the GNU diff
-B/--ignore-blank-lines as closely as possible. The short option is not
available because it's already used for "break-rewrites".

When this option is used, git-diff will not create hunks that simply
add or remove empty lines, but will still show empty lines
addition/suppression if they are close enough to "valuable" changes.

There are two differences between this option and GNU diff -B option:
- GNU diff doesn't have "--inter-hunk-context", so this must be handled
- The following sequence looks like a bug (context is displayed twice):

    $ seq 5 >file1
    $ cat <<EOF >file2
    change
    1
    2

    3
    4
    5
    change
    EOF
    $ diff -u -B file1 file2
    --- file1	2013-06-08 22:13:04.471517834 +0200
    +++ file2	2013-06-08 22:13:23.275517855 +0200
    @@ -1,5 +1,7 @@
    +change
     1
     2
    +
     3
     4
     5
    @@ -3,3 +5,4 @@
     3
     4
     5
    +change

So here is a more thorough description of the option:
- real changes are interesting
- blank lines that are close enough (less than context size) to
interesting changes are considered interesting (recursive definition)
- "context" lines are used around each hunk of interesting changes
- If two hunks are separated by less than "inter-hunk-context", they
will be merged into one.

The implementation does the "interesting changes selection" in a single
pass.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-06-19 15:17:45 -07:00
Junio C Hamano 6c374008b1 diff_opt: track whether flags have been set explicitly
The diff_opt infrastructure sets flags based on defaults and command
line options.  It is impossible to tell whether a flag has been set
as a default or on explicit request.  Update the structure so that
this detection is possible:

 * Add an extra "opt->touched_flags" that keeps track of all the
   fields that have been touched by DIFF_OPT_SET and DIFF_OPT_CLR.

 * You may continue setting the default values to the flags, like
   commands in the "log" family do in cmd_log_init_defaults(), but
   after you finished setting the defaults, you clear the
   touched_flags field;

 * And then you let the usual callchain call diff_opt_parse(),
   allowing the opt->flags be set or unset, while keeping track of
   which bits the user touched;

 * There is an optional callback "opt->set_default" that is called
   at the very beginning to let you inspect touched_flags and update
   opt->flags appropriately, before the remainder of the diffcore
   machinery is set up, taking the opt->flags value into account.

Signed-off-by: Michael J Gruber <git@drmicha.warpmail.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-05-10 10:24:17 -07:00
Junio C Hamano e4d15959d4 Merge branch 'jk/diff-algo-finishing-touches' into maint
"git diff --diff-algorithm=algo" was understood by the command line
parser, but "git diff --diff-algorithm algo" was not.

* jk/diff-algo-finishing-touches:
  diff: allow unstuck arguments with --diff-algorithm
  git-merge(1): document diff-algorithm option to merge-recursive
2013-04-24 16:19:42 -07:00
Junio C Hamano f678d9b592 Merge branch 'jk/diff-graph-submodule-summary'
Make "git diff --graph" work better with submodule log output.

* jk/diff-graph-submodule-summary:
  submodule: print graph output next to submodule log
2013-04-15 12:41:01 -07:00
Junio C Hamano 825ccfc23c Merge branch 'jk/diff-algo-finishing-touches'
"git diff --diff-algorithm algo" is also understood as "git diff
--diff-algorithm=algo".

* jk/diff-algo-finishing-touches:
  diff: allow unstuck arguments with --diff-algorithm
  git-merge(1): document diff-algorithm option to merge-recursive
2013-04-15 12:40:58 -07:00
Stefano Lattarini 41ccfdd9c9 Correct common spelling mistakes in comments and tests
Most of these were found using Lucas De Marchi's codespell tool.

Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Acked-by: Matthieu Moy <Matthieu.Moy@imag.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-12 13:38:40 -07:00
John Keeping 0f33a0677d submodule: print graph output next to submodule log
When running "git log -p --submodule=log", the submodule log is not
indented by the graph output, although all other lines are.  Fix this by
prepending the current line prefix to each line of the submodule log.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-05 11:28:10 -07:00
John Keeping 0895c6d4c0 diff: allow unstuck arguments with --diff-algorithm
The argument to --diff-algorithm is mandatory, so there is no reason to
require the argument to be stuck to the option with '='.  Change this
for consistency with other Git commands.

Note that this does not change the handling of diff-algorithm in
merge-recursive.c since the primary interface to that is via the -X
option to 'git merge' where the unstuck form does not make sense.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-05 11:01:08 -07:00
Junio C Hamano b76a9e1648 Merge branch 'ap/maint-diff-rename-avoid-overlap' into maint
* ap/maint-diff-rename-avoid-overlap:
  tests: make sure rename pretty print works
  diff: prevent pprint_rename from underrunning input
  diff: Fix rename pretty-print when suffix and prefix overlap
2013-04-01 09:19:47 -07:00
Junio C Hamano caf217a3b8 Merge branch 'ap/maint-diff-rename-avoid-overlap'
The logic used by "git diff -M --stat" to shorten the names of
files before and after a rename did not work correctly when the
common prefix and suffix between the two filenames overlapped.

* ap/maint-diff-rename-avoid-overlap:
  tests: make sure rename pretty print works
  diff: prevent pprint_rename from underrunning input
  diff: Fix rename pretty-print when suffix and prefix overlap
2013-03-25 14:00:37 -07:00
Max Nanasy c9fc4415e2 diff.c: diff.renamelimit => diff.renameLimit in message
In the warning message printed when rename or unmodified copy
detection was skipped due to too many files, change "diff.renamelimit"
to "diff.renameLimit", in order to make it consistent with git
documentation, which consistently uses "diff.renameLimit".

Signed-off-by: Max Nanasy <max.nanasy@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-03-21 14:06:49 -07:00
Thomas Rast dd281f09b7 diff: prevent pprint_rename from underrunning input
The logic described in d020e27 (diff: Fix rename pretty-print when
suffix and prefix overlap, 2013-02-23) is wrong: The proof in the
comment is valid only if both strings are the same length.  *One* of
old/new can reach a-1 (b-1, resp.) if 'a' is a suffix of 'b' (or vice
versa).

Since the intent was to let the loop run down to the '/' at the end of
the common prefix, fix it by making that distinction explicit: if
there is no prefix, allow no underrun.

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-26 13:01:34 -08:00
Antoine Pelisse d020e27fda diff: Fix rename pretty-print when suffix and prefix overlap
When considering a rename for two files that have a suffix and a prefix
that can overlap, a confusing line is shown. As an example, renaming
"a/b/b/c" to "a/b/c" shows "a/b/{ => }/b/c".

Currently, what we do is calculate the common prefix ("a/b/"), and the
common suffix ("/b/c"), but the same "/b/" is actually counted both in
prefix and suffix. Then when calculating the size of the non-common part,
we end-up with a negative value which is reset to 0, thus the "{ => }".

Do not allow the common suffix to overlap the common prefix and stop
when reaching a "/" that would be in both.

Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-23 23:52:39 -08:00
Junio C Hamano abea4dc76a Merge branch 'mp/diff-algo-config'
Add diff.algorithm configuration so that the user does not type
"diff --histogram".

* mp/diff-algo-config:
  diff: Introduce --diff-algorithm command line option
  config: Introduce diff.algorithm variable
  git-completion.bash: Autocomplete --minimal and --histogram for git-diff
2013-02-17 15:25:52 -08:00
Junio C Hamano a1d68bea89 Merge branch 'jk/diff-graph-cleanup'
Refactors a lot of repetitive code sequence from the graph drawing
code and adds it to the combined diff output.

* jk/diff-graph-cleanup:
  combine-diff.c: teach combined diffs about line prefix
  diff.c: use diff_line_prefix() where applicable
  diff: add diff_line_prefix function
  diff.c: make constant string arguments const
  diff: write prefix to the correct file
  graph: output padding for merge subsequent parents
2013-02-14 10:29:59 -08:00
John Keeping 30997bb8f1 diff.c: use diff_line_prefix() where applicable
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
John Keeping f192223447 diff: add diff_line_prefix function
This is a helper function to call the diff output_prefix function and
return its value as a C string, allowing us to greatly simplify
everywhere that needs to get the output prefix.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
John Keeping 32b367e444 diff.c: make constant string arguments const
Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
John Keeping 3bf25c23cd diff: write prefix to the correct file
Write the prefix for an output line to the same file as the actual
content.

Signed-off-by: John Keeping <john@keeping.me.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-12 11:42:07 -08:00
Michal Privoznik 07924d4d50 diff: Introduce --diff-algorithm command line option
Since command line options have higher priority than config file
variables and taking previous commit into account, we need a way
how to specify myers algorithm on command line. However,
inventing `--myers` is not the right answer. We need far more
general option, and that is `--diff-algorithm`.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 09:41:18 -08:00
Michal Privoznik 07ab4dec80 config: Introduce diff.algorithm variable
Some users or projects prefer different algorithms over others, e.g.
patience over myers or similar. However, specifying appropriate
argument every time diff is to be used is impractical. Moreover,
creating an alias doesn't play nicely with other tools based on diff
(git-show for instance). Hence, a configuration variable which is able
to set specific algorithm is needed. For now, these four values are
accepted: 'myers' (which has the same effect as not setting the config
variable at all), 'minimal', 'patience' and 'histogram'.

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-01-16 09:37:45 -08:00
Junio C Hamano 90d0b8a9f0 Merge branch 'jc/blame-no-follow'
Teaches "--no-follow" option to "git blame" to disable its
whole-file rename detection.

* jc/blame-no-follow:
  blame: pay attention to --no-follow
  diff: accept --no-follow option
2013-01-14 08:15:51 -08:00
Junio C Hamano a4eab8f38e Merge branch 'lt/diff-stat-show-0-lines'
"git diff --stat" miscounted the total number of changed lines when
binary files were involved and hidden beyond --stat-count.  It also
miscounted the total number of changed files when there were
unmerged paths.

* lt/diff-stat-show-0-lines:
  t4049: refocus tests
  diff --shortstat: do not count "unmerged" entries
  diff --stat: do not count "unmerged" entries
  diff --stat: move the "total count" logic to the last loop
  diff --stat: use "file" temporary variable to refer to data->files[i]
  diff --stat: status of unmodified pair in diff-q is not zero
  test: add failing tests for "diff --stat" to t4049
2012-11-29 12:53:54 -08:00
Junio C Hamano 20c8cde456 diff --shortstat: do not count "unmerged" entries
Fix the same issue as the previous one for "git diff --stat";
unmerged entries was doubly-counted.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 14:19:36 -08:00
Junio C Hamano 82dfc2c44e diff --stat: do not count "unmerged" entries
Even though we show a separate *UNMERGED* entry in the patch and
diffstat output (or in the --raw format, for that matter) in
addition to and separately from the diff against the specified stage
(defaulting to #2) for unmerged paths, they should not be counted in
the total number of files affected---that would lead to counting the
same path twice.

The separation done by the previous step makes this fix simple and
straightforward.  Among the filepairs in diff_queue, paths that
weren't modified, and the extra "unmerged" entries do not count as
total number of files.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano a20d3c0de1 diff --stat: move the "total count" logic to the last loop
The diffstat generation logic, with --stat-count limit, is
implemented as three loops.

 - The first counts the width necessary to show stats up to
   specified number of entries, and notes up to how many entries in
   the data we need to iterate to show the graph;

 - The second iterates that many times to draw the graph, adjusts
   the number of "total modified files", and counts the total
   added/deleted lines for the part that was shown in the graph;

 - The third iterates over the remainder and only does the part to
   count "total added/deleted lines" and to adjust "total modified
   files" without drawing anything.

Move the logic to count added/deleted lines and modified files from
the second loop to the third loop.

This incidentally fixes a bug.  The third loop was not filtering
binary changes (counted in bytes) from the total added/deleted as it
should.  The second loop implemented this correctly, so if a binary
change appeared earlier than the --stat-count cutoff, the code
counted number of added/deleted lines correctly, but if it appeared
beyond the cutoff, the number of lines would have mixed with the
byte count in the buggy third loop.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano af0ed819c5 diff --stat: use "file" temporary variable to refer to data->files[i]
The generated code shouldn't change but it is easier to read.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano 99bfd40700 diff --stat: status of unmodified pair in diff-q is not zero
It is spelled DIFF_STATUS_UNKNOWN these days, and is different from zero.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-27 13:21:15 -08:00
Junio C Hamano be95387af2 Merge branch 'rr/submodule-diff-config'
Allow "git diff --submodule=log" to set to be the default via
configuration.

* rr/submodule-diff-config:
  submodule: display summary header in bold
  diff: rename "set" variable
  diff: introduce diff.submodule configuration variable
  Documentation: move diff.wordRegex from config.txt to diff-config.txt
2012-11-25 18:44:50 -08:00
Junio C Hamano 76c39289ba Merge branch 'lt/diff-stat-show-0-lines'
We failed to mention a file without any content change but whose
permission bit was modified, or (worse yet) a new file without any
content in the "git diff --stat" output.

* lt/diff-stat-show-0-lines:
  Fix "git diff --stat" for interesting - but empty - file changes
2012-11-25 18:44:06 -08:00
Ramkumar Ramachandra 4e215131d2 submodule: display summary header in bold
Currently, 'git diff --submodule' displays output with a bold diff
header for non-submodules.  So this part is in bold:

    diff --git a/file1 b/file1
    index 30b2f6c..2638038 100644
    --- a/file1
    +++ b/file1

For submodules, the header looks like this:

    Submodule submodule1 012b072..248d0fd:

Unfortunately, it's easy to miss in the output because it's not bold.
Change this.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-18 19:18:13 -08:00
Jeff King d9c552f17a diff: rename "set" variable
Once upon a time the builtin_diff function used one color, and the color
variables were called "set" and "reset". Nowadays it is a much longer
function and we use several colors (e.g., "add", "del"). Rename "set" to
"meta" to show that it is the color for showing diff meta-info (it still
does not indicate that it is a "color", but at least it matches the
scheme of the other color variables).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-18 19:18:13 -08:00
Ramkumar Ramachandra c47ef57caa diff: introduce diff.submodule configuration variable
Introduce a diff.submodule configuration variable corresponding to the
'--submodule' command-line option of 'git diff'.

Signed-off-by: Ramkumar Ramachandra <artagnon@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-11-18 19:18:13 -08:00
Jeff King 19fb613695 Merge branch 'nd/builtin-to-libgit'
Code cleanups so that libgit.a does not depend on anything in the
builtin/ directory.

* nd/builtin-to-libgit:
  fetch-pack: move core code to libgit.a
  fetch-pack: remove global (static) configuration variable "args"
  send-pack: move core code to libgit.a
  Move setup_diff_pager to libgit.a
  Move print_commit_list to libgit.a
  Move estimate_bisect_steps to libgit.a
  Move try_merge_command and checkout_fast_forward to libgit.a
2012-11-09 12:51:06 -05:00
Jeff King 8736c9010c Merge branch 'mh/maint-parse-dirstat-fix'
Cleans up some code and avoids a potential bug.

* mh/maint-parse-dirstat-fix:
  parse_dirstat_params(): use string_list to split comma-separated string
2012-11-09 12:42:21 -05:00
Nguyễn Thái Ngọc Duy 4914c9629c Move setup_diff_pager to libgit.a
This is used by diff-no-index.c, part of libgit.a while it stays in
builtin/diff.c. Move it to diff.c so that we won't get undefined
reference if a program that uses libgit.a happens to pull it in.

While at it, move check_pager from git.c to pager.c. It makes more
sense there and pager.c is also part of libgit.a

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 03:08:30 -04:00
Michael Haggerty 02e8ca0e50 parse_dirstat_params(): use string_list to split comma-separated string
Use string_list_split_in_place() to split the comma-separated
parameters string.  This simplifies the code and also fixes a bug: the
old code made calls like

    memcmp(p, "lines", p_len)

which needn't work if p_len is different than the length of the
constant string (and could illegally access memory if p_len is larger
than the length of the constant string).

When p_len was less than the length of the constant string, the old
code would have allowed some abbreviations to be accepted (e.g., "cha"
for "changes") but this seems to have been a bug rather than a
feature, because (1) it is not documented; (2) no attempt was made to
handle ambiguous abbreviations, like "c" for "changes" vs
"cumulative".

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2012-10-29 02:52:41 -04:00
Linus Torvalds 74faaa16f0 Fix "git diff --stat" for interesting - but empty - file changes
The behavior of "git diff --stat" is rather odd for files that have
zero lines of changes: it will discount them entirely unless they were
renames.

Which means that the stat output will simply not show files that only
had "other" changes: they were created or deleted, or their mode was
changed.

Now, those changes do show up in the summary, but so do renames, so
the diffstat logic is inconsistent. Why does it show renames with zero
lines changed, but not mode changes or added files with zero lines
changed?

So change the logic to not check for "is_renamed", but for
"is_interesting" instead, where "interesting" is judged to be any
action but a pure data change (because a pure data change with zero
data changed really isn't worth showing, if we ever get one in our
diffpairs).

So if you did

   chmod +x Makefile
   git diff --stat

before, it would show empty (" 0 files changed"), with this it shows

 Makefile | 0
 1 file changed, 0 insertions(+), 0 deletions(-)

which I think is a more correct diffstat (and then with "--summary" it
shows *what* the metadata change to Makefile was - this is completely
consistent with our handling of renamed files).

Side note: the old behavior was *really* odd. With no changes at all,
"git diff --stat" output was empty. With just a chmod, it said "0
files changed". No way is our legacy behavior sane.

Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-10-17 11:50:50 -07:00
Jeff Muizelaar 6468a4e548 diff: diff.context configuration gives default to -U
Introduce a configuration variable diff.context that tells
Porcelain commands to use a non-default number of context
lines instead of 3 (the default).  With this variable, users
do not have to keep repeating "git log -U8" from the command
line; instead, it becomes sufficient to say "git config
diff.context 8" just once.

Signed-off-by: Jeff Muizelaar <jmuizelaar@mozilla.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-30 20:16:01 -07:00
Junio C Hamano aebbcf5797 diff: accept --no-follow option
Once you do

	$ alias glogone git log --follow

there is no way to say

	$ glogone --no-follow ...

Not that "log --follow" is all that useful, but it is cheap to
support the common "you can defeat an undesirable option with a
'no-' variant of it later on the command line" pattern.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-21 13:49:18 -07:00
Junio C Hamano 8ef2794ba8 Merge branch 'nd/maint-diffstat-summary' into maint
* nd/maint-diffstat-summary:
  Revert diffstat back to English
2012-09-20 15:55:31 -07:00
Junio C Hamano 06e211acc6 Merge branch 'jc/make-static'
Turn many file-scope private symbols to static to reduce the
global namespace contamination.

* jc/make-static:
  sequencer.c: mark a private file-scope symbol as static
  ident.c: mark private file-scope symbols as static
  trace.c: mark a private file-scope symbol as static
  wt-status.c: mark a private file-scope symbol as static
  read-cache.c: mark a private file-scope symbol as static
  strbuf.c: mark a private file-scope symbol as static
  sha1-array.c: mark a private file-scope symbol as static
  symlinks.c: mark private file-scope symbols as static
  notes.c: mark a private file-scope symbol as static
  rerere.c: mark private file-scope symbols as static
  graph.c: mark private file-scope symbols as static
  diff.c: mark a private file-scope symbol as static
  commit.c: mark a file-scope private symbol as static
  builtin/notes.c: mark file-scope private symbols as static
2012-09-18 14:37:46 -07:00
Junio C Hamano 9e40b6e595 Merge branch 'nd/maint-diffstat-summary'
Earlier we made the diffstat summary line that shows the number of
lines added/deleted localizable, but it was found irritating having
to see them in various languages on a list whose discussion language
is English.

The original had trivial thinko in reverting Q_(), which has been
fixed.

* nd/maint-diffstat-summary:
  Revert diffstat back to English
2012-09-17 15:57:22 -07:00
Junio C Hamano d2aea1371b diff.c: mark a private file-scope symbol as static
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-15 22:58:20 -07:00
Nguyễn Thái Ngọc Duy 218adaaaa0 Revert diffstat back to English
This reverts the i18n part of 7f81463 (Use correct grammar in diffstat
summary line - 2012-02-01) but still keeps the grammar correctness for
English. It also reverts b354f11 (Fix tests under GETTEXT_POISON on
diffstat - 2012-08-27). The result is diffstat always in English
for all commands.

This helps stop users from accidentally sending localized
format-patch'd patches.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-14 09:52:16 -07:00
Junio C Hamano 1c88a6d174 Sync with 1.7.11.6
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-09-11 11:23:54 -07:00
Junio C Hamano d9b983fc26 Merge branch 'ab/diff-write-incomplete-line' into maint-1.7.11
* ab/diff-write-incomplete-line:
  Fix '\ No newline...' annotation in rewrite diffs
2012-09-11 11:08:30 -07:00
Junio C Hamano 738c218760 Merge branch 'tr/void-diff-setup-done' into maint-1.7.11
* tr/void-diff-setup-done:
  diff_setup_done(): return void
2012-09-11 10:53:40 -07:00
Junio C Hamano e3f26752b5 Merge branch 'maint-1.7.11' into maint
* maint-1.7.11:
  Almost 1.7.11.6
  gitweb: URL-decode $my_url/$my_uri when stripping PATH_INFO
  rebase -i: use full onto sha1 in reflog
  sh-setup: protect from exported IFS
  receive-pack: do not leak output from auto-gc to standard output
  t/t5400: demonstrate breakage caused by informational message from prune
  setup: clarify error messages for file/revisions ambiguity
  send-email: improve RFC2047 quote parsing
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-09-10 15:31:06 -07:00
Junio C Hamano 03adeeaad6 Merge branch 'jk/maint-null-in-trees' into maint-1.7.11
"git diff" had a confusion between taking data from a path in the
working tree and taking data from an object that happens to have
name 0{40} recorded in a tree.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-09-10 15:24:54 -07:00
Junio C Hamano e6daf0ac22 Merge branch 'ab/diff-write-incomplete-line'
The output from "git diff -B" for a file that ends with an
incomplete line did not put "\ No newline..." on a line of its own.

* ab/diff-write-incomplete-line:
  Fix '\ No newline...' annotation in rewrite diffs
2012-08-27 11:54:46 -07:00
Junio C Hamano 3b753148b6 Merge branch 'jk/maint-null-in-trees'
We do not want a link to 0{40} object stored anywhere in our objects.

* jk/maint-null-in-trees:
  fsck: detect null sha1 in tree entries
  do not write null sha1s to on-disk index
  diff: do not use null sha1 as a sentinel value
2012-08-27 11:54:28 -07:00
Junio C Hamano 9cd33bbc52 Merge branch 'tr/void-diff-setup-done'
Remove unnecessary code.

* tr/void-diff-setup-done:
  diff_setup_done(): return void
2012-08-22 11:52:27 -07:00
Adam Butcher 35e2d03c2c Fix '\ No newline...' annotation in rewrite diffs
When a file that ends with an incomplete line is expressed as a
complete rewrite with the -B option, git diff incorrectly
appends the incomplete line indicator "\ No newline at end of
file" after such a line, rather than writing it on a line of its
own (the output codepath for normal output without -B does not
have this problem).  Add a LF after the incomplete line before
writing the "\ No newline ..." out to fix this.

Add a couple of tests to confirm that the indicator comment is
generated on its own line in both plain diff and rewrite mode.

Signed-off-by: Adam Butcher <dev.lists@jessamine.co.uk>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-05 12:37:52 -07:00
Thomas Rast 28452655af diff_setup_done(): return void
diff_setup_done() has historically returned an error code, but lost
the last nonzero return in 943d5b7 (allow diff.renamelimit to be set
regardless of -M/-C, 2006-08-09).  The callers were in a pretty
confused state: some actually checked for the return code, and some
did not.

Let it return void, and patch all callers to take this into account.
This conveniently also gets rid of a handful of different(!) error
messages that could never be triggered anyway.

Note that the function can still die().

Signed-off-by: Thomas Rast <trast@student.ethz.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-08-03 12:11:07 -07:00
Junio C Hamano 97c7934049 Merge branch 'nd/maint-i18n-diffstat'
* nd/maint-i18n-diffstat:
  i18n: leave \n out of translated diffstat
2012-07-31 09:43:07 -07:00
Junio C Hamano 70f6be7aa9 Merge branch 'jv/maint-no-ext-diff' into maint
"git diff --no-ext-diff" did not output anything for a typechange
filepair when GIT_EXTERNAL_DIFF is in effect.

* jv/maint-no-ext-diff:
  diff: test precedence of external diff drivers
  diff: correctly disable external_diff with --no-ext-diff
2012-07-30 13:04:59 -07:00
Jeff King e54501004a diff: do not use null sha1 as a sentinel value
The diff code represents paths using the diff_filespec
struct. This struct has a sha1 to represent the sha1 of the
content at that path, as well as a sha1_valid member which
indicates whether its sha1 field is actually useful. If
sha1_valid is not true, then the filespec represents a
working tree file (e.g., for the no-index case, or for when
the index is not up-to-date).

The diff_filespec is only used internally, though. At the
interfaces to the diff subsystem, callers feed the sha1
directly, and we create a diff_filespec from it. It's at
that point that we look at the sha1 and decide whether it is
valid or not; callers may pass the null sha1 as a sentinel
value to indicate that it is not.

We should not typically see the null sha1 coming from any
other source (e.g., in the index itself, or from a tree).
However, a corrupt tree might have a null sha1, which would
cause "diff --patch" to accidentally diff the working tree
version of a file instead of treating it as a blob.

This patch extends the edges of the diff interface to accept
a "sha1_valid" flag whenever we accept a sha1, and to use
that flag when creating a filespec. In some cases, this
means passing the flag through several layers, making the
code change larger than would be desirable.

One alternative would be to simply die() upon seeing
corrupted trees with null sha1s. However, this fix more
directly addresses the problem (while bogus sha1s in a tree
are probably a bad thing, it is really the sentinel
confusion sending us down the wrong code path that is what
makes it devastating). And it means that git is more capable
of examining and debugging these corrupted trees. For
example, you can still "diff --raw" such a tree to find out
when the bogus entry was introduced; you just cannot do a
"--patch" diff (just as you could not with any other
corrupted tree, as we do not have any content to diff).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2012-07-29 15:04:32 -07:00