Find a file
Michael Haggerty 433860f3d0 diff: improve positioning of add/delete blocks in diffs
Some groups of added/deleted lines in diffs can be slid up or down,
because lines at the edges of the group are not unique. Picking good
shifts for such groups is not a matter of correctness but definitely has
a big effect on aesthetics. For example, consider the following two
diffs. The first is what standard Git emits:

    --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
    +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
    @@ -231,6 +231,9 @@ if (!defined $initial_reply_to && $prompting) {
     }

     if (!$smtp_server) {
    +       $smtp_server = $repo->config('sendemail.smtpserver');
    +}
    +if (!$smtp_server) {
            foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
                    if (-x $_) {
                            $smtp_server = $_;

The following diff is equivalent, but is obviously preferable from an
aesthetic point of view:

    --- a/9c572b21dd090a1e5c5bb397053bf8043ffe7fb4:git-send-email.perl
    +++ b/6dcfa306f2b67b733a7eb2d7ded1bc9987809edb:git-send-email.perl
    @@ -230,6 +230,9 @@ if (!defined $initial_reply_to && $prompting) {
            $initial_reply_to =~ s/(^\s+|\s+$)//g;
     }

    +if (!$smtp_server) {
    +       $smtp_server = $repo->config('sendemail.smtpserver');
    +}
     if (!$smtp_server) {
            foreach (qw( /usr/sbin/sendmail /usr/lib/sendmail )) {
                    if (-x $_) {

This patch teaches Git to pick better positions for such "diff sliders"
using heuristics that take the positions of nearby blank lines and the
indentation of nearby lines into account.

The existing Git code basically always shifts such "sliders" as far down
in the file as possible. The only exception is when the slider can be
aligned with a group of changed lines in the other file, in which case
Git favors depicting the change as one add+delete block rather than one
add and a slightly offset delete block. This naive algorithm often
yields ugly diffs.

Commit d634d61ed6 improved the situation somewhat by preferring to
position add/delete groups to make their last line a blank line, when
that is possible. This heuristic does more good than harm, but (1) it
can only help if there are blank lines in the right places, and (2)
always picks the last blank line, even if there are others that might be
better. The end result is that it makes perhaps 1/3 as many errors as
the default Git algorithm, but that still leaves a lot of ugly diffs.

This commit implements a new and much better heuristic for picking
optimal "slider" positions using the following approach: First observe
that each hypothetical positioning of a diff slider introduces two
splits: one between the context lines preceding the group and the first
added/deleted line, and the other between the last added/deleted line
and the first line of context following it. It tries to find the
positioning that creates the least bad splits.

Splits are evaluated based only on the presence and locations of nearby
blank lines, and the indentation of lines near the split. Basically, it
prefers to introduce splits adjacent to blank lines, between lines that
are indented less, and between lines with the same level of indentation.
In more detail:

1. It measures the following characteristics of a proposed splitting
   position in a `struct split_measurement`:

   * the number of blank lines above the proposed split
   * whether the line directly after the split is blank
   * the number of blank lines following that line
   * the indentation of the nearest non-blank line above the split
   * the indentation of the line directly below the split
   * the indentation of the nearest non-blank line after that line

2. It combines the measured attributes using a bunch of
   empirically-optimized weighting factors to derive a `struct
   split_score` that measures the "badness" of splitting the text at
   that position.

3. It combines the `split_score` for the top and the bottom of the
   slider at each of its possible positions, and selects the position
   that has the best `split_score`.

I determined the initial set of weighting factors by collecting a corpus
of Git histories from 29 open-source software projects in various
programming languages. I generated many diffs from this corpus, and
determined the best positioning "by eye" for about 6600 diff sliders. I
used about half of the repositories in the corpus (corresponding to
about 2/3 of the sliders) as a training set, and optimized the weights
against this corpus using a crude automated search of the parameter
space to get the best agreement with the manually-determined values.
Then I tested the resulting heuristic against the full corpus. The
results are summarized in the following table, in column `indent-1`:

| repository            | count |      Git 2.9.0 |     compaction | compaction-fixed |       indent-1 |       indent-2 |
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| afnetworking          |   109 |    89  (81.7%) |    37  (33.9%) |      37  (33.9%) |     2   (1.8%) |     2   (1.8%) |
| alamofire             |    30 |    18  (60.0%) |    14  (46.7%) |      15  (50.0%) |     0   (0.0%) |     0   (0.0%) |
| angular               |   184 |   127  (69.0%) |    39  (21.2%) |      23  (12.5%) |     5   (2.7%) |     5   (2.7%) |
| animate               |   313 |     2   (0.6%) |     2   (0.6%) |       2   (0.6%) |     2   (0.6%) |     2   (0.6%) |
| ant                   |   380 |   356  (93.7%) |   152  (40.0%) |     148  (38.9%) |    15   (3.9%) |    15   (3.9%) | *
| bugzilla              |   306 |   263  (85.9%) |   109  (35.6%) |      99  (32.4%) |    14   (4.6%) |    15   (4.9%) | *
| corefx                |   126 |    91  (72.2%) |    22  (17.5%) |      21  (16.7%) |     6   (4.8%) |     6   (4.8%) |
| couchdb               |    78 |    44  (56.4%) |    26  (33.3%) |      28  (35.9%) |     6   (7.7%) |     6   (7.7%) | *
| cpython               |   937 |   158  (16.9%) |    50   (5.3%) |      49   (5.2%) |     5   (0.5%) |     5   (0.5%) | *
| discourse             |   160 |    95  (59.4%) |    42  (26.2%) |      36  (22.5%) |    18  (11.2%) |    13   (8.1%) |
| docker                |   307 |   194  (63.2%) |   198  (64.5%) |     253  (82.4%) |     8   (2.6%) |     8   (2.6%) | *
| electron              |   163 |   132  (81.0%) |    38  (23.3%) |      39  (23.9%) |     6   (3.7%) |     6   (3.7%) |
| git                   |   536 |   470  (87.7%) |    73  (13.6%) |      78  (14.6%) |    16   (3.0%) |    16   (3.0%) | *
| gitflow               |   127 |     0   (0.0%) |     0   (0.0%) |       0   (0.0%) |     0   (0.0%) |     0   (0.0%) |
| ionic                 |   133 |    89  (66.9%) |    29  (21.8%) |      38  (28.6%) |     1   (0.8%) |     1   (0.8%) |
| ipython               |   482 |   362  (75.1%) |   167  (34.6%) |     169  (35.1%) |    11   (2.3%) |    11   (2.3%) | *
| junit                 |   161 |   147  (91.3%) |    67  (41.6%) |      66  (41.0%) |     1   (0.6%) |     1   (0.6%) | *
| lighttable            |    15 |     5  (33.3%) |     0   (0.0%) |       2  (13.3%) |     0   (0.0%) |     0   (0.0%) |
| magit                 |    88 |    75  (85.2%) |    11  (12.5%) |       9  (10.2%) |     1   (1.1%) |     0   (0.0%) |
| neural-style          |    28 |     0   (0.0%) |     0   (0.0%) |       0   (0.0%) |     0   (0.0%) |     0   (0.0%) |
| nodejs                |   781 |   649  (83.1%) |   118  (15.1%) |     111  (14.2%) |     4   (0.5%) |     5   (0.6%) | *
| phpmyadmin            |   491 |   481  (98.0%) |    75  (15.3%) |      48   (9.8%) |     2   (0.4%) |     2   (0.4%) | *
| react-native          |   168 |   130  (77.4%) |    79  (47.0%) |      81  (48.2%) |     0   (0.0%) |     0   (0.0%) |
| rust                  |   171 |   128  (74.9%) |    30  (17.5%) |      27  (15.8%) |    16   (9.4%) |    14   (8.2%) |
| spark                 |   186 |   149  (80.1%) |    52  (28.0%) |      52  (28.0%) |     2   (1.1%) |     2   (1.1%) |
| tensorflow            |   115 |    66  (57.4%) |    48  (41.7%) |      48  (41.7%) |     5   (4.3%) |     5   (4.3%) |
| test-more             |    19 |    15  (78.9%) |     2  (10.5%) |       2  (10.5%) |     1   (5.3%) |     1   (5.3%) | *
| test-unit             |    51 |    34  (66.7%) |    14  (27.5%) |       8  (15.7%) |     2   (3.9%) |     2   (3.9%) | *
| xmonad                |    23 |    22  (95.7%) |     2   (8.7%) |       2   (8.7%) |     1   (4.3%) |     1   (4.3%) | *
| --------------------- | ----- | -------------- | -------------- | ---------------- | -------------- | -------------- |
| totals                |  6668 |  4391  (65.9%) |  1496  (22.4%) |    1491  (22.4%) |   150   (2.2%) |   144   (2.2%) |
| totals (training set) |  4552 |  3195  (70.2%) |  1053  (23.1%) |    1061  (23.3%) |    86   (1.9%) |    88   (1.9%) |
| totals (test set)     |  2116 |  1196  (56.5%) |   443  (20.9%) |     430  (20.3%) |    64   (3.0%) |    56   (2.6%) |

In this table, the numbers are the count and percentage of human-rated
sliders that the corresponding algorithm got *wrong*. The columns are

* "repository" - the name of the repository used. I used the diffs
  between successive non-merge commits on the HEAD branch of the
  corresponding repository.

* "count" - the number of sliders that were human-rated. I chose most,
  but not all, sliders to rate from those among which the various
  algorithms gave different answers.

* "Git 2.9.0" - the default algorithm used by `git diff` in Git 2.9.0.

* "compaction" - the heuristic used by `git diff --compaction-heuristic`
  in Git 2.9.0.

* "compaction-fixed" - the heuristic used by `git diff
  --compaction-heuristic` after the fixes from earlier in this patch
  series. Note that the results are not dramatically different than
  those for "compaction". Both produce non-ideal diffs only about 1/3 as
  often as the default `git diff`.

* "indent-1" - the new `--indent-heuristic` algorithm, using the first
  set of weighting factors, determined as described above.

* "indent-2" - the new `--indent-heuristic` algorithm, using the final
  set of weighting factors, determined as described below.

* `*` - indicates that repo was part of training set used to determine
  the first set of weighting factors.

The fact that the heuristic performed nearly as well on the test set as
on the training set in column "indent-1" is a good indication that the
heuristic was not over-trained. Given that fact, I ran a second round of
optimization, using the entire corpus as the training set. The resulting
set of weights gave the results in column "indent-2". These are the
weights included in this patch.

The final result gives consistently and significantly better results
across the whole corpus than either `git diff` or `git diff
--compaction-heuristic`. It makes only about 1/30 as many errors as the
former and about 1/10 as many errors as the latter. (And a good fraction
of the remaining errors are for diffs that involve weirdly-formatted
code, sometimes apparently machine-generated.)

The tools that were used to do this optimization and analysis, along
with the human-generated data values, are recorded in a separate project
[1].

This patch adds a new command-line option `--indent-heuristic`, and a
new configuration setting `diff.indentHeuristic`, that activate this
heuristic. This interface is only meant for testing purposes, and should
be finalized before including this change in any release.

[1] https://github.com/mhagger/diff-slider-tools

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 10:25:11 -07:00
block-sha1 sha1: provide another level of indirection for the SHA-1 functions 2015-11-05 10:35:11 -08:00
builtin Merge branch 'km/fetch-do-not-free-remote-name' 2016-07-06 13:38:08 -07:00
ci travis-ci: build documentation 2016-05-10 11:19:07 -07:00
compat Merge branch 'rj/compat-regex-size-max-fix' 2016-06-27 09:56:47 -07:00
contrib Merge branch 'tb/complete-status' 2016-06-27 09:56:54 -07:00
Documentation diff: improve positioning of add/delete blocks in diffs 2016-09-19 10:25:11 -07:00
ewah ewah: convert to REALLOC_ARRAY, etc 2016-02-22 14:51:09 -08:00
git-gui git-gui/po/glossary/txt-to-pot.sh: use the $( ... ) construct for command substitution 2015-12-27 15:33:13 -08:00
gitk-git Merge branch 'master' of git://ozlabs.org/~paulus/gitk 2016-03-20 18:05:10 -07:00
gitweb Merge branch 'sk/gitweb-highlight-encoding' into HEAD 2016-05-18 14:40:10 -07:00
mergetools mergetools: add support for ExamDiff 2016-04-04 09:15:14 -07:00
perl git-svn: skip mergeinfo handling with --no-follow-parent 2016-06-22 22:48:54 +00:00
po l10n: ko.po: Update Korean translation 2016-06-12 01:25:58 +09:00
ppc sha1: provide another level of indirection for the SHA-1 functions 2015-11-05 10:35:11 -08:00
refs Merge branch 'dt/pre-refs-backend' 2016-04-25 15:17:15 -07:00
t diff: improve positioning of add/delete blocks in diffs 2016-09-19 10:25:11 -07:00
templates Merge branch 'ma/update-hooks-sample-typofix' into maint 2016-03-10 11:13:50 -08:00
vcs-svn vcs-svn: use error_errno() 2016-05-09 12:29:08 -07:00
xdiff diff: improve positioning of add/delete blocks in diffs 2016-09-19 10:25:11 -07:00
.gitattributes
.gitignore test helpers: move test-* to t/helper/ subdirectory 2016-04-15 10:12:19 -07:00
.mailmap .mailmap: update to my shorter email address 2016-05-02 13:29:42 -07:00
.travis.yml Merge branch 'ls/travis-build-doc' into maint 2016-05-26 13:17:25 -07:00
abspath.c Windows: shorten code by re-using convert_slashes() 2016-04-04 18:03:02 -07:00
aclocal.m4
advice.c merge: grammofix in please-commit-before-merge message 2015-10-02 14:29:56 -07:00
advice.h pull: check if in unresolved merge state 2015-06-18 13:17:16 -07:00
alias.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
alloc.c
archive-tar.c archive-tar: convert snprintf to xsnprintf 2016-05-26 10:44:26 -07:00
archive-zip.c Merge branch 'rs/archive-zip-many' into maint 2015-09-03 19:18:01 -07:00
archive.c pathspec: rename free_pathspec() to clear_pathspec() 2016-06-02 14:09:22 -07:00
archive.h
argv-array.c argv-array: add detach function 2016-02-22 14:50:32 -08:00
argv-array.h argv-array: add detach function 2016-02-22 14:50:32 -08:00
attr.c Merge branch 'ss/exc-flag-is-a-collection-of-bits' into maint 2016-04-14 18:37:15 -07:00
attr.h
base85.c
bisect.c bisect.c: use die_errno() and warning_errno() 2016-05-09 12:29:08 -07:00
bisect.h bisect: simplify the addition of new bisect terms 2015-08-03 11:42:41 -07:00
blob.c
blob.h
branch.c worktree.c: check whether branch is rebased in another worktree 2016-04-22 14:09:38 -07:00
branch.h worktree.c: check whether branch is rebased in another worktree 2016-04-22 14:09:38 -07:00
builtin.h Merge branch 'sb/submodule-helper' 2015-10-05 12:30:19 -07:00
bulk-checkin.c use xsnprintf for generating git object headers 2015-09-25 10:18:18 -07:00
bulk-checkin.h
bundle.c bundle: don't leak an fd in case of early return 2016-04-01 10:33:18 -07:00
bundle.h
cache-tree.c struct name_entry: use struct object_id instead of unsigned char sha1[20] 2016-04-25 14:23:42 -07:00
cache-tree.h cache-tree: introduce write_index_as_tree() 2015-08-04 22:02:11 -07:00
cache.h Merge branch 'jk/send-pack-stdio' 2016-07-06 13:38:07 -07:00
check-builtins.sh check-builtins: strip executable suffix $X when enumerating builtins 2015-02-05 12:03:27 -08:00
check-racy.c check-racy.c: use error_errno() 2016-05-09 12:29:08 -07:00
check_bindir
color.c color: add color_set helper for copying raw colors 2015-10-05 11:08:05 -07:00
color.h color: add color_set helper for copying raw colors 2015-10-05 11:08:05 -07:00
column.c use xmallocz to avoid size arithmetic 2016-02-22 14:51:09 -08:00
column.h
combine-diff.c pathspec: rename free_pathspec() to clear_pathspec() 2016-06-02 14:09:22 -07:00
command-list.txt Merge branch 'nd/multiple-work-trees' 2015-07-13 14:02:02 -07:00
commit-slab.h Merge branch 'jc/commit-slab' 2015-08-03 11:01:21 -07:00
commit.c use st_add and st_mult for allocation size computation 2016-02-22 14:51:09 -08:00
commit.h pretty: allow tweaking tabwidth in --expand-tabs 2016-03-30 12:52:26 -07:00
config.c Merge branch 'pc/occurred' 2016-06-27 09:56:42 -07:00
config.mak.in
config.mak.uname mingw: make isatty() recognize MSYS2's pseudo terminals (/dev/pty*) 2016-05-26 13:12:02 -07:00
configure.ac Merge branch 'ky/imap-send-openssl-1.1.0' into maint 2016-05-06 14:53:24 -07:00
connect.c Merge branch 'cn/deprecate-ssh-git-url' 2016-03-16 13:16:40 -07:00
connect.h connect & http: support -4 and -6 switches for remote operations 2016-02-12 11:34:14 -08:00
connected.c connected.c: use error_errno() 2016-05-09 12:29:08 -07:00
connected.h
convert.c convert.c: ident + core.autocrlf didn't work 2016-04-25 12:12:03 -07:00
convert.h ls-files: add eol diagnostics 2016-01-18 19:48:43 -08:00
copy.c copy.c: use error_errno() 2016-05-09 12:29:08 -07:00
COPYING
credential-cache--daemon.c Merge branch 'nd/error-errno' 2016-05-17 14:38:28 -07:00
credential-cache.c credential-cache, send_request: close fd when done 2016-04-01 10:33:18 -07:00
credential-store.c strbuf: introduce strbuf_getline_{lf,nul}() 2016-01-15 10:12:51 -08:00
credential.c credential: let empty credential specs reset helper list 2016-02-26 10:58:14 -08:00
credential.h credential: let helpers tell us to quit 2014-12-04 10:11:12 -08:00
csum-file.c sha1fd_check: die when we cannot open the file 2015-03-19 13:35:15 -07:00
csum-file.h Merge branch 'jk/pack-bitmap' 2014-12-12 14:31:42 -08:00
ctype.c kwset: use unsigned char to store values with high-bit set 2015-03-02 12:32:24 -08:00
daemon.c daemon: enable SO_KEEPALIVE for all sockets 2016-05-25 09:42:53 -07:00
date.c date: make "local" orthogonal to date format 2015-09-03 15:45:26 -07:00
decorate.c Remove get_object_hash. 2015-11-20 08:02:05 -05:00
decorate.h
delta.h
diff-delta.c
diff-lib.c Remove get_object_hash. 2015-11-20 08:02:05 -05:00
diff-no-index.c diff-no-index.c: use error_errno() 2016-05-09 12:29:08 -07:00
diff.c diff: improve positioning of add/delete blocks in diffs 2016-09-19 10:25:11 -07:00
diff.h Merge branch 'mm/diff-renames-default' 2016-04-03 10:29:22 -07:00
diffcore-break.c diff -B -M: fix output for "copy and then rewrite" case 2014-10-23 16:17:09 -07:00
diffcore-delta.c use st_add and st_mult for allocation size computation 2016-02-22 14:51:09 -08:00
diffcore-order.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
diffcore-pickaxe.c react to errors in xdi_diff 2015-09-28 14:57:10 -07:00
diffcore-rename.c Merge branch 'sg/diff-multiple-identical-renames' into maint 2016-04-29 14:15:55 -07:00
diffcore.h
dir.c Merge branch 'nd/worktree-various-heads' 2016-05-23 14:54:29 -07:00
dir.h Merge branch 'nd/worktree-various-heads' 2016-05-23 14:54:29 -07:00
editor.c editor.c: use error_errno() 2016-05-09 12:29:08 -07:00
entry.c entry.c: use error_errno() 2016-05-09 12:29:08 -07:00
environment.c Merge branch 'js/windows-dotgit' into maint 2016-05-26 13:17:23 -07:00
exec_cmd.c Merge branch 'ak/extract-argv0-last-dir-sep' into maint 2016-03-10 11:13:47 -08:00
exec_cmd.h prepare_{git,shell}_cmd: use argv_array 2016-02-22 14:51:09 -08:00
fast-import.c Merge branch 'ew/fast-import-unpack-limit' 2016-06-20 11:01:00 -07:00
fetch-pack.c fetch-pack: isolate sigpipe in demuxer thread 2016-04-20 13:33:56 -07:00
fetch-pack.h
fmt-merge-msg.h
fsck.c Merge branch 'jc/fsck-nul-in-commit' 2016-05-17 14:38:34 -07:00
fsck.h fsck: git receive-pack: support excluding objects from fsck'ing 2015-06-23 14:27:37 -07:00
generate-cmdlist.sh generate-cmdlist: re-implement as shell script 2015-08-25 11:24:31 -07:00
gettext.c introduce "format" date-mode 2015-06-29 11:39:10 -07:00
gettext.h Merge branch 'ye/http-accept-language' 2015-03-06 15:02:25 -08:00
git-add--interactive.perl diff: improve positioning of add/delete blocks in diffs 2016-09-19 10:25:11 -07:00
git-archimport.perl
git-bisect.sh bisect: allow setting any user-specified in 'git bisect start' 2015-08-03 11:42:43 -07:00
git-compat-util.h Merge branch 'nd/error-errno' 2016-05-17 14:38:28 -07:00
git-cvsexportcommit.perl
git-cvsimport.perl Merge branch 'cn/cvsimport-perl-update' 2015-06-25 11:08:08 -07:00
git-cvsserver.perl typofix: assorted typofixes in comments, documentation and messages 2016-05-06 13:16:37 -07:00
git-difftool--helper.sh difftool/mergetool: make the form of yes/no questions consistent 2016-04-25 15:15:17 -07:00
git-difftool.perl difftool: handle unmerged files in dir-diff mode 2016-05-16 14:53:05 -07:00
git-filter-branch.sh Merge branch 'jk/filter-branch-no-index' into maint 2016-02-05 14:54:13 -08:00
git-instaweb.sh git-instaweb: use @SHELL_PATH@ instead of /bin/sh 2015-03-10 15:10:35 -07:00
git-merge-octopus.sh merge-octopus: abort if index does not match HEAD 2016-04-12 18:39:43 -07:00
git-merge-one-file.sh Merge branch 'jk/no-diff-emit-common' into maint 2016-03-10 11:13:42 -08:00
git-merge-resolve.sh
git-mergetool--lib.sh Merge branch 'nf/mergetool-prompt' 2016-05-03 14:08:17 -07:00
git-mergetool.sh Merge branch 'nf/mergetool-prompt' into HEAD 2016-05-18 14:40:07 -07:00
git-p4.py Merge branch 'ls/p4-lfs' 2016-05-10 13:40:29 -07:00
git-parse-remote.sh i18n: git-parse-remote.sh: mark strings for translation 2016-04-19 12:07:49 -07:00
git-quiltimport.sh git-quiltimport: add commandline option --series <file> 2015-09-01 11:10:07 -07:00
git-rebase--am.sh rebase: update comment about FreeBSD /bin/sh 2016-06-17 11:04:38 -07:00
git-rebase--interactive.sh Merge branch 'em/newer-freebsd-shells-are-fine-with-returns' 2016-06-27 09:56:52 -07:00
git-rebase--merge.sh Merge branch 'em/newer-freebsd-shells-are-fine-with-returns' 2016-06-27 09:56:52 -07:00
git-rebase.sh Merge branch 'jc/commit-tree-ignore-commit-gpgsign' 2016-05-13 13:18:27 -07:00
git-relink.perl
git-remote-testgit.sh transport-helper: do not request symbolic refs to remote helpers 2015-01-21 22:46:59 -08:00
git-request-pull.sh
git-send-email.perl Merge branch 'jd/send-email-to-whom' into HEAD 2016-05-18 14:40:07 -07:00
git-sh-i18n.sh
git-sh-setup.sh sane_grep: pass "-a" if grep accepts it 2016-03-10 15:35:43 -08:00
git-stash.sh always quote shell arguments to test -z/-n 2016-05-14 10:37:29 -07:00
git-submodule.sh Merge branch 'sb/submodule-recommend-shallowness' 2016-06-20 11:01:01 -07:00
git-svn.perl git-svn: fix URL canonicalization during init w/ SVN 1.7+ 2016-03-16 20:16:23 +00:00
GIT-VERSION-GEN Start the post-2.9 cycle 2016-06-20 11:06:49 -07:00
git-web--browse.sh
git.c Merge branch 'ak/git-strip-extension-from-dashed-command' into maint 2016-03-10 11:13:48 -08:00
git.rc
gpg-interface.c Merge branch 'nd/error-errno' 2016-05-17 14:38:28 -07:00
gpg-interface.h verify-commit: add option to print raw gpg status information 2015-06-22 14:20:47 -07:00
graph.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
graph.h
grep.c Merge branch 'rs/xdiff-hunk-with-func-line' into maint 2016-06-27 09:56:24 -07:00
grep.h grep: add color.grep.matchcontext and color.grep.matchselected 2014-10-28 10:33:50 -07:00
hashmap.c convert trivial cases to FLEX_ARRAY macros 2016-02-22 14:51:09 -08:00
hashmap.h
help.c convert trivial cases to FLEX_ARRAY macros 2016-02-22 14:51:09 -08:00
help.h
hex.c add reentrant variants of sha1_to_hex and find_unique_abbrev 2015-09-25 10:18:18 -07:00
http-backend.c show_head_ref(): check the result of resolve_ref_namespace() 2016-04-10 11:35:39 -07:00
http-fetch.c
http-push.c Merge branch 'bc/object-id' 2016-05-06 14:45:44 -07:00
http-walker.c http-walker: store url in a strbuf 2015-09-25 10:18:18 -07:00
http.c Merge branch 'ep/http-curl-trace' 2016-07-06 13:38:06 -07:00
http.h Merge branch 'ep/http-curl-trace' 2016-07-06 13:38:06 -07:00
ident.c Merge branch 'da/user-useconfigonly' into HEAD 2016-05-18 14:40:05 -07:00
imap-send.c Merge branch 'ep/http-curl-trace' 2016-07-06 13:38:06 -07:00
INSTALL git-imap-send: use libcurl for implementation 2014-11-10 09:17:27 -08:00
khash.h convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
kwset.c kwset: use unsigned char to store values with high-bit set 2015-03-02 12:32:24 -08:00
kwset.h kwset: use unsigned char to store values with high-bit set 2015-03-02 12:32:24 -08:00
levenshtein.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
levenshtein.h
LGPL-2.1
line-log.c Merge branch 'jc/deref-tag' 2016-06-27 09:56:50 -07:00
line-log.h line-log.c: make line_log_data_init() static 2015-01-15 11:05:47 -08:00
line-range.c
line-range.h
list-objects.c struct name_entry: use struct object_id instead of unsigned char sha1[20] 2016-04-25 14:23:42 -07:00
list-objects.h list-objects: pass full pathname to callbacks 2016-03-16 10:41:04 -07:00
ll-merge.c Merge branch 'jc/ll-merge-internal' 2016-05-17 14:38:32 -07:00
ll-merge.h
lockfile.c lockfile: improve error message when lockfile exists 2016-03-01 10:16:46 -08:00
lockfile.h lockfile: remove function "hold_lock_file_for_append" 2015-08-28 11:32:01 -07:00
log-tree.c pretty: expand tabs in indented logs to make things line up properly 2016-03-30 11:25:35 -07:00
log-tree.h Merge branch 'jn/parse-config-slot' 2014-10-20 12:23:48 -07:00
mailinfo.c strbuf: introduce strbuf_getline_{lf,nul}() 2016-01-15 10:12:51 -08:00
mailinfo.h mailinfo: remove calls to exit() and die() deep in the callchain 2015-10-21 15:59:34 -07:00
mailmap.c Merge branch 'nd/error-errno' 2016-05-17 14:38:28 -07:00
mailmap.h
Makefile Merge branch 'mm/makefile-developer-can-be-in-config-mak' 2016-06-03 14:38:02 -07:00
match-trees.c match-trees: convert several leaf functions to use struct object_id 2016-04-25 14:26:29 -07:00
merge-blobs.c Merge branch 'jk/no-diff-emit-common' into maint 2016-03-10 11:13:42 -08:00
merge-blobs.h
merge-recursive.c Merge branch 'bc/object-id' 2016-05-06 14:45:44 -07:00
merge-recursive.h merge-recursive: option to disable renames 2016-02-17 10:20:51 -08:00
merge.c Convert struct object to object_id 2015-11-20 08:02:05 -05:00
mergesort.c
mergesort.h
name-hash.c convert trivial cases to FLEX_ARRAY macros 2016-02-22 14:51:09 -08:00
notes-cache.c notes: allow treeish expressions as notes ref 2016-01-12 15:10:01 -08:00
notes-cache.h
notes-merge.c pathspec: rename free_pathspec() to clear_pathspec() 2016-06-02 14:09:22 -07:00
notes-merge.h notes: extract enum notes_merge_strategy to notes-utils.h 2015-08-17 15:36:23 -07:00
notes-utils.c notes: allow treeish expressions as notes ref 2016-01-12 15:10:01 -08:00
notes-utils.h notes: extract parse_notes_merge_strategy to notes-utils 2015-08-17 15:38:32 -07:00
notes.c use string_list initializer consistently 2016-06-13 10:37:51 -07:00
notes.h Merge branch 'jk/notes-merge-from-anywhere' 2016-02-03 14:15:59 -08:00
object.c Remove get_object_hash. 2015-11-20 08:02:05 -05:00
object.h Remove get_object_hash. 2015-11-20 08:02:05 -05:00
pack-bitmap-write.c Merge branch 'jk/path-name-safety-2.6' into jk/path-name-safety-2.7 2016-03-16 10:42:32 -07:00
pack-bitmap.c Merge branch 'jk/path-name-safety-2.6' into jk/path-name-safety-2.7 2016-03-16 10:42:32 -07:00
pack-bitmap.h pack-bitmap.c: make pack_bitmap_filename() static 2015-01-15 11:04:10 -08:00
pack-check.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
pack-objects.c
pack-objects.h
pack-revindex.c Merge branch 'jk/tighten-alloc' 2016-02-26 13:37:16 -08:00
pack-revindex.h pack-revindex: store entries directly in packed_git 2015-12-21 14:36:28 -08:00
pack-write.c
pack.h
pager.c Merge branch 'jc/am-i-v-fix' into maint 2016-03-10 11:13:41 -08:00
parse-options-cb.c Merge branch 'jk/parseopt-string-list' into jk/string-list-static-init 2016-06-13 10:37:48 -07:00
parse-options.c parse-options.c: make OPTION_COUNTUP respect "unspecified" values 2016-05-05 11:52:45 -07:00
parse-options.h parse-options: allow -h as a short option 2015-11-20 08:02:07 -05:00
patch-delta.c
patch-ids.c patch-ids: make commit_patch_id() a public helper function 2016-04-26 10:49:57 -07:00
patch-ids.h patch-ids: make commit_patch_id() a public helper function 2016-04-26 10:49:57 -07:00
path.c Merge branch 'lp/typofixes' into maint 2016-05-26 13:17:21 -07:00
pathspec.c pathspec: rename free_pathspec() to clear_pathspec() 2016-06-02 14:09:22 -07:00
pathspec.h pathspec: rename free_pathspec() to clear_pathspec() 2016-06-02 14:09:22 -07:00
pkt-line.c pkt-line: show packets in async processes as "sideband" 2015-09-01 15:11:57 -07:00
pkt-line.h
preload-index.c
pretty.c Merge branch 'et/pretty-format-c-auto' into maint 2016-06-27 09:56:23 -07:00
prio-queue.c
prio-queue.h
progress.c use xmallocz to avoid size arithmetic 2016-02-22 14:51:09 -08:00
progress.h
prompt.c prompt.c: remove git_getpass() nobody uses 2015-01-15 11:02:06 -08:00
prompt.h prompt.c: remove git_getpass() nobody uses 2015-01-15 11:02:06 -08:00
quote.c quote: implement sq_quotef() 2016-03-01 12:24:15 -08:00
quote.h quote: implement sq_quotef() 2016-03-01 12:24:15 -08:00
reachable.c reachable.c: use error_errno() 2016-05-09 12:29:08 -07:00
reachable.h pack-objects: match prune logic for discarding objects 2014-10-16 10:10:43 -07:00
read-cache.c add: add --chmod=+x / --chmod=-x options 2016-06-07 17:43:39 -07:00
README.md README.md: format CLI commands with code syntax 2016-05-31 08:54:24 -07:00
ref-filter.c ref-filter.c: mark strings for translation 2016-02-29 14:27:58 -08:00
ref-filter.h branch.c: use 'ref-filter' APIs 2015-09-25 08:54:54 -07:00
reflog-walk.c reflog: continue walking the reflog past root commits 2016-06-06 15:06:44 -07:00
reflog-walk.h convert "enum date_mode" into a struct 2015-06-29 11:39:07 -07:00
refs.c refs: move resolve_ref_unsafe into common code 2016-04-10 11:35:41 -07:00
refs.h refs.h: fix misspelt "occurred" in a comment 2016-06-10 14:53:32 -07:00
RelNotes Start preparing for 2.9.1 2016-06-27 09:59:51 -07:00
remote-curl.c http: support sending custom HTTP headers 2016-04-27 14:02:33 -07:00
remote-testsvn.c strbuf: introduce strbuf_getline_{lf,nul}() 2016-01-15 10:12:51 -08:00
remote.c Merge branch 'nd/remote-plural-ours-plus-theirs' into maint 2016-05-26 13:17:18 -07:00
remote.h remote: simplify remote_is_configured() 2016-02-16 13:33:12 -08:00
replace_object.c register_replace_ref(): rewrite to take an object_id argument 2015-05-25 12:19:35 -07:00
rerere.c Merge branch 'jc/rerere-multi' 2016-05-23 14:54:38 -07:00
rerere.h Merge branch 'jc/rerere-multi' 2016-04-25 15:17:15 -07:00
resolve-undo.c
resolve-undo.h
revision.c pathspec: rename free_pathspec() to clear_pathspec() 2016-06-02 14:09:22 -07:00
revision.h Merge branch 'lt/pretty-expand-tabs' 2016-04-13 14:12:36 -07:00
run-command.c Merge branch 'jk/push-client-deadlock-fix' into HEAD 2016-05-18 14:40:06 -07:00
run-command.h Merge branch 'jk/push-client-deadlock-fix' into HEAD 2016-05-18 14:40:06 -07:00
send-pack.c send-pack: use buffered I/O to talk to pack-objects 2016-06-08 16:02:40 -07:00
send-pack.h push: support signing pushes iff the server supports it 2015-08-19 12:58:45 -07:00
sequencer.c Merge branch 'mg/cherry-pick-multi-on-unborn' 2016-06-27 09:56:53 -07:00
sequencer.h Merge branch 'jc/conflict-hint' into cc/interpret-trailers-more 2014-11-10 09:56:39 -08:00
server-info.c server-info.c: use error_errno() 2016-05-09 12:29:08 -07:00
setup.c Merge branch 'jc/xstrfmt-null-with-prec-0' into maint 2016-05-02 14:24:14 -07:00
sh-i18n--envsubst.c
sha1-array.c
sha1-array.h
sha1-lookup.c
sha1-lookup.h
sha1_file.c Merge branch 'nd/worktree-various-heads' 2016-05-23 14:54:29 -07:00
sha1_name.c Merge branch 'bc/object-id' 2016-05-06 14:45:44 -07:00
shallow.c use st_add and st_mult for allocation size computation 2016-02-22 14:51:09 -08:00
shell.c strbuf: introduce strbuf_getline_{lf,nul}() 2016-01-15 10:12:51 -08:00
shortlog.h
show-index.c convert trivial cases to ALLOC_ARRAY 2016-02-22 14:51:09 -08:00
sideband.c convert trivial sprintf / strcpy calls to xsnprintf 2015-09-25 10:18:18 -07:00
sideband.h
sigchain.c sigchain: add command to pop all common signals 2015-12-16 12:06:08 -08:00
sigchain.h sigchain: add command to pop all common signals 2015-12-16 12:06:08 -08:00
split-index.c typofix: assorted typofixes in comments, documentation and messages 2016-05-06 13:16:37 -07:00
split-index.h
strbuf.c Merge branch 'jk/getwholeline-getdelim-empty' into maint 2016-04-14 18:57:46 -07:00
strbuf.h Merge branch 'pb/strbuf-read-file-doc' 2016-06-27 09:56:46 -07:00
streaming.c Merge branch 'sb/plug-streaming-leak' 2015-04-14 11:49:09 -07:00
streaming.h
string-list.c string_list: use string-list API in unsorted_string_list_lookup() 2016-04-25 11:48:27 -07:00
string-list.h Merge branch 'sb/string-list' 2014-12-22 12:27:30 -08:00
submodule-config.c submodule-config: keep shallow recommendation around 2016-05-27 10:40:45 -07:00
submodule-config.h submodule-config: keep shallow recommendation around 2016-05-27 10:40:45 -07:00
submodule.c use string_list initializer consistently 2016-06-13 10:37:51 -07:00
submodule.h Merge branch 'jk/submodule-c-credential' 2016-05-17 14:38:25 -07:00
symlinks.c
tag.c verify-tag: move tag verification code to tag.c 2016-04-22 14:06:46 -07:00
tag.h verify-tag: move tag verification code to tag.c 2016-04-22 14:06:46 -07:00
tar.h
tempfile.c register_tempfile(): new function to handle an existing temporary file 2015-08-10 12:57:14 -07:00
tempfile.h register_tempfile(): new function to handle an existing temporary file 2015-08-10 12:57:14 -07:00
thread-utils.c thread-utils.c: detect CPU count on older BSD-like systems 2015-03-10 15:13:28 -07:00
thread-utils.h pack-objects: set number of threads before checking and warning 2014-10-13 12:53:46 -07:00
trace.c trace: use strbuf for quote_crnl output 2015-09-25 10:18:18 -07:00
trace.h pkt-line: support tracing verbatim pack contents 2015-06-16 13:24:22 -07:00
trailer.c trailer.c: mark strings for translation 2016-02-29 14:27:58 -08:00
trailer.h interpret-trailers: add option for in-place editing 2016-01-14 12:22:17 -08:00
transport-helper.c Merge branch 'nd/error-errno' 2016-05-17 14:38:28 -07:00
transport.c Merge branch 'cn/deprecate-ssh-git-url' 2016-03-16 13:16:40 -07:00
transport.h connect & http: support -4 and -6 switches for remote operations 2016-02-12 11:34:14 -08:00
tree-diff.c Merge branch 'jk/avoid-unbounded-alloca' 2016-06-27 09:56:48 -07:00
tree-walk.c tree-walk: convert tree_entry_extract() to use struct object_id 2016-04-25 14:26:28 -07:00
tree-walk.h tree-walk: convert tree_entry_extract() to use struct object_id 2016-04-25 14:26:28 -07:00
tree.c struct name_entry: use struct object_id instead of unsigned char sha1[20] 2016-04-25 14:23:42 -07:00
tree.h Merge branch 'jk/squelch-missing-link-warning-for-unreachable' into maint 2015-06-25 11:02:10 -07:00
unicode_width.h
unimplemented.sh unimplemented.sh: use the $( ... ) construct for command substitution 2015-12-27 15:33:13 -08:00
unix-socket.c
unix-socket.h
unpack-trees.c Merge branch 'nd/error-errno' 2016-05-17 14:38:28 -07:00
unpack-trees.h
update_unicode.sh update_unicode.sh: delete the command group 2014-12-22 10:03:37 -08:00
upload-pack.c upload-pack.c: use parse-options API 2016-05-31 10:17:20 -07:00
url.c use strbuf_complete to conditionally append slash 2015-10-05 11:08:06 -07:00
url.h
urlmatch.c urlmatch.c: make match_urls() static 2015-01-15 11:05:48 -08:00
urlmatch.h urlmatch.c: make match_urls() static 2015-01-15 11:05:48 -08:00
usage.c usage.c: add warning_errno() and error_errno() 2016-05-09 12:29:08 -07:00
userdiff.c userdiff: add built-in pattern for CSS 2016-06-03 14:45:56 -07:00
userdiff.h diff: clarify textconv interface 2016-02-22 10:40:35 -08:00
utf8.c utf8: add function to align a string into given strbuf 2015-09-17 10:02:48 -07:00
utf8.h typofix: assorted typofixes in comments, documentation and messages 2016-05-06 13:16:37 -07:00
varint.c
varint.h
version.c
version.h
versioncmp.c versionsort: support reorder prerelease suffixes 2015-02-27 13:38:22 -08:00
walker.c struct name_entry: use struct object_id instead of unsigned char sha1[20] 2016-04-25 14:23:42 -07:00
walker.h
wildmatch.c typofix: assorted typofixes in comments, documentation and messages 2016-05-06 13:16:37 -07:00
wildmatch.h
worktree.c Merge branch 'nd/worktree-various-heads' 2016-05-23 14:54:29 -07:00
worktree.h branch: do not rename a branch under bisect or rebase 2016-04-22 14:09:39 -07:00
wrap-for-bin.sh wrap-for-bin.sh: regenerate bin-wrappers when switching branches 2016-05-10 13:23:34 -07:00
wrapper.c Merge branch 'nd/error-errno' 2016-05-17 14:38:28 -07:00
write_or_die.c write_or_die: remove the unused write_or_whine() function 2016-06-10 10:54:27 -07:00
ws.c
wt-status.c Use "working tree" instead of "working directory" for git status 2016-06-09 12:21:52 -07:00
wt-status.h wt-status.c: split bisect detection out of wt_status_get_state() 2016-04-22 14:09:39 -07:00
xdiff-interface.c xdiff: don't trim common tail with -W 2016-05-31 13:08:56 -07:00
xdiff-interface.h xdiff: reject files larger than ~1GB 2015-09-28 14:57:23 -07:00
zlib.c zlib: initialize git_zstream in git_deflate_init{,_gzip,_raw} 2015-03-05 15:46:03 -08:00

Git - fast, scalable, distributed revision control system

Git is a fast, scalable, distributed revision control system with an unusually rich command set that provides both high-level operations and full access to internals.

Git is an Open Source project covered by the GNU General Public License version 2 (some parts of it are under different licenses, compatible with the GPLv2). It was originally written by Linus Torvalds with help of a group of hackers around the net.

Please read the file INSTALL for installation instructions.

Many Git online resources are accessible from http://git-scm.com/ including full documentation and Git related tools.

See Documentation/gittutorial.txt to get started, then see Documentation/giteveryday.txt for a useful minimum set of commands, and Documentation/git-.txt for documentation of each command. If git has been correctly installed, then the tutorial can also be read with man gittutorial or git help tutorial, and the documentation of each command with man git-<commandname> or git help <commandname>.

CVS users may also want to read Documentation/gitcvs-migration.txt (man gitcvs-migration or git help cvs-migration if git is installed).

The user discussion and development of Git take place on the Git mailing list -- everyone is welcome to post bug reports, feature requests, comments and patches to git@vger.kernel.org (read Documentation/SubmittingPatches for instructions on patch submission). To subscribe to the list, send an email with just "subscribe git" in the body to majordomo@vger.kernel.org. The mailing list archives are available at http://news.gmane.org/gmane.comp.version-control.git/, http://marc.info/?l=git and other archival sites.

The maintainer frequently sends the "What's cooking" reports that list the current status of various development topics to the mailing list. The discussion following them give a good reference for project status, development direction and remaining tasks.

The name "git" was given by Linus Torvalds when he wrote the very first version. He described the tool as "the stupid content tracker" and the name as (depending on your mood):

  • random three-letter combination that is pronounceable, and not actually used by any common UNIX command. The fact that it is a mispronunciation of "get" may or may not be relevant.
  • stupid. contemptible and despicable. simple. Take your pick from the dictionary of slang.
  • "global information tracker": you're in a good mood, and it actually works for you. Angels sing, and a light suddenly fills the room.
  • "goddamn idiotic truckload of sh*t": when it breaks