Commit graph

1633 commits

Author SHA1 Message Date
Junio C Hamano 7b01c71b64 Sync with Git 2.13.7
* maint-2.13:
  Git 2.13.7
  verify_path: disallow symlinks in .gitmodules
  update-index: stat updated files earlier
  verify_dotfile: mention case-insensitivity in comment
  verify_path: drop clever fallthrough
  skip_prefix: add case-insensitive variant
  is_{hfs,ntfs}_dotgitmodules: add tests
  is_ntfs_dotgit: match other .git files
  is_hfs_dotgit: match other .git files
  is_ntfs_dotgit: use a size_t for traversing string
  submodule-config: verify submodule names as paths
2018-05-22 14:10:49 +09:00
Jeff King 10ecfa7649 verify_path: disallow symlinks in .gitmodules
There are a few reasons it's not a good idea to make
.gitmodules a symlink, including:

  1. It won't be portable to systems without symlinks.

  2. It may behave inconsistently, since Git may look at
     this file in the index or a tree without bothering to
     resolve any symbolic links. We don't do this _yet_, but
     the config infrastructure is there and it's planned for
     the future.

With some clever code, we could make (2) work. And some
people may not care about (1) if they only work on one
platform. But there are a few security reasons to simply
disallow it:

  a. A symlinked .gitmodules file may circumvent any fsck
     checks of the content.

  b. Git may read and write from the on-disk file without
     sanity checking the symlink target. So for example, if
     you link ".gitmodules" to "../oops" and run "git
     submodule add", we'll write to the file "oops" outside
     the repository.

Again, both of those are problems that _could_ be solved
with sufficient code, but given the complications in (1) and
(2), we're better off just outlawing it explicitly.

Note the slightly tricky call to verify_path() in
update-index's update_one(). There we may not have a mode if
we're not updating from the filesystem (e.g., we might just
be removing the file). Passing "0" as the mode there works
fine; since it's not a symlink, we'll just skip the extra
checks.

Signed-off-by: Jeff King <peff@peff.net>
2018-05-21 23:50:11 -04:00
Johannes Schindelin e7cb0b4455 is_ntfs_dotgit: match other .git files
When we started to catch NTFS short names that clash with .git, we only
looked for GIT~1. This is sufficient because we only ever clone into an
empty directory, so .git is guaranteed to be the first subdirectory or
file in that directory.

However, even with a fresh clone, .gitmodules is *not* necessarily the
first file to be written that would want the NTFS short name GITMOD~1: a
malicious repository can add .gitmodul0000 and friends, which sorts
before `.gitmodules` and is therefore checked out *first*. For that
reason, we have to test not only for ~1 short names, but for others,
too.

It's hard to just adapt the existing checks in is_ntfs_dotgit(): since
Windows 2000 (i.e., in all Windows versions still supported by Git),
NTFS short names are only generated in the <prefix>~<number> form up to
number 4. After that, a *different* prefix is used, calculated from the
long file name using an undocumented, but stable algorithm.

For example, the short name of .gitmodules would be GITMOD~1, but if it
is taken, and all of ~2, ~3 and ~4 are taken, too, the short name
GI7EBA~1 will be used. From there, collisions are handled by
incrementing the number, shortening the prefix as needed (until ~9999999
is reached, in which case NTFS will not allow the file to be created).

We'd also want to handle .gitignore and .gitattributes, which suffer
from a similar problem, using the fall-back short names GI250A~1 and
GI7D29~1, respectively.

To accommodate for that, we could reimplement the hashing algorithm, but
it is just safer and simpler to provide the known prefixes. This
algorithm has been reverse-engineered and described at
https://usn.pw/blog/gen/2015/06/09/filenames/, which is defunct but
still available via https://web.archive.org/.

These can be recomputed by running the following Perl script:

-- snip --
use warnings;
use strict;

sub compute_short_name_hash ($) {
        my $checksum = 0;
        foreach (split('', $_[0])) {
                $checksum = ($checksum * 0x25 + ord($_)) & 0xffff;
        }

        $checksum = ($checksum * 314159269) & 0xffffffff;
        $checksum = 1 + (~$checksum & 0x7fffffff) if ($checksum & 0x80000000);
        $checksum -= (($checksum * 1152921497) >> 60) * 1000000007;

        return scalar reverse sprintf("%x", $checksum & 0xffff);
}

print compute_short_name_hash($ARGV[0]);
-- snap --

E.g., running that with the argument ".gitignore" will
result in "250a" (which then becomes "gi250a" in the code).

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
2018-05-21 23:50:11 -04:00
Ramsay Jones 356a293f39 cache.h: hex2chr() - avoid -Wsign-compare warnings
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-22 13:00:38 +09:00
Junio C Hamano f04f860dfa Merge branch 'sb/sha1-file-cleanup' into maint
Code clean-up.

* sb/sha1-file-cleanup:
  sha1_file: make read_info_alternates static
2017-09-10 17:03:04 +09:00
Junio C Hamano 822a4d4178 Merge branch 'jk/hashcmp-memcmp' into maint
Code clean-up.

* jk/hashcmp-memcmp:
  hashcmp: use memcmp instead of open-coded loop
2017-09-10 17:02:59 +09:00
Junio C Hamano df2dd28316 Merge branch 'jt/subprocess-handshake' into maint
Code cleanup.

* jt/subprocess-handshake:
  sub-process: refactor handshake to common function
  Documentation: migrate sub-process docs to header
  convert: add "status=delayed" to filter process protocol
  convert: refactor capabilities negotiation
  convert: move multiple file filter error handling to separate function
  convert: put the flags field before the flag itself for consistent style
  t0021: write "OUT <size>" only on success
  t0021: make debug log file name configurable
  t0021: keep filter log files on comparison
2017-08-23 14:33:52 -07:00
Stefan Beller 2456990dfd sha1_file: make read_info_alternates static
read_info_alternates is not used from outside, so let's make it static.

We have to declare the function before link_alt_odb_entry instead of
moving the code around, link_alt_odb_entry calls read_info_alternates,
which in turn calls link_alt_odb_entry.

Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-15 14:39:25 -07:00
Jeff King 0b006014c8 hashcmp: use memcmp instead of open-coded loop
In 1a812f3a70 (hashcmp(): inline memcmp() by hand to
optimize, 2011-04-28), it was reported that an open-coded
loop outperformed memcmp() for comparing sha1s.

Discussion[1] a few years later in 2013 showed that this
depends on your libc's version of memcmp(). In particular,
glibc 2.13 optimized their memcmp around 2011. Here are
current timings with glibc 2.24 (best-of-five, on
linux.git):

  [before this patch, open-coded]
  $ time git rev-list --objects --all
  real	0m35.357s
  user	0m35.016s
  sys	0m0.340s

  [after this patch, memcmp]
  real	0m32.930s
  user	0m32.630s
  sys	0m0.300s

Now that we've had 6 years for that version of glibc to
make its way onto people's machines, it's worth revisiting
our benchmarks and switching to memcmp().

It may be that there are other non-glibc systems where
memcmp() isn't as well optimized. But since our single data
point in favor of open-coding was on a now-ancient glibc, we
should probably assume the system memcmp is good unless
proven otherwise. We may end up with a SLOW_MEMCMP Makefile
knob, but we can hold off on that until we actually find
such a system in practice.

[1] https://public-inbox.org/git/20130318073229.GA5551@sigill.intra.peff.net/

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-09 11:03:25 -07:00
Junio C Hamano 230ce07d13 Git 2.13.5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZgNa+AAoJELC16IaWr+bLaMIP/1tHYbkQ/iMvYE8RpV5SXZOC
 nKm8IHP4Hu+05gmp874Gw3XtF+FELC53Q2nMc3L4mJ/ZSjuJOuein9aVisapBluw
 IZ8UaxmgN1NUA8gDVkXULNMJGDaOQw+VckMrEAI3A0uYXGY2eAiHR3Q+p0txhHb9
 jfhSsnl7Rv3q6LeDOPMKpwPVT0+uxBklrli7YcIn9IssbQhAvDUpbZ0Ab/fEOH6j
 NDIsZZ8opEESsUE5WBCOVXKUYjZOpLLpU4dQXa+JBj019LRmUYxLgjGVt2BSuUh/
 K8xe6/3P1FOQF1tMY4Bjb2iIUnc0wzIQYULn9dqJthV0Ybz0qwT5bTt4IYYKs86I
 /XjJPI9cAQHNirafyUyTrWy95HGnvYSyvmNC4a2ElvD24i/GKCuRQY7O5MCT3fjB
 5jUH2VxxA5E1TvkeG4VHl0d8WZib+/4CWd0OwSXk9LJJC/C/OTUlBa2dakOpwtgS
 RNGM+8+gzzd5rv1/UL+vAiqtCYjDfU+uqsjP5fRnMyTZiCmbhRcdW9b1TRc4OMoe
 wpbSbz0L18IAsyqZ+KLhyZOCr5mxjrVCxV++efI+NhsRecmO5nbPNtRGKf7/AtAQ
 +e5hROZRSFwf8/bXoobcOvhpuvW36+0mVXxIOGIoYtXB6AdtvGFXi9TnC/rTLBZG
 zuj/z2fmgo3F0G2tnNxk
 =t0hU
 -----END PGP SIGNATURE-----

Merge tag 'v2.13.5' into maint
2017-08-04 12:40:37 -07:00
Junio C Hamano e312af164c Git 2.12.4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZflhUAAoJELC16IaWr+bLDyAP/jWDc9ic8S1ZH8W4ijAB24vP
 YRyQ1gbnRLhpEpbHYCUp7Uw9mrJBfdwYFlqxGJPH4JZL9qYLJUe5DJMWi5uAEptg
 tYPpPMLV5hgvGICwJbOaS5NlNf2NzLjRvzziOpUnE5CcR5Bw7doCPk4Uw6AVvAvK
 0x/6KDNLdKCBl3ZIoLdp9eW2PrTfYx6AK+Wf9oEgdMSB9+23acL7R/QEmH7oh9gl
 BS0riRQVHnku5akybMnRjeba7SvdhJlIV8rPc4WpuMRz0g2lPzOKQ+okeRtdQrfi
 REdEZ920EJR65KtxUgxYLrpPpmdRBxNI0jXC3Sm2Kac85MLvjFqhaosBWhTQuoOf
 tra68Gb9WSVkKLwRhRBYOG+dx00m1UETs7cYm6pw37RiMss1pcZWNdzjNNouVEEp
 3LBXcPJSpCbEjI+U/H2CqLqCk9gMfKLJXB9hK4b9jBcB9yrON2d75tPMhOcNx+Ej
 x6vZ4Zql6r1Bhe8y7T6KMnLe6vdli8Vrd7Tj5btogcEUmVfRQVHZzV94utevv9A5
 UEXLeCjJSjcY7rYtTdSLXgESioHW8WNfG+TPiyxjujSybtxGKmkcrSGCrugT26K8
 UT5VH2mYJOuHRtWnjWEEEhjayaXLv0mHNQ5XVfNDNPEFqRBQmIhLhcIf/aOF6r+F
 4Q6qN9QceJUEiaFnHsyO
 =ZBXN
 -----END PGP SIGNATURE-----

Merge tag 'v2.12.4' into maint
2017-08-01 12:27:31 -07:00
Junio C Hamano 3def5e9a8d Git 2.11.3
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZfleHAAoJELC16IaWr+bLxB8QANsdFCtO+/PFnda2CmadVt/J
 d4AGMSu+cD74aUp5wzMscROCggn3vMHVeDMdVJ3ihcY6nLjJRy0EC/VJ5yTpSGli
 iq2GjmoH/oTS2tq2JWbTe86VMVYAzuWlWyowwH6OymDLkBQcAOap1WfUHTmKehUi
 BV2br1x15c7hRGToFqN8yed39iVmQoDJ5ETTBgFqkVyVHDdlyc81FRt0RfiA2x3N
 nm5/gOOWvH5X4Cyu7yP2C9GSV9p1mufEtw1DNwp+MV3n3wa2P4wJeNnYYmW85hpS
 ZzuWEM9QcU3fbShHxHcwHCyy2imXUUsfm1/Y6rCH3ZVSzo1icz5ghL2rnmcxdZvS
 JMp60EKbaapUiIkI23R2Yvlh81J5frwOp739DYytlai3rZF7le9KYGQnsUrv95Ie
 CvFGr3Btiy3oEVOP7xRiGnGtThmVRP4mFsIIIgf3YsBJqRXRwxqn1D6jbkHBqu7z
 VfFnpp63BsKY59Udo1qilkxS2qQ35gAS+TNczPV9D0m3n3bZ5UXEMuonahAE5YwG
 d20wBNOd86oK4khtMWcxXx4BBx+tlA99FfQOgxvn3XWnHmTAJE3+L0uEajZpEpcU
 gkHLo0EutMY+xmX9+jwszmBS9gNL9xzFADtAoYIoAsmpaD7jBJsTjwyzstTyXLvr
 5jcZT/hyX4iZtOUlC67J
 =fCBm
 -----END PGP SIGNATURE-----

Merge tag 'v2.11.3' into maint-2.12

Git 2.11.3
2017-07-30 15:04:22 -07:00
Junio C Hamano 05bb78abc1 Git 2.10.4
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZflbsAAoJELC16IaWr+bLUUAQAONDi4Ty/I9K79Nwv9/15HcV
 4oMfCC56Y8nLUs83GsS0aZadX15iABsOACtZUA0kPQB8PV2XYcUnM6rPFtpeovRI
 sfs0XqfK5l+cVDoMMb2sGAfQYEBIXRy8sUb1EXIuJ4MzHxfRbmm1sKd7ko3lg6hN
 JHhGsNpzIVRspuUZh+yXp0Qa8CKKnekhwEntVd5b71eahG3lJNBO7UXvDAkDyl33
 amoc5eqKdoGvjs3yYBvOV0qX8ePV53wieKwL5uBG6LdjMrjtWpLJOuMk6IYR18Sm
 ++A+WiCb14lQ/6Wfu+r7WhjaWIXHHMPV/5YMhm1OzrWKiw+DuucLVaorl3cSPA2G
 zNPoHGUGxfnKz0NLiMkpbjUfB0gYqqLKts5pcnKeTconUcLZlpYKEYNpypfgbJyr
 XvIgkjAt3KwRa8mrGvCURkelmYKzFzd+hZdxvXiJ/flk4CcssgMgYorWCMwwy86a
 uErlgWDcGh9wtV9Pwy8M7EwXcRDggBND5jqH2dpFUaQ+8Kzm11lX5BRseZIOASzL
 ++MuZGEQiETz2HkWb+DWMIDAJMej2N2DF1eq7DnsmEUZgOarf2ZP3Lsd84W43WLI
 PdLhA1zpL2YVz9EEeFT/hLSX3fC16+lkeVQhtV5pJlIiLumHOdWYBElsnX694Nv3
 JTE4X1l38kCBQ4on8eEo
 =R9MM
 -----END PGP SIGNATURE-----

Merge tag 'v2.10.4' into maint-2.11

Git 2.10.4
2017-07-30 15:01:31 -07:00
Junio C Hamano d78f06a1b7 Git 2.9.5
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZflVaAAoJELC16IaWr+bL/5QP/1NoUGqrwB+zwJ8+oDqd+Djl
 PX8qyafoMXJr/w/fACk8r/tCSGKgK8Gx9FqZ9GIBCAZVNXkQnheRElOjiuRg4rbl
 +USiN2XM4ue/X7GqEBc7YVAmd+ifFFQ+ckm1g72A53B4Qh4/Ca4MnPYLOi7eKfC1
 85f+/zMj/5pYsmboFZzFiUPq+Khyb2e85Mm9ok+l/8zAXt4ER5cf4mhY3KSEtnfA
 6qGVUJ3fS9FzE4ud+/cx2qidsTrzZI/Hpv+3TVVXzSv5j32D3srnumLs+XnVIarV
 nJFoVUZV/XSC80YUkwbcdY6Rs2gVfhHJK6zVcs8MfHC28o+ZJDM+ceGVnUKcdpDW
 Gejsc7l0Blt0IodLoHAemBOsF3eeQBh5M5vodHdEFTiCdGRcCX3lvPxikCILW1Fv
 4FPmrjfOlWEz0ktV4eKacX+DVAa2p9P09v0B6pKFt/l5MiHKla8qdYXLjEnEHHaN
 ywIJPK0Lbgr+rjf3XcEQ96sjP+2XOcmtwTxychEcQ7Z2IwqyJA/GtdyCh1/jinap
 0M9odRHtYHRk1qUcZBLosM3C3Y0rgc2k1RZJRgdAY1kiBezctoU6FkH5Pb7LFRtH
 hr3/llk9X1ivh6fruLZ6Lu2EZ/vJVOwtUNLFqPO8fLP4cABkhDdxX13o5PS+qYMJ
 THXReDUV4vgtmzKrgJ+7
 =w1+M
 -----END PGP SIGNATURE-----

Merge tag 'v2.9.5' into maint-2.10

Git 2.9.5
2017-07-30 14:57:33 -07:00
Junio C Hamano af0178aec7 Git 2.8.6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZflRcAAoJELC16IaWr+bLOnAQAIEtjMActDfpYb+tXftBIzzm
 Od/tBG3WZMRPyq/fWExV9nPO5xYOf6O9PlU6H7rNMDh+2n5/ypxqEXDvjzNHRMyh
 TIk1oAjG0zDiSe/fHO4v3fcCeIne0C0ZDwzYjS9+mSnybmPRLMQ1j8ykV7oBIUlB
 A081Tcb86bxG9kdxO4Sih+0zIglZ1lNA9fH7PqY5v/DqBY9TkaZIuoEjCIo7wUYu
 k+kSrNjXWz8HdYovpO/snhgtU7TFS7OtWmYEvXBg4+p6R1nGCuSWejHeWrbqx3fI
 QPXdLXIua/NqZKdd6ad4K+K91XW1OaqnK49IY58sSzHXYiDRnfnmBDzduyuagEE1
 C3BQhALMvkGZBmkNI1unZBqxsz4E7hviyxeOt1W3Z/I8mt6IGGnLWg+oVEy4b3yj
 TAx4rQJs1xmGU5maR25yBnQI/ElZWHNg+vrtGhdt5XvklASwn8egukjAjUWJodie
 hs/BiMKf+Rk7dVPY6RnK94pHWtNpkTlD9VCaLXhmFN863Zc3DwYBcbUF2D78d5G8
 zLG1pQRtWizAjF9XJ/q01JAutHUyyoYGWwa8lKJvplxQaXwe0bntzPILZN81G1Cy
 mC955bsbyIGv+88elRAeYpu7SxQJ1uGmpMYcamdLr7irDF2bUZp7n55Ogia4IKvK
 LgvwELkejo1WgDBYvqET
 =iOsd
 -----END PGP SIGNATURE-----

Merge tag 'v2.8.6' into maint-2.9

Git 2.8.6
2017-07-30 14:52:14 -07:00
Junio C Hamano 7720c33f63 Git 2.7.6
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJZflNxAAoJELC16IaWr+bLpSsQAIT1s4c/uKAXJBw8CegM4SP1
 SeB5NMnjz7VVtBsdXKPy6fVXBHCjffON/MvNXcXwGqzx3lh6SiMAVNjYknBkQcKN
 b639dD9HEEBRFf62a+QAyRYbFeg0NONVydB25s7RfR57HUNxFibaJDT5SoymO0/5
 YCdmMENuvijvCYcwyb3MSjAKCkwDDErPzyI4NZ2YZpC7IG46Uoxq8BCdHpKhXa5I
 3TNEDruBAd/UJCIQiMW1HP3OMQXzXmCTL5i4QSr/uloO1kNzkWgCZDkkFrSGFPdx
 UeTRXOM0r5QdFXZC36zZNoL5ELflgzrYFSerj6VkCAbiG4FAWL+43CCxuUcq5OkZ
 JsTYObieBMFiaowTn9hKo3ix1xDSjR2+p0bfZbOPy5jMB85oegnjV3Rp/eBoXsDm
 h4qo+5kv0h8H2wKdxcBfVg6LkpBZGsvEOveAtWZIcFIVIOyULj9UAsnTwOotwQiL
 NHO4J2fJhcvSYUj6oGB3SpabKZfcbVXRE2fzZq+3+Mt4DdzSdSmx5CEJfUmxN7sQ
 YLb8UKSr2vv03YfKRghCGxqjOcmQL5vY79O8+QSN3cCDFFAwxzNYaGeHJ+/chvh2
 NySOkUf/uA7H1xQiZmJI1mfwQvi527MEzblCPDButm6n8ty6QyWOQ+kQYzcW5jjI
 kPWdqc5pCZQ+Q+q6lQc0
 =rNay
 -----END PGP SIGNATURE-----

Merge tag 'v2.7.6' into maint-2.8

Git 2.7.6
2017-07-30 14:46:43 -07:00
Jeff King 2491f77b90 connect: factor out "looks like command line option" check
We reject hostnames that start with a dash because they may
be confused for command-line options. Let's factor out that
notion into a helper function, as we'll use it in more
places. And while it's simple now, it's not clear if some
systems might need more complex logic to handle all cases.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-28 15:51:56 -07:00
Junio C Hamano 487fe1ffcd Merge branch 'ls/filter-process-delayed' into jt/subprocess-handshake
* ls/filter-process-delayed:
  convert: add "status=delayed" to filter process protocol
  convert: refactor capabilities negotiation
  convert: move multiple file filter error handling to separate function
  convert: put the flags field before the flag itself for consistent style
  t0021: write "OUT <size>" only on success
  t0021: make debug log file name configurable
  t0021: keep filter log files on comparison
2017-07-26 12:56:19 -07:00
Junio C Hamano 00b7cf2379 Merge branch 'jt/unify-object-info'
Code clean-ups.

* jt/unify-object-info:
  sha1_file: refactor has_sha1_file_with_flags
  sha1_file: do not access pack if unneeded
  sha1_file: teach sha1_object_info_extended more flags
  sha1_file: refactor read_object
  sha1_file: move delta base cache code up
  sha1_file: rename LOOKUP_REPLACE_OBJECT
  sha1_file: rename LOOKUP_UNKNOWN_OBJECT
  sha1_file: teach packed_object_info about typename
2017-07-05 13:32:57 -07:00
Junio C Hamano 5ab148dda0 Merge branch 'rs/sha1-name-readdir-optim'
Optimize "what are the object names already taken in an alternate
object database?" query that is used to derive the length of prefix
an object name is uniquely abbreviated to.

* rs/sha1-name-readdir-optim:
  sha1_file: guard against invalid loose subdirectory numbers
  sha1_file: let for_each_file_in_obj_subdir() handle subdir names
  p4205: add perf test script for pretty log formats
  sha1_name: cache readdir(3) results in find_short_object_filename()
2017-07-05 13:32:56 -07:00
Lars Schneider 2841e8f81c convert: add "status=delayed" to filter process protocol
Some `clean` / `smudge` filters may require a significant amount of
time to process a single blob (e.g. the Git LFS smudge filter might
perform network requests). During this process the Git checkout
operation is blocked and Git needs to wait until the filter is done to
continue with the checkout.

Teach the filter process protocol, introduced in edcc8581 ("convert: add
filter.<driver>.process option", 2016-10-16), to accept the status
"delayed" as response to a filter request. Upon this response Git
continues with the checkout operation. After the checkout operation Git
calls "finish_delayed_checkout" which queries the filter for remaining
blobs. If the filter is still working on the completion, then the filter
is expected to block. If the filter has completed all remaining blobs
then an empty response is expected.

Git has a multiple code paths that checkout a blob. Support delayed
checkouts only in `clone` (in unpack-trees.c) and `checkout` operations
for now. The optimization is most effective in these code paths as all
files of the tree are processed.

Signed-off-by: Lars Schneider <larsxschneider@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-30 13:50:41 -07:00
Jonathan Tan e83e71c5e1 sha1_file: refactor has_sha1_file_with_flags
has_sha1_file_with_flags() implements many mechanisms in common with
sha1_object_info_extended(). Make has_sha1_file_with_flags() a
convenience function for sha1_object_info_extended() instead.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26 10:28:58 -07:00
Jonathan Tan dfdd4afcf9 sha1_file: teach sha1_object_info_extended more flags
Improve sha1_object_info_extended() by supporting additional
flags. This allows has_sha1_file_with_flags() to be modified to use
sha1_object_info_extended() in a subsequent patch.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-26 10:28:42 -07:00
René Scharfe 70c49050d4 sha1_file: guard against invalid loose subdirectory numbers
Loose object subdirectories have hexadecimal names based on the first
byte of the hash of contained objects, thus their numerical
representation can range from 0 (0x00) to 255 (0xff).  Change the type
of the corresponding variable in for_each_file_in_obj_subdir() and
associated callback functions to unsigned int and add a range check.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-24 11:09:52 -07:00
Brandon Williams e7d72d0753 path: create path.h
Move all path related declarations from cache.h to a new path.h header
file.  This makes cache.h smaller and makes it easier to add new path
related functions.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:24:34 -07:00
Brandon Williams c14c234f22 environment: place key repository state in the_repository
Migrate 'git_dir', 'git_common_dir', 'git_object_dir', 'git_index_file',
'git_graft_file', and 'namespace' to be stored in 'the_repository'.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:24:34 -07:00
Brandon Williams 73f192c991 setup: don't perform lazy initialization of repository state
Under some circumstances (bogus GIT_DIR value or the discovered gitdir
is '.git') 'setup_git_directory()' won't initialize key repository
state.  This leads to inconsistent state after running the setup code.
To account for this inconsistent state, lazy initialization is done once
a caller asks for the repository's gitdir or some other piece of
repository state.  This is confusing and can be error prone.

Instead let's tighten the expected outcome of 'setup_git_directory()'
and ensure that it initializes repository state in all cases that would
have been handled by lazy initialization.

This also lets us drop the requirement to have 'have_git_dir()' check if
the environment variable GIT_DIR was set as that will be handled by the
end of the setup code.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 18:24:34 -07:00
Junio C Hamano 25bf951381 Merge branches 'bw/ls-files-sans-the-index' and 'bw/config-h' into bw/repo-object
* bw/ls-files-sans-the-index:
  ls-files: factor out tag calculation
  ls-files: factor out debug info into a function
  ls-files: convert show_files to take an index
  ls-files: convert show_ce_entry to take an index
  ls-files: convert prune_cache to take an index
  ls-files: convert ce_excluded to take an index
  ls-files: convert show_ru_info to take an index
  ls-files: convert show_other_files to take an index
  ls-files: convert show_killed_files to take an index
  ls-files: convert write_eolinfo to take an index
  ls-files: convert overlay_tree_on_cache to take an index
  tree: convert read_tree to take an index parameter
  convert: convert renormalize_buffer to take an index
  convert: convert convert_to_git to take an index
  convert: convert convert_to_git_filter_fd to take an index
  convert: convert crlf_to_git to take an index
  convert: convert get_cached_convert_stats_ascii to take an index

* bw/config-h:
  config: don't implicitly use gitdir or commondir
  config: respect commondir
  setup: teach discover_git_directory to respect the commondir
  config: don't include config.h by default
  config: remove git_config_iter
  config: create config.h
  alias: use the early config machinery to expand aliases
  t7006: demonstrate a problem with aliases in subdirectories
  t1308: relax the test verifying that empty alias values are disallowed
  help: use early config when autocorrecting aliases
  config: report correct line number upon error
  discover_git_directory(): avoid setting invalid git_dir
2017-06-23 18:24:00 -07:00
René Scharfe cc817ca3ef sha1_name: cache readdir(3) results in find_short_object_filename()
Read each loose object subdirectory at most once when looking for unique
abbreviated hashes.  This speeds up commands like "git log --pretty=%h"
considerably, which previously caused one readdir(3) call for each
candidate, even for subdirectories that were visited before.

The new cache is kept until the program ends and never invalidated.  The
same is already true for pack indexes.  The inherent racy nature of
finding unique short hashes makes it still fit for this purpose -- a
conflicting new object may be added at any time.  Tasks with higher
consistency requirements should not use it, though.

The cached object names are stored in an oid_array, which is quite
compact.  The bitmap for remembering which subdir was already read is
stored as a char array, with one char per directory -- that's not quite
as compact, but really simple and incurs only an overhead equivalent to
11 hashes after all.

Suggested-by: Jeff King <peff@peff.net>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-22 12:07:51 -07:00
Jonathan Tan c84a1f3ed4 sha1_file: refactor read_object
read_object() and sha1_object_info_extended() both implement mechanisms
such as object replacement, retrying the packed store after failing to
find the object in the packed store then the loose store, and being able
to mark a packed object as bad and then retrying the whole process.
Consolidating these mechanisms would be a great help to maintainability.

Therefore, consolidate them by extending sha1_object_info_extended() to
support the functionality needed, and then modifying read_object() to
use sha1_object_info_extended().

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21 18:54:43 -07:00
Jonathan Tan 1f0c0d36c1 sha1_file: rename LOOKUP_REPLACE_OBJECT
The LOOKUP_REPLACE_OBJECT flag controls whether the
lookup_replace_object() function is invoked by
sha1_object_info_extended(), read_sha1_file_extended(), and
lookup_replace_object_extended(), but it is not immediately clear which
functions accept that flag.

Therefore restrict this flag to only sha1_object_info_extended(),
renaming it appropriately to OBJECT_INFO_LOOKUP_REPLACE and adding some
documentation. Update read_sha1_file_extended() to have a boolean
parameter instead, and delete lookup_replace_object_extended().

parse_sha1_header() also passes this flag to
parse_sha1_header_extended() since commit 46f0344 ("sha1_file: support
reading from a loose object of unknown type", 2015-05-03), but that has
had no effect since that commit. Therefore this patch also removes this
flag from that invocation.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21 18:54:43 -07:00
Jonathan Tan 19fc5e84a7 sha1_file: rename LOOKUP_UNKNOWN_OBJECT
The LOOKUP_UNKNOWN_OBJECT flag was introduced in commit 46f0344
("sha1_file: support reading from a loose object of unknown type",
2015-05-03) in order to support a feature in cat-file subsequently
introduced in commit 39e4ae3 ("cat-file: teach cat-file a
'--allow-unknown-type' option", 2015-05-03). Despite its name and
location in cache.h, this flag is used neither in
read_sha1_file_extended() nor in any of the lookup functions, but used
only in sha1_object_info_extended().

Therefore rename this flag to OBJECT_INFO_ALLOW_UNKNOWN_TYPE, taking the
name of the cat-file flag that invokes this feature, and move it closer
to the declaration of sha1_object_info_extended(). Also add
documentation for this flag.

OBJECT_INFO_ALLOW_UNKNOWN_TYPE is defined to 2, not 1, to avoid
conflicting with LOOKUP_REPLACE_OBJECT. Avoidance of this conflict is
necessary because sha1_object_info_extended() supports both flags.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-21 18:54:43 -07:00
Junio C Hamano a6f38c109b Merge branch 'bw/object-id'
Conversion from uchar[20] to struct object_id continues.

* bw/object-id: (33 commits)
  diff: rename diff_fill_sha1_info to diff_fill_oid_info
  diffcore-rename: use is_empty_blob_oid
  tree-diff: convert path_appendnew to object_id
  tree-diff: convert diff_tree_paths to struct object_id
  tree-diff: convert try_to_follow_renames to struct object_id
  builtin/diff-tree: cleanup references to sha1
  diff-tree: convert diff_tree_sha1 to struct object_id
  notes-merge: convert write_note_to_worktree to struct object_id
  notes-merge: convert verify_notes_filepair to struct object_id
  notes-merge: convert find_notes_merge_pair_ps to struct object_id
  notes-merge: convert merge_from_diffs to struct object_id
  notes-merge: convert notes_merge* to struct object_id
  tree-diff: convert diff_root_tree_sha1 to struct object_id
  combine-diff: convert find_paths_* to struct object_id
  combine-diff: convert diff_tree_combined to struct object_id
  diff: convert diff_flush_patch_id to struct object_id
  patch-ids: convert to struct object_id
  diff: finish conversion for prepare_temp_file to struct object_id
  diff: convert reuse_worktree_file to struct object_id
  diff: convert fill_filespec to struct object_id
  ...
2017-06-19 12:38:44 -07:00
Brandon Williams d3fb71b3cb setup: teach discover_git_directory to respect the commondir
Currently 'discover_git_directory' only looks at the gitdir to determine
if a git directory was discovered.  This causes a problem in the event
that the gitdir which was discovered was in fact a per-worktree git
directory and not the common git directory.  This is because the
repository config, which is checked to verify the repository's format,
is stored in the commondir and not in the per-worktree gitdir.  Correct
this behavior by checking the config stored in the commondir.

It will also be of use for callers to have access to the commondir, so
lets also return that upon successfully discovering a git directory.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Brandon Williams b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Brandon Williams e67a57fc51 config: create config.h
Move all config related declarations from cache.h to a new config.h
header file.  This makes cache.h smaller and allows for the opportunity
in a following patch to only include config.h when needed.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Brandon Williams 312c984a02 ls-files: convert overlay_tree_on_cache to take an index
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-13 11:40:51 -07:00
Junio C Hamano 7ef0d04738 Merge branch 'jk/diff-blob'
The result from "git diff" that compares two blobs, e.g. "git diff
$commit1:$path $commit2:$path", used to be shown with the full
object name as given on the command line, but it is more natural to
use the $path in the output and use it to look up .gitattributes.

* jk/diff-blob:
  diff: use blob path for blob/file diffs
  diff: use pending "path" if it is available
  diff: use the word "path" instead of "name" for blobs
  diff: pass whole pending entry in blobinfo
  handle_revision_arg: record paths for pending objects
  handle_revision_arg: record modes for "a..b" endpoints
  t4063: add tests of direct blob diffs
  get_sha1_with_context: dynamically allocate oc->path
  get_sha1_with_context: always initialize oc->symlink_path
  sha1_name: consistently refer to object_context as "oc"
  handle_revision_arg: add handle_dotdot() helper
  handle_revision_arg: hoist ".." check out of range parsing
  handle_revision_arg: stop using "dotdot" as a generic pointer
  handle_revision_arg: simplify commit reference lookups
  handle_revision_arg: reset "dotdot" consistently
2017-06-02 15:06:05 +09:00
Brandon Williams 1c41c82bc4 grep: convert to struct object_id
Convert the remaining parts of grep to use struct object_id.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-02 09:36:06 +09:00
Junio C Hamano fa0624f79f Merge branch 'dt/unpack-save-untracked-cache-extension'
When "git checkout", "git merge", etc. manipulates the in-core
index, various pieces of information in the index extensions are
discarded from the original state, as it is usually not the case
that they are kept up-to-date and in-sync with the operation on the
main index.  The untracked cache extension is copied across these
operations now, which would speed up "git status" (as long as the
cache is properly invalidated).

* dt/unpack-save-untracked-cache-extension:
  unpack-trees: preserve index extensions
2017-05-30 11:16:45 +09:00
Junio C Hamano 6b526ced6f Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (53 commits)
  object: convert parse_object* to take struct object_id
  tree: convert parse_tree_indirect to struct object_id
  sequencer: convert do_recursive_merge to struct object_id
  diff-lib: convert do_diff_cache to struct object_id
  builtin/ls-tree: convert to struct object_id
  merge: convert checkout_fast_forward to struct object_id
  sequencer: convert fast_forward_to to struct object_id
  builtin/ls-files: convert overlay_tree_on_cache to object_id
  builtin/read-tree: convert to struct object_id
  sha1_name: convert internals of peel_onion to object_id
  upload-pack: convert remaining parse_object callers to object_id
  revision: convert remaining parse_object callers to object_id
  revision: rename add_pending_sha1 to add_pending_oid
  http-push: convert process_ls_object and descendants to object_id
  refs/files-backend: convert many internals to struct object_id
  refs: convert struct ref_update to use struct object_id
  ref-filter: convert some static functions to struct object_id
  Convert struct ref_array_item to struct object_id
  Convert the verify_pack callback to struct object_id
  Convert lookup_tag to struct object_id
  ...
2017-05-29 12:34:43 +09:00
Jeff King dc944b65f1 get_sha1_with_context: dynamically allocate oc->path
When a sha1 lookup returns the tree path via "struct
object_context", it just copies it into a fixed-size buffer.
This means the result can be truncated, and it means our
"struct object_context" consumes a lot of stack space.

Instead, let's allocate a string on the heap. Because most
callers don't care about this information, we'll avoid doing
it by default (so they don't all have to start calling
free() on the result). There are basically two options for
the caller to signal to us that it's interested:

  1. By setting a pointer to storage in the object_context.

  2. By passing a flag in another parameter.

Doing (1) would match the way that sha1_object_info_extended()
works. But it would mean that every caller would have to
initialize the object_context, which they don't currently
have to do.

This patch does (2), and adds a new bit to the function's
flags field. All of the callers that look at the "path"
field are updated to pass the new flag.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
Jeff King c0a487eafb sha1_name: consistently refer to object_context as "oc"
An early version of the patch to add object_context used the
name object_resolve_context. This was later shortened to
just object_context, but the "orc" variable name stuck in a
few places.  Let's use "oc", which is used elsewhere in the
code.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-24 10:59:27 +09:00
David Turner edf3b90553 unpack-trees: preserve index extensions
Make git checkout (and other unpack_tree operations) preserve the
untracked cache. This is valuable for two reasons:

1. Often, an unpack_tree operation will not touch large parts of the
working tree, and thus most of the untracked cache will continue to be
valid.

2. Even if the untracked cache were entirely invalidated by such an
operation, the user has signaled their intention to have such a cache,
and we don't want to throw it away.

[jes: backed out the watchman-specific parts]

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Ben Peart <benpeart@microsoft.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-20 18:26:45 +09:00
Junio C Hamano b15667bbdc Merge branch 'js/larger-timestamps'
Some platforms have ulong that is smaller than time_t, and our
historical use of ulong for timestamp would mean they cannot
represent some timestamp that the platform allows.  Invent a
separate and dedicated timestamp_t (so that we can distingiuish
timestamps and a vanilla ulongs, which along is already a good
move), and then declare uintmax_t is the type to be used as the
timestamp_t.

* js/larger-timestamps:
  archive-tar: fix a sparse 'constant too large' warning
  use uintmax_t for timestamps
  date.c: abort if the system time cannot handle one of our timestamps
  timestamp_t: a new data type for timestamps
  PRItime: introduce a new "printf format" for timestamps
  parse_timestamp(): specify explicitly where we parse timestamps
  t0006 & t5000: skip "far in the future" test when time_t is too limited
  t0006 & t5000: prepare for 64-bit timestamps
  ref-filter: avoid using `unsigned long` for catch-all data type
2017-05-16 11:51:59 +09:00
brian m. carlson f06e90dac1 merge: convert checkout_fast_forward to struct object_id
Converting checkout_fast_forward is required to convert
parse_tree_indirect.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
Johannes Schindelin dddbad728c timestamp_t: a new data type for timestamps
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).

So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27 13:07:39 +09:00
Junio C Hamano 6cbc478d83 Merge branch 'jh/add-index-entry-optim'
"git checkout" that handles a lot of paths has been optimized by
reducing the number of unnecessary checks of paths in the
has_dir_name() function.

* jh/add-index-entry-optim:
  read-cache: speed up has_dir_name (part 2)
  read-cache: speed up has_dir_name (part 1)
  read-cache: speed up add_index_entry during checkout
  p0006-read-tree-checkout: perf test to time read-tree
  read-cache: add strcmp_offset function
2017-04-26 15:39:07 +09:00
Junio C Hamano c9672ba4c8 Merge branch 'nd/conditional-config-in-early-config'
The recently introduced conditional inclusion of configuration did
not work well when early-config mechanism was involved.

* nd/conditional-config-in-early-config:
  config: correct file reading order in read_early_config()
  config: handle conditional include when $GIT_DIR is not set up
  config: prepare to pass more info in git_config_with_options()
2017-04-26 15:39:05 +09:00
Junio C Hamano cdfe138b36 Merge branch 'jh/verify-index-checksum-only-in-fsck'
The index file has a trailing SHA-1 checksum to detect file
corruption, and historically we checked it every time the index
file is used.  Omit the validation during normal use, and instead
verify only in "git fsck".

* jh/verify-index-checksum-only-in-fsck:
  read-cache: force_verify_index_checksum
2017-04-23 22:07:49 -07:00