Commit graph

5 commits

Author SHA1 Message Date
Ævar Arnfjörð Bjarmason 6b851e536b sha1dc: update from upstream
Update sha1dc from the latest version by the upstream
maintainer[1].

See commit a0103914c2 ("sha1dc: update from upstream", 2017-05-20) for
the latest update. That update was done sans some whitespace changes
by upstream, which is why the diff here isn't the same as the upstream
cc46554..e139984.

It also brings in a change[2] upstream made which should hopefully
address the breakage in 2.13.1 on Cygwin, see [3]. Cygwin defines both
_BIG_ENDIAN and _LITTLE_ENDIAN.

Adam Dinwoodie reports on the mailing list that that upstream commit
fixes the issue on Cygwin[4].

1. e1399840b5
2. a24eef58c0
3. <20170606100355.GC25777@dinwoodie.org> (https://public-inbox.org/git/20170606100355.GC25777@dinwoodie.org/)
4. <20170606124323.GD25777@dinwoodie.org> (https://public-inbox.org/git/20170606124323.GD25777@dinwoodie.org/)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-07 09:25:20 +09:00
Ævar Arnfjörð Bjarmason a0103914c2 sha1dc: update from upstream
Update sha1dc from the latest version by the upstream
maintainer[1].

This version includes a commit of mine which allows for replacing the
local modifications done to the upstream files in git.git with macro
definitions to monkeypatch it in place.

It also brings in a change[2] upstream made for the breakage 2.13.0
introduced on SPARC and other platforms that forbid unaligned
access[3].

This means that the code customizations done since the initial import
in commit 28dc98e343 ("sha1dc: add collision-detecting sha1
implementation", 2017-03-16) can be done purely via Makefile
definitions and by including the content of our own sha1dc_git.[ch] in
sha1dc/sha1.c via a macro.

1. cc465543b3
2. 33a694a9ee
3. "Git 2.13.0 segfaults on Solaris SPARC due to DC_SHA1=YesPlease
   being on by default"
   (https://public-inbox.org/git/CACBZZX6nmKK8af0-UpjCKWV4R+hV-uk2xWXVA5U+_UQ3VXU03g@mail.gmail.com/)

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-22 10:20:46 +09:00
Jeff King 8325e43b82 Makefile: add DC_SHA1 knob
This knob lets you use the sha1dc implementation from:

      https://github.com/cr-marcstevens/sha1collisiondetection

which can detect certain types of collision attacks (even
when we only see half of the colliding pair). So it
mitigates any attack which consists of getting the "good"
half of a collision into a trusted repository, and then
later replacing it with the "bad" half. The "good" half is
rejected by the victim's version of Git (and even if they
run an old version of Git, any sha1dc-enabled git will
complain loudly if it ever has to interact with the object).

The big downside is that it's slower than either the openssl
or block-sha1 implementations.

Here are some timings based off of linux.git:

  - compute sha1 over whole packfile
      sha1dc: 3.580s
    blk-sha1: 2.046s (-43%)
     openssl: 1.335s (-62%)

  - rev-list --all --objects
      sha1dc: 33.512s
    blk-sha1: 33.514s (+0.0%)
     openssl: 33.650s (+0.4%)

  - git log --no-merges -10000 -p
      sha1dc: 8.124s
    blk-sha1: 7.986s (-1.6%)
     openssl: 8.203s (+0.9%)

  - index-pack --verify
      sha1dc: 4m19s
    blk-sha1: 2m57s (-32%)
     openssl: 2m19s (-42%)

So overall the sha1 computation with collision detection is
about 1.75x slower than block-sha1, and 2.7x slower than
sha1. But of course most operations do more than just sha1.
Normal object access isn't really slowed at all (both the
+/- changes there are well within the run-to-run noise); any
changes are drowned out by the other work Git is doing.

The most-affected operation is `index-pack --verify`, which
is essentially just computing the sha1 on every object. This
is similar to the `index-pack` invocation that the receiver
of a push or fetch would perform. So clearly there's some
extra CPU load here.

There will also be some latency for the user, though keep in
mind that such an operation will generally be network bound
(this is about a 1.2GB packfile). Some of that extra CPU is
"free" in the sense that we use it while the pack is
streaming in anyway. But most of it comes during the
delta-resolution phase, after the whole pack has been
received. So we can imagine that for this (quite large)
push, the user might have to wait an extra 100 seconds over
openssl (which is what we use now). If we assume they can
push to us at 20Mbit/s, that's 480s for a 1.2GB pack, which
is only 20% slower.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-17 10:40:25 -07:00
Jeff King 45a574eec8 sha1dc: adjust header includes for git
We can replace system includes with git-compat-util.h or
cache.h (and should make sure it is included first in all C
files).  And we can drop includes from headers entirely, as
every C file should include git-compat-util.h itself.

We will add in new include guards around the header files,
though (otherwise you get into trouble including both
sha1dc/sha1.h and cache.h).

And finally, we'll use the full "sha1dc/" path for including
related files. This isn't strictly necessary, but makes the
expected resolution more obvious.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-16 15:16:45 -07:00
Jeff King 28dc98e343 sha1dc: add collision-detecting sha1 implementation
This is pulled straight from:

  https://github.com/cr-marcstevens/sha1collisiondetection

with no modifications yet (though I've pulled in only the
subset of files necessary for Git to use).

This is commit 007905a93c973f55b2daed6585f9f6c23545bf66.

Further updates can be done like:

  git checkout -b vendor-sha1dc $this_commit
  cp /path/to/sha1dc/{LICENSE.txt,lib/*} sha1dc/
  git add -A sha1dc
  git commit -m "update sha1dc"

  git checkout -b update-sha1dc origin
  git merge vendor-sha1dc

Thanks to both Marc and Dan for making the code fit our
needs by doing both optimization work, cutting down on the
object size, and doing some syntactic changes to work better
with git. And to Linus for kicking off the "diet" work that
removed some of the unused code.

The license of the sha1dc code is the MIT license, which is
obviously compatible with the GPLv2 of git.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-03-16 15:16:40 -07:00