Commit graph

490 commits

Author SHA1 Message Date
Jeff King
14a9bd2898 prepare_commit_graft: treat non-repository as a noop
The parse_commit_buffer() function consults lookup_commit_graft()
to see if we need to rewrite parents. The latter will look
at $GIT_DIR/info/grafts. If you're outside of a repository,
then this will trigger a BUG() as of b1ef400eec (setup_git_env:
avoid blind fall-back to ".git", 2016-10-20).

It's probably uncommon to actually parse a commit outside of
a repository, but you can see it in action with:

  cd /not/a/git/repo
  git index-pack --strict /some/file.pack

This works fine without --strict, but the fsck checks will
try to parse any commits, triggering the BUG(). We can fix
that by teaching the graft code to behave as if there are no
grafts when we aren't in a repository.

Arguably index-pack (and fsck) are wrong to consider grafts
at all. So another solution is to disable grafts entirely
for those commands. But given that the graft feature is
deprecated anyway, it's not worth even thinking through the
ramifications that might have.

There is one other corner case I considered here. What
should:

  cd /not/a/git/repo
  export GIT_GRAFT_FILE=/file/with/grafts
  git index-pack --strict /some/file.pack

do? We don't have a repository, but the user has pointed us
directly at a graft file, which we could respect. I believe
this case did work that way prior to b1ef400eec. However,
fixing it now would be pretty invasive. Back then we would
just call into setup_git_env() even without a repository.
But these days it actually takes a git_dir argument. So
there would be a fair bit of refactoring of the setup code
involved.

Given the obscurity of this case, plus the fact that grafts
are deprecated and probably shouldn't work under index-pack
anyway, it's not worth pursuing further. This patch at least
un-breaks the common case where you're _not_ using grafts,
but we BUG() anyway trying to even find that out.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-06-01 11:48:54 +09:00
Junio C Hamano
42c8ce1c49 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (42 commits)
  merge-one-file: compute empty blob object ID
  add--interactive: compute the empty tree value
  Update shell scripts to compute empty tree object ID
  sha1_file: only expose empty object constants through git_hash_algo
  dir: use the_hash_algo for empty blob object ID
  sequencer: use the_hash_algo for empty tree object ID
  cache-tree: use is_empty_tree_oid
  sha1_file: convert cached object code to struct object_id
  builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
  builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
  wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
  submodule: convert several uses of EMPTY_TREE_SHA1_HEX
  sequencer: convert one use of EMPTY_TREE_SHA1_HEX
  merge: convert empty tree constant to the_hash_algo
  builtin/merge: switch tree functions to use object_id
  builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
  sha1-file: add functions for hex empty tree and blob OIDs
  builtin/receive-pack: avoid hard-coded constants for push certs
  diff: specify abbreviation size in terms of the_hash_algo
  upload-pack: replace use of several hard-coded constants
  ...
2018-05-30 14:04:10 +09:00
Junio C Hamano
352cf6cfe1 Merge branch 'js/deprecate-grafts'
The functionality of "$GIT_DIR/info/grafts" has been superseded by
the "refs/replace/" mechanism for some time now, but the internal
code had support for it in many places, which has been cleaned up
in order to drop support of the "grafts" mechanism.

* js/deprecate-grafts:
  Remove obsolete script to convert grafts to replace refs
  technical/shallow: describe why shallow cannot use replace refs
  technical/shallow: stop referring to grafts
  filter-branch: stop suggesting to use grafts
  Deprecate support for .git/info/grafts
  Add a test for `git replace --convert-graft-file`
  replace: introduce --convert-graft-file
  replace: prepare create_graft() for converting graft files wholesale
  replace: "libify" create_graft() and callees
  replace: avoid using die() to indicate a bug
  commit: Let the callback of for_each_mergetag return on error
  argv_array: offer to split a string by whitespace
2018-05-23 14:38:17 +09:00
Junio C Hamano
c89b6e136e Merge branch 'ds/lazy-load-trees'
The code has been taught to use the duplicated information stored
in the commit-graph file to learn the tree object name for a commit
to avoid opening and parsing the commit object when it makes sense
to do so.

* ds/lazy-load-trees:
  coccinelle: avoid wrong transformation suggestions from commit.cocci
  commit-graph: lazy-load trees for commits
  treewide: replace maybe_tree with accessor methods
  commit: create get_commit_tree() method
  treewide: rename tree to maybe_tree
2018-05-23 14:38:13 +09:00
Derrick Stolee
04bc8d1ecc commit: use generation number in remove_redundant()
The static remove_redundant() method is used to filter a list
of commits by removing those that are reachable from another
commit in the list. This is used to remove all possible merge-
bases except a maximal, mutually independent set.

To determine these commits are independent, we use a number of
paint_down_to_common() walks and use the PARENT1, PARENT2 flags
to determine reachability. Since we only care about reachability
and not the full set of merge-bases between 'one' and 'twos', we
can use the 'min_generation' parameter to short-circuit the walk.

When no commit-graph exists, there is no change in behavior.

For a copy of the Linux repository, we measured the following
performance improvements:

git merge-base v3.3 v4.5

Before: 234 ms
 After: 208 ms
 Rel %: -11%

git merge-base v4.3 v4.5

Before: 102 ms
 After:  83 ms
 Rel %: -19%

The experiments above were chosen to demonstrate that we are
improving the filtering of the merge-base set. In the first
example, more time is spent walking the history to find the
set of merge bases before the remove_redundant() call. The
starting commits are closer together in the second example,
therefore more time is spent in remove_redundant(). The relative
change in performance differs as expected.

Reported-by: Jakub Narebski <jnareb@gmail.com>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-22 12:36:34 +09:00
Derrick Stolee
d7c1ec3efd commit: add short-circuit to paint_down_to_common()
When running 'git branch --contains', the in_merge_bases_many()
method calls paint_down_to_common() to discover if a specific
commit is reachable from a set of branches. Commits with lower
generation number are not needed to correctly answer the
containment query of in_merge_bases_many().

Add a new parameter, min_generation, to paint_down_to_common() that
prevents walking commits with generation number strictly less than
min_generation. If 0 is given, then there is no functional change.

For in_merge_bases_many(), we can pass commit->generation as the
cutoff, and this saves time during 'git branch --contains' queries
that would otherwise walk "around" the commit we are inspecting.

For a copy of the Linux repository, where HEAD is checked out at
v4.13~100, we get the following performance improvement for
'git branch --contains' over the previous commit:

Before: 0.21s
After:  0.13s
Rel %: -38%

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-22 12:36:34 +09:00
Derrick Stolee
f9b8908b85 commit: use generation numbers for in_merge_bases()
The containment algorithm for 'git branch --contains' is different
from that for 'git tag --contains' in that it uses is_descendant_of()
instead of contains_tag_algo(). The expensive portion of the branch
algorithm is computing merge bases.

When a commit-graph file exists with generation numbers computed,
we can avoid this merge-base calculation when the target commit has
a larger generation number than the initial commits.

Performance tests were run on a copy of the Linux repository where
HEAD is contained in v4.13 but no earlier tag. Also, all tags were
copied to branches and 'git branch --contains' was tested:

Before: 60.0s
After:   0.4s
Rel %: -99.3%

Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-22 12:36:34 +09:00
Derrick Stolee
e2838d85b6 commit-graph: always load commit-graph information
Most code paths load commits using lookup_commit() and then
parse_commit(). In some cases, including some branch lookups, the commit
is parsed using parse_object_buffer() which side-steps parse_commit() in
favor of parse_commit_buffer().

With generation numbers in the commit-graph, we need to ensure that any
commit that exists in the commit-graph file has its generation number
loaded.

Create new load_commit_graph_info() method to fill in the information
for a commit that exists only in the commit-graph file. Call it from
parse_commit_buffer() after loading the other commit information from
the given buffer. Only fill this information when specified by the
'check_graph' parameter.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-22 12:36:34 +09:00
Derrick Stolee
3afc679b3c commit: use generations in paint_down_to_common()
Define compare_commits_by_gen_then_commit_date(), which uses generation
numbers as a primary comparison and commit date to break ties (or as a
comparison when both commits do not have computed generation numbers).

Since the commit-graph file is closed under reachability, we know that
all commits in the file have generation at most GENERATION_NUMBER_MAX
which is less than GENERATION_NUMBER_INFINITY.

This change does not affect the number of commits that are walked during
the execution of paint_down_to_common(), only the order that those
commits are inspected. In the case that commit dates violate topological
order (i.e. a parent is "newer" than a child), the previous code could
walk a commit twice: if a commit is reached with the PARENT1 bit, but
later is re-visited with the PARENT2 bit, then that PARENT2 bit must be
propagated to its parents. Using generation numbers avoids this extra
effort, even if it is somewhat rare.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-22 12:36:34 +09:00
Nguyễn Thái Ngọc Duy
e2e5ac2303 merge: use commit-slab in merge remote desc instead of commit->util
It's done so that commit->util can be removed. See more explanation in
the commit that removes commit->util.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-21 14:07:20 +09:00
Stefan Beller
b9dbddf6da commit: allow lookup_commit_graft to handle arbitrary repositories
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Stefan Beller
2f6c767fd4 commit: allow prepare_commit_graft to handle arbitrary repositories
Move the global variable 'commit_graft_prepared' into the object
pool and convert the function prepare_commit_graft to work
an arbitrary repositories.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Stefan Beller
0437a2e365 cache: convert get_graft_file to handle arbitrary repositories
This conversion was done without the #define trick used in the earlier
series refactoring to have better repository access, because this function
is easy to review, as all lines are converted and it has only one caller.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Brandon Williams
d0e5dd0ed4 commit: convert read_graft_file to handle arbitrary repositories
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Brandon Williams
a3b78e833b commit: convert register_commit_graft to handle arbitrary repositories
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Brandon Williams
e808656c46 commit: convert commit_graft_pos() to handle arbitrary repositories
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Stefan Beller
c88134870e shallow: add repository argument to is_repository_shallow
Add a repository argument to allow callers of is_repository_shallow
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Jonathan Nieder
1f93ecd1ab commit: add repository argument to lookup_commit_graft
Add a repository argument to allow callers of lookup_commit_graft to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-18 08:13:10 +09:00
Jonathan Nieder
3ee37656ee commit: add repository argument to prepare_commit_graft
Add a repository argument to allow the caller of prepare_commit_graft
to be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Jonathan Nieder
02ba3e1a05 commit: add repository argument to read_graft_file
Add a repository argument to allow the caller of read_graft_file to be
more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Jonathan Nieder
3f5787f806 commit: add repository argument to register_commit_graft
Add a repository argument to allow callers of register_commit_graft to
be more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Jonathan Nieder
be479e801d commit: add repository argument to commit_graft_pos
Add a repository argument to allow callers of commit_graft_pos to be
more specific about which repository to handle. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Jonathan Nieder
6a1a79fd14 object: move grafts to object parser
Grafts are only meaningful in the context of a single repository.
Therefore they cannot be global.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Stefan Beller
cbd53a2193 object-store: move object access functions to object-store.h
This should make these functions easier to find and cache.h less
overwhelming to read.

In particular, this moves:
- read_object_file
- oid_object_info
- write_object_file

As a result, most of the codebase needs to #include object-store.h.
In this patch the #include is only added to files that would fail to
compile otherwise.  It would be better to #include wherever
identifiers from the header are used.  That can happen later
when we have better tooling for it.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:42:03 +09:00
Stefan Beller
14ba97f81c alloc: allow arbitrary repositories for alloc functions
We have to convert all of the alloc functions at once, because alloc_report
uses a funky macro for reporting. It is better for the sake of mechanical
conversion to convert multiple functions at once rather than changing the
structure of the reporting function.

We record all memory allocation in alloc.c, and free them in
clear_alloc_state, which is called for all repositories except
the_repository.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-16 11:16:50 +09:00
Stefan Beller
8ba0e5ec57 alloc: add repository argument to alloc_commit_node
This is a small mechanical change; it doesn't change the
implementation to handle repositories other than the_repository yet.
Use a macro to catch callers passing a repository other than
the_repository at compile time.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-09 12:12:36 +09:00
Stefan Beller
68f95d382b object: add repository argument to create_object
Add a repository argument to allow the callers of create_object
to be more specific about which repository to act on. This is a small
mechanical change; it doesn't change the implementation to handle
repositories other than the_repository yet.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-09 12:12:36 +09:00
Junio C Hamano
b10edb2df5 Merge branch 'ds/commit-graph'
Precompute and store information necessary for ancestry traversal
in a separate file to optimize graph walking.

* ds/commit-graph:
  commit-graph: implement "--append" option
  commit-graph: build graph from starting commits
  commit-graph: read only from specific pack-indexes
  commit: integrate commit graph with commit parsing
  commit-graph: close under reachability
  commit-graph: add core.commitGraph setting
  commit-graph: implement git commit-graph read
  commit-graph: implement git-commit-graph write
  commit-graph: implement write_commit_graph()
  commit-graph: create git-commit-graph builtin
  graph: add commit graph design document
  commit-graph: add format document
  csum-file: refactor finalize_hashfile() method
  csum-file: rename hashclose() to finalize_hashfile()
2018-05-08 15:59:20 +09:00
brian m. carlson
26ea3e7dca commit: convert uses of get_sha1_hex to get_oid_hex
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 13:59:51 +09:00
Johannes Schindelin
f9f99b3f7d Deprecate support for .git/info/grafts
The grafts feature was a convenient way to "stitch together" ancient
history to the fresh start of linux.git.

Its implementation is, however, not up to Git's standards, as there are
too many ways where it can lead to surprising and unwelcome behavior.

For example, when pushing from a repository with active grafts, it is
possible to miss commits that have been "grafted out", resulting in a
broken state on the other side.

Also, the grafts feature is limited to "rewriting" commits' list of
parents, it cannot replace anything else.

The much younger feature implemented as `git replace` set out to remedy
those limitations and dangerous bugs.

Seeing as `git replace` is pretty mature by now (since 4228e8bc98
(replace: add --graft option, 2014-07-19) it can perform the graft
file's duties), it is time to deprecate support for the graft file, and
to retire it eventually.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-30 11:12:31 +09:00
Johannes Schindelin
fef461ea5d commit: Let the callback of for_each_mergetag return on error
This is yet another patch to be filed under the keyword "libification".

There is one subtle change in behavior here, where a `git log` that has
been asked to show the mergetags would now stop reporting the mergetags
upon the first failure, whereas previously, it would have continued to the
next mergetag, if any.

In practice, that change should not matter, as it is 1) uncommon to
perform octopus merges using multiple tags as merge heads, and 2) when the
user asks to be shown those tags, they really should be there.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-26 12:52:57 +09:00
Derrick Stolee
7b8a21dba1 commit-graph: lazy-load trees for commits
The commit-graph file provides quick access to commit data, including
the OID of the root tree for each commit in the graph. When performing
a deep commit-graph walk, we may not need to load most of the trees
for these commits.

Delay loading the tree object for a commit loaded from the graph
until requested via get_commit_tree(). Do not lazy-load trees for
commits not in the graph, since that requires duplicate parsing
and the relative peformance improvement when trees are not needed
is small.

On the Linux repository, performance tests were run for the following
command:

    git log --graph --oneline -1000

    Before: 0.92s
    After:  0.66s
    Rel %: -28.3%

Adding '-- kernel/' to the command requires loading the root tree
for every commit that is walked. There was no measureable performance
change as a result of this patch.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11 10:47:16 +09:00
Derrick Stolee
5bb03de102 commit: create get_commit_tree() method
While walking the commit graph, we load struct commit objects into
the object cache. During this process, we also load struct tree
objects for the root tree of each of these commits. We load these
objects even if we are only computing commit reachability information,
such as a merge base or ahead/behind information.

Create get_commit_tree() as a first step to removing direct
references to the 'maybe_tree' member of struct commit.

Create get_commit_tree_oid() as a shortcut for several references
to "&commit->maybe_tree->object.oid" in the codebase.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11 10:47:16 +09:00
Derrick Stolee
891435d55d treewide: rename tree to maybe_tree
Using the commit-graph file to walk commit history removes the large
cost of parsing commits during the walk. This exposes a performance
issue: lookup_tree() takes a large portion of the computation time,
even when Git never uses those trees.

In anticipation of lazy-loading these trees, rename the 'tree' member
of struct commit to 'maybe_tree'. This serves two purposes: it hints
at the future role of possibly being NULL even if the commit has a
valid tree, and it allows for unambiguous transformation from simple
member access (i.e. commit->maybe_tree) to method access.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11 10:47:16 +09:00
Junio C Hamano
2d5792f071 Merge branch 'bw/c-plus-plus' into ds/lazy-load-trees
* bw/c-plus-plus: (37 commits)
  replace: rename 'new' variables
  trailer: rename 'template' variables
  tempfile: rename 'template' variables
  wrapper: rename 'template' variables
  environment: rename 'namespace' variables
  diff: rename 'template' variables
  environment: rename 'template' variables
  init-db: rename 'template' variables
  unpack-trees: rename 'new' variables
  trailer: rename 'new' variables
  submodule: rename 'new' variables
  split-index: rename 'new' variables
  remote: rename 'new' variables
  ref-filter: rename 'new' variables
  read-cache: rename 'new' variables
  line-log: rename 'new' variables
  imap-send: rename 'new' variables
  http: rename 'new' variables
  entry: rename 'new' variables
  diffcore-delta: rename 'new' variables
  ...
2018-04-11 10:46:32 +09:00
Derrick Stolee
177722b344 commit: integrate commit graph with commit parsing
Teach Git to inspect a commit graph file to supply the contents of a
struct commit when calling parse_commit_gently(). This implementation
satisfies all post-conditions on the struct commit, including loading
parents, the root tree, and the commit date.

If core.commitGraph is false, then do not check graph files.

In test script t5318-commit-graph.sh, add output-matching conditions on
read-only graph operations.

By loading commits from the graph instead of parsing commit buffers, we
save a lot of time on long commit walks. Here are some performance
results for a copy of the Linux repository where 'master' has 678,653
reachable commits and is behind 'origin/master' by 59,929 commits.

| Command                          | Before | After  | Rel % |
|----------------------------------|--------|--------|-------|
| log --oneline --topo-order -1000 |  8.31s |  0.94s | -88%  |
| branch -vv                       |  1.02s |  0.14s | -86%  |
| rev-list --all                   |  5.89s |  1.07s | -81%  |
| rev-list --all --objects         | 66.15s | 58.45s | -11%  |

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-11 10:43:02 +09:00
brian m. carlson
b4f5aca40e sha1_file: convert read_sha1_file to struct object_id
Convert read_sha1_file to take a pointer to struct object_id and rename
it read_object_file.  Do the same for read_sha1_file_extended.

Convert one use in grep.c to use the new function without any other code
change, since the pointer being passed is a void pointer that is already
initialized with a pointer to struct object_id.  Update the declaration
and definitions of the modified functions, and apply the following
semantic patch to convert the remaining callers:

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1.hash, E2, E3)
+ read_object_file(&E1, E2, E3)

@@
expression E1, E2, E3;
@@
- read_sha1_file(E1->hash, E2, E3)
+ read_object_file(E1, E2, E3)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1.hash, E2, E3, E4)
+ read_object_file_extended(&E1, E2, E3, E4)

@@
expression E1, E2, E3, E4;
@@
- read_sha1_file_extended(E1->hash, E2, E3, E4)
+ read_object_file_extended(E1, E2, E3, E4)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:50 -07:00
brian m. carlson
e816caa07b sha1_file: convert assert_sha1_type to object_id
Convert this function to take a pointer to struct object_id and rename
it to assert_oid_type.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-14 09:23:49 -07:00
Junio C Hamano
169c9c0169 Merge branch 'bw/c-plus-plus'
Avoid using identifiers that clash with C++ keywords.  Even though
it is not a goal to compile Git with C++ compilers, changes like
this help use of code analysis tools that targets C++ on our
codebase.

* bw/c-plus-plus: (37 commits)
  replace: rename 'new' variables
  trailer: rename 'template' variables
  tempfile: rename 'template' variables
  wrapper: rename 'template' variables
  environment: rename 'namespace' variables
  diff: rename 'template' variables
  environment: rename 'template' variables
  init-db: rename 'template' variables
  unpack-trees: rename 'new' variables
  trailer: rename 'new' variables
  submodule: rename 'new' variables
  split-index: rename 'new' variables
  remote: rename 'new' variables
  ref-filter: rename 'new' variables
  read-cache: rename 'new' variables
  line-log: rename 'new' variables
  imap-send: rename 'new' variables
  http: rename 'new' variables
  entry: rename 'new' variables
  diffcore-delta: rename 'new' variables
  ...
2018-03-06 14:54:07 -08:00
Brandon Williams
258c03eb45 commit: rename 'new' variables
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-22 10:08:05 -08:00
Junio C Hamano
8be8342b4c Merge branch 'po/object-id'
Conversion from uchar[20] to struct object_id continues.

* po/object-id:
  sha1_file: rename hash_sha1_file_literally
  sha1_file: convert write_loose_object to object_id
  sha1_file: convert force_object_loose to object_id
  sha1_file: convert write_sha1_file to object_id
  notes: convert write_notes_tree to object_id
  notes: convert combine_notes_* to object_id
  commit: convert commit_tree* to object_id
  match-trees: convert splice_tree to object_id
  cache: clear whole hash buffer with oidclr
  sha1_file: convert hash_sha1_file to object_id
  dir: convert struct sha1_stat to use object_id
  sha1_file: convert pretend_sha1_file to object_id
2018-02-15 14:55:43 -08:00
Brandon Williams
debca9d2fe object: rename function 'typename' to 'type_name'
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-14 13:10:05 -08:00
Junio C Hamano
cbf0240f82 Merge branch 'sg/cocci-move-array'
Code clean-up.

* sg/cocci-move-array:
  Use MOVE_ARRAY
2018-02-13 13:39:13 -08:00
Patryk Obara
a09c985eae sha1_file: convert write_sha1_file to object_id
Convert the definition and declaration of write_sha1_file to
struct object_id and adjust usage of this function.

This commit also converts static function write_sha1_file_prepare, as it
is closely related.

Rename these functions to write_object_file and
write_object_file_prepare respectively.

Replace sha1_to_hex, hashcpy and hashclr with their oid equivalents
wherever possible.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-30 10:42:36 -08:00
Patryk Obara
5078f34459 commit: convert commit_tree* to object_id
Convert the definitions and declarations of commit_tree and
commit_tree_extended to use struct object_id and adjust all usages of
these functions.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-30 10:42:36 -08:00
SZEDER Gábor
f919ffebed Use MOVE_ARRAY
Use the helper macro MOVE_ARRAY to move arrays.  This is shorter and
safer, as it automatically infers the size of elements.

Patch generated by Coccinelle and contrib/coccinelle/array.cocci in
Travis CI's static analysis build job.

Signed-off-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-22 11:32:51 -08:00
René Scharfe
6fcec2f9ae commit: remove unused function clear_commit_marks_for_object_array()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28 13:50:05 -08:00
René Scharfe
abc035126a commit: use clear_commit_marks_many() in remove_redundant()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28 13:50:05 -08:00
René Scharfe
07f7d55a34 commit: avoid allocation in clear_commit_marks_many()
Pass the entries of the commit array directly to clear_commit_marks_1()
instead of adding them to a commit_list first.  The function clears the
commit and any first parent without allocation; only higher numbered
parents are added to a list for later treatment.  This change extends
that optimization to clear_commit_marks_many().

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-28 13:50:05 -08:00
Martin Ågren
4da72644b7 reduce_heads: fix memory leaks
We currently have seven callers of `reduce_heads(foo)`. Six of them do
not use the original list `foo` again, and actually, all six of those
end up leaking it.

Introduce and use `reduce_heads_replace(&foo)` as a leak-free version of
`foo = reduce_heads(foo)` to fix several of these. Fix the remaining
leaks using `free_commit_list()`.

While we're here, document `reduce_heads()` and mark it as `extern`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-08 11:34:00 +09:00
Junio C Hamano
69c54c7284 Merge branch 'ma/leakplugs'
Memory leaks in various codepaths have been plugged.

* ma/leakplugs:
  pack-bitmap[-write]: use `object_array_clear()`, don't leak
  object_array: add and use `object_array_pop()`
  object_array: use `object_array_clear()`, not `free()`
  leak_pending: use `object_array_clear()`, not `free()`
  commit: fix memory leak in `reduce_heads()`
  builtin/commit: fix memory leak in `prepare_index()`
2017-09-29 11:23:43 +09:00
Martin Ågren
cb7b29eb67 commit: fix memory leak in reduce_heads()
We don't free the temporary scratch space we use with
`remove_redundant()`. Free it similar to how we do it in
`get_merge_bases_many_0()`.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:05:51 +09:00
Junio C Hamano
d811ba1897 Merge branch 'rs/strbuf-leakfix'
Many leaks of strbuf have been fixed.

* rs/strbuf-leakfix: (34 commits)
  wt-status: release strbuf after use in wt_longstatus_print_tracking()
  wt-status: release strbuf after use in read_rebase_todolist()
  vcs-svn: release strbuf after use in end_revision()
  utf8: release strbuf on error return in strbuf_utf8_replace()
  userdiff: release strbuf after use in userdiff_get_textconv()
  transport-helper: release strbuf after use in process_connect_service()
  sequencer: release strbuf after use in save_head()
  shortlog: release strbuf after use in insert_one_record()
  sha1_file: release strbuf on error return in index_path()
  send-pack: release strbuf on error return in send_pack()
  remote: release strbuf after use in set_url()
  remote: release strbuf after use in migrate_file()
  remote: release strbuf after use in read_remote_branches()
  refs: release strbuf on error return in write_pseudoref()
  notes: release strbuf after use in notes_copy_from_stdin()
  merge: release strbuf after use in write_merge_heads()
  merge: release strbuf after use in save_state()
  mailinfo: release strbuf on error return in handle_boundary()
  mailinfo: release strbuf after use in handle_from()
  help: release strbuf on error return in exec_woman_emacs()
  ...
2017-09-19 10:47:57 +09:00
Rene Scharfe
e505146dac commit: release strbuf on error return in commit_tree_extended()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Patryk Obara
cfa5bf1608 commit: rewrite read_graft_line
Old implementation determined number of hashes by dividing length of
line by length of hash, which works only if all hash representations
have same length.

New graft line parser works in two phases:

  1. In first phase line is scanned to verify correctness and compute
     number of hashes, then graft struct is allocated.

  2. In second phase line is scanned again to fill up already allocated
     graft struct.

This way graft parsing code can support different sizes of hashes
without any further code adaptations.

A number of alternative implementations were considered and discarded:

  - Modifying graft structure to store oid_array instead of FLEXI_ARRAY
    indicates undesirable usage of struct to readers.

  - Parsing into temporary string_list or oid_array complicates code
    by adding more return paths, as these structures needs to be
    cleared before returning from function.

  - Determining number of hashes by counting separators might cause
    maintenance issues, if this function needs to be modified in future
    again.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-18 12:41:06 -07:00
Patryk Obara
bc65d2262d commit: allocate array using object_id size
struct commit_graft aggregates an array of object_id's, which have
size >= GIT_MAX_RAWSZ bytes. This change prevents memory allocation
error when size of object_id is larger than GIT_SHA1_RAWSZ.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-18 12:18:10 -07:00
Patryk Obara
9a9340329a commit: replace the raw buffer with strbuf in read_graft_line
This simplifies function declaration and allows for use of strbuf_rtrim
instead of modifying buffer directly.

Signed-off-by: Patryk Obara <patryk.obara@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-18 12:18:10 -07:00
Junio C Hamano
32f90258bd Merge branch 'rs/move-array'
Code clean-up.

* rs/move-array:
  ls-files: don't try to prune an empty index
  apply: use COPY_ARRAY and MOVE_ARRAY in update_image()
  use MOVE_ARRAY
  add MOVE_ARRAY
2017-08-11 13:26:57 -07:00
René Scharfe
f331ab9d4c use MOVE_ARRAY
Simplify the code for moving members inside of an array and make it more
robust by using the helper macro MOVE_ARRAY.  It calculates the size
based on the specified number of elements for us and supports NULL
pointers when that number is zero.  Raw memmove(3) calls with NULL can
cause the compiler to (over-eagerly) optimize out later NULL checks.

This patch was generated with contrib/coccinelle/array.cocci and spatch
(Coccinelle).

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 14:54:56 -07:00
brian m. carlson
e82caf384b sha1_name: convert get_sha1* to get_oid*
Now that all the callers of get_sha1 directly or indirectly use struct
object_id, rename the functions starting with get_sha1 to start with
get_oid.  Convert the internals in sha1_name.c to use struct object_id
as well, and eliminate explicit length checks where possible.  Convert a
use of 40 in get_oid_basic to GIT_SHA1_HEXSZ.

Outside of sha1_name.c and cache.h, this transition was made with the
following semantic patch:

@@
expression E1, E2;
@@
- get_sha1(E1, E2.hash)
+ get_oid(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1(E1, E2->hash)
+ get_oid(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_committish(E1, E2.hash)
+ get_oid_committish(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_committish(E1, E2->hash)
+ get_oid_committish(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_treeish(E1, E2.hash)
+ get_oid_treeish(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_treeish(E1, E2->hash)
+ get_oid_treeish(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_commit(E1, E2.hash)
+ get_oid_commit(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_commit(E1, E2->hash)
+ get_oid_commit(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_tree(E1, E2.hash)
+ get_oid_tree(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_tree(E1, E2->hash)
+ get_oid_tree(E1, E2)

@@
expression E1, E2;
@@
- get_sha1_blob(E1, E2.hash)
+ get_oid_blob(E1, &E2)

@@
expression E1, E2;
@@
- get_sha1_blob(E1, E2->hash)
+ get_oid_blob(E1, E2)

@@
expression E1, E2, E3, E4;
@@
- get_sha1_with_context(E1, E2, E3.hash, E4)
+ get_oid_with_context(E1, E2, &E3, E4)

@@
expression E1, E2, E3, E4;
@@
- get_sha1_with_context(E1, E2, E3->hash, E4)
+ get_oid_with_context(E1, E2, E3, E4)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-17 13:54:51 -07:00
Stefan Beller
8b65a34c4a commit: convert lookup_commit_graft to struct object_id
With this patch, commit.h doesn't contain the string 'sha1' any more.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-13 12:02:40 -07:00
Ævar Arnfjörð Bjarmason
6a83d90207 coccinelle: make use of the "type" FREE_AND_NULL() rule
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:03 -07:00
Junio C Hamano
b9a7d55d93 Merge branch 'nd/fopen-errors'
We often try to open a file for reading whose existence is
optional, and silently ignore errors from open/fopen; report such
errors if they are not due to missing files.

* nd/fopen-errors:
  mingw_fopen: report ENOENT for invalid file names
  mingw: verify that paths are not mistaken for remote nicknames
  log: fix memory leak in open_next_file()
  rerere.c: move error_errno() closer to the source system call
  print errno when reporting a system call error
  wrapper.c: make warn_on_inaccessible() static
  wrapper.c: add and use fopen_or_warn()
  wrapper.c: add and use warn_on_fopen_errors()
  config.mak.uname: set FREAD_READS_DIRECTORIES for Darwin, too
  config.mak.uname: set FREAD_READS_DIRECTORIES for Linux and FreeBSD
  clone: use xfopen() instead of fopen()
  use xfopen() in more places
  git_fopen: fix a sparse 'not declared' warning
2017-06-13 13:47:09 -07:00
Junio C Hamano
965993d1ef Merge branch 'bm/interpret-trailers-cut-line-is-eom'
"git interpret-trailers", when used as GIT_EDITOR for "git commit
-v", looked for and appended to a trailer block at the very end,
i.e. at the end of the "diff" output.  The command has been
corrected to pay attention to the cut-mark line "commit -v" adds to
the buffer---the real trailer block should appear just before it.

* bm/interpret-trailers-cut-line-is-eom:
  interpret-trailers: honor the cut line
2017-05-29 12:34:53 +09:00
Junio C Hamano
6b526ced6f Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (53 commits)
  object: convert parse_object* to take struct object_id
  tree: convert parse_tree_indirect to struct object_id
  sequencer: convert do_recursive_merge to struct object_id
  diff-lib: convert do_diff_cache to struct object_id
  builtin/ls-tree: convert to struct object_id
  merge: convert checkout_fast_forward to struct object_id
  sequencer: convert fast_forward_to to struct object_id
  builtin/ls-files: convert overlay_tree_on_cache to object_id
  builtin/read-tree: convert to struct object_id
  sha1_name: convert internals of peel_onion to object_id
  upload-pack: convert remaining parse_object callers to object_id
  revision: convert remaining parse_object callers to object_id
  revision: rename add_pending_sha1 to add_pending_oid
  http-push: convert process_ls_object and descendants to object_id
  refs/files-backend: convert many internals to struct object_id
  refs: convert struct ref_update to use struct object_id
  ref-filter: convert some static functions to struct object_id
  Convert struct ref_array_item to struct object_id
  Convert the verify_pack callback to struct object_id
  Convert lookup_tag to struct object_id
  ...
2017-05-29 12:34:43 +09:00
Nguyễn Thái Ngọc Duy
e9d983f116 wrapper.c: add and use fopen_or_warn()
When fopen() returns NULL, it could be because the given path does not
exist, but it could also be some other errors and the caller has to
check. Add a wrapper so we don't have to repeat the same error check
everywhere.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-26 12:33:56 +09:00
Brian Malehorn
d76650b8d1 interpret-trailers: honor the cut line
If a commit message is edited with the "verbose" option, the buffer
will have a cut line and diff after the log message, like so:

    my subject

    # ------------------------ >8 ------------------------
    # Do not touch the line above.
    # Everything below will be removed.
    diff --git a/foo.txt b/foo.txt
    index 5716ca5..7601807 100644
    --- a/foo.txt
    +++ b/foo.txt
    @@ -1 +1 @@
    -bar
    +baz

"git interpret-trailers" is unaware of the cut line, and assumes the
trailer block would be at the end of the whole thing.  This can easily
be seen with:

     $ GIT_EDITOR='git interpret-trailers --in-place --trailer Acked-by:me' \
       git commit --amend -v

Teach "git interpret-trailers" to notice the cut-line and ignore the
remainder of the input when looking for a place to add new trailer
block.  This makes it consistent with how "git commit -v -s" inserts a
new Signed-off-by: line.

This can be done by the same logic as the existing helper function,
wt_status_truncate_message_at_cut_line(), uses, but it wants the caller
to pass a strbuf to it.  Because the function ignore_non_trailer() used
by the command takes a <pointer, length> pair, not a strbuf, steal the
logic from wt_status_truncate_message_at_cut_line() to create a new
wt_status_locate_end() helper function that takes <pointer, length>
pair, and make ignore_non_trailer() call it to help "interpret-trailers".

Since there is only one caller of wt_status_truncate_message_at_cut_line()
in cmd_commit(), rewrite it to call wt_status_locate_end() helper instead
and remove the old helper that no longer has any caller.

Signed-off-by: Brian Malehorn <bmalehorn@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-18 15:00:48 +09:00
brian m. carlson
c251c83df2 object: convert parse_object* to take struct object_id
Make parse_object, parse_object_or_die, and parse_object_buffer take a
pointer to struct object_id.  Remove the temporary variables inserted
earlier, since they are no longer necessary.  Transform all of the
callers using the following semantic patch:

@@
expression E1;
@@
- parse_object(E1.hash)
+ parse_object(&E1)

@@
expression E1;
@@
- parse_object(E1->hash)
+ parse_object(E1)

@@
expression E1, E2;
@@
- parse_object_or_die(E1.hash, E2)
+ parse_object_or_die(&E1, E2)

@@
expression E1, E2;
@@
- parse_object_or_die(E1->hash, E2)
+ parse_object_or_die(E1, E2)

@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1.hash, E2, E3, E4, E5)
+ parse_object_buffer(&E1, E2, E3, E4, E5)

@@
expression E1, E2, E3, E4, E5;
@@
- parse_object_buffer(E1->hash, E2, E3, E4, E5)
+ parse_object_buffer(E1, E2, E3, E4, E5)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:58 +09:00
brian m. carlson
740ee055c6 Convert lookup_tree to struct object_id
Convert the lookup_tree function to take a pointer to struct object_id.

The commit was created with manual changes to tree.c, tree.h, and
object.c, plus the following semantic patch:

@@
@@
- lookup_tree(EMPTY_TREE_SHA1_BIN)
+ lookup_tree(&empty_tree_oid)

@@
expression E1;
@@
- lookup_tree(E1.hash)
+ lookup_tree(&E1)

@@
expression E1;
@@
- lookup_tree(E1->hash)
+ lookup_tree(E1)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:57 +09:00
brian m. carlson
bc83266abe Convert lookup_commit* to struct object_id
Convert lookup_commit, lookup_commit_or_die,
lookup_commit_reference, and lookup_commit_reference_gently to take
struct object_id arguments.

Introduce a temporary in parse_object buffer in order to convert this
function.  This is required since in order to convert parse_object and
parse_object_buffer, lookup_commit_reference_gently and
lookup_commit_or_die would need to be converted.  Not introducing a
temporary would therefore require that lookup_commit_or_die take a
struct object_id *, but lookup_commit would take unsigned char *,
leaving a confusing and hard-to-use interface.

parse_object_buffer will lose this temporary in a later patch.

This commit was created with manual changes to commit.c, commit.h, and
object.c, plus the following semantic patch:

@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1.hash, E2)
+ lookup_commit_reference_gently(&E1, E2)

@@
expression E1, E2;
@@
- lookup_commit_reference_gently(E1->hash, E2)
+ lookup_commit_reference_gently(E1, E2)

@@
expression E1;
@@
- lookup_commit_reference(E1.hash)
+ lookup_commit_reference(&E1)

@@
expression E1;
@@
- lookup_commit_reference(E1->hash)
+ lookup_commit_reference(E1)

@@
expression E1;
@@
- lookup_commit(E1.hash)
+ lookup_commit(&E1)

@@
expression E1;
@@
- lookup_commit(E1->hash)
+ lookup_commit(E1)

@@
expression E1, E2;
@@
- lookup_commit_or_die(E1.hash, E2)
+ lookup_commit_or_die(&E1, E2)

@@
expression E1, E2;
@@
- lookup_commit_or_die(E1->hash, E2)
+ lookup_commit_or_die(E1, E2)

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:57 +09:00
brian m. carlson
e92b848cb6 shallow: convert shallow registration functions to object_id
Convert register_shallow and unregister_shallow to take struct
object_id.  register_shallow is a caller of lookup_commit, which we will
convert later.  It doesn't make sense for the registration and
unregistration functions to have incompatible interfaces, so convert
them both.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-05-08 15:12:57 +09:00
Johannes Schindelin
dddbad728c timestamp_t: a new data type for timestamps
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).

So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.

By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.

As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-27 13:07:39 +09:00
Johannes Schindelin
1aeb7e756c parse_timestamp(): specify explicitly where we parse timestamps
Currently, Git's source code represents all timestamps as `unsigned
long`. In preparation for using a more appropriate data type, let's
introduce a symbol `parse_timestamp` (currently being defined to
`strtoul`) where appropriate, so that we can later easily switch to,
say, use `strtoull()` instead.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-23 20:19:15 -07:00
Junio C Hamano
98c96f8bff Merge branch 'rs/commit-parsing-optim'
The code that parses header fields in the commit object has been
updated for (micro)performance and code hygiene.

* rs/commit-parsing-optim:
  commit: don't check for space twice when looking for header
  commit: be more precise when searching for headers
2017-03-10 13:24:22 -08:00
René Scharfe
b072504ce1 commit: don't check for space twice when looking for header
Both standard_header_field() and excluded_header_field() check if
there's a space after the buffer that's handed to them.  We already
check in the caller if that space is present.  Don't bother calling
the functions if it's missing, as they are guaranteed to return 0 in
that case, and remove the now redundant checks from them.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-27 11:20:18 -08:00
René Scharfe
50a01cc48c commit: be more precise when searching for headers
Search for a space character only within the current line in
read_commit_extra_header_lines() instead of searching in the whole
buffer (and possibly beyond, if it's not NUL-terminated) and then
discarding any results after the end of the current line.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-27 11:20:17 -08:00
Junio C Hamano
9c9b03b1f1 commit.c: use strchrnul() to scan for one line
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-01 13:46:51 -08:00
Jonathan Tan
710714aaa8 commit: make ignore_non_trailer take buf/len
Make ignore_non_trailer take a buf/len pair instead of struct strbuf.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-11-29 14:22:18 -08:00
Junio C Hamano
a813b19190 Merge branch 'rs/copy-array' into maint
Code cleanup.

* rs/copy-array:
  use COPY_ARRAY
  add COPY_ARRAY
2016-10-11 14:18:32 -07:00
Junio C Hamano
b1f0a85660 Merge branch 'rs/copy-array'
Code cleanup.

* rs/copy-array:
  use COPY_ARRAY
  add COPY_ARRAY
2016-10-03 13:30:33 -07:00
René Scharfe
45ccef87b3 use COPY_ARRAY
Add a semantic patch for converting certain calls of memcpy(3) to
COPY_ARRAY() and apply that transformation to the code base.  The result
is
 shorter and safer code.  For now only consider calls where source and
destination have the same type, or in other words: easy cases.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-25 16:44:13 -07:00
Vasco Almeida
4fa4b31507 i18n: commit: mark message for translation
Mark message commit_utf8_warn for translation.

Update tests to reflect changes.

Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-19 10:55:36 -07:00
Junio C Hamano
aeb1b7f55d Merge branch 'rs/pull-signed-tag'
When "git merge-recursive" works on history with many criss-cross
merges in "verbose" mode, the names the command assigns to the
virtual merge bases could have overwritten each other by unintended
reuse of the same piece of memory.

* rs/pull-signed-tag:
  commit: use FLEX_ARRAY in struct merge_remote_desc
  merge-recursive: fix verbose output for multiple base trees
  commit: factor out set_merge_remote_desc()
  commit: use xstrdup() in get_merge_parent()
2016-08-19 15:34:14 -07:00
René Scharfe
5447a76aad commit: use FLEX_ARRAY in struct merge_remote_desc
Convert the name member of struct merge_remote_desc to a FLEX_ARRAY and
use FLEX_ALLOC_STR to build the struct.  This halves the number of
memory allocations, saves the storage for a pointer and avoids an
indirection when reading the name.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-13 19:48:07 -07:00
René Scharfe
beb518c985 commit: factor out set_merge_remote_desc()
Export a helper function for allocating, populating and attaching a
merge_remote_desc to a commit.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-13 19:48:00 -07:00
René Scharfe
c089320cf6 commit: use xstrdup() in get_merge_parent()
Handle allocation errors for the name member just like we already do
for the struct merge_remote_desc itself.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-13 19:47:49 -07:00
Junio C Hamano
475495ff5e Merge branch 'js/sign-empty-commit-fix' into maint
"git commit --amend --allow-empty-message -S" for a commit without
any message body could have misidentified where the header of the
commit object ends.

* js/sign-empty-commit-fix:
  commit -S: avoid invalid pointer with empty message
2016-07-28 11:25:53 -07:00
Junio C Hamano
4966b58f3e Merge branch 'js/find-commit-subject-ignore-leading-blanks' into maint
A helper function that takes the contents of a commit object and
finds its subject line did not ignore leading blank lines, as is
commonly done by other codepaths.  Make it ignore leading blank
lines to match.

* js/find-commit-subject-ignore-leading-blanks:
  reset --hard: skip blank lines when reporting the commit subject
  sequencer: use skip_blank_lines() to find the commit subject
  commit -C: skip blank lines at the beginning of the message
  commit.c: make find_commit_subject() more robust
  pretty: make the skip_blank_lines() function public
2016-07-28 11:25:50 -07:00
Junio C Hamano
96e08010ee Merge branch 'jk/printf-format'
Code clean-up to avoid using a variable string that compilers may
feel untrustable as printf-style format given to write_file()
helper function.

* jk/printf-format:
  commit.c: remove print_commit_list()
  avoid using sha1_to_hex output as printf format
  walker: let walker_say take arbitrary formats
2016-07-19 13:22:22 -07:00
Junio C Hamano
c510926691 Merge branch 'js/sign-empty-commit-fix'
"git commit --amend --allow-empty-message -S" for a commit without
any message body could have misidentified where the header of the
commit object ends.

* js/sign-empty-commit-fix:
  commit -S: avoid invalid pointer with empty message
2016-07-13 11:24:15 -07:00
Junio C Hamano
62e5e83f8d Merge branch 'js/find-commit-subject-ignore-leading-blanks'
A helper function that takes the contents of a commit object and
finds its subject line did not ignore leading blank lines, as is
commonly done by other codepaths.  Make it ignore leading blank
lines to match.

* js/find-commit-subject-ignore-leading-blanks:
  reset --hard: skip blank lines when reporting the commit subject
  sequencer: use skip_blank_lines() to find the commit subject
  commit -C: skip blank lines at the beginning of the message
  commit.c: make find_commit_subject() more robust
  pretty: make the skip_blank_lines() function public
2016-07-11 10:31:08 -07:00
Junio C Hamano
54307ea7c3 commit.c: remove print_commit_list()
The helper function tries to offer a way to conveniently show the
last one differently from others, presumably to allow you to say
something like

	A, B, and C.

while iterating over a list that has these three elements.

However, there is only one caller, and it passes the same format
string "%s\n" for both the last one and the other ones.  Retire the
helper function and update the caller with a simplified version.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-08 10:11:36 -07:00
Johannes Schindelin
3324dd8f26 commit -S: avoid invalid pointer with empty message
While it is not recommended, fsck.c says:

	Not having a body is not a crime [...]

... which means that we cannot assume that the commit buffer
contains an empty line to separate header from body.  A commit
object with only a header without any body, not even without
a blank line after the header, is valid.

So let's tread carefully here.  strstr("\n\n") may find nothing
and return NULL.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-29 15:07:02 -07:00
Johannes Schindelin
4e1b06da25 commit.c: make find_commit_subject() more robust
Just like the pretty printing machinery, we should simply ignore
blank lines at the beginning of the commit messages.

This discrepancy was noticed when an early version of the
rebase--helper produced commit objects with more than one empty line
between the header and the commit message.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-06-22 13:24:17 -07:00
Jeff King
50a6c8efa2 use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King
b32fa95fd8 convert trivial cases to ALLOC_ARRAY
Each of these cases can be converted to use ALLOC_ARRAY or
REALLOC_ARRAY, which has two advantages:

  1. It automatically checks the array-size multiplication
     for overflow.

  2. It always uses sizeof(*array) for the element-size,
     so that it can never go out of sync with the declared
     type of the array.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Junio C Hamano
0af22d6fff Merge branch 'rs/pop-commit' into maint
Code simplification.

* rs/pop-commit:
  use pop_commit() for consuming the first entry of a struct commit_list
2015-12-11 11:14:13 -08:00
brian m. carlson
ed1c9977cb Remove get_object_hash.
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object.  This provides no
functional change, as it is essentially a macro substitution.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
f2fd0760f6 Convert struct object to object_id
struct object is one of the major data structures dealing with object
IDs.  Convert it to use struct object_id instead of an unsigned char
array.  Convert get_object_hash to refer to the new member as well.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson
7999b2cf77 Add several uses of get_object_hash.
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash.  Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00