Commit graph

929 commits

Author SHA1 Message Date
Jeff King 50a6c8efa2 use st_add and st_mult for allocation size computation
If our size computation overflows size_t, we may allocate a
much smaller buffer than we expected and overflow it. It's
probably impossible to trigger an overflow in most of these
sites in practice, but it is easy enough convert their
additions and multiplications into overflow-checking
variants. This may be fixing real bugs, and it makes
auditing the code easier.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King 96ffc06f72 convert trivial cases to FLEX_ARRAY macros
Using FLEX_ARRAY macros reduces the amount of manual
computation size we have to do. It also ensures we don't
overflow size_t, and it makes sure we write the same number
of bytes that we allocated.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Jeff King 3733e69464 use xmallocz to avoid size arithmetic
We frequently allocate strings as xmalloc(len + 1), where
the extra 1 is for the NUL terminator. This can be done more
simply with xmallocz, which also checks for integer
overflow.

There's no case where switching xmalloc(n+1) to xmallocz(n)
is wrong; the result is the same length, and malloc made no
guarantees about what was in the buffer anyway. But in some
cases, we can stop manually placing NUL at the end of the
allocated buffer. But that's only safe if it's clear that
the contents will always fill the buffer.

In each case where this patch does so, I manually examined
the control flow, and I tried to err on the side of caution.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-22 14:51:09 -08:00
Junio C Hamano 844a9ce472 Merge branch 'bc/object-id'
More transition from "unsigned char[40]" to "struct object_id".

This needed a few merge fixups, but is mostly disentangled from other
topics.

* bc/object-id:
  remote: convert functions to struct object_id
  Remove get_object_hash.
  Convert struct object to object_id
  Add several uses of get_object_hash.
  object: introduce get_object_hash macro.
  ref_newer: convert to use struct object_id
  push_refs_with_export: convert to struct object_id
  get_remote_heads: convert to struct object_id
  parse_fetch: convert to use struct object_id
  add_sought_entry_mem: convert to struct object_id
  Convert struct ref to use object_id.
  sha1_file: introduce has_object_file helper.
2015-12-10 12:36:13 -08:00
Junio C Hamano b1cda70fff Merge branch 'dt/refs-backend-pre-vtable'
Code preparation for pluggable ref backends.

* dt/refs-backend-pre-vtable:
  refs: break out ref conflict checks
  files_log_ref_write: new function
  initdb: make safe_create_dir public
  refs: split filesystem-based refs code into a new file
  refs/refs-internal.h: new header file
  refname_is_safe(): improve docstring
  pack_if_possible_fn(): use ref_type() instead of is_per_worktree_ref()
  copy_msg(): rename to copy_reflog_msg()
  verify_refname_available(): new function
  verify_refname_available(): rename function
2015-12-08 14:14:49 -08:00
brian m. carlson ed1c9977cb Remove get_object_hash.
Convert all instances of get_object_hash to use an appropriate reference
to the hash member of the oid member of struct object.  This provides no
functional change, as it is essentially a macro substitution.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
brian m. carlson 7999b2cf77 Add several uses of get_object_hash.
Convert most instances where the sha1 member of struct object is
dereferenced to use get_object_hash.  Most instances that are passed to
functions that have versions taking struct object_id, such as
get_sha1_hex/get_oid_hex, or instances that can be trivially converted
to use struct object_id instead, are not converted.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
David Turner 0845122c39 refs: break out ref conflict checks
Create new function find_descendant_ref, to hold one of the ref
conflict checks used in verify_refname_available. Multiple backends
will need this function, so move it to the common code.

Also move rename_ref_available to the common code, because alternate
backends might need it and it has no files-backend-specific code.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
Michael Haggerty 7bd9bcf372 refs: split filesystem-based refs code into a new file
As another step in the move to pluggable reference backends, move the
code that is specific to the filesystem-based reference backend (i.e.,
the current system of storing references as loose and packed files) into
a separate file, refs/files-backend.c.

Aside from a tiny bit of file header boilerplate, this commit only moves
a subset of the code verbatim from refs.c to the new file, as can easily
be verified using patience diff:

    git diff --patience $commit^:refs.c $commit:refs.c
    git diff --patience $commit^:refs.c $commit:refs/files-backend.c

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
Michael Haggerty 4cb77009e1 refs/refs-internal.h: new header file
There are a number of constants, structs, and static functions defined
in refs.c and treated as private to the references module. But we want
to support multiple reference backends within the reference module,
and those backends will need access to some heretofore private
declarations.

We don't want those declarations to be visible to non-refs code, so we
don't want to move them to refs.h. Instead, add a new header file,
refs/refs-internal.h, that is intended to be included only from within
the refs module. Make some functions non-static and move some
declarations (and their corresponding docstrings) from refs.c to this
file.

In a moment we will add more content to the "refs" subdirectory.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
Michael Haggerty 03b32623d8 refname_is_safe(): improve docstring
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
Michael Haggerty a935ebd4a7 pack_if_possible_fn(): use ref_type() instead of is_per_worktree_ref()
is_per_worktree_ref() will soon be made private, so use the public
interface, ref_type(), in its place. And now that we're using
ref_type(), we can make it clear that we won't pack pseudorefs. This was
the case before, but due to the not-so-obvious reason that this function
is applied to references via the loose reference cache, which only
includes references that live inside "refs/".

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
David Turner f4a5721ccb copy_msg(): rename to copy_reflog_msg()
We will soon increase the visibility of this function, so make its name
more distinctive.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
Ronnie Sahlberg d336123160 verify_refname_available(): new function
Add a new verify_refname_available() function, which checks whether the
refname is available for use, taking all references (both packed and
loose) into account. This function, unlike the old
verify_refname_available(), has semantics independent of the choice of
reference storage, and can therefore be implemented by alternative
reference backends.

Use the new function in a couple of places.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
Ronnie Sahlberg 7003b3ce21 verify_refname_available(): rename function
Rename verify_refname_available() to verify_refname_available_dir() to
make the old name available for a more general purpose.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 04:52:01 -05:00
Lukas Fleischer 78a766ab6e hideRefs: add support for matching full refs
In addition to matching stripped refs, one can now add hideRefs
patterns that the full (unstripped) ref is matched against. To
distinguish between stripped and full matches, those new patterns
must be prefixed with a circumflex (^).

This commit also removes support for the undocumented and unintended
hideRefs settings ".have" (suppressing all "have" lines) and
"capabilities^{}" (suppressing the capabilities line).

Signed-off-by: Lukas Fleischer <lfleischer@lfos.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-05 11:25:02 -08:00
Junio C Hamano 78891795df Merge branch 'jk/war-on-sprintf'
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.

Macintosh-specific breakage was noticed and corrected in this
reroll.

* jk/war-on-sprintf: (70 commits)
  name-rev: use strip_suffix to avoid magic numbers
  use strbuf_complete to conditionally append slash
  fsck: use for_each_loose_file_in_objdir
  Makefile: drop D_INO_IN_DIRENT build knob
  fsck: drop inode-sorting code
  convert strncpy to memcpy
  notes: document length of fanout path with a constant
  color: add color_set helper for copying raw colors
  prefer memcpy to strcpy
  help: clean up kfmclient munging
  receive-pack: simplify keep_arg computation
  avoid sprintf and strcpy with flex arrays
  use alloc_ref rather than hand-allocating "struct ref"
  color: add overflow checks for parsing colors
  drop strcpy in favor of raw sha1_to_hex
  use sha1_to_hex_r() instead of strcpy
  daemon: use cld->env_array when re-spawning
  stat_tracking_info: convert to argv_array
  http-push: use an argv_array for setup_revisions
  fetch-pack: use argv_array for index-pack / unpack-objects
  ...
2015-10-20 15:24:01 -07:00
Junio C Hamano 8a54523f0f Merge branch 'kn/for-each-tag'
The "ref-filter" code was taught about many parts of what "tag -l"
does and then "tag -l" is being reimplemented in terms of "ref-filter".

* kn/for-each-tag:
  tag.c: implement '--merged' and '--no-merged' options
  tag.c: implement '--format' option
  tag.c: use 'ref-filter' APIs
  tag.c: use 'ref-filter' data structures
  ref-filter: add option to match literal pattern
  ref-filter: add support to sort by version
  ref-filter: add support for %(contents:lines=X)
  ref-filter: add option to filter out tags, branches and remotes
  ref-filter: implement an `align` atom
  ref-filter: introduce match_atom_name()
  ref-filter: introduce handler function for each atom
  utf8: add function to align a string into given strbuf
  ref-filter: introduce ref_formatting_state and ref_formatting_stack
  ref-filter: move `struct atom_value` to ref-filter.c
  strtoul_ui: reject negative values
2015-10-05 12:30:18 -07:00
Jeff King 00b6c178c3 use strbuf_complete to conditionally append slash
When working with paths in strbufs, we frequently want to
ensure that a directory contains a trailing slash before
appending to it. We can shorten this code (and make the
intent more obvious) by calling strbuf_complete.

Most of these cases are trivially identical conversions, but
there are two things to note:

  - in a few cases we did not check that the strbuf is
    non-empty (which would lead to an out-of-bounds memory
    access). These were generally not triggerable in
    practice, either from earlier assertions, or typically
    because we would have just fed the strbuf to opendir(),
    which would choke on an empty path.

  - in a few cases we indexed the buffer with "original_len"
    or similar, rather than the current sb->len, and it is
    not immediately obvious from the diff that they are the
    same. In all of these cases, I manually verified that
    the strbuf does not change between the assignment and
    the strbuf_complete call.

This does not convert cases which look like:

  if (sb->len && !is_dir_sep(sb->buf[sb->len - 1]))
	  strbuf_addch(sb, '/');

as those are obviously semantically different. Some of these
cases arguably should be doing that, but that is out of
scope for this change, which aims purely for cleanup with no
behavior change (and at least it will make such sites easier
to find and examine in the future, as we can grep for
strbuf_complete).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:06 -07:00
Jeff King c7ab0ba340 avoid sprintf and strcpy with flex arrays
When we are allocating a struct with a FLEX_ARRAY member, we
generally compute the size of the array and then sprintf or
strcpy into it. Normally we could improve a dynamic allocation
like this by using xstrfmt, but it doesn't work here; we
have to account for the size of the rest of the struct.

But we can improve things a bit by storing the length that
we use for the allocation, and then feeding it to xsnprintf
or memcpy, which makes it more obvious that we are not
writing more than the allocated number of bytes.

It would be nice if we had some kind of helper for
allocating generic flex arrays, but it doesn't work that
well:

 - the call signature is a little bit unwieldy:

      d = flex_struct(sizeof(*d), offsetof(d, path), fmt, ...);

   You need offsetof here instead of just writing to the
   end of the base size, because we don't know how the
   struct is packed (partially this is because FLEX_ARRAY
   might not be zero, though we can account for that; but
   the size of the struct may actually be rounded up for
   alignment, and we can't know that).

 - some sites do clever things, like over-allocating because
   they know they will write larger things into the buffer
   later (e.g., struct packed_git here).

So we're better off to just write out each allocation (or
add type-specific helpers, though many of these are one-off
allocations anyway).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-10-05 11:08:05 -07:00
Jeff King 495127dbcb resolve_ref: use strbufs for internal buffers
resolve_ref already uses a strbuf internally when generating
pathnames, but it uses fixed-size buffers for storing the
refname and symbolic refs. This means that you cannot
actually point HEAD to a ref that is larger than 256 bytes.

We can lift this limit by using strbufs here, too. Like
sb_path, we pass the the buffers into our helper function,
so that we can easily clean up all output paths. We can also
drop the "unsafe" name from our helper function, as it no
longer uses a single static buffer (but of course
resolve_ref_unsafe is still unsafe, because the static
buffers moved there).

As a bonus, we also get to drop some strcpy calls between
the two fixed buffers (that cannot currently overflow
because the two buffers are sized identically).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King 5096d4909f convert trivial sprintf / strcpy calls to xsnprintf
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.

However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Karthik Nayak 5b4f28510f ref-filter: add option to filter out tags, branches and remotes
Add a function called 'for_each_fullref_in()' to refs.{c,h} which
iterates through each ref for the given path without trimming the path
and also accounting for broken refs, if mentioned.

Add 'filter_ref_kind()' in ref-filter.c to check the kind of ref being
handled and return the kind to 'ref_filter_handler()', where we
discard refs which we do not need and assign the kind to needed refs.

Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Matthieu Moy <matthieu.moy@grenoble-inp.fr>
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-17 10:02:48 -07:00
David Turner ce414b33ec refs: make refs/bisect/* per-worktree
We need the place we stick refs for bisects in progress to not be
shared between worktrees.  So we make the refs/bisect/ hierarchy
per-worktree.

The is_per_worktree_ref function and associated docs learn that
refs/bisect/ is per-worktree, as does the git_path code in path.c

The ref-packing functions learn that per-worktree refs should not be
packed (since packed-refs is common rather than per-worktree).

Since refs/bisect is per-worktree, logs/refs/bisect should be too.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-01 10:37:39 -07:00
Junio C Hamano db86e61cbb Merge branch 'mh/tempfile'
The "lockfile" API has been rebuilt on top of a new "tempfile" API.

* mh/tempfile:
  credential-cache--daemon: use tempfile module
  credential-cache--daemon: delete socket from main()
  gc: use tempfile module to handle gc.pid file
  lock_repo_for_gc(): compute the path to "gc.pid" only once
  diff: use tempfile module
  setup_temporary_shallow(): use tempfile module
  write_shared_index(): use tempfile module
  register_tempfile(): new function to handle an existing temporary file
  tempfile: add several functions for creating temporary files
  prepare_tempfile_object(): new function, extracted from create_tempfile()
  tempfile: a new module for handling temporary files
  commit_lock_file(): use get_locked_file_path()
  lockfile: add accessor get_lock_file_path()
  lockfile: add accessors get_lock_file_fd() and get_lock_file_fp()
  create_bundle(): duplicate file descriptor to avoid closing it twice
  lockfile: move documentation to lockfile.h and lockfile.c
2015-08-25 14:57:09 -07:00
Junio C Hamano 080cc64663 Merge branch 'dt/refs-pseudo'
To prepare for allowing a different "ref" backend to be plugged in
to the system, update_ref()/delete_ref() have been taught about
ref-like things like MERGE_HEAD that are per-worktree (they will
always be written to the filesystem inside $GIT_DIR).

* dt/refs-pseudo:
  pseudoref: check return values from read_ref()
  sequencer: replace write_cherry_pick_head with update_ref
  bisect: use update_ref
  pseudorefs: create and use pseudoref update and delete functions
  refs: add ref_type function
  refs: introduce pseudoref and per-worktree ref concepts
2015-08-25 14:57:08 -07:00
Junio C Hamano 8c9155e031 Merge branch 'jk/git-path'
git_path() and mkpath() are handy helper functions but it is easy
to misuse, as the callers need to be careful to keep the number of
active results below 4.  Their uses have been reduced.

* jk/git-path:
  memoize common git-path "constant" files
  get_repo_path: refactor path-allocation
  find_hook: keep our own static buffer
  refs.c: remove_empty_directories can take a strbuf
  refs.c: avoid git_path assignment in lock_ref_sha1_basic
  refs.c: avoid repeated git_path calls in rename_tmp_log
  refs.c: simplify strbufs in reflog setup and writing
  path.c: drop git_path_submodule
  refs.c: remove extra git_path calls from read_loose_refs
  remote.c: drop extraneous local variable from migrate_file
  prefer mkpathdup to mkpath in assignments
  prefer git_pathdup to git_path in some possibly-dangerous cases
  add_to_alternates_file: don't add duplicate entries
  t5700: modernize style
  cache.h: complete set of git_path_submodule helpers
  cache.h: clarify documentation for git_path, et al
2015-08-19 14:48:56 -07:00
Junio C Hamano 824a0be6be Merge branch 'jk/negative-hiderefs'
A negative !ref entry in multi-value transfer.hideRefs
configuration can be used to say "don't hide this one".

* jk/negative-hiderefs:
  refs: support negative transfer.hideRefs
  docs/config.txt: reorder hideRefs config
2015-08-19 14:48:54 -07:00
David Turner 2c3aed1381 pseudoref: check return values from read_ref()
These codepaths attempt to compare the "expected" current value with
the actual current value, but did not check if we successfully read
the current value before comparison.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-11 15:52:20 -07:00
Jeff King 470e28d4e1 refs.c: remove_empty_directories can take a strbuf
The first thing we do in this function is copy the input
into a strbuf. Of the 4 callers, 3 of them already have a
strbuf we could use. Let's just take the strbuf, and convert
the remaining caller to use a strbuf, rather than a raw
git_path. This is safer, anyway, as remove_dir_recursively
is a non-trivial function that might use the pathname
buffers itself (this is _probably_ OK, as the likely culprit
would be calling resolve_gitlink_ref, but we do not pass the
proper flags to ask it to avoid blowing away gitlinks).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:13 -07:00
Jeff King 5f8ef5b848 refs.c: avoid git_path assignment in lock_ref_sha1_basic
Assigning the result of git_path is a bad pattern, because
it's not immediately obvious how long you expect the content
to stay valid (and it may be overwritten by subsequent
calls). Let's use a function-local strbuf here instead,
which we know is safe (we just have to remember to free it
in all code paths).

As a bonus, we get rid of a confusing variable-reuse
("ref_file" is used for two distinct purposes).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:13 -07:00
Jeff King d6549f3655 refs.c: avoid repeated git_path calls in rename_tmp_log
Because it's not safe to store the static-buffer results of
git_path for a long time, we end up formatting the same
filename over and over. We can fix this by using a
function-local strbuf to store the formatted pathname and
avoid repeating ourselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:13 -07:00
Jeff King 54b418f698 refs.c: simplify strbufs in reflog setup and writing
Commit 1a83c24 (git_snpath(): retire and replace with
strbuf_git_path(), 2014-11-30) taught log_ref_setup and
log_ref_write_1 to take a strbuf parameter, rather than a
bare string. It then makes an alias to the strbuf's "buf"
field under the original name.

This made the original diff much shorter, but the resulting
code is more complicated that it needs to be. Since we've
aliased the pointer, we drop our reference to the strbuf to
ensure we don't accidentally change it. But if we simply
drop our alias and use "logfile.buf" directly, we do not
have to worry about this aliasing. It's a larger diff, but
the resulting code is simpler.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:13 -07:00
Jeff King f5b2dec165 refs.c: remove extra git_path calls from read_loose_refs
In iterating over the loose refs in "refs/foo/", we keep a
running strbuf with "refs/foo/one", "refs/foo/two", etc. But
we also need to access these files in the filesystem, as
".git/refs/foo/one", etc. For this latter purpose, we make a
series of independent calls to git_path(). These are safe
(we only use the result to call stat()), but assigning the
result of git_path is a suspicious pattern that we'd rather
avoid.

This patch keeps a running buffer with ".git/refs/foo/", and
we can just append/reset each directory element as we loop.
This matches how we handle the refnames. It should also be
more efficient, as we do not keep formatting the same
".git/refs/foo" prefix (which can be arbitrarily deep).

Technically we are dropping a call to strbuf_cleanup() on
each generated filename, but that's OK; it wasn't doing
anything, as we are putting in single-level names we read
from the filesystem (so it could not possibly be cleaning up
cruft like "./" in this instance).

A clever reader may also note that the running refname
buffer ("refs/foo/") is actually a subset of the filesystem
path buffer (".git/refs/foo/"). We could get by with one
buffer, indexing the length of $GIT_DIR when we want the
refname. However, having tried this, the resulting code
actually ends up a little more confusing, and the efficiency
improvement is tiny (and almost certainly dwarfed by the
system calls we are making).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:13 -07:00
Jeff King e3cf230324 prefer mkpathdup to mkpath in assignments
As with the previous commit to git_path, assigning the
result of mkpath is suspicious, since it is not clear
whether we will still depend on the value after it may have
been overwritten by subsequent calls. This patch converts
low-hanging fruit to use mkpathdup instead of mkpath (with
the downside that we must remember to free the result).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:12 -07:00
Jeff King fcd12db6af prefer git_pathdup to git_path in some possibly-dangerous cases
Because git_path uses a static buffer that is shared with
calls to git_path, mkpath, etc, it can be dangerous to
assign the result to a variable or pass it to a non-trivial
function. The value may change unexpectedly due to other
calls.

None of the cases changed here has a known bug, but they're
worth converting away from git_path because:

  1. It's easy to use git_pathdup in these cases.

  2. They use constructs (like assignment) that make it
     hard to tell whether they're safe or not.

The extra malloc overhead should be trivial, as an
allocation should be an order of magnitude cheaper than a
system call (which we are clearly about to make, since we
are constructing a filename). The real cost is that we must
remember to free the result.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 15:37:12 -07:00
Michael Haggerty b4fb09e4da lockfile: add accessor get_lock_file_path()
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 12:57:14 -07:00
Michael Haggerty c99a4c2db3 lockfile: add accessors get_lock_file_fd() and get_lock_file_fp()
We are about to move those members, so change client code to read them
through accessor functions.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 12:57:14 -07:00
Jeff King 2bc31d1631 refs: support negative transfer.hideRefs
If you hide a hierarchy of refs using the transfer.hideRefs
config, there is no way to later override that config to
"unhide" it. This patch implements a "negative" hide which
causes matches to immediately be marked as unhidden, even if
another match would hide it. We take care to apply the
matches in reverse-order from how they are fed to us by the
config machinery, as that lets our usual "last one wins"
config precedence work (and entries in .git/config, for
example, will override /etc/gitconfig).

So you can now do:

  $ git config --system transfer.hideRefs refs/secret
  $ git config transfer.hideRefs '!refs/secret/not-so-secret'

to hide refs/secret in all repos, except for one public bit
in one specific repo. Or you can even do:

  $ git clone \
      -u "git -c transfer.hiderefs="!refs/foo" upload-pack" \
      remote:repo.git

to clone remote:repo.git, overriding any hiding it has
configured.

There are two alternatives that were considered and
rejected:

  1. A generic config mechanism for removing an item from a
     list. E.g.: (e.g., "[transfer] hideRefs -= refs/foo").

     This is nice because it could apply to other
     multi-valued config, as well. But it is not nearly as
     flexible. There is no way to say:

       [transfer]
       hideRefs = refs/secret
       hideRefs = refs/secret/not-so-secret

     Having explicit negative specifications means we can
     override previous entries, even if they are not the
     same literal string.

  2. Adding another variable to override some parts of
     hideRefs (e.g., "exposeRefs").

     This solves the problem from alternative (1), but it
     cannot easily obey the normal config precedence,
     because it would use two separate lists. For example:

       [transfer]
       hideRefs = refs/secret
       exposeRefs = refs/secret/not-so-secret
       hideRefs = refs/secret/not-so-secret/no-really-its-secret

     With two lists, we have to apply the "expose" rules
     first, and only then apply the "hide" rules. But that
     does not match what the above config intends.

     Of course we could internally parse that to a single
     list, respecting the ordering, which saves us having to
     invent the new "!" syntax. But using a single name
     communicates to the user that the ordering _is_
     important. And "!" is well-known for negation, and
     should not appear at the beginning of a ref (it is
     actually valid in a ref-name, but all entries here
     should be fully-qualified, starting with "refs/").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-07 11:47:36 -07:00
Junio C Hamano 8d3981ccbe Merge branch 'jk/refspec-parse-wildcard'
Allow an asterisk as a substring (as opposed to the entirety) of
a path component for both side of a refspec, e.g.
"refs/heads/o*:refs/remotes/heads/i*".

* jk/refspec-parse-wildcard:
  refs: loosen restriction on wildcard "*" refspecs
  refs: cleanup comments regarding check_refname_component()
2015-08-03 11:01:31 -07:00
Junio C Hamano b6d323f164 Merge branch 'dt/refs-backend-preamble'
In preparation for allowing different "backends" to store the refs
in a way different from the traditional "one ref per file in $GIT_DIR
or in a $GIT_DIR/packed-refs file" filesystem storage, reduce
direct filesystem access to ref-like things like CHERRY_PICK_HEAD
from scripts and programs.

* dt/refs-backend-preamble:
  git-stash: use update-ref --create-reflog instead of creating files
  update-ref and tag: add --create-reflog arg
  refs: add REF_FORCE_CREATE_REFLOG flag
  git-reflog: add exists command
  refs: new public ref function: safe_create_reflog
  refs: break out check for reflog autocreation
  refs.c: add err arguments to reflog functions
2015-08-03 11:01:29 -07:00
Junio C Hamano d939af12bd Merge branch 'jk/date-mode-format'
Teach "git log" and friends a new "--date=format:..." option to
format timestamps using system's strftime(3).

* jk/date-mode-format:
  strbuf: make strbuf_addftime more robust
  introduce "format" date-mode
  convert "enum date_mode" into a struct
  show-branch: use DATE_RELATIVE instead of magic number
2015-08-03 11:01:27 -07:00
Junio C Hamano be9cb560e3 Merge branch 'mh/init-delete-refs-api'
Clean up refs API and make "git clone" less intimate with the
implementation detail.

* mh/init-delete-refs-api:
  delete_ref(): use the usual convention for old_sha1
  cmd_update_ref(): make logic more straightforward
  update_ref(): don't read old reference value before delete
  check_branch_commit(): make first parameter const
  refs.h: add some parameter names to function declarations
  refs: move the remaining ref module declarations to refs.h
  initial_ref_transaction_commit(): check for ref D/F conflicts
  initial_ref_transaction_commit(): check for duplicate refs
  refs: remove some functions from the module's public interface
  initial_ref_transaction_commit(): function for initial ref creation
  repack_without_refs(): make function private
  prune_refs(): use delete_refs()
  prune_remote(): use delete_refs()
  delete_refs(): bail early if the packed-refs file cannot be rewritten
  delete_refs(): make error message more generic
  delete_refs(): new function for the refs API
  delete_ref(): handle special case more explicitly
  remove_branches(): remove temporary
  delete_ref(): move declaration to refs.h
2015-08-03 11:01:17 -07:00
Junio C Hamano 31a0ad5456 Merge branch 'mh/replace-refs'
Add an environment variable to tell Git to look into refs hierarchy
other than refs/replace/ for the object replacement data.

* mh/replace-refs:
  Allow to control where the replace refs are looked for
2015-08-03 11:01:10 -07:00
David Turner 74ec19d4be pseudorefs: create and use pseudoref update and delete functions
Pseudorefs should not be updated through the ref transaction
API, because alternate ref backends still need to store pseudorefs
in GIT_DIR (instead of wherever they store refs).  Instead,
change update_ref and delete_ref to call pseudoref-specific
functions.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-31 10:39:38 -07:00
David Turner 266b18273a refs: add ref_type function
Add a function ref_type, which categorizes refs as per-worktree,
pseudoref, or normal ref.

Later, we will use this in refs.c to treat pseudorefs specially.
Alternate ref backends may use it to treat both pseudorefs and
per-worktree refs differently.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-31 10:39:04 -07:00
Jacob Keller cd377f45c9 refs: loosen restriction on wildcard "*" refspecs
Loosen restrictions on refspecs by allowing patterns that have a "*"
within a component instead of only as the whole component.

Remove the logic to accept a single "*" as a whole component from
check_refname_format(), and implement an extended form of that logic
in check_refname_component().  Pass the pointer to the flags argument
to the latter, as it has to clear REFNAME_REFSPEC_PATTERN bit when
it sees "*".

Teach check_refname_component() function to allow an asterisk "*"
only when REFNAME_REFSPEC_PATTERN is set in the flags, and drop the
bit after seeing a "*", to ensure that one side of a refspec
contains at most one asterisk.

This will allow us to accept refspecs such as `for/bar*:foo/baz*`.
Any refspec which functioned before shall continue functioning with
the new logic.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-27 09:21:31 -07:00
Jacob Keller 53a8555ee4 refs: cleanup comments regarding check_refname_component()
Correctly specify all characters which are rejected under the '4: a
bad character' disposition, which did not list all characters that
are treated as such.

Cleanup comment style for rejected refs by inserting a ", or" at the
end of each statement.

Signed-off-by: Jacob Keller <jacob.keller@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-27 09:21:15 -07:00
David Turner 0f2a71d992 refs: add REF_FORCE_CREATE_REFLOG flag
Add a flag to allow forcing the creation of a reflog even if the ref
name and core.logAllRefUpdates setting would not ordinarily cause ref
creation.

In a moment, we will use this to add options to git tag and git
update-ref to force reflog creation.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-21 14:08:26 -07:00
David Turner abd0cd3a30 refs: new public ref function: safe_create_reflog
The safe_create_reflog function creates a reflog, if it does not
already exist.

The log_ref_setup function becomes private and gains a force_create
parameter to force the creation of a reflog even if log_all_ref_updates
is false or the refname is not one of the special refnames.

The new parameter also reduces the need to store, modify, and restore
the log_all_ref_updates global before reflog creation.

In a moment, we will use this to add reflog creation commands to
git-reflog.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-21 14:07:59 -07:00
David Turner 4e2bef57c9 refs: break out check for reflog autocreation
This is just for clarity.

Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-21 14:07:54 -07:00
David Turner a4c653dfcd refs.c: add err arguments to reflog functions
Add an err argument to log_ref_setup that can explain the reason
for a failure. This then eliminates the need to manage errno through
this function since we can just add strerror(errno) to the err string
when meaningful. No callers relied on errno from this function for
anything else than the error message.

Also add err arguments to private functions write_ref_to_lockfile,
log_ref_write_1, commit_ref_update. This again eliminates the need to
manage errno in these functions.

Some error messages are slightly reordered.

Update of a patch by Ronnie Sahlberg.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: David Turner <dturner@twopensource.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-07-21 14:07:28 -07:00
Jeff King a5481a6c94 convert "enum date_mode" into a struct
In preparation for adding date modes that may carry extra
information beyond the mode itself, this patch converts the
date_mode enum into a struct.

Most of the conversion is fairly straightforward; we pass
the struct as a pointer and dereference the type field where
necessary. Locations that declare a date_mode can use a "{}"
constructor.  However, the tricky case is where we use the
enum labels as constants, like:

  show_date(t, tz, DATE_NORMAL);

Ideally we could say:

  show_date(t, tz, &{ DATE_NORMAL });

but of course C does not allow that. Likewise, we cannot
cast the constant to a struct, because we need to pass an
actual address. Our options are basically:

  1. Manually add a "struct date_mode d = { DATE_NORMAL }"
     definition to each caller, and pass "&d". This makes
     the callers uglier, because they sometimes do not even
     have their own scope (e.g., they are inside a switch
     statement).

  2. Provide a pre-made global "date_normal" struct that can
     be passed by address. We'd also need "date_rfc2822",
     "date_iso8601", and so forth. But at least the ugliness
     is defined in one place.

  3. Provide a wrapper that generates the correct struct on
     the fly. The big downside is that we end up pointing to
     a single global, which makes our wrapper non-reentrant.
     But show_date is already not reentrant, so it does not
     matter.

This patch implements 3, along with a minor macro to keep
the size of the callers sane.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-29 11:39:07 -07:00
Junio C Hamano 9d71c5f408 Merge branch 'mh/reporting-broken-refs-from-for-each-ref'
"git for-each-ref" reported "missing object" for 0{40} when it
encounters a broken ref.  The lack of object whose name is 0{40} is
not the problem; the ref being broken is.

* mh/reporting-broken-refs-from-for-each-ref:
  read_loose_refs(): treat NULL_SHA1 loose references as broken
  read_loose_refs(): simplify function logic
  for-each-ref: report broken references correctly
  t6301: new tests of for-each-ref error handling
2015-06-24 12:21:52 -07:00
Michael Haggerty 1c03c4d347 delete_ref(): use the usual convention for old_sha1
The ref_transaction_update() family of functions use the following
convention for their old_sha1 parameters:

* old_sha1 == NULL: Don't check the old value at all.
* is_null_sha1(old_sha1): Ensure that the reference didn't exist
  before the transaction.
* otherwise: Ensure that the reference had the specified value before
  the transaction.

delete_ref() had a different convention, namely treating
is_null_sha1(old_sha1) as "don't care". Change it to adhere to the
standard convention to reduce the scope for confusion.

Please note that it is now a bug to pass old_sha1=NULL_SHA1 to
delete_ref() (because it doesn't make sense to delete a reference that
you already know doesn't exist). This is consistent with the behavior
of ref_transaction_delete().

Most of the callers of delete_ref() never pass old_sha1=NULL_SHA1 to
delete_ref(), and are therefore unaffected by this change. The
two exceptions are:

* The call in cmd_update_ref(), which passed NULL_SHA1 if the old
  value passed in on the command line was 0{40} or the empty string.
  Change that caller to pass NULL in those cases.

  Arguably, it should be an error to call "update-ref -d" with the old
  value set to "does not exist", just as it is for the `--stdin`
  command "delete". But since this usage was accepted until now,
  continue to accept it.

* The call in delete_branches(), which could pass NULL_SHA1 if
  deleting a broken or symbolic ref. Change it to pass NULL in these
  cases.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:14 -07:00
Michael Haggerty fb58c8d507 refs: move the remaining ref module declarations to refs.h
Some functions from the refs module were still declared in cache.h.
Move them to refs.h.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:12 -07:00
Michael Haggerty e426ff4222 initial_ref_transaction_commit(): check for ref D/F conflicts
In initial_ref_transaction_commit(), check for D/F conflicts (i.e.,
the type of conflict that exists between "refs/foo" and
"refs/foo/bar") among the references being created and between the
references being created and any hypothetical existing references.

Ideally, there shouldn't *be* any existing references when this
function is called. But, at least in the case of the "testgit" remote
helper, "clone" can be called after the remote-tracking "HEAD" and
"master" branches have already been created. So let's just do the
full-blown check.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:12 -07:00
Michael Haggerty fb802b3129 initial_ref_transaction_commit(): check for duplicate refs
Error out if the ref_transaction includes more than one update for any
refname.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:11 -07:00
Michael Haggerty 0a4b24ff14 refs: remove some functions from the module's public interface
The following functions are no longer used from outside the refs
module:

* lock_packed_refs()
* add_packed_ref()
* commit_packed_refs()
* rollback_packed_refs()

So make these functions private.

This is an important step, because it means that nobody outside of the
refs module needs to know the difference between loose and packed
references.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:11 -07:00
Michael Haggerty 58f233ce1e initial_ref_transaction_commit(): function for initial ref creation
"git clone" uses shortcuts when creating the initial set of
references:

* It writes them directly to packed-refs.

* It doesn't lock the individual references (though it does lock the
  packed-refs file).

* It doesn't check for refname conflicts between two new references or
  between one new reference and any hypothetical old ones.

* It doesn't create reflog entries for the reference creations.

This functionality was implemented in builtin/clone.c. But really that
file shouldn't have such intimate knowledge of how references are
stored. So provide a new function in the refs API,
initial_ref_transaction_commit(), which can be used for initial
reference creation. The new function is based on the ref_transaction
interface.

This means that we can make some other functions private to the refs
module. That will be done in a followup commit.

It would seem to make sense to add a test here that there are no
existing references, because that is how the function *should* be
used. But in fact, the "testgit" remote helper appears to call it
*after* having set up refs/remotes/<name>/HEAD and
refs/remotes/<name>/master, so we can't be so strict. For now, the
function trusts its caller to only call it when it makes sense. Future
commits will add some more limited sanity checks.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:11 -07:00
Michael Haggerty 79e4d8a9b8 repack_without_refs(): make function private
It is no longer called from outside of the refs module. Also move its
docstring and change it to imperative voice.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:11 -07:00
Michael Haggerty 7fa7dc8904 delete_refs(): bail early if the packed-refs file cannot be rewritten
If we fail to delete the doomed references from the packed-refs file,
then it is unsafe to delete their loose references, because doing so
might expose a value from the packed-refs file that is obsolete and
perhaps even points at an object that has been garbage collected.

So if repack_without_refs() fails, emit a more explicit error message
and bail.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:10 -07:00
Michael Haggerty 5d97861b9b delete_refs(): make error message more generic
Change the error message from

    Could not remove branch %s

to

    could not remove reference %s

First of all, the old error message referred to "branch
refs/remotes/origin/foo", which was awkward even for the existing
caller. Normally we would refer to a reference like that as either
"remote-tracking branch origin/foo" or "reference
refs/remotes/origin/foo". Here I take the lazier alternative.

Moreover, now that this function is part of the refs API, it might be
called for refs that are neither branches nor remote-tracking
branches.

While we're at it, convert the error message to lower case, as per our
usual convention.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:09 -07:00
Michael Haggerty 98ffd5ff67 delete_refs(): new function for the refs API
Move the function remove_branches() from builtin/remote.c to refs.c,
rename it to delete_refs(), and make it public.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:09 -07:00
Michael Haggerty fc67a0825c delete_ref(): handle special case more explicitly
delete_ref() uses a different convention for its old_sha1 parameter
than, say, ref_transaction_delete(): NULL_SHA1 means not to check the
old value. Make this fact a little bit clearer in the code by handling
it in explicit, commented code rather than burying it in a conditional
expression.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:09 -07:00
Michael Haggerty fc1c21689d delete_ref(): move declaration to refs.h
Also

* Add a docstring

* Rename the second parameter to "old_sha1", to be consistent with the
  convention used elsewhere in the refs module

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-22 13:17:08 -07:00
Mike Hommey 58d121b22b Allow to control where the replace refs are looked for
It can be useful to have grafts or replace refs for specific use-cases while
keeping the default "view" of the repository pristine (or with a different
set of grafts/replace refs).

It is possible to use a different graft file with GIT_GRAFT_FILE, but while
replace refs are more powerful, they don't have an equivalent override.

Add a GIT_REPLACE_REF_BASE environment variable to control where git is
going to look for replace refs.

Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-12 15:28:17 -07:00
Junio C Hamano 829f03e98c Merge branch 'mh/verify-lock-error-report'
Bring consistency to error reporting mechanism used in "refs" API.

* mh/verify-lock-error-report:
  ref_transaction_commit(): do not capitalize error messages
  verify_lock(): do not capitalize error messages
  verify_lock(): report errors via a strbuf
  verify_lock(): on errors, let the caller unlock the lock
  verify_lock(): return 0/-1 rather than struct ref_lock *
2015-06-11 09:29:54 -07:00
Michael Haggerty 501cf47cdd read_loose_refs(): treat NULL_SHA1 loose references as broken
NULL_SHA1 is used to indicate an "invalid object name" throughout our
code (and the code of other git implementations), so it is vastly more
likely that an on-disk reference was set to this value due to a
software bug than that NULL_SHA1 is the legitimate SHA-1 of an actual
object.  Therefore, if a loose reference has the value NULL_SHA1,
consider it to be broken.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-08 10:35:41 -07:00
Junio C Hamano 7c997bcbf6 Merge branch 'mh/write-refs-sooner-2.4' into maint
Multi-ref transaction support we merged a few releases ago
unnecessarily kept many file descriptors open, risking to fail with
resource exhaustion.  This is for 2.4.x track.

* mh/write-refs-sooner-2.4:
  ref_transaction_commit(): fix atomicity and avoid fd exhaustion
  ref_transaction_commit(): remove the local flags variable
  ref_transaction_commit(): inline call to write_ref_sha1()
  rename_ref(): inline calls to write_ref_sha1() from this function
  commit_ref_update(): new function, extracted from write_ref_sha1()
  write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
  t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
  update-ref: test handling large transactions properly
  ref_transaction_commit(): fix atomicity and avoid fd exhaustion
  ref_transaction_commit(): remove the local flags variable
  ref_transaction_commit(): inline call to write_ref_sha1()
  rename_ref(): inline calls to write_ref_sha1() from this function
  commit_ref_update(): new function, extracted from write_ref_sha1()
  write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
  t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
  update-ref: test handling large transactions properly
2015-06-05 12:00:17 -07:00
Michael Haggerty f5517074f8 read_loose_refs(): simplify function logic
Make it clearer that there are two possible ways to read the
reference, but that we handle read errors uniformly regardless of
which way it was read.

This refactoring also makes the following change easier to implement.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-03 11:44:25 -07:00
Michael Haggerty c2e0a718c6 ref_transaction_commit(): do not capitalize error messages
Our convention is for error messages to start with a lower-case
letter.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-27 15:58:42 -07:00
Michael Haggerty 000f0da57a verify_lock(): do not capitalize error messages
Our convention is for error messages to start with a lower-case
letter.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-27 15:58:42 -07:00
Michael Haggerty 33ffc176d6 verify_lock(): report errors via a strbuf
Instead of writing error messages directly to stderr, write them to
a "strbuf *err".  The caller, lock_ref_sha1_basic(), uses this error
reporting convention with all the other callees, and reports its
error this way to its callers.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-27 15:57:47 -07:00
Michael Haggerty f41d632970 verify_lock(): on errors, let the caller unlock the lock
The caller already knows how to do it, so always do it in the same
place.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-27 12:40:29 -07:00
Michael Haggerty a5e2499e54 verify_lock(): return 0/-1 rather than struct ref_lock *
Its return value wasn't conveying any extra information, but it made
the reader wonder whether the ref_lock that it returned might be
different than the one that was passed to it. So change the function
to the traditional "return 0 on success or a negative value on error".

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-27 12:39:41 -07:00
Michael Haggerty 5cb901a4b0 struct ref_lock: convert old_sha1 member to object_id
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:40 -07:00
Michael Haggerty 4e675d1732 warn_if_dangling_symref(): convert local variable "junk" to object_id
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:39 -07:00
Michael Haggerty 0a0c953217 each_ref_fn_adapter(): remove adapter
All of the callers of the for_each_ref family of functions have now
been rewritten to work with object_ids, so this adapter is no longer
needed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:39 -07:00
Michael Haggerty 2b2a5be394 each_ref_fn: change to take an object_id parameter
Change typedef each_ref_fn to take a "const struct object_id *oid"
parameter instead of "const unsigned char *sha1".

To aid this transition, implement an adapter that can be used to wrap
old-style functions matching the old typedef, which is now called
"each_ref_sha1_fn"), and make such functions callable via the new
interface. This requires the old function and its cb_data to be
wrapped in a "struct each_ref_fn_sha1_adapter", and that object to be
used as the cb_data for an adapter function, each_ref_fn_adapter().

This is an enormous diff, but most of it consists of simple,
mechanical changes to the sites that call any of the "for_each_ref"
family of functions. Subsequent to this change, the call sites can be
rewritten one by one to use the new interface.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:27 -07:00
brian m. carlson 8353847e85 refs: convert struct ref_entry to use struct object_id
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-25 12:19:27 -07:00
Junio C Hamano fb257bfa17 Merge branch 'mh/lockfile-retry'
Instead of dying immediately upon failing to obtain a lock, retry
after a short while with backoff.

* mh/lockfile-retry:
  lock_packed_refs(): allow retries when acquiring the packed-refs lock
  lockfile: allow file locking to be retried with a timeout
2015-05-22 12:41:55 -07:00
Junio C Hamano faa4b2ecbb Merge branch 'mh/ref-directory-file'
The ref API did not handle cases where 'refs/heads/xyzzy/frotz' is
removed at the same time as 'refs/heads/xyzzy' is added (or vice
versa) very well.

* mh/ref-directory-file:
  reflog_expire(): integrate lock_ref_sha1_basic() errors into ours
  ref_transaction_commit(): delete extra "the" from error message
  ref_transaction_commit(): provide better error messages
  rename_ref(): integrate lock_ref_sha1_basic() errors into ours
  lock_ref_sha1_basic(): improve diagnostics for ref D/F conflicts
  lock_ref_sha1_basic(): report errors via a "struct strbuf *err"
  verify_refname_available(): report errors via a "struct strbuf *err"
  verify_refname_available(): rename function
  refs: check for D/F conflicts among refs created in a transaction
  ref_transaction_commit(): use a string_list for detecting duplicates
  is_refname_available(): use dirname in first loop
  struct nonmatching_ref_data: store a refname instead of a ref_entry
  report_refname_conflict(): inline function
  entry_matches(): inline function
  is_refname_available(): convert local variable "dirname" to strbuf
  is_refname_available(): avoid shadowing "dir" variable
  is_refname_available(): revamp the comments
  t1404: new tests of ref D/F conflicts within transactions
2015-05-22 12:41:53 -07:00
Junio C Hamano 91c90876de Merge branch 'mh/write-refs-sooner-2.4'
Multi-ref transaction support we merged a few releases ago
unnecessarily kept many file descriptors open, risking to fail with
resource exhaustion.  This is for 2.4.x track.

* mh/write-refs-sooner-2.4:
  ref_transaction_commit(): fix atomicity and avoid fd exhaustion
  ref_transaction_commit(): remove the local flags variable
  ref_transaction_commit(): inline call to write_ref_sha1()
  rename_ref(): inline calls to write_ref_sha1() from this function
  commit_ref_update(): new function, extracted from write_ref_sha1()
  write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
  t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
  update-ref: test handling large transactions properly
  ref_transaction_commit(): fix atomicity and avoid fd exhaustion
  ref_transaction_commit(): remove the local flags variable
  ref_transaction_commit(): inline call to write_ref_sha1()
  rename_ref(): inline calls to write_ref_sha1() from this function
  commit_ref_update(): new function, extracted from write_ref_sha1()
  write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
  t7004: rename ULIMIT test prerequisite to ULIMIT_STACK_SIZE
  update-ref: test handling large transactions properly
2015-05-22 12:41:52 -07:00
Junio C Hamano 4295abc040 Merge branch 'sb/ref-lock-lose-lock-fd'
The refs API uses ref_lock struct which had its own "int fd", even
though the same file descriptor was in the lock struct it contains.
Clean-up the code to lose this redundant field.

* sb/ref-lock-lose-lock-fd:
  refs.c: remove lock_fd from struct ref_lock
2015-05-19 13:17:59 -07:00
Michael Haggerty f4ab4f3ab1 lock_packed_refs(): allow retries when acquiring the packed-refs lock
Currently, there is only one attempt to acquire any lockfile, and if
the lock is held by another process, the locking attempt fails
immediately.

This is not such a limitation for loose reference files. First, they
don't take long to rewrite. Second, most reference updates have a
known "old" value, so if another process is updating a reference at
the same moment that we are trying to lock it, then probably the
expected "old" value will not longer be valid, and the update will
fail anyway.

But these arguments do not hold for packed-refs:

* The packed-refs file can be large and take significant time to
  rewrite.

* Many references are stored in a single packed-refs file, so it could
  be that the other process was changing a different reference than
  the one that we are interested in.

Therefore, it is much more likely for there to be spurious lock
conflicts in connection to the packed-refs file, resulting in
unnecessary command failures.

So, if the first attempt to lock the packed-refs file fails, continue
retrying for a configurable length of time before giving up. The
default timeout is 1 second.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-14 14:51:51 -07:00
Michael Haggerty cf018ee0cd ref_transaction_commit(): fix atomicity and avoid fd exhaustion
The old code was roughly

    for update in updates:
        acquire locks and check old_sha
    for update in updates:
        if changing value:
            write_ref_to_lockfile()
            commit_ref_update()
    for update in updates:
        if deleting value:
            unlink()
    rewrite packed-refs file
    for update in updates:
        if reference still locked:
            unlock_ref()

This has two problems.

Non-atomic updates
==================

The atomicity of the reference transaction depends on all pre-checks
being done in the first loop, before any changes have started being
committed in the second loop. The problem is that
write_ref_to_lockfile() (previously part of write_ref_sha1()), which
is called from the second loop, contains two more checks:

* It verifies that new_sha1 is a valid object

* If the reference being updated is a branch, it verifies that
  new_sha1 points at a commit object (as opposed to a tag, tree, or
  blob).

If either of these checks fails, the "transaction" is aborted during
the second loop. But this might happen after some reference updates
have already been permanently committed. In other words, the
all-or-nothing promise of "git update-ref --stdin" could be violated.

So these checks have to be moved to the first loop.

File descriptor exhaustion
==========================

The old code locked all of the references in the first loop, leaving
all of the lockfiles open until later loops. Since we might be
updating a lot of references, this could result in file descriptor
exhaustion.

The solution
============

After this patch, the code looks like

    for update in updates:
        acquire locks and check old_sha
        if changing value:
            write_ref_to_lockfile()
        else:
            close_ref()
    for update in updates:
        if changing value:
            commit_ref_update()
    for update in updates:
        if deleting value:
            unlink()
    rewrite packed-refs file
    for update in updates:
        if reference still locked:
            unlock_ref()

This fixes both problems:

1. The pre-checks in write_ref_to_lockfile() are now done in the first
   loop, before any changes have been committed. If any of the checks
   fails, the whole transaction can now be rolled back correctly.

2. All lockfiles are closed in the first loop immediately after they
   are created (either by write_ref_to_lockfile() or by close_ref()).
   This means that there is never more than one open lockfile at a
   time, preventing file descriptor exhaustion.

To simplify the bookkeeping across loops, add a new REF_NEEDS_COMMIT
bit to update->flags, which keeps track of whether the corresponding
lockfile needs to be committed, as opposed to just unlocked. (Since
"struct ref_update" is internal to the refs module, this change is not
visible to external callers.)

This change fixes two tests in t1400.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-12 21:28:03 -07:00
Michael Haggerty cbf50f9e3d ref_transaction_commit(): remove the local flags variable
Instead, work directly with update->flags. This has the advantage that
the REF_DELETING bit, set in the first loop, can be read in the second
loop instead of having to be recomputed. Plus, it was potentially
confusing having both update->flags and flags, which sometimes had
different values.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-12 21:28:03 -07:00
Michael Haggerty 61e51e0000 ref_transaction_commit(): inline call to write_ref_sha1()
That was the last caller, so delete function write_ref_sha1().

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-12 21:28:03 -07:00
Michael Haggerty ba43b7f29c rename_ref(): inline calls to write_ref_sha1() from this function
Most of what it does is unneeded from these call sites.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-12 21:28:02 -07:00
Michael Haggerty ad4cd6c297 commit_ref_update(): new function, extracted from write_ref_sha1()
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-12 21:28:02 -07:00
Michael Haggerty e6fd3c6730 write_ref_to_lockfile(): new function, extracted from write_ref_sha1()
This is the first step towards separating the checking and writing of
the new reference value to committing the change.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-12 21:28:02 -07:00
Junio C Hamano 6cc983d0ad Merge branch 'jk/reading-packed-refs'
An earlier rewrite to use strbuf_getwholeline() instead of fgets(3)
to read packed-refs file revealed that the former is unacceptably
inefficient.

* jk/reading-packed-refs:
  t1430: add another refs-escape test
  read_packed_refs: avoid double-checking sane refs
  strbuf_getwholeline: use getdelim if it is available
  strbuf_getwholeline: avoid calling strbuf_grow
  strbuf_addch: avoid calling strbuf_grow
  config: use getc_unlocked when reading from file
  strbuf_getwholeline: use getc_unlocked
  git-compat-util: add fallbacks for unlocked stdio
  strbuf_getwholeline: use getc macro
2015-05-11 14:23:42 -07:00
Junio C Hamano 68a2e6a2c8 Merge branch 'nd/multiple-work-trees'
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.

* nd/multiple-work-trees: (41 commits)
  prune --worktrees: fix expire vs worktree existence condition
  t1501: fix test with split index
  t2026: fix broken &&-chain
  t2026 needs procondition SANITY
  git-checkout.txt: a note about multiple checkout support for submodules
  checkout: add --ignore-other-wortrees
  checkout: pass whole struct to parse_branchname_arg instead of individual flags
  git-common-dir: make "modules/" per-working-directory directory
  checkout: do not fail if target is an empty directory
  t2025: add a test to make sure grafts is working from a linked checkout
  checkout: don't require a work tree when checking out into a new one
  git_path(): keep "info/sparse-checkout" per work-tree
  count-objects: report unused files in $GIT_DIR/worktrees/...
  gc: support prune --worktrees
  gc: factor out gc.pruneexpire parsing code
  gc: style change -- no SP before closing parenthesis
  checkout: clean up half-prepared directories in --to mode
  checkout: reject if the branch is already checked out elsewhere
  prune: strategies for linked checkouts
  checkout: support checking out into a new working directory
  ...
2015-05-11 14:23:39 -07:00
Michael Haggerty c628edfddb reflog_expire(): integrate lock_ref_sha1_basic() errors into ours
Now that lock_ref_sha1_basic() gives us back its error messages via a
strbuf, incorporate its error message into our error message rather
than emitting two separate error messages.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:20 -07:00
Michael Haggerty 3553944aa8 ref_transaction_commit(): delete extra "the" from error message
While we are in the area, let's remove a superfluous definite article
from the error message that is emitted when the reference cannot be
locked. This improves how it reads and makes it a bit shorter.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:20 -07:00
Michael Haggerty cbaabcbc6f ref_transaction_commit(): provide better error messages
Now that lock_ref_sha1_basic() gives us back its error messages via a
strbuf, incorporate its error message into our error message rather
than emitting one error messages to stderr immediately and returning a
second to our caller.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:20 -07:00
Michael Haggerty abeef9c856 rename_ref(): integrate lock_ref_sha1_basic() errors into ours
Now that lock_ref_sha1_basic() gives us back its error messages via a
strbuf, incorporate its error message into our error message rather
than emitting two separate error messages.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:20 -07:00
Michael Haggerty 5b2d8d6f21 lock_ref_sha1_basic(): improve diagnostics for ref D/F conflicts
If there is a failure to lock a reference that is likely caused by a
D/F conflict (e.g., trying to lock "refs/foo/bar" when reference
"refs/foo" already exists), invoke verify_refname_available() to try
to generate a more helpful error message.

That function might not detect an error. For example, some
non-reference file might be blocking the deletion of an
otherwise-empty directory tree, or there might be a race with another
process that just deleted the offending reference. In such cases,
generate the strerror-based error message like before.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:20 -07:00
Michael Haggerty 4a32b2e08b lock_ref_sha1_basic(): report errors via a "struct strbuf *err"
For now, change the callers to spew the error to stderr like before.
But soon we will change them to incorporate the reason for the failure
into their own error messages.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:19 -07:00
Michael Haggerty 1146f17e2c verify_refname_available(): report errors via a "struct strbuf *err"
It shouldn't be spewing errors directly to stderr.

For now, change its callers to spew the errors to stderr.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:19 -07:00
Michael Haggerty 5baf37d383 verify_refname_available(): rename function
Rename is_refname_available() to verify_refname_available() and change
its return value from 1 for success to 0 for success, to be consistent
with our error-handling convention. In a moment it will also get a
"struct strbuf *err" parameter.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:19 -07:00
Michael Haggerty e911104c84 refs: check for D/F conflicts among refs created in a transaction
If two references that D/F conflict (e.g., "refs/foo" and
"refs/foo/bar") are created in a single transaction, the old code
discovered the problem only after the "commit" phase of
ref_transaction_commit() had already begun. This could leave some
references updated and others not, which violates the promise of
atomicity.

Instead, check for such conflicts during the "locking" phase:

* Teach is_refname_available() to take an "extras" parameter that can
  contain extra reference names with which the specified refname must
  not conflict.

* Change lock_ref_sha1_basic() to take an "extras" parameter, which it
  passes through to is_refname_available().

* Change ref_transaction_commit() to pass "affected_refnames" to
  lock_ref_sha1_basic() as its "extras" argument.

This change fixes a test case in t1404.

This code is a bit stricter than it needs to be. We could conceivably
allow reference "refs/foo/bar" to be created in the same transaction
as "refs/foo" is deleted (or vice versa). But that would be
complicated to implement, because it is not possible to lock
"refs/foo/bar" while "refs/foo" exists as a loose reference, but on
the other hand we don't want to delete some references before adding
others (because that could leave a gap during which required objects
are unreachable). There is also a complication that reflog files'
paths can conflict.

Any less-strict implementation would probably require tricks like the
packing of all references before the start of the real transaction, or
the use of temporary intermediate reference names.

So for now let's accept too-strict checks. Some reference update
transactions will be rejected unnecessarily, but they will be rejected
in their entirety rather than leaving the repository in an
intermediate state, as would happen now.

Please note that there is still one kind of D/F conflict that is *not*
handled correctly. If two processes are running at the same time, and
one tries to create "refs/foo" at the same time that the other tries
to create "refs/foo/bar", then they can race with each other. Both
processes can obtain their respective locks ("refs/foo.lock" and
"refs/foo/bar.lock"), proceed to the "commit" phase of
ref_transaction_commit(), and then the slower process will discover
that it cannot rename its lockfile into place (after possibly having
committed changes to other references). There appears to be no way to
fix this race without changing the locking policy, which in turn would
require a change to *all* Git clients.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:19 -07:00
Michael Haggerty 07f9c881d6 ref_transaction_commit(): use a string_list for detecting duplicates
Detect duplicates by storing the reference names in a string_list and
sorting that, instead of sorting the ref_updates directly.

* In a moment the string_list will be used for another purpose, too.

* This removes the need for the custom comparison function
  ref_update_compare().

* This means that we can carry out the updates in the order that the
  user specified them instead of reordering them. This might be handy
  someday if, we want to permit multiple updates to a single reference
  as long as they are compatible with each other.

Note: we can't use string_list_remove_duplicates() to check for
duplicates, because we need to know the name of the reference that
appeared multiple times, to be used in the error message.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:19 -07:00
Michael Haggerty 61da596992 is_refname_available(): use dirname in first loop
In the first loop (over prefixes of refname), use dirname to keep
track of the current prefix. This is not an improvement in itself, but
in a moment we will start using dirname for a role where a
NUL-terminated string is needed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:18 -07:00
Michael Haggerty 521331cc9f struct nonmatching_ref_data: store a refname instead of a ref_entry
Now that we don't need a ref_entry to pass to
report_refname_conflict(), it is sufficient to store the refname of
the conflicting reference.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:18 -07:00
Michael Haggerty 385e8af5a2 report_refname_conflict(): inline function
It wasn't pulling its weight. And we are about to need code similar to
this where no ref_entry is available and with more diverse error
messages. Rather than try to generalize the function, just inline it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:18 -07:00
Michael Haggerty 8bfac19ab4 entry_matches(): inline function
It wasn't pulling its weight. And in a moment we will need similar
tests that take a refname rather than a ref_entry as parameter, which
would have made entry_matches() even less useful.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:18 -07:00
Michael Haggerty 6075f3076e is_refname_available(): convert local variable "dirname" to strbuf
This change wouldn't be worth it by itself, but in a moment we will
use the strbuf for more string juggling.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:17 -07:00
Michael Haggerty 9ef6eaa287 is_refname_available(): avoid shadowing "dir" variable
The function had a "dir" parameter that was shadowed by a local "dir"
variable within a code block. Use the former in place of the latter.
(This is consistent with "dir"'s use elsewhere in the function.)

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:17 -07:00
Michael Haggerty 49e818762a is_refname_available(): revamp the comments
Change the comments to a running example of running the function with
refname set to "refs/foo/bar". Add some more explanation of the logic.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
2015-05-11 11:50:17 -07:00
Stefan Beller 1238ac8c5d refs.c: remove lock_fd from struct ref_lock
The 'lock_fd' is the same as 'lk->fd'. No need to store it twice so remove
it.

No functional changes intended.

Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-10 21:13:26 -07:00
Jeff King 03afcbee9b read_packed_refs: avoid double-checking sane refs
Prior to d0f810f (refs.c: allow listing and deleting badly
named refs, 2014-09-03), read_packed_refs would barf on any
malformed refnames by virtue of calling create_ref_entry
with the "check" parameter set to 1. That commit loosened
our reading so that we call check_refname_format ourselves
and just set a REF_BAD_NAME flag.

We then call create_ref_entry with the check parameter set
to 0. That function learned to do an extra safety check even
when the check parameter is 0, so that we don't load any
dangerous refnames (like "../../../etc/passwd"). This is
implemented by calling refname_is_safe() in
create_ref_entry().

However, we can observe that refname_is_safe() can only be
true if check_refname_format() also failed. So in the common
case of a sanely named ref, we perform _both_ checks, even
though we know that the latter will never trigger. This has
a noticeable performance impact when the packed-refs file is
large.

Let's drop the refname_is_safe check from create_ref_entry(),
and make it the responsibility of the caller.  Of the three
callers that pass a check parameter of "0", two will have
just called check_refname_format(), and can check the
refname-safety only when it fails. The third case,
pack_if_possible_fn, is copying from an existing ref entry,
which must have previously passed our safety check.

With this patch, running "git rev-parse refs/heads/does-not-exist"
on a repo with a large (1.6GB) packed-refs file went from:

  real    0m6.768s
  user    0m6.340s
  sys     0m0.432s

to:

  real    0m5.703s
  user    0m5.276s
  sys     0m0.432s

for a wall-clock speedup of 15%.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-04-16 08:15:05 -07:00
Junio C Hamano 05e816e37f Merge branch 'jk/prune-with-corrupt-refs'
"git prune" used to largely ignore broken refs when deciding which
objects are still being used, which could spread an existing small
damage and make it a larger one.

* jk/prune-with-corrupt-refs:
  refs.c: drop curate_packed_refs
  repack: turn on "ref paranoia" when doing a destructive repack
  prune: turn on ref_paranoia flag
  refs: introduce a "ref paranoia" flag
  t5312: test object deletion code paths in a corrupted repository
2015-03-25 12:54:26 -07:00
Jeff King ea56c4e02f refs.c: drop curate_packed_refs
When we delete a ref, we have to rewrite the entire
packed-refs file. We take this opportunity to "curate" the
packed-refs file and drop any entries that are crufty or
broken.

Dropping broken entries (e.g., with bogus names, or ones
that point to missing objects) is actively a bad idea, as it
means that we lose any notion that the data was there in the
first place. Aside from the general hackiness that we might
lose any information about ref "foo" while deleting an
unrelated ref "bar", this may seriously hamper any attempts
by the user at recovering from the corruption in "foo".

They will lose the sha1 and name of "foo"; the exact pointer
may still be useful even if they recover missing objects
from a different copy of the repository. But worse, once the
ref is gone, there is no trace of the corruption. A
follow-up "git prune" may delete objects, even though it
would otherwise bail when seeing corruption.

We could just drop the "broken" bits from
curate_packed_refs, and continue to drop the "crufty" bits:
refs whose loose counterpart exists in the filesystem. This
is not wrong to do, and it does have the advantage that we
may write out a slightly smaller packed-refs file. But it
has two disadvantages:

  1. It is a potential source of races or mistakes with
     respect to these refs that are otherwise unrelated to
     the operation. To my knowledge, there aren't any active
     problems in this area, but it seems like an unnecessary
     risk.

  2. We have to spend time looking up the matching loose
     refs for every item in the packed-refs file. If you
     have a large number of packed refs that do not change,
     that outweighs the benefit from writing out a smaller
     packed-refs file (it doesn't get smaller, and you do a
     bunch of directory traversal to find that out).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-20 12:41:41 -07:00
Jeff King 49672f26d9 refs: introduce a "ref paranoia" flag
Most operations that iterate over refs are happy to ignore
broken cruft. However, some operations should be performed
with knowledge of these broken refs, because it is better
for the operation to choke on a missing object than it is to
silently pretend that the ref did not exist (e.g., if we are
computing the set of reachable tips in order to prune
objects).

These processes could just call for_each_rawref, except that
ref iteration is often hidden behind other interfaces. For
instance, for a destructive "repack -ad", we would have to
inform "pack-objects" that we are destructive, and then it
would in turn have to tell the revision code that our
"--all" should include broken refs.

It's much simpler to just set a global for "dangerous"
operations that includes broken refs in all iterations.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-20 12:40:49 -07:00
Junio C Hamano 82b7e65199 Merge branch 'mh/expire-updateref-fixes'
Various issues around "reflog expire", e.g. using --updateref when
expiring a reflog for a symbolic reference, have been corrected
and/or made saner.

* mh/expire-updateref-fixes:
  reflog_expire(): never update a reference to null_sha1
  reflog_expire(): ignore --updateref for symbolic references
  reflog: improve and update documentation
  struct ref_lock: delete the force_write member
  lock_ref_sha1_basic(): do not set force_write for missing references
  write_ref_sha1(): move write elision test to callers
  write_ref_sha1(): remove check for lock == NULL
2015-03-10 13:52:40 -07:00
Michael Haggerty 423c688b85 reflog_expire(): never update a reference to null_sha1
Currently, if --updateref is specified and the very last reflog entry
is expired or deleted, the reference's value is set to 0{40}. This is
an invalid state of the repository, and breaks, for example, "git
fsck" and "git for-each-ref".

The only place we use --updateref in our own code is when dropping
stash entries. In that code, the very next step is to check if the
reflog has been made empty, and if so, delete the "refs/stash"
reference entirely. Thus that code path ultimately leaves the
repository in a valid state.

But we don't want to the repository in an invalid state even
temporarily, and we don't want to leave an invalid state if other
callers of "git reflog expire|delete --updateref" don't think to do
the extra cleanup step.

So, if "git reflog expire|delete" leaves no more entries in the
reflog, just leave the reference unchanged.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05 12:35:37 -08:00
Michael Haggerty 5e6f003ca8 reflog_expire(): ignore --updateref for symbolic references
If we are expiring reflog entries for a symbolic reference, then how
should --updateref be handled if the newest reflog entry is expired?

Option 1: Update the referred-to reference. (This is what the current
code does.) This doesn't make sense, because the referred-to reference
has its own reflog, which hasn't been rewritten.

Option 2: Update the symbolic reference itself (as in, REF_NODEREF).
This would convert the symbolic reference into a non-symbolic
reference (e.g., detaching HEAD), which is surely not what a user
would expect.

Option 3: Error out. This is plausible, but it would make the
following usage impossible:

    git reflog expire ... --updateref --all

Option 4: Ignore --updateref for symbolic references.

We choose to implement option 4.

Note: another problem in this code will be fixed in a moment.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05 12:35:37 -08:00
Stefan Beller 5a6f47077b struct ref_lock: delete the force_write member
Instead, compute the value when it is needed.

Signed-off-by: Stefan Beller <sbeller@google.com>
Edited-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05 12:35:36 -08:00
Michael Haggerty 074336e5ed lock_ref_sha1_basic(): do not set force_write for missing references
If a reference is missing, its SHA-1 will be null_sha1, which can't
possibly match a new value that ref_transaction_commit() is trying to
update it to. So there is no need to set force_write in this scenario.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05 12:35:36 -08:00
Michael Haggerty 706d5f816f write_ref_sha1(): move write elision test to callers
write_ref_sha1() previously skipped the write if the reference already
had the desired value, unless lock->force_write was set. Instead,
perform that test at the callers.

Two of the callers (in rename_ref()) unconditionally set force_write
just before calling write_ref_sha1(), so they don't need the extra
check at all. Nor do they need to set force_write anymore.

The last caller, in ref_transaction_commit(), still needs the test.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05 12:35:36 -08:00
Michael Haggerty 8280bbebd1 write_ref_sha1(): remove check for lock == NULL
None of the callers pass NULL to this function, and there doesn't seem
to be any usefulness to allowing them to do so.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-03-05 12:35:36 -08:00
Junio C Hamano faf723a631 Merge branch 'jk/blame-commit-label' into maint
"git blame HEAD -- missing" failed to correctly say "HEAD" when it
tried to say "No such path 'missing' in HEAD".

* jk/blame-commit-label:
  blame.c: fix garbled error message
  use xstrdup_or_null to replace ternary conditionals
  builtin/commit.c: use xstrdup_or_null instead of envdup
  builtin/apply.c: use xstrdup_or_null instead of null_strdup
  git-compat-util: add xstrdup_or_null helper
2015-02-24 22:09:54 -08:00
Michael Haggerty 4b7b520b9f update_ref(): improve documentation
Add a docstring for update_ref(), emphasizing its similarity to
ref_transaction_update(). Rename its parameters to match those of
ref_transaction_update().

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:25:03 -08:00
Michael Haggerty 1618033401 ref_transaction_verify(): new function to check a reference's value
If NULL is passed to ref_transaction_update()'s new_sha1 parameter,
then just verify old_sha1 (under lock) without trying to change the
new value of the reference.

Use this functionality to add a new function ref_transaction_verify(),
which checks the current value of the reference under lock but doesn't
change it.

Use ref_transaction_verify() in the implementation of "git update-ref
--stdin"'s "verify" command to avoid the awkward need to "update" the
reference to its existing value.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:24:59 -08:00
Michael Haggerty 60294596ba ref_transaction_delete(): check that old_sha1 is not null_sha1
It makes no sense to delete a reference that is already known not to
exist.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:24:55 -08:00
Michael Haggerty f04c5b5522 ref_transaction_create(): check that new_sha1 is valid
Creating a reference requires a new_sha1 that is not NULL and not
null_sha1.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:24:48 -08:00
Michael Haggerty fb5a6bb61c ref_transaction_delete(): remove "have_old" parameter
Instead, verify the reference's old value if and only if old_sha1 is
non-NULL.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:23:48 -08:00
Michael Haggerty 1d147bdff0 ref_transaction_update(): remove "have_old" parameter
Instead, verify the reference's old value if and only if old_sha1 is
non-NULL.

ref_transaction_delete() will get the same treatment in a moment.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:22:50 -08:00
Michael Haggerty 8df4e51138 struct ref_update: move "have_old" into "flags"
Instead of having a separate have_old field, record this boolean value
as a bit in the "flags" field.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:22:42 -08:00
Michael Haggerty fec14ec38c refs.c: change some "flags" to "unsigned int"
Change the following functions' "flags" arguments from "int" to
"unsigned int":

 * ref_transaction_update()
 * ref_transaction_create()
 * ref_transaction_delete()
 * update_ref()
 * delete_ref()
 * lock_ref_sha1_basic()

Also change the "flags" member in "struct ref_update" to unsigned.

Suggested-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-17 11:22:29 -08:00
Michael Haggerty 31e79f0a54 refs: remove the gap in the REF_* constant values
There is no reason to "reserve" a gap between the public and private
flags values.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-12 11:42:53 -08:00
Michael Haggerty 581d4e0cdb refs: move REF_DELETING to refs.c
It is only used internally now. Document it a little bit better, too.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-02-12 11:42:53 -08:00
Junio C Hamano 4d5c4e498a Merge branch 'mh/reflog-expire'
Restructure "reflog expire" to fit the reflogs better with the
recently updated ref API.

Looked reasonable (except that some shortlog entries stood out like
a sore thumb).

* mh/reflog-expire: (24 commits)
  refs.c: let fprintf handle the formatting
  refs.c: don't expose the internal struct ref_lock in the header file
  lock_any_ref_for_update(): inline function
  refs.c: remove unlock_ref/close_ref/commit_ref from the refs api
  reflog_expire(): new function in the reference API
  expire_reflog(): treat the policy callback data as opaque
  Move newlog and last_kept_sha1 to "struct expire_reflog_cb"
  expire_reflog(): move rewrite to flags argument
  expire_reflog(): move verbose to flags argument
  expire_reflog(): pass flags through to expire_reflog_ent()
  struct expire_reflog_cb: a new callback data type
  Rename expire_reflog_cb to expire_reflog_policy_cb
  expire_reflog(): move updateref to flags argument
  expire_reflog(): move dry_run to flags argument
  expire_reflog(): add a "flags" argument
  expire_reflog(): extract two policy-related functions
  Extract function should_expire_reflog_ent()
  expire_reflog(): use a lock_file for rewriting the reflog file
  expire_reflog(): return early if the reference has no reflog
  expire_reflog(): rename "ref" parameter to "refname"
  ...
2015-02-11 13:43:38 -08:00
Junio C Hamano 092c4be7f5 Merge branch 'jk/blame-commit-label'
"git blame HEAD -- missing" failed to correctly say "HEAD" when it
tried to say "No such path 'missing' in HEAD".

* jk/blame-commit-label:
  blame.c: fix garbled error message
  use xstrdup_or_null to replace ternary conditionals
  builtin/commit.c: use xstrdup_or_null instead of envdup
  builtin/apply.c: use xstrdup_or_null instead of null_strdup
  git-compat-util: add xstrdup_or_null helper
2015-02-11 13:39:50 -08:00
Junio C Hamano 61c9475221 Merge branch 'mh/reflog-expire' into mh/ref-trans-value-check
* mh/reflog-expire: (24 commits)
  refs.c: let fprintf handle the formatting
  refs.c: don't expose the internal struct ref_lock in the header file
  lock_any_ref_for_update(): inline function
  refs.c: remove unlock_ref/close_ref/commit_ref from the refs api
  reflog_expire(): new function in the reference API
  expire_reflog(): treat the policy callback data as opaque
  Move newlog and last_kept_sha1 to "struct expire_reflog_cb"
  expire_reflog(): move rewrite to flags argument
  expire_reflog(): move verbose to flags argument
  expire_reflog(): pass flags through to expire_reflog_ent()
  struct expire_reflog_cb: a new callback data type
  Rename expire_reflog_cb to expire_reflog_policy_cb
  expire_reflog(): move updateref to flags argument
  expire_reflog(): move dry_run to flags argument
  expire_reflog(): add a "flags" argument
  expire_reflog(): extract two policy-related functions
  Extract function should_expire_reflog_ent()
  expire_reflog(): use a lock_file for rewriting the reflog file
  expire_reflog(): return early if the reference has no reflog
  expire_reflog(): rename "ref" parameter to "refname"
  ...
2015-02-09 14:37:01 -08:00
Jeff King 8c53f0719b use xstrdup_or_null to replace ternary conditionals
This replaces "x ? xstrdup(x) : NULL" with xstrdup_or_null(x).
The change is fairly mechanical, with the exception of
resolve_refdup, which can eliminate a temporary variable.

There are still a few hits grepping for "?.*xstrdup", but
these are of slightly different forms and cannot be
converted (e.g., "x ? xstrdup(x->foo) : NULL").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-01-13 10:05:48 -08:00
René Scharfe 33adc83ddb refs: plug strbuf leak in lock_ref_sha1_basic()
Don't just reset, but release the resource held by the local
variable that is about to go out of scope.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-29 13:14:16 -08:00
Junio C Hamano aa9066fccd Merge branch 'jk/read-packed-refs-without-path-max'
Git did not correctly read an overlong refname from a packed refs
file.

* jk/read-packed-refs-without-path-max:
  read_packed_refs: use skip_prefix instead of static array
  read_packed_refs: pass strbuf to parse_ref_line
  read_packed_refs: use a strbuf for reading lines
2014-12-22 12:28:04 -08:00
Junio C Hamano 6f3abb7a87 Merge branch 'jk/for-each-reflog-ent-reverse'
The code that reads the reflog from the newer to the older entries
did not handle an entry that crosses a boundary of block it uses to
read them correctly.

* jk/for-each-reflog-ent-reverse:
  for_each_reflog_ent_reverse: turn leftover check into assertion
  for_each_reflog_ent_reverse: fix newlines on block boundaries
2014-12-22 12:27:32 -08:00
Junio C Hamano a7ddaa8eac Merge branch 'mh/simplify-repack-without-refs'
"git remote update --prune" to drop many refs has been optimized.

* mh/simplify-repack-without-refs:
  sort_string_list(): rename to string_list_sort()
  prune_remote(): iterate using for_each_string_list_item()
  prune_remote(): rename local variable
  repack_without_refs(): make the refnames argument a string_list
  prune_remote(): sort delete_refs_list references en masse
  prune_remote(): initialize both delete_refs lists in a single loop
  prune_remote(): exit early if there are no stale references
2014-12-22 12:26:50 -08:00
Stefan Beller c653e0343d refs.c: let fprintf handle the formatting
Instead of calculating whether to put a plus or minus sign, offload
the responsibilty to the fprintf function.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22 10:13:16 -08:00
Stefan Beller 3581d79335 refs.c: don't expose the internal struct ref_lock in the header file
Now the struct ref_lock is used completely internally, so let's
remove it from the header file.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22 10:13:15 -08:00
Michael Haggerty 31e07f76a9 lock_any_ref_for_update(): inline function
Inline the function at its one remaining caller (which is within
refs.c) and remove it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22 10:13:15 -08:00
Ronnie Sahlberg 0b1e654801 refs.c: remove unlock_ref/close_ref/commit_ref from the refs api
unlock|close|commit_ref can be made static since there are no more external
callers.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22 10:13:15 -08:00
Michael Haggerty fa5b1830b0 reflog_expire(): new function in the reference API
Move expire_reflog() into refs.c and rename it to reflog_expire().
Turn the three policy functions into function pointers that are passed
into reflog_expire(). Add function prototypes and documentation to
refs.h.

[jc: squashed in $gmane/261582, drop "extern" in function definition]

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Stefan Beller <sbeller@google.com>
Tweaked-by: Ramsay Jones
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-22 10:11:40 -08:00
Ronnie Sahlberg 2c6207abbd refs.c: add a function to append a reflog entry to a fd
Break out the code to create the string and writing it to the file
descriptor from log_ref_write and add it into a dedicated function
log_ref_write_fd. It is a nice unit of work.

For now this is only used from log_ref_write, but in the future it
might have other callers.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-12 11:42:00 -08:00
Jeff King ea417833ea read_packed_refs: use skip_prefix instead of static array
We want to recognize the packed-refs header and skip to the
"traits" part of the line. We currently do it by feeding
sizeof() a static const array to strncmp. However, it's a
bit simpler to just skip_prefix, which expresses the
intention more directly, and without remembering to account
for the NUL-terminator in each sizeof() call.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-10 09:40:33 -08:00
Jeff King 6a49870a72 read_packed_refs: pass strbuf to parse_ref_line
Now that we have a strbuf in read_packed_refs, we can pass
it straight to the line parser, which saves us an extra
strlen.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-10 09:28:54 -08:00
Jeff King 10c497aa0c read_packed_refs: use a strbuf for reading lines
Current code uses a fixed PATH_MAX-sized buffer for reading
packed-refs lines. This is a reasonable guess, in the sense
that git generally cannot work with refs larger than
PATH_MAX.  However, there are a few cases where it is not
great:

  1. Some systems may have a low value of PATH_MAX, but can
     actually handle larger paths in practice. Fixing this
     code path probably isn't enough to make them work
     completely with long refs, but it is a step in the
     right direction.

  2. We use fgets, which will happily give us half a line on
     the first read, and then the rest of the line on the
     second. This is probably OK in practice, because our
     refline parser is careful enough to look for the
     trailing newline on the first line. The second line may
     look like a peeled line to us, but since "^" is illegal
     in refnames, it is not likely to come up.

     Still, it does not hurt to be more careful.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-10 09:27:24 -08:00
Jeff King 69216bf72b for_each_reflog_ent_reverse: turn leftover check into assertion
Our loop should always process all lines, even if we hit the
beginning of the file. We have a conditional after the loop
ends to double-check that there is nothing left and to
process it. But this should never happen, and is a sign of a
logic bug in the loop. Let's turn it into a BUG assertion.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-05 11:11:52 -08:00
Jeff King e5e73ff20b for_each_reflog_ent_reverse: fix newlines on block boundaries
When we read a reflog file in reverse, we read whole chunks
of BUFSIZ bytes, then loop over the buffer, parsing any
lines we find. We find the beginning of each line by looking
for the newline from the previous line. If we don't find
one, we know that we are either at the beginning of
the file, or that we have to read another block.

In the latter case, we stuff away what we have into a
strbuf, read another block, and continue our parse. But we
missed one case here. If we did find a newline, and it is at
the beginning of the block, we must also stuff that newline
into the strbuf, as it belongs to the block we are about to
read.

The minimal fix here would be to add this special case to
the conditional that checks whether we found a newline.
But we can make the flow a little clearer by rearranging a
bit: we first handle lines that we are going to show, and
then at the end of each loop, stuff away any leftovers if
necessary. That lets us fold this special-case in with the
more common "we ended in the middle of a line" case.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-05 11:11:35 -08:00
Ronnie Sahlberg a785d3f77c refs.c: make ref_transaction_delete a wrapper for ref_transaction_update
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-04 15:39:37 -08:00
Ronnie Sahlberg bc9f2925fb refs.c: make ref_transaction_create a wrapper for ref_transaction_update
The ref_transaction_update function can already be used to create refs by
passing null_sha1 as the old_sha1 parameter. Simplify by replacing
transaction_create with a thin wrapper.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-04 15:39:36 -08:00
Nguyễn Thái Ngọc Duy 1a83c240f2 git_snpath(): retire and replace with strbuf_git_path()
In the previous patch, git_snpath() is modified to allocate a new
strbuf buffer because vsnpath() needs that. But that makes it
awkward because git_snpath() receives a pre-allocated buffer from
outside and has to copy data back. Rename it to strbuf_git_path()
and make it receive strbuf directly.

Using git_path() in update_refs_for_switch() which used to call
git_snpath() is safe because that function and all of its callers do
not keep any pointer to the round-robin buffer pool allocated by
get_pathname().

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01 11:00:11 -08:00
Nguyễn Thái Ngọc Duy dcf692625a path.c: make get_pathname() call sites return const char *
Before the previous commit, get_pathname returns an array of PATH_MAX
length. Even if git_path() and similar functions does not use the
whole array, git_path() caller can, in theory.

After the commit, get_pathname() may return a buffer that has just
enough room for the returned string and git_path() caller should never
write beyond that.

Make git_path(), mkpath() and git_path_submodule() return a const
buffer to make sure callers do not write in it at all.

This could have been part of the previous commit, but the "const"
conversion is too much distraction from the core changes in path.c.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-01 11:00:10 -08:00
Michael Haggerty 4a45b2f347 repack_without_refs(): make the refnames argument a string_list
Most of the callers have string_lists available already, whereas two
of them had to read data out of a string_list into an array of strings
just to call this function. So change repack_without_refs() to take
the list of refnames to omit as a string_list, and change the callers
accordingly.

Suggested-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-25 10:09:58 -08:00
Ronnie Sahlberg 068395150b lock_ref_sha1_basic: do not die on locking errors
lock_ref_sha1_basic is inconsistent about when it calls
die() and when it returns NULL to signal an error. This is
annoying to any callers that want to recover from a locking
error.

This seems to be mostly historical accident. It was added in
4bd18c4 (Improve abstraction of ref lock/write.,
2006-05-17), which returned an error in all cases except
calling safe_create_leading_directories, in which case it
died.  Later, 40aaae8 (Better error message when we are
unable to lock the index file, 2006-08-12) asked
hold_lock_file_for_update to die for us, leaving the
resolve_ref code-path the only one which returned NULL.

We tried to correct that in 5cc3cef (lock_ref_sha1(): do not
sometimes error() and sometimes die()., 2006-09-30),
by converting all of the die() calls into returns. But we
missed the "die" flag passed to the lock code, leaving us
inconsistent. This state persisted until e5c223e
(lock_ref_sha1_basic(): if locking fails with ENOENT, retry,
2014-01-18). Because of its retry scheme, it does not ask
the lock code to die, but instead manually dies with
unable_to_lock_die().

We can make this consistent with the other return paths by
converting this to use unable_to_lock_message(), and
returning NULL. This is safe to do because all callers
already needed to check the return value of the function,
since it could fail (and return NULL) for other reasons.

[jk: Added excessive history explanation]

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-20 08:25:03 -08:00
Junio C Hamano a1671dd82b Merge branch 'jk/fetch-reflog-df-conflict'
Corner-case bugfixes for "git fetch" around reflog handling.

* jk/fetch-reflog-df-conflict:
  ignore stale directories when checking reflog existence
  fetch: load all default config at startup
2014-11-06 10:52:32 -08:00
Jeff King 9233887cce ignore stale directories when checking reflog existence
When we update a ref, we have two rules for whether or not
we actually update the reflog:

  1. If the reflog already exists, we will always append to
     it.

  2. If log_all_ref_updates is set, we will create a new
     reflog file if necessary.

We do the existence check by trying to open the reflog file,
either with or without O_CREAT (depending on log_all_ref_updates).
If it fails, then we check errno to see what happened.

If we were not using O_CREAT and we got ENOENT, the file
doesn't exist, and we return success (there isn't a reflog
already, and we were not told to make a new one).

If we get EISDIR, then there is likely a stale directory
that needs to be removed (e.g., there used to be "foo/bar",
it was deleted, and the directory "foo" was left. Now we
want to create the ref "foo"). If O_CREAT is set, then we
catch this case, try to remove the directory, and retry our
open. So far so good.

But if we get EISDIR and O_CREAT is not set, then we treat
this as any other error, which is not right. Like ENOENT,
EISDIR is an indication that we do not have a reflog, and we
should silently return success (we were not told to create
it). Instead, the current code reports this as an error, and
we fail to update the ref at all.

Note that this is relatively unlikely to happen, as you
would have to have had reflogs turned on, and then later
turned them off (it could also happen due to a bug in fetch,
but that was fixed in the previous commit). However, it's
quite easy to fix: we just need to treat EISDIR like ENOENT
for the non-O_CREAT case, and silently return (note that
this early return means we can also simplify the O_CREAT
case).

Our new tests cover both cases (O_CREAT and non-O_CREAT).
The first one already worked, of course.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-11-04 12:18:44 -08:00
Jonathan Nieder 65732845e8 ref_transaction_commit: bail out on failure to remove a ref
When removal of a loose or packed ref fails, bail out instead of
trying to finish the transaction.  This way, a single error message
can be printed (instead of multiple messages being concatenated by
mistake) and the operator can try to solve the underlying problem
before there is a chance to muck things up even more.

In particular, when git fails to remove a ref, git goes on to try to
delete the reflog.  Exiting early lets us keep the reflog.

When git succeeds in deleting a ref A and fails to remove a ref B, it
goes on to try to delete both reflogs.  It would be better to just
remove the reflog for A, but that would be a more invasive change.
Failing early means we keep both reflogs, which puts the operator in a
good position to understand the problem and recover.

A long term goal is to avoid these problems altogether and roll back
the transaction on failure.  That kind of transactionality will have
to wait for a later series (the plan for which is to make all
destructive work happen in a single update of the packed-refs file).

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:27 -07:00
Jonathan Nieder 5a603b0463 refs.c: do not permit err == NULL
Some functions that take a strbuf argument to append an error treat
!err as an indication that the message should be suppressed (e.g.,
ref_update_reject_duplicates).  Others write the message to stderr on
!err (e.g., repack_without_refs).  Others crash (e.g.,
ref_transaction_update).

Some of these behaviors are for historical reasons and others were
accidents.  Luckily no callers pass err == NULL any more.  Simplify
by consistently requiring the strbuf argument.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:26 -07:00
Ronnie Sahlberg d0f810f0bc refs.c: allow listing and deleting badly named refs
We currently do not handle badly named refs well:

  $ cp .git/refs/heads/master .git/refs/heads/master.....@\*@\\.
  $ git branch
    fatal: Reference has invalid format: 'refs/heads/master.....@*@\.'
  $ git branch -D master.....@\*@\\.
    error: branch 'master.....@*@\.' not found.

Users cannot recover from a badly named ref without manually finding
and deleting the loose ref file or appropriate line in packed-refs.
Making that easier will make it easier to tweak the ref naming rules
in the future, for example to forbid shell metacharacters like '`'
and '"', without putting people in a state that is hard to get out of.

So allow "branch --list" to show these refs and allow "branch -d/-D"
and "update-ref -d" to delete them.  Other commands (for example to
rename refs) will continue to not handle these refs but can be changed
in later patches.

Details:

In resolving functions, refuse to resolve refs that don't pass the
git-check-ref-format(1) check unless the new RESOLVE_REF_ALLOW_BAD_NAME
flag is passed.  Even with RESOLVE_REF_ALLOW_BAD_NAME, refuse to
resolve refs that escape the refs/ directory and do not match the
pattern [A-Z_]* (think "HEAD" and "MERGE_HEAD").

In locking functions, refuse to act on badly named refs unless they
are being deleted and either are in the refs/ directory or match [A-Z_]*.

Just like other invalid refs, flag resolved, badly named refs with the
REF_ISBROKEN flag, treat them as resolving to null_sha1, and skip them
in all iteration functions except for for_each_rawref.

Flag badly named refs (but not symrefs pointing to badly named refs)
with a REF_BAD_NAME flag to make it easier for future callers to
notice and handle them specially.  For example, in a later patch
for-each-ref will use this flag to detect refs whose names can confuse
callers parsing for-each-ref output.

In the transaction API, refuse to create or update badly named refs,
but allow deleting them (unless they try to escape refs/ and don't match
[A-Z_]*).

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:26 -07:00
Jonathan Nieder f3cc52d840 packed-ref cache: forbid dot-components in refnames
Since v1.7.9-rc1~10^2 (write_head_info(): handle "extra refs" locally,
2012-01-06), this trick to keep track of ".have" refs that are only
valid on the wire and not on the filesystem is not needed any more.

Simplify by removing support for the REFNAME_DOT_COMPONENT flag.

This means we'll be slightly stricter with invalid refs found in a
packed-refs file or during clone.  read_loose_refs() already checks
for and skips refnames with .components so it is not affected.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:25 -07:00
Jonathan Nieder 62a2d52514 branch -d: avoid repeated symref resolution
If a repository gets in a broken state with too much symref nesting,
it cannot be repaired with "git branch -d":

 $ git symbolic-ref refs/heads/nonsense refs/heads/nonsense
 $ git branch -d nonsense
 error: branch 'nonsense' not found.

Worse, "git update-ref --no-deref -d" doesn't work for such repairs
either:

 $ git update-ref -d refs/heads/nonsense
 error: unable to resolve reference refs/heads/nonsense: Too many levels of symbolic links

Fix both by teaching resolve_ref_unsafe a new RESOLVE_REF_NO_RECURSE
flag and passing it when appropriate.

Callers can still read the value of a symref (for example to print a
message about it) with that flag set --- resolve_ref_unsafe will
resolve one level of symrefs and stop there.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:25 -07:00
Ronnie Sahlberg 7695d118e5 refs.c: change resolve_ref_unsafe reading argument to be a flags field
resolve_ref_unsafe takes a boolean argument for reading (a nonexistent ref
resolves successfully for writing but not for reading).  Change this to be
a flags field instead, and pass the new constant RESOLVE_REF_READING when
we want this behaviour.

While at it, swap two of the arguments in the function to put output
arguments at the end.  As a nice side effect, this ensures that we can
catch callers that were unaware of the new API so they can be audited.

Give the wrapper functions resolve_refdup and read_ref_full the same
treatment for consistency.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:24 -07:00
Ronnie Sahlberg aae383db8c refs.c: make write_ref_sha1 static
No external users call write_ref_sha1 any more so let's declare it static.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:23 -07:00
Ronnie Sahlberg 28e6a97e39 refs.c: ref_transaction_commit: distinguish name conflicts from other errors
In _commit, ENOTDIR can happen in the call to lock_ref_sha1_basic, either
when we lstat the new refname or if the name checking function reports that
the same type of conflict happened.  In both cases, it means that we can not
create the new ref due to a name conflict.

Start defining specific return codes for _commit.  TRANSACTION_NAME_CONFLICT
refers to a failure to create a ref due to a name conflict with another ref.
TRANSACTION_GENERIC_ERROR is for all other errors.

When "git fetch" is creating refs, name conflicts differ from other errors in
that they are likely to be resolved by running "git remote prune <remote>".
"git fetch" currently inspects errno to decide whether to give that advice.
Once it switches to the transaction API, it can check for
TRANSACTION_NAME_CONFLICT instead.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:23 -07:00
Ronnie Sahlberg 5fe7d825da refs.c: pass a list of names to skip to is_refname_available
Change is_refname_available to take a list of strings to exclude when
checking for conflicts instead of just one single name. We can already
exclude a single name for the sake of renames. This generalizes that support.

ref_transaction_commit already tracks a set of refs that are being deleted
in an array.  This array is then used to exclude refs from being written to
the packed-refs file.  At some stage we will want to change this array to a
struct string_list and then we can pass it to is_refname_available via the
call to lock_ref_sha1_basic.  That will allow us to perform transactions
that perform multiple renames as long as there are no conflicts within the
starting or ending state.

For example, that would allow a single transaction that contains two
renames that are both individually conflicting:

   m -> n/n
   n -> m/m

No functional change intended yet.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:23 -07:00
Ronnie Sahlberg 5d94a1b033 refs.c: call lock_ref_sha1_basic directly from commit
Skip using the lock_any_ref_for_update wrapper and call lock_ref_sha1_basic
directly from the commit function.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:23 -07:00
Ronnie Sahlberg 8a9df90d9a refs.c: refuse to lock badly named refs in lock_ref_sha1_basic
Move the check for check_refname_format from lock_any_ref_for_update to
lock_ref_sha1_basic.  At some later stage we will get rid of
lock_any_ref_for_update completely.  This has no visible impact to callers
except for the inability to lock badly named refs, which is not possible
today already for other reasons.(*)

Keep lock_any_ref_for_update as a no-op wrapper.  It is the public facing
version of this interface and keeping it as a separate function will make
it easier to experiment with the internal lock_ref_sha1_basic signature.

(*) For example, if lock_ref_sha1_basic checks the refname format and
refuses to lock badly named refs, it will not be possible to delete
such refs because the first step of deletion is to lock the ref.  We
currently already fail in that case because these refs are not recognized
to exist:

 $ cp .git/refs/heads/master .git/refs/heads/echo...\*\*
 $ git branch -D .git/refs/heads/echo...\*\*
 error: branch '.git/refs/heads/echo...**' not found.

This has been broken for a while.  Later patches in the series will start
repairing the handling of badly named refs.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:22 -07:00
Ronnie Sahlberg 7522e3dbcc rename_ref: don't ask read_ref_full where the ref came from
We call read_ref_full with a pointer to flags from rename_ref but since
we never actually use the returned flags we can just pass NULL here instead.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:22 -07:00
Ronnie Sahlberg db7516ab9f refs.c: pass the ref log message to _create/delete/update instead of _commit
Change the ref transaction API so that we pass the reflog message to the
create/delete/update functions instead of to ref_transaction_commit.
This allows different reflog messages for each ref update in a multi-ref
transaction.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:22 -07:00
Ronnie Sahlberg dbdcac7d5c refs.c: add an err argument to delete_ref_loose
Add an err argument to delete_ref_loose so that we can pass a descriptive
error string back to the caller. Pass the err argument from transaction
commit to this function so that transaction users will have a nice error
string if the transaction failed due to delete_ref_loose.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:21 -07:00
Ronnie Sahlberg 3c93c847ca refs.c: lock_ref_sha1_basic is used for all refs
lock_ref_sha1_basic is used to lock refs that sit directly in the .git
dir such as HEAD and MERGE_HEAD in addition to the more ordinary refs
under "refs/".  Remove the note claiming otherwise.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:21 -07:00
Ronnie Sahlberg 1054af7d04 wrapper.c: remove/unlink_or_warn: simplify, treat ENOENT as success
Simplify the function warn_if_unremovable slightly. Additionally, change
behaviour slightly. If we failed to remove the object because the object
does not exist, we can still return success back to the caller since none of
the callers depend on "fail if the file did not exist".

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-15 10:47:20 -07:00
Michael Haggerty 6e578a31e6 commit_packed_refs(): reimplement using fdopen_lock_file()
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 14:20:25 -07:00
Michael Haggerty 697cc8efd9 lockfile.h: extract new header file for the functions in lockfile.c
Move the interface declaration for the functions in lockfile.c from
cache.h to a new file, lockfile.h. Add #includes where necessary (and
remove some redundant includes of cache.h by files that already
include builtin.h).

Move the documentation of the lock_file state diagram from lockfile.c
to the new header file.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:56:14 -07:00
Michael Haggerty ec38b4e482 get_locked_file_path(): new function
Add a function to return the path of the file that is locked by a
lock_file object. This reduces the knowledge that callers have to have
about the lock_file layout.

Suggested-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:53:54 -07:00
Michael Haggerty 47ba4662bf lockfile: rename LOCK_NODEREF to LOCK_NO_DEREF
This makes it harder to misread the name as LOCK_NODE_REF.

Suggested-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:53:28 -07:00
Michael Haggerty cf6950d3bf lockfile: change lock_file::filename into a strbuf
For now, we still make sure to allocate at least PATH_MAX characters
for the strbuf because resolve_symlink() doesn't know how to expand
the space for its return value.  (That will be fixed in a moment.)

Another alternative would be to just use a strbuf as scratch space in
lock_file() but then store a pointer to the naked string in struct
lock_file.  But lock_file objects are often reused.  By reusing the
same strbuf, we can avoid having to reallocate the string most times
when a lock_file object is reused.

Helped-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:50:01 -07:00
Michael Haggerty 91f1f19184 delete_ref_loose(): don't muck around in the lock_file's filename
It's bad manners. Especially since there could be a signal during the
call to unlink_or_warn(), in which case the signal handler will see
the wrong filename and delete the reference file, leaving the lockfile
behind.

So make our own copy to work with.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:45:11 -07:00
Michael Haggerty 7108ad232f cache.h: define constants LOCK_SUFFIX and LOCK_SUFFIX_LEN
There are a few places that use these values, so define constants for
them.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:45:11 -07:00
Michael Haggerty e197c21807 unable_to_lock_die(): rename function from unable_to_lock_index_die()
This function is used for other things besides the index, so rename it
accordingly.

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-10-01 13:38:38 -07:00
Junio C Hamano 507fe835ed Merge branch 'da/rev-parse-verify-quiet'
"rev-parse --verify --quiet $name" is meant to quietly exit with a
non-zero status when $name is not a valid object name, but still
gave error messages in some cases.

* da/rev-parse-verify-quiet:
  stash: prefer --quiet over shell redirection of the standard error stream
  refs: make rev-parse --quiet actually quiet
  t1503: use test_must_be_empty
  Documentation: a note about stdout for git rev-parse --verify --quiet
2014-09-29 12:36:10 -07:00
Junio C Hamano 9bc4222746 Merge branch 'jk/faster-name-conflicts'
Optimize the check to see if a ref $F can be created by making sure
no existing ref has $F/ as its prefix, which especially matters in
a repository with a large number of existing refs.

* jk/faster-name-conflicts:
  refs: speed up is_refname_available
2014-09-26 14:39:43 -07:00
Junio C Hamano 69a5bbbbfa Merge branch 'jk/write-packed-refs-via-stdio'
Optimize the code path to write out the packed-refs file, which
especially matters in a repository with a large number of refs.

* jk/write-packed-refs-via-stdio:
  refs: write packed_refs file using stdio
2014-09-26 14:39:42 -07:00
Junio C Hamano fb6f843a8f Merge branch 'jk/prune-top-level-refs-after-packing' into maint
* jk/prune-top-level-refs-after-packing:
  pack-refs: prune top-level refs like "refs/foo"
2014-09-19 14:05:12 -07:00
David Aguilar c41a87dd80 refs: make rev-parse --quiet actually quiet
When a reflog is deleted, e.g. when "git stash" clears its stashes,
"git rev-parse --verify --quiet" dies:

	fatal: Log for refs/stash is empty.

The reason is that the get_sha1() code path does not allow us
to suppress this message.

Pass the flags bitfield through get_sha1_with_context() so that
read_ref_at() can suppress the message.

Use get_sha1_with_context1() instead of get_sha1() in rev-parse
so that the --quiet flag is honored.

Signed-off-by: David Aguilar <davvid@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-19 10:46:15 -07:00
Jeff King cbe7333181 refs: speed up is_refname_available
Our filesystem ref storage does not allow D/F conflicts; so
if "refs/heads/a/b" exists, we do not allow "refs/heads/a"
to exist (and vice versa). This falls out naturally for
loose refs, where the filesystem enforces the condition. But
for packed-refs, we have to make the check ourselves.

We do so by iterating over the entire packed-refs namespace
and checking whether each name creates a conflict. If you
have a very large number of refs, this is quite inefficient,
as you end up doing a large number of comparisons with
uninteresting bits of the ref tree (e.g., we know that all
of "refs/tags" is uninteresting in the example above, yet we
check each entry in it).

Instead, let's take advantage of the fact that we have the
packed refs stored as a trie of ref_entry structs. We can
find each component of the proposed refname as we walk
through the trie, checking for D/F conflicts as we go. For a
refname of depth N (i.e., 4 in the above example), we only
have to visit N nodes. And at each visit, we can binary
search the M names at that level, for a total complexity of
O(N lg M). ("M" is different at each level, of course, but
we can take the worst-case "M" as a bound).

In a pathological case of fetching 30,000 fresh refs into a
repository with 8.5 million refs, this dropped the time to
run "git fetch" from tens of minutes to ~30s.

This may also help smaller cases in which we check against
loose refs (which we do when renaming a ref), as we may
avoid a disk access for unrelated loose directories.

Note that the tests we add appear at first glance to be
redundant with what is already in t3210. However, the early
tests are not robust; they are run with reflogs turned on,
meaning that we are not actually testing
is_refname_available at all! The operations will still fail
because the reflogs will hit D/F conflicts in the
filesystem. To get a true test, we must turn off reflogs
(but we don't want to do so for the entire script, because
the point of turning them on was to cover some other cases).

Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-12 12:48:54 -07:00
Junio C Hamano 88e7dff93d Merge branch 'jk/prune-top-level-refs-after-packing'
After "pack-refs --prune" packed refs at the top-level, it failed
to prune them.

* jk/prune-top-level-refs-after-packing:
  pack-refs: prune top-level refs like "refs/foo"
2014-09-11 10:33:33 -07:00
Junio C Hamano 01d678a226 Merge branch 'rs/ref-transaction-1'
The second batch of the transactional ref update series.

* rs/ref-transaction-1: (22 commits)
  update-ref --stdin: pass transaction around explicitly
  update-ref --stdin: narrow scope of err strbuf
  refs.c: make delete_ref use a transaction
  refs.c: make prune_ref use a transaction to delete the ref
  refs.c: remove lock_ref_sha1
  refs.c: remove the update_ref_write function
  refs.c: remove the update_ref_lock function
  refs.c: make lock_ref_sha1 static
  walker.c: use ref transaction for ref updates
  fast-import.c: use a ref transaction when dumping tags
  receive-pack.c: use a reference transaction for updating the refs
  refs.c: change update_ref to use a transaction
  branch.c: use ref transaction for all ref updates
  fast-import.c: change update_branch to use ref transactions
  sequencer.c: use ref transactions for all ref updates
  commit.c: use ref transactions for updates
  replace.c: use the ref transaction functions for updates
  tag.c: use ref transactions when doing updates
  refs.c: add transaction.status and track OPEN/CLOSED
  refs.c: make ref_transaction_begin take an err argument
  ...
2014-09-11 10:33:31 -07:00
Jeff King 9540ce5030 refs: write packed_refs file using stdio
We write each line of a new packed-refs file individually
using a write() syscall (and sometimes 2, if the ref is
peeled). Since each line is only about 50-100 bytes long,
this creates a lot of system call overhead.

We can instead open a stdio handle around our descriptor and
use fprintf to write to it. The extra buffering is not a
problem for us, because nobody will read our new packed-refs
file until we call commit_lock_file (by which point we have
flushed everything).

On a pathological repository with 8.5 million refs, this
dropped the time to run `git pack-refs` from 20s to 6s.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-10 10:58:32 -07:00
Ronnie Sahlberg 7521cc4611 refs.c: make delete_ref use a transaction
Change delete_ref to use a ref transaction for the deletion. At the same time
since we no longer have any callers of repack_without_ref we can now delete
this function.

Change delete_ref to return 0 on success and 1 on failure instead of the
previous 0 on success either 1 or -1 on failure.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:18 -07:00
Ronnie Sahlberg 029cdb4ab2 refs.c: make prune_ref use a transaction to delete the ref
Change prune_ref to delete the ref using a ref transaction. To do this we also
need to add a new flag REF_ISPRUNING that will tell the transaction that we
do not want to delete this ref from the packed refs. This flag is private to
refs.c and not exposed to external callers.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:18 -07:00
Ronnie Sahlberg cba12021c3 refs.c: remove lock_ref_sha1
lock_ref_sha1 was only called from one place in refs.c and only provided
a check that the refname was sane before adding back the initial "refs/"
part of the ref path name, the initial "refs/" that this caller had already
stripped off before calling lock_ref_sha1.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:17 -07:00
Ronnie Sahlberg 04ad6223ec refs.c: remove the update_ref_write function
Since we only call update_ref_write from a single place and we only call it
with onerr==QUIET_ON_ERR we can just as well get rid of it and just call
write_ref_sha1 directly. This changes the return status for _commit from
1 to -1 on failures when writing to the ref. Eventually we will want
_commit to start returning more detailed error conditions than the current
simple success/failure.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:17 -07:00
Ronnie Sahlberg 45421e24e8 refs.c: remove the update_ref_lock function
Since we now only call update_ref_lock with onerr==QUIET_ON_ERR we no longer
need this function and can replace it with just calling lock_any_ref_for_update
directly.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:16 -07:00
Ronnie Sahlberg 88b680ae8d refs.c: make lock_ref_sha1 static
No external callers reference lock_ref_sha1 any more so let's declare it
static.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:15 -07:00
Ronnie Sahlberg b4d75ac1d1 refs.c: change update_ref to use a transaction
Change the update_ref helper function to use a ref transaction internally.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:13 -07:00
Ronnie Sahlberg 2bdc785fd7 refs.c: add transaction.status and track OPEN/CLOSED
Track the state of a transaction in a new state field. Check the field for
sanity, i.e. that state must be OPEN when _commit/_create/_delete or
_update is called or else die(BUG:...)

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:09 -07:00
Ronnie Sahlberg 93a644ea9d refs.c: make ref_transaction_begin take an err argument
Add an err argument to _begin so that on non-fatal failures in future ref
backends we can report a nice error back to the caller.
While _begin can currently never fail for other reasons than OOM, in which
case we die() anyway, we may add other types of backends in the future.
For example, a hypothetical MySQL backend could fail in _begin with
"Can not connect to MySQL server. No route to host".

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:08 -07:00
Ronnie Sahlberg 8c8bdc0d35 refs.c: update ref_transaction_delete to check for error and return status
Change ref_transaction_delete() to do basic error checking and return
non-zero on error. Update all callers to check the return for
ref_transaction_delete(). There are currently no conditions in _delete that
will return error but there will be in the future. Add an err argument that
will be updated on failure.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:08 -07:00
Ronnie Sahlberg b416af5bcd refs.c: change ref_transaction_create to do error checking and return status
Do basic error checking in ref_transaction_create() and make it return
non-zero on error. Update all callers to check the result of
ref_transaction_create(). There are currently no conditions in _create that
will return error but there will be in the future. Add an err argument that
will be updated on failure.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-09-03 10:04:07 -07:00
Jeff King afd11d3ebc pack-refs: prune top-level refs like "refs/foo"
After we have packed all refs, we prune any loose refs that
correspond to what we packed. We do so by first taking a
lock with lock_ref_sha1, and then deleting the loose ref
file.

However, lock_ref_sha1 will refuse to take a lock on any
refs that exist at the top-level of the "refs/" directory,
and we skip pruning the ref.  This is almost certainly not
what we want to happen here. The criteria to be pruned
should not differ from that to be packed; if a ref makes it
to prune_ref, it's because we want it both packed and
pruned (if there are refs you do not want to be packed, they
should be omitted much earlier by pack_ref_is_possible,
which we do in this case if --all is not given).

We can fix this by switching to lock_any_ref_for_update.
This behaves exactly the same with the exception of this
top-level check.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-25 12:19:50 -07:00
Junio C Hamano 5e6502288d Revert "Merge branch 'dt/refs-check-refname-component-sse'"
This reverts commit 6f92e5ff3c, reversing
changes made to a02ad882a1.
2014-07-28 10:41:53 -07:00
Junio C Hamano dad2e7f4bf Revert "Merge branch 'dt/refs-check-refname-component-sse-fix'"
This reverts commit 779c99fd68, reversing
changes made to df4d7d5646.
2014-07-28 10:41:16 -07:00
Jeff King c4ad00f8cc add object_as_type helper for casting objects
When we call lookup_commit, lookup_tree, etc, the logic goes
something like:

  1. Look for an existing object struct. If we don't have
     one, allocate and return a new one.

  2. Double check that any object we have is the expected
     type (and complain and return NULL otherwise).

  3. Convert an object with type OBJ_NONE (from a prior
     call to lookup_unknown_object) to the expected type.

We can encapsulate steps 2 and 3 in a helper function which
checks whether we have the expected object type, converts
OBJ_NONE as appropriate, and returns the object.

Not only does this shorten the code, but it also provides
one central location for converting OBJ_NONE objects into
objects of other types. Future patches will use that to
enforce type-specific invariants.

Since this is a refactoring, we would want it to behave
exactly as the current code. It takes a little reasoning to
see that this is the case:

  - for lookup_{commit,tree,etc} functions, we are just
    pulling steps 2 and 3 into a function that does the same
    thing.

  - for the call in peel_object, we currently only do step 3
    (but we want to consolidate it with the others, as
    mentioned above). However, step 2 is a noop here, as the
    surrounding conditional makes sure we have OBJ_NONE
    (which we want to keep to avoid an extraneous call to
    sha1_object_info).

  - for the call in lookup_commit_reference_gently, we are
    currently doing step 2 but not step 3. However, step 3
    is a noop here. The object we got will have just come
    from deref_tag, which must have figured out the type for
    each object in order to know when to stop peeling.
    Therefore the type will never be OBJ_NONE.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-28 10:14:33 -07:00
Junio C Hamano 10b944b37b Merge branch 'jk/alloc-commit-id'
Make sure all in-core commit objects are assigned a unique number
so that they can be annotated using the commit-slab API.

* jk/alloc-commit-id:
  diff-tree: avoid lookup_unknown_object
  object_as_type: set commit index
  alloc: factor out commit index
  add object_as_type helper for casting objects
  parse_object_buffer: do not set object type
  move setting of object->type to alloc_* functions
  alloc: write out allocator definitions
  alloc.c: remove the alloc_raw_commit_node() function
2014-07-22 10:59:25 -07:00
Junio C Hamano 528396a463 Merge branch 'rs/unify-is-branch'
* rs/unify-is-branch:
  refs.c: add a public is_branch function
2014-07-21 11:18:57 -07:00
Junio C Hamano 19a249ba83 Merge branch 'rs/ref-transaction-0'
Early part of the "ref transaction" topic.

* rs/ref-transaction-0:
  refs.c: change ref_transaction_update() to do error checking and return status
  refs.c: remove the onerr argument to ref_transaction_commit
  update-ref: use err argument to get error from ref_transaction_commit
  refs.c: make update_ref_write update a strbuf on failure
  refs.c: make ref_update_reject_duplicates take a strbuf argument for errors
  refs.c: log_ref_write should try to return meaningful errno
  refs.c: make resolve_ref_unsafe set errno to something meaningful on error
  refs.c: commit_packed_refs to return a meaningful errno on failure
  refs.c: make remove_empty_directories always set errno to something sane
  refs.c: verify_lock should set errno to something meaningful
  refs.c: make sure log_ref_setup returns a meaningful errno
  refs.c: add an err argument to repack_without_refs
  lockfile.c: make lock_file return a meaningful errno on failurei
  lockfile.c: add a new public function unable_to_lock_message
  refs.c: add a strbuf argument to ref_transaction_commit for error logging
  refs.c: allow passing NULL to ref_transaction_free
  refs.c: constify the sha arguments for ref_transaction_create|delete|update
  refs.c: ref_transaction_commit should not free the transaction
  refs.c: remove ref_transaction_rollback
2014-07-21 11:18:37 -07:00
Ronnie Sahlberg e7e0f26eb6 refs.c: add a public is_branch function
Both refs.c and fsck.c have their own private copies of the is_branch function.
Delete the is_branch function from fsck.c and make the version in refs.c
public.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-16 13:06:41 -07:00
Junio C Hamano 6e4094731a Merge branch 'jk/strip-suffix'
* jk/strip-suffix:
  prepare_packed_git_one: refactor duplicate-pack check
  verify-pack: use strbuf_strip_suffix
  strbuf: implement strbuf_strip_suffix
  index-pack: use strip_suffix to avoid magic numbers
  use strip_suffix instead of ends_with in simple cases
  replace has_extension with ends_with
  implement ends_with via strip_suffix
  add strip_suffix function
  sha1_file: replace PATH_MAX buffer with strbuf in prepare_packed_git_one()
2014-07-16 11:26:00 -07:00
Ronnie Sahlberg 8e34800e5b refs.c: change ref_transaction_update() to do error checking and return status
Update ref_transaction_update() do some basic error checking and return
non-zero on error. Update all callers to check ref_transaction_update() for
error. There are currently no conditions in _update that will return error but
there will be in the future. Add an err argument that will be updated on
failure. In future patches we will start doing both locking and checking
for name conflicts in _update instead of _commit at which time this function
will start returning errors for these conditions.

Also check for BUGs during update and die(BUG:...) if we are calling
_update with have_old but the old_sha1 pointer is NULL.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:42 -07:00
Ronnie Sahlberg 01319837c5 refs.c: remove the onerr argument to ref_transaction_commit
Since all callers now use QUIET_ON_ERR we no longer need to provide an onerr
argument any more. Remove the onerr argument from the ref_transaction_commit
signature.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:42 -07:00
Ronnie Sahlberg c1703d7634 refs.c: make update_ref_write update a strbuf on failure
Change update_ref_write to also update an error strbuf on failure.
This makes the error available to ref_transaction_commit callers if the
transaction failed due to update_ref_sha1/write_ref_sha1 failures.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:42 -07:00
Ronnie Sahlberg 038d005129 refs.c: make ref_update_reject_duplicates take a strbuf argument for errors
Make ref_update_reject_duplicates return any error that occurs through a
new strbuf argument. This means that when a transaction commit fails in
this function we will now be able to pass a helpful error message back to the
caller.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:42 -07:00
Ronnie Sahlberg dc615de861 refs.c: log_ref_write should try to return meaningful errno
Making errno from write_ref_sha1() meaningful, which should fix

* a bug in "git checkout -b" where it prints strerror(errno)
  despite errno possibly being zero or clobbered

* a bug in "git fetch"'s s_update_ref, which trusts the result of an
  errno == ENOTDIR check to detect D/F conflicts

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:42 -07:00
Ronnie Sahlberg 76d70dc0c6 refs.c: make resolve_ref_unsafe set errno to something meaningful on error
Making errno when returning from resolve_ref_unsafe() meaningful,
which should fix

 * a bug in lock_ref_sha1_basic, where it assumes EISDIR
   means it failed due to a directory being in the way

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:42 -07:00
Ronnie Sahlberg d3f6655505 refs.c: commit_packed_refs to return a meaningful errno on failure
Making errno when returning from commit_packed_refs() meaningful,
which should fix

 * a bug in "git clone" where it prints strerror(errno) based on
   errno, despite errno possibly being zero and potentially having
   been clobbered by that point
 * the same kind of bug in "git pack-refs"

and prepares for repack_without_refs() to get a meaningful
error message when commit_packed_refs() fails without falling into
the same bug.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:41 -07:00
Ronnie Sahlberg 470a91ef75 refs.c: make remove_empty_directories always set errno to something sane
Making errno when returning from remove_empty_directories() more
obviously meaningful, which should provide some peace of mind for
people auditing lock_ref_sha1_basic.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:41 -07:00
Ronnie Sahlberg 835e3c992f refs.c: verify_lock should set errno to something meaningful
Making errno when returning from verify_lock() meaningful, which
should almost but not completely fix

 * a bug in "git fetch"'s s_update_ref, which trusts the result of an
   errno == ENOTDIR check to detect D/F conflicts

ENOTDIR makes sense as a sign that a file was in the way of a
directory we wanted to create.  Should "git fetch" also look for
ENOTEMPTY or EEXIST to catch cases where a directory was in the way
of a file to be created?

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:41 -07:00
Ronnie Sahlberg bd3b02daec refs.c: make sure log_ref_setup returns a meaningful errno
Making errno when returning from log_ref_setup() meaningful,

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:41 -07:00
Ronnie Sahlberg 60bca085c8 refs.c: add an err argument to repack_without_refs
Update repack_without_refs to take an err argument and update it if there
is a failure. Pass the err variable from ref_transaction_commit to this
function so that callers can print a meaningful error message if _commit
fails due to this function.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:41 -07:00
Ronnie Sahlberg 447ff1bf0a lockfile.c: make lock_file return a meaningful errno on failurei
Making errno when returning from lock_file() meaningful, which should
fix

 * an existing almost-bug in lock_ref_sha1_basic where it assumes
   errno==ENOENT is meaningful and could waste some work on retries

 * an existing bug in repack_without_refs where it prints
   strerror(errno) and picks advice based on errno, despite errno
   potentially being zero and potentially having been clobbered by
   that point

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:41 -07:00
Ronnie Sahlberg 995f8746bc refs.c: add a strbuf argument to ref_transaction_commit for error logging
Add a strbuf argument to _commit so that we can pass an error string back to
the caller. So that we can do error logging from the caller instead of from
_commit.

Longer term plan is to first convert all callers to use onerr==QUIET_ON_ERR
and craft any log messages from the callers themselves and finally remove the
onerr argument completely.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:40 -07:00
Ronnie Sahlberg 1b07255c95 refs.c: allow passing NULL to ref_transaction_free
Allow ref_transaction_free(NULL) as a no-op. This makes ref_transaction_free
easier to use and more similar to plain 'free'.

In particular, it lets us rollback unconditionally as part of cleanup code
after setting 'transaction = NULL' if a transaction has been committed or
rolled back already.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:40 -07:00
Ronnie Sahlberg f1c9350ad7 refs.c: constify the sha arguments for ref_transaction_create|delete|update
ref_transaction_create|delete|update has no need to modify the sha1
arguments passed to it so it should use const unsigned char* instead
of unsigned char*.

Some functions, such as fast_forward_to(), already have its old/new
sha1 arguments as consts. This function will at some point need to
use ref_transaction_update() in which case this change is required.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:40 -07:00
Ronnie Sahlberg 33f9fc5932 refs.c: ref_transaction_commit should not free the transaction
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:40 -07:00
Ronnie Sahlberg 026bd1d3e2 refs.c: remove ref_transaction_rollback
We do not yet need both a rollback and a free function for transactions.
Remove ref_transaction_rollback and use ref_transaction_free instead.

At a later stage we may reintroduce a rollback function if we want to start
adding reusable transactions and similar.

Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
2014-07-14 11:54:40 -07:00
Jeff King 8ff226a9d5 add object_as_type helper for casting objects
When we call lookup_commit, lookup_tree, etc, the logic goes
something like:

  1. Look for an existing object struct. If we don't have
     one, allocate and return a new one.

  2. Double check that any object we have is the expected
     type (and complain and return NULL otherwise).

  3. Convert an object with type OBJ_NONE (from a prior
     call to lookup_unknown_object) to the expected type.

We can encapsulate steps 2 and 3 in a helper function which
checks whether we have the expected object type, converts
OBJ_NONE as appropriate, and returns the object.

Not only does this shorten the code, but it also provides
one central location for converting OBJ_NONE objects into
objects of other types. Future patches will use that to
enforce type-specific invariants.

Since this is a refactoring, we would want it to behave
exactly as the current code. It takes a little reasoning to
see that this is the case:

  - for lookup_{commit,tree,etc} functions, we are just
    pulling steps 2 and 3 into a function that does the same
    thing.

  - for the call in peel_object, we currently only do step 3
    (but we want to consolidate it with the others, as
    mentioned above). However, step 2 is a noop here, as the
    surrounding conditional makes sure we have OBJ_NONE
    (which we want to keep to avoid an extraneous call to
    sha1_object_info).

  - for the call in lookup_commit_reference_gently, we are
    currently doing step 2 but not step 3. However, step 3
    is a noop here. The object we got will have just come
    from deref_tag, which must have figured out the type for
    each object in order to know when to stop peeling.
    Therefore the type will never be OBJ_NONE.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-13 18:59:05 -07:00
Junio C Hamano 779c99fd68 Merge branch 'dt/refs-check-refname-component-sse-fix'
Fixes to a topic that is already in 'master'.

* dt/refs-check-refname-component-sse-fix:
  refs: fix valgrind suppression file
  refs.c: handle REFNAME_REFSPEC_PATTERN at end of page
2014-07-10 11:27:55 -07:00
David Turner 6d17dc1dd3 refs.c: handle REFNAME_REFSPEC_PATTERN at end of page
When a ref crosses a memory page boundary, we restart the parsing
at the beginning with the bytewise code.  Pass the original flags
to that code, rather than the current flags.

Reported-By: Øyvind A. Holm <sunny@sunbase.org>
Signed-off-by: David Turner <dturner@twitter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-07-07 11:05:43 -07:00
Junio C Hamano 6f92e5ff3c Merge branch 'dt/refs-check-refname-component-sse'
Further micro-optimization of a leaf-function.

* dt/refs-check-refname-component-sse:
  refs.c: SSE2 optimizations for check_refname_component
2014-07-02 12:53:07 -07:00
Jeff King 2975c770ca replace has_extension with ends_with
These two are almost the same function, with the exception
that has_extension only matches if there is content before
the suffix. So ends_with(".exe", ".exe") is true, but
has_extension would not be.

This distinction does not matter to any of the callers,
though, and we can just replace uses of has_extension with
ends_with. We prefer the "ends_with" name because it is more
generic, and there is nothing about the function that
requires it to be used for file extensions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-30 13:43:16 -07:00
David Turner 745224e04a refs.c: SSE2 optimizations for check_refname_component
Optimize check_refname_component using SSE2 on x86_64.

git rev-parse HEAD is a good test-case for this, since it does almost
nothing except parse refs.  For one particular repo with about 60k
refs, almost all packed, the timings are:

Look up table: 29 ms
SSE2:          23 ms

This cuts about 20% off of the runtime.

Ondřej Bílka <neleai@seznam.cz> suggested an SSE2 approach to the
substring searches, which netted a speed boost over the SSE4.2 code I
had initially written.

Signed-off-by: David Turner <dturner@twitter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-18 10:57:18 -07:00
Junio C Hamano ae7dd1a492 Merge branch 'dt/refs-check-refname-component-optim'
* dt/refs-check-refname-component-optim:
  refs.c: optimize check_refname_component()
2014-06-16 12:18:52 -07:00
Junio C Hamano bb0ced7581 Merge branch 'rs/read-ref-at'
* rs/read-ref-at:
  refs.c: change read_ref_at to use the reflog iterators
2014-06-16 12:18:48 -07:00
Junio C Hamano 474df928b1 Merge branch 'jl/remote-rm-prune'
"git remote rm" and "git remote prune" can involve removing many
refs at once, which is not a very efficient thing to do when very
many refs exist in the packed-refs file.

* jl/remote-rm-prune:
  remote prune: optimize "dangling symref" check/warning
  remote: repack packed-refs once when deleting multiple refs
  remote rm: delete remote configuration as the last
2014-06-16 12:17:58 -07:00
Junio C Hamano f7f349e138 Merge branch 'rs/reflog-exists'
* rs/reflog-exists:
  checkout.c: use ref_exists instead of file_exist
  refs.c: add new functions reflog_exists and delete_reflog
2014-06-06 11:23:04 -07:00
David Turner dde8a902c7 refs.c: optimize check_refname_component()
In a repository with many refs, check_refname_component can be a major
contributor to the runtime of some git commands. One such command is

git rev-parse HEAD

Timings for one particular repo, with about 60k refs, almost all
packed, are:

Old: 35 ms
New: 29 ms

Many other commands which read refs are also sped up.

Signed-off-by: David Turner <dturner@twitter.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-05 15:24:50 -07:00
Ronnie Sahlberg 4207ed285f refs.c: change read_ref_at to use the reflog iterators
read_ref_at has its own parsing of the reflog file for no really good reason
so lets change this to use the existing reflog iterators. This removes one
instance where we manually unmarshall the reflog file format.

Remove the now redundant ref_msg function.

Log messages for errors are changed slightly. We no longer print the file
name for the reflog, instead we refer to it as 'Log for ref <refname>'.
This might be a minor useability regression, but I don't really think so, since
experienced users would know where the log is anyway and inexperienced users
would not know what to do about/how to repair 'Log ... has gap ...' anyway.

Adapt the t1400 test to handle the change in log messages.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-06-03 11:09:32 -07:00
Jens Lindström e6bea66db6 remote prune: optimize "dangling symref" check/warning
When 'git remote prune' was used to delete many refs in a repository
with many refs, a lot of time was spent checking for (now) dangling
symbolic refs pointing to the deleted ref, since warn_dangling_symref()
was once per deleted ref to check all other refs in the repository.

Avoid this using the new warn_dangling_symrefs() function which
makes one pass over all refs and checks for all the deleted refs in
one go, after they have all been deleted.

Signed-off-by: Jens Lindström <jl@opera.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27 12:30:47 -07:00
Jens Lindström c9e768bb77 remote: repack packed-refs once when deleting multiple refs
When 'git remote rm' or 'git remote prune' were used in a repository
with many refs, and needed to delete many remote-tracking refs, a lot
of time was spent deleting those refs since for each deleted ref,
repack_without_refs() was called to rewrite packed-refs without just
that deleted ref.

To avoid this, call repack_without_refs() first to repack without all
the refs that will be deleted, before calling delete_ref() to delete
each one completely.  The call to repack_without_ref() in delete_ref()
then becomes a no-op, since packed-refs already won't contain any of
the deleted refs.

Signed-off-by: Jens Lindström <jl@opera.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-27 12:30:42 -07:00
Ronnie Sahlberg 4da588357a refs.c: add new functions reflog_exists and delete_reflog
Add two new functions, reflog_exists and delete_reflog, to hide the internal
reflog implementation (that they are files under .git/logs/...) from callers.
Update checkout.c to use these functions in update_refs_for_switch instead of
building pathnames and calling out to file access functions. Update reflog.c
to use these to check if the reflog exists. Now there are still many places
in reflog.c where we are still leaking the reflog storage implementation but
this at least reduces the number of such dependencies by one. Finally
change two places in refs.c itself to use the new function to check if a ref
exists or not isntead of build-path-and-stat(). Now, this is strictly not all
that important since these are in parts of refs that are implementing the
actual file storage backend but on the other hand it will not hurt either.

Signed-off-by: Ronnie Sahlberg <sahlberg@google.com>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-05-08 14:31:43 -07:00
Michael Haggerty 6a402338ec ref_transaction_commit(): work with transaction->updates in place
Now that we free the transaction when we are done, there is no need to
make a copy of transaction->updates before working with it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07 12:09:16 -07:00
Michael Haggerty 84178db76f struct ref_update: add a type field
It used to be that ref_transaction_commit() allocated a temporary
array to hold the types of references while it is working.  Instead,
add a type field to ref_update that ref_transaction_commit() can use
as its scratch space.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07 12:09:15 -07:00
Michael Haggerty 81c960e4dc struct ref_update: add a lock field
Now that we manage ref_update objects internally, we can use them to
hold some of the scratch space we need when actually carrying out the
updates.  Store the (struct ref_lock *) there.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07 12:09:15 -07:00
Michael Haggerty cb198d21d3 ref_transaction_commit(): simplify code using temporary variables
Use temporary variables in the for-loop blocks to simplify expressions
in the rest of the loop.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-04-07 12:09:15 -07:00