Commit graph

319 commits

Author SHA1 Message Date
Junio C Hamano
4e40fb302e Merge branch 'mh/ref-locking-fix'
Transactions to update multiple references that involves a deletion
was quite broken in an error codepath and did not abort everything
correctly.

* mh/ref-locking-fix:
  files_transaction_prepare(): fix handling of ref lock failure
  t1404: add a bunch of tests of D/F conflicts
2017-10-26 12:29:23 +09:00
Michael Haggerty
da5267f1b6 files_transaction_prepare(): fix handling of ref lock failure
Since dc39e09942 (files_ref_store: use a transaction to update packed
refs, 2017-09-08), failure to lock a reference has been handled
incorrectly by `files_transaction_prepare()`. If
`lock_ref_for_update()` fails in the lock-acquisition loop of that
function, it sets `ret` then breaks out of that loop. Prior to
dc39e09942, that was OK, because the only thing following the loop was
the cleanup code. But dc39e09942 added another blurb of code between
the loop and the cleanup. That blurb sometimes resets `ret` to zero,
making the cleanup code think that the locking was successful.

Specifically, whenever

* One or more reference deletions have been processed successfully in
  the lock-acquisition loop. (Processing the first such reference
  causes a packed-ref transaction to be initialized.)

* Then `lock_ref_for_update()` fails for a subsequent reference. Such
  a failure can happen for a number of reasons, such as the old SHA-1
  not being correct, lock contention, etc. This causes a `break` out
  of the lock-acquisition loop.

* The `packed-refs` lock is acquired successfully and
  `ref_transaction_prepare()` succeeds for the packed-ref transaction.
  This has the effect of resetting `ret` back to 0, and making the
  cleanup code think that lock acquisition was successful.

In that case, any reference updates that were processed prior to
breaking out of the loop would be carried out (loose and packed), but
the reference that couldn't be locked and any subsequent references
would silently be ignored.

This can easily cause data loss if, for example, the user was trying
to push a new name for an existing branch while deleting the old name.
After the push, the branch could be left unreachable, and could even
subsequently be garbage-collected.

This problem was noticed in the context of deleting one reference and
creating another in a single transaction, when the two references D/F
conflict with each other, like

    git update-ref --stdin <<EOF
    delete refs/foo
    create refs/foo/bar HEAD
    EOF

This triggers the above bug because the deletion is processed
successfully for `refs/foo`, then the D/F conflict causes
`lock_ref_for_update()` to fail when `refs/foo/bar` is processed. In
this case the transaction *should* fail, but instead it causes
`refs/foo` to be deleted without creating `refs/foo`. This could
easily result in data loss.

The fix is simple: instead of just breaking out of the loop, jump
directly to the cleanup code. This fixes some tests in t1404 that were
added in the previous commit.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-25 15:08:26 +09:00
Junio C Hamano
e46ebc2754 Merge branch 'rs/cleanup-strbuf-users'
Code clean-up.

* rs/cleanup-strbuf-users:
  graph: use strbuf_addchars() to add spaces
  use strbuf_addstr() for adding strings to strbufs
  path: use strbuf_add_real_path()
2017-10-05 13:48:19 +09:00
Junio C Hamano
efe9d6ce33 Merge branch 'rs/resolve-ref-optional-result'
Code clean-up.

* rs/resolve-ref-optional-result:
  refs: pass NULL to resolve_refdup() if hash is not needed
  refs: pass NULL to refs_resolve_refdup() if hash is not needed
2017-10-05 13:48:19 +09:00
Junio C Hamano
1a2e1a76ec Merge branch 'mh/mmap-packed-refs'
Operations that do not touch (majority of) packed refs have been
optimized by making accesses to packed-refs file lazy; we no longer
pre-parse everything, and an access to a single ref in the
packed-refs does not touch majority of irrelevant refs, either.

* mh/mmap-packed-refs: (21 commits)
  packed-backend.c: rename a bunch of things and update comments
  mmapped_ref_iterator: inline into `packed_ref_iterator`
  ref_cache: remove support for storing peeled values
  packed_ref_store: get rid of the `ref_cache` entirely
  ref_store: implement `refs_peel_ref()` generically
  packed_read_raw_ref(): read the reference from the mmapped buffer
  packed_ref_iterator_begin(): iterate using `mmapped_ref_iterator`
  read_packed_refs(): ensure that references are ordered when read
  packed_ref_cache: keep the `packed-refs` file mmapped if possible
  packed-backend.c: reorder some definitions
  mmapped_ref_iterator_advance(): no peeled value for broken refs
  mmapped_ref_iterator: add iterator over a packed-refs file
  packed_ref_cache: remember the file-wide peeling state
  read_packed_refs(): read references with minimal copying
  read_packed_refs(): make parsing of the header line more robust
  read_packed_refs(): only check for a header at the top of the file
  read_packed_refs(): use mmap to read the `packed-refs` file
  die_unterminated_line(), die_invalid_line(): new functions
  packed_ref_cache: add a backlink to the associated `packed_ref_store`
  prefix_ref_iterator: break when we leave the prefix
  ...
2017-10-03 15:42:50 +09:00
Junio C Hamano
cb1083ca23 Merge branch 'jk/read-in-full'
Code clean-up to prevent future mistakes by copying and pasting
code that checks the result of read_in_full() function.

* jk/read-in-full:
  worktree: check the result of read_in_full()
  worktree: use xsize_t to access file size
  distinguish error versus short read from read_in_full()
  avoid looking at errno for short read_in_full() returns
  prefer "!=" when checking read_in_full() result
  notes-merge: drop dead zero-write code
  files-backend: prefer "0" for write_in_full() error check
2017-10-03 15:42:49 +09:00
Junio C Hamano
3b48045c6c Merge branch 'sd/branch-copy'
"git branch" learned "-c/-C" to create a new branch by copying an
existing one.

* sd/branch-copy:
  branch: fix "copy" to never touch HEAD
  branch: add a --copy (-c) option to go with --move (-m)
  branch: add test for -m renaming multiple config sections
  config: create a function to format section headers
2017-10-03 15:42:48 +09:00
René Scharfe
72d4a9a721 use strbuf_addstr() for adding strings to strbufs
Use strbuf_addstr() instead of strbuf_addf() for adding strings.  That's
simpler and makes the intent clearer.

Patch generated by Coccinelle and contrib/coccinelle/strbuf.cocci;
adjusted indentation in refs/packed-backend.c manually.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-02 13:13:46 +09:00
René Scharfe
872ccb2c69 refs: pass NULL to refs_resolve_refdup() if hash is not needed
This gets us rid of a write-only variable.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-01 17:26:58 +09:00
Junio C Hamano
73ecdc606e Merge branch 'rs/resolve-ref-optional-result'
Code clean-up.

* rs/resolve-ref-optional-result:
  refs: pass NULL to resolve_ref_unsafe() if hash is not needed
  refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed
  refs: make sha1 output parameter of refs_resolve_ref_unsafe() optional
2017-09-28 14:47:56 +09:00
Jeff King
88780c37b3 files-backend: prefer "0" for write_in_full() error check
Commit 06f46f237a (avoid "write_in_full(fd, buf, len) !=
len" pattern, 2017-09-13) converted this callsite from:

  write_in_full(...) != 1

to

  write_in_full(...) < 0

But during the conflict resolution in c50424a6f0 (Merge
branch 'jk/write-in-full-fix', 2017-09-25), this morphed
into

  write_in_full(...) < 1

This behaves as we want, but we prefer to avoid modeling the
"less than length" error-check which can be subtly buggy, as
shown in efacf609c8 (config: avoid "write_in_full(fd, buf,
len) < len" pattern, 2017-09-13).

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-26 12:54:43 +09:00
Michael Haggerty
cff28ca94c packed-backend.c: rename a bunch of things and update comments
We've made huge changes to this file, and some of the old names and
comments are no longer very fitting. So rename a bunch of things:

* `struct packed_ref_cache` → `struct snapshot`
* `acquire_packed_ref_cache()` → `acquire_snapshot()`
* `release_packed_ref_buffer()` → `clear_snapshot_buffer()`
* `release_packed_ref_cache()` → `release_snapshot()`
* `clear_packed_ref_cache()` → `clear_snapshot()`
* `struct packed_ref_entry` → `struct snapshot_record`
* `cmp_packed_ref_entries()` → `cmp_packed_ref_records()`
* `cmp_entry_to_refname()` → `cmp_record_to_refname()`
* `sort_packed_refs()` → `sort_snapshot()`
* `read_packed_refs()` → `create_snapshot()`
* `validate_packed_ref_cache()` → `validate_snapshot()`
* `get_packed_ref_cache()` → `get_snapshot()`
* Renamed local variables and struct members accordingly.

Also update a bunch of comments to reflect the renaming and the
accumulated changes that the code has undergone.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:46 +09:00
Michael Haggerty
523ee2d785 mmapped_ref_iterator: inline into packed_ref_iterator
Since `packed_ref_iterator` is now delegating to
`mmapped_ref_iterator` rather than `cache_ref_iterator` to do the
heavy lifting, there is no need to keep the two iterators separate. So
"inline" `mmapped_ref_iterator` into `packed_ref_iterator`. This
removes a bunch of boilerplate.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:46 +09:00
Michael Haggerty
a6e19bcdad ref_cache: remove support for storing peeled values
Now that the `packed-refs` backend doesn't use `ref_cache`, there is
nobody left who might want to store peeled values of references in
`ref_cache`. So remove that feature.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:46 +09:00
Michael Haggerty
9dd389f3d8 packed_ref_store: get rid of the ref_cache entirely
Now that everything has been changed to read what it needs directly
out of the `packed-refs` file, `packed_ref_store` doesn't need to
maintain a `ref_cache` at all. So get rid of it.

First of all, this will save a lot of memory and lots of little
allocations. Instead of needing to store complicated parsed data
structures in memory, we just mmap the file (potentially sharing
memory with other processes) and parse only what we need.

Moreover, since the mmapped access to the file reads only the parts of
the file that it needs, this might save reading all of the data from
disk at all (at least if the file starts out sorted).

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:46 +09:00
Michael Haggerty
ba1c052fa6 ref_store: implement refs_peel_ref() generically
We're about to stop storing packed refs in a `ref_cache`. That means
that the only way we have left to optimize `peel_ref()` is by checking
whether the reference being peeled is the one currently being iterated
over (in `current_ref_iter`), and if so, using `ref_iterator_peel()`.
But this can be done generically; it doesn't have to be implemented
per-backend.

So implement `refs_peel_ref()` in `refs.c` and remove the `peel_ref()`
method from the refs API.

This removes the last callers of a couple of functions, so delete
them. More cleanup to come...

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:46 +09:00
Michael Haggerty
f3987ab36d packed_read_raw_ref(): read the reference from the mmapped buffer
Instead of reading the reference from the `ref_cache`, read it
directly from the mmapped buffer.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
d1cf15516f packed_ref_iterator_begin(): iterate using mmapped_ref_iterator
Now that we have an efficient way to iterate, in order, over the
mmapped contents of the `packed-refs` file, we can use that directly
to implement reference iteration for the `packed_ref_store`, rather
than iterating over the `ref_cache`. This is the next step towards
getting rid of the `ref_cache` entirely.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
02b920f3f7 read_packed_refs(): ensure that references are ordered when read
It doesn't actually matter now, because the references are only
iterated over to fill the associated `ref_cache`, which itself puts
them in the correct order. But we want to get rid of the `ref_cache`,
so we want to be able to iterate directly over the `packed-refs`
buffer, and then the iteration will need to be ordered correctly.

In fact, we already write the `packed-refs` file sorted, but it is
possible that other Git clients don't get it right. So let's not
assume that a `packed-refs` file is sorted unless it is explicitly
declared to be so via a `sorted` trait in its header line.

If it is *not* declared to be sorted, then scan quickly through the
file to check. If it is found to be out of order, then sort the
records into a new memory-only copy. This checking and sorting is done
quickly, without parsing the full file contents. However, it needs a
little bit of care to avoid reading past the end of the buffer even if
the `packed-refs` file is corrupt.

Since *we* always write the file correctly sorted, include that trait
when we write or rewrite a `packed-refs` file. This means that the
scan described in the previous paragraph should only have to be done
for `packed-refs` files that were written by older versions of the Git
command-line client, or by other clients that haven't yet learned to
write the `sorted` trait.

If `packed-refs` was already sorted, then (if the system allows it) we
can use the mmapped file contents directly. But if the system doesn't
allow a file that is currently mmapped to be replaced using
`rename()`, then it would be bad for us to keep the file mmapped for
any longer than necessary. So, on such systems, always make a copy of
the file contents, either as part of the sorting process, or
afterwards.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
5b633610ec packed_ref_cache: keep the packed-refs file mmapped if possible
Keep a copy of the `packed-refs` file contents in memory for as long
as a `packed_ref_cache` object is in use:

* If the system allows it, keep the `packed-refs` file mmapped.

* If not (either because the system doesn't support `mmap()` at all,
  or because a file that is currently mmapped cannot be replaced via
  `rename()`), then make a copy of the file's contents in
  heap-allocated space, and keep that around instead.

We base the choice of behavior on a new build-time switch,
`MMAP_PREVENTS_DELETE`. By default, this switch is set for Windows
variants.

After this commit, `MMAP_NONE` and `MMAP_TEMPORARY` are still handled
identically. But the next commit will introduce a difference.

This whole change is still pointless, because we only read the
`packed-refs` file contents immediately after instantiating the
`packed_ref_cache`. But that will soon change.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
14b3c344ea packed-backend.c: reorder some definitions
No code has been changed. This will make subsequent patches more
self-contained.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
81b9b5aea7 mmapped_ref_iterator_advance(): no peeled value for broken refs
If a reference is broken, suppress its peeled value.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
9cfb3dc0d1 mmapped_ref_iterator: add iterator over a packed-refs file
Add a new `mmapped_ref_iterator`, which can iterate over the
references in an mmapped `packed-refs` file directly. Use this
iterator from `read_packed_refs()` to fill the packed refs cache.

Note that we are not yet willing to promise that the new iterator
generates its output in order. That doesn't matter for now, because
the packed refs cache doesn't care what order it is filled.

This change adds a lot of boilerplate without providing any obvious
benefits. The benefits will come soon, when we get rid of the
`ref_cache` for packed references altogether.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
daa45408c1 packed_ref_cache: remember the file-wide peeling state
Rather than store the peeling state (i.e., the one defined by traits
in the `packed-refs` file header line) in a local variable in
`read_packed_refs()`, store it permanently in `packed_ref_cache`. This
will be needed when we stop reading all packed refs at once.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty
6a9bc4034a read_packed_refs(): read references with minimal copying
Instead of copying data from the `packed-refs` file one line at time
and then processing it, process the data in place as much as possible.

Also, instead of processing one line per iteration of the main loop,
process a reference line plus its corresponding peeled line (if
present) together.

Note that this change slightly tightens up the parsing of the
`packed-refs` file. Previously, the parser would have accepted
multiple "peeled" lines for a single reference (ignoring all but the
last one). Now it would reject that.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Junio C Hamano
c50424a6f0 Merge branch 'jk/write-in-full-fix'
Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.

* jk/write-in-full-fix:
  read_pack_header: handle signed/unsigned comparison in read result
  config: flip return value of store_write_*()
  notes-merge: use ssize_t for write_in_full() return value
  pkt-line: check write_in_full() errors against "< 0"
  convert less-trivial versions of "write_in_full() != len"
  avoid "write_in_full(fd, buf, len) != len" pattern
  get-tar-commit-id: check write_in_full() return against 0
  config: avoid "write_in_full(fd, buf, len) < len" pattern
2017-09-25 15:24:06 +09:00
René Scharfe
e691b027b6 refs: pass NULL to refs_resolve_ref_unsafe() if hash is not needed
This allows us to get rid of two write-only variables, one of them
being a SHA1 buffer.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-24 10:18:18 +09:00
Junio C Hamano
07f0542da3 Merge branch 'mh/packed-ref-transactions'
Implement transactional update to the packed-ref representation of
references.

* mh/packed-ref-transactions:
  files_transaction_finish(): delete reflogs before references
  packed-backend: rip out some now-unused code
  files_ref_store: use a transaction to update packed refs
  t1404: demonstrate two problems with reference transactions
  files_initial_transaction_commit(): use a transaction for packed refs
  prune_refs(): also free the linked list
  files_pack_refs(): use a reference transaction to write packed refs
  packed_delete_refs(): implement method
  packed_ref_store: implement reference transactions
  struct ref_transaction: add a place for backends to store data
  packed-backend: don't adjust the reference count on lock/unlock
2017-09-19 10:47:56 +09:00
Junio C Hamano
89563ec379 Merge branch 'jk/incore-lockfile-removal'
The long-standing rule that an in-core lockfile instance, once it
is used, must not be freed, has been lifted and the lockfile and
tempfile APIs have been updated to reduce the chance of programming
errors.

* jk/incore-lockfile-removal:
  stop leaking lock structs in some simple cases
  ref_lock: stop leaking lock_files
  lockfile: update lifetime requirements in documentation
  tempfile: auto-allocate tempfiles on heap
  tempfile: remove deactivated list entries
  tempfile: use list.h for linked list
  tempfile: release deactivated strbufs instead of resetting
  tempfile: robustify cleanup handler
  tempfile: factor out deactivation
  tempfile: factor out activation
  tempfile: replace die("BUG") with BUG()
  tempfile: handle NULL tempfile pointers gracefully
  tempfile: prefer is_tempfile_active to bare access
  lockfile: do not rollback lock on failed close
  tempfile: do not delete tempfile on failed close
  always check return value of close_tempfile
  verify_signed_buffer: prefer close_tempfile() to close()
  setup_temporary_shallow: move tempfile struct into function
  setup_temporary_shallow: avoid using inactive tempfile
  write_index_as_tree: cleanup tempfile on error
2017-09-19 10:47:53 +09:00
Junio C Hamano
8a044c7f1d Merge branch 'nd/prune-in-worktree'
"git gc" and friends when multiple worktrees are used off of a
single repository did not consider the index and per-worktree refs
of other worktrees as the root for reachability traversal, making
objects that are in use only in other worktrees to be subject to
garbage collection.

* nd/prune-in-worktree:
  refs.c: reindent get_submodule_ref_store()
  refs.c: remove fallback-to-main-store code get_submodule_ref_store()
  rev-list: expose and document --single-worktree
  revision.c: --reflog add HEAD reflog from all worktrees
  files-backend: make reflog iterator go through per-worktree reflog
  revision.c: --all adds HEAD from all worktrees
  refs: remove dead for_each_*_submodule()
  refs.c: move for_each_remote_ref_submodule() to submodule.c
  revision.c: use refs_for_each*() instead of for_each_*_submodule()
  refs: add refs_head_ref()
  refs: move submodule slash stripping code to get_submodule_ref_store
  refs.c: refactor get_submodule_ref_store(), share common free block
  revision.c: --indexed-objects add objects from all worktrees
  revision.c: refactor add_index_objects_to_pending()
  refs.c: use is_dir_sep() in resolve_gitlink_ref()
  revision.h: new flag in struct rev_info wrt. worktree-related refs
2017-09-19 10:47:53 +09:00
Junio C Hamano
dafbe1993e Merge branch 'ma/split-symref-update-fix'
A leakfix.

* ma/split-symref-update-fix:
  refs/files-backend: add `refname`, not "HEAD", to list
  refs/files-backend: correct return value in lock_ref_for_update
  refs/files-backend: fix memory leak in lock_ref_for_update
  refs/files-backend: add longer-scoped copy of string to list
2017-09-19 10:47:53 +09:00
Michael Haggerty
a8811695e3 read_packed_refs(): make parsing of the header line more robust
The old code parsed the traits in the `packed-refs` header by looking
for the string " trait " (i.e., the name of the trait with a space on
either side) in the header line. This is fragile, because if any other
implementation of Git forgets to write the trailing space, the last
trait would silently be ignored (and the error might never be
noticed).

So instead, use `string_list_split_in_place()` to split the traits
into tokens then use `unsorted_string_list_has_string()` to look for
the tokens we are interested in. This means that we can read the
traits correctly even if the header line is missing a trailing
space (or indeed, if it is missing the space after the colon, or if it
has multiple spaces somewhere).

However, older Git clients (and perhaps other Git implementations)
still require the surrounding spaces, so we still have to output the
header with a trailing space.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
36f23534ae read_packed_refs(): only check for a header at the top of the file
This tightens up the parsing a bit; previously, stray header-looking
lines would have been processed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
49a03ef466 read_packed_refs(): use mmap to read the packed-refs file
It's still done in a pretty stupid way, involving more data copying
than necessary. That will improve in future commits.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
735267aa10 die_unterminated_line(), die_invalid_line(): new functions
Extract some helper functions for reporting errors. While we're at it,
prevent them from spewing unlimited output to the terminal. These
functions will soon have more callers.

These functions accept the problematic line as a `(ptr, len)` pair
rather than a NUL-terminated string, and `die_invalid_line()` checks
for an EOL itself, because these calling conventions will be
convenient for future callers. (Efficiency is not a concern here
because these functions are only ever called if the `packed-refs` file
is corrupt.)

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
f0a7dc86d2 packed_ref_cache: add a backlink to the associated packed_ref_store
It will prove convenient in upcoming patches.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Jeff King
157113c614 prefix_ref_iterator: break when we leave the prefix
If the underlying iterator is ordered, then `prefix_ref_iterator` can
stop as soon as it sees a refname that comes after the prefix. This
will rarely make a big difference now, because `ref_cache_iterator`
only iterates over the directory containing the prefix (and usually
the prefix will span a whole directory anyway). But if *hint, hint* a
future reference backend doesn't itself know where to stop the
iteration, then this optimization will be a big win.

Note that there is no guarantee that the underlying iterator doesn't
include output preceding the prefix, so we have to skip over any
unwanted references before we get to the ones that we want.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty
8738a8a4df ref_iterator: keep track of whether the iterator output is ordered
References are iterated over in order by refname, but reflogs are not.
Some consumers of reference iteration care about the difference. Teach
each `ref_iterator` to keep track of whether its output is ordered.

`overlay_ref_iterator` is one of the picky consumers. Add a sanity
check in `overlay_ref_iterator_begin()` to verify that its inputs are
ordered.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Jeff King
564bde9ae6 convert less-trivial versions of "write_in_full() != len"
The prior commit converted many sites to check the return
value of write_in_full() for negativity, rather than a
mismatch with the input length. This patch covers similar
cases, but where the return value is stored in an
intermediate variable. These should get the same treatment,
but they need to be reviewed more carefully since it would
be a bug if the return value is stored in an unsigned type
(which indeed, it is in one of the cases).

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Jeff King
06f46f237a avoid "write_in_full(fd, buf, len) != len" pattern
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).

So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:

  1. It can do a funny signed/unsigned comparison. If your
     "len" is signed (e.g., a size_t) then the compiler will
     promote the "-1" to its unsigned variant.

     This works out for "!= len" (unless you really were
     trying to write the maximum size_t bytes), but is a
     bug if you check "< len" (an example of which was fixed
     recently in config.c).

     We should avoid promoting the mental model that you
     need to check the length at all, so that new sites are
     not tempted to copy us.

  2. Checking for a negative value is shorter to type,
     especially when the length is an expression.

  3. Linus says so. In d34cf19b89 (Clean up write_in_full()
     users, 2007-01-11), right after the write_in_full()
     semantics were changed, he wrote:

       I really wish every "write_in_full()" user would just
       check against "<0" now, but this fixes the nasty and
       stupid ones.

     Appeals to authority aside, this makes it clear that
     writing it this way does not have an intentional
     benefit. It's a historical curiosity that we never
     bothered to clean up (and which was undoubtedly
     cargo-culted into new sites).

So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).

[1] A careful reader may notice there is one way that
    write_in_full() can return a different value. If we ask
    write() to write N bytes and get a return value that is
    _larger_ than N, we could return a larger total. But
    besides the fact that this would imply a totally broken
    version of write(), it would already invoke undefined
    behavior. Our internal remaining counter is an unsigned
    size_t, which means that subtracting too many byte will
    wrap it around to a very large number. So we'll instantly
    begin reading off the end of the buffer, trying to write
    gigabytes (or petabytes) of data.

Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:17:59 +09:00
Martin Ågren
276d0e35c0 refs/files-backend: add refname, not "HEAD", to list
An earlier patch rewrote `split_symref_update()` to add a copy of a
string to a string list instead of adding the original string. That was
so that the original string could be freed in a later patch, but it is
also conceptually cleaner, since now all calls to `string_list_insert()`
and `string_list_append()` add `update->refname`. --- Except a literal
"HEAD" is added in `split_head_update()`.

Restructure `split_head_update()` in the same way as the earlier patch
did for `split_symref_update()`. This does not correct any practical
problem, but makes things conceptually cleaner. The downside is a call
to `string_list_has_string()`, which should be relatively cheap.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10 16:36:58 +09:00
Martin Ågren
3f5ef95b5e refs/files-backend: correct return value in lock_ref_for_update
In one code path we return a literal -1 and not a symbolic constant. The
value -1 would be interpreted as TRANSACTION_NAME_CONFLICT, which is
wrong. Use TRANSACTION_GENERIC_ERROR instead (that is the only other
return value we have to choose from).

Noticed-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10 16:36:58 +09:00
Martin Ågren
851e1fbd01 refs/files-backend: fix memory leak in lock_ref_for_update
After the previous patch, none of the functions we call hold on to
`referent.buf`, so we can safely release the string buffer before
returning.

Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10 16:36:58 +09:00
Martin Ågren
c299468bd7 refs/files-backend: add longer-scoped copy of string to list
split_symref_update() receives a string-pointer `referent` and adds it
to the list of `affected_refnames`. The list simply holds on to the
pointers it is given, it does not copy the strings and it does not ever
free them. The `referent` string in split_symref_update() belongs to a
string buffer in the caller. After we return, the string will be leaked.

In the next patch, we want to properly release the string buffer in the
caller, but we can't safely do so until we've made sure that
`affected_refnames` will not be holding on to a pointer to the string.
We could configure the list to handle its own resources, but it would
mean some alloc/free-churning. The list is already handling other
strings (through other code paths) which we do not need to worry about,
and we'd be memory-churning those strings too, completely unnecessary.

Observe that split_symref_update() creates a `new_update`-object through
ref_transaction_add_update(), after which `new_update->refname` is a
copy of `referent`. The difference is, this copy will be freed, and it
will be freed *after* `affected_refnames` has been cleared.

Rearrange the handling of `referent`, so that we don't add it directly
to `affected_refnames`. Instead, first just check whether `referent`
exists in the string list, and later add `new_update->refname`.

Helped-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-10 16:36:58 +09:00
Michael Haggerty
5e00a6c873 files_transaction_finish(): delete reflogs before references
If the deletion steps unexpectedly fail, it is less bad to leave a
reference without its reflog than it is to leave a reflog without its
reference, since the latter is an invalid repository state.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00
Michael Haggerty
9939b33d6a packed-backend: rip out some now-unused code
Now the outside world interacts with the packed ref store only via the
generic refs API plus a few lock-related functions. This allows us to
delete some functions that are no longer used, thereby completing the
encapsulation of the packed ref store.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00
Michael Haggerty
dc39e09942 files_ref_store: use a transaction to update packed refs
When processing a `files_ref_store` transaction, it is sometimes
necessary to delete some references from the "packed-refs" file. Do
that using a reference transaction conducted against the
`packed_ref_store`.

This change further decouples `files_ref_store` from
`packed_ref_store`. It also fixes multiple problems, including the two
revealed by test cases added in the previous commit.

First, the old code didn't obtain the `packed-refs` lock until
`files_transaction_finish()`. This means that a failure to acquire the
`packed-refs` lock (e.g., due to contention with another process)
wasn't detected until it was too late (problems like this are supposed
to be detected in the "prepare" phase). The new code acquires the
`packed-refs` lock in `files_transaction_prepare()`, the same stage of
the processing when the loose reference locks are being acquired,
removing another reason why the "prepare" phase might succeed and the
"finish" phase might nevertheless fail.

Second, the old code deleted the loose version of a reference before
deleting any packed version of the same reference. This left a moment
when another process might think that the packed version of the
reference is current, which is incorrect. (Even worse, the packed
version of the reference can be arbitrarily old, and might even point
at an object that has since been garbage-collected.)

Third, if a reference deletion fails to acquire the `packed-refs` lock
altogether, then the old code might leave the repository in the
incorrect state (possibly corrupt) described in the previous
paragraph.

Now we activate the new "packed-refs" file (sans any references that
are being deleted) *before* deleting the corresponding loose
references. But we hold the "packed-refs" lock until after the loose
references have been finalized, thus preventing a simultaneous
"pack-refs" process from packing the loose version of the reference in
the time gap, which would otherwise defeat our attempt to delete it.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00
Michael Haggerty
1444bfe027 files_initial_transaction_commit(): use a transaction for packed refs
Use a `packed_ref_store` transaction in the implementation of
`files_initial_transaction_commit()` rather than using internal
features of the packed ref store. This further decouples
`files_ref_store` from `packed_ref_store`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00
Michael Haggerty
22b09cdfad prune_refs(): also free the linked list
At least since v1.7, the elements of the `refs_to_prune` linked list
have been leaked. Fix the leak by teaching `prune_refs()` to free the
list elements as it processes them.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00
Michael Haggerty
27d03d04d5 files_pack_refs(): use a reference transaction to write packed refs
Now that the packed reference store supports transactions, we can use
a transaction to write the packed versions of references that we want
to pack. This decreases the coupling between `files_ref_store` and
`packed_ref_store`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00