Commit graph

29 commits

Author SHA1 Message Date
Michael Haggerty 5b633610ec packed_ref_cache: keep the packed-refs file mmapped if possible
Keep a copy of the `packed-refs` file contents in memory for as long
as a `packed_ref_cache` object is in use:

* If the system allows it, keep the `packed-refs` file mmapped.

* If not (either because the system doesn't support `mmap()` at all,
  or because a file that is currently mmapped cannot be replaced via
  `rename()`), then make a copy of the file's contents in
  heap-allocated space, and keep that around instead.

We base the choice of behavior on a new build-time switch,
`MMAP_PREVENTS_DELETE`. By default, this switch is set for Windows
variants.

After this commit, `MMAP_NONE` and `MMAP_TEMPORARY` are still handled
identically. But the next commit will introduce a difference.

This whole change is still pointless, because we only read the
`packed-refs` file contents immediately after instantiating the
`packed_ref_cache`. But that will soon change.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty 14b3c344ea packed-backend.c: reorder some definitions
No code has been changed. This will make subsequent patches more
self-contained.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty 81b9b5aea7 mmapped_ref_iterator_advance(): no peeled value for broken refs
If a reference is broken, suppress its peeled value.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty 9cfb3dc0d1 mmapped_ref_iterator: add iterator over a packed-refs file
Add a new `mmapped_ref_iterator`, which can iterate over the
references in an mmapped `packed-refs` file directly. Use this
iterator from `read_packed_refs()` to fill the packed refs cache.

Note that we are not yet willing to promise that the new iterator
generates its output in order. That doesn't matter for now, because
the packed refs cache doesn't care what order it is filled.

This change adds a lot of boilerplate without providing any obvious
benefits. The benefits will come soon, when we get rid of the
`ref_cache` for packed references altogether.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty daa45408c1 packed_ref_cache: remember the file-wide peeling state
Rather than store the peeling state (i.e., the one defined by traits
in the `packed-refs` file header line) in a local variable in
`read_packed_refs()`, store it permanently in `packed_ref_cache`. This
will be needed when we stop reading all packed refs at once.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty 6a9bc4034a read_packed_refs(): read references with minimal copying
Instead of copying data from the `packed-refs` file one line at time
and then processing it, process the data in place as much as possible.

Also, instead of processing one line per iteration of the main loop,
process a reference line plus its corresponding peeled line (if
present) together.

Note that this change slightly tightens up the parsing of the
`packed-refs` file. Previously, the parser would have accepted
multiple "peeled" lines for a single reference (ignoring all but the
last one). Now it would reject that.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-25 18:02:45 +09:00
Michael Haggerty a8811695e3 read_packed_refs(): make parsing of the header line more robust
The old code parsed the traits in the `packed-refs` header by looking
for the string " trait " (i.e., the name of the trait with a space on
either side) in the header line. This is fragile, because if any other
implementation of Git forgets to write the trailing space, the last
trait would silently be ignored (and the error might never be
noticed).

So instead, use `string_list_split_in_place()` to split the traits
into tokens then use `unsorted_string_list_has_string()` to look for
the tokens we are interested in. This means that we can read the
traits correctly even if the header line is missing a trailing
space (or indeed, if it is missing the space after the colon, or if it
has multiple spaces somewhere).

However, older Git clients (and perhaps other Git implementations)
still require the surrounding spaces, so we still have to output the
header with a trailing space.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty 36f23534ae read_packed_refs(): only check for a header at the top of the file
This tightens up the parsing a bit; previously, stray header-looking
lines would have been processed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty 49a03ef466 read_packed_refs(): use mmap to read the packed-refs file
It's still done in a pretty stupid way, involving more data copying
than necessary. That will improve in future commits.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty 735267aa10 die_unterminated_line(), die_invalid_line(): new functions
Extract some helper functions for reporting errors. While we're at it,
prevent them from spewing unlimited output to the terminal. These
functions will soon have more callers.

These functions accept the problematic line as a `(ptr, len)` pair
rather than a NUL-terminated string, and `die_invalid_line()` checks
for an EOL itself, because these calling conventions will be
convenient for future callers. (Efficiency is not a concern here
because these functions are only ever called if the `packed-refs` file
is corrupt.)

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty f0a7dc86d2 packed_ref_cache: add a backlink to the associated packed_ref_store
It will prove convenient in upcoming patches.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty 8738a8a4df ref_iterator: keep track of whether the iterator output is ordered
References are iterated over in order by refname, but reflogs are not.
Some consumers of reference iteration care about the difference. Teach
each `ref_iterator` to keep track of whether its output is ordered.

`overlay_ref_iterator` is one of the picky consumers. Add a sanity
check in `overlay_ref_iterator_begin()` to verify that its inputs are
ordered.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-14 15:19:07 +09:00
Michael Haggerty 9939b33d6a packed-backend: rip out some now-unused code
Now the outside world interacts with the packed ref store only via the
generic refs API plus a few lock-related functions. This allows us to
delete some functions that are no longer used, thereby completing the
encapsulation of the packed ref store.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00
Michael Haggerty 2fb330ca72 packed_delete_refs(): implement method
Implement `packed_delete_refs()` using a reference transaction. This
means that `files_delete_refs()` can use `refs_delete_refs()` instead
of `repack_without_refs()` to delete any packed references, decreasing
the coupling between the classes.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:04 +09:00
Michael Haggerty 2775d8724d packed_ref_store: implement reference transactions
Implement the methods needed to support reference transactions for
the packed-refs backend. The new methods are not yet used.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:03 +09:00
Michael Haggerty 39c8df0cfe packed-backend: don't adjust the reference count on lock/unlock
The old code incremented the packed ref cache reference count when
acquiring the packed-refs lock, and decremented the count when
releasing the lock. This is unnecessary because:

* Another process cannot change the packed-refs file because it is
  locked.

* When we ourselves change the packed-refs file, we do so by first
  modifying the packed ref-cache, and then writing the data from the
  ref-cache to disk. So the packed ref-cache remains fresh because any
  changes that we plan to make to the file are made in the cache first
  anyway.

So there is no reason for the cache to become stale.

Moreover, the extra reference count causes a problem if we
intentionally clear the packed refs cache, as we sometimes need to do
if we change the cache in anticipation of writing a change to disk,
but then the write to disk fails. In that case, `packed_refs_unlock()`
would have no easy way to find the cache whose reference count it
needs to decrement.

This whole issue will soon become moot due to upcoming changes that
avoid changing the in-memory cache as part of updating the packed-refs
on disk, but this change makes that transition easier.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-09 03:18:03 +09:00
Junio C Hamano 44c2339e55 Merge branch 'mh/packed-ref-store'
The "ref-store" code reorganization continues.

* mh/packed-ref-store: (32 commits)
  files-backend: cheapen refname_available check when locking refs
  packed_ref_store: handle a packed-refs file that is a symlink
  read_packed_refs(): die if `packed-refs` contains bogus data
  t3210: add some tests of bogus packed-refs file contents
  repack_without_refs(): don't lock or unlock the packed refs
  commit_packed_refs(): remove call to `packed_refs_unlock()`
  clear_packed_ref_cache(): don't protest if the lock is held
  packed_refs_unlock(), packed_refs_is_locked(): new functions
  packed_refs_lock(): report errors via a `struct strbuf *err`
  packed_refs_lock(): function renamed from lock_packed_refs()
  commit_packed_refs(): use a staging file separate from the lockfile
  commit_packed_refs(): report errors rather than dying
  packed_ref_store: make class into a subclass of `ref_store`
  packed-backend: new module for handling packed references
  packed_read_raw_ref(): new function, replacing `resolve_packed_ref()`
  packed_ref_store: support iteration
  packed_peel_ref(): new function, extracted from `files_peel_ref()`
  repack_without_refs(): take a `packed_ref_store *` parameter
  get_packed_ref(): take a `packed_ref_store *` parameter
  rollback_packed_refs(): take a `packed_ref_store *` parameter
  ...
2017-08-22 10:29:16 -07:00
Michael Haggerty 198b808e20 packed_ref_store: handle a packed-refs file that is a symlink
One of the tricks that `contrib/workdir/git-new-workdir` plays is to
making `packed-refs` in the new workdir a symlink to the `packed-refs`
file in the original repository. Before
42dfa7ecef ("commit_packed_refs(): use a staging file separate from
the lockfile", 2017-06-23), a lockfile was used as the staging file,
and because the `LOCK_NO_DEREF` was not used, the pointed-to file was
locked and modified.

But after that commit, the staging file was created using a tempfile,
with the end result that rewriting the `packed-refs` file in the
workdir overwrote the symlink rather than the original `packed-refs`
file.

Change `commit_packed_refs()` to use `get_locked_file_path()` to find
the path of the file that it should overwrite. Since that path was
properly resolved when the lockfile was created, this restores the
pre-42dfa7ecef behavior.

Also add a test case to document this use case and prevent a
regression like this from recurring.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-27 10:19:56 -07:00
Michael Haggerty 9308b7f3ca read_packed_refs(): die if packed-refs contains bogus data
The old code ignored any lines that it didn't understand, including
unterminated lines. This is dangerous. Instead, `die()` if the
`packed-refs` file contains any unterminated lines or lines that we
don't know how to handle.

This fixes the tests added in the last commit.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:01:57 -07:00
Michael Haggerty e5cc7d7d2b repack_without_refs(): don't lock or unlock the packed refs
Change `repack_without_refs()` to expect the packed-refs lock to be
held already, and not to release the lock before returning. Change the
callers to deal with lock management.

This change makes it possible for callers to hold the packed-refs lock
for a longer span of time, a possibility that will eventually make it
possible to fix some longstanding races.

The only semantic change here is that `repack_without_refs()` used to
forget to release the lock in the `if (!removed)` exit path. That
omission is now fixed.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-03 10:01:56 -07:00
Michael Haggerty 42c7f7ff96 commit_packed_refs(): remove call to packed_refs_unlock()
Instead, change the callers of `commit_packed_refs()` to call
`packed_refs_unlock()`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty 9051198214 clear_packed_ref_cache(): don't protest if the lock is held
The existing callers already check that the lock isn't held just
before calling `clear_packed_ref_cache()`, and in the near future we
want to be able to call this function when the lock is held.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty 49aebcf432 packed_refs_unlock(), packed_refs_is_locked(): new functions
Add two new public functions, `packed_refs_unlock()` and
`packed_refs_is_locked()`, with which callers can manage and query the
`packed-refs` lock externally.

Call `packed_refs_unlock()` from `commit_packed_refs()` and
`rollback_packed_refs()`.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty c8bed835c2 packed_refs_lock(): report errors via a struct strbuf *err
That way the callers don't have to come up with error messages
themselves.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty b7de57d8d1 packed_refs_lock(): function renamed from lock_packed_refs()
Rename `lock_packed_refs()` to `packed_refs_lock()` for consistency
with how other methods are named. Also, it's about to get some
companions.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty 42dfa7ecef commit_packed_refs(): use a staging file separate from the lockfile
We will want to be able to hold the lockfile for `packed-refs` even
after we have activated the new values. So use a separate tempfile,
`packed-refs.new`, as a place to stage the new contents of the
`packed-refs` file. For now this is all done within
`commit_packed_refs()`, but that will change shortly.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty 3478983b51 commit_packed_refs(): report errors rather than dying
Report errors via a `struct strbuf *err` rather than by calling
`die()`. To enable this goal, change `write_packed_entry()` to report
errors via a return value and `errno` rather than dying.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty e0cc8ac820 packed_ref_store: make class into a subclass of ref_store
Add the infrastructure to make `packed_ref_store` implement
`ref_store`, at least formally (few of the methods are actually
implemented yet). Change the functions in its interface to take
`ref_store *` arguments. Change `files_ref_store` to store a pointer
to `ref_store *` and to call functions via the virtual `ref_store`
interface where possible. This also means that a few
`packed_ref_store` functions can become static.

This is a work in progress. Some more `ref_store` methods will soon be
implemented (e.g., those having to do with reference transactions).
But some of them will never be implemented (e.g., those having to do
with symrefs or reflogs).

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:33 -07:00
Michael Haggerty 67be7c5a59 packed-backend: new module for handling packed references
Now that the interface between `files_ref_store` and
`packed_ref_store` is relatively narrow, move the latter into a new
module, "refs/packed-backend.h" and "refs/packed-backend.c". It still
doesn't quite implement the `ref_store` interface, but it will soon.

This commit moves code around and adjusts its visibility, but doesn't
change anything.

Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-23 13:27:32 -07:00