In-core storage of the reverse index for .pack files (which lets
you go from a pack offset to an object name) has been streamlined.
* jk/pack-revindex:
pack-revindex: store entries directly in packed_git
pack-revindex: drop hash table
Some "git notes" operations, e.g. "git log --notes=<note>", should
be able to read notes from any tree-ish that is shaped like a notes
tree, but the notes infrastructure required that the argument must
be a ref under refs/notes/. Loosen it to require a valid ref only
when the operation would update the notes (in which case we must
have a place to store the updated notes tree, iow, a ref).
* mh/notes-allow-reading-treeish:
notes: allow treeish expressions as notes ref
"git grep" can now be configured (or told from the command line)
how many threads to use when searching in the working tree files.
* vl/grep-configurable-threads:
grep: add --threads=<num> option and grep.threads configuration
grep: slight refactoring to the code that disables threading
grep: allow threading even on a single-core machine
"git blame" learned to produce the progress eye-candy when it takes
too much time before emitting the first line of the result.
* ea/blame-progress:
blame: add support for --[no-]progress option
Add a framework to spawn a group of processes in parallel, and use
it to run "git fetch --recurse-submodules" in parallel.
Rerolled and this seems to be a lot cleaner. The merge of the
earlier one to 'next' has been reverted.
* sb/submodule-parallel-fetch:
submodules: allow parallel fetching, add tests and documentation
fetch_populated_submodules: use new parallel job processing
run-command: add an asynchronous parallel child processor
sigchain: add command to pop all common signals
strbuf: add strbuf_read_once to read without blocking
xread: poll on non blocking fds
submodule.c: write "Fetching submodule <foo>" to stderr
"branch --delete" has "branch -d" but "push --delete" does not.
* ps/push-delete-option:
push: add '-d' as shorthand for '--delete'
push: add '--delete' flag to synopsis
An earlier change in 2.5.x-era broke users' hooks and aliases by
exporting GIT_WORK_TREE to point at the root of the working tree,
interfering when they tried to use a different working tree without
setting GIT_WORK_TREE environment themselves.
* nd/stop-setenv-work-tree:
Revert "setup: set env $GIT_WORK_TREE when work tree is set, like $GIT_DIR"
init_notes() is the main point of entry to the notes API. It ensures
that the input can be used as ref, because it needs a ref to update to
store notes tree after modifying it.
There however are many use cases where notes tree is only read, e.g.
"git log --notes=...". Any notes-shaped treeish could be used for such
purpose, but it is not allowed due to existing restriction.
Allow treeish expressions to be used in the case the notes tree is going
to be used without write "permissions". Add a flag to distinguish
whether the notes tree is intended to be used read-only, or will be
updated.
With this change, operations that use notes read-only can be fed any
notes-shaped tree-ish can be used, e.g. git log --notes=notes@{1}.
Signed-off-by: Mike Hommey <mh@glandium.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
History traversal with "git log --source" that starts with an
annotated tag failed to report the tag as "source", due to an
old regression in the command line parser back in v2.2 days.
* jk/pending-keep-tag-name:
revision.c: propagate tag names from pending array
"git symbolic-ref" forgot to report a failure with its exit status.
* jk/symbolic-ref-maint:
t1401: test reflog creation for git-symbolic-ref
symbolic-ref: propagate error code from create_symref()
When getpwuid() on the system returned NULL (e.g. the user is not
in the /etc/passwd file or other uid-to-name mappings), the
codepath to find who the user is to record it in the reflog barfed
and died. Loosen the check in this codepath, which already accepts
questionable ident string (e.g. host part of the e-mail address is
obviously bogus), and in general when we operate fmt_ident() function
in non-strict mode.
* jk/ident-loosen-getpwuid:
ident: loosen getpwuid error in non-strict mode
ident: keep a flag for bogus default_email
ident: make xgetpwuid_self() a static local helper
The completion script (in contrib/) used to list "git column"
(which is not an end-user facing command) as one of the choices
* sg/completion-no-column:
completion: remove 'git column' from porcelain commands
"git p4" when interacting with multiple depots at the same time
used to incorrectly drop changes.
* sh/p4-multi-depot:
git-p4: reduce number of server queries for fetches
git-p4: support multiple depot paths in p4 submit
git-p4: failing test case for skipping changes with multiple depots
History traversal with "git log --source" that starts with an
annotated tag failed to report the tag as "source", due to an
old regression in the command line parser back in v2.2 days.
* jk/pending-keep-tag-name:
revision.c: propagate tag names from pending array
"git symbolic-ref" forgot to report a failure with its exit status.
* jk/symbolic-ref-maint:
t1401: test reflog creation for git-symbolic-ref
symbolic-ref: propagate error code from create_symref()
The write(2) emulation for Windows learned to set errno to EPIPE
when necessary.
* js/emu-write-epipe-on-windows:
mingw: emulate write(2) that fails with a EPIPE
A pack_revindex struct has two elements: the revindex
entries themselves, and a pointer to the packed_git. We need
both to do lookups, because only the latter knows things
like the number of objects in the pack.
Now that packed_git contains the pack_revindex struct it's
just as easy to pass around the packed_git itself, and we do
not need the extra back-pointer.
We can instead just store the entries directly in the pack.
All functions which took a pack_revindex now just take a
packed_git. We still lazy-load in find_pack_revindex, so
most callers are unaffected.
The exception is the bitmap code, which computes the
revindex and caches the pointer when we load the bitmaps. We
can continue to load, drop the extra cache pointer, and just
access bitmap_git.pack.revindex directly.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The main entry point to the pack-revindex code is
find_pack_revindex(). This calls revindex_for_pack(), which
lazily computes and caches the revindex for the pack.
We store the cache in a very simple hash table. It's created
by init_pack_revindex(), which inserts an entry for every
packfile we know about, and we never grow or shrink the
hash. If we ever need the revindex for a pack that isn't in
the hash, we die() with an internal error.
This can lead to a race, because we may load more packs
after having called init_pack_revindex(). For example,
imagine we have one process which needs to look at the
revindex for a variety of objects (e.g., cat-file's
"%(objectsize:disk)" format). Simultaneously, git-gc is
running, which is doing a `git repack -ad`. We might hit a
sequence like:
1. We need the revidx for some packed object. We call
find_pack_revindex() and end up in init_pack_revindex()
to create the hash table for all packs we know about.
2. We look up another object and can't find it, because
the repack has removed the pack it's in. We re-scan the
pack directory and find a new pack containing the
object. It gets added to our packed_git list.
3. We call find_pack_revindex() for the new object, which
hits revindex_for_pack() for our new pack. It can't
find the packed_git in the revindex hash, and dies.
You could also replace the `repack` above with a push or
fetch to create a new pack, though these are less likely
(you would have to somehow learn about the new objects to
look them up).
Prior to 1a6d8b9 (do not discard revindex when re-preparing
packfiles, 2014-01-15), this was safe, as we threw away the
revindex whenever we re-scanned the pack directory (and thus
re-created the revindex hash on the fly). However, we don't
want to simply revert that commit, as it was solving a
different race.
So we have a few options:
- We can fix the race in 1a6d8b9 differently, by having
the bitmap code look in the revindex hash instead of
caching the pointer. But this would introduce a lot of
extra hash lookups for common bitmap operations.
- We could teach the revindex to dynamically add new packs
to the hash table. This would perform the same, but
would mean adding extra code to the revindex hash (which
currently cannot be resized at all).
- We can get rid of the hash table entirely. There is
exactly one revindex per pack, so we can just store it
in the packed_git struct. Since it's initialized lazily,
it does not add to the startup cost.
This is the best of both worlds: less code and fewer
hash table lookups. The original code likely avoided
this in the name of encapsulation. But the packed_git
and reverse_index code are fairly intimate already, so
it's not much of a loss.
This patch implements the final option. It's a minimal
conversion that retains the pack_revindex struct. No callers
need to change, and we can do further cleanup in a follow-on
patch.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The current code writes a reflog entry whenever we update a
symbolic ref, but we never test that this is so. Let's add a
test to make sure upcoming refactoring doesn't cause a
regression.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If create_symref() fails, git-symbolic-ref will still exit
with code 0, and our caller has no idea that the command did
nothing.
This appears to have been broken since the beginning of time
(e.g., it is not a regression where create_symref() stopped
calling die() or something similar).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When fetching changes from a depot using a full client spec, there
is no need to perform as many queries as there are top-level paths
in the client spec. Instead we query all changes in chronological
order, also getting rid of the need to sort the results and remove
duplicates.
Signed-off-by: Sam Hocevar <sam@hocevar.net>
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When submitting from a repository that was cloned using a client spec,
use the full list of paths when ruling out files that are outside the
view. This fixes a bug where only files pertaining to the first path
would be included in the p4 submit.
Signed-off-by: Sam Hocevar <sam@hocevar.net>
Signed-off-by: Luke Diamand <luke@diamand.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>