Commit graph

94 commits

Author SHA1 Message Date
Taylor Blau
b757353676 builtin/pack-objects.c: --cruft without expiration
Teach `pack-objects` how to generate a cruft pack when no objects are
dropped (i.e., `--cruft-expiration=never`). Later patches will teach
`pack-objects` how to generate a cruft pack that prunes objects.

When generating a cruft pack which does not prune objects, we want to
collect all unreachable objects into a single pack (noting and updating
their mtimes as we accumulate them). Ordinary use will pass the result
of a `git repack -A` as a kept pack, so when this patch says "kept
pack", readers should think "reachable objects".

Generating a non-expiring cruft packs works as follows:

  - Callers provide a list of every pack they know about, and indicate
    which packs are about to be removed.

  - All packs which are going to be removed (we'll call these the
    redundant ones) are marked as kept in-core.

    Any packs the caller did not mention (but are known to the
    `pack-objects` process) are also marked as kept in-core. Packs not
    mentioned by the caller are assumed to be unknown to them, i.e.,
    they entered the repository after the caller decided which packs
    should be kept and which should be discarded.

    Since we do not want to include objects in these "unknown" packs
    (because we don't know which of their objects are or aren't
    reachable), these are also marked as kept in-core.

  - Then, we enumerate all objects in the repository, and add them to
    our packing list if they do not appear in an in-core kept pack.

This results in a new cruft pack which contains all known objects that
aren't included in the kept packs. When the kept pack is the result of
`git repack -A`, the resulting pack contains all unreachable objects.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-05-26 15:48:26 -07:00
Jean-Noël Avila
49cbad0edd doc: express grammar placeholders between angle brackets
This discerns user inputs from verbatim options in the synopsis.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-09 09:39:11 -08:00
Junio C Hamano
18b49be492 Merge branch 'jk/doc-max-pack-size'
Doc update.

* jk/doc-max-pack-size:
  doc: warn people against --max-pack-size
2021-07-08 13:15:03 -07:00
Jeff King
6fb9195f6c doc: warn people against --max-pack-size
This option is almost never a good idea, as the resulting repository is
larger and slower (see the new explanations in the docs).

I outlined the potential problems. We could go further and make the
option harder to find (or at least, make the command-line option
descriptions a much more terse "you probably don't want this; see
pack.packsizeLimit for details"). But this seems like a minimal change
that may prevent people from thinking it's more useful than it is.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-06-09 08:56:09 +09:00
Junio C Hamano
2744383cbd Merge branch 'tb/geometric-repack'
"git repack" so far has been only capable of repacking everything
under the sun into a single pack (or split by size).  A cleverer
strategy to reduce the cost of repacking a repository has been
introduced.

* tb/geometric-repack:
  builtin/pack-objects.c: ignore missing links with --stdin-packs
  builtin/repack.c: reword comment around pack-objects flags
  builtin/repack.c: be more conservative with unsigned overflows
  builtin/repack.c: assign pack split later
  t7703: test --geometric repack with loose objects
  builtin/repack.c: do not repack single packs with --geometric
  builtin/repack.c: add '--geometric' option
  packfile: add kept-pack cache for find_kept_pack_entry()
  builtin/pack-objects.c: rewrite honor-pack-keep logic
  p5303: measure time to repack with keep
  p5303: add missing &&-chains
  builtin/pack-objects.c: add '--stdin-packs' option
  revision: learn '--no-kept-objects'
  packfile: introduce 'find_kept_pack_entry()'
2021-03-24 14:36:27 -07:00
Taylor Blau
339bce27f4 builtin/pack-objects.c: add '--stdin-packs' option
In an upcoming commit, 'git repack' will want to create a pack comprised
of all of the objects in some packs (the included packs) excluding any
objects in some other packs (the excluded packs).

This caller could iterate those packs themselves and feed the objects it
finds to 'git pack-objects' directly over stdin, but this approach has a
few downsides:

  - It requires every caller that wants to drive 'git pack-objects' in
    this way to implement pack iteration themselves. This forces the
    caller to think about details like what order objects are fed to
    pack-objects, which callers would likely rather not do.

  - If the set of objects in included packs is large, it requires
    sending a lot of data over a pipe, which is inefficient.

  - The caller is forced to keep track of the excluded objects, too, and
    make sure that it doesn't send any objects that appear in both
    included and excluded packs.

But the biggest downside is the lack of a reachability traversal.
Because the caller passes in a list of objects directly, those objects
don't get a namehash assigned to them, which can have a negative impact
on the delta selection process, causing 'git pack-objects' to fail to
find good deltas even when they exist.

The caller could formulate a reachability traversal themselves, but the
only way to drive 'git pack-objects' in this way is to do a full
traversal, and then remove objects in the excluded packs after the
traversal is complete. This can be detrimental to callers who care
about performance, especially in repositories with many objects.

Introduce 'git pack-objects --stdin-packs' which remedies these four
concerns.

'git pack-objects --stdin-packs' expects a list of pack names on stdin,
where 'pack-xyz.pack' denotes that pack as included, and
'^pack-xyz.pack' denotes it as excluded. The resulting pack includes all
objects that are present in at least one included pack, and aren't
present in any excluded pack.

To address the delta selection problem, 'git pack-objects --stdin-packs'
works as follows. First, it assembles a list of objects that it is going
to pack, as above. Then, a reachability traversal is started, whose tips
are any commits mentioned in included packs. Upon visiting an object, we
find its corresponding object_entry in the to_pack list, and set its
namehash parameter appropriately.

To avoid the traversal visiting more objects than it needs to, the
traversal is halted upon encountering an object which can be found in an
excluded pack (by marking the excluded packs as kept in-core, and
passing --no-kept-objects=in-core to the revision machinery).

This can cause the traversal to halt early, for example if an object in
an included pack is an ancestor of ones in excluded packs. But stopping
early is OK, since filling in the namehash fields of objects in the
to_pack list is only additive (i.e., having it helps the delta selection
process, but leaving it blank doesn't impact the correctness of the
resulting pack).

Even still, it is unlikely that this hurts us much in practice, since
the 'git repack --geometric' caller (which is introduced in a later
commit) marks small packs as included, and large ones as excluded.
During ordinary use, the small packs usually represent pushes after a
large repack, and so are unlikely to be ancestors of objects that
already exist in the repository.

(I found it convenient while developing this patch to have 'git
pack-objects' report the number of objects which were visited and got
their namehash fields filled in during traversal. This is also included
in the below patch via trace2 data lines).

Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 23:30:52 -08:00
Christian Walther
3a837b58e3 doc: mention bigFileThreshold for packing
Knowing about the core.bigFileThreshold configuration variable is
helpful when examining pack file size differences between repositories.
Add a reference to it to the manpages a user is likely to read in this
situation.

Capitalize CONFIGURATION for consistency with other pages having such a
section.

Signed-off-by: Christian Walther <cwalther@gmx.ch>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-02-22 13:18:30 -08:00
Jonathan Tan
ee47243d76 pack-objects: no fetch when allow-{any,promisor}
The options --missing=allow-{any,promisor} were introduced in caf3827e2f
("rev-list: add list-objects filtering support", 2017-11-22) with the
following note in the commit message:

    This patch introduces handling of missing objects to help
    debugging and development of the "partial clone" mechanism,
    and once the mechanism is implemented, for a power user to
    perform operations that are missing-object aware without
    incurring the cost of checking if a missing link is expected.

The idea that these options are missing-object aware (and thus do not
need to lazily fetch objects, unlike unaware commands that assume that
all objects are present) are assumed in later commits such as 07ef3c6604
("fetch test: use more robust test for filtered objects", 2020-01-15).

However, the current implementations of these options use
has_object_file(), which indeed lazily fetches missing objects. Teach
these implementations not to do so. Also, update the documentation of
these options to be clearer.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-06 13:01:03 -07:00
Derrick Stolee
de3a864114 config: set pack.useSparse=true by default
The pack.useSparse config option was introduced by 3d036eb0
(pack-objects: create pack.useSparse setting, 2019-01-19) and was
first available in v2.21.0. When enabled, the pack-objects process
during 'git push' will use a sparse tree walk when deciding which
trees and blobs to send to the remote. The algorithm was introduced
by d5d2e93 (revision: implement sparse algorithm, 2019-01-16) and
has been in production use by VFS for Git since around that time.
The features.experimental config option also enabled pack.useSparse,
so hopefully that has also increased exposure.

It is worth noting that pack.useSparse has a possibility of
sending more objects across a push, but requires a special
arrangement of exact _copies_ across directories. There is a test
in t5322-pack-objects-sparse.sh that demonstrates this possibility.
This test uses the --sparse option to "git pack-objects" but we
can make it implied by the config value to demonstrate that the
default value has changed.

While updating that test, I noticed that the documentation did not
include an option for --no-sparse, which is now more important than
it was before.

Since the downside is unlikely but the upside is significant, set
the default value of pack.useSparse to true. Remove it from the
set of options implied by features.experimental.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-03-20 14:22:31 -07:00
Mark Rushakoff
24966cd982 doc: fix repeated words
Inspired by 21416f0a07 ("restore: fix typo in docs", 2019-08-03), I ran
"git grep -E '(\b[a-zA-Z]+) \1\b' -- Documentation/" to find other cases
where words were duplicated, e.g. "the the", and in most cases removed
one of the repeated words.

There were many false positives by this grep command, including
deliberate repeated words like "really really" or valid uses of "that
that" which I left alone, of course.

I also did not correct any of the legitimate, accidentally repeated
words in old RelNotes.

Signed-off-by: Mark Rushakoff <mark.rushakoff@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-08-11 17:40:07 -07:00
Derrick Stolee
4f6d26b167 list-objects: consume sparse tree walk
When creating a pack-file using 'git pack-objects --revs' we provide
a list of interesting and uninteresting commits. For example, a push
operation would make the local topic branch be interesting and the
known remote refs as uninteresting. We want to discover the set of
new objects to send to the server as a thin pack.

We walk these commits until we discover a frontier of commits such
that every commit walk starting at interesting commits ends in a root
commit or unintersting commit. We then need to discover which
non-commit objects are reachable from  uninteresting commits. This
commit walk is not changing during this series.

The mark_edges_uninteresting() method in list-objects.c iterates on
the commit list and does the following:

* If the commit is UNINTERSTING, then mark its root tree and every
  object it can reach as UNINTERESTING.

* If the commit is interesting, then mark the root tree of every
  UNINTERSTING parent (and all objects that tree can reach) as
  UNINTERSTING.

At the very end, we repeat the process on every commit directly
given to the revision walk from stdin. This helps ensure we properly
cover shallow commits that otherwise were not included in the
frontier.

The logic to recursively follow trees is in the
mark_tree_uninteresting() method in revision.c. The algorithm avoids
duplicate work by not recursing into trees that are already marked
UNINTERSTING.

Add a new 'sparse' option to the mark_edges_uninteresting() method
that performs this logic in a slightly different way. As we iterate
over the commits, we add all of the root trees to an oidset. Then,
call mark_trees_uninteresting_sparse() on that oidset. Note that we
include interesting trees in this process. The current implementation
of mark_trees_unintersting_sparse() will walk the same trees as
the old logic, but this will be replaced in a later change.

Add a '--sparse' flag in 'git pack-objects' to call this new logic.
Add a new test script t/t5322-pack-objects-sparse.sh that tests this
option. The tests currently demonstrate that the resulting object
list is the same as the old algorithm. This includes a case where
both algorithms pack an object that is not needed by a remote due to
limits on the explored set of trees. When the sparse algorithm is
changed in a later commit, we will add a test that demonstrates a
change of behavior in some cases.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-17 13:44:39 -08:00
Jeff King
28b8a73080 pack-objects: add delta-islands support
Implement support for delta islands in git pack-objects
and document how delta islands work in
"Documentation/git-pack-objects.txt" and Documentation/config.txt.

This allows users to setup delta islands in their config and
get the benefit of less disk usage while cloning and fetching
is still quite fast and not much more CPU intensive.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-16 10:51:17 -07:00
Junio C Hamano
5a97e7be88 Merge branch 'nd/pack-unreachable-objects-doc'
Doc update.

* nd/pack-unreachable-objects-doc:
  pack-objects: validation and documentation about unreachable options
2018-05-23 14:38:24 +09:00
Junio C Hamano
ad635e82d6 Merge branch 'nd/pack-objects-pack-struct'
"git pack-objects" needs to allocate tons of "struct object_entry"
while doing its work, and shrinking its size helps the performance
quite a bit.

* nd/pack-objects-pack-struct:
  ci: exercise the whole test suite with uncommon code in pack-objects
  pack-objects: reorder members to shrink struct object_entry
  pack-objects: shrink delta_size field in struct object_entry
  pack-objects: shrink size field in struct object_entry
  pack-objects: clarify the use of object_entry::size
  pack-objects: don't check size when the object is bad
  pack-objects: shrink z_delta_size field in struct object_entry
  pack-objects: refer to delta objects by index instead of pointer
  pack-objects: move in_pack out of struct object_entry
  pack-objects: move in_pack_pos out of struct object_entry
  pack-objects: use bitfield for object_entry::depth
  pack-objects: use bitfield for object_entry::dfs_state
  pack-objects: turn type and in_pack_type to bitfields
  pack-objects: a bit of document about struct object_entry
  read-cache.c: make $GIT_TEST_SPLIT_INDEX boolean
2018-05-23 14:38:19 +09:00
Nguyễn Thái Ngọc Duy
58bd77b66a pack-objects: validation and documentation about unreachable options
These options are added in [1] [2] [3]. All these depend on running
rev-list internally which is normally true since they are always used
with "--all --objects" which implies --revs. But let's keep this
dependency explicit.

While at there, add documentation for them. These are mostly used
internally by git-repack. But it's still good to not chase down the
right commit message to know how they work.

[1] ca11b212eb (let pack-objects do the writing of unreachable objects
    as loose objects - 2008-05-14)
[2] 08cdfb1337 (pack-objects --keep-unreachable - 2007-09-16)
[3] e26a8c4721 (repack: extend --keep-unreachable to loose objects -
    2016-06-13)

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06 18:49:32 +09:00
Nguyễn Thái Ngọc Duy
ed7e5fc3a2 repack: add --keep-pack option
We allow to keep existing packs by having companion .keep files. This
is helpful when a pack is permanently kept. In the next patch, git-gc
just wants to keep a pack temporarily, for one pack-objects
run. git-gc can use --keep-pack for this use case.

A note about why the pack_keep field cannot be reused and
pack_keep_in_core has to be added. This is about the case when
--keep-pack is specified together with either --keep-unreachable or
--unpack-unreachable, but --honor-pack-keep is NOT specified.

In this case, we want to exclude objects from the packs specified on
command line, not from ones with .keep files. If only one bit flag is
used, we have to clear pack_keep on pack files with the .keep file.

But we can't make any assumption about unreachable objects in .keep
packs. If "pack_keep" field is false for .keep packs, we could
potentially pull lots of unreachable objects into the new pack, or
unpack them loose. The safer approach is ignore all packs with either
.keep file or --keep-pack.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16 13:52:29 +09:00
Nguyễn Thái Ngọc Duy
b5c0cbd808 pack-objects: use bitfield for object_entry::depth
Because of struct packing from now on we can only handle max depth
4095 (or even lower when new booleans are added in this struct). This
should be ok since long delta chain will cause significant slow down
anyway.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-04-16 12:38:58 +09:00
Jonathan Tan
0c16cd499d gc: do not repack promisor packfiles
Teach gc to stop traversal at promisor objects, and to leave promisor
packfiles alone. This has the effect of only repacking non-promisor
packfiles, and preserves the distinction between promisor packfiles and
non-promisor packfiles.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-08 09:52:42 -08:00
Jeff Hostetler
4875c9791e list-objects-filter-options: support --no-filter
Teach opt_parse_list_objects_filter() to take --no-filter
option and to free the contents of struct filter_options.
This command line argument will be automatically inherited
by commands using OPT_PARSE_LIST_OBJECTS_FILTER(); this
includes pack-objects.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-05 09:44:36 -08:00
Jeff Hostetler
9535ce7337 pack-objects: add list-objects filtering
Teach pack-objects to use the filtering provided by the
traverse_commit_list_filtered() interface to omit unwanted
objects from the resulting packfile.

Filtering requires the use of the "--stdout" option.

Add t5317 test.

In the future, we will introduce a "partial clone" mechanism
wherein an object in a repo, obtained from a remote, may
reference a missing object that can be dynamically fetched from
that remote once needed.  This "partial clone" mechanism will
have a way, sometimes slow, of determining if a missing link
is one of the links expected to be produced by this mechanism.

This patch introduces handling of missing objects to help
debugging and development of the "partial clone" mechanism,
and once the mechanism is implemented, for a power user to
perform operations that are missing-object aware without
incurring the cost of checking if a missing link is expected.

Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-11-22 14:11:57 +09:00
Jonathan Tan
4a4becfb23 Doc: clarify that pack-objects makes packs, plural
The documentation for pack-objects describes that it creates "a packed
archive of objects", which is confusing because it may create multiple
packs if --max-pack-size is set. Update the documentation to clarify
this, and explaining in which cases such a feature would be useful.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 10:39:41 -07:00
Michael Stahl
954176c128 document git-repack interaction of pack.threads and pack.windowMemory
Signed-off-by: Michael Stahl <mstahl@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-10 10:55:13 -07:00
Eric Wong
9cea46cdda pack-objects: warn on split packs disabling bitmaps
It can be tempting for a server admin to want a stable set of
long-lived packs for dumb clients; but also want to enable bitmaps
to serve smart clients more quickly.

Unfortunately, such a configuration is impossible; so at least warn
users of this incompatibility since commit 21134714 (pack-objects:
turn off bitmaps when we split packs, 2014-10-16).

Tested the warning by inspecting the output of:

	make -C t t5310-pack-bitmaps.sh GIT_TEST_OPTS=-v

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-28 09:58:14 -07:00
Jeff King
1c262bb7b2 doc: convert \--option to --option
Older versions of AsciiDoc would convert the "--" in
"--option" into an emdash. According to 565e135
(Documentation: quote double-dash for AsciiDoc, 2011-06-29),
this is fixed in AsciiDoc 8.3.0. According to bf17126, we
don't support anything older than 8.4.1 anyway, so we no
longer need to worry about quoting.

Even though this does not change the output at all, there
are a few good reasons to drop the quoting:

  1. It makes the source prettier to read.

  2. We don't quote consistently, which may be confusing when
     reading the source.

  3. Asciidoctor does not like the quoting, and renders a
     literal backslash.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-12 22:14:46 -07:00
brian m. carlson
2dacf26d09 pack-objects: use --objects-edge-aggressive for shallow repos
When fetching into or pushing from a shallow repository, we want to
aggressively mark edges as uninteresting, since this decreases the pack
size.  However, aggressively marking edges can negatively affect
performance on large non-shallow repositories with lots of refs.

Teach pack-objects a --shallow option to indicate that we're pushing
from or fetching into a shallow repository.  Use
--objects-edge-aggressive only for shallow repositories and otherwise
use --objects-edge, which performs better in the general case.  Update
the callers to pass the --shallow option when they are dealing with a
shallow repository.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-12-29 09:58:25 -08:00
Nguyễn Thái Ngọc Duy
b790e0f67c upload-pack: send shallow info over stdin to pack-objects
Before cdab485 (upload-pack: delegate rev walking in shallow fetch to
pack-objects - 2013-08-16) upload-pack does not write to the source
repository. cdab485 starts to write $GIT_DIR/shallow_XXXXXX if it's a
shallow fetch, so the source repo must be writable.

git:// servers do not need write access to repos and usually don't
have it, which means cdab485 breaks shallow clone over git://

Instead of using a temporary file as the media for shallow points, we
can send them over stdin to pack-objects as well. Prepend shallow
SHA-1 with --shallow so pack-objects knows what is what.

Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-03-11 13:32:10 -07:00
Jeff King
40a4f5a7bf pack-objects doc: treat output filename as opaque
After 1190a1a (pack-objects: name pack files after trailer hash,
2013-12-05), the SHA-1 used to determine the filename is calculated
differently.  Update the documentation to not guarantee anything more
than that the SHA-1 depends on the pack content somehow.

Hopefully this will discourage readers from depending on the old or
the new calculation.

Reported-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-12-16 11:36:09 -08:00
Thomas Ackermann
d5fa1f1a69 The name of the hash function is "SHA-1", not "SHA1"
Use "SHA-1" instead of "SHA1" whenever we talk about the hash function.
When used as a programming symbol, we keep "SHA1".

Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-04-15 11:08:37 -07:00
Thomas Ackermann
2de9b71138 Documentation: the name of the system is 'Git', not 'git'
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2013-02-01 13:53:33 -08:00
Jim Meyering
0353a0c4ec remove doubled words, e.g., s/to to/to/, and fix related typos
I found that some doubled words had snuck back into projects from which
I'd already removed them, so now there's a "syntax-check" makefile rule in
gnulib to help prevent recurrence.

Running the command below spotted a few in git, too:

  git ls-files | xargs perl -0777 -n \
    -e 'while (/\b(then?|[iao]n|i[fst]|but|f?or|at|and|[dt])\s+\1\b/gims)' \
    -e '{$n=($` =~ tr/\n/\n/ + 1); ($v=$&)=~s/\n/\\n/g;' \
    -e 'print "$ARGV:$n:$v\n"}'

Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-13 11:59:11 -07:00
Junio C Hamano
8f84c95fb2 Sync with 1.7.4.3 2011-04-03 00:14:16 -07:00
Junio C Hamano
c14f372791 Doc: mention --delta-base-offset is the default for Porcelain commands
The underlying pack-objects plumbing command still needs an explicit
option from the command line, but these days Porcelain passes the
option, so there is no need for end users to worry about it anymore.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-02 23:08:13 -07:00
Junio C Hamano
17a0299807 Merge branch 'maint'
* maint:
  contrib/thunderbird-patch-inline: do not require bash to run the script
  t8001: check the exit status of the command being tested
  strbuf.h: remove a tad stale docs-in-comment and reference api-doc instead
  Typos: t/README
  Documentation/config.txt: make truth value of numbers more explicit
  git-pack-objects.txt: fix grammatical errors
  parse-remote: replace unnecessary sed invocation
2011-03-30 14:10:41 -07:00
Stephen Boyd
2f8ee02c49 git-pack-objects.txt: fix grammatical errors
Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-03-30 11:43:15 -07:00
Jeff King
48bb914ed6 doc: drop author/documentation sections from most pages
The point of these sections is generally to:

  1. Give credit where it is due.

  2. Give the reader an idea of where to ask questions or
     file bug reports.

But they don't do a good job of either case. For (1), they
are out of date and incomplete. A much more accurate answer
can be gotten through shortlog or blame.  For (2), the
correct contact point is generally git@vger, and even if you
wanted to cc the contact point, the out-of-date and
incomplete fields mean you're likely sending to somebody
useless.

So let's drop the fields entirely from all manpages except
git(1) itself. We already point people to the mailing list
for bug reports there, and we can update the Authors section
to give credit to the major contributors and point to
shortlog and blame for more information.

Each page has a "This is part of git" footer, so people can
follow that to the main git manpage.
2011-03-11 10:59:16 -05:00
Štěpán Němec
0adda9362a Use parentheses and `...' where appropriate
Remove some stray usage of other bracket types and asterisks for the
same purpose.

Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:31:07 -07:00
Štěpán Němec
62b4698e55 Use angles for placeholders consistently
Signed-off-by: Štěpán Němec <stepnem@gmail.com>
Acked-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-10-08 12:29:52 -07:00
Nelson Elhage
18879bc526 pack-objects documentation: Fix --honor-pack-keep as well.
Signed-off-by: Nelson Elhage <nelhage@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-24 19:10:22 -08:00
Junio C Hamano
3909f14f62 pack-objects documentation: reword "objects that appear in the standard input"
These were written back when we always read objects from the standard
input.  These days --revs and its friends can feed only the start and
end points and have the command internally enumerate the objects.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-24 15:41:27 -08:00
Nelson Elhage
21da426214 Documentation: pack-objects: Clarify --local's semantics.
The current documentation suggests that --local also ignores any
objects in local packs, which is incorrect. Change the language to be
clearer and more parallel to the other options that ignore objects.

While we're at it, fix a trivial error in --incremental's
documentation.

Signed-off-by: Nelson Elhage <nelhage@mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-20 09:24:19 -08:00
Stephen Boyd
738820a913 Documentation: describe --thin more accurately
The description for --thin was misleading and downright wrong. Correct
it with some inspiration from the description of index-pack's --fix-thin
and some background information from Nicolas Pitre <nico@fluxnic.net>.

Signed-off-by: Stephen Boyd <bebarino@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-18 17:13:18 -08:00
Jeff King
cc1b8d8bc6 docs: don't talk about $GIT_DIR/refs/ everywhere
It is misleading to say that we pull refs from $GIT_DIR/refs/*, because we
may also consult the packed refs mechanism. These days we tend to treat
the "refs hierarchy" as more of an abstract namespace that happens to be
represented as $GIT_DIR/refs. At best, this is a minor inaccuracy, but at
worst it can confuse users who then look in $GIT_DIR/refs and find that it
is missing some of the refs they expected to see.

This patch drops most uses of "$GIT_DIR/refs/*", changing them into just
"refs/*", under the assumption that users can handle the concept of an
abstract refs namespace. There are a few things to note:

  - most cases just dropped the $GIT_DIR/ portion. But for cases where
    that left _just_ the word "refs", I changed it to "refs/" to help
    indicate that it was a hierarchy.  I didn't do the same for longer
    paths (e.g., "refs/heads" remained, instead of becoming
    "refs/heads/").

  - in some cases, no change was made, as the text was explicitly about
    unpacked refs (e.g., the discussion in git-pack-refs).

  - In some cases it made sense instead to note the existence of packed
    refs (e.g., in check-ref-format and rev-parse).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-17 21:40:09 -08:00
Nicolas Pitre
07cf0f2407 make --max-pack-size argument to 'git pack-object' count in bytes
The value passed to --max-pack-size used to count in MiB which was
inconsistent with the corresponding configuration variable as well as
other command arguments which are defined to count in bytes with an
optional unit suffix.  This brings --max-pack-size in line with the
rest of Git.

Also, in order not to cause havoc with people used to the previous
megabyte scale, and because this is a sane thing to do anyway, a
minimum size of 1 MiB is enforced to avoid an explosion of pack files.

Adjust and extend test suite accordingly.

Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-02-03 20:39:56 -08:00
Thomas Rast
0b444cdb19 Documentation: spell 'git cmd' without dash throughout
The documentation was quite inconsistent when spelling 'git cmd' if it
only refers to the program, not to some specific invocation syntax:
both 'git-cmd' and 'git cmd' spellings exist.

The current trend goes towards dashless forms, and there is precedent
in 647ac70 (git-svn.txt: stop using dash-form of commands.,
2009-07-07) to actively eliminate the dashed variants.

Replace 'git-cmd' with 'git cmd' throughout, except where git-shell,
git-cvsserver, git-upload-pack, git-receive-pack, and
git-upload-archive are concerned, because those really live in the
$PATH.
2010-01-10 13:01:28 +01:00
Nicolas Pitre
4f36627518 pack-objects: split implications of --all-progress from progress activation
Currently the --all-progress flag is used to use force progress display
during the writing object phase even if output goes to stdout which is
primarily the case during a push operation.  This has the unfortunate
side effect of forcing progress display even if stderr is not a
terminal.

Let's introduce the --all-progress-implied argument which has the same
intent except for actually forcing the activation of any progress
display.  With this, progress display will be automatically inhibited
whenever stderr is not a terminal, or full progress display will be
included otherwise.  This should let people use 'git push' within a cron
job without filling their logs with useless percentage displays.

Signed-off-by: Nicolas Pitre <nico@fluxnic.net>
Tested-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-11-23 21:33:09 -08:00
Johannes Schindelin
7f3140cd23 git repack: keep commits hidden by a graft
When you have grafts that pretend that a given commit has different
parents than the ones recorded in the commit object, it is dangerous
to let 'git repack' remove those hidden parents, as you can easily
remove the graft and end up with a broken repository.

So let's play it safe and keep those parent objects and everything
that is reachable by them, in addition to the grafted parents.

As this behavior can only be triggered by git pack-objects, and as that
command handles duplicate parents gracefully, we do not bother to cull
duplicated parents that may result by using both true and grafted
parents.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2009-07-24 09:10:16 -07:00
Brandon Casey
daae062595 pack-objects: extend --local to mean ignore non-local loose objects too
With this patch, --local means pack only local objects that are not already
packed.

Additionally, this fixes t7700 testing whether loose objects in an alternate
object database are repacked.

Signed-off-by: Brandon Casey <drafnel@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 10:29:22 -08:00
Brandon Casey
e96fb9b8f9 pack-objects: new option --honor-pack-keep
This adds a new option to pack-objects which will cause it to ignore an
object which appears in a local pack which has a .keep file, even if it
was specified for packing.

This option will be used by the porcelain repack.

Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-11-12 10:28:10 -08:00
Jonathan Nieder
ba020ef5eb manpages: italicize git command names (which were in teletype font)
The names of git commands are not meant to be entered at the
commandline; they are just names. So we render them in italics,
as is usual for command names in manpages.

Using

	doit () {
	  perl -e 'for (<>) { s/\`(git-[^\`.]*)\`/'\''\1'\''/g; print }'
	}
	for i in git*.txt config.txt diff*.txt blame*.txt fetch*.txt i18n.txt \
	        merge*.txt pretty*.txt pull*.txt rev*.txt urls*.txt
	do
	  doit <"$i" >"$i+" && mv "$i+" "$i"
	done
	git diff

.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-05 11:24:40 -07:00
Jonathan Nieder
483bc4f045 Documentation formatting and cleanup
Following what appears to be the predominant style, format
names of commands and commandlines both as `teletype text`.

While we're at it, add articles ("a" and "the") in some
places, italicize the name of the command in the manual page
synopsis line, and add a comma or two where it seems appropriate.

Signed-off-by: Jonathan Nieder <jrnieder@uchicago.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2008-07-01 17:20:16 -07:00