git/Documentation/git-multi-pack-index.txt
Taylor Blau fcb2205b77 midx: implement support for writing incremental MIDX chains
Now that the rest of the MIDX subsystem and relevant callers have been
updated to learn about how to read and process incremental MIDX chains,
let's finally update the implementation in `write_midx_internal()` to be
able to write incremental MIDX chains.

This new feature is available behind the `--incremental` option for the
`multi-pack-index` builtin, like so:

    $ git multi-pack-index write --incremental

The implementation for doing so is relatively straightforward, and boils
down to a handful of different kinds of changes implemented in this
patch:

  - The `compute_sorted_entries()` function is taught to reject objects
    which appear in any existing MIDX layer.

  - Functions like `write_midx_revindex()` are adjusted to write
    pack_order values which are offset by the number of objects in the
    base MIDX layer.

  - The end of `write_midx_internal()` is adjusted to move
    non-incremental MIDX files when necessary (i.e. when creating an
    incremental chain with an existing non-incremental MIDX in the
    repository).

There are a handful of other changes that are introduced, like new
functions to clear incremental MIDX files that are unrelated to the
current chain (using the same "keep_hash" mechanism as in the
non-incremental case).

The tests explicitly exercising the new incremental MIDX feature are
relatively limited for two reasons:

  1. Most of the "interesting" behavior is already thoroughly covered in
     t5319-multi-pack-index.sh, which handles the core logic of reading
     objects through a MIDX.

     The new tests in t5334-incremental-multi-pack-index.sh are mostly
     focused on creating and destroying incremental MIDXs, as well as
     stitching their results together across layers.

  2. A new GIT_TEST environment variable is added called
     "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the
     entire test suite to write incremental MIDXs after repacking when
     combined with the "GIT_TEST_MULTI_PACK_INDEX" variable.

     This exercises the long tail of other interesting behavior that is
     defined implicitly throughout the rest of the CI suite. It is
     likewise added to the linux-TEST-vars job.

Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-08-06 12:01:39 -07:00

147 lines
4.9 KiB
Text

git-multi-pack-index(1)
=======================
NAME
----
git-multi-pack-index - Write and verify multi-pack-indexes
SYNOPSIS
--------
[verse]
'git multi-pack-index' [--object-dir=<dir>] [--[no-]bitmap] <sub-command>
DESCRIPTION
-----------
Write or verify a multi-pack-index (MIDX) file.
OPTIONS
-------
--object-dir=<dir>::
Use given directory for the location of Git objects. We check
`<dir>/packs/multi-pack-index` for the current MIDX file, and
`<dir>/packs` for the pack-files to index.
+
`<dir>` must be an alternate of the current repository.
--[no-]progress::
Turn progress on/off explicitly. If neither is specified, progress is
shown if standard error is connected to a terminal. Supported by
sub-commands `write`, `verify`, `expire`, and `repack.
The following subcommands are available:
write::
Write a new MIDX file. The following options are available for
the `write` sub-command:
+
--
--preferred-pack=<pack>::
Optionally specify the tie-breaking pack used when
multiple packs contain the same object. `<pack>` must
contain at least one object. If not given, ties are
broken in favor of the pack with the lowest mtime.
--[no-]bitmap::
Control whether or not a multi-pack bitmap is written.
--stdin-packs::
Write a multi-pack index containing only the set of
line-delimited pack index basenames provided over stdin.
--refs-snapshot=<path>::
With `--bitmap`, optionally specify a file which
contains a "refs snapshot" taken prior to repacking.
+
A reference snapshot is composed of line-delimited OIDs corresponding to
the reference tips, usually taken by `git repack` prior to generating a
new pack. A line may optionally start with a `+` character to indicate
that the reference which corresponds to that OID is "preferred" (see
linkgit:git-config[1]'s `pack.preferBitmapTips`.)
+
The file given at `<path>` is expected to be readable, and can contain
duplicates. (If a given OID is given more than once, it is marked as
preferred if at least one instance of it begins with the special `+`
marker).
--incremental::
Write an incremental MIDX file containing only objects
and packs not present in an existing MIDX layer.
Migrates non-incremental MIDXs to incremental ones when
necessary. Incompatible with `--bitmap`.
--
verify::
Verify the contents of the MIDX file.
expire::
Delete the pack-files that are tracked by the MIDX file, but
have no objects referenced by the MIDX (with the exception of
`.keep` packs and cruft packs). Rewrite the MIDX file afterward
to remove all references to these pack-files.
+
NOTE: this mode is incompatible with incremental MIDX files.
repack::
Create a new pack-file containing objects in small pack-files
referenced by the multi-pack-index. If the size given by the
`--batch-size=<size>` argument is zero, then create a pack
containing all objects referenced by the multi-pack-index. For
a non-zero batch size, Select the pack-files by examining packs
from oldest-to-newest, computing the "expected size" by counting
the number of objects in the pack referenced by the
multi-pack-index, then divide by the total number of objects in
the pack and multiply by the pack size. We select packs with
expected size below the batch size until the set of packs have
total expected size at least the batch size, or all pack-files
are considered. If only one pack-file is selected, then do
nothing. If a new pack-file is created, rewrite the
multi-pack-index to reference the new pack-file. A later run of
'git multi-pack-index expire' will delete the pack-files that
were part of this batch.
+
If `repack.packKeptObjects` is `false`, then any pack-files with an
associated `.keep` file will not be selected for the batch to repack.
+
NOTE: this mode is incompatible with incremental MIDX files.
EXAMPLES
--------
* Write a MIDX file for the packfiles in the current `.git` directory.
+
-----------------------------------------------
$ git multi-pack-index write
-----------------------------------------------
* Write a MIDX file for the packfiles in the current `.git` directory with a
corresponding bitmap.
+
-------------------------------------------------------------
$ git multi-pack-index write --preferred-pack=<pack> --bitmap
-------------------------------------------------------------
* Write a MIDX file for the packfiles in an alternate object store.
+
-----------------------------------------------
$ git multi-pack-index --object-dir <alt> write
-----------------------------------------------
* Verify the MIDX file for the packfiles in the current `.git` directory.
+
-----------------------------------------------
$ git multi-pack-index verify
-----------------------------------------------
SEE ALSO
--------
See link:technical/multi-pack-index.html[The Multi-Pack-Index Design
Document] and linkgit:gitformat-pack[5] for more information on the
multi-pack-index feature and its file format.
GIT
---
Part of the linkgit:git[1] suite