New interface into the tmp-objdir API to help in-core use of the
quarantine feature.
* ns/tmp-objdir:
tmp-objdir: disable ref updates when replacing the primary odb
tmp-objdir: new API for creating temporary writable databases
When creating a subprocess with a temporary ODB, we set the
GIT_QUARANTINE_ENVIRONMENT env var to tell child Git processes not
to update refs, since the tmp-objdir may go away.
Introduce a similar mechanism for in-process temporary ODBs when
we call tmp_objdir_replace_primary_odb. Now both mechanisms set
the disable_ref_updates flag on the odb, which is queried by
the ref_transaction_prepare function.
Peff's test case [1] was invoking ref updates via the cachetextconv
setting. That particular code silently does nothing when a ref
update is forbidden. See the call to notes_cache_put in
fill_textconv where errors are ignored.
[1] https://lore.kernel.org/git/YVOn3hDsb5pnxR53@coredump.intra.peff.net/
Reported-by: Jeff King <peff@peff.net>
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
remote.c does not works with non-the_repository because it stores its
state as static variables. To support non-the_repository, we can use a
per-repository struct for the remotes subsystem.
Prepare for this change by defining a struct remote_state that holds
the remotes subsystem state and move the static variables of remote.c
into the_repository->remote_state.
This introduces no behavioral or API changes.
Signed-off-by: Glen Choo <chooglen@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code cleanup.
* ab/repo-settings-cleanup:
repository.h: don't use a mix of int and bitfields
repo-settings.c: simplify the setup
read-cache & fetch-negotiator: check "enum" values in switch()
environment.c: remove test-specific "ignore_untracked..." variable
wrapper.c: add x{un,}setenv(), and use xsetenv() in environment.c
Change the bitfield added in 58300f4743 (sparse-index: add
index.sparse config option, 2021-03-30) and 3964fc2aae (sparse-index:
add guard to ensure full index, 2021-03-30) to just use an "int"
boolean instead.
It might be smart to optimize the space here in the future, but by
consistently using an "int" we can take its address and pass it to
repo_cfg_bool(), and therefore don't need to handle "sparse_index" as
a special-case when reading the "index.sparse" setting.
There's no corresponding config for "command_requires_full_index", but
let's change it too for consistency and to prevent future bugs
creeping in due to one of these being "unsigned".
Using "int" consistently also prevents subtle bugs or undesired
control flow creeping in here. Before the preceding commit the
initialization of "command_requires_full_index" in
prepare_repo_settings() did nothing, i.e. this:
r->settings.command_requires_full_index = 1
Was redundant to the earlier memset() to -1. Likewise for
"sparse_index" added in 58300f4743 (sparse-index: add index.sparse
config option, 2021-03-30) the code and comment added there was
misleading, we weren't initializing it to off, but re-initializing it
from "1" to "0", and then finally checking the config, and perhaps
setting it to "1" again. I.e. we could have applied this patch before
the preceding commit:
+ assert(r->settings.command_requires_full_index == 1);
r->settings.command_requires_full_index = 1;
/*
* Initialize this as off.
*/
+ assert(r->settings.sparse_index == 1);
r->settings.sparse_index = 0;
if (!repo_config_get_bool(r, "index.sparse", &value) && value)
r->settings.sparse_index = 1;
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the setup code in repo-settings.c in various ways, making the
code shorter, easier to read, and requiring fewer hacks to do the same
thing as it did before:
Since 7211b9e753 (repo-settings: consolidate some config settings,
2019-08-13) we have memset() the whole "settings" structure to -1 in
prepare_repo_settings(), and subsequently relied on the -1 value.
Most of the fields did not need to be initialized to -1, and because
we were doing that we had the enum labels "UNTRACKED_CACHE_UNSET" and
"FETCH_NEGOTIATION_UNSET" purely to reflect the resulting state
created this memset() in prepare_repo_settings(). No other code used
or relied on them, more on that below.
For the rest most of the subsequent "are we -1, then read xyz" can
simply be removed by re-arranging what we read first. E.g. when
setting the "index.version" setting we should have first read
"feature.experimental", so that it (and "feature.manyfiles") can
provide a default for our "index.version".
Instead the code setting it, added when "feature.manyFiles"[1] was
created, was using the UPDATE_DEFAULT_BOOL() macro added in an earlier
commit[2]. That macro is now gone, since it was only needed for this
pattern of reading things in the wrong order.
This also fixes an (admittedly obscure) logic error where we'd
conflate an explicit "-1" value in the config with our own earlier
memset() -1.
We can also remove the UPDATE_DEFAULT_BOOL() wrapper added in
[3]. Using it is redundant to simply using the return value from
repo_config_get_bool(), which is non-zero if the provided key exists
in the config.
Details on edge cases relating to the memset() to -1, continued from
"more on that below" above:
* UNTRACKED_CACHE_KEEP:
In [4] the "unset" and "keep" handling for core.untrackedCache was
consolidated. But it while we understand the "keep" value, we don't
handle it differently than the case of any other unknown value.
So let's retain UNTRACKED_CACHE_KEEP and remove the
UNTRACKED_CACHE_UNSET setting (which was always implicitly
UNTRACKED_CACHE_KEEP before). We don't need to inform any code
after prepare_repo_settings() that the setting was "unset", as far
as anyone else is concerned it's core.untrackedCache=keep. if
"core.untrackedcache" isn't present in the config.
* FETCH_NEGOTIATION_UNSET & FETCH_NEGOTIATION_NONE:
Since these two two enum fields added in [5] don't rely on the
memzero() setting them to "-1" anymore we don't have to provide
them with explicit values.
1. c6cc4c5afd (repo-settings: create feature.manyFiles setting,
2019-08-13)
2. 31b1de6a09 (commit-graph: turn on commit-graph by default,
2019-08-13)
3. 31b1de6a09 (commit-graph: turn on commit-graph by default,
2019-08-13)
4. ad0fb65999 (repo-settings: parse core.untrackedCache,
2019-08-13)
5. aaf633c2ad (repo-settings: create feature.experimental setting,
2019-08-13)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In preparation for a subsequent commit that migrates code using
add_submodule_odb() to repo_submodule_init(), teach
repo_submodule_init() to support submodules with unabsorbed gitdirs.
(See the documentation for "git submodule absorbgitdirs" for more
information about absorbed and unabsorbed gitdirs.)
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of using global variables to store promisor remote information,
store this config in struct repository instead, and add
repository-agnostic non-static functions corresponding to the existing
non-static functions that only work on the_repository.
The actual lazy-fetching of missing objects currently does not work on
repositories other than the_repository, and will still not work after
this commit, so add a BUG message explaining this. A subsequent commit
will remove this limitation.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move repository_format_partial_clone, which is currently a global
variable, into struct repository. (Full support for per-repository
partial clone config will be done in a subsequent commit - this is split
into its own commit because of the extent of the changes needed.)
The new repo-specific variable cannot be set in
check_repository_format_gently() (as is currently), because that
function does not know which repo it is operating on (or even whether
the value is important); therefore this responsibility is delegated to
the outermost caller that knows. Of all the outermost callers that know
(found by looking at all functions that call clear_repository_format()),
I looked at those that either read from the main Git directory or write
into a struct repository. These callers have been modified accordingly
(write to the_repository in the former case and write to the given
struct repository in the latter case).
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reviewed-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When enabled, this config option signals that index writes should
attempt to use sparse-directory entries.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Upcoming changes will introduce modifications to the index format that
allow sparse directories. It will be useful to have a mechanism for
converting those sparse index files into full indexes by walking the
tree at those sparse directories. Name this method ensure_full_index()
as it will guarantee that the index is fully expanded.
This method is not implemented yet, and instead we focus on the
scaffolding to declare it and call it at the appropriate time.
Add a 'command_requires_full_index' member to struct repo_settings. This
will be an indicator that we need the index in full mode to do certain
index operations. This starts as being true for every command, then we
will set it to false as some commands integrate with sparse indexes.
If 'command_requires_full_index' is true, then we will immediately
expand a sparse index to a full one upon reading from disk. This
suffices for now, but we will want to add more callers to
ensure_full_index() later.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git commit-graph write" learned to limit the number of bloom
filters that are computed from scratch with the --max-new-filters
option.
* tb/bloom-improvements:
commit-graph: introduce 'commitGraph.maxNewFilters'
builtin/commit-graph.c: introduce '--max-new-filters=<n>'
commit-graph: rename 'split_commit_graph_opts'
bloom: encode out-of-bounds filters as non-empty
bloom/diff: properly short-circuit on max_changes
bloom: use provided 'struct bloom_filter_settings'
bloom: split 'get_bloom_filter()' in two
commit-graph.c: store maximum changed paths
commit-graph: respect 'commitGraph.readChangedPaths'
t/helper/test-read-graph.c: prepare repo settings
commit-graph: pass a 'struct repository *' in more places
t4216: use an '&&'-chain
commit-graph: introduce 'get_bloom_filter_settings()'
The core.multiPackIndex setting has been around since c4d25228eb
(config: create core.multiPackIndex setting, 2018-07-12), but has been
disabled by default. If a user wishes to use the multi-pack-index
feature, then they must enable this config and run 'git multi-pack-index
write'.
The multi-pack-index feature is relatively stable now, so make the
config option true by default. For users that do not use a
multi-pack-index, the only extra cost will be a file lookup to see if a
multi-pack-index file exists (once per process, per object directory).
Also, this config option will be referenced by an upcoming
"incremental-repack" task in the maintenance builtin, so move the config
option into the repository settings struct. Note that if
GIT_TEST_MULTI_PACK_INDEX=1, then we want to ignore the config option
and treat core.multiPackIndex as enabled.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git uses the 'core.commitGraph' configuration value to control whether
or not the commit graph is used when parsing commits or performing a
traversal.
Now that commit-graphs can also contain a section for changed-path Bloom
filters, administrators that already have commit-graphs may find it
convenient to use those graphs without relying on their changed-path
Bloom filters. This can happen, for example, during a staged roll-out,
or in the event of an incident.
Introduce 'commitGraph.readChangedPaths' to control whether or not Bloom
filters are read. Note that this configuration is independent from both:
- 'core.commitGraph', to allow flexibility in using all parts of a
commit-graph _except_ for its Bloom filters.
- The '--changed-paths' option for 'git commit-graph write', to allow
reading and writing Bloom filters to be controlled independently.
When the variable is set, pretend as if no Bloom data was specified at
all. This avoids adding additional special-casing outside of the
commit-graph internals.
Suggested-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a noop fetch negotiator. This is introduced to allow partial clones
to skip the unneeded negotiation step when fetching missing objects
using a "git fetch" subprocess. (The implementation of spawning a "git
fetch" subprocess will be done in a subsequent patch.) But this can also
be useful for end users, e.g. as a blunt fix for object corruption.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In version 1 of repository format, "extensions" gained special meaning
and it is safer to avoid upgrading when there are pre-existing
extensions.
Make list-objects-filter to use the helper function instead of setting
repository version directly as a prerequisite of exposing the upgrade
capability.
Signed-off-by: Xin Li <delphij@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We've left the command line parsing of "git log :/a/b/" broken for
about a full year without anybody noticing, which has been
corrected.
* jc/missing-ref-store-fix:
repository: mark the "refs" pointer as private
sha1-name: do not assume that the ref store is initialized
The "refs" pointer in a struct repository starts life as NULL, but then
is lazily initialized when it is accessed via get_main_ref_store().
However, it's easy for calling code to forget this and access it
directly, leading to code which works _some_ of the time, but fails if
it is called before anybody else accesses the refs.
This was the cause of the bug fixed by 5ff4b920eb (sha1-name: do not
assume that the ref store is initialized, 2020-04-09). In order to
prevent similar bugs, let's more clearly mark the "refs" field as
private.
In addition to helping future code, the name change will help us audit
any existing direct uses. Besides get_main_ref_store() itself, it turns
out there is only one. But we know it's OK as it is on the line directly
after the fix from 5ff4b920eb, which will have initialized the pointer.
However it's still a good idea for it to model the proper use of the
accessing function, so we'll convert it.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "upload-pack" (the counterpart of "git fetch") needs to disable
commit-graph when responding to a shallow clone/fetch request, but
the way this was done made Git panic, which has been corrected.
* jk/disable-commit-graph-during-upload-pack:
upload-pack: disable commit graph more gently for shallow traversal
commit-graph: bump DIE_ON_LOAD check to actual load-time
When the client has asked for certain shallow options like
"deepen-since", we do a custom rev-list walk that pretends to be
shallow. Before doing so, we have to disable the commit-graph, since it
is not compatible with the shallow view of the repository. That's
handled by 829a321569 (commit-graph: close_commit_graph before shallow
walk, 2018-08-20). That commit literally closes and frees our
repo->objects->commit_graph struct.
That creates an interesting problem for commits that have _already_ been
parsed using the commit graph. Their commit->object.parsed flag is set,
their commit->graph_pos is set, but their commit->maybe_tree may still
be NULL. When somebody later calls repo_get_commit_tree(), we see that
we haven't loaded the tree oid yet and try to get it from the commit
graph. But since it has been freed, we segfault!
So the root of the issue is a data dependency between the commit's
lazy-load of the tree oid and the fact that the commit graph can go
away mid-process. How can we resolve it?
There are a couple of general approaches:
1. The obvious answer is to avoid loading the tree from the graph when
we see that it's NULL. But then what do we return for the tree oid?
If we return NULL, our caller in do_traverse() will rightly
complain that we have no tree. We'd have to fallback to loading the
actual commit object and re-parsing it. That requires teaching
parse_commit_buffer() to understand re-parsing (i.e., not starting
from a clean slate and not leaking any allocated bits like parent
list pointers).
2. When we close the commit graph, walk through the set of in-memory
objects and clear any graph_pos pointers. But this means we also
have to "unparse" any such commits so that we know they still need
to open the commit object to fill in their trees. So it's no less
complicated than (1), and is more expensive (since we clear objects
we might not later need).
3. Stop freeing the commit-graph struct. Continue to let it be used
for lazy-loads of tree oids, but let upload-pack specify that it
shouldn't be used for further commit parsing.
4. Push the whole shallow rev-list out to its own sub-process, with
the commit-graph disabled from the start, giving it a clean memory
space to work from.
I've chosen (3) here. Options (1) and (2) would work, but are
non-trivial to implement. Option (4) is more expensive, and I'm not sure
how complicated it is (shelling out for the actual rev-list part is
easy, but we do then parse the resulting commits internally, and I'm not
clear which parts need to be handling shallow-ness).
The new test in t5500 triggers this segfault, but see the comments there
for how horribly intimate it has to be with how both upload-pack and
commit graphs work.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The commit-graph feature is now on by default, and is being
written during 'git gc' by default. Typically, Git only writes
a commit-graph when a 'git gc --auto' command passes the gc.auto
setting to actualy do work. This means that a commit-graph will
typically fall behind the commits that are being used every day.
To stay updated with the latest commits, add a step to 'git
fetch' to write a commit-graph after fetching new objects. The
fetch.writeCommitGraph config setting enables writing a split
commit-graph, so on average the cost of writing this file is
very small. Occasionally, the commit-graph chain will collapse
to a single level, and this could be slow for very large repos.
For additional use, adjust the default to be true when
feature.experimental is enabled.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The 'feature.experimental' setting includes config options that are
not committed to become defaults, but could use additional testing.
Update the following config settings to take new defaults, and to
use the repo_settings struct if not already using them:
* 'pack.useSparse=true'
* 'fetch.negotiationAlgorithm=skipping'
In the case of fetch.negotiationAlgorithm, the existing logic
would load the config option only when about to use the setting,
so had a die() statement on an unknown string value. This is
removed as now the config is parsed under prepare_repo_settings().
In general, this die() is probably misplaced and not valuable.
A test was removed that checked this die() statement executed.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The core.untrackedCache config setting is slightly complicated,
so clarify its use and centralize its parsing into the repo
settings.
The default value is "keep" (returned as -1), which persists the
untracked cache if it exists.
If the value is set as "false" (returned as 0), then remove the
untracked cache if it exists.
If the value is set as "true" (returned as 1), then write the
untracked cache and persist it.
Instead of relying on magic values of -1, 0, and 1, split these
options into an enum. This allows the use of "-1" as a
default value. After parsing the config options, if the value is
unset we can initialize it to UNTRACKED_CACHE_KEEP.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There are a few important config settings that are not loaded
during git_default_config. These are instead loaded on-demand.
Centralize these config options to a single scan, and store
all of the values in a repo_settings struct. The values for
each setting are initialized as negative to indicate "unset".
This centralization will be particularly important in a later
change to introduce "meta" config settings that change the
defaults for these config settings.
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new unified tracing facility for git. The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.
In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.
Trace2 defines 3 output targets. These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).
* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
summary data.
* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
It extends the output with columns for the command process, thread,
repo, absolute and relative elapsed times. It reports events for
child process start/stop, thread start/stop, and per-thread function
nesting.
* GIT_TR2_EVENT is a new structured format. It writes event data as a
series of JSON records.
Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The assumption to work on the single "in-core index" instance has
been reduced from the library-ish part of the codebase.
* nd/the-index-final:
cache.h: flip NO_THE_INDEX_COMPATIBILITY_MACROS switch
read-cache.c: remove the_* from index_has_changes()
merge-recursive.c: remove implicit dependency on the_repository
merge-recursive.c: remove implicit dependency on the_index
sha1-name.c: remove implicit dependency on the_index
read-cache.c: replace update_index_if_able with repo_&
read-cache.c: kill read_index()
checkout: avoid the_index when possible
repository.c: replace hold_locked_index() with repo_hold_locked_index()
notes-utils.c: remove the_repository references
grep: use grep_opt->repo instead of explict repo argument
read_index() shares the same problem as hold_locked_index(): it
assumes $GIT_DIR/index. Move all call sites to repo_read_index()
instead. read_index_preload() and read_index_unmerged() are also
killed as a consequence.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hold_locked_index() assumes the index path at $GIT_DIR/index. This is
not good for places that take an arbitrary index_state instead of
the_index, which is basically everywhere except builtin/.
Replace it with repo_hold_locked_index(). hold_locked_index() remains
as a wrapper around repo_hold_locked_index() to reduce changes in builtin/
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When constructing a struct repository for a submodule for some revision
of the superproject where the submodule is not contained in the index,
it may not be present in the working tree currently either. In that
situation giving a 'path' argument is not useful. Upgrade the
repo_submodule_init function to take a struct submodule instead.
The submodule struct can be obtained via submodule_from_{path, name} or
an artificial submodule struct can be passed in.
While we are at it, rename the repository struct in the repo_submodule_init
function, which is to be initialized, to a name that is not confused with
the struct submodule as easily. Perform such renames in similar functions
as well.
Also move its documentation into the header file.
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code hygiene improvement for the header files.
* en/incl-forward-decl:
Remove forward declaration of an enum
compat/precompose_utf8.h: use more common include guard style
urlmatch.h: fix include guard
Move definition of enum branch_track from cache.h to branch.h
alloc: make allocate_alloc_state and clear_alloc_state more consistent
Add missing includes and forward declarations
I looped over the toplevel header files, creating a temporary two-line C
program for each consisting of
#include "git-compat-util.h"
#include $HEADER
This patch is the result of manually fixing errors in compiling those
tiny programs.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-grafts:
commit: allow lookup_commit_graft to handle arbitrary repositories
commit: allow prepare_commit_graft to handle arbitrary repositories
shallow: migrate shallow information into the object parser
path.c: migrate global git_path_* to take a repository argument
cache: convert get_graft_file to handle arbitrary repositories
commit: convert read_graft_file to handle arbitrary repositories
commit: convert register_commit_graft to handle arbitrary repositories
commit: convert commit_graft_pos() to handle arbitrary repositories
shallow: add repository argument to is_repository_shallow
shallow: add repository argument to check_shallow_file_for_update
shallow: add repository argument to register_shallow
shallow: add repository argument to set_alternate_shallow_file
commit: add repository argument to lookup_commit_graft
commit: add repository argument to prepare_commit_graft
commit: add repository argument to read_graft_file
commit: add repository argument to register_commit_graft
commit: add repository argument to commit_graft_pos
object: move grafts to object parser
object-store: move object access functions to object-store.h
The conversion to pass "the_repository" and then "a_repository"
throughout the object access API continues.
* sb/object-store-alloc:
alloc: allow arbitrary repositories for alloc functions
object: allow create_object to handle arbitrary repositories
object: allow grow_object_hash to handle arbitrary repositories
alloc: add repository argument to alloc_commit_index
alloc: add repository argument to alloc_report
alloc: add repository argument to alloc_object_node
alloc: add repository argument to alloc_tag_node
alloc: add repository argument to alloc_commit_node
alloc: add repository argument to alloc_tree_node
alloc: add repository argument to alloc_blob_node
object: add repository argument to grow_object_hash
object: add repository argument to create_object
repository: introduce parsed objects field
Migrate all git_path_* functions that are defined in path.c to take a
repository argument. Unlike other patches in this series, do not use the
#define trick, as we rewrite the whole function, which is rather small.
This doesn't migrate all the functions, as other builtins have their own
local path functions defined using GIT_PATH_FUNC. So keep that macro
around to serve the other locations.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Convert the existing global cache for parsed objects (obj_hash) into
repository-specific parsed object caches. Existing code that uses
obj_hash are modified to use the parsed object cache of
the_repository; future patches will use the parsed object caches of
other repositories.
Another future use case for a pool of objects is ease of memory management
in revision walking: If we can free the rev-list related memory early in
pack-objects (e.g. part of repack operation) then it could lower memory
pressure significantly when running on large repos. While this has been
discussed on the mailing list lately, this series doesn't implement this.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The effort to pass the repository in-core structure throughout the
API continues. This round deals with the code that implements the
refs/replace/ mechanism.
* sb/object-store-replace:
replace-object: allow lookup_replace_object to handle arbitrary repositories
replace-object: allow do_lookup_replace_object to handle arbitrary repositories
replace-object: allow prepare_replace_object to handle arbitrary repositories
refs: allow for_each_replace_ref to handle arbitrary repositories
refs: store the main ref store inside the repository struct
replace-object: add repository argument to lookup_replace_object
replace-object: add repository argument to do_lookup_replace_object
replace-object: add repository argument to prepare_replace_object
refs: add repository argument to for_each_replace_ref
refs: add repository argument to get_main_ref_store
replace-object: check_replace_refs is safe in multi repo environment
replace-object: eliminate replace objects prepared flag
object-store: move lookup_replace_object to replace-object.h
replace-object: move replace_map to object store
replace_object: use oidmap
This moves the 'main_ref_store', which was a global variable in refs.c
into the repository struct.
This patch does not deal with the parts in the refs subsystem which deal
with the submodules there. A later patch needs to get rid of the submodule
exposure in the refs API, such as 'get_submodule_ref_store(path)'.
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
connect_work_tree_and_git_dir is used to connect a submodule worktree with
its git directory and vice versa after events that require a reconnection
such as moving around the working tree. As submodules can have nested
submodules themselves, we'd also want to fix the nested submodules when
asked to. Add an option to recurse into the nested submodules and connect
them as well.
As submodules are identified by their name (which determines their git
directory in relation to their superproject's git directory) internally
and by their path in the working tree of the superproject, we need to
make sure that the mapping of name <-> path is kept intact. We can do
that in the git-mv command by writing out the gitmodules file first
and then forcing a reload of the submodule config machinery.
Signed-off-by: Stefan Beller <sbeller@google.com>
Reviewed-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The raw object store field will contain any objects needed for access
to objects in a given repository.
This patch introduces the raw object store and populates it with the
`objectdir`, which used to be part of the repository struct.
As the struct gains members, we'll also populate the function to clear
the memory for these members.
In a later step, we'll introduce a struct object_parser, that will
complement the object parsing in a repository struct: The raw object
parser is the layer that will provide access to raw object content,
while the higher level object parser code will parse raw objects and
keeps track of parenthood and other object relationships using 'struct
object'. For now only add the lower level to the repository struct.
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The argument name "optional" may mislead the reader to think this
option could be NULL. But it can't be. While at there, document a bit
more about struct set_gitdir_args.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This variable was added because the repo_set_gitdir() was created to
cover both submodule and main repos, but these two are initialized a
bit differently so ignore_env == 0 means main repo, while ignore_env
!= 0 is submodules.
Since the difference part (env variables) has been moved out of
repo_set_gitdir(), this function works the same way for both repo
types and ignore_env is not needed anymore.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
getenv() is supposed to work on the main repository only. This delayed
getenv() code in sha1_file.c makes it more difficult to convert
sha1_file.c to a generic object store that could be used by both
submodule and main repositories.
Move the getenv() back in setup_git_env() where other env vars are
also fetched.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It does not make sense that generic repository code contains handling
of environment variables, which are specific for the main repository
only. Refactor repo_set_gitdir() function to take $GIT_DIR and
optionally _all_ other customizable paths. These optional paths can be
NULL and will be calculated according to the default directory layout.
Note that some dead functions are left behind to reduce diff
noise. They will be deleted in the next patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This simplifies initialization of struct repository and anything
inside. Easier to read. Easier to add/remove fields.
Everything will go through main() common-main.c so this should cover all
programs, including t/helper.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In future versions of Git, we plan to support an additional hash
algorithm. Integrate the enumeration of hash algorithms with repository
setup, and store a pointer to the enumerated data in struct repository.
Of course, we currently only support SHA-1, so hard-code this value in
read_repository_format. In the future, we'll enumerate this value from
the configuration.
Add a constant, the_hash_algo, which points to the hash_algo structure
pointer in the repository global. Note that this is the hash which is
used to serialize data to disk, not the hash which is used to display
items to the user. The transition plan anticipates that these may be
different. We can add an additional element in the future (say,
ui_hash_algo) to provide for this case.
Include repository.h in cache.h since we now need to have access to
these struct and variable definitions.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Have 'repo_read_index()' behave more like the other read_index family of
functions and don't discard the index if it has already been populated
and instead rely on the quick return of read_index_from which has:
/* istate->initialized covers both .git/index and .git/sharedindex.xxx */
if (istate->initialized)
return istate->cache_nr;
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>