Commit graph

423 commits

Author SHA1 Message Date
Junio C Hamano a8bf3c0cac Merge branch 'kl/allow-working-in-dot-git-in-non-bare-repository'
The "disable repository discovery of a bare repository" check,
triggered by setting safe.bareRepository configuration variable to
'explicit', has been loosened to exclude the ".git/" directory inside
a non-bare repository from the check.  So you can do "cd .git &&
git cmd" to run a Git command that works on a bare repository without
explicitly specifying $GIT_DIR now.

* kl/allow-working-in-dot-git-in-non-bare-repository:
  setup: allow cwd=.git w/ bareRepository=explicit
2024-01-30 13:34:12 -08:00
Junio C Hamano dc8ce995a2 Merge branch 'ps/worktree-refdb-initialization'
Instead of manually creating refs/ hierarchy on disk upon a
creation of a secondary worktree, which is only usable via the
files backend, use the refs API to populate it.

* ps/worktree-refdb-initialization:
  builtin/worktree: create refdb via ref backend
  worktree: expose interface to look up worktree by name
  builtin/worktree: move setup of commondir file earlier
  refs/files: skip creation of "refs/{heads,tags}" for worktrees
  setup: move creation of "refs/" into the files backend
  refs: prepare `refs_init_db()` for initializing worktree refs
2024-01-26 08:54:46 -08:00
Kyle Lippincott 45bb916248 setup: allow cwd=.git w/ bareRepository=explicit
The safe.bareRepository setting can be set to 'explicit' to disallow
implicit uses of bare repositories, preventing an attack [1] where an
artificial and malicious bare repository is embedded in another git
repository. Unfortunately, some tooling uses myrepo/.git/ as the cwd
when executing commands, and this is blocked when
safe.bareRepository=explicit. Blocking is unnecessary, as git already
prevents nested .git directories.

Teach git to not reject uses of git inside of the .git directory: check
if cwd is .git (or a subdirectory of it) and allow it even if
safe.bareRepository=explicit.

[1] https://github.com/justinsteven/advisories/blob/main/2022_git_buried_bare_repos_and_fsmonitor_various_abuses.md

Signed-off-by: Kyle Lippincott <spectral@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-20 14:11:49 -08:00
Junio C Hamano 32c6fc3e30 Merge branch 'ps/refstorage-extension'
Introduce a new extension "refstorage" so that we can mark a
repository that uses a non-default ref backend, like reftable.

* ps/refstorage-extension:
  t9500: write "extensions.refstorage" into config
  builtin/clone: introduce `--ref-format=` value flag
  builtin/init: introduce `--ref-format=` value flag
  builtin/rev-parse: introduce `--show-ref-format` flag
  t: introduce GIT_TEST_DEFAULT_REF_FORMAT envvar
  setup: introduce GIT_DEFAULT_REF_FORMAT envvar
  setup: introduce "extensions.refStorage" extension
  setup: set repository's formats on init
  setup: start tracking ref storage format
  refs: refactor logic to look up storage backends
  worktree: skip reading HEAD when repairing worktrees
  t: introduce DEFAULT_REPO_FORMAT prereq
2024-01-16 10:11:57 -08:00
Junio C Hamano 492ee03f60 Merge branch 'en/header-cleanup'
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-01-08 14:05:15 -08:00
Patrick Steinhardt c358d165f2 setup: move creation of "refs/" into the files backend
When creating the ref database we unconditionally create the "refs/"
directory in "setup.c". This is a mandatory prerequisite for all Git
repositories regardless of the ref backend in use, because Git will be
unable to detect the directory as a repository if "refs/" doesn't exist.

We are about to add another new caller that will want to create a ref
database when creating worktrees. We would require the same logic to
create the "refs/" directory even though the caller really should not
care about such low-level details. Ideally, the ref database should be
fully initialized after calling `refs_init_db()`.

Move the code to create the directory into the files backend itself to
make it so. This means that future ref backends will also need to have
equivalent logic around to ensure that the directory exists, but it
seems a lot more sensible to have it this way round than to require
callers to create the directory themselves.

An alternative to this would be to create "refs/" in `refs_init_db()`
directly. This feels conceptually unclean though as the creation of the
refdb is now cluttered across different callsites. Furthermore, both the
"files" and the upcoming "reftable" backend write backend-specific data
into the "refs/" directory anyway, so splitting up this logic would only
make it harder to reason about.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
Patrick Steinhardt 2e573d61ff refs: prepare refs_init_db() for initializing worktree refs
The purpose of `refs_init_db()` is to initialize the on-disk files of a
new ref database. The function is quite inflexible right now though, as
callers can neither specify the `struct ref_store` nor can they pass any
flags.

Refactor the interface to accept both of these. This will be required so
that we can start initializing per-worktree ref databases via the ref
backend instead of open-coding the initialization in "worktree.c".

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-08 13:17:30 -08:00
Patrick Steinhardt aa19619a98 setup: introduce GIT_DEFAULT_REF_FORMAT envvar
Introduce a new GIT_DEFAULT_REF_FORMAT environment variable that lets
users control the default ref format used by both git-init(1) and
git-clone(1). This is modeled after GIT_DEFAULT_OBJECT_FORMAT, which
does the same thing for the repository's object format.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
Patrick Steinhardt d7497a42b0 setup: introduce "extensions.refStorage" extension
Introduce a new "extensions.refStorage" extension that allows us to
specify the ref storage format used by a repository. For now, the only
supported format is the "files" format, but this list will likely soon
be extended to also support the upcoming "reftable" format.

There have been discussions on the Git mailing list in the past around
how exactly this extension should look like. One alternative [1] that
was discussed was whether it would make sense to model the extension in
such a way that backends are arbitrarily stackable. This would allow for
a combined value of e.g. "loose,packed-refs" or "loose,reftable", which
indicates that new refs would be written via "loose" files backend and
compressed into "packed-refs" or "reftable" backends, respectively.

It is arguable though whether this flexibility and the complexity that
it brings with it is really required for now. It is not foreseeable that
there will be a proliferation of backends in the near-term future, and
the current set of existing formats and formats which are on the horizon
can easily be configured with the much simpler proposal where we have a
single value, only.

Furthermore, if we ever see that we indeed want to gain the ability to
arbitrarily stack the ref formats, then we can adapt the current
extension rather easily. Given that Git clients will refuse any unknown
value for the "extensions.refStorage" extension they would also know to
ignore a stacked "loose,packed-refs" in the future.

So let's stick with the easy proposal for the time being and wire up the
extension.

[1]: <pull.1408.git.1667846164.gitgitgadget@gmail.com>

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:48 -08:00
Patrick Steinhardt 58be32fff9 setup: set repository's formats on init
The proper hash algorithm and ref storage format that will be used for a
newly initialized repository will be figured out in `init_db()` via
`validate_hash_algorithm()` and `validate_ref_storage_format()`. Until
now though, we never set up the hash algorithm or ref storage format of
`the_repository` accordingly.

There are only two callsites of `init_db()`, one in git-init(1) and one
in git-clone(1). The former function doesn't care for the formats to be
set up properly because it never access the repository after calling the
function in the first place.

For git-clone(1) it's a different story though, as we call `init_db()`
before listing remote refs. While we do indeed have the wrong hash
function in `the_repository` when `init_db()` sets up a non-default
object format for the repository, it never mattered because we adjust
the hash after learning about the remote's hash function via the listed
refs.

So the current state is correct for the hash algo, but it's not for the
ref storage format because git-clone(1) wouldn't know to set it up
properly. But instead of adjusting only the `ref_storage_format`, set
both the hash algo and the ref storage format so that `the_repository`
is in the correct state when `init_db()` exits. This is fine as we will
adjust the hash later on anyway and makes it easier to reason about the
end state of `the_repository`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
Patrick Steinhardt 173761e21b setup: start tracking ref storage format
In order to discern which ref storage format a repository is supposed to
use we need to start setting up and/or discovering the format. This
needs to happen in two separate code paths.

  - The first path is when we create a repository via `init_db()`. When
    we are re-initializing a preexisting repository we need to retain
    the previously used ref storage format -- if the user asked for a
    different format then this indicates an error and we error out.
    Otherwise we either initialize the repository with the format asked
    for by the user or the default format, which currently is the
    "files" backend.

  - The second path is when discovering repositories, where we need to
    read the config of that repository. There is not yet any way to
    configure something other than the "files" backend, so we can just
    blindly set the ref storage format to this backend.

Wire up this logic so that we have the ref storage format always readily
available when needed. As there is only a single backend and because it
is not configurable we cannot yet verify that this tracking works as
expected via tests, but tests will be added in subsequent commits. To
countermand this ommission now though, raise a BUG() in case the ref
storage format is not set up properly in `ref_store_init()`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-02 09:24:47 -08:00
Junio C Hamano 94e8e404a7 Merge branch 'ps/clone-into-reftable-repository'
"git clone" has been prepared to allow cloning a repository with
non-default hash function into a repository that uses the reftable
backend.

* ps/clone-into-reftable-repository:
  builtin/clone: create the refdb with the correct object format
  builtin/clone: skip reading HEAD when retrieving remote
  builtin/clone: set up sparse checkout later
  builtin/clone: fix bundle URIs with mismatching object formats
  remote-curl: rediscover repository when fetching refs
  setup: allow skipping creation of the refdb
  setup: extract function to create the refdb
2023-12-27 14:52:28 -08:00
Elijah Newren eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
Junio C Hamano 7a75e131d6 Merge branch 'ps/clone-into-reftable-repository' into ps/refstorage-extension
* ps/clone-into-reftable-repository:
  builtin/clone: create the refdb with the correct object format
  builtin/clone: skip reading HEAD when retrieving remote
  builtin/clone: set up sparse checkout later
  builtin/clone: fix bundle URIs with mismatching object formats
  remote-curl: rediscover repository when fetching refs
  setup: allow skipping creation of the refdb
  setup: extract function to create the refdb
2023-12-20 10:19:58 -08:00
Patrick Steinhardt 18c9cb7524 builtin/clone: create the refdb with the correct object format
We're currently creating the reference database with a potentially
incorrect object format when the remote repository's object format is
different from the local default object format. This works just fine for
now because the files backend never records the object format anywhere.
But this logic will fail with any new reference backend that encodes
this information in some form either on-disk or in-memory.

The preceding commits have reshuffled code in git-clone(1) so that there
is no code path that will access the reference database before we have
detected the remote's object format. With these refactorings we can now
defer initialization of the reference database until after we have
learned the remote's object format and thus initialize it with the
correct format from the get-go.

These refactorings are required to make git-clone(1) work with the
upcoming reftable backend when cloning repositories with the SHA256
object format.

This change breaks a test in "t5550-http-fetch-dumb.sh" when cloning an
empty repository with `GIT_TEST_DEFAULT_HASH=sha256`. The test expects
the resulting hash format of the empty cloned repository to match the
default hash, but now we always end up with a sha1 repository. The
problem is that for dumb HTTP fetches, we have no easy way to figure out
the remote's hash function except for deriving it based on the hash
length of refs in `info/refs`. But as the remote repository is empty we
cannot rely on this detection mechanism.

Before the change in this commit we already initialized the repository
with the default hash function and then left it as-is. With this patch
we always use the hash function detected via the remote, where we fall
back to "sha1" in case we cannot detect it.

Neither the old nor the new behaviour are correct as we second-guess the
remote hash function in both cases. But given that this is a rather
unlikely edge case (we use the dumb HTTP protocol, the remote repository
uses SHA256 and the remote repository is empty), let's simply adapt the
test to assert the new behaviour. If we want to properly address this
edge case in the future we will have to extend the dumb HTTP protocol so
that we can properly detect the hash function for empty repositories.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
Patrick Steinhardt 56cd0334f7 setup: allow skipping creation of the refdb
Allow callers to skip creation of the reference database via a new flag
`INIT_DB_SKIP_REFDB`, which is required for git-clone(1) so that we can
create it at a later point once the object format has been discovered
from the remote repository.

Note that we also uplift the call to `create_reference_database()` into
`init_db()`, which makes it easier to handle the new flag for us. This
changes the order in which we do initialization so that we now set up
the Git configuration before we create the reference database. In
practice this move should not result in any change in behaviour.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
Patrick Steinhardt 79543e760d setup: extract function to create the refdb
We're about to let callers skip creation of the reference database when
calling `init_db()`. Extract the logic into a standalone function so
that it becomes easier to do this refactoring.

While at it, expand the comment that explains why we always create the
"refs/" directory.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
Jeff King a62712696e setup: handle NULL value when parsing extensions
The "partialclone" extension config records a string, and hence it is an
error to have an implicit bool like:

  [extensions]
  partialclone

in your config. We should recognize and reject this, rather than
segfaulting (which is the current behavior). Note that it's OK to use
config_error_nonbool() here, even though the return value is an enum. We
explicitly document EXTENSION_ERROR as -1 for compatibility with
error(), etc.

This is the only extension value that has this problem. Most of the
others are bools that interpret this value naturally. The exception is
extensions.objectformat, which does correctly check for NULL.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-09 08:24:46 +09:00
Patrick Steinhardt 9972cd6004 setup: fix leaking repository format
While populating the `repository_format` structure may cause us to
allocate memory, we do not call `clear_repository_format()` in some
places and thus potentially leak memory. Fix this.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 08:51:41 +09:00
Patrick Steinhardt 4ce14e1325 setup: refactor upgrade_repository_format() to have common exit
The `upgrade_repository_format()` function has multiple exit paths,
which means that there is no common cleanup of acquired resources.
While this isn't much of a problem right now, we're about to fix a
memory leak that would require us to free the resource in every one of
those exit paths.

Refactor the code to have a common exit path so that the subsequent
memory leak fix becomes easier to implement.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-11-07 08:51:41 +09:00
Junio C Hamano 50758312f2 Merge branch 'ds/scalar-updates' into maint-2.42
Scalar updates.

* ds/scalar-updates:
  scalar reconfigure: help users remove buggy repos
  setup: add discover_git_directory_reason()
  scalar: add --[no-]src option
2023-11-02 16:53:15 +09:00
Derrick Stolee 26ae8da683 setup: add discover_git_directory_reason()
There are many reasons why discovering a Git directory may fail. In
particular, 8959555cee (setup_git_directory(): add an owner check for
the top-level directory, 2022-03-02) added ownership checks as a
security precaution.

Callers attempting to set up a Git directory may want to inform the user
about the reason for the failure. For that, expose the enum
discovery_result from within setup.c and move it into cache.h where
discover_git_directory() is defined.

I initially wanted to change the return type of discover_git_directory()
to be this enum, but several callers rely upon the "zero means success".
The two problems with this are:

1. The zero value of the enum is actually GIT_DIR_NONE, so nonpositive
   results are errors.

2. There are multiple successful states; positive results are
   successful.

It is worth noting that GIT_DIR_NONE is not returned, so we remove this
option from the enum. We must be careful to keep the successful reasons
as positive values, so they are given explicit positive values.

Instead of updating all callers immediately, add a new method,
discover_git_directory_reason(), and convert discover_git_directory() to
be a thin shim on top of it.

One thing that is important to note is that discover_git_directory()
previously returned -1 on error, so let's continue that into the future.
There is only one caller (in scalar.c) that depends on that signedness
instead of a non-zero check, so clean that up, too.

Because there are extra checks that discover_git_directory_reason() does
after setup_git_directory_gently_1(), there are other modes that can be
returned for failure states. Add these modes to the enum, but be sure to
explicitly add them as BUG() states in the switch of
setup_git_directory_gently().

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-08-28 09:16:06 -07:00
Junio C Hamano ce481ac8b3 Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.

* cw/compat-util-header-cleanup:
  git-compat-util: move alloc macros to git-compat-util.h
  treewide: remove unnecessary includes for wrapper.h
  kwset: move translation table from ctype
  sane-ctype.h: create header for sane-ctype macros
  git-compat-util: move wrapper.c funcs to its header
  git-compat-util: move strbuf.c funcs to its header
2023-07-17 11:30:42 -07:00
Junio C Hamano b3d1c85d48 Merge branch 'gc/config-context'
Reduce reliance on a global state in the config reading API.

* gc/config-context:
  config: pass source to config_parser_event_fn_t
  config: add kvi.path, use it to evaluate includes
  config.c: remove config_reader from configsets
  config: pass kvi to die_bad_number()
  trace2: plumb config kvi
  config.c: pass ctx with CLI config
  config: pass ctx with config files
  config.c: pass ctx in configsets
  config: add ctx arg to config_fn_t
  urlmatch.h: use config_fn_t type
  config: inline git_color_default_config
2023-07-06 11:54:48 -07:00
Calvin Wan da9502ff4d treewide: remove unnecessary includes for wrapper.h
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:41:59 -07:00
Junio C Hamano a1264a08a1 Merge branch 'en/header-split-cache-h-part-3'
Header files cleanup.

* en/header-split-cache-h-part-3: (28 commits)
  fsmonitor-ll.h: split this header out of fsmonitor.h
  hash-ll, hashmap: move oidhash() to hash-ll
  object-store-ll.h: split this header out of object-store.h
  khash: name the structs that khash declares
  merge-ll: rename from ll-merge
  git-compat-util.h: remove unneccessary include of wildmatch.h
  builtin.h: remove unneccessary includes
  list-objects-filter-options.h: remove unneccessary include
  diff.h: remove unnecessary include of oidset.h
  repository: remove unnecessary include of path.h
  log-tree: replace include of revision.h with simple forward declaration
  cache.h: remove this no-longer-used header
  read-cache*.h: move declarations for read-cache.c functions from cache.h
  repository.h: move declaration of the_index from cache.h
  merge.h: move declarations for merge.c from cache.h
  diff.h: move declaration for global in diff.c from cache.h
  preload-index.h: move declarations for preload-index.c from elsewhere
  sparse-index.h: move declarations for sparse-index.c from cache.h
  name-hash.h: move declarations for name-hash.c from cache.h
  run-command.h: move declarations for run-command.c from cache.h
  ...
2023-06-29 16:43:21 -07:00
Glen Choo 8868b1ebfb config: pass kvi to die_bad_number()
Plumb "struct key_value_info" through all code paths that end in
die_bad_number(), which lets us remove the helper functions that read
analogous values from "struct config_reader". As a result, nothing reads
config_reader.config_kvi any more, so remove that too.

In config.c, this requires changing the signature of
git_configset_get_value() to 'return' "kvi" in an out parameter so that
git_configset_get_<type>() can pass it to git_config_<type>(). Only
numeric types will use "kvi", so for non-numeric types (e.g.
git_configset_get_string()), pass NULL to indicate that the out
parameter isn't needed.

Outside of config.c, config callbacks now need to pass "ctx->kvi" to any
of the git_config_<type>() functions that parse a config string into a
number type. Included is a .cocci patch to make that refactor.

The only exceptional case is builtin/config.c, where git_config_<type>()
is called outside of a config callback (namely, on user-provided input),
so config source information has never been available. In this case,
die_bad_number() defaults to a generic, but perfectly descriptive
message. Let's provide a safe, non-NULL for "kvi" anyway, but make sure
not to change the message.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:40 -07:00
Glen Choo a4e7e317f8 config: add ctx arg to config_fn_t
Add a new "const struct config_context *ctx" arg to config_fn_t to hold
additional information about the config iteration operation.
config_context has a "struct key_value_info kvi" member that holds
metadata about the config source being read (e.g. what kind of config
source it is, the filename, etc). In this series, we're only interested
in .kvi, so we could have just used "struct key_value_info" as an arg,
but config_context makes it possible to add/adjust members in the future
without changing the config_fn_t signature. We could also consider other
ways of organizing the args (e.g. moving the config name and value into
config_context or key_value_info), but in my experiments, the
incremental benefit doesn't justify the added complexity (e.g. a
config_fn_t will sometimes invoke another config_fn_t but with a
different config value).

In subsequent commits, the .kvi member will replace the global "struct
config_reader" in config.c, making config iteration a global-free
operation. It requires much more work for the machinery to provide
meaningful values of .kvi, so for now, merely change the signature and
call sites, pass NULL as a placeholder value, and don't rely on the arg
in any meaningful way.

Most of the changes are performed by
contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every
config_fn_t:

- Modifies the signature to accept "const struct config_context *ctx"
- Passes "ctx" to any inner config_fn_t, if needed
- Adds UNUSED attributes to "ctx", if needed

Most config_fn_t instances are easily identified by seeing if they are
called by the various config functions. Most of the remaining ones are
manually named in the .cocci patch. Manual cleanups are still needed,
but the majority of it is trivial; it's either adjusting config_fn_t
that the .cocci patch didn't catch, or adding forward declarations of
"struct config_context ctx" to make the signatures make sense.

The non-trivial changes are in cases where we are invoking a config_fn_t
outside of config machinery, and we now need to decide what value of
"ctx" to pass. These cases are:

- trace2/tr2_cfg.c:tr2_cfg_set_fl()

  This is indirectly called by git_config_set() so that the trace2
  machinery can notice the new config values and update its settings
  using the tr2 config parsing function, i.e. tr2_cfg_cb().

- builtin/checkout.c:checkout_main()

  This calls git_xmerge_config() as a shorthand for parsing a CLI arg.
  This might be worth refactoring away in the future, since
  git_xmerge_config() can call git_default_config(), which can do much
  more than just parsing.

Handle them by creating a KVI_INIT macro that initializes "struct
key_value_info" to a reasonable default, and use that to construct the
"ctx" arg.

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 14:06:39 -07:00
Junio C Hamano 01202f5f68 Merge branch 'gc/discover-not-setup'
discover_git_directory() no longer touches the_repository.

* gc/discover-not-setup:
  setup.c: don't setup in discover_git_directory()
2023-06-23 11:21:16 -07:00
Elijah Newren c339932bd8 repository: remove unnecessary include of path.h
This also made it clear that several .c files that depended upon path.h
were missing a #include for it; add the missing includes while at it.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Elijah Newren e8cf8ef507 setup: adopt shared init-db & clone code
The functions init_db() and initialize_repository_version() were shared
by builtin/init-db.c and builtin/clone.c, and declared in cache.h.

Move these functions, plus their several helpers only used by these
functions, to setup.[ch].

Diff best viewed with `--color-moved`.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-21 13:39:53 -07:00
Glen Choo 69f4da8ead setup.c: don't setup in discover_git_directory()
discover_git_directory() started modifying the_repository in ebaf3bcf1a
(repository: move global r_f_p_c to repo struct, 2021-06-17), when, in
the repository setup process, we started copying members from the
"struct repository_format" we're inspecting to the appropriate "struct
repository". However, discover_git_directory() isn't actually used in
the setup process (its only caller in the Git binary is
read_early_config()), so it shouldn't be doing this setup at all!

As explained by 16ac8b8db6 (setup: introduce the
discover_git_directory() function, 2017-03-13) and the comment on its
declaration, discover_git_directory() is intended to be an entrypoint
into setup.c machinery that allows the Git directory to be discovered
without side effects, e.g. so that read_early_config() can read
".git/config" before the_repository has been set up.

Fortunately, we didn't start to rely on this unintended behavior between
then and now, so we let's just remove it. It isn't harming anyone, but
it's confusing.

Signed-off-by: Glen Choo <chooglen@google.com>
Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-16 08:35:06 -07:00
Victoria Dye 3867f6d650 repository: move 'repository_format_worktree_config' to repo scope
Move 'repository_format_worktree_config' out of the global scope and into
the 'repository' struct. This change is similar to how
'repository_format_partial_clone' was moved in ebaf3bcf1a (repository: move
global r_f_p_c to repo struct, 2021-06-17), adding it to the 'repository'
struct and updating 'setup.c' & 'repository.c' functions to assign the value
appropriately.

The primary goal of this change is to be able to load the worktree config of
a submodule depending on whether that submodule - not its superproject - has
'extensions.worktreeConfig' enabled. To ensure 'do_git_config_sequence()'
has access to the newly repo-scoped configuration, add a 'struct repository'
argument to 'do_git_config_sequence()' and pass it the 'repo' value from
'config_with_options()'.

Finally, add/update tests in 't3007-ls-files-recurse-submodules.sh' to
verify 'extensions.worktreeConfig' is read an used independently by
superprojects and submodules.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-26 13:53:41 +09:00
Glen Choo e35f202b45 setup: trace bare repository setups
safe.bareRepository=explicit is a safer default mode of operation, since
it guards against the embedded bare repository attack [1]. Most end
users don't use bare repositories directly, so they should be able to
set safe.bareRepository=explicit, with the expectation that they can
reenable bare repositories by specifying GIT_DIR or --git-dir.

However, the user might use a tool that invokes Git on bare repositories
without setting GIT_DIR (e.g. "go mod" will clone bare repositories
[2]), so even if a user wanted to use safe.bareRepository=explicit, it
wouldn't be feasible until their tools learned to set GIT_DIR.

To make this transition easier, add a trace message to note when we
attempt to set up a bare repository without setting GIT_DIR. This allows
users and tool developers to audit which of their tools are problematic
and report/fix the issue.  When they are sufficiently confident, they
would switch over to "safe.bareRepository=explicit".

Note that this uses trace2_data_string(), which isn't supported by the
"normal" GIT_TRACE2 target, only _EVENT or _PERF.

[1] https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com/
[2] https://go.dev/ref/mod

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-05-01 11:20:33 -07:00
Elijah Newren 65156bb7ec treewide: remove double forward declaration of read_in_full
cache.h's nature of a dumping ground of includes prevented it from
being included in some compat/ files, forcing us into a workaround
of having a double forward declaration of the read_in_full() function
(see commit 14086b0a13 ("compat/pread.c: Add a forward declaration to
fix a warning", 2007-11-17)).  Now that we have moved functions like
read_in_full() from cache.h to wrapper.h, and wrapper.h isn't littered
with unrelated and scary #defines, get rid of the extra forward
declaration and just have compat/pread.c include wrapper.h.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:11 -07:00
Elijah Newren e93fc5d721 treewide: remove cache.h inclusion due to object-name.h changes
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren dabab1d6e6 object-name.h: move declarations for object-name.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:09 -07:00
Elijah Newren 74ea5c9574 treewide: be explicit about dependence on trace.h & trace2.h
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h.  This made it more difficult
to find which files could remove a dependence on cache.h.  Make C files
explicitly include trace.h or trace2.h if they are using them.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:08 -07:00
Elijah Newren e38da487cc setup.h: move declarations for setup.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren 32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren 0b027f6ca7 abspath.h: move absolute path functions from cache.h
This is another step towards letting us remove the include of cache.h in
strbuf.c.  It does mean that we also need to add includes of abspath.h
in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:52 -07:00
Elijah Newren f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Junio C Hamano 7fac7b563b Merge branch 'js/safe-directory-plus'
Platform-specific code that determines if a directory is OK to use
as a repository has been taught to report more details, especially
on Windows.

* js/safe-directory-plus:
  mingw: handle a file owned by the Administrators group correctly
  mingw: be more informative when ownership check fails on FAT32
  mingw: provide details about unsafe directories' ownership
  setup: prepare for more detailed "dubious ownership" messages
  setup: fix some formatting
2022-08-14 23:19:28 -07:00
Johannes Schindelin 17d3883fe9 setup: prepare for more detailed "dubious ownership" messages
When verifying the ownership of the Git directory, we sometimes would
like to say a bit more about it, e.g. when using a platform-dependent
code path (think: Windows has the permission model that is so different
from Unix'), but only when it is a appropriate to actually say
something.

To allow for that, collect that information and hand it back to the
caller (whose responsibility it is to show it or not).

Note: We do not actually fill in any platform-dependent information yet,
this commit just adds the infrastructure to be able to do so.

Based-on-an-idea-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 09:25:40 -07:00
Johannes Schindelin d51e1dff98 setup: fix some formatting
In preparation for touching code that was introduced in 3b0bf27049
(setup: tighten ownership checks post CVE-2022-24765, 2022-05-10) and
that was formatted differently than preferred in the Git project, fix
the indentation before actually modifying the code.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 09:24:00 -07:00
Glen Choo 8d1a744820 setup.c: create safe.bareRepository
There is a known social engineering attack that takes advantage of the
fact that a working tree can include an entire bare repository,
including a config file. A user could run a Git command inside the bare
repository thinking that the config file of the 'outer' repository would
be used, but in reality, the bare repository's config file (which is
attacker-controlled) is used, which may result in arbitrary code
execution. See [1] for a fuller description and deeper discussion.

A simple mitigation is to forbid bare repositories unless specified via
`--git-dir` or `GIT_DIR`. In environments that don't use bare
repositories, this would be minimally disruptive.

Create a config variable, `safe.bareRepository`, that tells Git whether
or not to die() when working with a bare repository. This config is an
enum of:

- "all": allow all bare repositories (this is the default)
- "explicit": only allow bare repositories specified via --git-dir
  or GIT_DIR.

If we want to protect users from such attacks by default, neither value
will suffice - "all" provides no protection, but "explicit" is
impractical for bare repository users. A more usable default would be to
allow only non-embedded bare repositories ([2] contains one such
proposal), but detecting if a repository is embedded is potentially
non-trivial, so this work is not implemented in this series.

[1]: https://lore.kernel.org/git/kl6lsfqpygsj.fsf@chooglen-macbookpro.roam.corp.google.com
[2]: https://lore.kernel.org/git/5b969c5e-e802-c447-ad25-6acc0b784582@github.com

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
Glen Choo 6061601d9f safe.directory: use git_protected_config()
Use git_protected_config() to read `safe.directory` instead of
read_very_early_config(), making it 'protected configuration only'.

As a result, `safe.directory` now respects "-c", so update the tests and
docs accordingly. It used to ignore "-c" due to how it was implemented,
not because of security or correctness concerns [1].

[1] https://lore.kernel.org/git/xmqqlevabcsu.fsf@gitster.g/

Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-14 15:08:29 -07:00
Junio C Hamano f2e5255fc2 Git 2.37.1
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmLDUXkACgkQsLXohpav
 5sv4SA/6AyzOuaMiBVTLtiYETFj9UU1Z3C12XtUlnu4qmw4Ddd1rq8/E4BgNDYJ7
 cC4MDVfOp7tvyaVGBBSfzIvIieBnGa7PIQl4z1eqZqIm0xt4T1D65jx1CHeJ+UIK
 k63L879dodQnIgwd1ThoGng0BUvmTREFODbGzX6JYPlRuEYkOpswLdvMO14epjLu
 T+TV9etoD5UELTrwnXDOq2ydH1MguyFj21g6NhMvBDTMCbZlQFb9skuF8dx2mC7T
 TktGntaEnFbm1UoZAoHg7g9AzO0iD+Vl6MVkkkolxJyAqiNUf030Ct6RQq9kRj1W
 7kYJDrgN9Oh3g93tJGsnTHmeOSoNiEJsHIeddH4HU0gzRtcx32ygL+KYE6exl2F6
 S1aoWJMdiQ9lYgQRef6aTQEHl2A08rIr3a3wFhVZBNZZk4NpwGblXfI2oIgKIVAx
 cFt0ABGX6RlokUUFGP+F/pk2noPi4m2tMaYfpUsd3sca+uKhFgtf26tQtmLpXAbq
 LKOA9FE+kjGgcOKMSIBjZYFod1HYHgZ+0F87JAURiUJnK2zAYSj+Sq4EatyvcHlb
 QcWQX5+Zxf+/rA1ACEBY+y4iZoSA0f1VQv8aANRmQwoIcjYjcu+W8dQKGCpJB68I
 ftTTxoM/OeQONkdzCcVNtS6ZbgjgKaaaaurvzLMwkQowtWEHd08=
 =pXYr
 -----END PGP SIGNATURE-----

Sync with Git 2.37.1
2022-07-11 16:08:49 -07:00
Junio C Hamano 6d65013bb7 Merge branch 'cr/setup-bug-typo'
Typofix in a BUG() message.

* cr/setup-bug-typo:
  setup: fix function name in a BUG() message
2022-07-11 15:38:52 -07:00
Junio C Hamano 69ab3309e9 Sync with Git 2.36.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-06-27 12:41:41 -07:00