Many codepaths did not diagnose write failures correctly when disks
go full, due to their misuse of write_in_full() helper function,
which have been corrected.
* jk/write-in-full-fix:
read_pack_header: handle signed/unsigned comparison in read result
config: flip return value of store_write_*()
notes-merge: use ssize_t for write_in_full() return value
pkt-line: check write_in_full() errors against "< 0"
convert less-trivial versions of "write_in_full() != len"
avoid "write_in_full(fd, buf, len) != len" pattern
get-tar-commit-id: check write_in_full() return against 0
config: avoid "write_in_full(fd, buf, len) < len" pattern
The return value of write_in_full() is either "-1", or the
requested number of bytes[1]. If we make a partial write
before seeing an error, we still return -1, not a partial
value. This goes back to f6aa66cb95 (write_in_full: really
write in full or return error on disk full., 2007-01-11).
So checking anything except "was the return value negative"
is pointless. And there are a couple of reasons not to do
so:
1. It can do a funny signed/unsigned comparison. If your
"len" is signed (e.g., a size_t) then the compiler will
promote the "-1" to its unsigned variant.
This works out for "!= len" (unless you really were
trying to write the maximum size_t bytes), but is a
bug if you check "< len" (an example of which was fixed
recently in config.c).
We should avoid promoting the mental model that you
need to check the length at all, so that new sites are
not tempted to copy us.
2. Checking for a negative value is shorter to type,
especially when the length is an expression.
3. Linus says so. In d34cf19b89 (Clean up write_in_full()
users, 2007-01-11), right after the write_in_full()
semantics were changed, he wrote:
I really wish every "write_in_full()" user would just
check against "<0" now, but this fixes the nasty and
stupid ones.
Appeals to authority aside, this makes it clear that
writing it this way does not have an intentional
benefit. It's a historical curiosity that we never
bothered to clean up (and which was undoubtedly
cargo-culted into new sites).
So let's convert these obviously-correct cases (this
includes write_str_in_full(), which is just a wrapper for
write_in_full()).
[1] A careful reader may notice there is one way that
write_in_full() can return a different value. If we ask
write() to write N bytes and get a return value that is
_larger_ than N, we could return a larger total. But
besides the fact that this would imply a totally broken
version of write(), it would already invoke undefined
behavior. Our internal remaining counter is an unsigned
size_t, which means that subtracting too many byte will
wrap it around to a very large number. So we'll instantly
begin reading off the end of the buffer, trying to write
gigabytes (or petabytes) of data.
Signed-off-by: Jeff King <peff@peff.net>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers are only allowed to pass certain flags into
ref_transaction_update, other flags are internal to it. To prevent
mistakes from the callers, strip the internal only flags out before
continuing.
This was noticed because of a compiler warning gcc 7.1.1 issued about
passing a NULL parameter as second parameter to memcpy (through
hashcpy):
In file included from refs.c:5:0:
refs.c: In function ‘ref_transaction_verify’:
cache.h:948:2: error: argument 2 null where non-null expected [-Werror=nonnull]
memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from git-compat-util.h:165:0,
from cache.h:4,
from refs.c:5:
/usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared here
extern void *memcpy (void *__restrict __dest, const void *__restrict __src,
^~~~~~
The call to hascpy in ref_transaction_add_update is protected by the
passed in flags, but as we only add flags there, gcc notices
REF_HAVE_NEW or REF_HAVE_OLD flags could be passed in from the outside,
which would potentially result in passing in NULL as second parameter to
memcpy.
Fix both the compiler warning, and make the interface safer for its
users by stripping the internal flags out.
Suggested-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Thomas Gummerer <t.gummerer@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is shorter, makes the logic a bit easier to follow, and is
perhaps a bit faster too.
The logic is to make the final decision only when "subject" is there,
its early part matches "match", and the match is at the slash
boundary (or the whole thing).
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Update the hashmap API so that data to customize the behaviour of
the comparison function can be specified at the time a hashmap is
initialized.
* sb/hashmap-customize-comparison:
hashmap: migrate documentation from Documentation/technical into header
patch-ids.c: use hashmap correctly
hashmap.h: compare function has access to a data field
When using the hashmap a common need is to have access to caller provided
data in the compare function. A couple of times we abuse the keydata field
to pass in the data needed. This happens for example in patch-ids.c.
This patch changes the function signature of the compare function
to have one more void pointer available. The pointer given for each
invocation of the compare function must be defined in the init function
of the hashmap and is just passed through.
Documentation of this new feature is deferred to a later patch.
This is a rather mechanical conversion, just adding the new pass-through
parameter. However while at it improve the naming of the fields of all
compare functions used by hashmaps by ensuring unused parameters are
prefixed with 'unused_' and naming the parameters what they are (instead
of 'unused' make it 'unused_keydata').
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Bugfix for a topic that is (only) in 'master'.
* mh/packed-ref-store-prep:
for_each_bisect_ref(): don't trim refnames
lock_packed_refs(): fix cache validity check
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.
This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`for_each_bisect_ref()` is called by `for_each_bad_bisect_ref()` with
a term "bad". This used to make it call `for_each_ref_in_submodule()`
with a prefix "refs/bisect/bad". But the latter is the name of the
reference that is being sought, so the empty string was being passed
to the callback as the trimmed refname. Moreover, this questionable
practice was turned into an error by
b9c8e7f2fb prefix_ref_iterator: don't trim too much, 2017-05-22
It makes more sense (and agrees better with the documentation of
`--bisect`) for the callers to receive the full reference names. So
* Add a new function, `for_each_fullref_in_submodule()`, to the refs
API. This plugs a gap in the existing functionality, analogous to
`for_each_fullref_in()` but accepting a `submodule` argument.
* Change `for_each_bad_bisect_ref()` to call the new function rather
than `for_each_ref_in_submodule()`.
* Add a test.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of handling `GIT_REF_PARANOIA` in
`files_ref_iterator_begin()`, handle it in
`refs_ref_iterator_begin()`, where it will cover all reference stores.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It's pretty cheap to make sure that the caller didn't pass us an
unsorted list by accident, so do so.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eliminate a theoretical risk of integer overflow if the two types have
different sizes.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In the future, compound reference stores will sometimes need to modify
references in two different reference stores at the same time, meaning
that a single logical reference transaction might have to be
implemented as two internal sub-transactions. They won't want to call
`ref_transaction_commit()` for the two sub-transactions one after the
other, because that wouldn't be atomic (the first commit could succeed
and the second one fail). Instead, they will want to prepare both
sub-transactions (i.e., obtain any necessary locks and do any
pre-checks), and only if both prepare steps succeed, then commit both
sub-transactions.
Start preparing for that day by adding a new, optional
`ref_transaction_prepare()` step to the reference transaction
sequence, which obtains the locks and does any prechecks, reporting
any errors that occur. Also add a `ref_transaction_abort()` function
that can be used to abort a sub-transaction even if it has already
been prepared.
That is on the side of the public-facing API. On the side of the
`ref_store` VTABLE, get rid of `transaction_commit` and instead add
methods `transaction_prepare`, `transaction_finish`, and
`transaction_abort`. A `ref_transaction_commit()` now basically calls
methods `transaction_prepare` then `transaction_finish`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move the check that `transaction->state` is valid from
`files_transaction_commit()` to `ref_transaction_commit()`, where
other future reference backends can benefit from it as well.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Just because the files backend can't retain reflogs for deleted
references is no reason that they shouldn't be supported by the
virtual method interface. Also, `delete_ref()` and `refs_delete_ref()`
have already gained `msg` parameters. Now let's add them to
`delete_refs()` and `refs_delete_refs()`.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Eliminate any chance of integer overflow on platforms where the two
types have different sizes.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The backend already correctly restricts its output to references whose
names start with the prefix. By passing the prefix again to
`prefix_ref_iterator`, we were forcing that iterator to do redundant
prefix comparisons. So set it to the empty string.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* bc/object-id: (53 commits)
object: convert parse_object* to take struct object_id
tree: convert parse_tree_indirect to struct object_id
sequencer: convert do_recursive_merge to struct object_id
diff-lib: convert do_diff_cache to struct object_id
builtin/ls-tree: convert to struct object_id
merge: convert checkout_fast_forward to struct object_id
sequencer: convert fast_forward_to to struct object_id
builtin/ls-files: convert overlay_tree_on_cache to object_id
builtin/read-tree: convert to struct object_id
sha1_name: convert internals of peel_onion to object_id
upload-pack: convert remaining parse_object callers to object_id
revision: convert remaining parse_object callers to object_id
revision: rename add_pending_sha1 to add_pending_oid
http-push: convert process_ls_object and descendants to object_id
refs/files-backend: convert many internals to struct object_id
refs: convert struct ref_update to use struct object_id
ref-filter: convert some static functions to struct object_id
Convert struct ref_array_item to struct object_id
Convert the verify_pack callback to struct object_id
Convert lookup_tag to struct object_id
...
Some platforms have ulong that is smaller than time_t, and our
historical use of ulong for timestamp would mean they cannot
represent some timestamp that the platform allows. Invent a
separate and dedicated timestamp_t (so that we can distingiuish
timestamps and a vanilla ulongs, which along is already a good
move), and then declare uintmax_t is the type to be used as the
timestamp_t.
* js/larger-timestamps:
archive-tar: fix a sparse 'constant too large' warning
use uintmax_t for timestamps
date.c: abort if the system time cannot handle one of our timestamps
timestamp_t: a new data type for timestamps
PRItime: introduce a new "printf format" for timestamps
parse_timestamp(): specify explicitly where we parse timestamps
t0006 & t5000: skip "far in the future" test when time_t is too limited
t0006 & t5000: prepare for 64-bit timestamps
ref-filter: avoid using `unsigned long` for catch-all data type
"git gc" did not interact well with "git worktree"-managed
per-worktree refs.
* nd/worktree-kill-parse-ref:
refs: kill set_worktree_head_symref()
worktree.c: kill parse_ref() in favor of refs_resolve_ref_unsafe()
refs: introduce get_worktree_ref_store()
refs: add REFS_STORE_ALL_CAPS
refs.c: make submodule ref store hashmap generic
environment.c: fix potential segfault by get_git_common_dir()
Convert struct ref_array_item to use struct object_id by changing the
definition and applying the following semantic patch, plus the standard
object_id transforms:
@@
struct ref_update E1;
@@
- E1.new_sha1
+ E1.new_oid.hash
@@
struct ref_update *E1;
@@
- E1->new_sha1
+ E1->new_oid.hash
@@
struct ref_update E1;
@@
- E1.old_sha1
+ E1.old_oid.hash
@@
struct ref_update *E1;
@@
- E1->old_sha1
+ E1->old_oid.hash
This transformation allows us to convert write_ref_to_lockfile, which is
required to convert parse_object.
Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Git's source code assumes that unsigned long is at least as precise as
time_t. Which is incorrect, and causes a lot of problems, in particular
where unsigned long is only 32-bit (notably on Windows, even in 64-bit
versions).
So let's just use a more appropriate data type instead. In preparation
for this, we introduce the new `timestamp_t` data type.
By necessity, this is a very, very large patch, as it has to replace all
timestamps' data type in one go.
As we will use a data type that is not necessarily identical to `time_t`,
we need to be very careful to use `time_t` whenever we interact with the
system functions, and `timestamp_t` everywhere else.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internals of the refs API around the cached refs has been
streamlined.
* mh/separate-ref-cache:
do_for_each_entry_in_dir(): delete function
files_pack_refs(): use reference iteration
commit_packed_refs(): use reference iteration
cache_ref_iterator_begin(): make function smarter
get_loose_ref_cache(): new function
get_loose_ref_dir(): function renamed from get_loose_refs()
do_for_each_entry_in_dir(): eliminate `offset` argument
refs: handle "refs/bisect/" in `loose_fill_ref_dir()`
ref-cache: use a callback function to fill the cache
refs: record the ref_store in ref_cache, not ref_dir
ref-cache: introduce a new type, ref_cache
refs: split `ref_cache` code into separate files
ref-cache: rename `remove_entry()` to `remove_entry_from_dir()`
ref-cache: rename `find_ref()` to `find_ref_entry()`
ref-cache: rename `add_ref()` to `add_ref_entry()`
refs_verify_refname_available(): use function in more places
refs_verify_refname_available(): implement once for all backends
refs_ref_iterator_begin(): new function
refs_read_raw_ref(): new function
get_ref_dir(): don't call read_loose_refs() for "refs/bisect"
files-backend at this point is still aware of the per-repo/worktree
separation in refs, so it can handle a linked worktree.
Some refs operations are known not working when current files-backend is
used in a linked worktree (e.g. reflog). Tests will be written when
refs_* functions start to be called with worktree backend to verify that
they work as expected.
Note: accessing a worktree of a submodule remains unaddressed. Perhaps
after get_worktrees() can access submodule (or rather a new function
get_submodule_worktrees(), that lists worktrees of a submodule), we can
update this function to work with submodules as well.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add finishing touches to a recent topic.
* jk/quarantine-received-objects:
refs: reject ref updates while GIT_QUARANTINE_PATH is set
receive-pack: document user-visible quarantine effects
receive-pack: drop tmp_objdir_env from run_update_hook
The "submodule" specific field in the ref_store structure is
replaced with a more generic "gitdir" that can later be used also
when dealing with ref_store that represents the set of refs visible
from the other worktrees.
* nd/files-backend-git-dir: (28 commits)
refs.h: add a note about sorting order of for_each_ref_*
t1406: new tests for submodule ref store
t1405: some basic tests on main ref store
t/helper: add test-ref-store to test ref-store functions
refs: delete pack_refs() in favor of refs_pack_refs()
files-backend: avoid ref api targeting main ref store
refs: new transaction related ref-store api
refs: add new ref-store api
refs: rename get_ref_store() to get_submodule_ref_store() and make it public
files-backend: replace submodule_allowed check in files_downcast()
refs: move submodule code out of files-backend.c
path.c: move some code out of strbuf_git_path_submodule()
refs.c: make get_main_ref_store() public and use it
refs.c: kill register_ref_store(), add register_submodule_ref_store()
refs.c: flatten get_ref_store() a bit
refs: rename lookup_ref_store() to lookup_submodule_ref_store()
refs.c: introduce get_main_ref_store()
files-backend: remove the use of git_path()
files-backend: add and use files_ref_path()
files-backend: add and use files_reflog_path()
...
It turns out that we can now implement
`refs_verify_refname_available()` based on the other virtual
functions, so there is no need for it to be defined at the backend
level. Instead, define it once in `refs.c` and remove the
`files_backend` definition.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This removes the "submodule" from submodule_hash_entry and other
function names. The goal is to reuse the same code and data structure
for other ref store types. The first one is worktree ref stores.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As documented in git-receive-pack(1), updating a ref from
within the pre-receive hook is dangerous and can corrupt
your repo. This patch forbids ref updates entirely during
the hook to make it harder for adventurous hook writers to
shoot themselves in the foot.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract a new function from `do_for_each_ref()`. It will be useful
elsewhere.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Extract a new function from `refs_resolve_ref_unsafe()`. It will be
useful elsewhere.
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It only has one caller, not worth keeping just for convenience.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The transaction struct now takes a ref store at creation and will
operate on that ref store alone.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is not meant to cover all existing API. It adds enough to test ref
stores with the new test program test-ref-store, coming soon and to be
used by files-backend.c.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function is intended to replace *_submodule() refs API. It provides
a ref store for a specific submodule, which can be operated on by a new
set of refs API.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
files-backend.c is unlearning submodules. Instead of having a specific
check for submodules to see what operation is allowed, files backend
now takes a set of flags at init. Each operation will check if the
required flags is present before performing.
For now we have four flags: read, write and odb access. Main ref store
has all flags, obviously, while submodule stores are read-only and have
access to odb (*).
The "main" flag stays because many functions in the backend calls
frontend ones without a ref store, so these functions always target the
main ref store. Ideally the flag should be gone after ref-store-aware
api is in place and used by backends.
(*) Submodule code needs for_each_ref. Try take REF_STORE_ODB flag
out. At least t3404 would fail. The "have access to odb" in submodule is
a bit hacky since we don't know from he whether add_submodule_odb() has
been called.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Like the previous commit, we'd like to avoid the assumption
that refs fit into PATH_MAX-sized buffers. These callsites
have an extra twist, though: they write the refnames using
mksnpath. This does two things beyond a regular snprintf:
1. It quietly writes "/bad-path/" when truncation occurs.
This saves the caller having to check the error code,
but if you aren't actually feeding the result to a
system call (and we aren't here), it's questionable.
2. It calls cleanup_path(), which removes leading
instances of "./". That's questionable when dealing
with refnames, as we could silently canonicalize a
syntactically bogus refname into a valid one.
Let's convert each case to use a strbuf. This is preferable
to xstrfmt() because we can reuse the same buffer as we
loop.
Signed-off-by: Jeff King <peff@peff.net>
"git branch @" created refs/heads/@ as a branch, and in general the
code that handled @{-1} and @{upstream} was a bit too loose in
disambiguating.
* jk/interpret-branch-name:
checkout: restrict @-expansions when finding branch
strbuf_check_ref_format(): expand only local branches
branch: restrict @-expansions when deleting
t3204: test git-branch @-expansion corner cases
interpret_branch_name: allow callers to restrict expansions
strbuf_branchname: add docstring
strbuf_branchname: drop return value
interpret_branch_name: move docstring to header file
interpret_branch_name(): handle auto-namelen for @{-1}
The "parse_config_key()" API function has been cleaned up.
* jk/parse-config-key-cleanup:
parse_hide_refs_config: tell parse_config_key we don't want a subsection
parse_config_key: allow matching single-level config
parse_config_key: use skip_prefix instead of starts_with
refs: parse_hide_refs_config to use parse_config_key
files-backend is now initialized with a $GIT_DIR. Converting a submodule
path to where real submodule gitdir is located is done in get_ref_store().
This gives a slight performance improvement for submodules since we
don't convert submodule path to gitdir at every backend call like
before. We pay that once at ref-store creation.
More cleanup in files_downcast() and files_assert_main_repository()
follows shortly. It's separate to keep noises from this patch.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
get_ref_store() will soon be renamed to get_submodule_ref_store().
Together with future get_worktree_ref_store(), the three functions
provide an appropriate ref store for different operation modes. New APIs
will be added to operate directly on ref stores.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is the last function in this code (besides public API) that takes
submodule argument and handles both main/submodule cases. Break it down,
move main store registration in get_main_ref_store() and keep the rest
in register_submodule_ref_store().
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This helps the future changes in this code. And because get_ref_store()
is destined to become get_submodule_ref_store(), the "get main store"
code path will be removed eventually. After this the patch to delete
that code will be cleaner.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>