"git apply --inaccurate-eof" when used with "--ignore-space-change"
triggered an internal sanity check, which has been fixed.
* rs/apply-inaccurate-eof-with-incomplete-line:
apply: update line lengths for --inaccurate-eof
A fix for an ancient bug in "git apply --ignore-space-change" codepath.
* rs/apply-fuzzy-match-fix:
apply: avoid out-of-bounds access in fuzzy_matchlines()
Some diff implementations don't report missing newlines at the end of
files. Applying such a patch can cause a newline character to be
added inadvertently. The option --inaccurate-eof of git apply can be
used to remove trailing newlines if needed.
apply_one_fragment() cuts it off from the buffers for preimage and
postimage. Before it does, it builds an array with the lengths of each
line for both. Make sure to update the length of the last line in
these line info structures as well to keep them consistent with their
respective buffer.
Without this fix the added test fails; git apply dies and reports:
fatal: BUG: caller miscounted postlen: asked 1, orig = 1, used = 2
That sanity check is only called if whitespace changes are ignored.
Reported-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
fuzzy_matchlines() uses a pointers to the first and last characters of
two lines to keep track while matching them. This makes it impossible
to deal with empty strings. It accesses characters before the start of
empty lines. It can also access characters after the end when checking
for trailing whitespace in the main loop.
Avoid that by using pointers to the first character and the one *after*
the last one. This is well-defined as long as the latter is not
dereferenced. Basically rewrite the function based on that premise; it
becomes much simpler as a result. There is no need to check for
leading whitespace outside of the main loop anymore.
Reported-by: Mahmoud Al-Qudsi <mqudsi@neosmart.net>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Similar to a previous patch, we do not need to use `newfd` to signal
that we have a lockfile to clean up. We can just unconditionally call
`rollback_lock_file`. If we do not hold the lock, it will be a no-op.
Where we check `newfd` to decide whether we need to take the lock, we
can instead use `is_lock_file_locked()`.
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We have two users of `struct apply_state` and the related functionality
in apply.c. Each user sets up its `apply_state` by handing over a
pointer to its static `lock_file`. (Before 076aa2cbd (tempfile:
auto-allocate tempfiles on heap, 2017-09-05), we could never free
lockfiles, so making them static was a reasonable approach.)
Other than that, they never directly access their `lock_file`s, which
are instead handled by the functionality in apply.c.
To make life easier for the caller and to make it less tempting for a
future caller to mess with the lock, make apply.c fully responsible for
setting up the `lock_file`. As mentioned above, it is now safe to free a
`lock_file`, so we can make the `struct apply_state` contain an actual
`struct lock_file` instead of a pointer to one.
The user in builtin/apply.c is rather simple. For builtin/am.c, we might
worry that the lock state is actually meant to be inherited across
calls. But the lock is only taken as `apply_all_patches()` executes, and
code inspection shows that it will always be released.
Alternatively, we can observe that the lock itself is never queried
directly. When we decide whether we should lock, we check a related
variable `newfd`. That variable is not inherited, so from the point of
view of apply.c, the state machine really is reset with each call to
`init_apply_state()`. (It would be a bug if `newfd` and the lock status
were not in sync. The duplication of information in `newfd` and the lock
will be addressed in the next patch.)
Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Gcc 7 adds -Wimplicit-fallthrough, which can warn when a
switch case falls through to the next case. The general idea
is that the compiler can't tell if this was intentional or
not, so you should annotate any intentional fall-throughs as
such, leaving it to complain about any unannotated ones.
There's a GNU __attribute__ which can be used for
annotation, but of course we'd have to #ifdef it away on
non-gcc compilers. Gcc will also recognize
specially-formatted comments, which matches our current
practice. Let's extend that practice to all of the
unannotated sites (which I did look over and verify that
they were behaving as intended).
Ideally in each case we'd actually give some reasons in the
comment about why we're falling through, or what we're
falling through to. And gcc does support that with
-Wimplicit-fallthrough=2, which relaxes the comment pattern
matching to anything that contains "fallthrough" (or a
variety of spelling variants). However, this isn't the
default for -Wimplicit-fallthrough, nor for -Wextra. In the
name of simplicity, it's probably better for us to support
the default level, which requires "fallthrough" to be the
only thing in the comment (modulo some window dressing like
"else" and some punctuation; see the gcc manual for the
complete set of patterns).
This patch suppresses all warnings due to
-Wimplicit-fallthrough. We might eventually want to add that
to the DEVELOPER Makefile knob, but we should probably wait
until gcc 7 is more widely adopted (since earlier versions
will complain about the unknown warning type).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git apply" that is used as a better "patch -p1" failed to apply a
taken from a file with CRLF line endings to a file with CRLF line
endings. The root cause was because it misused convert_to_git()
that tried to do "safe-crlf" processing by looking at the index
entry at the same path, which is a nonsense---in that mode, "apply"
is not working on the data in (or derived from) the index at all.
This has been fixed.
* tb/apply-with-crlf:
apply: file commited with CRLF should roundtrip diff and apply
convert: add SAFE_CRLF_KEEP_CRLF
We check the date of epoch timestamp candidates already with
starts_with(). Move beyond that part using skip_prefix() instead of
checking it again using a regular expression. Also group the minutes
part, so that we can access them using a substring match instead of
using a magic number.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
has_epoch_timestamp() looks for time stamps that amount to either
1969-12-31 24:00 or 1970-01-01 00:00 after applying the time zone
offset. Move the check for these two dates up, set the expected hour
based on which one is found, or exit early if none of them are present,
thus avoiding to engage the regex machinery for newer dates.
This also gets rid of two magic string length constants.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a file had been commited with CRLF but now .gitattributes say
"* text=auto" (or core.autocrlf is true), the following does not
roundtrip, `git apply` fails:
printf "Added line\r\n" >>file &&
git diff >patch &&
git checkout -- . &&
git apply patch
Before applying the patch, the file from working tree is converted
into the index format (clean filter, CRLF conversion, ...). Here,
when commited with CRLF, the line endings should not be converted.
Note that `git apply --index` or `git apply --cache` doesn't call
convert_to_git() because the source material is already in index
format.
Analyze the patch if there is a) any context line with CRLF, or b)
if any line with CRLF is to be removed. In this case the patch file
`patch` has mixed line endings, for a) it looks like this:
diff --git a/one b/one
index 533790e..c30dea8 100644
--- a/one
+++ b/one
@@ -1 +1,2 @@
a\r
+b\r
And for b) it looks like this:
diff --git a/one b/one
index 533790e..485540d 100644
--- a/one
+++ b/one
@@ -1 +1 @@
-a\r
+b\r
If `git apply` detects that the patch itself has CRLF, (look at the
line " a\r" or "-a\r" above), the new flag crlf_in_old is set in
"struct patch" and two things will happen:
- read_old_data() will not convert CRLF into LF by calling
convert_to_git(..., SAFE_CRLF_KEEP_CRLF);
- The WS_CR_AT_EOL bit is set in the "white space rule",
CRLF are no longer treated as white space.
While at there, make it clear that read_old_data() in apply.c knows
what it wants convert_to_git() to do with respect to CRLF. In fact,
this codepath is about applying a patch to a file in the filesystem,
which may not exist in the index, or may exist but may not match
what is recorded in the index, or in the extreme case, we may not
even be in a Git repository. If convert_to_git() peeked at the
index while doing its work, it *would* be a bug.
Pass NULL instead of &the_index to convert_to_git() to make sure we
catch future bugs to clarify this.
Update the test in t4124: split one test case into 3:
- Detect the " a\r" line in the patch
- Detect the "-a\r" line in the patch
- Use LF in repo and CLRF in the worktree.
Reported-by: Anthony Sottile <asottile@umich.edu>
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Code clean-up.
* rs/move-array:
ls-files: don't try to prune an empty index
apply: use COPY_ARRAY and MOVE_ARRAY in update_image()
use MOVE_ARRAY
add MOVE_ARRAY
Use a NULL-and-NUL check to see if we have a prefix and consistently use
C string functions on it instead of storing its length in a member of
struct apply_state. This avoids strlen() calls and simplifies the code.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Simplify the code by using the helper macros COPY_ARRAY and MOVE_ARRAY,
which also makes them more robust in the case we copy or move no lines,
as they allow using NULL points in that case, while memcpy(3) and
memmove(3) don't.
Found with Clang's UBSan.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We don't know the length of the C string "another". It could be
shorter than "name", which we compare it to using memchr(3). Call
strcmp(3) instead to avoid running over the end of the former, and
get rid of a strlen(3) call as a bonus.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid running over the end of line -- a C string whose length is not
known to this function -- by using starts_with() instead of memcmp(3)
for checking if it starts with "/dev/null". Also simply include the
newline in the string constant to compare against. Drop a comment that
just states the obvious.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A file can either be added, removed, copied, or renamed, but no two of
these actions can be done by the same patch. Some of these combinations
provoke error messages due to missing file names, and some are only
caught by an assertion. Check git patches already as they are parsed
and report conflicting lines on sight.
Found by Vegard Nossum using AFL.
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
An empty string as mode specification is accepted silently by git apply,
as Vegard Nossum found out using AFL. It's interpreted as zero. Reject
such bogus file modes, and only accept ones consisting exclusively of
octal digits.
Reported-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2c93286a (fix "git apply --index ..." not to deref NULL) added a check
for git patches missing a +++ line, preventing a segfault. Check for
missing --- lines as well, and add a test for each case.
Found by Vegard Nossum using AFL.
Original-patch-by: Vegard Nossum <vegard.nossum@oracle.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.
* ab/free-and-null:
*.[ch] refactoring: make use of the FREE_AND_NULL() macro
coccinelle: make use of the "expression" FREE_AND_NULL() rule
coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
coccinelle: make use of the "type" FREE_AND_NULL() rule
coccinelle: add a rule to make "type" code use FREE_AND_NULL()
git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
Fix configuration codepath to pay proper attention to commondir
that is used in multi-worktree situation, and isolate config API
into its own header file.
* bw/config-h:
config: don't implicitly use gitdir or commondir
config: respect commondir
setup: teach discover_git_directory to respect the commondir
config: don't include config.h by default
config: remove git_config_iter
config: create config.h
Code clean-up.
* bw/ls-files-sans-the-index:
ls-files: factor out tag calculation
ls-files: factor out debug info into a function
ls-files: convert show_files to take an index
ls-files: convert show_ce_entry to take an index
ls-files: convert prune_cache to take an index
ls-files: convert ce_excluded to take an index
ls-files: convert show_ru_info to take an index
ls-files: convert show_other_files to take an index
ls-files: convert show_killed_files to take an index
ls-files: convert write_eolinfo to take an index
ls-files: convert overlay_tree_on_cache to take an index
tree: convert read_tree to take an index parameter
convert: convert renormalize_buffer to take an index
convert: convert convert_to_git to take an index
convert: convert convert_to_git_filter_fd to take an index
convert: convert crlf_to_git to take an index
convert: convert get_cached_convert_stats_ascii to take an index
Remove the unused wildopts placeholder struct from being passed to all
wildmatch() invocations, or rather remove all the boilerplate NULL
parameters.
This parameter was added back in commit 9b3497cab9 ("wildmatch: rename
constants and update prototype", 2013-01-01) as a placeholder for
future use. Over 4 years later nothing has made use of it, let's just
remove it. It can be added in the future if we find some reason to
start using such a parameter.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Stop including config.h by default in cache.h. Instead only include
config.h in those files which require use of the config system.
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our code often opens a path to an optional file, to work on its
contents when we can successfully open it. We can ignore a failure
to open if such an optional file does not exist, but we do want to
report a failure in opening for other reasons (e.g. we got an I/O
error, or the file is there, but we lack the permission to open).
The exact errors we need to ignore are ENOENT (obviously) and
ENOTDIR (less obvious). Instead of repeating comparison of errno
with these two constants, introduce a helper function to do so.
* jc/noent-notdir:
treewide: use is_missing_file_error() where ENOENT and ENOTDIR are checked
compat-util: is_missing_file_error()
Similar functions exist in apply.c and builtin/show-branch.c for
counting the number of slashes in a string. Also in the later
patches, we introduce a third caller for the same. Hence, we unify
it now by cleaning the existing functions and declaring a common
function count_slashes in dir.h and implementing it in dir.c to
remove this code duplication.
Mentored-by: Christian Couder <christian.couder@gmail.com>
Mentored-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Prathamesh Chavan <pc44800@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Using the is_missing_file_error() helper introduced in the previous
step, update all hits from
$ git grep -e ENOENT --and -e ENOTDIR
There are codepaths that only check ENOENT, and it is possible that
some of them should be checking both. Updating them is kept out of
this step deliberately, as we do not want to change behaviour in this
step.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
4af9a7d3 ("Merge branch 'bc/object-id'", 2016-09-19) involved
merging a lot of changes made to builtin/apply.c on the side branch
manually to apply.c as an intervening commit 13b5af22 ("apply: move
libified code from builtin/apply.c to apply.{c,h}", 2016-04-22)
moved a lot of the lines changed on the side branch to a different
file apply.c at the top-level, requiring manual patching of it.
Apparently, the maintainer screwed up and made the code indent in a
funny way while doing so.
Reported-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The prefix_filename() function returns a pointer to static
storage, which makes it easy to use dangerously. We already
fixed one buggy caller in hash-object recently, and the
calls in apply.c are suspicious (I didn't dig in enough to
confirm that there is a bug, but we call the function once
in apply_all_patches() and then again indirectly from
parse_chunk()).
Let's make it harder to get wrong by allocating the return
value. For simplicity, we'll do this even when the prefix is
empty (and we could just return the original file pointer).
That will cause us to allocate sometimes when we wouldn't
otherwise need to, but this function isn't called in
performance critical code-paths (and it already _might_
allocate on any given call, so a caller that cares about
performance is questionable anyway).
The downside is that the callers need to remember to free()
the result to avoid leaking. Most of them already used
xstrdup() on the result, so we know they are OK. The
remainder have been converted to use free() as appropriate.
I considered retaining a prefix_filename_unsafe() for cases
where we know the static lifetime is OK (and handling the
cleanup is awkward). This is only a handful of cases,
though, and it's not worth the mental energy in worrying
about whether the "unsafe" variant is OK to use in any
situation.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This function takes the prefix as a ptr/len pair, but in
every caller the length is exactly strlen(ptr). Let's
simplify the interface and just take the string. This saves
callers specifying it (and in some cases handling a NULL
prefix).
In a handful of cases we had the length already without
calling strlen, so this is technically slower. But it's not
likely to matter (after all, if the prefix is non-empty
we'll allocate and copy it into a buffer anyway).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the exported macro SWAP instead of the file-scoped macro swap and
remove the latter's definition.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Callers of the hold_locked_index() function pass 0 when they want to
prepare to write a new version of the index file without wishing to
die or emit an error message when the request fails (e.g. somebody
else already held the lock), and pass 1 when they want the call to
die upon failure.
This option is called LOCK_DIE_ON_ERROR by the underlying lockfile
API, and the hold_locked_index() function translates the paramter to
LOCK_DIE_ON_ERROR when calling the hold_lock_file_for_update().
Replace these hardcoded '1' with LOCK_DIE_ON_ERROR and stop
translating. Callers other than the ones that are replaced with
this change pass '0' to the function; no behaviour change is
intended with this patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
Among the callers of hold_locked_index() that passes 0:
- diff.c::refresh_index_quietly() at the end of "git diff" is an
opportunistic update; it leaks the lockfile structure but it is
just before the program exits and nobody should care.
- builtin/describe.c::cmd_describe(),
builtin/commit.c::cmd_status(),
sequencer.c::read_and_refresh_cache() are all opportunistic
updates and they are OK.
- builtin/update-index.c::cmd_update_index() takes a lock upfront
but we may end up not needing to update the index (i.e. the
entries may be fully up-to-date), in which case we do not need to
issue an error upon failure to acquire the lock. We do diagnose
and die if we indeed need to update, so it is OK.
- wt-status.c::require_clean_work_tree() IS BUGGY. It asks
silence, does not check the returned value. Compare with
callsites like cmd_describe() and cmd_status() to notice that it
is wrong to call update_index_if_able() unconditionally.
Mark error messages for translation passed to error() and die()
functions.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Mark messages for translation printed to stderr.
Signed-off-by: Vasco Almeida <vascomalmeida@sapo.pt>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a static initializer for struct checkout and use it throughout the
code base. It's shorter, avoids a memset(3) call and makes sure the
base_dir member is initialized to a valid (empty) string.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The "unsigned char sha1[20]" to "struct object_id" conversion
continues. Notable changes in this round includes that ce->sha1,
i.e. the object name recorded in the cache_entry, turns into an
object_id.
It had merge conflicts with a few topics in flight (Christian's
"apply.c split", Dscho's "cat-file --filters" and Jeff Hostetler's
"status --porcelain-v2"). Extra sets of eyes double-checking for
mismerges are highly appreciated.
* bc/object-id:
builtin/reset: convert to use struct object_id
builtin/commit-tree: convert to struct object_id
builtin/am: convert to struct object_id
refs: add an update_ref_oid function.
sha1_name: convert get_sha1_mb to struct object_id
builtin/update-index: convert file to struct object_id
notes: convert init_notes to use struct object_id
builtin/rm: convert to use struct object_id
builtin/blame: convert file to use struct object_id
Convert read_mmblob to take struct object_id.
notes-merge: convert struct notes_merge_pair to struct object_id
builtin/checkout: convert some static functions to struct object_id
streaming: make stream_blob_to_fd take struct object_id
builtin: convert textconv_object to use struct object_id
builtin/cat-file: convert some static functions to struct object_id
builtin/cat-file: convert struct expand_data to use struct object_id
builtin/log: convert some static functions to use struct object_id
builtin/blame: convert struct origin to use struct object_id
builtin/apply: convert static functions to struct object_id
cache: convert struct cache_entry to use struct object_id
Sometimes we want to apply in a different index file.
Before the apply functionality was libified it was possible to
use the GIT_INDEX_FILE environment variable, for this purpose.
But now, as the apply functionality has been libified, it should
be possible to do that in a libified way.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>