Commit graph

48725 commits

Author SHA1 Message Date
Rene Scharfe 5a612017eb diff: release strbuf after use in show_stats()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe 348eda249e diff: release strbuf after use in show_rename_copy()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe fa842d843d diff: release strbuf after use in diff_summary()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe f31f1d3951 convert: release strbuf on error return in filter_buffer_or_fd()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe f13992917b connect: release strbuf on error return in git_connect()
Reduce the scope of the variable cmd and release it before returning
early.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe e505146dac commit: release strbuf on error return in commit_tree_extended()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe 9c18b5488e clone: release strbuf after use in remove_junk()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe 25a8f80a84 clean: release strbuf after use in remove_dirs()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:27 +09:00
Rene Scharfe 861e65557f check-ref-format: release strbuf after use in check_ref_format_branch()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:26 +09:00
Rene Scharfe 28ac7aa79b am: release strbuf after use in safe_to_abort()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:26 +09:00
Rene Scharfe b36474ff6b am: release strbuf on error return in hg_patch_to_mail()
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:26 +09:00
Rene Scharfe 542627a4f7 am: release strbufs after use in detect_patch_format()
Don't reset the strbufs l2 and l3 before use as if they were static, but
release them at the end instead.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:49:26 +09:00
Jeff King 1d0538e486 rev-parse: don't trim bisect refnames
Using for_each_ref_in() with a full refname has always been
a questionable practice, but it became an error with
b9c8e7f2fb (prefix_ref_iterator: don't trim too much,
2017-05-22), making "git rev-parse --bisect" pretty reliably
show a BUG.

Commit 03df567fbf (for_each_bisect_ref(): don't trim
refnames, 2017-06-18) fixed this case for revision.c, but
rev-parse handles this option on its own. We can use the
same solution here (and piggy-back on its test).

Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-07 08:46:53 +09:00
Jeff King 1fb2b636c6 set_git_dir: handle feeding gitdir to itself
Ideally we'd free the existing gitdir field before assigning
the new one, to avoid a memory leak. But we can't do so
safely because some callers do the equivalent of:

  set_git_dir(get_git_dir());

We can detect that case as a noop, but there are even more
complicated cases like:

  set_git_dir(remove_leading_path(worktree, get_git_dir());

where we really do need to do some work, but the original
string must remain valid.

Rather than put the burden on callers to make a copy of the
string (only to free it later, since we'll make a copy of it
ourselves), let's solve the problem inside set_git_dir(). We
can make a copy of the pointer for the old gitdir, and then
avoid freeing it until after we've made our new copy.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King f9b7573f6b repository: free fields before overwriting them
It's possible that the repository data may be initialized
twice (e.g., after doing a chdir() to the top of the
worktree we may have to adjust a relative git_dir path). We
should free() any existing fields before assigning to them
to avoid leaks.

This should be safe, as the fields are set based on the
environment or on other strings like the gitdir or
commondir. That makes it impossible that we are feeding an
alias to the just-freed string.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King afbb8838b7 reset: free allocated tree buffers
We read the tree objects with fill_tree_descriptor(), but
never actually free the resulting buffers, causing a memory
leak. This isn't a huge deal because we call this code at
most twice per program invocation. But it does potentially
double our heap usage if you have large root trees. Let's
free the trees before returning.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King e9ce897b9f reset: make tree counting less confusing
Depending on whether we're in --keep mode, git-reset may
feed one or two trees to unpack_trees(). We start a counter
at "1" and then increment it to "2" only for the two-tree
case. But that means we must always subtract one to find the
correct array slot to fill with each descriptor.

Instead, let's start at "0" and just increment our counter
after adding each tree. This skips the extra subtraction,
and will make things much easier when we start to actually
free our tree buffers.

While we're at it, let's make the first allocation use the
slot at "desc + nr", too, even though we know "nr" is 0 at
that point. It makes the two fill_tree_descriptor() calls
consistent (always "desc + nr", followed by always
incrementing "nr").

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King 6c6b08d269 config: plug user_config leak
We generate filenames for the user_config ("~/.gitconfig")
and the xdg config ("$XDG_CONFIG_HOME/git/config") and then
decide which to use by looking at the filesystem. But after
selecting one, the unused string is just leaked.

This is a tiny leak, but it creates noise in leak-checker
output.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King baddc96b2c update-index: fix cache entry leak in add_one_file()
When we fail to add the cache entry to the index, we end up
just leaking the struct. We should follow the pattern of the
early-return above and free it.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King fe6a01af8a add: free leaked pathspec after add_files_to_cache()
After run_diff_files, we throw away the rev_info struct,
including the pathspec that we copied into it, leaking the
memory. this is probably not a big deal in practice. We
usually only run this once per process, and the leak is
proportional to the pathspec list we're already holding in
memory.

But it's still a leak, and it pollutes leak-checker output,
making it harder to find important leaks.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King 85b81b35ff test-lib: set LSAN_OPTIONS to abort by default
We already set ASAN_OPTIONS to abort if it finds any errors.
As we start to experiment with LSAN, the leak sanitizer,
it's convenient if we give it the same treatment.

Note that ASAN is actually a superset of LSAN and can do the
leak detection itself. So this only has an effect if you
specifically build with "make SANITIZE=leak" (leak detection
but not the rest of ASAN). Building with just LSAN results
in a build that runs much faster. That makes the
build-test-fix cycle more pleasant.

In the long run, once we've fixed or suppressed all the
leaks, it will probably be worth turning leak-detection on
for ASAN and just using that (to check both leaks _and_
memory errors in a single test run). But there's still a lot
of work before we get there.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King 88c6e9d31c test-lib: --valgrind should not override --verbose-log
The --verbose test option cannot be used with test harnesses
like "prove". Instead, you must use --verbose-log.

Since the --valgrind option implies --verbose, that means
that it cannot be used with prove. I.e., this does not work:

  prove t0000-basic.sh :: --valgrind

You'd think it could be fixed by doing:

  prove t0000-basic.sh :: --valgrind --verbose-log

but that doesn't work either, because the implied --verbose
takes precedence over --verbose-log. If the user has given
us a specific option, we should prefer that.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 18:06:26 +09:00
Jeff King bfffb48c5d stop leaking lock structs in some simple cases
Now that it's safe to declare a "struct lock_file" on the
stack, we can do so (and avoid an intentional leak). These
leaks were found by running t0000 and t0001 under valgrind
(though certainly other similar leaks exist and just don't
happen to be exercised by those tests).

Initializing the lock_file's inner tempfile with NULL is not
strictly necessary in these cases, but it's a good practice
to model.  It means that if we were to call a function like
rollback_lock_file() on a lock that was never taken in the
first place, it becomes a quiet noop (rather than undefined
behavior).

Likewise, it's always safe to rollback_lock_file() on a file
that has already been committed or deleted, since that
operation is a noop on an inactive lockfile (and that's why
the case in config.c can drop the "if (lock)" check as we
move away from using a pointer).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Jeff King ee4d8e455c ref_lock: stop leaking lock_files
Since the tempfile code recently relaxed the rule that
tempfile structs (and thus locks) need to hang around
forever, we no longer have to leak our lock_file structs.

In fact, we don't even need to heap-allocate them anymore,
since their lifetime can just match that of the surrounding
ref_lock (and if we forget to delete a lock, the effect is
the same as before: it will eventually go away at program
exit).

Note that there is a check in unlock_ref() to only rollback
a lock file if it has been allocated. We don't need that
check anymore; we zero the ref_lock (and thus the
lock_file), so at worst we pass a NULL pointer to
delete_tempfile(), which considers that a noop.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Jeff King 5e7f01c93e lockfile: update lifetime requirements in documentation
Now that the tempfile system we rely on has loosened the
lifetime requirements for storage, we can adjust our
documentation to match.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Jeff King 076aa2cbda tempfile: auto-allocate tempfiles on heap
The previous commit taught the tempfile code to give up
ownership over tempfiles that have been renamed or deleted.
That makes it possible to use a stack variable like this:

  struct tempfile t;

  create_tempfile(&t, ...);
  ...
  if (!err)
          rename_tempfile(&t, ...);
  else
          delete_tempfile(&t);

But doing it this way has a high potential for creating
memory errors. The tempfile we pass to create_tempfile()
ends up on a global linked list, and it's not safe for it to
go out of scope until we've called one of those two
deactivation functions.

Imagine that we add an early return from the function that
forgets to call delete_tempfile(). With a static or heap
tempfile variable, the worst case is that the tempfile hangs
around until the program exits (and some functions like
setup_shallow_temporary rely on this intentionally, creating
a tempfile and then leaving it for later cleanup).

But with a stack variable as above, this is a serious memory
error: the variable goes out of scope and may be filled with
garbage by the time the tempfile code looks at it.  Let's
see if we can make it harder to get this wrong.

Since many callers need to allocate arbitrary numbers of
tempfiles, we can't rely on static storage as a general
solution. So we need to turn to the heap. We could just ask
all callers to pass us a heap variable, but that puts the
burden on them to call free() at the right time.

Instead, let's have the tempfile code handle the heap
allocation _and_ the deallocation (when the tempfile is
deactivated and removed from the list).

This changes the return value of all of the creation
functions. For the cleanup functions (delete and rename),
we'll add one extra bit of safety: instead of taking a
tempfile pointer, we'll take a pointer-to-pointer and set it
to NULL after freeing the object. This makes it safe to
double-call functions like delete_tempfile(), as the second
call treats the NULL input as a noop. Several callsites
follow this pattern.

The resulting patch does have a fair bit of noise, as each
caller needs to be converted to handle:

  1. Storing a pointer instead of the struct itself.

  2. Passing the pointer instead of taking the struct
     address.

  3. Handling a "struct tempfile *" return instead of a file
     descriptor.

We could play games to make this less noisy. For example, by
defining the tempfile like this:

  struct tempfile {
	struct heap_allocated_part_of_tempfile {
                int fd;
                ...etc
        } *actual_data;
  }

Callers would continue to have a "struct tempfile", and it
would be "active" only when the inner pointer was non-NULL.
But that just makes things more awkward in the long run.
There aren't that many callers, so we can simply bite
the bullet and adjust all of them. And the compiler makes it
easy for us to find them all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Jeff King 422a21c6a0 tempfile: remove deactivated list entries
Once a "struct tempfile" is added to the global cleanup
list, it is never removed. This means that its storage must
remain valid for the lifetime of the program. For single-use
tempfiles and locks, this isn't a big deal: we just declare
the struct static. But for library code which may take
multiple simultaneous locks (like the ref code), they're
forced to allocate a struct on the heap and leak it.

This is mostly OK in practice. The size of the leak is
bounded by the number of refs, and most programs exit after
operating on a fixed number of refs (and allocate
simultaneous memory proportional to the number of ref
updates in the first place). But:

  1. It isn't hard to imagine a real leak: a program which
     runs for a long time taking a series of ref update
     instructions and fulfilling them one by one. I don't
     think we have such a program now, but it's certainly
     plausible.

  2. The leaked entries appear as false positives to
     tools like valgrind.

Let's relax this rule by keeping only "active" tempfiles on
the list. We can do this easily by moving the list-add
operation from prepare_tempfile_object to activate_tempfile,
and adding a deletion in deactivate_tempfile.

Existing callers do not need to be updated immediately.
They'll continue to leak any tempfile objects they may have
allocated, but that's no different than the status quo. We
can clean them up individually.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Jeff King 24d82185d2 tempfile: use list.h for linked list
The tempfile API keeps to-be-cleaned tempfiles in a
singly-linked list and never removes items from the list.  A
future patch would like to start removing items, but removal
from a singly linked list is O(n), as we have to walk the
list to find the predecessor element. This means that a
process which takes "n" simultaneous lockfiles (for example,
an atomic transaction on "n" refs) may end up quadratic in
"n".

Before we start allowing items to be removed, it would be
nice to have a way to cover this case in linear time.

The simplest solution is to make an assumption about the
order in which tempfiles are added and removed from the
list. If both operations iterate over the tempfiles in the
same order, then by putting new items at the end of the list
our removal search will always find its items at the
beginning of the list. And indeed, that would work for the
case of refs. But it creates a hidden dependency between
unrelated parts of the code. If anybody changes the ref code
(or if we add a new caller that opens multiple simultaneous
tempfiles) they may unknowingly introduce a performance
regression.

Another solution is to use a better data structure. A
doubly-linked list works fine, and we already have an
implementation in list.h. But there's one snag: the elements
of "struct tempfile" are all marked as "volatile", since a
signal handler may interrupt us and iterate over the list at
any moment (even if we were in the middle of adding a new
entry).

We can declare a "volatile struct list_head", but we can't
actually use it with the normal list functions. The compiler
complains about passing a pointer-to-volatile via a regular
pointer argument. And rightfully so, as the sub-function
would potentially need different code to deal with the
volatile case.

That leaves us with a few options:

  1. Drop the "volatile" modifier for the list items.

     This is probably a bad idea. I checked the assembly
     output from "gcc -O2", and the "volatile" really does
     impact the order in which it updates memory.

  2. Use macros instead of inline functions. The irony here
     is that list.h is entirely implemented as trivial
     inline functions. So we basically are already
     generating custom code for each call. But sadly there's no
     way in C to declare the inline function to take a more
     generic type.

     We could do so by switching the inline functions to
     macros, but it does make the end result harder to read.
     And it doesn't fully solve the problem (for instance,
     the declaration of list_head needs to change so that
     its "prev" and "next" pointers point to other volatile
     structs).

  3. Don't use list.h, and just make our own ad-hoc
     doubly-linked list. It's not that much code to
     implement the basics that we need here. But if we're
     going to do so, why not add the few extra lines
     required to model it after the actual list.h interface?
     We can even reuse a few of the macro helpers.

So this patch takes option 3, but actually implements a
parallel "volatile list" interface in list.h, where it could
potentially be reused by other code. This implements just
enough for tempfile.c's use, though we could easily port
other functions later if need be.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Jeff King 102cf7a6aa tempfile: release deactivated strbufs instead of resetting
When a tempfile is deactivated, we reset its strbuf to the
empty string, which means we hold onto the memory for later
reuse.

Since we'd like to move to a system where tempfile structs
can actually be freed, deactivating one should drop all
resources it is currently using. And thus "release" rather
than "reset" is the appropriate function to call.

In theory the reset may have saved a malloc() when a
tempfile (or a lockfile) is reused multiple times. But in
practice this happened rarely. Most of our tempfiles are
single-use, since in cases where we might actually use many
(like ref locking) we xcalloc() a fresh one for each ref. In
fact, we leak those locks (to appease the rule that tempfile
storage can never be freed). Which means that using reset is
actively hurting us: instead of leaking just the tempfile
struct, we're leaking the extra heap chunk for the filename,
too.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:54 +09:00
Jeff King 6b93506696 tempfile: robustify cleanup handler
We may call remove_tempfiles() from an atexit handler, or
from a signal handler. In the latter case we must take care
to avoid functions which may deadlock if the process is in
an unknown state, including looking at any stdio handles
(which may be in the middle of doing I/O and locked) or
calling malloc() or free().

The current implementation calls delete_tempfile(). We unset
the tempfile's stdio handle (if any) to avoid deadlocking
there. But delete_tempfile() still calls unlink_or_warn(),
which can deadlock writing to stderr if the unlink fails.

Since delete_tempfile() isn't very long, let's just
open-code our own simple conservative version of the same
thing.  Notably:

  1. The "skip_fclose" flag is now called "in_signal_handler",
     because it should inform more decisions than just the
     fclose handling.

  2. We can replace close_tempfile() with just close(fd).
     That skips the fclose() question altogether. This is
     fine for the atexit() case, too; there's no point
     flushing data to a file which we're about to delete
     anyway.

  3. We can choose between unlink/unlink_or_warn based on
     whether it's safe to use stderr.

  4. We can replace the deactivate_tempfile() call with a
     simple setting of the active flag. There's no need to
     do any further cleanup since we know the program is
     exiting.  And even though the current deactivation code
     is safe in a signal handler, this frees us up in future
     patches to make non-signal deactivation more
     complicated (e.g., by freeing resources).

  5. There's no need to remove items from the tempfile_list.
     The "active" flag is the ultimate answer to whether an
     entry has been handled or not. Manipulating the list
     just introduces more chance of recursive signals
     stomping on each other, and the whole list will go away
     when the program exits anyway. Less is more.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King b5f4dcb598 tempfile: factor out deactivation
When we deactivate a tempfile, we also have to clean up the
"filename" strbuf. Let's pull this out into its own function
to keep the logic in one place (which will become more
important when a future patch makes it more complicated).

Note that we can use the same function when deactivating an
object that _isn't_ actually active yet (like when we hit an
error creating a tempfile). These callsites don't currently
reset the "active" flag to 0, but it's OK to do so (it's
just a noop for these cases).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King 2933ebbac1 tempfile: factor out activation
There are a few steps required to "activate" a tempfile
struct. Let's pull these out into a function. That saves a
few repeated lines now, but more importantly will make it
easier to change the activation scheme later.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King 9b028aa45a tempfile: replace die("BUG") with BUG()
Compared to die(), using BUG() triggers abort(). That may
give us an actual coredump, which should make it easier to
get a stack trace. And since the programming error for these
assertions is not in the functions themselves but in their
callers, such a stack trace is needed to actually find the
source of the bug.

In addition, abort() raises SIGABRT, which is more likely to
be caught by our test suite.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King f5b4dc7668 tempfile: handle NULL tempfile pointers gracefully
The tempfile functions all take pointers to tempfile
objects, but do not check whether the argument is NULL.
This isn't a big deal in practice, since the lifetime of any
tempfile object is defined to last for the whole program. So
even if we try to call delete_tempfile() on an
already-deleted tempfile, our "active" check will tell us
that it's a noop.

In preparation for transitioning to a new system that
loosens the "tempfile objects can never be freed" rule,
let's tighten up our active checks:

  1. A NULL pointer is now defined as "inactive" (so it will
     BUG for most functions, but works as a silent noop for
     things like delete_tempfile).

  2. Functions should always do the "active" check before
     looking at any of the struct fields.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King e6fc267314 tempfile: prefer is_tempfile_active to bare access
The tempfile code keeps an "active" flag, and we have a
number of assertions to make sure that the objects are being
used in the right order. Most of these directly check
"active" rather than using the is_tempfile_active()
accessor.

Let's prefer using the accessor, in preparation for it
growing more complicated logic (like checking for NULL).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King 83a3069a38 lockfile: do not rollback lock on failed close
Since the lockfile code is based on the tempfile code, it
has some of the same problems, including that close_lock_file()
erases the tempfile's filename buf, making it hard for the
caller to write a good error message.

In practice this comes up less for lockfiles than for
straight tempfiles, since we usually just report the
refname. But there is at least one buggy case in
write_ref_to_lockfile(). Besides, given the coupling between
the lockfile and tempfile modules, it's less confusing if
their close() functions have the same semantics.

Just as the previous commit did for close_tempfile(), let's
teach close_lock_file() and its wrapper close_ref() not to
rollback on error. And just as before, we'll give them new
"gently" names to catch any new callers that are added.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King 49bd0fc222 tempfile: do not delete tempfile on failed close
When close_tempfile() fails, we delete the tempfile and
reset the fields of the tempfile struct. This makes it
easier for callers to return without cleaning up, but it
also makes this common pattern:

  if (close_tempfile(tempfile))
	return error_errno("error closing %s", tempfile->filename.buf);

wrong, because the "filename" field has been reset after the
failed close. And it's not easy to fix, as in many cases we
don't have another copy of the filename (e.g., if it was
created via one of the mks_tempfile functions, and we just
have the original template string).

Let's drop the feature that a failed close automatically
deletes the file. This puts the burden on the caller to do
the deletion themselves, but this isn't that big a deal.
Callers which do:

  if (write(...) || close_tempfile(...)) {
	delete_tempfile(...);
	return -1;
  }

already had to call delete when the write() failed, and so
aren't affected. Likewise, any caller which just calls die()
in the error path is OK; we'll delete the tempfile during
the atexit handler.

Because this patch changes the semantics of close_tempfile()
without changing its signature, all callers need to be
manually checked and converted to the new scheme. This patch
covers all in-tree callers, but there may be others for
not-yet-merged topics. To catch these, we rename the
function to close_tempfile_gently(), which will attract
compile-time attention to new callers. (Technically the
original could be considered "gentle" already in that it
didn't die() on errors, but this one is even more so).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King 45c6b1ed24 always check return value of close_tempfile
If close_tempfile() encounters an error, then it deletes the
tempfile and resets the "struct tempfile". But many code
paths ignore the return value and continue to use the
tempfile. Instead, we should generally treat this the same
as a write() error.

Note that in the postimage of some of these cases our error
message will be bogus after a failed close because we look
at tempfile->filename (either directly or via get_tempfile_path).
But after the failed close resets the tempfile object, this
is guaranteed to be the empty string. That will be addressed
in a future patch (because there are many more cases of the
same problem than just these instances).

Note also in the hunk in gpg-interface.c that it's fine to
call delete_tempfile() in the error path, even if
close_tempfile() failed and already deleted the file. The
tempfile code is smart enough to know the second deletion is
a noop.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:53 +09:00
Jeff King d88ef66051 verify_signed_buffer: prefer close_tempfile() to close()
We do a manual close() on the descriptor provided to us by
mks_tempfile. But this runs contrary to the advice in
tempfile.h, which notes that you should always use
close_tempfile(). Otherwise the descriptor may be reused
without the tempfile object knowing it, and the later call
to delete_tempfile() could close a random descriptor.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:52 +09:00
Jeff King c0e963b77c setup_temporary_shallow: move tempfile struct into function
The setup_temporary_shallow() function creates a temporary
file, but we never access the tempfile struct outside of the
function. This is OK, since it means we'll just clean up the
tempfile on exit.  But we can simplify the code a bit by
moving the global tempfile struct to the only function in
which it's used.

Note that it must remain "static" due to tempfile.c's
requirement that tempfile storage never goes away until
program exit.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:52 +09:00
Jeff King 0899013993 setup_temporary_shallow: avoid using inactive tempfile
When there are no shallow entries to write, we skip creating
the tempfile entirely and try to return the empty string.

But we do so by calling get_tempfile_path() on the inactive
tempfile object. This will trigger an assertion that kills
the program. The bug was introduced by 6e122b449b
(setup_temporary_shallow(): use tempfile module,
2015-08-10). But nobody seems to have noticed since then
because we do not end up calling this function at all when
there are no shallow items. In other words, this code path
is completely unexercised.

Since the tempfile object is a static global, it _is_
possible that we call the function twice, writing out
shallow info the first time and then "reusing" our tempfile
object the second time. But:

  1. It seems unlikely that this was the intent, as hitting
     this code path would imply somebody clearing the
     shallow_info list between calls.

     And if somebody _did_ call the function multiple times
     without clearing the shallow_info list, we'd hit a
     different BUG for trying to reuse an already-active
     tempfile.

  2. I verified by code inspection that the function is only
     called once per program. And also replacing this code
     with a BUG() and running the test suite demonstrates
     that it is not triggered there.

So we could probably just replace this with an assertion and
confirm that it's never called. However, the original intent
does seem to be that you _could_ call it when the
shallow_info is empty. And that's easy enough to do; since
the return value doesn't need to point to a writable buffer,
we can just return a string literal.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:52 +09:00
Jeff King c82c75b951 write_index_as_tree: cleanup tempfile on error
If we failed to write our new index file, we rollback our
lockfile to remove the temporary index. But if we fail
before we even get to the write step (because reading the
old index failed), we leave the lockfile in place, which
makes no sense.

In practice this hasn't been a big deal because failing at
write_index_as_tree() typically results in the whole program
exiting (and thus the tempfile handler kicking in and
cleaning up the files). But this function should
consistently take responsibility for the resources it
allocates.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 17:19:52 +09:00
Junio C Hamano 3ec7d702a8 The sixth batch post 2.14
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 13:15:24 +09:00
Junio C Hamano a7d7f125e3 Merge branch 'rs/archive-excluded-directory'
"git archive" did not work well with pathspecs and the
export-ignore attribute.

* rs/archive-excluded-directory:
  archive: don't queue excluded directories
  archive: factor out helper functions for handling attributes
  t5001: add tests for export-ignore attributes and exclude pathspecs
2017-09-06 13:11:25 +09:00
Junio C Hamano 8b36f0b196 Merge branch 'po/read-graft-line'
Conversion from uchar[20] to struct object_id continues; this is to
ensure that we do not assume sizeof(struct object_id) is the same
as the length of SHA-1 hash (or length of longest hash we support).

* po/read-graft-line:
  commit: rewrite read_graft_line
  commit: allocate array using object_id size
  commit: replace the raw buffer with strbuf in read_graft_line
  sha1_file: fix definition of null_sha1
2017-09-06 13:11:25 +09:00
Junio C Hamano 1fb77b3ee5 Merge branch 'ks/branch-set-upstream'
"branch --set-upstream" that has been deprecated in Git 1.8 has
finally been retired.

* ks/branch-set-upstream:
  branch: quote branch/ref names to improve readability
  builtin/branch: stop supporting the "--set-upstream" option
  t3200: cleanup cruft of a test
2017-09-06 13:11:24 +09:00
Martin Ågren 150efef1e7 pkt-line: re-'static'-ify buffer in packet_write_fmt_1()
The static-ness was silently dropped in commit 70428d1a5 ("pkt-line: add
packet_write_fmt_gently()", 2016-10-16). As a result, for each call to
packet_write_fmt_1, we allocate and leak a buffer.

We could keep the strbuf non-static and instead make sure we always
release it before returning (but not before we die, so that we don't
touch errno). That would also prepare us for threaded use. But until
that needs to happen, let's just restore the static-ness so that we get
back to a situation where we (eventually) do not continuosly keep
allocating memory.

Signed-off-by: Martin Ågren <martin.agren@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 13:11:14 +09:00
Kevin Willford ef9c4dc3b6 merge-recursive: remove return value from get_files_dirs
The return value of the get_files_dirs call is never being used.
Looking at the history of the file and it was originally only
being used for debug output statements.  Also when
read_tree_recursive return value is non zero it is changed to
zero.  This leads me to believe that it doesn't matter if
read_tree_recursive gets an error.

Since the debug output has been removed and the caller isn't
checking the return value there is no reason to keep calculating
and returning a value.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 13:10:21 +09:00
Kevin Willford e336bdc5b9 merge-recursive: fix memory leak
In merge_trees if process_renames or process_entry returns less
than zero, the method will just return and not free re_merge,
re_head, or entries.

This change cleans up the allocated variables before returning
to the caller.

Signed-off-by: Kevin Willford <kewillf@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 13:10:20 +09:00
Jeff King f991761eb8 config: use a static lock_file struct
When modifying git config, we xcalloc() a struct lock_file
but never free it. This is necessary because the tempfile
code (upon which the locking code is built) requires that
the resulting struct remain valid through the life of the
program. However, it also confuses leak-checkers like
valgrind because only the inner "struct tempfile" is still
reachable; no pointer to the outer lock_file is kept.

Other code paths solve this by using a single static lock
struct. We can do the same here, because we know that we'll
only lock and modify one config file at a time (and
assertions within the lockfile code will ensure that this
remains the case).

That removes a real leak (when we fail to free the struct
after locking fails) as well as removes the valgrind false
positive. It also means that doing N sequential
config-writes will use a constant amount of memory, rather
than leaving stale lock_files for each.

Note that since "lock" is no longer a pointer, it can't be
NULL anymore. But that's OK. We used that feature only to
avoid calling rollback_lock_file() on an already-committed
lock. Since the lockfile code keeps its own "active" flag,
it's a noop to rollback an inactive lock, and we don't have
to worry about this ourselves.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-06 12:59:53 +09:00