Several files were including cache.h solely to get other headers, such
as trace.h and trace2.h. Since the last few commits have modified
files to make these dependencies more explicit, the inclusion of cache.h
is no longer needed in several cases. Remove it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h. This made it more difficult
to find which files could remove a dependence on cache.h. Make C files
explicitly include trace.h or trace2.h if they are using them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is another step towards letting us remove the include of cache.h in
strbuf.c. It does mean that we also need to add includes of abspath.h
in a number of C files.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
trace_repo_setup() of trace.c is called with the argument 'prefix' from
only one location, run_builtin of git.c, which sets 'prefix' to the return
value of setup_git_directory() or setup_git_directory_gently() (a wrapper
of the former).
Now that "prefix" is in startup_info there is no need for the parameter
of trace_repo_setup() because setup_git_directory() sets "startup_info->prefix"
to the same value it returns. It would be less confusing to use "prefix"
from startup_info instead of passing it as an argument.
Signed-off-by: Idriss Fekir <mcsm224@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "else" branches of the HAVE_VARIADIC_MACROS macro, which
have been unconditionally omitted since 765dc16888 (git-compat-util:
always enable variadic macros, 2021-01-28).
Since were always omitted, anyone trying to use a compiler without
variadic macro support to compile a git since version
git v2.31.0 or later would have had a compilation error. 10 months
across a few releases since then should have been enough time for
anyone who cared to run into that and report the issue.
In addition to that, for anyone unsetting HAVE_VARIADIC_MACROS we've
been emitting extremely verbose warnings since at least
ee4512ed48 (trace2: create new combined trace facility,
2019-02-22). That's because there is no such thing as a
"region_enter_printf" or "region_leave_printf" format, so at least
under GCC and Clang everything that includes trace.h (almost every
file) emits a couple of warnings about that.
There's a large benefit to being able to have a hard dependency rely
on variadic macros, the code surrounding usage.c is hard to maintain
if we need to write two implementations of everything, and by relying
on "__FILE__" and "__LINE__" along with "__VA_ARGS__" we can in the
future make error(), die() etc. log where they were called from. We've
also recently merged d67fc4bf0b (Merge branch 'bc/require-c99',
2021-12-10) which further cements our hard dependency on C99.
So let's delete the fallback code, and update our CodingGuidelines to
note that we depend on this. The added bullet-point starts with
lower-case for consistency with other bullet-points in that section.
The diff in "trace.h" is relatively hard to read, since we need to
retain the existing API docs, which were comments on the code used if
HAVE_VARIADIC_MACROS was not defined.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Whenever GIT_CURL_VERBOSE is set, teach Git to behave as if
GIT_TRACE_CURL=1 and GIT_TRACE_CURL_NO_DATA=1 is set, instead of setting
CURLOPT_VERBOSE.
This is to prevent inadvertent revelation of sensitive data. In
particular, GIT_CURL_VERBOSE redacts neither the "Authorization" header
nor any cookies specified by GIT_REDACT_COOKIES.
Unifying the tracing mechanism also has the future benefit that any
improvements to the tracing mechanism will benefit both users of
GIT_CURL_VERBOSE and GIT_TRACE_CURL, and we do not need to remember to
implement any improvement twice.
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Long ago, in 97bfeb34df (Release pack windows before reporting out of
memory., 2006-12-24), we taught xmalloc() and friends to try unmapping
pack windows when malloc() failed. It's unlikely that his helps a lot in
practice, and it has some downsides. First, the downsides:
1. It makes xmalloc() not thread-safe. We've worked around this in
pack-objects.c, which installs its own locking version of the
try_to_free_routine(). But other threaded code doesn't.
2. It makes the system as a whole harder to reason about. Functions
which allocate heap memory under the hood may have farther-reaching
effects than expected.
That might be worth the tradeoff if there's a benefit. But in practice,
it seems unlikely. We're generally dealing with mmap'd files, so the OS
is going to do a much better job at responding to memory pressure by
dropping individual pages (the exception is systems with NO_MMAP, but
even there the OS can probably respond just as well with swapping).
So the only thing we're really freeing is address space. On 64-bit
systems, we have plenty of that to go around. On 32-bit systems, it
could possibly help. But around the same time we made two other changes:
77ccc5bbd1 (Introduce new config option for mmap limit., 2006-12-23) and
60bb8b1453 (Fully activate the sliding window pack access., 2006-12-23).
Together that means that a 32-bit system should have no more than 256MB
total of packed-git mmaps at one time, split between a few 32MB windows.
It's unlikely we have any address space problems since then, but we
don't have any data since the features were all added at the same time.
Likewise, xmmap() will try to free memory. At first glance, it seems
like we'd need this (when we try to mmap a new window, we might need to
close an old one to save address space on a 32-bit system). But we're
saved again by core.packedGitLimit: if we're going to exceed our 256MB
limit, we'll close an existing window before we even call mmap().
So it seems unlikely that this feature is actually doing anything
useful. And while we don't have reports of it harming anything (probably
because it rarely if ever kicks in), it would be nice to simplify the
system overall. This patch drops the whole try_to_free system from
xmalloc(), as well as the manual pack memory release in xmmap().
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Performance measurements are listed right now as a flat list, which is
fine when we measure big blocks. But when we start adding more and
more measurements, some of them could be just part of a bigger
measurement and a flat list gives a wrong impression that they are
executed at the same level instead of nested.
Add trace_performance_enter() and trace_performance_leave() to allow
indent these nested measurements. For now it does not help much
because the only nested thing is (lazy) name hash initialization
(e.g. called in diff-index from "git status"). This will help more
because I'm going to add some more tracing that's actually nested.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is so that we can print traces based on this key outside trace.c.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function is about printing a trace line, not releasing the buffer it
receives too. Move strbuf_release() back outside. This makes it easier
to see how strbuf is managed.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Trace output which contains arbitrary strings (e.g., the
arguments to commands which we are running) is always passed
through sq_quote_buf(). That function always adds
single-quotes, even if the output consists of vanilla
characters. This can make the output a bit hard to read.
Let's avoid the quoting if there are no characters which a
shell would interpret. Trace output doesn't necessarily need
to be shell-compatible, but:
- the shell language is a good ballpark for what humans
consider readable (well, humans versed in command line
tools)
- the run_command bits can be cut-and-pasted to a shell,
and we'll keep that property
- it covers any cases which would make the output
visually ambiguous (e.g., embedded whitespace or quotes)
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No caller passes anything but "0" for this parameter, which
requests that the function ignore it completely. In fact, in
all of history there was only one such caller, and it went
away in 7f51f8bc2b (alias: use run_command api to execute
aliases, 2011-01-07).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Move just enough code from trace.c into trace.h header so all code
necessary to determine that trace is disabled could be inlined to
calling functions. Then perform the check if the trace key is
enabled sooner in call chain.
Signed-off-by: Gennady Kupava <gkupava@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Trace key normalization is not used, not strictly necessary,
complicates the code and would negatively affect compilation speed if
moved to header.
New trace_default_key key or existing separate marco could be used
instead of passing NULL as a key.
Signed-off-by: Gennady Kupava <gkupava@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The mailing address for the FSF has changed over the years. Rather than
updating the address across all files, refer readers to gnu.org, as the
GNU GPL documentation now suggests for license notices. The mailing
address is retained in the full license files (COPYING and LGPL-2.1).
The old address is still present in t/diff-lib/COPYING. This is
intentional, as the file is used in tests and the contents are not
expected to change.
Signed-off-by: Todd Zullinger <tmz@pobox.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If the trace code cannot open a specified file, or does not
understand the contents of the GIT_TRACE variable, it falls
back to printing trace output to stderr.
This is an attempt to be helpful, but in practice it just
ends up annoying. The user was trying to get the output to
go somewhere else, so spewing it to stderr does not really
accomplish that. And as it's intended for debugging, they
can presumably re-run the command with their error
corrected.
So instead of falling back, this patch disables bogus trace
keys for the rest of the program, just as we do for write
errors. We can drop the "Defaulting to..." part of the error
message entirely; after seeing "cannot open '/foo'", the
user can assume that tracing is skipped.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If we get a write error writing to a trace descriptor, the
error isn't likely to go away if we keep writing. Instead,
you'll just get the same error over and over. E.g., try:
GIT_TRACE_PACKET=42 git ls-remote >/dev/null
You don't really need to see:
warning: unable to write trace for GIT_TRACE_PACKET: Bad file descriptor
hundreds of times. We could fallback to tracing to stderr,
as we do in the error code-path for open(), but there's not
much point. If the user fed us a bogus descriptor, they're
probably better off fixing their invocation. And if they
didn't, and we saw a transient error (e.g., ENOSPC writing
to a file), it probably doesn't help anybody to have half of
the trace in a file, and half on stderr.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Our error message for write() always mentions GIT_TRACE,
even though we may be writing for a different variable
entirely. It's also not quite accurate to say "fd given by
GIT_TRACE environment variable", as we may hit this error
based on a filename the user put in the variable (we do
complain and switch to stderr if the file cannot be opened,
but it's still possible to hit a write() error on the
descriptor later).
So let's fix those things, and switch to our more usual
"unable to do X: Y" format for the error.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The error messages for the trace code are often multi-line;
the first line gets a nice "warning:", but the rest are
left-aligned. Let's give them an indentation to make sure
they stand out as a unit.
While we're here, let's also downcase the first letter of
each error (our usual style), and break up a long line of
advice (since we're already using multiple lines, one more
doesn't hurt).
We also replace "What does 'foo' for GIT_TRACE mean?". While
cute, it's probably a good idea to give more context, and
follow our usual styles. So it's now "unknown trace value
for 'GIT_TRACE': foo".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Right now we just fprintf() straight to stderr, which can
make the output hard to distinguish. It would be helpful to
give it one of our usual prefixes like "error:", "warning:",
etc.
It doesn't make sense to use error() here, as the trace code
is "optional" debugging code. If something goes wrong, we
should warn the user, but saying "error" implies the actual
git operation had a problem. So warning() is the only sane
choice.
Note that this does end up calling warn_routine() to do the
formatting. This is probably a good thing, since they are
clearly trying to hook messages before they make it to
stderr. However, it also means that in theory somebody who
tries to trace from their warn_routine() could cause a loop.
This seems rather unlikely in practice (we've never even
overridden the default warn_builtin routine before, and
recent discussions to do so would just install a noop
routine).
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The write_or_whine_pipe function does two things:
1. it checks for EPIPE and converts it into a signal death
2. it prints a message to stderr on error
The first thing does not help us, and actively hurts.
Generally we would simply die from SIGPIPE in this case,
unless somebody has taken the time to ignore SIGPIPE for the
whole process. And if they _did_ do that, it seems rather
silly for the trace code, which otherwise takes pains to
continue even in the face of errors (e.g., by not using
write_or_die!), to take down the whole process for one
specific type of error.
Nor does the second thing help us; it just makes it harder
to write our error message, because we have to feed bits of
it as an argument to write_or_whine_pipe(). Translators
never get to see the full message, and it's hard for us to
customize it.
Let's switch to just using write_in_full() and writing our
own error string. For now, the error is identical to what
write_or_whine_pipe() would say, but now that it's more
under our control, we can improve it in future patches.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
All of the trace functions treat a NULL key as a synonym for
the default GIT_TRACE key. Except for trace_disable(), which
will segfault.
Fortunately, this can't cause any bugs, as the function has
no callers. But rather than drop it, let's fix the bug, as I
plan to add a caller.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When we output GIT_TRACE_SETUP paths, we quote any
meta-characters. But our buffer to hold the result is only
PATH_MAX bytes, and we could double the size of the input
path (if every character needs quoting). We could use a
2*PATH_MAX buffer, if we assume the input will never be more
than PATH_MAX. But it's easier still to just switch to a
strbuf and not worry about whether the input can exceed
PATH_MAX or not.
The original copied the "p2" pointer to "p1", advancing
both. Since this gets rid of "p1", let's also drop "p2",
whose name is now confusing. We can just advance the
original "path" pointer.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When debugging the pack protocol, it is sometimes useful to
store the verbatim pack that we sent or received on the
wire. Looking at the on-disk result is often not helpful for
a few reasons:
1. If the operation is a clone, we destroy the repo on
failure, leaving nothing on disk.
2. If the pack is small, we unpack it immediately, and the
full pack never hits the disk.
3. If we feed the pack to "index-pack --fix-thin", the
resulting pack has the extra delta bases added to it.
We already have a GIT_TRACE_PACKET mechanism for tracing
packets. Let's extend it with GIT_TRACE_PACKFILE to dump the
verbatim packfile.
There are a few other positive fallouts that come from
rearranging this code:
- We currently disable the packet trace after seeing the
PACK header, even though we may get human-readable lines
on other sidebands; now we include them in the trace.
- We currently try to print "PACK ..." in the trace to
indicate that the packfile has started. But because we
disable packet tracing, we never printed this line. We
will now do so.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A replacement for contrib/workdir/git-new-workdir that does not
rely on symbolic links and make sharing of objects and refs safer
by making the borrowee and borrowers aware of each other.
* nd/multiple-work-trees: (41 commits)
prune --worktrees: fix expire vs worktree existence condition
t1501: fix test with split index
t2026: fix broken &&-chain
t2026 needs procondition SANITY
git-checkout.txt: a note about multiple checkout support for submodules
checkout: add --ignore-other-wortrees
checkout: pass whole struct to parse_branchname_arg instead of individual flags
git-common-dir: make "modules/" per-working-directory directory
checkout: do not fail if target is an empty directory
t2025: add a test to make sure grafts is working from a linked checkout
checkout: don't require a work tree when checking out into a new one
git_path(): keep "info/sparse-checkout" per work-tree
count-objects: report unused files in $GIT_DIR/worktrees/...
gc: support prune --worktrees
gc: factor out gc.pruneexpire parsing code
gc: style change -- no SP before closing parenthesis
checkout: clean up half-prepared directories in --to mode
checkout: reject if the branch is already checked out elsewhere
prune: strategies for linked checkouts
checkout: support checking out into a new working directory
...
Set or clear Makefile variables HAVE_CLOCK_GETTIME and
HAVE_CLOCK_MONOTONIC based upon results of the checks (overriding
default values from config.mak.uname).
CLOCK_MONOTONIC isn't available on RHEL3, but there are still RHEL3
systems being used in production.
Signed-off-by: Reuben Hawkins <reubenhwk@gmail.com>
Reviewed-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Call strbuf_complete_line() instead of open-coding it. Also remove
surrounding comments indicating the intent to complete a line since
this information is already included in the function name.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The repo setup procedure is updated to detect $GIT_DIR/commondir and
set $GIT_COMMON_DIR properly.
The core.worktree is ignored when $GIT_COMMON_DIR is set. This is
because the config file is shared in multi-checkout setup, but
checkout directories _are_ different. Making core.worktree effective
in all checkouts mean it's back to a single checkout.
Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Oracle Studio compilers don't allow for static variables in
functions that are defined to be inline. GNU C does permit this.
Let's reference the C99 standard though, which doesn't allow for
inline functions to contain modifiable static variables.
Signed-off-by: Ben Walton <bdwalton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce the use of fixed sized buffer passed to getcwd() calls
by introducing xgetcwd() helper.
* rs/strbuf-getcwd:
use strbuf_add_absolute_path() to add absolute paths
abspath: convert absolute_path() to strbuf
use xgetcwd() to set $GIT_DIR
use xgetcwd() to get the current directory or die
wrapper: add xgetcwd()
abspath: convert real_path_internal() to strbuf
abspath: use strbuf_getcwd() to remember original working directory
setup: convert setup_git_directory_gently_1 et al. to strbuf
unix-sockets: use strbuf_getcwd()
strbuf: add strbuf_getcwd()
Convert several calls of getcwd() and die() to use xgetcwd() instead.
This way we get rid of fixed-size buffers (which can be too small
depending on the used file system) and gain consistent error messages.
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use trace_performance to measure and print execution time and command line
arguments of the entire main() function. In constrast to the shell's 'time'
utility, which measures total time of the parent process, this logs all
involved git commands recursively. This is particularly useful to debug
performance issues of scripted commands (i.e. which git commands were
called with which parameters, and how long did they execute).
Due to git's deliberate use of exit(), the implementation uses an atexit
routine rather than just adding trace_performance_since() at the end of
main().
Usage example: > GIT_TRACE_PERFORMANCE=~/git-trace.log git stash list
Creates a log file like this:
23:57:38.638765 trace.c:405 performance: 0.000310107 s: git command: 'git' 'rev-parse' '--git-dir'
23:57:38.644387 trace.c:405 performance: 0.000261759 s: git command: 'git' 'rev-parse' '--show-toplevel'
23:57:38.646207 trace.c:405 performance: 0.000304468 s: git command: 'git' 'config' '--get-colorbool' 'color.interactive'
23:57:38.648491 trace.c:405 performance: 0.000241667 s: git command: 'git' 'config' '--get-color' 'color.interactive.help' 'red bold'
23:57:38.650465 trace.c:405 performance: 0.000243063 s: git command: 'git' 'config' '--get-color' '' 'reset'
23:57:38.654850 trace.c:405 performance: 0.025126313 s: git command: 'git' 'stash' 'list'
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add trace_performance and trace_performance_since macros that print a
duration and an optional printf-formatted text to the file specified in
environment variable GIT_TRACE_PERFORMANCE.
These macros, in conjunction with getnanotime(), are intended to simplify
performance measurements from within the application (i.e. profiling via
manual instrumentation, rather than using an external profiling tool).
Unless enabled via GIT_TRACE_PERFORMANCE, these macros have no noticeable
impact on performance, so that test code for well known time killers may
be shipped in release builds. Alternatively, a developer could provide an
additional performance patch (not meant for master) that allows reviewers
to reproduce performance tests more easily, e.g. on other platforms or
using their own repositories.
Usage examples:
Simple use case (measure one code section):
uint64_t start = getnanotime();
/* code section to measure */
trace_performance_since(start, "foobar");
Complex use case (measure repetitive code sections):
uint64_t t = 0;
for (;;) {
/* ignore */
t -= getnanotime();
/* code section to measure */
t += getnanotime();
/* ignore */
}
trace_performance(t, "frotz");
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a getnanotime() function that returns nanoseconds since 01/01/1970 as
unsigned 64-bit integer (i.e. overflows in july 2554). This is easier to
work with than e.g. struct timeval or struct timespec. Basing the timer on
the epoch allows using the results with other time-related APIs.
To simplify adaption to different platforms, split the implementation into
a common getnanotime() and a platform-specific highres_nanos() function.
The common getnanotime() function handles errors, falling back to
gettimeofday() if highres_nanos() isn't implemented or doesn't work.
getnanotime() is also responsible for normalizing to the epoch. The offset
to the system clock is calculated only once on initialization, i.e.
manually setting the system clock has no impact on the timer (except if
the fallback gettimeofday() is in use). Git processes are typically short
lived, so we don't need to handle clock drift.
The highres_nanos() function returns monotonically increasing nanoseconds
relative to some arbitrary point in time (e.g. system boot), or 0 on
failure. Providing platform-specific implementations should be relatively
easy, e.g. adapting to clock_gettime() as defined by the POSIX realtime
extensions is seven lines of code.
This version includes highres_nanos() implementations for:
* Linux: using clock_gettime(CLOCK_MONOTONIC)
* Windows: using QueryPerformanceCounter()
Todo:
* enable clock_gettime() on more platforms
* add Mac OSX version, e.g. using mach_absolute_time + mach_timebase_info
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is useful to see where trace output came from.
Add 'const char *file, int line' parameters to the printing functions and
rename them to *_fl.
Add trace_printf* and trace_strbuf macros resolving to the *_fl functions
and let the preprocessor fill in __FILE__ and __LINE__.
As the trace_printf* functions take a variable number of arguments, this
requires variadic macros (i.e. '#define foo(...) foo_impl(__VA_ARGS__)'.
Though part of C99, it is unclear whether older compilers support this.
Thus keep the old functions and only enable variadic macros for GNUC and
MSVC 2005+ (_MSC_VER 1400). This has the nice side effect that the old
C-style declarations serve as documentation how the macros are to be used.
Print 'file:line ' as prefix to each trace line. Align the remaining trace
output at column 40 to accommodate 18 char file names + 4 digit line
number (currently there are 30 *.c files of length 18 and just 11 of 19).
Trace output from longer source files (e.g. builtin/receive-pack.c) will
not be aligned.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
No functional changes, just move stuff around so that the next patch isn't
that ugly...
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This is useful to tell apart trace output of separate test runs.
It can also be used for basic, coarse-grained performance analysis. Note
that the accuracy is tainted by writing to the trace file, and you have to
calculate the deltas yourself (which is next to impossible if multiple
threads or processes are involved).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some unit-tests use trace output to verify internal state, and unstable
output such as timestamps and line numbers are not useful there.
Disable additional trace output if GIT_TRACE_BARE is set.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To be able to add a common prefix or suffix to all trace output (e.g.
a timestamp or file:line of the caller), factor out common setup and
cleanup tasks of the trace* functions.
When adding a common prefix, it makes sense that the output of each trace
call starts on a new line. Add '\n' in case the caller forgot.
Note that this explicitly limits trace output to line-by-line, it is no
longer possible to trace-print just part of a line. Until now, this was
just an implicit assumption (trace-printing part of a line worked, but
messed up the trace file if multiple threads or processes were involved).
Thread-safety / inter-process-safety is also the reason why we need to do
the prefixing and suffixing in memory rather than issuing multiple write()
calls. Write_or_whine_pipe() / xwrite() is atomic unless the size exceeds
MAX_IO_SIZE (8MB, see wrapper.c). In case of trace_strbuf, this costs an
additional string copy (which should be irrelevant for performance in light
of actual file IO).
While we're at it, rename trace_strbuf's 'buf' argument, which suggests
that the function is modifying the buffer. Trace_strbuf() currently is the
only trace API that can print arbitrary binary data (without barfing on
'%' or stopping at '\0'), so 'data' seems more appropriate.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The trace API currently rechecks the environment variable and reopens the
trace file on every API call. This has the ugly side effect that errors
(e.g. file cannot be opened, or the user specified a relative path) are
also reported on every call. Performance can be improved by about factor
three by remembering the environment state and keeping the file open.
Replace the 'const char *key' parameter in the API with a pointer to a
'struct trace_key' that bundles the environment variable name with
additional, trace-internal state. Change the call sites of these APIs to
use a static 'struct trace_key' instead of a string constant.
In trace.c::get_trace_fd(), save and reuse the file descriptor in 'struct
trace_key'.
Add a 'trace_disable()' API, so that packet_trace() can cleanly disable
tracing when it encounters packed data (instead of using unsetenv()).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
trace_printf_key() is the only non-static function that duplicates the
printf format attribute in the .c file, remove it for consistency.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The format parameter to trace_printf functions is sometimes abbreviated
'fmt'. Rename to 'format' everywhere (consistent with POSIX' printf
specification).
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Suppose a fetch or push is requested between two shallow repositories
(with no history deepening or shortening). A pack that contains
necessary objects is transferred over together with .git/shallow of
the sender. The receiver has to determine whether it needs to update
.git/shallow if new refs needs new shallow comits.
The rule here is avoid updating .git/shallow by default. But we don't
want to waste the received pack. If the pack contains two refs, one
needs new shallow commits installed in .git/shallow and one does not,
we keep the latter and reject/warn about the former.
Even if .git/shallow update is allowed, we only add shallow commits
strictly necessary for the former ref (remember the sender can send
more shallow commits than necessary) and pay attention not to
accidentally cut the receiver history short (no history shortening is
asked for)
So the steps to figure out what ref need what new shallow commits are:
1. Split the sender shallow commit list into "ours" and "theirs" list
by has_sha1_file. Those that exist in current repo in "ours", the
remaining in "theirs".
2. Check the receiver .git/shallow, remove from "ours" the ones that
also exist in .git/shallow.
3. Fetch the new pack. Either install or unpack it.
4. Do has_sha1_file on "theirs" list again. Drop the ones that fail
has_sha1_file. Obviously the new pack does not need them.
5. If the pack is kept, remove from "ours" the ones that do not exist
in the new pack.
6. Walk the new refs to answer the question "what shallow commits,
both ours and theirs, are required in .git/shallow in order to add
this ref?". Shallow commits not associated to any refs are removed
from their respective list.
7. (*) Check reachability (from the current refs) of all remaining
commits in "ours". Those reachable are removed. We do not want to
cut any part of our (reachable) history. We only check up
commits. True reachability test is done by
check_everything_connected() at the end as usual.
8. Combine the final "ours" and "theirs" and add them all to
.git/shallow. Install new refs. The case where some hook rejects
some refs on a push is explained in more detail in the push
patches.
Of these steps, #6 and #7 are expensive. Both require walking through
some commits, or in the worst case all commits. And we rather avoid
them in at least common case, where the transferred pack does not
contain any shallow commits that the sender advertises. Let's look at
each scenario:
1) the sender has longer history than the receiver
All shallow commits from the sender will be put into "theirs" list
at step 1 because none of them exists in current repo. In the
common case, "theirs" becomes empty at step 4 and exit early.
2) the sender has shorter history than the receiver
All shallow commits from the sender are likely in "ours" list at
step 1. In the common case, if the new pack is kept, we could empty
"ours" and exit early at step 5.
If the pack is not kept, we hit the expensive step 6 then exit
after "ours" is emptied. There'll be only a handful of objects to
walk in fast-forward case. If it's forced update, we may need to
walk to the bottom.
3) the sender has same .git/shallow as the receiver
This is similar to case 2 except that "ours" should be emptied at
step 2 and exit early.
A fetch after "clone --depth=X" is case 1. A fetch after "clone" (from
a shallow repo) is case 3. Luckily they're cheap for the common case.
A push from "clone --depth=X" falls into case 2, which is expensive.
Some more work may be done at the sender/client side to avoid more
work on the server side: if the transferred pack does not contain any
shallow commits, send-pack should not send any shallow commits to the
receive-pack, effectively turning it into a normal push and avoid all
steps.
This patch implements all steps except #3, already handled by
fetch-pack and receive-pack, #6 and #7, which has their own patch due
to their size.
(*) in previous versions step 7 was put before step 3. I reorder it so
that the common case that keeps the pack does not need to walk
commits at all. In future if we implement faster commit
reachability check (maybe with the help of pack bitmaps or commit
cache), step 7 could become cheap and be moved up before 6 again.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>