Remove unused header "#include".
* en/header-cleanup:
treewide: remove unnecessary includes in source files
treewide: add direct includes currently only pulled in transitively
trace2/tr2_tls.h: remove unnecessary include
submodule-config.h: remove unnecessary include
pkt-line.h: remove unnecessary include
line-log.h: remove unnecessary include
http.h: remove unnecessary include
fsmonitor--daemon.h: remove unnecessary includes
blame.h: remove unnecessary includes
archive.h: remove unnecessary include
treewide: remove unnecessary includes in source files
treewide: remove unnecessary includes from header files
Each of these were checked with
gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).
...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file. These cases were:
* builtin/credential-cache.c
* builtin/pull.c
* builtin/send-pack.c
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It is an unsafe practice to call something like
git clone https://user:password@example.com/
This not only risks leaking the password "over the shoulder" or into the
readline history of the current Unix shell, it also gets logged via
Trace2 if enabled.
Let's at least avoid logging such secrets via Trace2, much like we avoid
logging secrets in `http.c`. Much like the code in `http.c` is guarded
via `GIT_TRACE_REDACT` (defaulting to `true`), we guard the new code via
`GIT_TRACE2_REDACT` (also defaulting to `true`).
The new tests added in this commit uncover leaks in `builtin/clone.c`
and `remote.c`. Therefore we need to turn off
`TEST_PASSES_SANITIZE_LEAK`. The reasons:
- We observed that `the_repository->remote_status` is not released
properly.
- We are using `url...insteadOf` and that runs into a code path where an
allocated URL is replaced with another URL, and the original URL is
never released.
- `remote_states` contains plenty of `struct remote`s whose refspecs
seem to be usually allocated by never released.
More investigation is needed here to identify the exact cause and
proper fixes for these leaks/bugs.
Co-authored-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Jeff Hostetler <jeffhostetler@github.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As mentioned in the thread starting at [1], trace2 counters should be
used to count events instead of ad-hoc static variables.
Convert the two fsync static variables to trace2 counters, reducing the
coupling between wrapper.c and the trace2 subsystem. Adjust t/t5351 to
match the trace2 counter output format.
The counters are not per-thread because the ones being replaced also
were not.
[1] https://lore.kernel.org/git/20230627195251.1973421-2-calvinwan@google.com/
Signed-off-by: Beat Bolli <dev+git@drbeat.li>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
There is a code path starting from trace2_def_param_fl() that eventually
calls current_config_scope(), and thus it needs to have "kvi" plumbed
through it. Additional plumbing is also needed to get "kvi" to
trace2_def_param_fl(), which gets called by two code paths:
- Through tr2_cfg_cb(), which is a config callback, so it trivially
receives "kvi" via the "struct config_context ctx" parameter.
- Through tr2_list_env_vars_fl(), which is a high level function that
lists environment variables for tracing. This has been secretly
behaving like git_config_from_parameters() (in that it parses config
from environment variables/the CLI), but does not set config source
information.
Teach tr2_list_env_vars_fl() to be well-behaved by using
kvi_from_param(), which is used elsewhere for CLI/environment
variable-based config.
As a result, current_config_scope() has no more callers, so remove it.
Signed-off-by: Glen Choo <chooglen@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
hash.h depends upon and includes repository.h, due to the definition and
use of the_hash_algo (defined as the_repository->hash_algo). However,
most headers trying to include hash.h are only interested in the layout
of the structs like object_id. Move the parts of hash.h that do not
depend upon repository.h into a new file hash-ll.h (the "low level"
parts of hash.h), and adjust other files to use this new header where
the convenience inline functions aren't needed.
This allows hash.h and object.h to be fairly small, minimal headers. It
also exposes a lot of hidden dependencies on both path.h (which was
brought in by repository.h) and repository.h (which was previously
implicitly brought in by object.h), so also adjust other files to be
more explicit about what they depend upon.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h. This made it more difficult
to find which files could remove a dependence on cache.h. Make C files
explicitly include trace.h or trace2.h if they are using them.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We had several C files include cache.h unnecessarily. Replace those
with an include of "git-compat-util.h" instead. Much like the previous
commit, these have all been verified via both ensuring that
gcc -E $SOURCE_FILE | grep '"cache.h"'
found no hits and that
make DEVELOPER=1 ${OBJECT_FILE_FOR_SOURCE_FILE}
successfully compiles without warnings.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add global counters mechanism to Trace2.
The Trace2 counters mechanism adds the ability to create a set of
global counter variables and an API to increment them efficiently.
Counters can optionally report per-thread usage in addition to the sum
across all threads.
Counter events are emitted to the Trace2 logs when a thread exits and
at process exit.
Counters are an alternative to `data` and `data_json` events.
Counters are useful when you want to measure something across the life
of the process, when you don't want per-measurement events for
performance reasons, when the data does not fit conveniently within a
region, or when your control flow does not easily let you write the
final total. For example, you might use this to report the number of
calls to unzip() or the number of de-delta steps during a checkout.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add stopwatch timer mechanism to Trace2.
Timers are an alternative to Trace2 Regions. Regions are useful for
measuring the time spent in various computation phases, such as the
time to read the index, time to scan for unstaged files, time to scan
for untracked files, and etc.
However, regions are not appropriate in all places. For example,
during a checkout, it would be very inefficient to use regions to
measure the total time spent inflating objects from the ODB from
across the entire lifetime of the process; a per-unzip() region would
flood the output and significantly slow the command; and some form of
post-processing would be requried to compute the time spent in unzip().
Timers can be used to measure a series of timer intervals and emit
a single summary event (at thread and/or process exit).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Rename the `thread_name` argument in `tr2tls_create_self()` and
`trace2_thread_start()` to be `thread_base_name` to make it clearer
that the passed argument is a component used in the construction of
the actual `struct tr2tls_thread_ctx.thread_name` variable.
The base name will be used along with the thread id to create a
unique thread name.
This commit does not change how the `thread_name` field is
allocated or stored within the `tr2tls_thread_ctx` structure.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reduce or eliminate use of the term "TLS" in the Trace2 code.
The term "TLS" has two popular meanings: "thread-local storage" and
"transport layer security". In the Trace2 source, the term is associated
with the former. There was concern on the mailing list about it refering
to the latter.
Update the source and documentation to eliminate the use of the "TLS" term
or replace it with the phrase "thread-local storage" to reduce ambiguity.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Change the exit() wrapper added in ee4512ed48 (trace2: create new
combined trace facility, 2019-02-22) so that we'll split up the trace2
logging concerns from wanting to wrap the "exit()" function itself for
other purposes.
This makes more sense structurally, as we won't seem to conflate
non-trace2 behavior with the trace2 code. I'd previously added an
explanation for this in 368b584315 (common-main.c: call exit(), don't
return, 2021-12-07), that comment is being adjusted here.
Now the only thing we'll do if we're not using trace2 is to truncate
the "code" argument to the lowest 8 bits.
We only need to do that truncation on non-POSIX systems, but in
ee4512ed48 that "if defined(__MINGW32__)" code added in
47e3de0e79 (MinGW: truncate exit()'s argument to lowest 8 bits,
2009-07-05) was made to run everywhere. It might be good for clarify
to narrow that down by an "ifdef" again, but I'm not certain that in
the interim we haven't had some other non-POSIX systems rely the
behavior. On a POSIX system taking the lowest 8 bits is implicit, see
exit(3)[1] and wait(2)[2]. Let's leave a comment about that instead.
1. https://man7.org/linux/man-pages/man3/exit.3.html
2. https://man7.org/linux/man-pages/man2/wait.2.html
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add some global trace2 statistics for the number of fsyncs performed
during the lifetime of a Git process.
These stats are printed as part of trace2_cmd_exit_fl, which is
presumably where we might want to print any other cross-cutting
statistics.
Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Remove the "else" branches of the HAVE_VARIADIC_MACROS macro, which
have been unconditionally omitted since 765dc16888 (git-compat-util:
always enable variadic macros, 2021-01-28).
Since were always omitted, anyone trying to use a compiler without
variadic macro support to compile a git since version
git v2.31.0 or later would have had a compilation error. 10 months
across a few releases since then should have been enough time for
anyone who cared to run into that and report the issue.
In addition to that, for anyone unsetting HAVE_VARIADIC_MACROS we've
been emitting extremely verbose warnings since at least
ee4512ed48 (trace2: create new combined trace facility,
2019-02-22). That's because there is no such thing as a
"region_enter_printf" or "region_leave_printf" format, so at least
under GCC and Clang everything that includes trace.h (almost every
file) emits a couple of warnings about that.
There's a large benefit to being able to have a hard dependency rely
on variadic macros, the code surrounding usage.c is hard to maintain
if we need to write two implementations of everything, and by relying
on "__FILE__" and "__LINE__" along with "__VA_ARGS__" we can in the
future make error(), die() etc. log where they were called from. We've
also recently merged d67fc4bf0b (Merge branch 'bc/require-c99',
2021-12-10) which further cements our hard dependency on C99.
So let's delete the fallback code, and update our CodingGuidelines to
note that we depend on this. The added bullet-point starts with
lower-case for consistency with other bullet-points in that section.
The diff in "trace.h" is relatively hard to read, since we need to
retain the existing API docs, which were comments on the code used if
HAVE_VARIADIC_MACROS was not defined.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create "child_ready" event to capture the state of a child process
created in the background.
When a child command is started a "child_start" event is generated in
the Trace2 log. For normal synchronous children, a "child_exit" event
is later generated when the child exits or is terminated. The two events
include information, such as the "child_id" and "pid", to allow post
analysis to match-up the command line and exit status.
When a child is started in the background (and may outlive the parent
process), it is not possible for the parent to emit a "child_exit"
event. Create a new "child_ready" event to indicate whether the
child was successfully started. Also include the "child_id" and "pid"
to allow similar post processing.
This will be used in a later commit with the new "start_bg_command()".
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
It can be useful to tell who invoked Git - was it invoked manually by a
user via CLI or script? By an IDE? In some cases - like 'repo' tool -
we can influence the source code and set the GIT_TRACE2_PARENT_SID
environment variable from the caller process. In 'repo''s case, that
parent SID is manipulated to include the string "repo", which means we
can positively identify when Git was invoked by 'repo' tool. However,
identifying parents that way requires both that we know which tools
invoke Git and that we have the ability to modify the source code of
those tools. It cannot scale to keep up with the various IDEs and
wrappers which use Git, most of which we don't know about. Learning
which tools and wrappers invoke Git, and how, would give us insight to
decide where to improve Git's usability and performance.
Unfortunately, there's no cross-platform reliable way to gather the name
of the parent process. If procfs is present, we can use that; otherwise
we will need to discover the name another way. However, the process ID
should be sufficient to look up the process name on most platforms, so
that code may be shareable.
Git for Windows gathers similar information and logs it as a "data_json"
event. However, since "data_json" has a variable format, it is difficult
to parse effectively in some languages; instead, let's pursue a
dedicated "cmd_ancestry" event to record information about the ancestry
of the current process and a consistent, parseable way.
Git for Windows also gathers information about more than one generation
of parent. In Linux further ancestry info can be gathered with procfs,
but it's unwieldy to do so. In the interest of later moving Git for
Windows ancestry logging to the 'cmd_ancestry' event, and in the
interest of later adding more ancestry to the Linux implementation - or
of adding this functionality to other platforms which have an easier
time walking the process tree - let's make 'cmd_ancestry' accept an
array of parentage.
Signed-off-by: Emily Shaffer <emilyshaffer@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add a public wrapper, trace2_session_id(), around tr2_sid_get(), which
is intended to be private trace2 implementation.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Via trace2, Git can already log interesting config parameters (see the
trace2_cmd_list_config() function). However, this can grant an
incomplete picture because many config parameters also allow overrides
via environment variables.
To allow for more complete logs, we add a new trace2_cmd_list_env_vars()
function and supporting implementation, modeled after the pre-existing
config param logging implementation.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Polishing of the new trace2 facility continues. The system-level
configuration can specify site-wide trace2 settings, which can be
overridden with per-user configuration and environment variables.
* jh/trace2-sid-fix:
trace2: fixup access problem on /etc/gitconfig in read_very_early_config
trace2: update docs to describe system/global config settings
trace2: make SIDs more unique
trace2: clarify UTC datetime formatting
trace2: report peak memory usage of the process
trace2: use system/global config for default trace2 settings
config: add read_very_early_config()
trace2: find exec-dir before trace2 initialization
trace2: add absolute elapsed time to start event
trace2: refactor setting process starting time
config: initialize opts structure in repo_read_config()
Fix trace2_data_json_fl() to check for the presence of pfn_data_json_fl
in its targets, rather than pfn_data_fl, which is not actually called.
Signed-off-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach Windows version of git to report peak memory usage
during exit() processing.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Teach git to read the system and global config files for
default Trace2 settings. This allows system-wide Trace2 settings to
be installed and inherited to make it easier to manage a collection of
systems.
The original GIT_TR2* environment variables are loaded afterwards and
can be used to override the system settings.
Only the system and global config files are used. Repo and worktree
local config files are ignored. Likewise, the "-c" command line
arguments are also ignored. These limits are for performance reasons.
(1) For users not using Trace2, there should be minimal overhead to
detect that Trace2 is not enabled. In particular, Trace2 should not
allocate lots of otherwise unused data strucutres.
(2) For accurate performance measurements, Trace2 should be initialized
as early in the git process as possible, and before most of the normal
git process initialization (which involves discovering the .git directory
and reading a hierarchy of config files).
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Add elapsed process time to "start" event to measure
the performance of early process startup.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create trace2_initialize_clock() and call from main() to capture
process start time in isolation and before other sub-systems are
ready.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some compilers don't allow NULL to be passed for a va_list,
and e.g. "gcc (Raspbian 6.3.0-18+rpi1+deb9u1) 6.3.0 20170516"
errors out like this:
trace2/tr2_tgt_event.c:193:18:
error: invalid operands to binary &&
(have ‘int’ and ‘va_list {aka __va_list}’)
if (fmt && *fmt && ap) {
^^
I couldn't find any hints that va_list and pointers can be mixed,
and no hints that they can't either. Morten Welinder comments:
"C99, Section 7.15, simply says that va_list "is an object type suitable for
holding information needed by the macros va_start, va_end, and
va_copy". So clearly not guaranteed to be mixable with pointers...
The portable solution is to use "va_list" everywhere in the callchain.
As a consequence, both trace2_region_enter_fl() and trace2_region_leave_fl()
now take a variable argument list.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Acked-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Create a new unified tracing facility for git. The eventual intent is to
replace the current trace_printf* and trace_performance* routines with a
unified set of git_trace2* routines.
In addition to the usual printf-style API, trace2 provides higer-level
event verbs with fixed-fields allowing structured data to be written.
This makes post-processing and analysis easier for external tools.
Trace2 defines 3 output targets. These are set using the environment
variables "GIT_TR2", "GIT_TR2_PERF", and "GIT_TR2_EVENT". These may be
set to "1" or to an absolute pathname (just like the current GIT_TRACE).
* GIT_TR2 is intended to be a replacement for GIT_TRACE and logs command
summary data.
* GIT_TR2_PERF is intended as a replacement for GIT_TRACE_PERFORMANCE.
It extends the output with columns for the command process, thread,
repo, absolute and relative elapsed times. It reports events for
child process start/stop, thread start/stop, and per-thread function
nesting.
* GIT_TR2_EVENT is a new structured format. It writes event data as a
series of JSON records.
Calls to trace2 functions log to any of the 3 output targets enabled
without the need to call different trace_printf* or trace_performance*
routines.
Signed-off-by: Jeff Hostetler <jeffhost@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>