Commit graph

5 commits

Author SHA1 Message Date
Stefan Beller 2a73b3dad0 run-command: do not pass child process data into callbacks
The expected way to pass data into the callback is to pass them via
the customizable callback pointer. The error reporting in
default_{start_failure, task_finished} is not user friendly enough, that
we want to encourage using the child data for such purposes.

Furthermore the struct child data is cleaned by the run-command API,
before we access them in the callbacks, leading to use-after-free
situations.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-03-01 09:42:01 -08:00
Stefan Beller c553c72eed run-command: add an asynchronous parallel child processor
This allows to run external commands in parallel with ordered output
on stderr.

If we run external commands in parallel we cannot pipe the output directly
to the our stdout/err as it would mix up. So each process's output will
flow through a pipe, which we buffer. One subprocess can be directly
piped to out stdout/err for a low latency feedback to the user.

Example:
Let's assume we have 5 submodules A,B,C,D,E and each fetch takes a
different amount of time as the different submodules vary in size, then
the output of fetches in sequential order might look like this:

 time -->
 output: |---A---| |-B-| |-------C-------| |-D-| |-E-|

When we schedule these submodules into maximal two parallel processes,
a schedule and sample output over time may look like this:

process 1: |---A---| |-D-| |-E-|

process 2: |-B-| |-------C-------|

output:    |---A---|B|---C-------|DE

So A will be perceived as it would run normally in the single child
version. As B has finished by the time A is done, we can dump its whole
progress buffer on stderr, such that it looks like it finished in no
time. Once that is done, C is determined to be the visible child and
its progress will be reported in real time.

So this way of output is really good for human consumption, as it only
changes the timing, not the actual output.

For machine consumption the output needs to be prepared in the tasks,
by either having a prefix per line or per block to indicate whose tasks
output is displayed, because the output order may not follow the
original sequential ordering:

 |----A----| |--B--| |-C-|

will be scheduled to be all parallel:

process 1: |----A----|
process 2: |--B--|
process 3: |-C-|
output:    |----A----|CB

This happens because C finished before B did, so it will be queued for
output before B.

To detect when a child has finished executing, we check interleaved
with other actions (such as checking the liveliness of children or
starting new processes) whether the stderr pipe still exists. Once a
child closed its stderr stream, we assume it is terminating very soon,
and use `finish_command()` from the single external process execution
interface to collect the exit status.

By maintaining the strong assumption of stderr being open until the
very end of a child process, we can avoid other hassle such as an
implementation using `waitpid(-1)`, which is not implemented in Windows.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-12-16 12:06:08 -08:00
René Scharfe d318027932 run-command: introduce CHILD_PROCESS_INIT
Most struct child_process variables are cleared using memset first after
declaration.  Provide a macro, CHILD_PROCESS_INIT, that can be used to
initialize them statically instead.  That's shorter, doesn't require a
function call and is slightly more readable (especially given that we
already have STRBUF_INIT, ARGV_ARRAY_INIT etc.).

Helped-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2014-08-20 09:53:37 -07:00
Jonathan Nieder c0f19bf3b9 tests: check error message from run_command
In git versions starting at v1.7.5-rc0~29^2 until v1.7.5-rc3~2 (Revert
"run-command: prettify -D_FORTIFY_SOURCE workaround", 2011-04-18)
fixed it, the run_command facility would write a truncated error
message when the command is present but cannot be executed for some
other reason.  For example, if I add a 'hello' command to git:

	$ echo 'echo hello' >git-hello
	$ chmod +x git-hello
	$ PATH=.:$PATH git hello
	hello

and make it non-executable, this is what I normally get:

	$ chmod -x git-hello
	$ git hello
	fatal: cannot exec 'git-hello': Permission denied

But with the problematic versions, we get disturbing output:

	$ PATH=.:$PATH git hello
	fatal: $

Add some tests to make sure it doesn't happen again.

The hello-script used in these tests uses cat instead of echo because
on Windows the bash spawned by git converts LF to CRLF in text written
by echo while the bash running tests does not, causing the test to
fail if "echo" is used.  Thanks to Hannes for noticing.

Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Improved-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2011-04-20 10:08:54 -07:00
Johannes Sixt 2b541bf8be start_command: detect execvp failures early
Previously, failures during execvp could be detected only by
finish_command. However, in some situations it is beneficial for the
parent process to know earlier that the child process will not run.

The idea to use a pipe to signal failures to the parent process and
the test case were lifted from patches by Ilari Liusvaara.

Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2010-01-10 10:15:03 -08:00