development/git - HydraGit

mirror of https://github.com/git/git synced 2024-11-05 18:59:29 +00:00

Author	SHA1	Message	Date
Derrick Stolee	dee8a1455c	daemon: clarify directory arguments The undecorated arguments to the 'git-daemon' command provide a list of directories. When at least one directory is specified, then 'git-daemon' only serves requests that are within that directory list. The boolean '--strict-paths' option makes the list more explicit in that subdirectories are no longer included. The existing documentation and error messages around this directory list refer to it and its behavior as a "whitelist". The word "whitelist" has cultural implications that are not inclusive. Thankfully, it is not difficult to reword and avoid its use. In the process, we can define the purpose of this directory list directly. In Documentation/git-daemon.txt, rewrite the OPTIONS section around the '<directory>' option. Add additional clarity to the other options that refer to these directories. Some error messages can also be improved in daemon.c. The '--strict-paths' option requires '<directory>' arguments, so refer to that section of the documentation directly. A logerror() call points out that a requested directory is not in the specified directory list. We can use "list" here without any loss of information. Signed-off-by: Derrick Stolee <derrickstolee@github.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-07-19 12:45:31 -07:00
Junio C Hamano	c21fa3bb54	Merge branch 'ab/env-array' Rename .env_array member to .env in the child_process structure. * ab/env-array: run-command API users: use "env" not "env_array" in comments & names run-command API: rename "env_array" to "env"	2022-06-10 15:04:13 -07:00
Ævar Arnfjörð Bjarmason	29fda24dd1	run-command API: rename "env_array" to "env" Start following-up on the rename mentioned in `c7c4bdeccf` (run-command API: remove "env" member, always use "env_array", 2021-11-25) of "env_array" to "env". The "env_array" name was picked in `19a583dc39` (run-command: add env_array, an optional argv_array for env, 2014-10-19) because "env" was taken. Let's not forever keep the oddity of "_array" for this "struct strvec", but not for its "args" sibling. This commit is almost entirely made with a coccinelle rule[1]. The only manual change here is in run-command.h to rename the struct member itself and to change "env_array" to "env" in the CHILD_PROCESS_INIT initializer. The rest of this is all a result of applying [1]: make contrib/coccinelle/run_command.cocci.patch * patch -p1 <contrib/coccinelle/run_command.cocci.patch * git add -u 1. cat contrib/coccinelle/run_command.pending.cocci @@ struct child_process E; @@ - E.env_array + E.env @@ struct child_process *E; @@ - E->env_array + E->env I've avoided changing any comments and derived variable names here, that will all be done in the next commit. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-06-02 14:31:16 -07:00
Junio C Hamano	2b0a58d164	Merge branch 'ep/maint-equals-null-cocci' for maint-2.35 * ep/maint-equals-null-cocci: tree-wide: apply equals-null.cocci contrib/coccinnelle: add equals-null.cocci	2022-05-02 10:06:04 -07:00
Junio C Hamano	afe8a9070b	tree-wide: apply equals-null.cocci Signed-off-by: Junio C Hamano <gitster@pobox.com>	2022-05-02 09:50:37 -07:00
Junio C Hamano	9afe4d9f6b	Merge branch 'rs/daemon-plug-leak' Plug a memory leak. * rs/daemon-plug-leak: daemon: plug memory leak on overlong path	2022-01-05 14:01:31 -08:00
René Scharfe	999bba3e0b	daemon: plug memory leak on overlong path Release the strbuf containing the interpolated path after copying it to a stack buffer and before erroring out if it's too long. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-12-20 12:43:18 -08:00
Ævar Arnfjörð Bjarmason	7f14609e29	run-command API users: use strvec_push(), not argv construction Change a pattern of hardcoding an "argv" array size, populating it and assigning to the "argv" member of "struct child_process" to instead use "strvec_push()" to add data to the "args" member. As noted in the preceding commit this moves us further towards being able to remove the "argv" member in a subsequent commit These callers could have used strvec_pushl(), but moving to strvec_push() makes the diff easier to read, and keeps the arguments aligned as before. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-11-25 22:15:07 -08:00
Ævar Arnfjörð Bjarmason	6def0ff878	run-command API users: use strvec_pushv(), not argv assignment Migrate those run-command API users that assign directly to the "argv" member to use a strvec_pushv() of "args" instead. In these cases it did not make sense to further refactor these callers, e.g. daemon.c could be made to construct the arguments closer to handle(), but that would require moving the construction from its cmd_main() and pass "argv" through two intermediate functions. It would be possible for a change like this to introduce a regression if we were doing: cp.argv = argv; argv[1] = "foo"; And changed the code, as is being done here, to: strvec_pushv(&cp.args, argv); argv[1] = "foo"; But as viewing this change with the "-W" flag reveals none of these functions modify variable that's being pushed afterwards in a way that would introduce such a logic error. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-11-25 22:15:07 -08:00
Junio C Hamano	06355d72dc	Merge branch 'ab/pkt-line-cleanup' Code clean-up. * ab/pkt-line-cleanup: pkt-line.[ch]: remove unused packet_read_line_buf() pkt-line.[ch]: remove unused packet_buf_write_len()	2021-10-25 16:07:00 -07:00
Ævar Arnfjörð Bjarmason	ec9a37d69b	pkt-line.[ch]: remove unused packet_read_line_buf() This function was added in `4981fe750b` (pkt-line: share buffer/descriptor reading implementation, 2013-02-23), but in `01f9ec64c8` (Use packet_reader instead of packet_read_line, 2018-12-29) the code that was using it was removed. Since it's being removed we can in turn remove the "src" and "src_len" arguments to packet_read(), all the remaining users just passed a NULL/NULL pair to it. That function is only a thin wrapper for packet_read_with_status() which still needs those arguments, but for the thin packet_read() convenience wrapper we can do away with it for now. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-10-15 13:09:40 -07:00
Ævar Arnfjörð Bjarmason	6e54a32468	daemon.c: refactor hostinfo_init() to HOSTINFO_INIT macro Remove the hostinfo_init() function added in `01cec54e13` (daemon: deglobalize hostname information, 2015-03-07) and instead initialize the "struct hostinfo" with a macro. This is the more idiomatic pattern in the codebase, and doesn't leave us wondering when we see the *_init() function if this struct needs more complex initialization than a macro can provide. Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-09-27 15:02:32 -07:00
Junio C Hamano	bde35a2a93	Merge branch 'rs/daemon-sanitize-dir-sep' "git daemon" has been tightened against systems that take backslash as directory separator. * rs/daemon-sanitize-dir-sep: daemon: sanitize all directory separators	2021-04-08 13:23:26 -07:00
René Scharfe	9a7f1ce8b7	daemon: sanitize all directory separators When sanitizing client-supplied strings on Windows, also strip off backslashes, not just slashes. Signed-off-by: René Scharfe <l.s.r@web.de> Acked-by: Johannes Schindelin <johannes.schindelin@gmx.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-26 22:00:12 -07:00
René Scharfe	ca56dadb4b	use CALLOC_ARRAY Add and apply a semantic patch for converting code that open-codes CALLOC_ARRAY to use it instead. It shortens the code and infers the element size automatically. Signed-off-by: René Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2021-03-13 16:00:09 -08:00
Jeff King	d70a9eb611	strvec: rename struct fields The "argc" and "argv" names made sense when the struct was argv_array, but now they're just confusing. Let's rename them to "nr" (which we use for counts elsewhere) and "v" (which is rather terse, but reads well when combined with typical variable names like "args.v"). Note that we have to update all of the callers immediately. Playing tricks with the preprocessor is hard here, because we wouldn't want to rewrite unrelated tokens. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-30 19:18:06 -07:00
Jeff King	f6d8942b1f	strvec: fix indentation in renamed calls Code which split an argv_array call across multiple lines, like: argv_array_pushl(&args, "one argument", "another argument", "and more", NULL); was recently mechanically renamed to use strvec, which results in mis-matched indentation like: strvec_pushl(&args, "one argument", "another argument", "and more", NULL); Let's fix these up to align the arguments with the opening paren. I did this manually by sifting through the results of: git jump grep 'strvec_.*,$' and liberally applying my editor's auto-format. Most of the changes are of the form shown above, though I also normalized a few that had originally used a single-tab indentation (rather than our usual style of aligning with the open paren). I also rewrapped a couple of obvious cases (e.g., where previously too-long lines became short enough to fit on one), but I wasn't aggressive about it. In cases broken to three or more lines, the grouping of arguments is sometimes meaningful, and it wasn't worth my time or reviewer time to ponder each case individually. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Jeff King	ef8d7ac42a	strvec: convert more callers away from argv_array name We eventually want to drop the argv_array name and just use strvec consistently. There's no particular reason we have to do it all at once, or care about interactions between converted and unconverted bits. Because of our preprocessor compat layer, the names are interchangeable to the compiler (so even a definition and declaration using different names is OK). This patch converts remaining files from the first half of the alphabet, to keep the diff to a manageable size. The conversion was done purely mechanically with: git ls-files '.c' '.h' \| xargs perl -i -pe ' s/ARGV_ARRAY/STRVEC/g; s/argv_array/strvec/g; ' and then selectively staging files with "git add '[abcdefghjkl]*'". We'll deal with any indentation/style fallouts separately. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2020-07-28 15:02:18 -07:00
Elijah Newren	15beaaa3d1	Fix spelling errors in code comments Reported-by: Jens Schleusener <Jens.Schleusener@fossies.org> Signed-off-by: Elijah Newren <newren@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2019-11-10 16:00:54 +09:00
Junio C Hamano	f9bcd751aa	Merge branch 'lw/daemon-log-destination' Recent introduction of "--log-destination" option to "git daemon" did not work well when the daemon was run under "--inetd" mode. * lw/daemon-log-destination: daemon.c: fix condition for redirecting stderr	2018-04-25 13:28:58 +09:00
Lucas Werkmeister	e67d906d73	daemon.c: fix condition for redirecting stderr Since the --log-destination option was added in `0c591cacb` ("daemon: add --log-destination=(stderr\|syslog\|none)", 2018-02-04) with the explicit goal of allowing logging to stderr when running in inetd mode, we should not always redirect stderr to /dev/null in inetd mode, but rather only when stderr is not being used for logging. Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-04-09 11:25:48 +09:00
Junio C Hamano	c2bd43d66d	Merge branch 'lw/daemon-log-destination' The log from "git daemon" can be redirected with a new option; one relevant use case is to send the log to standard error (instead of syslog) when running it from inetd. * lw/daemon-log-destination: daemon: add --log-destination=(stderr\|syslog\|none)	2018-02-21 12:45:04 -08:00
Lucas Werkmeister	0c591cacba	daemon: add --log-destination=(stderr\|syslog\|none) This new option can be used to override the implicit --syslog of --inetd, or to disable all logging. (While --detach also implies --syslog, --log-destination=stderr with --detach is useless since --detach disassociates the process from the original stderr.) --syslog is retained as an alias for --log-destination=syslog. --log-destination always overrides implicit --syslog regardless of option order. This is different than the “last one wins” logic that applies to some implicit options elsewhere in Git, but should hopefully be less confusing. (I also don’t know if all implicit options in Git follow “last one wins”.) The combination of --inetd with --log-destination=stderr is useful, for instance, when running `git daemon` as an instanced systemd service (with associated socket unit). In this case, log messages sent via syslog are received by the journal daemon, but run the risk of being processed at a time when the `git daemon` process has already exited (especially if the process was very short-lived, e.g. due to client error), so that the journal daemon can no longer read its cgroup and attach the message to the correct systemd unit (see systemd/systemd#2913 [1]). Logging to stderr instead can solve this problem, because systemd can connect stderr directly to the journal daemon, which then already knows which unit is associated with this stream. [1]: https://github.com/systemd/systemd/issues/2913 Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com> Helped-by: Junio C Hamano <gitster@pobox.com> Helped-by: Eric Sunshine <sunshine@sunshineco.com> Signed-off-by: Lucas Werkmeister <mail@lucaswerkmeister.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-02-05 10:30:44 -08:00
Jeff King	ed15e58efe	daemon: fix length computation in newline stripping When git-daemon gets a pktline request, we strip off any trailing newline, replacing it with a NUL. Clients prior to `5ad312bede` (in git v1.4.0) would send: git-upload-pack repo.git\n and we need to strip it off to understand their request. After `5ad312bede`, we send the host attribute but no newline, like: git-upload-pack repo.git\0host=example.com\0 Both of these are parsed correctly by git-daemon. But if some client were to combine the two: git-upload-pack repo.git\n\0host=example.com\0 we don't parse it correctly. The problem is that we use the "len" variable to record the position of the NUL separator, but then decrement it when we strip the newline. So we start with: git-upload-pack repo.git\n\0host=example.com\0 ^-- len and end up with: git-upload-pack repo.git\0\0host=example.com\0 ^-- len This is arguably correct, since "len" tells us the length of the initial string, but we don't actually use it for that. What we do use it for is finding the offset of the extended attributes; they used to be at len+1, but are now at len+2. We can solve that by just leaving "len" where it is. We don't have to care about the length of the shortened string, since we just treat it like a C string. No version of Git ever produced such a string, but it seems like the daemon code meant to handle this case (and it seems like a reasonable thing for somebody to do in a 3rd-party implementation). Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-25 13:50:17 -08:00
Jeff King	550fbcad1c	daemon: handle NULs in extended attribute string If we receive a request with extended attributes after the NUL, we try to write those attributes to the log. We do so with a "%s" format specifier, which will only show characters up to the first NUL. That's enough for printing a "host=" specifier. But since `dfe422d04d` (daemon: recognize hidden request arguments, 2017-10-16) we may have another NUL, followed by protocol parameters, and those are not logged at all. Let's cut out the attempt to show the whole string, and instead log when we parse individual attributes. We could leave the "extended attributes (%d bytes) exist" part of the log, which in theory could alert us to attributes that fail to parse. But anything we don't parse as a "host=" parameter gets blindly added to the "protocol" attribute, so we'd see it in that part of the log. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-25 13:50:17 -08:00
Jeff King	19136be3f8	daemon: fix off-by-one in logging extended attributes If receive a request like: git-upload-pack /foo.git\0host=localhost we mark the offset of the NUL byte as "len", and then log the bytes after the NUL with a "%.s" placeholder, using "pktlen - len" as the length, and "line + len + 1" as the start of the string. This is off-by-one, since the start of the string skips past the separating NUL byte, but the adjusted length includes it. Fortunately this doesn't actually read past the end of the buffer, since "%.s" will stop when it hits a NUL. And regardless of what is in the buffer, packet_read() will always add an extra NUL terminator for safety. As an aside, the git.git client sends an extra NUL after a "host" field, too, so we'd generally hit that one first, not the one added by packet_read(). You can see this in the test output which reports 15 bytes, even though the string has only 14 bytes of visible data. But the point is that even a client sending unusual data could not get us to read past the end of the buffer, so this is purely a cosmetic fix. Reported-by: Michael Haggerty <mhagger@alum.mit.edu> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2018-01-25 13:50:17 -08:00
Brandon Williams	dfe422d04d	daemon: recognize hidden request arguments A normal request to git-daemon is structured as "command path/to/repo\0host=..\0" and due to a bug introduced in `49ba83fb6` (Add virtualization support to git-daemon, 2006-09-19) we aren't able to place any extra arguments (separated by NULs) besides the host otherwise the parsing of those arguments would enter an infinite loop. This bug was fixed in `73bb33a94` (daemon: Strictly parse the "extra arg" part of the command, 2009-06-04) but a check was put in place to disallow extra arguments so that new clients wouldn't trigger this bug in older servers. In order to get around this limitation teach git-daemon to recognize additional request arguments hidden behind a second NUL byte. Requests can then be structured like: "command path/to/repo\0host=..\0\0version=1\0key=value\0". git-daemon can then parse out the extra arguments and set 'GIT_PROTOCOL' accordingly. By placing these extra arguments behind a second NUL byte we can skirt around both the infinite loop bug in `49ba83fb6` (Add virtualization support to git-daemon, 2006-09-19) as well as the explicit disallowing of extra arguments introduced in `73bb33a94` (daemon: Strictly parse the "extra arg" part of the command, 2009-06-04) because both of these versions of git-daemon check for a single NUL byte after the host argument before terminating the argument parsing. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-10-17 10:51:29 +09:00
Brandon Williams	b2141fc1d2	config: don't include config.h by default Stop including config.h by default in cache.h. Instead only include config.h in those files which require use of the config system. Signed-off-by: Brandon Williams <bmwill@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-06-15 12:56:22 -07:00
Junio C Hamano	5938454cbc	Merge branch 'dt/xgethostname-nul-termination' gethostname(2) may not NUL terminate the buffer if hostname does not fit; unfortunately there is no easy way to see if our buffer was too small, but at least this will make sure we will not end up using garbage past the end of the buffer. * dt/xgethostname-nul-termination: xgethostname: handle long hostnames use HOST_NAME_MAX to size buffers for gethostname(2)	2017-04-23 22:07:57 -07:00
René Scharfe	da25bdb776	use HOST_NAME_MAX to size buffers for gethostname(2) POSIX limits the length of host names to HOST_NAME_MAX. Export the fallback definition from daemon.c and use this constant to make all buffers used with gethostname(2) big enough for any possible result and a terminating NUL. Inspired-by: David Turner <dturner@twosigma.com> Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: David Turner <dturner@twosigma.com> Reviewed-by: Jonathan Nieder <jrnieder@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2017-04-18 19:57:41 -07:00
Jeff King	6a97da3964	daemon: use an argv_array to exec children Our struct child_process already has its own argv_array. Let's use that to avoid having to format options into separate buffers. Note that we'll need to declare the child process outside of the run_service_command() helper to do this. But that opens up a further simplification, which is that the helper can append to our argument list, saving each caller from specifying "." manually. Signed-off-by: Jeff King <peff@peff.net>	2017-03-30 14:59:50 -07:00
Junio C Hamano	aa22ef8a80	Merge branch 'jk/daemon-path-ok-check-truncation' into maint "git daemon" used fixed-length buffers to turn URL to the repository the client asked for into the server side directory path, using snprintf() to avoid overflowing these buffers, but allowed possibly truncated paths to the directory. This has been tightened to reject such a request that causes overlong path to be required to serve. * jk/daemon-path-ok-check-truncation: daemon: detect and reject too-long paths	2016-11-29 13:27:56 -08:00
Junio C Hamano	dbaa6bdce2	Merge branch 'ls/filter-process' The smudge/clean filter API expect an external process is spawned to filter the contents for each path that has a filter defined. A new type of "process" filter API has been added to allow the first request to run the filter for a path to spawn a single process, and all filtering need is served by this single process for multiple paths, reducing the process creation overhead. * ls/filter-process: contrib/long-running-filter: add long running filter example convert: add filter.<driver>.process option convert: prepare filter.<driver>.process option convert: make apply_filter() adhere to standard Git error handling pkt-line: add functions to read/write flush terminated packet streams pkt-line: add packet_write_gently() pkt-line: add packet_flush_gently() pkt-line: add packet_write_fmt_gently() pkt-line: extract set_packet_header() pkt-line: rename packet_write() to packet_write_fmt() run-command: add clean_on_exit_handler run-command: move check_pipe() from write_or_die to run_command convert: modernize tests convert: quote filter names in error messages	2016-10-31 13:15:21 -07:00
Junio C Hamano	ee87d47b36	Merge branch 'jk/daemon-path-ok-check-truncation' "git daemon" used fixed-length buffers to turn URL to the repository the client asked for into the server side directory path, using snprintf() to avoid overflowing these buffers, but allowed possibly truncated paths to the directory. This has been tightened to reject such a request that causes overlong path to be required to serve. * jk/daemon-path-ok-check-truncation: daemon: detect and reject too-long paths	2016-10-27 14:58:50 -07:00
Jeff King	6bdb0083be	daemon: detect and reject too-long paths When we are checking the path via path_ok(), we use some fixed PATH_MAX buffers. We write into them via snprintf(), so there's no possibility of overflow, but it does mean we may silently truncate the path, leading to potentially confusing errors when the partial path does not exist. We're better off to reject the path explicitly. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-24 09:59:29 -07:00
Lars Schneider	81c634e94f	pkt-line: rename packet_write() to packet_write_fmt() packet_write() should be called packet_write_fmt() because it is a printf-like function that takes a format string as first parameter. packet_write_fmt() should be used for text strings only. Arbitrary binary data should use a new packet_write() function that is introduced in a subsequent patch. Suggested-by: Junio C Hamano <gitster@pobox.com> Signed-off-by: Lars Schneider <larsxschneider@gmail.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-10-17 11:36:50 -07:00
Junio C Hamano	faacc8efe5	Merge branch 'jk/common-main' into maint There are certain house-keeping tasks that need to be performed at the very beginning of any Git program, and programs that are not built-in commands had to do them exactly the same way as "git" potty does. It was easy to make mistakes in one-off standalone programs (like test helpers). A common "main()" function that calls cmd_main() of individual program has been introduced to make it harder to make mistakes. * jk/common-main: mingw: declare main()'s argv as const common-main: call git_setup_gettext() common-main: call restore_sigpipe_to_default() common-main: call sanitize_stdfds() common-main: call git_extract_argv0_path() add an extra level of indirection to main()	2016-09-08 21:35:51 -07:00
Junio C Hamano	b48dfd86c9	Merge branch 'ew/daemon-socket-keepalive' Recent update to "git daemon" tries to enable the socket-level KEEPALIVE, but when it is spawned via inetd, the standard input file descriptor may not necessarily be connected to a socket. Suppress an ENOTSOCK error from setsockopt(). * ew/daemon-socket-keepalive: Windows: add missing definition of ENOTSOCK daemon: ignore ENOTSOCK from setsockopt	2016-07-28 10:34:43 -07:00
Junio C Hamano	d4c6375fd8	Merge branch 'jk/common-main' There are certain house-keeping tasks that need to be performed at the very beginning of any Git program, and programs that are not built-in commands had to do them exactly the same way as "git" potty does. It was easy to make mistakes in one-off standalone programs (like test helpers). A common "main()" function that calls cmd_main() of individual program has been introduced to make it harder to make mistakes. * jk/common-main: mingw: declare main()'s argv as const common-main: call git_setup_gettext() common-main: call restore_sigpipe_to_default() common-main: call sanitize_stdfds() common-main: call git_extract_argv0_path() add an extra level of indirection to main()	2016-07-19 13:22:19 -07:00
Eric Wong	49c58d86ce	daemon: ignore ENOTSOCK from setsockopt In inetd mode, we are not guaranteed stdin or stdout is a socket; callers could filter the data through a pipe or be testing with regular files. This prevents t5802 from polluting syslog. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-18 11:09:52 -07:00
Jeff King	5ce5f5fa5a	common-main: call git_setup_gettext() This should be part of every program, as otherwise users do not get translated error messages. However, some external commands forgot to do so (e.g., git-credential-store). This fixes them, and eliminates the repeated code in programs that did remember to use it. Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-01 15:09:10 -07:00
Jeff King	57f5d52a94	common-main: call sanitize_stdfds() This is setup that should be done in every program for safety, but we never got around to adding it everywhere (so builtins benefited from the call in git.c, but any external commands did not). Putting it in the common main() gives us this safety everywhere. Note that the case in daemon.c is a little funny. We wait until we know whether we want to daemonize, and then either: - call daemonize(), which will close stdio and reopen it to /dev/null under the hood - sanitize_stdfds(), to fix up any odd cases But that is way too late; the point of sanitizing is to give us reliable descriptors on 0/1/2, and we will already have executed code, possibly called die(), etc. The sanitizing should be the very first thing that happens. With this patch, git-daemon will sanitize first, and can remove the call in the non-daemonize case. It does mean that daemonize() may just end up closing the descriptors we opened, but that's not a big deal (it's not wrong to do so, nor is it really less optimal than the case where our parent process redirected us from /dev/null ahead of time). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-01 15:09:10 -07:00
Jeff King	650c449250	common-main: call git_extract_argv0_path() Every program which links against libgit.a must call this function, or risk hitting an assert() in system_path() that checks whether we have configured argv0_path (though only when RUNTIME_PREFIX is defined, so essentially only on Windows). Looking at the diff, you can see that putting it into the common main() saves us having to do it individually in each of the external commands. But what you can't see are the cases where we _should_ have been doing so, but weren't (e.g., git-credential-store, and all of the t/helper test programs). This has been an accident-waiting-to-happen for a long time, but wasn't triggered until recently because it involves one of those programs actually calling system_path(). That happened with git-credential-store in v2.8.0 with `ae5f677` (lazily load core.sharedrepository, 2016-03-11). The program: - takes a lock file, which... - opens a tempfile, which... - calls adjust_shared_perm to fix permissions, which... - lazy-loads the config (as of `ae5f677`), which... - calls system_path() to find the location of /etc/gitconfig On systems with RUNTIME_PREFIX, this means credential-store reliably hits that assert() and cannot be used. We never noticed in the test suite, because we set GIT_CONFIG_NOSYSTEM there, which skips the system_path() lookup entirely. But if we were to tweak git_config() to find /etc/gitconfig even when we aren't going to open it, then the test suite shows multiple failures (for credential-store, and for some other test helpers). I didn't include that tweak here because it's way too specific to this particular call to be worth carrying around what is essentially dead code. The implementation is fairly straightforward, with one exception: there is exactly one caller (git.c) that actually cares about the result of the function, and not the side-effect of setting up argv0_path. We can accommodate that by simply replacing the value of argv[0] in the array we hand down to cmd_main(). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-01 15:09:10 -07:00
Jeff King	3f2e2297b9	add an extra level of indirection to main() There are certain startup tasks that we expect every git process to do. In some cases this is just to improve the quality of the program (e.g., setting up gettext()). In others it is a requirement for using certain functions in libgit.a (e.g., system_path() expects that you have called git_extract_argv0_path()). Most commands are builtins and are covered by the git.c version of main(). However, there are still a few external commands that use their own main(). Each of these has to remember to include the correct startup sequence, and we are not always consistent. Rather than just fix the inconsistencies, let's make this harder to get wrong by providing a common main() that can run this standard startup. We basically have two options to do this: - the compat/mingw.h file already does something like this by adding a #define that replaces the definition of main with a wrapper that calls mingw_startup(). The upside is that the code in each program doesn't need to be changed at all; it's rewritten on the fly by the preprocessor. The downside is that it may make debugging of the startup sequence a bit more confusing, as the preprocessor is quietly inserting new code. - the builtin functions are all of the form cmd_foo(), and git.c's main() calls them. This is much more explicit, which may make things more obvious to somebody reading the code. It's also more flexible (because of course we have to figure out _which_ cmd_foo() to call). The downside is that each of the builtins must define cmd_foo(), instead of just main(). This patch chooses the latter option, preferring the more explicit approach, even though it is more invasive. We introduce a new file common-main.c, with the "real" main. It expects to call cmd_main() from whatever other objects it is linked against. We link common-main.o against anything that links against libgit.a, since we know that such programs will need to do this setup. Note that common-main.o can't actually go inside libgit.a, as the linker would not pick up its main() function automatically (it has no callers). The rest of the patch is just adjusting all of the various external programs (mostly in t/helper) to use cmd_main(). I've provided a global declaration for cmd_main(), which means that all of the programs also need to match its signature. In particular, many functions need to switch to "const char " instead of "char " for argv. This effect ripples out to a few other variables and functions, as well. This makes the patch even more invasive, but the end result is much better. We should be treating argv strings as const anyway, and now all programs conform to the same signature (which also matches the way builtins are defined). Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-07-01 15:09:10 -07:00
Eric Wong	a43b68a196	daemon: enable SO_KEEPALIVE for all sockets While --init-timeout and --timeout options exist and I've never run git-daemon without them, some users may forget to set them and encounter hung daemon processes when connections fail. Enable socket-level timeouts so the kernel can send keepalive probes as necessary to detect failed connections. Signed-off-by: Eric Wong <e@80x24.org> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-05-25 09:42:53 -07:00
Junio C Hamano	11529ecec9	Merge branch 'jk/tighten-alloc' Update various codepaths to avoid manually-counted malloc(). * jk/tighten-alloc: (22 commits) ewah: convert to REALLOC_ARRAY, etc convert ewah/bitmap code to use xmalloc diff_populate_gitlink: use a strbuf transport_anonymize_url: use xstrfmt git-compat-util: drop mempcpy compat code sequencer: simplify memory allocation of get_message test-path-utils: fix normalize_path_copy output buffer size fetch-pack: simplify add_sought_entry fast-import: simplify allocation in start_packfile write_untracked_extension: use FLEX_ALLOC helper prepare_{git,shell}_cmd: use argv_array use st_add and st_mult for allocation size computation convert trivial cases to FLEX_ARRAY macros use xmallocz to avoid size arithmetic convert trivial cases to ALLOC_ARRAY convert manual allocations to argv_array argv-array: add detach function add helpers for allocating flex-array structs harden REALLOC_ARRAY and xcalloc against size_t overflow tree-diff: catch integer overflow in combine_diff_path allocation ...	2016-02-26 13:37:16 -08:00
Jeff King	850d2fec53	convert manual allocations to argv_array There are many manual argv allocations that predate the argv_array API. Switching to that API brings a few advantages: 1. We no longer have to manually compute the correct final array size (so it's one less thing we can screw up). 2. In many cases we had to make a separate pass to count, then allocate, then fill in the array. Now we can do it in one pass, making the code shorter and easier to follow. 3. argv_array handles memory ownership for us, making it more obvious when things should be free()d and and when not. Most of these cases are pretty straightforward. In some, we switch from "run_command_v" to "run_command" which lets us directly use the argv_array embedded in "struct child_process". Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-02-22 14:50:32 -08:00
Junio C Hamano	8f309aeb82	strbuf: introduce strbuf_getline_{lf,nul}() The strbuf_getline() interface allows a byte other than LF or NUL as the line terminator, but this is only because I wrote these codepaths anticipating that there might be a value other than NUL and LF that could be useful when I introduced line_termination long time ago. No useful caller that uses other value has emerged. By now, it is clear that the interface is overly broad without a good reason. Many codepaths have hardcoded preference to read either LF terminated or NUL terminated records from their input, and then call strbuf_getline() with LF or NUL as the third parameter. This step introduces two thin wrappers around strbuf_getline(), namely, strbuf_getline_lf() and strbuf_getline_nul(), and mechanically rewrites these call sites to call either one of them. The changes contained in this patch are: * introduction of these two functions in strbuf.[ch] * mechanical conversion of all callers to strbuf_getline() with either '\n' or '\0' as the third parameter to instead call the respective thin wrapper. After this step, output from "git grep 'strbuf_getline('" would become a lot smaller. An interim goal of this series is to make this an empty set, so that we can have strbuf_getline_crlf() take over the shorter name strbuf_getline(). Signed-off-by: Junio C Hamano <gitster@pobox.com>	2016-01-15 10:12:51 -08:00
Junio C Hamano	c3c592ef95	Merge branch 'rs/daemon-plug-child-leak' "git daemon" uses "run_command()" without "finish_command()", so it needs to release resources itself, which it forgot to do. * rs/daemon-plug-child-leak: daemon: plug memory leak run-command: factor out child_process_clear()	2015-11-03 15:13:12 -08:00
René Scharfe	b1b49ff8d4	daemon: plug memory leak Call child_process_clear() when a child ends to release the memory allocated for its environment. This is necessary because unlike all other users of start_command() we don't call finish_command(), which would have taken care of that for us. This leak was introduced by `f063d38b` (daemon: use cld->env_array when re-spawning). Signed-off-by: Rene Scharfe <l.s.r@web.de> Signed-off-by: Junio C Hamano <gitster@pobox.com>	2015-11-02 15:01:23 -08:00

1 2 3 4 5 ...

253 commits