git/alias.c

136 lines
2.7 KiB
C
Raw Normal View History

#include "git-compat-util.h"
#include "alias.h"
#include "config.h"
#include "gettext.h"
#include "strbuf.h"
#include "string-list.h"
struct config_alias_data {
const char *alias;
char *v;
struct string_list *list;
};
config: add ctx arg to config_fn_t Add a new "const struct config_context *ctx" arg to config_fn_t to hold additional information about the config iteration operation. config_context has a "struct key_value_info kvi" member that holds metadata about the config source being read (e.g. what kind of config source it is, the filename, etc). In this series, we're only interested in .kvi, so we could have just used "struct key_value_info" as an arg, but config_context makes it possible to add/adjust members in the future without changing the config_fn_t signature. We could also consider other ways of organizing the args (e.g. moving the config name and value into config_context or key_value_info), but in my experiments, the incremental benefit doesn't justify the added complexity (e.g. a config_fn_t will sometimes invoke another config_fn_t but with a different config value). In subsequent commits, the .kvi member will replace the global "struct config_reader" in config.c, making config iteration a global-free operation. It requires much more work for the machinery to provide meaningful values of .kvi, so for now, merely change the signature and call sites, pass NULL as a placeholder value, and don't rely on the arg in any meaningful way. Most of the changes are performed by contrib/coccinelle/config_fn_ctx.pending.cocci, which, for every config_fn_t: - Modifies the signature to accept "const struct config_context *ctx" - Passes "ctx" to any inner config_fn_t, if needed - Adds UNUSED attributes to "ctx", if needed Most config_fn_t instances are easily identified by seeing if they are called by the various config functions. Most of the remaining ones are manually named in the .cocci patch. Manual cleanups are still needed, but the majority of it is trivial; it's either adjusting config_fn_t that the .cocci patch didn't catch, or adding forward declarations of "struct config_context ctx" to make the signatures make sense. The non-trivial changes are in cases where we are invoking a config_fn_t outside of config machinery, and we now need to decide what value of "ctx" to pass. These cases are: - trace2/tr2_cfg.c:tr2_cfg_set_fl() This is indirectly called by git_config_set() so that the trace2 machinery can notice the new config values and update its settings using the tr2 config parsing function, i.e. tr2_cfg_cb(). - builtin/checkout.c:checkout_main() This calls git_xmerge_config() as a shorthand for parsing a CLI arg. This might be worth refactoring away in the future, since git_xmerge_config() can call git_default_config(), which can do much more than just parsing. Handle them by creating a KVI_INIT macro that initializes "struct key_value_info" to a reasonable default, and use that to construct the "ctx" arg. Signed-off-by: Glen Choo <chooglen@google.com> Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-06-28 19:26:22 +00:00
static int config_alias_cb(const char *key, const char *value,
const struct config_context *ctx UNUSED, void *d)
{
struct config_alias_data *data = d;
const char *p;
if (!skip_prefix(key, "alias.", &p))
return 0;
if (data->alias) {
if (!strcasecmp(p, data->alias))
return git_config_string((const char **)&data->v,
key, value);
} else if (data->list) {
string_list_append(data->list, p);
}
return 0;
}
char *alias_lookup(const char *alias)
{
struct config_alias_data data = { alias, NULL };
read_early_config(config_alias_cb, &data);
return data.v;
}
void list_aliases(struct string_list *list)
{
struct config_alias_data data = { NULL, NULL, list };
read_early_config(config_alias_cb, &data);
}
void quote_cmdline(struct strbuf *buf, const char **argv)
{
for (const char **argp = argv; *argp; argp++) {
if (argp != argv)
strbuf_addch(buf, ' ');
strbuf_addch(buf, '"');
for (const char *p = *argp; *p; p++) {
const char c = *p;
if (c == '"' || c =='\\')
strbuf_addch(buf, '\\');
strbuf_addch(buf, c);
}
strbuf_addch(buf, '"');
}
}
#define SPLIT_CMDLINE_BAD_ENDING 1
#define SPLIT_CMDLINE_UNCLOSED_QUOTE 2
alias.c: reject too-long cmdline strings in split_cmdline() This function improperly uses an int to represent the number of entries in the resulting argument array. This allows a malicious actor to intentionally overflow the return value, leading to arbitrary heap writes. Because the resulting argv array is typically passed to execv(), it may be possible to leverage this attack to gain remote code execution on a victim machine. This was almost certainly the case for certain configurations of git-shell until the previous commit limited the size of input it would accept. Other calls to split_cmdline() are typically limited by the size of argv the OS is willing to hand us, so are similarly protected. So this is not strictly fixing a known vulnerability, but is a hardening of the function that is worth doing to protect against possible unknown vulnerabilities. One approach to fixing this would be modifying the signature of `split_cmdline()` to look something like: int split_cmdline(char *cmdline, const char ***argv, size_t *argc); Where the return value of `split_cmdline()` is negative for errors, and zero otherwise. If non-NULL, the `*argc` pointer is modified to contain the size of the `**argv` array. But this implies an absurdly large `argv` array, which more than likely larger than the system's argument limit. So even if split_cmdline() allowed this, it would fail immediately afterwards when we called execv(). So instead of converting all of `split_cmdline()`'s callers to work with `size_t` types in this patch, instead pursue the minimal fix here to prevent ever returning an array with more than INT_MAX entries in it. Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-09-28 22:53:32 +00:00
#define SPLIT_CMDLINE_ARGC_OVERFLOW 3
static const char *split_cmdline_errors[] = {
N_("cmdline ends with \\"),
alias.c: reject too-long cmdline strings in split_cmdline() This function improperly uses an int to represent the number of entries in the resulting argument array. This allows a malicious actor to intentionally overflow the return value, leading to arbitrary heap writes. Because the resulting argv array is typically passed to execv(), it may be possible to leverage this attack to gain remote code execution on a victim machine. This was almost certainly the case for certain configurations of git-shell until the previous commit limited the size of input it would accept. Other calls to split_cmdline() are typically limited by the size of argv the OS is willing to hand us, so are similarly protected. So this is not strictly fixing a known vulnerability, but is a hardening of the function that is worth doing to protect against possible unknown vulnerabilities. One approach to fixing this would be modifying the signature of `split_cmdline()` to look something like: int split_cmdline(char *cmdline, const char ***argv, size_t *argc); Where the return value of `split_cmdline()` is negative for errors, and zero otherwise. If non-NULL, the `*argc` pointer is modified to contain the size of the `**argv` array. But this implies an absurdly large `argv` array, which more than likely larger than the system's argument limit. So even if split_cmdline() allowed this, it would fail immediately afterwards when we called execv(). So instead of converting all of `split_cmdline()`'s callers to work with `size_t` types in this patch, instead pursue the minimal fix here to prevent ever returning an array with more than INT_MAX entries in it. Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-09-28 22:53:32 +00:00
N_("unclosed quote"),
N_("too many arguments"),
};
int split_cmdline(char *cmdline, const char ***argv)
{
alias.c: reject too-long cmdline strings in split_cmdline() This function improperly uses an int to represent the number of entries in the resulting argument array. This allows a malicious actor to intentionally overflow the return value, leading to arbitrary heap writes. Because the resulting argv array is typically passed to execv(), it may be possible to leverage this attack to gain remote code execution on a victim machine. This was almost certainly the case for certain configurations of git-shell until the previous commit limited the size of input it would accept. Other calls to split_cmdline() are typically limited by the size of argv the OS is willing to hand us, so are similarly protected. So this is not strictly fixing a known vulnerability, but is a hardening of the function that is worth doing to protect against possible unknown vulnerabilities. One approach to fixing this would be modifying the signature of `split_cmdline()` to look something like: int split_cmdline(char *cmdline, const char ***argv, size_t *argc); Where the return value of `split_cmdline()` is negative for errors, and zero otherwise. If non-NULL, the `*argc` pointer is modified to contain the size of the `**argv` array. But this implies an absurdly large `argv` array, which more than likely larger than the system's argument limit. So even if split_cmdline() allowed this, it would fail immediately afterwards when we called execv(). So instead of converting all of `split_cmdline()`'s callers to work with `size_t` types in this patch, instead pursue the minimal fix here to prevent ever returning an array with more than INT_MAX entries in it. Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-09-28 22:53:32 +00:00
size_t src, dst, count = 0, size = 16;
char quoted = 0;
ALLOC_ARRAY(*argv, size);
/* split alias_string */
(*argv)[count++] = cmdline;
for (src = dst = 0; cmdline[src];) {
char c = cmdline[src];
if (!quoted && isspace(c)) {
cmdline[dst++] = 0;
while (cmdline[++src]
&& isspace(cmdline[src]))
; /* skip */
ALLOC_GROW(*argv, count + 1, size);
(*argv)[count++] = cmdline + dst;
} else if (!quoted && (c == '\'' || c == '"')) {
quoted = c;
src++;
} else if (c == quoted) {
quoted = 0;
src++;
} else {
if (c == '\\' && quoted != '\'') {
src++;
c = cmdline[src];
if (!c) {
FREE_AND_NULL(*argv);
return -SPLIT_CMDLINE_BAD_ENDING;
}
}
cmdline[dst++] = c;
src++;
}
}
cmdline[dst] = 0;
if (quoted) {
FREE_AND_NULL(*argv);
return -SPLIT_CMDLINE_UNCLOSED_QUOTE;
}
alias.c: reject too-long cmdline strings in split_cmdline() This function improperly uses an int to represent the number of entries in the resulting argument array. This allows a malicious actor to intentionally overflow the return value, leading to arbitrary heap writes. Because the resulting argv array is typically passed to execv(), it may be possible to leverage this attack to gain remote code execution on a victim machine. This was almost certainly the case for certain configurations of git-shell until the previous commit limited the size of input it would accept. Other calls to split_cmdline() are typically limited by the size of argv the OS is willing to hand us, so are similarly protected. So this is not strictly fixing a known vulnerability, but is a hardening of the function that is worth doing to protect against possible unknown vulnerabilities. One approach to fixing this would be modifying the signature of `split_cmdline()` to look something like: int split_cmdline(char *cmdline, const char ***argv, size_t *argc); Where the return value of `split_cmdline()` is negative for errors, and zero otherwise. If non-NULL, the `*argc` pointer is modified to contain the size of the `**argv` array. But this implies an absurdly large `argv` array, which more than likely larger than the system's argument limit. So even if split_cmdline() allowed this, it would fail immediately afterwards when we called execv(). So instead of converting all of `split_cmdline()`'s callers to work with `size_t` types in this patch, instead pursue the minimal fix here to prevent ever returning an array with more than INT_MAX entries in it. Signed-off-by: Kevin Backhouse <kevinbackhouse@github.com> Signed-off-by: Taylor Blau <me@ttaylorr.com> Signed-off-by: Jeff King <peff@peff.net> Signed-off-by: Taylor Blau <me@ttaylorr.com>
2022-09-28 22:53:32 +00:00
if (count >= INT_MAX) {
FREE_AND_NULL(*argv);
return -SPLIT_CMDLINE_ARGC_OVERFLOW;
}
ALLOC_GROW(*argv, count + 1, size);
(*argv)[count] = NULL;
return count;
}
const char *split_cmdline_strerror(int split_cmdline_errno)
{
return split_cmdline_errors[-split_cmdline_errno - 1];
}