Merge branch 'tl/ls-tree-oid-only'

"git ls-tree" learns "--oid-only" option, similar to "--name-only",
and more generalized "--format" option.

* tl/ls-tree-oid-only:
  ls-tree: split up "fast path" callbacks
  ls-tree: detect and error on --name-only --name-status
  ls-tree: support --object-only option for "git-ls-tree"
  ls-tree: introduce "--format" option
  cocci: allow padding with `strbuf_addf()`
  ls-tree: introduce struct "show_tree_data"
  ls-tree: slightly refactor `show_tree()`
  ls-tree: fix "--name-only" and "--long" combined use bug
  ls-tree: simplify nesting if/else logic in "show_tree()"
  ls-tree: rename "retval" to "recurse" in "show_tree()"
  ls-tree: use "size_t", not "int" for "struct strbuf"'s "len"
  ls-tree: use "enum object_type", not {blob,tree,commit}_type
  ls-tree: add missing braces to "else" arms
  ls-tree: remove commented-out code
  ls-tree tests: add tests for --name-status
This commit is contained in:
Junio C Hamano 2022-04-04 10:56:21 -07:00
commit 1041d58b4d
6 changed files with 490 additions and 92 deletions

View file

@ -10,7 +10,7 @@ SYNOPSIS
--------
[verse]
'git ls-tree' [-d] [-r] [-t] [-l] [-z]
[--name-only] [--name-status] [--full-name] [--full-tree] [--abbrev[=<n>]]
[--name-only] [--name-status] [--object-only] [--full-name] [--full-tree] [--abbrev[=<n>]] [--format=<format>]
<tree-ish> [<path>...]
DESCRIPTION
@ -59,6 +59,15 @@ OPTIONS
--name-only::
--name-status::
List only filenames (instead of the "long" output), one per line.
Cannot be combined with `--object-only`.
--object-only::
List only names of the objects, one per line. Cannot be combined
with `--name-only` or `--name-status`.
This is equivalent to specifying `--format='%(objectname)'`, but
for both this option and that exact format the command takes a
hand-optimized codepath instead of going through the generic
formatting mechanism.
--abbrev[=<n>]::
Instead of showing the full 40-byte hexadecimal object
@ -74,6 +83,16 @@ OPTIONS
Do not limit the listing to the current working directory.
Implies --full-name.
--format=<format>::
A string that interpolates `%(fieldname)` from the result
being shown. It also interpolates `%%` to `%`, and
`%xx` where `xx` are hex digits interpolates to character
with hex code `xx`; for example `%00` interpolates to
`\0` (NUL), `%09` to `\t` (TAB) and `%0a` to `\n` (LF).
When specified, `--format` cannot be combined with other
format-altering options, including `--long`, `--name-only`
and `--object-only`.
[<path>...]::
When paths are given, show them (note that this isn't really raw
pathnames, but rather a list of patterns to match). Otherwise
@ -82,16 +101,29 @@ OPTIONS
Output Format
-------------
<mode> SP <type> SP <object> TAB <file>
The output format of `ls-tree` is determined by either the `--format`
option, or other format-altering options such as `--name-only` etc.
(see `--format` above).
The use of certain `--format` directives is equivalent to using those
options, but invoking the full formatting machinery can be slower than
using an appropriate formatting option.
In cases where the `--format` would exactly map to an existing option
`ls-tree` will use the appropriate faster path. Thus the default format
is equivalent to:
%(objectmode) %(objecttype) %(objectname)%x09%(path)
This output format is compatible with what `--index-info --stdin` of
'git update-index' expects.
When the `-l` option is used, format changes to
<mode> SP <type> SP <object> SP <object size> TAB <file>
%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)
Object size identified by <object> is given in bytes, and right-justified
Object size identified by <objectname> is given in bytes, and right-justified
with minimum width of 7 characters. Object size is given only for blobs
(file) entries; for other entries `-` character is used in place of size.
@ -100,6 +132,34 @@ quoted as explained for the configuration variable `core.quotePath`
(see linkgit:git-config[1]). Using `-z` the filename is output
verbatim and the line is terminated by a NUL byte.
Customized format:
It is possible to print in a custom format by using the `--format` option,
which is able to interpolate different fields using a `%(fieldname)` notation.
For example, if you only care about the "objectname" and "path" fields, you
can execute with a specific "--format" like
git ls-tree --format='%(objectname) %(path)' <tree-ish>
FIELD NAMES
-----------
Various values from structured fields can be used to interpolate
into the resulting output. For each outputing line, the following
names can be used:
objectmode::
The mode of the object.
objecttype::
The type of the object (`blob` or `tree`).
objectname::
The name of the object.
objectsize[:padded]::
The size of the object ("-" if it's a tree).
It also supports a padded format of size with "%(size:padded)".
path::
The pathname of the object.
GIT
---
Part of the linkgit:git[1] suite

View file

@ -16,22 +16,102 @@
static int line_termination = '\n';
#define LS_RECURSIVE 1
#define LS_TREE_ONLY 2
#define LS_SHOW_TREES 4
#define LS_NAME_ONLY 8
#define LS_SHOW_SIZE 16
#define LS_TREE_ONLY (1 << 1)
#define LS_SHOW_TREES (1 << 2)
static int abbrev;
static int ls_options;
static struct pathspec pathspec;
static int chomp_prefix;
static const char *ls_tree_prefix;
static const char *format;
struct show_tree_data {
unsigned mode;
enum object_type type;
const struct object_id *oid;
const char *pathname;
struct strbuf *base;
};
static const char * const ls_tree_usage[] = {
N_("git ls-tree [<options>] <tree-ish> [<path>...]"),
NULL
};
static int show_recursive(const char *base, int baselen, const char *pathname)
static enum ls_tree_cmdmode {
MODE_DEFAULT = 0,
MODE_LONG,
MODE_NAME_ONLY,
MODE_NAME_STATUS,
MODE_OBJECT_ONLY,
} cmdmode;
static void expand_objectsize(struct strbuf *line, const struct object_id *oid,
const enum object_type type, unsigned int padded)
{
if (type == OBJ_BLOB) {
unsigned long size;
if (oid_object_info(the_repository, oid, &size) < 0)
die(_("could not get object info about '%s'"),
oid_to_hex(oid));
if (padded)
strbuf_addf(line, "%7"PRIuMAX, (uintmax_t)size);
else
strbuf_addf(line, "%"PRIuMAX, (uintmax_t)size);
} else if (padded) {
strbuf_addf(line, "%7s", "-");
} else {
strbuf_addstr(line, "-");
}
}
static size_t expand_show_tree(struct strbuf *sb, const char *start,
void *context)
{
struct show_tree_data *data = context;
const char *end;
const char *p;
unsigned int errlen;
size_t len = strbuf_expand_literal_cb(sb, start, NULL);
if (len)
return len;
if (*start != '(')
die(_("bad ls-tree format: element '%s' does not start with '('"), start);
end = strchr(start + 1, ')');
if (!end)
die(_("bad ls-tree format: element '%s' does not end in ')'"), start);
len = end - start + 1;
if (skip_prefix(start, "(objectmode)", &p)) {
strbuf_addf(sb, "%06o", data->mode);
} else if (skip_prefix(start, "(objecttype)", &p)) {
strbuf_addstr(sb, type_name(data->type));
} else if (skip_prefix(start, "(objectsize:padded)", &p)) {
expand_objectsize(sb, data->oid, data->type, 1);
} else if (skip_prefix(start, "(objectsize)", &p)) {
expand_objectsize(sb, data->oid, data->type, 0);
} else if (skip_prefix(start, "(objectname)", &p)) {
strbuf_add_unique_abbrev(sb, data->oid, abbrev);
} else if (skip_prefix(start, "(path)", &p)) {
const char *name = data->base->buf;
const char *prefix = chomp_prefix ? ls_tree_prefix : NULL;
struct strbuf quoted = STRBUF_INIT;
struct strbuf sbuf = STRBUF_INIT;
strbuf_addstr(data->base, data->pathname);
name = relative_path(data->base->buf, prefix, &sbuf);
quote_c_style(name, &quoted, NULL, 0);
strbuf_addbuf(sb, &quoted);
strbuf_release(&sbuf);
strbuf_release(&quoted);
} else {
errlen = (unsigned long)len;
die(_("bad ls-tree format: %%%.*s"), errlen, start);
}
return len;
}
static int show_recursive(const char *base, size_t baselen, const char *pathname)
{
int i;
@ -43,7 +123,7 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
for (i = 0; i < pathspec.nr; i++) {
const char *spec = pathspec.items[i].match;
int len, speclen;
size_t len, speclen;
if (strncmp(base, spec, baselen))
continue;
@ -61,69 +141,197 @@ static int show_recursive(const char *base, int baselen, const char *pathname)
return 0;
}
static int show_tree(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, void *context)
static int show_tree_fmt(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, void *context)
{
int retval = 0;
int baselen;
const char *type = blob_type;
size_t baselen;
int recurse = 0;
struct strbuf sb = STRBUF_INIT;
enum object_type type = object_type(mode);
if (S_ISGITLINK(mode)) {
/*
* Maybe we want to have some recursive version here?
*
* Something similar to this incomplete example:
*
if (show_subprojects(base, baselen, pathname))
retval = READ_TREE_RECURSIVE;
*
*/
type = commit_type;
} else if (S_ISDIR(mode)) {
if (show_recursive(base->buf, base->len, pathname)) {
retval = READ_TREE_RECURSIVE;
if (!(ls_options & LS_SHOW_TREES))
return retval;
}
type = tree_type;
}
else if (ls_options & LS_TREE_ONLY)
struct show_tree_data data = {
.mode = mode,
.type = type,
.oid = oid,
.pathname = pathname,
.base = base,
};
if (type == OBJ_TREE && show_recursive(base->buf, base->len, pathname))
recurse = READ_TREE_RECURSIVE;
if (type == OBJ_TREE && recurse && !(ls_options & LS_SHOW_TREES))
return recurse;
if (type == OBJ_BLOB && (ls_options & LS_TREE_ONLY))
return 0;
if (!(ls_options & LS_NAME_ONLY)) {
if (ls_options & LS_SHOW_SIZE) {
char size_text[24];
if (!strcmp(type, blob_type)) {
unsigned long size;
if (oid_object_info(the_repository, oid, &size) == OBJ_BAD)
xsnprintf(size_text, sizeof(size_text),
"BAD");
else
xsnprintf(size_text, sizeof(size_text),
"%"PRIuMAX, (uintmax_t)size);
} else
xsnprintf(size_text, sizeof(size_text), "-");
printf("%06o %s %s %7s\t", mode, type,
find_unique_abbrev(oid, abbrev),
size_text);
} else
printf("%06o %s %s\t", mode, type,
find_unique_abbrev(oid, abbrev));
}
baselen = base->len;
strbuf_expand(&sb, format, expand_show_tree, &data);
strbuf_addch(&sb, line_termination);
fwrite(sb.buf, sb.len, 1, stdout);
strbuf_release(&sb);
strbuf_setlen(base, baselen);
return recurse;
}
static int show_tree_common(struct show_tree_data *data, int *recurse,
const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode)
{
enum object_type type = object_type(mode);
int ret = -1;
*recurse = 0;
data->mode = mode;
data->type = type;
data->oid = oid;
data->pathname = pathname;
data->base = base;
if (type == OBJ_BLOB) {
if (ls_options & LS_TREE_ONLY)
ret = 0;
} else if (type == OBJ_TREE &&
show_recursive(base->buf, base->len, pathname)) {
*recurse = READ_TREE_RECURSIVE;
if (!(ls_options & LS_SHOW_TREES))
ret = *recurse;
}
return ret;
}
static void show_tree_common_default_long(struct strbuf *base,
const char *pathname,
const size_t baselen)
{
strbuf_addstr(base, pathname);
write_name_quoted_relative(base->buf,
chomp_prefix ? ls_tree_prefix : NULL, stdout,
line_termination);
strbuf_setlen(base, baselen);
}
static int show_tree_default(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode,
void *context)
{
int early;
int recurse;
struct show_tree_data data = { 0 };
early = show_tree_common(&data, &recurse, oid, base, pathname, mode);
if (early >= 0)
return early;
printf("%06o %s %s\t", data.mode, type_name(data.type),
find_unique_abbrev(data.oid, abbrev));
show_tree_common_default_long(base, pathname, data.base->len);
return recurse;
}
static int show_tree_long(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, void *context)
{
int early;
int recurse;
struct show_tree_data data = { 0 };
char size_text[24];
early = show_tree_common(&data, &recurse, oid, base, pathname, mode);
if (early >= 0)
return early;
if (data.type == OBJ_BLOB) {
unsigned long size;
if (oid_object_info(the_repository, data.oid, &size) == OBJ_BAD)
xsnprintf(size_text, sizeof(size_text), "BAD");
else
xsnprintf(size_text, sizeof(size_text),
"%" PRIuMAX, (uintmax_t)size);
} else {
xsnprintf(size_text, sizeof(size_text), "-");
}
printf("%06o %s %s %7s\t", data.mode, type_name(data.type),
find_unique_abbrev(data.oid, abbrev), size_text);
show_tree_common_default_long(base, pathname, data.base->len);
return 1;
}
static int show_tree_name_only(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, void *context)
{
int early;
int recurse;
const size_t baselen = base->len;
struct show_tree_data data = { 0 };
early = show_tree_common(&data, &recurse, oid, base, pathname, mode);
if (early >= 0)
return early;
strbuf_addstr(base, pathname);
write_name_quoted_relative(base->buf,
chomp_prefix ? ls_tree_prefix : NULL,
stdout, line_termination);
strbuf_setlen(base, baselen);
return retval;
return recurse;
}
static int show_tree_object(const struct object_id *oid, struct strbuf *base,
const char *pathname, unsigned mode, void *context)
{
int early;
int recurse;
struct show_tree_data data = { 0 };
early = show_tree_common(&data, &recurse, oid, base, pathname, mode);
if (early >= 0)
return early;
printf("%s%c", find_unique_abbrev(oid, abbrev), line_termination);
return recurse;
}
struct ls_tree_cmdmode_to_fmt {
enum ls_tree_cmdmode mode;
const char *const fmt;
read_tree_fn_t fn;
};
static struct ls_tree_cmdmode_to_fmt ls_tree_cmdmode_format[] = {
{
.mode = MODE_DEFAULT,
.fmt = "%(objectmode) %(objecttype) %(objectname)%x09%(path)",
.fn = show_tree_default,
},
{
.mode = MODE_LONG,
.fmt = "%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)",
.fn = show_tree_long,
},
{
.mode = MODE_NAME_ONLY, /* And MODE_NAME_STATUS */
.fmt = "%(path)",
.fn = show_tree_name_only,
},
{
.mode = MODE_OBJECT_ONLY,
.fmt = "%(objectname)",
.fn = show_tree_object
},
{
/* fallback */
.fn = show_tree_default,
},
};
int cmd_ls_tree(int argc, const char **argv, const char *prefix)
{
struct object_id oid;
struct tree *tree;
int i, full_tree = 0;
read_tree_fn_t fn = NULL;
const struct option ls_tree_options[] = {
OPT_BIT('d', NULL, &ls_options, N_("only show trees"),
LS_TREE_ONLY),
@ -133,20 +341,26 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
LS_SHOW_TREES),
OPT_SET_INT('z', NULL, &line_termination,
N_("terminate entries with NUL byte"), 0),
OPT_BIT('l', "long", &ls_options, N_("include object size"),
LS_SHOW_SIZE),
OPT_BIT(0, "name-only", &ls_options, N_("list only filenames"),
LS_NAME_ONLY),
OPT_BIT(0, "name-status", &ls_options, N_("list only filenames"),
LS_NAME_ONLY),
OPT_CMDMODE('l', "long", &cmdmode, N_("include object size"),
MODE_LONG),
OPT_CMDMODE(0, "name-only", &cmdmode, N_("list only filenames"),
MODE_NAME_ONLY),
OPT_CMDMODE(0, "name-status", &cmdmode, N_("list only filenames"),
MODE_NAME_STATUS),
OPT_CMDMODE(0, "object-only", &cmdmode, N_("list only objects"),
MODE_OBJECT_ONLY),
OPT_SET_INT(0, "full-name", &chomp_prefix,
N_("use full path names"), 0),
OPT_BOOL(0, "full-tree", &full_tree,
N_("list entire tree; not just current directory "
"(implies --full-name)")),
OPT_STRING_F(0, "format", &format, N_("format"),
N_("format to use for the output"),
PARSE_OPT_NONEG),
OPT__ABBREV(&abbrev),
OPT_END()
};
struct ls_tree_cmdmode_to_fmt *m2f = ls_tree_cmdmode_format;
git_config(git_default_config, NULL);
ls_tree_prefix = prefix;
@ -159,11 +373,23 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
ls_tree_prefix = prefix = NULL;
chomp_prefix = 0;
}
/*
* We wanted to detect conflicts between --name-only and
* --name-status, but once we're done with that subsequent
* code should only need to check the primary name.
*/
if (cmdmode == MODE_NAME_STATUS)
cmdmode = MODE_NAME_ONLY;
/* -d -r should imply -t, but -d by itself should not have to. */
if ( (LS_TREE_ONLY|LS_RECURSIVE) ==
((LS_TREE_ONLY|LS_RECURSIVE) & ls_options))
ls_options |= LS_SHOW_TREES;
if (format && cmdmode)
usage_msg_opt(
_("--format can't be combined with other format-altering options"),
ls_tree_usage, ls_tree_options);
if (argc < 1)
usage_with_options(ls_tree_usage, ls_tree_options);
if (get_oid(argv[0], &oid))
@ -185,6 +411,24 @@ int cmd_ls_tree(int argc, const char **argv, const char *prefix)
tree = parse_tree_indirect(&oid);
if (!tree)
die("not a tree object");
return !!read_tree(the_repository, tree,
&pathspec, show_tree, NULL);
/*
* The generic show_tree_fmt() is slower than show_tree(), so
* take the fast path if possible.
*/
while (m2f) {
if (!m2f->fmt) {
fn = format ? show_tree_fmt : show_tree_default;
} else if (format && !strcmp(format, m2f->fmt)) {
cmdmode = m2f->mode;
fn = m2f->fn;
} else if (!format && cmdmode == m2f->mode) {
fn = m2f->fn;
} else {
m2f++;
continue;
}
break;
}
return !!read_tree(the_repository, tree, &pathspec, fn, NULL);
}

View file

@ -15,7 +15,7 @@ constant fmt !~ "%";
@@
expression E;
struct strbuf SB;
format F =~ "s";
format F =~ "^s$";
@@
- strbuf_addf(E, "%@F@", SB.buf);
+ strbuf_addbuf(E, &SB);
@ -23,7 +23,7 @@ format F =~ "s";
@@
expression E;
struct strbuf *SBP;
format F =~ "s";
format F =~ "^s$";
@@
- strbuf_addf(E, "%@F@", SBP->buf);
+ strbuf_addbuf(E, SBP);
@ -44,7 +44,7 @@ struct strbuf *SBP;
@@
expression E1, E2;
format F =~ "s";
format F =~ "^s$";
@@
- strbuf_addf(E1, "%@F@", E2);
+ strbuf_addstr(E1, E2);

View file

@ -201,31 +201,34 @@ EOF
test_cmp expected check
'
test_expect_success 'ls-tree --name-only' '
git ls-tree --name-only $tree >current &&
cat >expected <<\EOF &&
1.txt
2.txt
path0
path1
path2
path3
EOF
test_output
'
for opt in --name-only --name-status
do
test_expect_success "ls-tree $opt" '
git ls-tree $opt $tree >current &&
cat >expected <<-\EOF &&
1.txt
2.txt
path0
path1
path2
path3
EOF
test_output
'
test_expect_success 'ls-tree --name-only -r' '
git ls-tree --name-only -r $tree >current &&
cat >expected <<\EOF &&
1.txt
2.txt
path0/a/b/c/1.txt
path1/b/c/1.txt
path2/1.txt
path3/1.txt
path3/2.txt
EOF
test_output
'
test_expect_success "ls-tree $opt -r" '
git ls-tree $opt -r $tree >current &&
cat >expected <<-\EOF &&
1.txt
2.txt
path0/a/b/c/1.txt
path1/b/c/1.txt
path2/1.txt
path3/1.txt
path3/2.txt
EOF
test_output
'
done
test_done

View file

@ -23,4 +23,19 @@ test_expect_success 'ls-tree fails with non-zero exit code on broken tree' '
test_must_fail git ls-tree -r HEAD
'
for opts in \
"--long --name-only" \
"--name-only --name-status" \
"--name-status --object-only" \
"--object-only --long"
do
test_expect_success "usage: incompatible options: $opts" '
test_expect_code 129 git ls-tree $opts $tree
'
one_opt=$(echo "$opts" | cut -d' ' -f1)
test_expect_success "usage: incompatible options: $one_opt and --format" '
test_expect_code 129 git ls-tree $one_opt --format=fmt $tree
'
done
test_done

76
t/t3104-ls-tree-format.sh Executable file
View file

@ -0,0 +1,76 @@
#!/bin/sh
test_description='ls-tree --format'
TEST_PASSES_SANITIZE_LEAK=true
. ./test-lib.sh
test_expect_success 'ls-tree --format usage' '
test_expect_code 129 git ls-tree --format=fmt -l HEAD &&
test_expect_code 129 git ls-tree --format=fmt --name-only HEAD &&
test_expect_code 129 git ls-tree --format=fmt --name-status HEAD
'
test_expect_success 'setup' '
mkdir dir &&
test_commit dir/sub-file &&
test_commit top-file
'
test_ls_tree_format () {
format=$1 &&
opts=$2 &&
fmtopts=$3 &&
shift 2 &&
test_expect_success "ls-tree '--format=<$format>' is like options '$opts $fmtopts'" '
git ls-tree $opts -r HEAD >expect &&
git ls-tree --format="$format" -r $fmtopts HEAD >actual &&
test_cmp expect actual
'
test_expect_success "ls-tree '--format=<$format>' on optimized v.s. non-optimized path" '
git ls-tree --format="$format" -r $fmtopts HEAD >expect &&
git ls-tree --format="> $format" -r $fmtopts HEAD >actual.raw &&
sed "s/^> //" >actual <actual.raw &&
test_cmp expect actual
'
}
test_ls_tree_format \
"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
""
test_ls_tree_format \
"%(objectmode) %(objecttype) %(objectname) %(objectsize:padded)%x09%(path)" \
"--long"
test_ls_tree_format \
"%(path)" \
"--name-only"
test_ls_tree_format \
"%(objectname)" \
"--object-only"
test_ls_tree_format \
"%(objectname)" \
"--object-only --abbrev" \
"--abbrev"
test_ls_tree_format \
"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
"-t" \
"-t"
test_ls_tree_format \
"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
"--full-name" \
"--full-name"
test_ls_tree_format \
"%(objectmode) %(objecttype) %(objectname)%x09%(path)" \
"--full-tree" \
"--full-tree"
test_done