Fixes a handful of issues in the code to traverse working tree to
find untracked and/or ignored files, cleans up and optimizes the
codepath in general.
* kb/status-ignored-optim-2:
dir.c: git-status --ignored: don't scan the work tree twice
dir.c: git-status --ignored: don't scan the work tree three times
dir.c: git-status: avoid is_excluded checks for tracked files
dir.c: replace is_path_excluded with now equivalent is_excluded API
dir.c: unify is_excluded and is_path_excluded APIs
dir.c: move prep_exclude
dir.c: factor out parts of last_exclude_matching for later reuse
dir.c: git-clean -d -X: don't delete tracked directories
dir.c: make 'git-status --ignored' work within leading directories
dir.c: git-status --ignored: don't list empty directories as ignored
dir.c: git-ls-files --directories: don't hide empty directories
dir.c: git-status --ignored: don't list empty ignored directories
dir.c: git-status --ignored: don't list files in ignored directories
dir.c: git-status --ignored: don't drop ignored directories
* ta/glossary:
glossary: improve definitions of refspec and pathspec
The name of the hash function is "SHA-1", not "SHA1"
glossary: improve description of SHA-1 related topics
glossary: remove outdated/misleading/irrelevant entries
Teach "--human-readable" aka "-H" option to "git count-objects" to
show various large numbers in Ki/Mi/GiB scaled as necessary.
* ap/strbuf-humanize:
count-objects: add -H option to humanize sizes
strbuf: create strbuf_humanise_bytes() to show byte sizes
'git-status --ignored' still scans the work tree twice to collect
untracked and ignored files, respectively.
fill_directory / read_directory already supports collecting untracked and
ignored files in a single directory scan. However, the DIR_COLLECT_IGNORED
flag to enable this has some git-add specific side-effects (e.g. it
doesn't recurse into ignored directories, so listing ignored files with
--untracked=all doesn't work).
The DIR_SHOW_IGNORED flag doesn't list untracked files and returns ignored
files in dir_struct.entries[] (instead of dir_struct.ignored[] as
DIR_COLLECT_IGNORED). DIR_SHOW_IGNORED is used all throughout git.
We don't want to break the existing API, so lets introduce a new flag
DIR_SHOW_IGNORED_TOO that lists untracked as well as ignored files similar
to DIR_COLLECT_FILES, but will recurse into sub-directories based on the
other flags as DIR_SHOW_IGNORED does.
In dir.c::read_directory_recursive, add ignored files to either
dir_struct.entries[] or dir_struct.ignored[] based on the flags. Also move
the DIR_COLLECT_IGNORED case here so that filling result lists is in a
common place.
In wt-status.c::wt_status_collect_untracked, use the new flag and read
results from dir_struct.ignored[]. Remove the extra fill_directory call.
builtin/check-ignore.c doesn't call fill_directory, setting the git-add
specific DIR_COLLECT_IGNORED flag has no effect here. Remove for clarity.
Update API documentation to reflect the changes.
Performance: with this patch, 'git-status --ignored' is typically as fast
as 'git-status'.
Signed-off-by: Karsten Blees <blees@dcon.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use "SHA-1" instead of "SHA1" whenever we talk about the hash function.
When used as a programming symbol, we keep "SHA1".
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Most of these were found using Lucas De Marchi's codespell tool.
Signed-off-by: Stefano Lattarini <stefano.lattarini@gmail.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
eb32d236 introduced the OBJ_OFS_DELTA object that uses a relative offset to
identify the base object instead of the 20-byte SHA1 reference. The pack file
documentation only mentions the SHA1 based reference in its description of the
deltified object entry.
Update the pack format documentation to clarify that the deltified object
representation refers to its base using either a relative negative offset or
the absolute SHA1 identifier.
Signed-off-by: Stefan Saasen <ssaasen@atlassian.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Humanization of downloaded size is done in the same function as text
formatting in 'process.c'. The code cannot be reused easily elsewhere.
Separate text formatting from size simplification and make the
function public in strbuf so that it can easily be used by other
callers.
We now can use strbuf_humanise_bytes() for both downloaded size and
download speed calculation. One of the drawbacks is that speed will
now look like this when download is stalled: "0 bytes/s" instead of
"0 KiB/s".
Signed-off-by: Antoine Pelisse <apelisse@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* maint-1.8.1:
bundle: Add colons to list headings in "verify"
bundle: Fix "verify" output if history is complete
Documentation: filter-branch env-filter example
git-filter-branch.txt: clarify ident variables usage
git-compat-util.h: Provide missing netdb.h definitions
describe: Document --match pattern format
Documentation/githooks: Explain pre-rebase parameters
update-index: list supported idx versions and their features
diff-options: unconfuse description of --color
read-cache.c: use INDEX_FORMAT_{LB,UB} in verify_hdr()
index-format.txt: mention of v4 is missing in some places
Update the index format documentation to mention the v4 format.
* nd/doc-index-format:
update-index: list supported idx versions and their features
read-cache.c: use INDEX_FORMAT_{LB,UB} in verify_hdr()
index-format.txt: mention of v4 is missing in some places
Update documentation to change "GIT" which was a poor-man's small
caps to "Git". The latter was the intended spelling.
Also change "git" spelled in all-lowercase to "Git" when it refers
to the system as the whole or the concept it embodies, as opposed to
the command the end users would type.
* ta/doc-no-small-caps:
Documentation: StGit is the right spelling, not StGIT
Documentation: describe the "repository" in repository-layout
Documentation: add a description for 'gitfile' to glossary
Documentation: do not use undefined terms git-dir and git-file
Documentation: the name of the system is 'Git', not 'git'
Documentation: avoid poor-man's small caps GIT
Allow a configuration variable core.commentchar to customize the
character used to comment out the hint lines in the edited text from
the default '#'.
* jc/custom-comment-char:
Allow custom "comment char"
In the earlier days, we used to spell the name of the system as GIT,
to simulate as if it were typeset with capital G and IT in small
caps. Later we stopped doing so at around 1.6.5 days.
Let's stop doing so throughout the documentation. The name to refer
to the whole system (and the concept it embodies) is "Git"; the
command end-users type is "git". And document this in the coding
guideline.
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git fetch --depth" was broken in at least three ways. The
resulting history was deeper than specified by one commit, it was
unclear how to wipe the shallowness of the repository with the
command, and documentation was misleading.
* nd/fetch-depth-is-broken:
fetch: elaborate --depth action
upload-pack: fix off-by-one depth calculation in shallow clone
fetch: add --unshallow for turning shallow repo into complete one
An element on GIT_CEILING_DIRECTORIES list that does not name the
real path to a directory (i.e. a symbolic link) could have caused
the GIT_DIR discovery logic to escape the ceiling.
* mh/ceiling:
string_list_longest_prefix(): remove function
setup_git_directory_gently_1(): resolve symlinks in ceiling paths
longest_ancestor_length(): require prefix list entries to be normalized
longest_ancestor_length(): take a string_list argument for prefixes
longest_ancestor_length(): use string_list_split()
Introduce new function real_path_if_valid()
real_path_internal(): add comment explaining use of cwd
Introduce new static function real_path_internal()
Add a new command "git check-ignore" for debugging .gitignore
files.
The variable names may want to get cleaned up but that can be done
in-tree.
* as/check-ignore:
clean.c, ls-files.c: respect encapsulation of exclude_list_groups
t0008: avoid brace expansion
add git-check-ignore sub-command
setup.c: document get_pathspec()
add.c: extract new die_if_path_beyond_symlink() for reuse
add.c: extract check_path_for_gitlink() from treat_gitlinks() for reuse
pathspec.c: rename newly public functions for clarity
add.c: move pathspec matchers into new pathspec.c for reuse
add.c: remove unused argument from validate_pathspec()
dir.c: improve docs for match_pathspec() and match_pathspec_depth()
dir.c: provide clear_directory() for reclaiming dir_struct memory
dir.c: keep track of where patterns came from
dir.c: use a single struct exclude_list per source of excludes
Conflicts:
builtin/ls-files.c
dir.c
Some users do want to write a line that begin with a pound sign, #,
in their commit log message. Many tracking system recognise
a token of #<bugid> form, for example.
The support we offer these use cases is not very friendly to the end
users. They have a choice between
- Don't do it. Avoid such a line by rewrapping or indenting; and
- Use --cleanup=whitespace but remove all the hint lines we add.
Give them a way to set a custom comment char, e.g.
$ git -c core.commentchar="%" commit
so that they do not have to do either of the two workarounds.
[jc: although I started the topic, all the tests and documentation
updates, many of the call sites of the new strbuf_add_commented_*()
functions, and the change to git-submodule.sh scripted Porcelain are
from Ralf.]
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Ralf Thielow <ralf.thielow@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The internal logic had to deal with two representations of a death
of a child process by a signal.
* jk/unify-exit-code-by-receiving-signal:
run-command: encode signal death as a positive integer
The user can do --depth=2147483647 (*) for restoring complete repo
now. But it's hard to remember. Any other numbers larger than the
longest commit chain in the repository would also do, but some
guessing may be involved. Make easy-to-remember --unshallow an alias
for --depth=2147483647.
Make upload-pack recognize this special number as infinite depth. The
effect is essentially the same as before, except that upload-pack is
more efficient because it does not have to traverse to the bottom
anymore.
The chance of a user actually wanting exactly 2147483647 commits
depth, not infinite, on a repository with a history that long, is
probably too small to consider. The client can learn to add or
subtract one commit to avoid the special treatment when that actually
happens.
(*) This is the largest positive number a 32-bit signed integer can
contain. JGit and older C Git store depth as "int" so both are OK
with this number. Dulwich does not support shallow clone.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Refactor and generally clean up the directory traversal API
implementation.
* as/dir-c-cleanup:
dir.c: rename free_excludes() to clear_exclude_list()
dir.c: refactor is_path_excluded()
dir.c: refactor is_excluded()
dir.c: refactor is_excluded_from_list()
dir.c: rename excluded() to is_excluded()
dir.c: rename excluded_from_list() to is_excluded_from_list()
dir.c: rename path_excluded() to is_path_excluded()
dir.c: rename cryptic 'which' variable to more consistent name
Improve documentation and comments regarding directory traversal API
api-directory-listing.txt: update to match code
By the end of a directory traversal, a dir_struct instance will
typically contains pointers to various data structures on the heap.
clear_directory() provides a convenient way to reclaim that memory.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Previously each exclude_list could potentially contain patterns
from multiple sources. For example dir->exclude_list[EXC_FILE]
would typically contain patterns from .git/info/exclude and
core.excludesfile, and dir->exclude_list[EXC_DIRS] could contain
patterns from multiple per-directory .gitignore files during
directory traversal (i.e. when dir->exclude_stack was more than
one item deep).
We split these composite exclude_lists up into three groups of
exclude_lists (EXC_CMDL / EXC_DIRS / EXC_FILE as before), so that each
exclude_list now contains patterns from a single source. This will
allow us to cleanly track the origin of each pattern simply by adding
a src field to struct exclude_list, rather than to struct exclude,
which would make memory management of the source string tricky in the
EXC_DIRS case where its contents are dynamically generated.
Similarly, by moving the filebuf member from struct exclude_stack to
struct exclude_list, it allows us to track and subsequently free
memory buffers allocated during the parsing of all exclude files,
rather than only tracking buffers allocated for files in the EXC_DIRS
group.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The documentation for the ALLOC_GROW API implicitly encouraged
developers to use "ary" as the variable name for the array which is
dynamically grown. However "ary" is an unusual abbreviation hardly
used anywhere else in the source tree, and it is also better to name
variables based on their contents not on their type.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a sub-command dies due to a signal, we encode the
signal number into the numeric exit status as "signal -
128". This is easy to identify (versus a regular positive
error code), and when cast to an unsigned integer (e.g., by
feeding it to exit), matches what a POSIX shell would return
when reporting a signal death in $? or through its own exit
code.
So we have a negative value inside the code, but once it
passes across an exit() barrier, it looks positive (and any
code we receive from a sub-shell will have the positive
form). E.g., death by SIGPIPE (signal 13) will look like
-115 to us in inside git, but will end up as 141 when we
call exit() with it. And a program killed by SIGPIPE but run
via the shell will come to us with an exit code of 141.
Unfortunately, this means that when the "use_shell" option
is set, we need to be on the lookout for _both_ forms. We
might or might not have actually invoked the shell (because
we optimize out some useless shell calls). If we didn't invoke
the shell, we will will see the sub-process's signal death
directly, and run-command converts it into a negative value.
But if we did invoke the shell, we will see the shell's
128+signal exit status. To be thorough, we would need to
check both, or cast the value to an unsigned char (after
checking that it is not -1, which is a magic error value).
Fortunately, most callsites do not care at all whether the
exit was from a code or from a signal; they merely check for
a non-zero status, and sometimes propagate the error via
exit(). But for the callers that do care, we can make life
slightly easier by just using the consistent positive form.
This actually fixes two minor bugs:
1. In launch_editor, we check whether the editor died from
SIGINT or SIGQUIT. But we checked only the negative
form, meaning that we would fail to notice a signal
death exit code which was propagated through the shell.
2. In handle_alias, we assume that a negative return value
from run_command means that errno tells us something
interesting (like a fork failure, or ENOENT).
Otherwise, we simply propagate the exit code. Negative
signal death codes confuse us, and we print a useless
"unable to run alias 'foo': Success" message. By
encoding signal deaths using the positive form, the
existing code just propagates it as it would a normal
non-zero exit code.
The downside is that callers of run_command can no longer
differentiate between a signal received directly by the
sub-process, and one propagated. However, no caller
currently cares, and since we already optimize out some
calls to the shell under the hood, that distinction is not
something that should be relied upon by callers.
Fix the same logic in t/test-terminal.perl for consistency [jc:
raised by Jonathan in the discussion].
Signed-off-by: Jeff King <peff@peff.net>
Acked-by: Johannes Sixt <j6t@kdbg.org>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The --graph code fell into infinite loop when asked to do what the
code did not expect.
* mk/maint-graph-infinity-loop:
graph.c: infinite loop in git whatchanged --graph -m
An element on GIT_CEILING_DIRECTORIES list that does not name the
real path to a directory (i.e. a symbolic link) could have caused
the GIT_DIR discovery logic to escape the ceiling.
* mh/ceiling:
string_list_longest_prefix(): remove function
setup_git_directory_gently_1(): resolve symlinks in ceiling paths
longest_ancestor_length(): require prefix list entries to be normalized
longest_ancestor_length(): take a string_list argument for prefixes
longest_ancestor_length(): use string_list_split()
Introduce new function real_path_if_valid()
real_path_internal(): add comment explaining use of cwd
Introduce new static function real_path_internal()
traversal API has a few potentially confusing properties. These
comments clarify a few key aspects and will hopefully make it easier
to understand for other newcomers in the future.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
7c4c97c0ac turned the flags in struct dir_struct into a single bitfield
variable, but forgot to update this document.
Signed-off-by: Adam Spiers <git@adamspiers.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The contents of this document does not describe any particular API, but
is more about the way to add a new command, which belongs to the "How To"
section of the documentation suite.
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Start list with 1 instead of 0; ASCIIDOC will renumber it anyway.
Signed-off-by: Thomas Ackermann <th.acker@arcor.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A cache-tree entry with a negative entry count is considered invalid
by the current Git; it records that we do not know the object name
of a tree that would result by writing the directory covered by the
cache-tree as a tree object.
Clarify that any entry with a negative entry count is invalid, but
the implementations must write -1 there. This way, we can later
decide to allow writers to use negative values other than -1 to
encode optional information on such invalidated entries without
harming interoperability; we do not know what will be encoded and
how, so we keep these other negative values as reserved for now.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This document contains no new policies or proposals; it attempts to
document established practices and interface requirements.
Signed-off-by: Eric S. Raymond <esr@thyrsus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* ta/doc-cleanup:
Documentation: build html for all files in technical and howto
Documentation/howto: convert plain text files to asciidoc
Documentation/technical: convert plain text files to asciidoc
Change headline of technical/send-pack-pipeline.txt to not confuse its content with content from git-send-pack.txt
Shorten two over-long lines in git-bisect-lk2009.txt by abbreviating some sha1
Split over-long synopsis in git-fetch-pack.txt into several lines
Improve the asymptotic performance of the cat_sort_uniq notes merge
strategy.
* mh/notes-string-list:
string_list_add_refs_from_colon_sep(): use string_list_split()
notes: fix handling of colon-separated values
combine_notes_cat_sort_uniq(): sort and dedup lines all at once
Initialize sort_uniq_list using named constant
string_list: add a function string_list_remove_empty_items()
Document strbuf_split_buf(), strbuf_split_str(), strbuf_split_max(),
strbuf_split(), and strbuf_list_free() in the header file and in
api-strbuf.txt. (These functions were previously completely
undocumented.)
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Jeff King <peff@peff.net>