"git --attr-source=<tree> cmd $args" is a new way to have any
command to read attributes not from the working tree but from the
given tree object.
* jc/attr-source-tree:
attr: teach "--attr-source=<tree>" global option to "git"
Earlier, 47cfc9bd (attr: add flag `--source` to work with tree-ish,
2023-01-14) taught "git check-attr" the "--source=<tree>" option to
allow it to read attribute files from a tree-ish, but did so only
for the command. Just like "check-attr" users wanted a way to use
attributes from a tree-ish and not from the working tree files,
users of other commands (like "git diff") would benefit from the
same.
Undo most of the UI change the commit made, while keeping the
internal logic to read attributes from a given tree-ish. Expose the
internal logic via a new "--attr-source=<tree>" command line option
given to "git", so that it can be used with any git command that
runs as part of the main git process.
Additionally, add an environment variable GIT_ATTR_SOURCE that is set
when --attr-source is passed in, so that subprocesses use the same value
for the attributes source tree.
Signed-off-by: John Cai <johncai86@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git check-attr" learned to take an optional tree-ish to read the
.gitattributes file from.
* kn/attr-from-tree:
attr: add flag `--source` to work with tree-ish
t0003: move setup for `--all` into new block
The contents of the .gitattributes files may evolve over time, but "git
check-attr" always checks attributes against them in the working tree
and/or in the index. It may be beneficial to optionally allow the users
to check attributes taken from a commit other than HEAD against paths.
Add a new flag `--source` which will allow users to check the
attributes against a commit (actually any tree-ish would do). When the
user uses this flag, we go through the stack of .gitattributes files but
instead of checking the current working tree and/or in the index, we
check the blobs from the provided tree-ish object. This allows the
command to also be used in bare repositories.
Since we use a tree-ish object, the user can pass "--source
HEAD:subdirectory" and all the attributes will be looked up as if
subdirectory was the root directory of the repository.
We cannot simply use the `<rev>:<path>` syntax without the `--source`
flag, similar to how it is used in `git show` because any non-flag
parameter before `--` is treated as an attribute and any parameter after
`--` is treated as a pathname.
The change involves creating a new function `read_attr_from_blob`, which
given the path reads the blob for the path against the provided source and
parses the attributes line by line. This function is plugged into
`read_attr()` function wherein we go through the stack of attributes
files.
Signed-off-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: Toon Claes <toon@iotcl.com>
Co-authored-by: toon@iotcl.com
Signed-off-by: Junio C Hamano <gitster@pobox.com>
As with the previous patch, using skip_prefix() is more readable and
less error-prone than a raw strncmp(), because it avoids a
manually-computed length. These cases differ from the previous patch
that uses starts_with() because they care about the value after the
matched prefix.
We can convert these to use skip_prefix() by introducing an extra
variable to hold the out-pointer.
Note in the case in ws.c that to get rid of the magic number "9"
completely, we also switch out "len" for recomputing the pointer
difference. These are equivalent because "len" is always "ep - string".
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
We take a ws_rule parameter, but have never looked at it since the
function was added in 877f23ccb8 (Teach "diff --check" about new blank
lines at end, 2008-06-26). A comment in the function does mention how we
_could_ use it, but nobody has felt the need to do so for over a decade.
We could keep it around as reminder of what could be done, but the
comment serves that purpose. And in the meantime, it triggers
-Wunused-parameter.
So let's drop it, which in turn allows us to drop similar arguments
further up the callstack. I've left the comment intact. It does still
say "ws_rule", but that name is used consistently in the whitespace
code, so the meaning is clear.
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Various codepaths in the core-ish part learn to work on an
arbitrary in-core index structure, not necessarily the default
instance "the_index".
* nd/the-index: (23 commits)
revision.c: reduce implicit dependency the_repository
revision.c: remove implicit dependency on the_index
ws.c: remove implicit dependency on the_index
tree-diff.c: remove implicit dependency on the_index
submodule.c: remove implicit dependency on the_index
line-range.c: remove implicit dependency on the_index
userdiff.c: remove implicit dependency on the_index
rerere.c: remove implicit dependency on the_index
sha1-file.c: remove implicit dependency on the_index
patch-ids.c: remove implicit dependency on the_index
merge.c: remove implicit dependency on the_index
merge-blobs.c: remove implicit dependency on the_index
ll-merge.c: remove implicit dependency on the_index
diff-lib.c: remove implicit dependency on the_index
read-cache.c: remove implicit dependency on the_index
diff.c: remove implicit dependency on the_index
grep.c: remove implicit dependency on the_index
diff.c: remove the_index dependency in textconv() functions
blame.c: rename "repo" argument to "r"
combine-diff.c: remove implicit dependency on the_index
...
git_check_attr() returns always 0.
Remove all the error handling code of the callers, which is never executed.
Change git_check_attr() to be a void function.
Signed-off-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Make the attr API take an index_state instead of assuming the_index in
attr code. All call sites are converted blindly to keep the patch
simple and retain current behavior. Individual call sites may receive
further updates to use the right index instead of the_index.
There is one ugly temporary workaround added in attr.c that needs some
more explanation.
Commit c24f3abace (apply: file commited with CRLF should roundtrip
diff and apply - 2017-08-19) forces one convert_to_git() call to NOT
read the index at all. But what do you know, we read it anyway by
falling back to the_index. When "istate" from convert_to_git is now
propagated down to read_attr_from_array() we will hit segfault
somewhere inside read_blob_data_from_index.
The right way of dealing with this is to kill "use_index" variable and
only follow "istate" but at this stage we are not ready for that:
while most git_attr_set_direction() calls just passes the_index to be
assigned to use_index, unpack-trees passes a different one which is
used by entry.c code, which has no way to know what index to use if we
delete use_index. So this has to be done later.
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The remaining callers are all simple "I have N attributes I am
interested in. I'll ask about them with various paths one by one".
After this step, no caller to git_check_attrs() remains. After
removing it, we can extend "struct attr_check" struct with data
that can be used in optimizing the query for the specific N
attributes it contains.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The traditional API to check attributes is to prepare an N-element
array of "struct git_attr_check" and pass N and the array to the
function "git_check_attr()" as arguments.
In preparation to revamp the API to pass a single structure, in
which these N elements are held, rename the type used for these
individual array elements to "struct attr_check_item" and rename
the function to "git_check_attrs()".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Avoid scanning strings twice, once with strchr() and then with
strlen(), by using strchrnul().
Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Rohit Mani <rohit.mani@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Suggested by: Junio Hamano <gitster@pobox.com>
Signed-off-by: Michael Haggerty <mhagger@alum.mit.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A new whitespace "rule" is added that sets the tab width to use for
whitespace checks and fix-ups and replaces the hard-coded constant 8.
Since the setting is part of the rules, it can be set per file using
.gitattributes.
The new configuration is backwards compatible because older git versions
simply ignore unknown whitespace rules.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When the whitespace rule tab-in-indent is enabled, apply --whitespace=fix
replaces tabs by the appropriate amount of blanks. The code used
"dst->len % 8" as the criterion to stop adding blanks. But it forgot that
dst holds more than just the current line. Consequently, the modulus was
computed correctly only for the first added line, but not for the second
and subsequent lines. Fix it.
Signed-off-by: Johannes Sixt <j6t@kdbg.org>
Acked-by: Chris Webb <chris@arachsys.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a line contains nothing but whitespace with at least one tab
and the core.whitespace config option contains blank-at-eol, the
whitespace on the line is being printed twice, once unhighlighted
(unless otherwise matched by one of the other core.whitespace values),
and a second time highlighted for blank-at-eol.
Update the leading indentation check to stop checking when it reaches
the trailing whitespace.
Signed-off-by: Kevin Ballard <kevin@sb.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If tab-in-indent is set, --whitespace=fix will ensure that any stray tabs in
the initial indent are expanded to the correct number of space characters.
Signed-off-by: Chris Webb <chris@arachsys.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
To implement --whitespace=fix for tab-in-indent, we have to allow for the
possibility that whitespace can increase in size when it is fixed, expanding
tabs to to multiple spaces in the initial indent.
Signed-off-by: Chris Webb <chris@arachsys.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Some projects and languages use coding style where no tab character is used to
indent the lines.
This only adds support and documentation for "apply --whitespace=warn" and
"diff --check"; later patches add "apply --whitespace=fix" and tests.
Signed-off-by: Chris Webb <chris@arachsys.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Traditionally, "*.txt whitespace" in .gitattributes file has been an
instruction to catch _all_ classes of whitespace errors known to git.
This has to change, however, in order to introduce "tab-in-indent" which
is inherently incompatible with "indent-with-non-tab". As we do not want
to break configuration of existing users, add a mechanism to allow marking
selected rules to be excluded from "all rules known to git".
Signed-off-by: Chris Webb <chris@arachsys.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function took (name, namelen) as its arguments, but all the public
callers wanted to pass a full string.
Demote the counted-string interface to an internal API status, and allow
public callers to just pass the string to the function.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
* 'jc/maint-1.6.0-blank-at-eof' (early part):
diff --whitespace: fix blank lines at end
core.whitespace: split trailing-space into blank-at-{eol,eof}
diff --color: color blank-at-eof
diff --whitespace=warn/error: fix blank-at-eof check
diff --whitespace=warn/error: obey blank-at-eof
diff.c: the builtin_diff() deals with only two-file comparison
apply --whitespace: warn blank but not necessarily empty lines at EOF
apply --whitespace=warn/error: diagnose blank at EOF
apply.c: split check_whitespace() into two
apply --whitespace=fix: detect new blank lines at eof correctly
apply --whitespace=fix: fix handling of blank lines at the eof
People who configured trailing-space depended on it to catch both extra
white space at the end of line, and extra blank lines at the end of file.
Earlier attempt to introduce only blank-at-eof gave them an escape hatch
to keep the old behaviour, but it is a regression until they explicitly
specify the new error class.
This introduces a blank-at-eol that only catches extra white space at the
end of line, and makes the traditional trailing-space a convenient synonym
to catch both blank-at-eol and blank-at-eof. This way, people who used
trailing-space continue to catch both classes of errors.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git apply" strips new blank lines at EOF under --whitespace=fix option,
but neigher --whitespace=warn nor --whitespace=error paid any attention to
these errors.
Introduce a new whitespace error class, blank-at-eof, to make the
whitespace error handling more consistent.
The patch adds a new "linenr" field to the struct fragment in order to
record which line the hunk started in the input file, but this is needed
solely for reporting purposes. The detection of this class of whitespace
errors cannot be done while parsing a patch like we do for all the other
classes of whitespace errors. It instead has to wait until we find where
to apply the hunk, but at that point, we do not have an access to the
original line number in the input file anymore, hence the new field.
Depending on your point of view, this may be a bugfix that makes warn and
error in line with fix. Or you could call it a new feature. The line
between them is somewhat fuzzy in this case.
Strictly speaking, triggering more errors than before is a change in
behaviour that is not backward compatible, even though the reason for the
change is because the code was not checking for an error that it should
have. People who do not want added blank lines at EOF to trigger an error
can disable the new error class.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
735c674 (Trailing whitespace and no newline fix, 2009-07-22) completely
broke --whitespace=fix, causing it to lose all the empty lines in a patch.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If a patch adds a new line to the end of a file and this line ends with
one trailing whitespace character and has no newline, then
'--whitespace=fix' currently does not remove that trailing whitespace.
This patch fixes this by removing the check for trailing whitespace at
the end of the line at a hardcoded offset which does not take the
eventual absence of newline into account.
Signed-off-by: SZEDER Gábor <szeder@ira.uka.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
That is what the documentation says, but the code pretends as if all the
known whitespace error tokens were given.
Among the whitespace error tokens, there is one kind that loosens the rule
when set: cr-at-eol. Which means that whitespace error token that is set
to true ignores a newly introduced CR at the end, which is inconsistent
with the documentation.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Many call sites use strbuf_init(&foo, 0) to initialize local
strbuf variable "foo" which has not been accessed since its
declaration. These can be replaced with a static initialization
using the STRBUF_INIT macro which is just as readable, saves a
function call, and takes up fewer lines.
Signed-off-by: Brandon Casey <casey@nrlssc.navy.mil>
Signed-off-by: Shawn O. Pearce <spearce@spearce.org>
When a patch adds new blank lines at the end, "git apply --whitespace"
warns. This teaches "diff --check" to do the same.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The function name was too bland and not explicit enough as to what it is
checking. Split it into two, and call the one that checks if there is a
whitespace breakage "ws_check()", and call the other one that checks and
emits the line after color coding "ws_check_emit()".
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When a patch adds a whitespace followed by end-of-line, the
trailing whitespace error was detected correctly but was not
fixed, due to misconversion in 42ab241 (builtin-apply.c: do not
feed copy_wsfix() leading '+').
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This new error mode allows a line to have a carriage return at the
end of the line when checking and fixing trailing whitespace errors.
Some people like to keep CRLF line ending recorded in the repository,
and still want to take advantage of the automated trailing whitespace
stripping. We still show ^M in the diff output piped to "less" to
remind them that they do have the CR at the end, but these carriage
return characters at the end are no longer flagged as errors.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Instead of highlighting the entire initial indent, highlight only the
problematic spaces.
In the case of an indent like ' \t \t' there may be multiple problematic
ranges, so it's easiest to emit the highlighting as we go instead of
trying rember disjoint ranges and do it all at the end.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
After this patch, "written" counts the number of bytes up to and
including the most recently seen tab. This allows us to detect (and
count) spaces by comparing to "i".
This allows catching initial indents like '\t ' (a tab followed
by 8 spaces), while previously indent-with-non-tab caught only indents
that consisted entirely of spaces.
This also allows fixing an indent-with-non-tab regression, so we can
again detect indents like '\t \t'.
Also update tests to catch these cases.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The variable leading_space is initially used to represent the index of
the last space seen before a non-space. Then later it represents the
index of the first non-indent character.
It will prove simpler to replace it by a variable representing a number
of bytes. Eventually it will represent the number of bytes written so
far (in the stream != NULL case).
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Reorganize to emphasize the most complicated part of the code (the tab
case).
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
If there were no tabs, and the last space was at position 7, then
positions 0..7 had spaces, so there were 8 spaces.
Update test to check exactly this case.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The initial version of the whitespace_error_string() function took the
messages from builtin-apply.c rather than the shorter messages from
diff.c.
This commit addresses Junio's concern that these messages might be too
long (now that we can emit multiple warnings per line).
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
This commit unifies three separate places where whitespace checking was
performed:
- the whitespace checking previously done in builtin-apply.c is
extracted into a function in ws.c
- the equivalent logic in "git diff" is removed
- the emit_line_with_ws() function is also removed because that also
rechecks the whitespace, and its functionality is rolled into ws.c
The new function is called check_and_emit_line() and it does two things:
checks a line for whitespace errors and optionally emits it. The checking
is based on lines of content rather than patch lines (in other words, the
caller must strip the leading "+" or "-"); this was suggested by Junio on
the mailing list to allow for a future extension to "git show" to display
whitespace errors in blobs.
At the same time we teach it to report all classes of whitespace errors
found for a given line rather than reporting only the first found error.
Signed-off-by: Wincent Colaiuta <win@wincent.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The `core.whitespace` configuration variable allows you to define what
`diff` and `apply` should consider whitespace errors for all paths in
the project (See gitlink:git-config[1]). This attribute gives you finer
control per path.
For example, if you have these in the .gitattributes:
frotz whitespace
nitfol -whitespace
xyzzy whitespace=-trailing
all types of whitespace problems known to git are noticed in path 'frotz'
(i.e. diff shows them in diff.whitespace color, and apply warns about
them), no whitespace problem is noticed in path 'nitfol', and the
default types of whitespace problems except "trailing whitespace" are
noticed for path 'xyzzy'. A project with mixed Python and C might want
to have:
*.c whitespace
*.py whitespace=-indent-with-non-tab
in its toplevel .gitattributes file.
Signed-off-by: Junio C Hamano <gitster@pobox.com>