The index file has room only for lower 32-bit of the file size in
the cached stat information, which means cached stat information
will have 0 in its sd_size member for a file whose size is multiple
of 4GiB. This is mistaken for a racily clean path. Avoid it by
storing a bogus sd_size value instead for such files.
* bc/racy-4gb-files:
Prevent git from rehashing 4GiB files
t: add a test helper to truncate files
In a future commit, we're going to work with some large files which will
be at least 4 GiB in size. To take advantage of the sparseness
functionality on most Unix systems and avoid running the system out of
disk, it would be convenient to use truncate(2) to simply create a
sparse file of sufficient size.
However, the GNU truncate(1) utility isn't portable, so let's write a
tiny test helper that does the work for us.
Signed-off-by: brian m. carlson <bk2204@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Shuffle some bits across headers and sources to prepare for
libification effort.
* cw/prelim-cleanup:
parse: separate out parsing functions from config.h
config: correct bad boolean env value error message
wrapper: reduce scope of remove_or_warn()
hex-ll: separate out non-hash-algo functions
In a following commit, we will make it possible to separate objects in
different packfiles depending on a filter.
To make sure that the right objects are in the right packs, let's add a
new test-tool that can display which packfile(s) a given object is in.
Let's also make it possible to check if a given object is in the
expected number of packfiles with a `--check-count <n>` option.
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The files config.{h,c} contain functions that have to do with parsing,
but not config.
In order to further reduce all-in-one headers, separate out functions in
config.c that do not operate on config into its own file, parse.h,
and update the include directives in the .c files that need only such
functions accordingly.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In order to further reduce all-in-one headers, separate out functions in
hex.h that do not operate on object hashes into its own file, hex-ll.h,
and update the include directives in the .c files that need only such
functions accordingly.
Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
"git update-index" learns "--show-index-version" to inspect
the index format version used by the on-disk index file.
* jc/update-index-show-index-version:
test-tool: retire "index-version"
update-index: add --show-index-version
update-index doc: v4 is OK with JGit and libgit2
As "git update-index --show-index-version" can do the same thing,
the 'index-version' subcommand in the test-tool lost its reason to
exist. Remove it and replace its use with the end-user facing
'git update-index --show-index-version'.
Helped-by: Linus Arver <linusa@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Windows updates.
* ds/maintenance-on-windows-fix:
git maintenance: avoid console window in scheduled tasks on Windows
win32: add a helper to run `git.exe` without a foreground window
Adjust to OpenSSL 3+, which deprecates its SHA-1 functions based on
its traditional API, by using its EVP API instead.
* ew/hash-with-openssl-evp:
avoid SHA-1 functions deprecated in OpenSSL 3+
sha256: avoid functions deprecated in OpenSSL 3+
On Windows, there are two kinds of executables, console ones and
non-console ones. Git's executables are all console ones.
When launching the former e.g. in a scheduled task, a CMD window pops
up. This is not what we want for the tasks installed via the `git
maintenance` command.
To work around this, let's introduce `headless-git.exe`, which is a
non-console program that does _not_ pop up any window. All it does is to
re-launch `git.exe`, suppressing that console window, passing through
all command-line arguments as-are.
Helped-by: Carlo Marcelo Arenas Belón <carenas@gmail.com>
Helped-by: Yuyi Wang <Strawberry_Str@hotmail.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OpenSSL 3+ deprecates the SHA1_Init, SHA1_Update, and SHA1_Final
functions, leading to errors when building with `DEVELOPER=1'.
Use the newer EVP_* API with OpenSSL 3+ (only) despite being more
error-prone and less efficient due to heap allocations.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
OpenSSL 3+ deprecates the SHA256_Init, SHA256_Update, and SHA256_Final
functions, leading to errors when building with `DEVELOPER=1'.
Use the newer EVP_* API with OpenSSL 3+ despite being more
error-prone and less efficient due to heap allocations.
Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
A long term (but rather minor) pet-peeve of mine was the name
ll-merge.[ch]. I thought it made it harder to realize what stuff was
related to merging when I was working on the merge machinery and trying
to improve it.
Further, back in d1cbe1e6d8 ("hash-ll.h: split out of hash.h to remove
dependency on repository.h", 2023-04-22), we have split the portions of
hash.h that do not depend upon repository.h into a "hash-ll.h" (due to
the recommendation to use "ll" for "low-level" in its name[1], but which
I used as a suffix precisely because of my distaste for "ll-merge").
When we discussed adding additional "*-ll.h" files, a request was made
that we use "ll" consistently as either a prefix or a suffix. Since it
is already in use as both a prefix and a suffix, the only way to do so
is to rename some files.
Besides my distaste for the ll-merge.[ch] name, let me also note that
the files
ll-fsmonitor.h, ll-hash.h, ll-merge.h, ll-object-store.h, ll-read-cache.h
would have essentially nothing to do with each other and make no sense
to group. But giving them the common "ll-" prefix would group them. Using
"-ll" as a suffix thus seems just much more logical to me. Rename
ll-merge.[ch] to merge-ll.[ch] to achieve this consistency, and to
ensure we get a more logical grouping of files.
[1] https://lore.kernel.org/git/kl6lsfcu1g8w.fsf@chooglen-macbookpro.roam.corp.google.com/
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since this header showed up in some places besides just #include
statements, update/clean-up/remove those other places as well.
Note that compat/fsmonitor/fsm-path-utils-darwin.c previously got
away with violating the rule that all files must start with an include
of git-compat-util.h (or a short-list of alternate headers that happen
to include it first). This change exposed the violation and caused it
to stop building correctly; fix it by having it include
git-compat-util.h first, as per policy.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
These functions do not depend upon struct cache_entry or struct
index_state in any way, and it seems more logical to break them out into
this file, especially since statinfo.h already has the struct stat_data
declaration.
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
The functions init_db() and initialize_repository_version() were shared
by builtin/init-db.c and builtin/clone.c, and declared in cache.h.
Move these functions, plus their several helpers only used by these
functions, to setup.[ch].
Diff best viewed with `--color-moved`.
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Header clean-up.
* en/header-split-cache-h: (24 commits)
protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
mailmap, quote: move declarations of global vars to correct unit
treewide: reduce includes of cache.h in other headers
treewide: remove double forward declaration of read_in_full
cache.h: remove unnecessary includes
treewide: remove cache.h inclusion due to pager.h changes
pager.h: move declarations for pager.c functions from cache.h
treewide: remove cache.h inclusion due to editor.h changes
editor: move editor-related functions and declarations into common file
treewide: remove cache.h inclusion due to object.h changes
object.h: move some inline functions and defines from cache.h
treewide: remove cache.h inclusion due to object-file.h changes
object-file.h: move declarations for object-file.c functions from cache.h
treewide: remove cache.h inclusion due to git-zlib changes
git-zlib: move declarations for git-zlib functions from cache.h
treewide: remove cache.h inclusion due to object-name.h changes
object-name.h: move declarations for object-name.c functions from cache.h
treewide: remove unnecessary cache.h inclusion
treewide: be explicit about dependence on mem-pool.h
treewide: be explicit about dependence on oid-array.h
...
Move functions from cache.h for zlib.c into a new header file. Since
adding a "zlib.h" would cause issues with the real zlib, rename zlib.c
to git-zlib.c while we are at it.
Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
In 2007 the docbook project made the mistake of converting ' to \' for
man pages [1]. It's a problem because groff interprets \' as acute
accent which is rendered as ' in ASCII, but as ´ in utf-8.
This started a cascade of bug reports in git [2], debian [3], Arch Linux
[4], docbook itself [5], and probably many others.
A solution was to use the correct groff character: \(aq, which is always
rendered as ', but the problem is that such character doesn't work in
other troff programs.
A portable solution required the use of a conditional character that is
\(aq in groff, but ' in all others:
.ie \n(.g .ds Aq \(aq
.el .ds Aq '
The proper solution took time to be implemented in docbook, but in 2010
they did it [6]. So the docbook man page stylesheets were broken from
1.73 to 1.76.
Unfortunately by that point many workarounds already existed. In the
case of git, GNU_ROFF was introduced, and in the case of Arch Linux
a mapping from \' to ' was added to groff's man.local. Other
distributions might have done the same, or similar workarounds.
Since 2010 there is no need for this workaround, which is fixed
elsewhere, not just in docbook, but other layers as well.
Let's remove it.
[1] ea2a0bac56
[2] https://lore.kernel.org/git/20091012102926.GA3937@debian.b2j/
[3] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=507673#65
[4] https://bugs.archlinux.org/task/9643
[5] https://sourceforge.net/p/docbook/bugs/1022/
[6] fb55343426
Inspired-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Since [1] first released with Git v2.37.0 the built-in version of "add
-i" has been the default. That built-in implementation was added in
[2], first released with Git v2.25.0.
At this point enough time has passed to allow for finding any
remaining bugs in this new implementation, so let's remove the
fallback code.
As with similar migrations for "stash"[3] and "rebase"[4] we're
keeping a mention of "add.interactive.useBuiltin" in the
documentation, but adding a warning() to notify any outstanding users
that the built-in is now the default. As with [5] and [6] we should
follow-up in the future and eventually remove that warning.
1. 0527ccb1b5 (add -i: default to the built-in implementation,
2021-11-30)
2. f83dff60a7 (Start to implement a built-in version of `git add
--interactive`, 2019-11-13)
3. 8a2cd3f512 (stash: remove the stash.useBuiltin setting,
2020-03-03)
4. d03ebd411c (rebase: remove the rebase.useBuiltin setting,
2019-03-18)
5. deeaf5ee07 (stash: remove documentation for `stash.useBuiltin`,
2022-01-27)
6. 9bcde4d531 (rebase: remove transitory rebase.useBuiltin setting &
env, 2021-03-23)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Newer regex library macOS stopped enabling GNU-like enhanced BRE,
where '\(A\|B\)' works as alternation, unless explicitly asked with
the REG_ENHANCED flag. "git grep" now can be compiled to do so, to
retain the old behaviour.
* rs/use-enhanced-bre-on-macos:
use enhanced basic regular expressions on macOS
Since [1] there has been no reason for keeping "git env--helper" a
built-in. The reason it was a built-in to begin with was to support
the GIT_TEST_GETTEXT_POISON mode removed in that commit. I.e. unlike
the rest of "test-tool" it would potentially be called by the
installed git via "git-sh-i18n.sh".
As none of that applies since [1] we should stop carrying this
technical debt, and move it to t/helper/*. As this mostly move-only
change shows this has the nice bonus that we'll stop wasting time
translating the internal-only strings it emits.
Even though this was a built-in, it was intentionally never
documented, see its introduction in [2]. It never saw use outside of
the test suite, except for the "GIT_TEST_GETTEXT_POISON" use-case
noted above.
1. d162b25f95 (tests: remove support for GIT_TEST_GETTEXT_POISON,
2021-01-20)
2. b4f207f339 (env--helper: new undocumented builtin wrapping
git_env_*(), 2019-06-21)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
When 1819ad327b (grep: fix multibyte regex handling under macOS,
2022-08-26) started to use the native regex library instead of Git's
own (compat/regex/), it lost support for alternation in basic
regular expressions.
Bring it back by enabling the flag REG_ENHANCED on macOS when
compiling basic regular expressions.
Reported-by: Marco Nenciarini <marco.nenciarini@enterprisedb.com>
Suggested-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Use the SHA1DC implementation on macOS, just like other platforms,
by default.
* ab/darwin-default-to-sha1dc:
Makefile: use sha1collisiondetection by default on OSX and Darwin
When the sha1collisiondetection library was added and made the default
in [1] the interaction with APPLE_COMMON_CRYPTO added in [2] and [3]
seems to have been missed. On modern OSX and Darwin we are able to use
Apple's CommonCrypto both for SHA-1, and as a generic (but partial)
OpenSSL replacement.
This left OSX and Darwin without protection against the SHAttered
attack when building Git in its default configuration.
Let's also use sha1collisiondetection on OSX, to do so we'll need to
split up the "APPLE_COMMON_CRYPTO" flag into that flag and a new
"APPLE_COMMON_CRYPTO_SHA1".
Because of this we can stop conflating whether we want to use Apple's
CommonCrypto at all, and whether we want to use it for SHA-1. This
makes the CI recipe added in [4] simpler.
1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. 4dcd7732db (Makefile: add support for Apple CommonCrypto facility, 2013-05-19)
3. 61067954ce (cache.h: eliminate SHA-1 deprecation warnings on Mac OS X, 2013-05-19)
4. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
2022-10-20)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
`git bisect` becomes a builtin.
* dd/git-bisect-builtin:
bisect; remove unused "git-bisect.sh" and ".gitignore" entry
Turn `git bisect` into a full built-in
bisect--helper: log: allow arbitrary number of arguments
bisect--helper: handle states directly
bisect--helper: emit usage for "git bisect"
bisect test: test exit codes on bad usage
bisect--helper: identify as bisect when report error
bisect-run: verify_good: account for non-negative exit status
bisect run: keep some of the post-v2.30.0 output
bisect: fix output regressions in v2.30.0
bisect: refactor bisect_run() to match CodingGuidelines
bisect tests: test for v2.30.0 "bisect run" regressions
"make coccicheck" is time consuming. It has been made to run more
incrementally.
* ab/coccicheck-incremental:
Makefile: don't create a ".build/.build/" for cocci, fix output
spatchcache: add a ccache-alike for "spatch"
cocci: run against a generated ALL.cocci
cocci rules: remove <id>'s from rules that don't need them
Makefile: copy contrib/coccinelle/*.cocci to build/
cocci: optimistically use COMPUTE_HEADER_DEPENDENCIES
cocci: make "coccicheck" rule incremental
cocci: split off "--all-includes" from SPATCH_FLAGS
cocci: split off include-less "tests" from SPATCH_FLAGS
Makefile: split off SPATCH_BATCH_SIZE comment from "cocci" heading
Makefile: have "coccicheck" re-run if flags change
Makefile: add ability to TAB-complete cocci *.patch rules
cocci rules: remove unused "F" metavariable from pending rule
Makefile + shared.mak: rename and indent $(QUIET_SPATCH_T)
Avoid calling 'cache_tree_update()' when doing so would be redundant.
* vd/skip-cache-tree-update:
rebase: use 'skip_cache_tree_update' option
read-tree: use 'skip_cache_tree_update' option
reset: use 'skip_cache_tree_update' option
unpack-trees: add 'skip_cache_tree_update' option
cache-tree: add perf test comparing update and prime
Fix a couple of issues in the recently merged 0f3c55d4c2 (Merge
branch 'ab/coccicheck-incremental' into next, 2022-11-08):
In copying over the "contrib/coccinelle/" rules to
".build/contrib/coccinelle/" we inadvertently ended up with a
".build/.build/contrib/coccinelle/" as well. We'd generate the
per-file patches in the former, and keep the rule and overall result
in the latter. E.g. running:
make contrib/coccinelle/free.cocci.patch COCCI_SOURCES="attr.c grep.c"
Would, per "tree -a .build" yield the following result:
.build
├── .build
│ └── contrib
│ └── coccinelle
│ └── free.cocci.patch
│ ├── attr.c
│ ├── attr.c.log
│ ├── grep.c
│ └── grep.c.log
└── contrib
└── coccinelle
├── FOUND_H_SOURCES
├── free.cocci
└── free.cocci.patch
Now we'll instead generate all of our files in
".build/contrib/coccinelle/". Fixing this required renaming the
directory where we keep our per-file patches, as we'd otherwise
conflict with the result.
Now the per-file patch directory is named e.g. "free.cocci.d". And the
end result will now be:
.build
└── contrib
└── coccinelle
├── FOUND_H_SOURCES
├── free.cocci
├── free.cocci.d
│ ├── attr.c.patch
│ ├── attr.c.patch.log
│ ├── grep.c.patch
│ └── grep.c.patch.log
└── free.cocci.patch
The per-file patches now have a ".patch" file suffix, which fixes
another issue reported against 0f3c55d4c2: The summary output was
confusing. Before for the "make" command above we'd emit:
[...]
MKDIR -p .build/contrib/coccinelle
CP contrib/coccinelle/free.cocci .build/contrib/coccinelle/free.cocci
GEN .build/contrib/coccinelle/FOUND_H_SOURCES
MKDIR -p .build/.build/contrib/coccinelle/free.cocci.patch
SPATCH .build/.build/contrib/coccinelle/free.cocci.patch/grep.c
SPATCH .build/.build/contrib/coccinelle/free.cocci.patch/attr.c
SPATCH CAT $^ >.build/contrib/coccinelle/free.cocci.patch
CP .build/contrib/coccinelle/free.cocci.patch contrib/coccinelle/free.cocci.patch
But now we'll instead emit (identical output at the start omitted):
[...]
MKDIR -p .build/contrib/coccinelle/free.cocci.d
SPATCH grep.c >.build/contrib/coccinelle/free.cocci.d/grep.c.patch
SPATCH attr.c >.build/contrib/coccinelle/free.cocci.d/attr.c.patch
SPATCH CAT .build/contrib/coccinelle/free.cocci.d/**.patch >.build/contrib/coccinelle/free.cocci.patch
CP .build/contrib/coccinelle/free.cocci.patch contrib/coccinelle/free.cocci.patch
I.e. we have an "SPATCH" line that makes it clear that we're running
against the "{attr,grep}.c" file. The "SPATCH CAT" is then altered to
correspond to it, showing that we're concatenating the
"free.cocci.d/**.patch" files into one generated "free.cocci.patch" at
the end.
Reported-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Now that the shell script hands off to the `bisect--helper` to do
_anything_ (except to show the help), it is but a tiny step to let the
helper implement the actual `git bisect` command instead.
This retires `git-bisect.sh`, concluding a multi-year journey that many
hands helped with, in particular Pranit Bauna, Tanushree Tumane and
Miriam Rubio.
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Đoàn Trần Công Danh <congdanhqx@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Add a performance test comparing the execution times of 'prime_cache_tree()'
and 'cache_tree_update(_, WRITE_TREE_SILENT | WRITE_TREE_REPAIR)'. The goal
of comparing these two is to identify which is the faster method for
rebuilding an invalid cache tree, ultimately to remove one when both are
(reundantly) called in immediate succession.
Both methods are fast, so the new tests in 'p0090-cache-tree.sh' must call
each tested function multiple times to ensure the reported times (to 0.01s
resolution) convey the differences between them.
The tests compare the timing of a 'test-tool cache-tree' run as a no-op (to
capture a baseline for the overhead associated with running the tool),
'cache_tree_update()', and 'prime_cache_tree()' on four scenarios:
- A completely valid cache tree
- A cache tree with 2 invalid paths
- A cache tree with 50 invalid paths
- A completely empty cache tree
Example results:
Test this tree
-----------------------------------------------------------
0090.2: no-op, clean 1.27(0.48+0.52)
0090.3: prime_cache_tree, clean 2.02(0.83+0.85)
0090.4: cache_tree_update, clean 1.30(0.49+0.54)
0090.5: no-op, invalidate 2 1.29(0.48+0.54)
0090.6: prime_cache_tree, invalidate 2 1.98(0.81+0.83)
0090.7: cache_tree_update, invalidate 2 2.12(0.94+0.86)
0090.8: no-op, invalidate 50 1.32(0.50+0.55)
0090.9: prime_cache_tree, invalidate 50 2.10(0.86+0.89)
0090.10: cache_tree_update, invalidate 50 2.35(1.14+0.90)
0090.11: no-op, empty 1.33(0.50+0.54)
0090.12: prime_cache_tree, empty 2.04(0.84+0.87)
0090.13: cache_tree_update, empty 2.51(1.27+0.92)
These timings show that, while 'cache_tree_update()' is faster when the
cache tree is completely valid, it is equal to or slower than
'prime_cache_tree()' when there are any invalid paths. Since the redundant
calls are mostly in scenarios where the cache tree will be at least
partially invalid (e.g., 'git reset --hard'), 'prime_cache_tree()' will
likely perform better than 'cache_tree_update()' in typical cases.
Helped-by: SZEDER Gábor <szeder.dev@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Let's mention the SHAttered attack and more generally why we use the
sha1collisiondetection backend by default, and note that for SHA-256
the user should feel free to pick any of the supported backends as far
as hashing security is concerned.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since [1] the default SHA-1 backend on OSX has been
APPLE_COMMON_CRYPTO. Per [2] we'll skip using it on anything older
than Mac OS X 10.4 "Tiger"[3].
When "DC_SHA1" was made the default in [4] this interaction between it
and APPLE_COMMON_CRYPTO seems to have been missed in. Ever since
DC_SHA1 was "made the default" we've still used Apple's CommonCrypto
instead of sha1collisiondetection on modern versions of Darwin and
OSX.
1. 61067954ce (cache.h: eliminate SHA-1 deprecation warnings on Mac
OS X, 2013-05-19)
2. 9c7a0beee0 (config.mak.uname: set NO_APPLE_COMMON_CRYPTO on older
systems, 2014-08-15)
3. We could probably drop "NO_APPLE_COMMON_CRYPTO", as nobody's likely
to care about such on old version of OSX anymore. But let's leave that
for now.
4. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Address the root cause of technical debt we've been carrying since
sha1collisiondetection was made the default in [1]. In a preceding
commit we narrowly fixed a bug where the "DC_SHA1" variable would be
unset (in combination with "NO_APPLE_COMMON_CRYPTO=" on OSX), even
though we had the sha1collisiondetection library enabled.
But the only reason we needed to have such a user-exposed knob went
away with [1], and it's been doing nothing useful since then. We don't
care if you define DC_SHA1=*, we only care that you don't ask for any
other SHA-1 implementation. If it turns out that you didn't, we'll use
sha1collisiondetection, whether you had "DC_SHA1" set or not.
As a result of this being confusing we had e.g. [2] for cmake and the
recent [3] for ci/lib.sh setting "DC_SHA1" explicitly, even though
this was always a NOOP.
A much simpler way to do this is to stop having the Makefile and
CMakeLists.txt set "DC_SHA1" to be picked up by the test-lib.sh, let's
instead add a trivial "test-tool sha1-is-sha1dc". It returns zero if
we're using sha1collisiondetection, non-zero otherwise.
1. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
2. c4b2f41b5f (cmake: support for testing git with ctest, 2020-06-26)
3. 1ad5c3df35 (ci: use DC_SHA1=YesPlease on osx-clang job for CI,
2022-10-20)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
For the *_SHA1 and *_SHA256 flags we've discussed the various flags,
but not the fact that when you define multiple flags we'll pick one.
Which one we pick depends on the order they're listed in the Makefile,
which differed from the order we discussed them in this documentation.
Let's be explicit about how we select these, and re-arrange the
listings so that they're listed in the priority order we've picked.
I'd personally prefer that the selection was more explicit, and that
we'd error out if conflicting flags were provided, but per the
discussion downhtread of[1] the consensus was to keep theses semantics.
This behavior makes it easier to e.g. integrate with autoconf-like
systems, where the configuration can provide everything it can
support, and Git is tasked with picking the first one it prefers.
1. https://lore.kernel.org/git/220710.86mtdh81ty.gmgdl@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since 27dc04c545 (sha256: add an SHA-256 implementation using
libgcrypt, 2018-11-14) we've claimed to support a BLK_SHA256 flag, but
there's no such SHA-256 backend.
Instead we fall back on adding "sha256/block/sha256.o" to "LIB_OBJS"
and adding "-DSHA256_BLK" to BASIC_CFLAGS.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
In the preceding commit the discussion of the *_SHA1 knobs was left
as-is to benefit from a smaller diff, but since we're changing these
let's use the same phrasing we use for most other knobs. E.g. "define
X", not "define X environment variable", and get rid of the "when
running make to link with" entirely.
Furthermore the discussion of DC_SHA1* options is now under a "Options
for the sha1collisiondetection implementation" heading, so we don't
need to clarify that these options go along with DC_SHA1=Y, so let's
rephrase them accordingly.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Since the "Define ..." template of comments at the top of the Makefile
was started in 5bdac8b326 ([PATCH] Improve the compilation-time
settings interface, 2005-07-29) we've had a lot more flags added,
including flags that come in "groups". Not having any obvious
structure to the >500 line comment at the top of the Makefile has made
it hard to follow.
This change is almost entirely a move-only change, the two paragraphs
at the start of the first two sections are new, and so are the added
sections themselves, but other than that no lines are changed, only
moved.
We now list Makefile-only flags at the start, followed by stand-alone
flags, and then cover "optional library" flags in their respective
groups, followed by SHA-1 and SHA-256 flags, and finally
DEVELOPER-specific flags.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The claim that DC_SHA1 takes priority over other *_SHA1 knobs was true
when it was added in [1], But that hasn't been the case since it was
made the fallback default in [2].
We should be making it not only the default, but something that takes
priority over other *_SHA1 knobs, but that's outside the scope of this
change. For now let's correct the documentation to match reality.
Let's also remove the "unconditionally enable" wording, per the above
the enabling of "DC_SHA1" is conditional on these other flags.
The "Define DC_SHA1" here is also a lie, actually it's "we don't care
if you define DC_SHA1, just don't define anything else", but that's a
more general issue that'll be addressed in a subsequent commit. Let's
first stop pretending that this setting (which we actually don't even
use) takes priority over anything else.
1. 8325e43b82 (Makefile: add DC_SHA1 knob, 2017-03-16)
2. e6b07da278 (Makefile: make DC_SHA1 the default, 2017-03-17)
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Fix an edge case introduced in in e6b07da278 (Makefile: make DC_SHA1
the default, 2017-03-17), when DC_SHA1 was made the default fallback
we started unconditionally adding to BASIC_CFLAGS and LIB_OBJS, so
we'd use the sha1collisiondetection by default.
But the "DC_SHA1" variable remained unset, so e.g.:
make test DC_SHA1= T=t0013*.sh
Would skip the sha1collisiondetection tests, as we'd write
"DC_SHA1=''" to "GIT-BUILD-OPTIONS", but if we manually removed that
test prerequisite we'd pass the test (which we couldn't if we weren't
using sha1collisiondetection).
So let's have the fallback assignment use the 'override' directive
instead of the ":=" simply expanded variable introduced in
e6b07da278. In this case we explicitly want to override the user's
choice.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
The preceding commits to make the "coccicheck" target incremental made
it slower in some cases. As an optimization let's not have the
many=many mapping of <*.cocci>=<*.[ch]>, but instead concat the
<*.cocci> into an ALL.cocci, and then run one-to-many
ALL.cocci=<*.[ch]>.
A "make coccicheck" is now around 2x as fast as it was on "master",
and around 1.5x as fast as the preceding change to make the run
incremental:
$ git hyperfine -L rev origin/master,HEAD~,HEAD -p 'make clean' 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' -r 3
Benchmark 1: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'origin/master
Time (mean ± σ): 4.258 s ± 0.015 s [User: 27.432 s, System: 1.532 s]
Range (min … max): 4.241 s … 4.268 s 3 runs
Benchmark 2: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD~
Time (mean ± σ): 5.365 s ± 0.079 s [User: 36.899 s, System: 1.810 s]
Range (min … max): 5.281 s … 5.436 s 3 runs
Benchmark 3: make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD
Time (mean ± σ): 2.725 s ± 0.063 s [User: 14.796 s, System: 0.233 s]
Range (min … max): 2.667 s … 2.792 s 3 runs
Summary
'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD' ran
1.56 ± 0.04 times faster than 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'origin/master'
1.97 ± 0.05 times faster than 'make coccicheck SPATCH=spatch COCCI_SOURCES="$(echo $(ls o*.c builtin/h*.c))"' in 'HEAD~'
This can be turned off with SPATCH_CONCAT_COCCI, but as the
beneficiaries of "SPATCH_CONCAT_COCCI=" would mainly be those
developing the *.cocci rules themselves, let's leave this optimization
on by default.
For more information see my "Optimizing *.cocci rules by concat'ing
them" (<220901.8635dbjfko.gmgdl@evledraar.gmail.com>) on the
cocci@inria.fr mailing list.
This potentially changes the results of our *.cocci rules, but as
noted in that discussion it should be safe for our use. We don't name
rules, or if we do their names don't conflict across our *.cocci
files.
To the extent that we'd have any inter-dependencies between rules this
doesn't make that worse, as we'd have them now if we ran "make
coccicheck", applied the results, and would then have (due to
hypothetical interdependencies) suggested changes on the subsequent
"make coccicheck".
Our "coccicheck-test" target makes use of the ALL.cocci when running
tests, e.g. when testing unused.{c,out} we test it against ALL.cocci,
not unused.cocci. We thus assert (to the extent that we have test
coverage) that this concatenation doesn't change the expected results
of running these rules.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Change the "coccinelle" rule so that we first copy the *.cocci source
in e.g. "contrib/coccinelle/strbuf.cocci" to
".build/contrib/coccinelle/strbuf.cocci" before operating on it.
For now this serves as a rather pointless indirection, but prepares us
for the subsequent commit where we'll be able to inject generated
*.cocci files. Having the entire dependency tree live inside .build/*
simplifies both the globbing we'd need to do, and any "clean" rules.
It will also help for future targets which will want to act on the
generated patches or the logs, e.g. targets to alert if we can't parse
certain files (or, less so than usual) with "spatch", and e.g. a
replacement for "ci/run-static-analysis.sh". Such a replacement won't
care about placing the patches in the in-tree, only whether they're
"OK" (and about the diff).
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Improve the incremental rebuilding support of "coccicheck" by
piggy-backing on the computed dependency information of the
corresponding *.o file, rather than rebuilding all <RULE>/<FILE> pairs
if either their corresponding file changes, or if any header changes.
This in effect uses the same method that the "sparse" target was made
to use in c234e8a0ec (Makefile: make the "sparse" target non-.PHONY,
2021-09-23), except that the dependency on the *.o file isn't a hard
one, we check with $(wildcard) if the *.o file exists, and if so we'll
depend on it.
This means that the common case of:
make
make coccicheck
Will benefit from incremental rebuilding, now changing e.g. a header
will only re-run "spatch" on those those *.c files that make use of
it:
By depending on the *.o we piggy-back on
COMPUTE_HEADER_DEPENDENCIES. See c234e8a0ec (Makefile: make the
"sparse" target non-.PHONY, 2021-09-23) for prior art of doing that
for the *.sp files. E.g.:
make contrib/coccinelle/free.cocci.patch
make -W column.h contrib/coccinelle/free.cocci.patch
Will take around 15 seconds for the second command on my 8 core box if
I didn't run "make" beforehand to create the *.o files. But around 2
seconds if I did and we have those "*.o" files.
Notes about the approach of piggy-backing on *.o for dependencies:
* It *is* a trade-off since we'll pay the extra cost of running the C
compiler, but we're probably doing that anyway. The compiler is much
faster than "spatch", so even though we need to re-compile the *.o to
create the dependency info for the *.c for "spatch" it's
faster (especially if using "ccache").
* There *are* use-cases where some would like to have *.o files
around, but to have the "make coccicheck" ignore them. See:
https://lore.kernel.org/git/20220826104312.GJ1735@szeder.dev/
For those users a:
make
make coccicheck SPATCH_USE_O_DEPENDENCIES=
Will avoid considering the *.o files.
* If that *.o file doesn't exist we'll depend on an intermediate file
of ours which in turn depends on $(FOUND_H_SOURCES).
This covers both an initial build, or where "coccicheck" is run
without running "all" beforehand, and because we run "coccicheck"
on e.g. files in compat/* that we don't know how to build unless
the requisite flag was provided to the Makefile.
Most of the runtime of "incremental" runs is now spent on various
compat/* files, i.e. we conditionally add files to COMPAT_OBJS, and
therefore conflate whether we *can* compile an object and generate
dependency information for it with whether we'd like to link it
into our binary.
Before this change the distinction didn't matter, but now one way
to make this even faster on incremental builds would be to peel
those concerns apart so that we can see that e.g. compat/mmap.c
doesn't depend on column.h.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Optimize the very slow "coccicheck" target to take advantage of
incremental rebuilding, and fix outstanding dependency problems with
the existing rule.
The rule is now faster both on the initial run as we can make better
use of GNU make's parallelism than the old ad-hoc combination of
make's parallelism combined with $(SPATCH_BATCH_SIZE) and/or the
"--jobs" argument to "spatch(1)".
It also makes us *much* faster when incrementally building, it's now
viable to "make coccicheck" as topic branches are merged down.
The rule didn't use FORCE (or its equivalents) before, so a:
make coccicheck
make coccicheck
Would report nothing to do on the second iteration. But all of our
patch output depended on all $(COCCI_SOURCES) files, therefore e.g.:
make -W grep.c coccicheck
Would do a full re-run, i.e. a a change in a single file would force
us to do a full re-run.
The reason for this (not the initial rationale, but my analysis) is:
* Since we create a single "*.cocci.patch+" we don't know where to
pick up where we left off, or how to incrementally merge e.g. a
"grep.c" change with an existing *.cocci.patch.
* We've been carrying forward the dependency on the *.c files since
63f0a758a0 (add coccicheck make target, 2016-09-15) the rule was
initially added as a sort of poor man's dependency discovery.
As we don't include other *.c files depending on other *.c files
has always been broken, as could be trivially demonstrated
e.g. with:
make coccicheck
make -W strbuf.h coccicheck
However, depending on the corresponding *.c files has been doing
something, namely that *if* an API change modified both *.c and *.h
files we'd catch the change to the *.h we care about via the *.c
being changed.
For API changes that happened only via *.h files we'd do the wrong
thing before this change, but e.g. for function additions (not
"static inline" ones) catch the *.h change by proxy.
Now we'll instead:
* Create a <RULE>/<FILE> pair in the .build directory, E.g. for
swap.cocci and grep.c we'll create
.build/contrib/coccinelle/swap.cocci.patch/grep.c.
That file is the diff we'll apply for that <RULE>-<FILE>
combination, if there's no changes to me made (the common case)
it'll be an empty file.
* Our generated *.patch
file (e.g. contrib/coccinelle/swap.cocci.patch) is now a simple "cat
$^" of all of all of the <RULE>/<FILE> files for a given <RULE>.
In the case discussed above of "grep.c" being changed we'll do the
full "cat" every time, so they resulting *.cocci.patch will always
be correct and up-to-date, even if it's "incrementally updated".
See 1cc0425a27 (Makefile: have "make pot" not "reset --hard",
2022-05-26) for another recent rule that used that technique.
As before we'll:
* End up generating a contrib/coccinelle/swap.cocci.patch, if we
"fail" by creating a non-empty patch we'll still exit with a zero
exit code.
Arguably we should move to a more Makefile-native way of doing
this, i.e. fail early, and if we want all of the "failed" changes
we can use "make -k", but as the current
"ci/run-static-analysis.sh" expects us to behave this way let's
keep the existing behavior of exhaustively discovering all cocci
changes, and only failing if spatch itself errors out.
Further implementation details & notes:
* Before this change running "make coccicheck" would by default end
up pegging just one CPU at the very end for a while, usually as
we'd finish whichever *.cocci rule was the most expensive.
This could be mitigated by combining "make -jN" with
SPATCH_BATCH_SIZE, see 960154b9c1 (coccicheck: optionally batch
spatch invocations, 2019-05-06).
There will be cases where getting rid of "SPATCH_BATCH_SIZE" makes
things worse, but a from-scratch "make coccicheck" with the default
of SPATCH_BATCH_SIZE=1 (and tweaking it doesn't make a difference)
is faster (~3m36s v.s. ~3m56s) with this approach, as we can feed
the CPU more work in a less staggered way.
* Getting rid of "SPATCH_BATCH_SIZE" particularly helps in cases
where the default of 1 yields parallelism under "make coccicheck",
but then running e.g.:
make -W contrib/coccinelle/swap.cocci coccicheck
I.e. before that would use only one CPU core, until the user
remembered to adjust "SPATCH_BATCH_SIZE" differently than the
setting that makes sense when doing a non-incremental run of "make
coccicheck".
* Before the "make coccicheck" rule would have to clean
"contrib/coccinelle/*.cocci.patch*", since we'd create "*+" and
"*.log" files there. Now those are created in
.build/contrib/coccinelle/, which is covered by the "cocciclean" rule
already.
Outstanding issues & future work:
* We could get rid of "--all-includes" in favor of manually
specifying a list of includes to give to "spatch(1)".
As noted upthread of [1] a naïve removal of "--all-includes" will
result in broken *.cocci patches, but if we know the exhaustive
list of includes via COMPUTE_HEADER_DEPENDENCIES we don't need to
re-scan for them, we could grab the headers to include from the
.depend.d/<file>.o.d and supply them with the "--include" option to
spatch(1).q
1. https://lore.kernel.org/git/87ft18tcog.fsf@evledraar.gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Per the rationale in 7b63ea5750 (Makefile: remove mandatory "spatch"
arguments from SPATCH_FLAGS, 2022-07-05) we have certain flags that
are truly mandatory, such as "--sp-file" and "--patch .". The
"--all-includes" flag is also critical, but per [1] we might want to
ad-hoc tweak it occasionally for testing or one-offs.
But being unable to set e.g. SPATCH_FLAGS="--verbose-parsing" without
breaking how our "spatch" works isn't ideal, i.e. before this we'd
need to know about the default include flags, and specify:
SPATCH_FLAGS="--all-includes --verbose-parsing".
If we were then to change the default include flag (e.g. to
"--recursive-includes") in the future any such one-off commands would
need to be correspondingly updated.
Let's instead leave the SPATCH_FLAGS for the user, while creating a
new SPATCH_INCLUDE_FLAGS to allow for ad-hoc testing of the include
strategy itself.
1. https://lore.kernel.org/git/20220823095733.58685-1-szeder.dev@gmail.com/
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Amend the "coccicheck-test" rule added in f7ff6597a7 (cocci: add a
"coccicheck-test" target and test *.cocci rules, 2022-07-05) to stop
using "--all-includes". The flags we'll need for the tests are
different than the ones we'll need for our main source code.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>
Split off the "; setting[...]" part of the comment added in In
960154b9c1 (coccicheck: optionally batch spatch invocations,
2019-05-06), and restore what we had before that, which was a comment
indicating that variables for the "coccicheck" target were being set
here.
When 960154b9c1 amended the heading to discuss SPATCH_BATCH_SIZE it
left no natural place to add a new comment about other flags that
preceded it. As subsequent commits will add such comments we need to
split the existing comment up.
The wrapping for the "SPATCH_BATCH_SIZE" is now a bit odd, but
minimizes the diff size. As a subsequent commit will remove that
feature altogether this is worth it.
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Taylor Blau <me@ttaylorr.com>