Commit graph

67869 commits

Author SHA1 Message Date
Junio C Hamano f0e9754a27 Merge branch 'gc/git-reflog-doc-markup'
Doc mark-up fix.

* gc/git-reflog-doc-markup:
  Documentation/git-reflog: remove unneeded \ from \{
2022-08-12 13:19:08 -07:00
Junio C Hamano 8faaf690f7 Merge branch 'lt/symbolic-ref-sanity'
"git symbolic-ref symref non..sen..se" is now diagnosed as an error.

* lt/symbolic-ref-sanity:
  symbolic-ref: refuse to set syntactically invalid target
2022-08-12 13:19:08 -07:00
Teng Long 35ae40ead3 tr2: shows scope unconditionally in addition to key-value pair
When we specify GIT_TRACE2_CONFIG_PARAMS or trace2.configparams,
trace2 will prints "interesting" config values to log. Sometimes,
when a config set in multiple scope files, the following output
looks like (the irrelevant fields are omitted here as "..."):

...| def_param    |  ...  | core.multipackindex:false
...| def_param    |  ...  | core.multipackindex:false
...| def_param    |  ...  | core.multipackindex:false

As the log shows, even each config in different scope is dumped, but
we don't know which scope it comes from. Therefore, it's better to
add the scope names as well to make them be more recognizable. For
example, when execute:

    $ GIT_TRACE2_PERF=1 \
    > GIT_TRACE2_CONFIG_PARAMS=core.multipackIndex \
    > git rev-list --test-bitmap HEAD"

The following is the ouput (the irrelevant fields are omitted here
as "..."):

Format normal:
... git.c:461 ... def_param scope:system core.multipackindex=false
... git.c:461 ... def_param scope:global core.multipackindex=false
... git.c:461 ... def_param scope:local core.multipackindex=false

Format perf:

... | def_param    | ... | scope:system | core.multipackindex:false
... | def_param    | ... | scope:global | core.multipackindex:false
... | def_param    | ... | scope:local  | core.multipackindex:false

Format event:

{"event":"def_param", ... ,"scope":"system","param":"core.multipackindex","value":"false"}
{"event":"def_param", ... ,"scope":"global","param":"core.multipackindex","value":"false"}
{"event":"def_param", ... ,"scope":"local","param":"core.multipackindex","value":"false"}

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 21:05:00 -07:00
Teng Long 050d0dc241 api-trace2.txt: print config key-value pair
It's supported to print "interesting" config key-value paire
to tr2 log by setting "GIT_TRACE2_CONFIG_PARAMS" environment
variable and the "trace2.configparam" config, let's add the
related docs in Documentaion/technical/api-trace2.txt.

Signed-off-by: Teng Long <dyroneteng@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 21:05:00 -07:00
Li Linchao 9096451acd rev-list: support human-readable output for --disk-usage
The '--disk-usage' option for git-rev-list was introduced in 16950f8384
(rev-list: add --disk-usage option for calculating disk usage, 2021-02-09).
This is very useful for people inspect their git repo's objects usage
infomation, but the resulting number is quit hard for a human to read.

Teach git rev-list to output a human readable result when using
'--disk-usage'.

Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-11 13:45:23 -07:00
Junio C Hamano 5502f77b69 Git 2.37.2
-----BEGIN PGP SIGNATURE-----
 
 iQIzBAABCAAdFiEE4fA2sf7nIh/HeOzvsLXohpav5ssFAmL0iy8ACgkQsLXohpav
 5sumeA/+KwxAxk1VbIQBo0riJJqgAQygrs80H78dR6qQ3b/j0jLbshd7op9HWjeS
 58xmsvHA3MyzB2RBh1daZyzAl23ANZkViirbRU8q7Y3JrQdOT9PCd0W9BU9x3MCb
 OdnEZH5biiZzltK8aB8Yr0h3Y33K7SqBQjRI/r2zLIcQdErJLK4wsOYSQ8R6pJNg
 unFtFmZsaW8hksPFe4w4/S2ySwQForcDFScrxwegsYBqsLCbGDe0va+tw9glIaMe
 3zRA7U8Y92B9SWBaipydwmnjdbVj58v2C9DiBCw6dpBJIBfEXRDRBbwNeNMgsX3s
 1jpKFeEJxT6gEnZzQcwYDmMfOiifmB/R7Ii8ceAVOY+DLRDrDoJyGukBZ+pOeCAn
 6l1uJP50KbkZZ8v6Uu+wWkrSMvDTPQ01L3DnQKmcStNcyFviBfoOC2TkYAtJfVHB
 puP1Y6U84dpux3xOoBvIPers/9/NMvhvJUTxjxqbXsHN5C/XR8h/VXG3dT1WXZbW
 6X11ww8dut7iM8VM/Pvc70mRCiCZAJPSj0YEk1S/YtsD3726xOAnK/5p/19hdZdB
 V/ylCaxZ2FKG26s3gsao64tzGoTAi9iwXnAZuN8uBAlXlEVRMv/IBLTEJ2lyzY+O
 V09odT1ixIDRRVi26ExYIRkf5B465O3C+xyCwIB9AMeTnQTr0Tk=
 =WqCZ
 -----END PGP SIGNATURE-----

Sync with Git 2.37.2
2022-08-10 21:57:59 -07:00
Junio C Hamano ad60dddad7 Git 2.37.2
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 21:52:36 -07:00
Junio C Hamano b0fd38a515 Merge branch 'jc/string-list-cleanup' into maint
Code clean-up.
source: <xmqq7d471dns.fsf@gitster.g>

* jc/string-list-cleanup:
  builtin/remote.c: use the right kind of STRING_LIST_INIT
2022-08-10 21:52:36 -07:00
Junio C Hamano 3f4fa1fab8 Merge branch 'mt/pkt-line-comment-tweak' into maint
In-code comment clarification.
source: <6a14443c101fa132498297af6d7a483520688d75.1658488203.git.matheus.bernardino@usp.br>

* mt/pkt-line-comment-tweak:
  pkt-line.h: move comment closer to the associated code
2022-08-10 21:52:35 -07:00
Junio C Hamano 5856cb98c0 Merge branch 'ma/t4200-update' into maint
Test fix.
source: <20220718154322.2177166-1-martin.agren@gmail.com>

* ma/t4200-update:
  t4200: drop irrelevant code
2022-08-10 21:52:35 -07:00
Junio C Hamano 042159a509 Merge branch 'tb/commit-graph-genv2-upgrade-fix' into maint
There was a bug in the codepath to upgrade generation information
in commit-graph from v1 to v2 format, which has been corrected.
source: <cover.1657667404.git.me@ttaylorr.com>

* tb/commit-graph-genv2-upgrade-fix:
  commit-graph: fix corrupt upgrade from generation v1 to v2
  commit-graph: introduce `repo_find_commit_pos_in_graph()`
  t5318: demonstrate commit-graph generation v2 corruption
2022-08-10 21:52:35 -07:00
Junio C Hamano 4f049a16bf Merge branch 'tk/untracked-cache-with-uall' into maint
Fix for a bug that makes write-tree to fail to write out a
non-existent index as a tree, introduced in 2.37.
source: <20220722212232.833188-1-martin.agren@gmail.com>

* tk/untracked-cache-with-uall:
  read-cache: make `do_read_index()` always set up `istate->repo`
2022-08-10 21:52:34 -07:00
Junio C Hamano 340a6120e5 Merge branch 'mt/checkout-count-fix' into maint
"git checkout" miscounted the paths it updated, which has been
corrected.
source: <cover.1657799213.git.matheus.bernardino@usp.br>

* mt/checkout-count-fix:
  checkout: fix two bugs on the final count of updated entries
  checkout: show bug about failed entries being included in final report
  checkout: document bug where delayed checkout counts entries twice
2022-08-10 21:52:34 -07:00
Junio C Hamano acd3bce63f Merge branch 'cl/rerere-train-with-no-sign' into maint
"rerere-train" script (in contrib/) used to honor commit.gpgSign
while recreating the throw-away merges.
source: <PH7PR14MB5594A27B9295E95ACA4D6A69CE8F9@PH7PR14MB5594.namprd14.prod.outlook.com>

* cl/rerere-train-with-no-sign:
  contrib/rerere-train: avoid useless gpg sign in training
2022-08-10 21:52:33 -07:00
Junio C Hamano b1b489f4cc Merge branch 'kk/p4-client-name-encoding-fix' into maint
"git p4" did not handle non-ASCII client name well, which has been
corrected.
source: <pull.1285.v3.git.git.1658394440.gitgitgadget@gmail.com>

* kk/p4-client-name-encoding-fix:
  git-p4: refactoring of p4CmdList()
  git-p4: fix bug with encoding of p4 client name
2022-08-10 21:52:33 -07:00
Junio C Hamano 4fc4066c4a Merge branch 'mb/p4-utf16-crlf' into maint
"git p4" working on UTF-16 files on Windows did not implement
CRLF-to-LF conversion correctly, which has been corrected.
source: <pull.1294.v2.git.git.1658341065221.gitgitgadget@gmail.com>

* mb/p4-utf16-crlf:
  git-p4: fix CR LF handling for utf16 files
2022-08-10 21:52:32 -07:00
Junio C Hamano 312d5b7429 Merge branch 'hx/lookup-commit-in-graph-fix' into maint
A corner case bug where lazily fetching objects from a promisor
remote resulted in infinite recursion has been corrected.
source: <cover.1656593279.git.hanxin.hx@bytedance.com>

* hx/lookup-commit-in-graph-fix:
  t5330: remove run_with_limited_processses()
  commit-graph.c: no lazy fetch in lookup_commit_in_graph()
2022-08-10 21:52:32 -07:00
Junio C Hamano a6aeb2fef9 Merge branch 'jc/resolve-undo' into maint
The resolve-undo information in the index was not protected against
GC, which has been corrected.
source: <xmqq35f7kzad.fsf@gitster.g>

* jc/resolve-undo:
  fsck: do not dereference NULL while checking resolve-undo data
  revision: mark blobs needed for resolve-undo as reachable
2022-08-10 21:52:32 -07:00
Jeff King 4dd3b045f5 fsck: downgrade tree badFilemode to "info"
The previous commit un-broke the "badFileMode" check; before then it was
literally testing nothing. And as far as I can tell, it has been so
since the very initial version of fsck.

The current severity of "badFileMode" is just "warning". But in the
--strict mode used by transfer.fsckObjects, that is elevated to an
error. This will potentially cause hassle for users, because historical
objects with bad modes will suddenly start causing pushes to many server
operators to be rejected.

At the same time, these bogus modes aren't actually a big risk. Because
we canonicalize them everywhere besides fsck, they can't cause too much
mischief in the real world. The worst thing you can do is end up with
two almost-identical trees that have different hashes but are
interpreted the same. That will generally cause things to be inefficient
rather than wrong, and is a bug somebody working on a Git implementation
would want to fix, but probably not worth inconveniencing users by
refusing to push or fetch.

So let's downgrade this to "info" by default, which is our setting for
"mention this when fscking, but don't ever reject, even under strict
mode". If somebody really wants to be paranoid, they can still adjust
the level using config.

Suggested-by: Xavier Morel <xavier.morel@masklinn.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:26:29 -07:00
Jeff King 53602a937d fsck: actually detect bad file modes in trees
We use the normal tree_desc code to iterate over trees in fsck, meaning
we only see the canonicalized modes it returns. And hence we'd never see
anything unexpected, since it will coerce literally any garbage into one
of our normal and accepted modes.

We can use the new RAW_MODES flag to see the real modes, and then use
the existing code to actually analyze them. The existing code is written
as allow-known-good, so there's not much point in testing a variety of
breakages. The one tested here should be S_IFREG but with nonsense
permissions.

Do note that the error-reporting here isn't great. We don't mention the
specific bad mode, but just that the tree has one or more broken modes.
But when you go to look at it with "git ls-tree", we'll report the
canonicalized mode! This isn't ideal, but given that this should come up
rarely, and that any number of other tree corruptions might force you
into looking at the binary bytes via "cat-file", it's not the end of the
world. And it's something we can improve on top later if we choose.

Reported-by: Xavier Morel <xavier.morel@masklinn.net>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:26:27 -07:00
Jeff King ec18b10bf2 tree-walk: add a mechanism for getting non-canonicalized modes
When using init_tree_desc() and tree_entry() to iterate over a tree, we
always canonicalize the modes coming out of the tree. This is a good
thing to prevent bugs or oddities in normal code paths, but it's
counter-productive for tools like fsck that want to see the exact
contents.

We can address this by adding an option to avoid the extra
canonicalization. A few notes on the implementation:

  - I've attached the new option to the tree_desc struct itself. The
    actual code change is in decode_tree_entry(), which is in turn
    called by the public update_tree_entry(), tree_entry(), and
    init_tree_desc() functions, plus their "gently" counterparts.

    By letting it ride along in the struct, we can avoid changing the
    signature of those functions, which are called many times. Plus it's
    conceptually simpler: you really want a particular iteration of a
    tree to be "raw" or not, rather than individual calls.

  - We still have to set the new option somewhere. The struct is
    initialized by init_tree_desc(). I added the new flags field only to
    the "gently" version. That avoids disturbing the much more numerous
    non-gentle callers, and it makes sense that anybody being careful
    about looking at raw modes would also be careful about bogus trees
    (i.e., the caller will be something like fsck in the first place).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:26:25 -07:00
Derrick Stolee e21e663cd1 clone: --bundle-uri cannot be combined with --depth
A previous change added the '--bundle-uri' option, but did not check
if the --depth parameter was included. Since bundles are not compatible
with shallow clones, provide an error message to the user who is
attempting this combination.

I am leaving this as its own change, separate from the one that
implements '--bundle-uri', because this is more of an advisory for the
user. There is nothing wrong with bootstrapping with bundles and then
fetching a shallow clone. However, that is likely going to involve too
much work for the client _and_ the server. The client will download all
of this bundle information containing the full history of the
repository only to ignore most of it. The server will get a shallow
fetch request, but with a list of haves that might cause a more painful
computation of that shallow pack-file.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Derrick Stolee 59c1752ab6 bundle-uri: add support for http(s):// and file://
The previous change created the 'git clone --bundle-uri=<uri>' option.
Currently, <uri> must be a filename.

Update copy_uri_to_file() to first inspect the URI for an HTTP(S) prefix
and use git-remote-https as the way to download the data at that URI.
Otherwise, check to see if file:// is present and modify the prefix
accordingly.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Derrick Stolee 5556891961 clone: add --bundle-uri option
Cloning a remote repository is one of the most expensive operations in
Git. The server can spend a lot of CPU time generating a pack-file for
the client's request. The amount of data can clog the network for a long
time, and the Git protocol is not resumable. For users with poor network
connections or are located far away from the origin server, this can be
especially painful.

Add a new '--bundle-uri' option to 'git clone' to bootstrap a clone from
a bundle. If the user is aware of a bundle server, then they can tell
Git to bootstrap the new repository with these bundles before fetching
the remaining objects from the origin server.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Derrick Stolee 53a50892be bundle-uri: create basic file-copy logic
Before implementing a way to fetch bundles into a repository, create the
basic logic. Assume that the URI is actually a file path. Future logic
will make this more careful to other protocols.

For now, we also only succeed if the content at the URI is a bundle
file, not a bundle list. Bundle lists will be implemented in a future
change.

Note that the discovery of a temporary filename is slightly racy because
the odb_mkstemp() relies on the temporary file not existing. With the
current implementation being limited to file copies, we could replace
the copy_file() with copy_fd(). The tricky part comes in future changes
that send the filename to 'git remote-https' and its 'get' capability.
At that point, we need the file descriptor closed _and_ the file
unlinked. If we were to keep the file descriptor open for the sake of
normal file copies, then we would pollute the rest of the code for
little benefit. This is especially the case because we expect that most
bundle URI use will be based on HTTPS instead of file copies.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Derrick Stolee b5624a4474 remote-curl: add 'get' capability
A future change will want a way to download a file over HTTP(S) using
the simplest of download mechanisms. We do not want to assume that the
server on the other side understands anything about the Git protocol but
could be a simple static web server.

Create the new 'get' capability for the remote helpers which advertises
that the 'get' command is avalable. A caller can send a line containing
'get <url> <path>' to download the file at <url> into the file at
<path>.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Derrick Stolee d06ed85dcb bundle-uri: add example bundle organization
The previous change introduced the bundle URI design document. It
creates a flexible set of options that allow bundle providers many ways
to organize Git object data and speed up clones and fetches. It is
particularly important that we have flexibility so we can apply future
advancements as new ideas for efficiently organizing Git data are
discovered.

However, the design document does not provide even an example of how
bundles could be organized, and that makes it difficult to envision how
the feature should work at the end of the implementation plan.

Add a section that details how a bundle provider could work, including
using the Git server advertisement for multiple geo-distributed servers.
This organization is based on the GVFS Cache Servers which have
successfully used similar ideas to provide fast object access and
reduced server load for very large repositories.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:03:17 -07:00
Derrick Stolee 2da14fad8f docs: document bundle URI standard
Introduce the idea of bundle URIs to the Git codebase through an
aspirational design document. This document includes the full design
intended to include the feature in its fully-implemented form. This will
take several steps as detailed in the Implementation Plan section.

By committing this document now, it can be used to motivate changes
necessary to reach these final goals. The design can still be altered as
new information is discovered.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:03:11 -07:00
Felipe Contreras 34133d9658 mergetools: vimdiff: simplify tabfirst
If we wrap the tabdo command there's no need for a separate command
call.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 12:39:39 -07:00
Felipe Contreras b6014eeac0 mergetools: vimdiff: fix single window layouts
Layouts with a single window other than "MERGED" do not work (e.g.
"LOCAL" or "MERGED+LOCAL").

This is because as the documentation of bufdo says:

    The last buffer (or where an error occurred) becomes the current
    buffer.

And we do always do bufdo the end.

Additionally, we do it only once, when it should be per tab.

Fix this by doing it once per tab right after it's created and before
any buffer is switched.

Cc: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 12:39:35 -07:00
Felipe Contreras ffcc33f6a6 mergetools: vimdiff: rework tab logic
If we treat tabs especially, the logic becomes much simpler.

Cc: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 12:39:32 -07:00
Felipe Contreras 60184ab4d3 mergetools: vimdiff: fix for diffopt
When diffopt has hiddenoff set and there's only one window (as is the
case in the single window mode) the diff mode is turned off.

We don't want that, so turn that option off.

Cc: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 12:39:28 -07:00
Felipe Contreras 66dd83ad09 mergetools: vimdiff: silence annoying messages
When using the single window mode we are greeted with the following
warning:

  "./content_LOCAL_8975" 6L, 28B
  "./content_BASE_8975" 6 lines, 29 bytes
  "./content_REMOTE_8975" 6 lines, 29 bytes
  "content" 16 lines, 115 bytes
  Press ENTER or type command to continue

every time.

Silence that.

Suggested-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 12:39:24 -07:00
Felipe Contreras 79db50d821 mergetools: vimdiff: make vimdiff3 actually work
When vimdiff3 was added in 7c147b77d3 (mergetools: add vimdiff3 mode,
2014-04-20), the description made clear the intention:

    It's similar to the default, except that the other windows are
    hidden.  This ensures that removed/added colors are still visible on
    the main merge window, but the other windows not visible.

However, in 0041797449 (vimdiff: new implementation with layout support,
2022-03-30) this was broken by generating a command that never creates
windows, and therefore vim never shows the diff.

The layout support implementation broke the whole purpose of vimdiff3,
and simply shows MERGED, which is no different from simply opening the
file with vim.

In order to show the diff, the windows need to be created first, and
then when they are hidden the diff remains (if hidenoff isn't set), but
by setting the `hidden` option the initial buffers are marked as hidden
thus making the feature work.

Suggested-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 12:39:17 -07:00
Felipe Contreras d619183710 mergetools: vimdiff: fix comment
The name of the variable is wrong, and it can be set to anything, like
1.

Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com>
Reviewed-by: Fernando Ramos <greenfoo@u92.eu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 12:39:07 -07:00
Philip Oakley efae7ce692 doc add: renormalize is not idempotent for CRCRLF
Bug report
 https://lore.kernel.org/git/AM0PR02MB56357CC96B702244F3271014E8DC9@AM0PR02MB5635.eurprd02.prod.outlook.com/
noted that a file containing /r/r/n needed renormalising twice.

This is by design. Lone CR characters, not paired with an LF, are left
unchanged. Note this limitation of the "clean" filter in the documentation.

Renormalize was introduced at 9472935d81 (add: introduce "--renormalize",
Torsten Bögershausen, 2017-11-16)

Signed-off-by: Philip Oakley <philipoakley@iee.email>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 11:26:38 -07:00
Shaoxuan Yuan ede241c715 rm: integrate with sparse-index
Enable the sparse index within the `git-rm` command.

The `p2000` tests demonstrate a ~92% execution time reduction for
'git rm' using a sparse index.

Test                              HEAD~1            HEAD
--------------------------------------------------------------------------
2000.74: git rm ... (full-v3)     0.41(0.37+0.05)   0.43(0.36+0.07) +4.9%
2000.75: git rm ... (full-v4)     0.38(0.34+0.05)   0.39(0.35+0.05) +2.6%
2000.76: git rm ... (sparse-v3)   0.57(0.56+0.01)   0.05(0.05+0.00) -91.2%
2000.77: git rm ... (sparse-v4)   0.57(0.55+0.02)   0.03(0.03+0.00) -94.7%

----
Also, normalize a behavioral difference of `git-rm` under sparse-index.
See related discussion [1].

`git-rm` a sparse-directory entry within a sparse-index enabled repo
behaves differently from a sparse directory within a sparse-checkout
enabled repo.

For example, in a sparse-index repo, where 'folder1' is a
sparse-directory entry, `git rm -r --sparse folder1` provides this:

        rm 'folder1/'

Whereas in a sparse-checkout repo *without* sparse-index, doing so
provides this:

        rm 'folder1/0/0/0'
        rm 'folder1/0/1'
        rm 'folder1/a'

Because `git rm` a sparse-directory entry does not need to expand the
index, therefore we should accept the current behavior, which is faster
than "expand the sparse-directory entry to match the sparse-checkout
situation".

Modify a previous test so such difference is not considered as an error.

[1] https://github.com/ffyuanda/git/pull/6#discussion_r934861398

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Shaoxuan Yuan bcf96cfca6 rm: expand the index only when necessary
Remove the `ensure_full_index()` method so `git-rm` does not always
expand the index when the expansion is unnecessary, i.e. when
<pathspec> does not have any possibilities to match anything outside
of sparse-checkout definition.

Expand the index when the <pathspec> needs an expanded index, i.e. the
<pathspec> contains wildcard that may need a full-index or the
<pathspec> is simply outside of sparse-checkout definition.

Notice that the test 'rm pathspec expands index when necessary' in
t1092 *is* testing this code change behavior, though it will be marked
as 'test_expect_success' only in the next patch, where we officially
mark `command_requires_full_index = 0`, so the index does not expand
unless we tell it to do so.

Notice that because we also want `ensure_full_index` to record the
stdout and stderr from Git command, a corresponding modification
is also included in this patch. The reason we want the "sparse-index-out"
and "sparse-index-err", is that we need to make sure there is no error
from Git command itself, so we can rely on the `test_region` result
and determine if the index is expanded or not.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Shaoxuan Yuan b29ad38322 pathspec.h: move pathspec_needs_expanded_index() from reset.c to here
Method pathspec_needs_expanded_index() in reset.c from 4d1cfc1351
(reset: make --mixed sparse-aware, 2021-11-29) is reusable when we
need to verify if the index needs to be expanded when the command
is utilizing a pathspec rather than a literal path.

Move it to pathspec.h for reusability.

Add a few items to the function so it can better serve its purpose as
a standalone public function:

* Add a check in front so if the index is not sparse, return early since
  no expansion is needed.

* It now takes an arbitrary 'struct index_state' pointer instead of
  using `the_index` and `active_cache`.

* Add documentation to the function.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Shaoxuan Yuan ba808251aa t1092: add tests for git-rm
Add tests for `git-rm`, make sure it behaves as expected when
<pathspec> is both inside or outside of sparse-checkout definition.

Helped-by: Victoria Dye <vdye@github.com>
Helped-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:23:26 -07:00
Junio C Hamano 3f61790678 Merge branch 'vd/sparse-reset-checkout-fixes' into sy/sparse-rm
* vd/sparse-reset-checkout-fixes:
  unpack-trees: unpack new trees as sparse directories
  cache.h: create 'index_name_pos_sparse()'
  oneway_diff: handle removed sparse directories
  checkout: fix nested sparse directory diff in sparse index
2022-08-08 13:23:06 -07:00
Victoria Dye b15207b8cf unpack-trees: unpack new trees as sparse directories
If 'unpack_single_entry()' is unpacking a new directory tree (that is, one
not already present in the index) into a sparse index, unpack the tree as a
sparse directory rather than traversing its contents and unpacking each file
individually. This helps keep the sparse index as collapsed as possible in
cases such as 'git reset --hard' restoring a outside-of-cone directory
removed with 'git rm -r --sparse'.

Without this patch, 'unpack_single_entry()' will only unpack a directory
into the index as a sparse directory (rather than traversing into it and
unpacking its files one-by-one) if an entry with the same name already
exists in the index. This patch allows sparse directory unpacking without a
matching index entry when the following conditions are met:

1. the directory's path is outside the sparse cone, and
2. there are no children of the directory in the index

If a directory meets these requirements (as determined by
'is_new_sparse_dir()'), 'unpack_single_entry()' unpacks the sparse directory
index entry and propagates the decision back up to 'unpack_callback()' to
prevent unnecessary tree traversal into the unpacked directory.

Reported-by: Shaoxuan Yuan <shaoxuan.yuan02@gmail.com>
Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:21:50 -07:00
Victoria Dye 9553aa0f6c cache.h: create 'index_name_pos_sparse()'
Add 'index_name_pos_sparse()', which behaves the same as 'index_name_pos()',
except that it does not expand a sparse index to search for an entry inside
a sparse directory.

'index_entry_exists()' was originally implemented in 20ec2d034c (reset: make
sparse-aware (except --mixed), 2021-11-29) as an alternative to
'index_name_pos()' to allow callers to search for an index entry without
expanding a sparse index. However, that particular use case only required
knowing whether the requested entry existed, so 'index_entry_exists()' does
not return the index positioning information provided by 'index_name_pos()'.

This patch implements 'index_name_pos_sparse()' to accommodate callers that
need the positioning information of 'index_name_pos()', but do not want to
expand the index.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:21:50 -07:00
Victoria Dye 56d8a27124 oneway_diff: handle removed sparse directories
Update 'do_oneway_diff()' to perform a 'diff_tree_oid()' on removed sparse
directories, as it does for added or modified sparse directories (see
9eb00af562 (diff-lib: handle index diffs with sparse dirs, 2021-07-14)).

At the moment, this update is unreachable code because 'unpack_trees()'
(currently the only way 'oneway_diff()' can be called, via 'diff_cache()')
will always traverse trees down to the individual removed files of a deleted
sparse directory. A subsequent patch will change this to better preserve a
sparse index in other uses of 'unpack_tree()', e.g. 'git reset --hard'.
However, making that change without this patch would result in (among other
issues) 'git status' printing only the name of a deleted sparse directory,
not its contents. To avoid introducing that bug, 'do_oneway_diff()' is
updated before modifying 'unpack_trees()'.

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:21:49 -07:00
Victoria Dye 49ff3cb90f checkout: fix nested sparse directory diff in sparse index
Add the 'recursive' diff flag to the local changes reporting done by 'git
checkout' in 'show_local_changes()'. Without the flag enabled, unexpanded
sparse directories will not be recursed into to report the diff of each
file's contents, resulting in the reported local changes including
"modified" sparse directories.

The same issue was found and fixed for 'git status' in 2c521b0e49 (status:
fix nested sparse directory diff in sparse index, 2022-03-01)

Signed-off-by: Victoria Dye <vdye@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:21:49 -07:00
Junio C Hamano c50926e1f4 The eleventh batch
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-08 13:13:14 -07:00
Junio C Hamano bac92b1f39 Merge branch 'js/ort-clean-up-after-failed-merge'
Plug memory leaks in the failure code path in the "merge-ort" merge
strategy backend.

* js/ort-clean-up-after-failed-merge:
  merge-ort: do leave trace2 region even if checkout fails
  merge-ort: clean up after failed merge
2022-08-08 13:13:14 -07:00
Junio C Hamano b9654bee99 Merge branch 'jk/struct-zero-init-with-older-gcc'
Older gcc with -Wall complains about the universal zero initializer
"struct s = { 0 };" idiom, which makes developers' lives
inconvenient (as -Werror is enabled by DEVELOPER=YesPlease).  The
build procedure has been tweaked to help these compilers.

* jk/struct-zero-init-with-older-gcc:
  config.mak.dev: squelch -Wno-missing-braces for older gcc
2022-08-08 13:13:14 -07:00
Junio C Hamano 1b53bea29a Merge branch 'js/t5351-freebsd-fix'
Some tests assumed that core.fsyncMethod=batch is supported
everywhere, which broke FreeBSD.

* js/t5351-freebsd-fix:
  t5351: avoid using `test_cmp` for binary data
  t5351: avoid relying on `core.fsyncMethod = batch` to be supported
2022-08-08 13:13:14 -07:00
Junio C Hamano 6c5fbd866c Merge branch 'js/lstat-mingw-enotdir-fix'
Fix to lstat() emulation on Windows.

* js/lstat-mingw-enotdir-fix:
  lstat(mingw): correctly detect ENOTDIR scenarios
2022-08-08 13:13:14 -07:00