Commit graph

302 commits

Author SHA1 Message Date
Junio C Hamano
eba498a774 Merge branch 'jk/libcurl-8.7-regression-workaround'
Fix was added to work around a regression in libcURL 8.7.0 (which has
already been fixed in their tip of the tree).

* jk/libcurl-8.7-regression-workaround:
  remote-curl: add Transfer-Encoding header only for older curl
  INSTALL: bump libcurl version to 7.21.3
  http: reset POSTFIELDSIZE when clearing curl handle
2024-04-15 14:11:44 -07:00
Jeff King
92a209bf24 remote-curl: add Transfer-Encoding header only for older curl
As of curl 7.66.0, we don't need to manually specify a "chunked"
Transfer-Encoding header. Instead, modern curl deduces the need for it
in a POST that has a POSTFIELDSIZE of -1 and uses READFUNCTION rather
than POSTFIELDS.

That version is recent enough that we can't just drop the header; we
need to do so conditionally. Since it's only a single line, it seems
like the simplest thing would just be to keep setting it unconditionally
(after all, the #ifdefs are much longer than the actual code). But
there's another wrinkle: HTTP/2.

Curl may choose to use HTTP/2 under the hood if the server supports it.
And in that protocol, we do not use the chunked encoding for streaming
at all. Most versions of curl handle this just fine by recognizing and
removing the header. But there's a regression in curl 8.7.0 and 8.7.1
where it doesn't, and large requests over HTTP/2 are broken (which t5559
notices). That regression has since been fixed upstream, but not yet
released.

Make the setting of this header conditional, which will let Git work
even with those buggy curl versions. And as a bonus, it serves as a
reminder that we can eventually clean up the code as we bump the
supported curl versions.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-04-05 14:45:19 -07:00
Jeff King
d6f6b433a8 transport-helper: drop "object-format <algo>" option
The documentation in gitremote-helpers.txt claims that helpers should
accept an object-format option from Git whose value is either:

  1. "true", in which case the helper is merely told that Git
     understands the special ":object-format" response, and will send it

  2. an algorithm name that the helper should use

However, Git has never sent the second form, and it's not clear if it
would ever be useful.

When interacting with a remote Git repository, we generally discover
what _their_ object format is, and then decide what to do with a
mismatch (where that is currently just "bail out", but could eventually
be on-the-fly conversion and interop). And that is true for native
protocols, but also for transport helpers like remote-curl that talk to
remote Git repositories.  There we send back an ":object-format" line
telling Git what remote-curl detected on the other side.

And this is true even for pushes (since we get it via receive-pack's
advertisement). And it is even true for dumb-http, as we guess at the
algorithm based on the hash size, due to ac093d0790 (remote-curl: detect
algorithm for dumb HTTP by size, 2020-06-19).

The one case where it _isn't_ true is dumb-http talking to an empty
repository. There we have no clue what the remote hash is, so
remote-curl just sends back its default. If we kept the "object-format
<algo>" form then in theory Git could say "object-format sha256" to
change that default. But it doesn't really accomplish anything. We still
may or may not be mis-matched with the other side. For a fetch that's
OK, since it's by definition a noop. For a push into an empty
repository, it might matter (though the dumb http-push DAV code seems
happy to clobber a remote sha256 info/refs and corrupt the repository).
If we want to pursue making this work, I think we'd be better off
improving detection of the object format of empty repositories over
dumb-http (e.g., an "info/object-format" file).

But what about helpers that _aren't_ talking to another Git repo?
Consider something like git-cinnabar, which is converting on the fly
to/from hg. Most of the heavy lifting is done by fast-import/export, but
some oids may still pass between Git and the helper. Could
"object-format <algo>" be useful to tell the helper what oids we expect
to see?

Possibly, but in practice this isn't necessary. Git-cinnabar for example
already peeks at the local-repo .git/config to check its object-format
(and currently just bails if it is sha256).

So I think the "object-format" extension really is only useful for the
helper telling Git what object-format it found, and not the other way
around.

Note that this patch can't break any remote helpers; we're not changing
the code on the Git side at all, but just bringing the documentation in
line with what Git has always done. It does remove the receiving support
in remote-curl.c, but that code was never actually triggered.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-03-20 10:01:27 -07:00
Junio C Hamano
fa50e7a8a0 Merge branch 'jx/remote-archive-over-smart-http'
"git archive --remote=<remote>" learned to talk over the smart
http (aka stateless) transport.

* jx/remote-archive-over-smart-http:
  transport-helper: call do_take_over() in process_connect
  transport-helper: call do_take_over() in connect_helper
  http-backend: new rpc-service for git-upload-archive
  transport-helper: protocol v2 supports upload-archive
  remote-curl: supports git-upload-archive service
  transport-helper: no connection restriction in connect_helper
2024-01-30 13:34:12 -08:00
Jiang Xin
23b7d59a82 remote-curl: supports git-upload-archive service
Add new service (git-upload-archive) support in remote-curl, so we can
support remote archive over HTTP/HTTPS protocols. Differences between
git-upload-archive and other services:

 1. The git-archive program does not expect to see protocol version and
    capabilities when connecting to remote-helper, so do not send them
    in remote-curl for the git-upload-archive service.

 2. We need to detect protocol version by calling discover_refs().
    Fallback to use the git-upload-pack service (which, like
    git-upload-archive, is a read-only operation) to discover protocol
    version.

Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2024-01-22 07:54:37 -08:00
Junio C Hamano
492ee03f60 Merge branch 'en/header-cleanup'
Remove unused header "#include".

* en/header-cleanup:
  treewide: remove unnecessary includes in source files
  treewide: add direct includes currently only pulled in transitively
  trace2/tr2_tls.h: remove unnecessary include
  submodule-config.h: remove unnecessary include
  pkt-line.h: remove unnecessary include
  line-log.h: remove unnecessary include
  http.h: remove unnecessary include
  fsmonitor--daemon.h: remove unnecessary includes
  blame.h: remove unnecessary includes
  archive.h: remove unnecessary include
  treewide: remove unnecessary includes in source files
  treewide: remove unnecessary includes from header files
2024-01-08 14:05:15 -08:00
Elijah Newren
d57c671a51 treewide: remove unnecessary includes in source files
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:33 -08:00
Elijah Newren
f25e65e0fe http.h: remove unnecessary include
The unnecessary include in the header transitively pulled in some
other headers actually needed by source files, though.  Have those
source files explicitly include the headers they need.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:32 -08:00
Elijah Newren
eea0e59ffb treewide: remove unnecessary includes in source files
Each of these were checked with
   gcc -E -I. ${SOURCE_FILE} | grep ${HEADER_FILE}
to ensure that removing the direct inclusion of the header actually
resulted in that header no longer being included at all (i.e. that
no other header pulled it in transitively).

...except for a few cases where we verified that although the header
was brought in transitively, nothing from it was directly used in
that source file.  These cases were:
  * builtin/credential-cache.c
  * builtin/pull.c
  * builtin/send-pack.c

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-26 12:04:31 -08:00
Patrick Steinhardt
bab2283ec6 remote-curl: rediscover repository when fetching refs
The reftable format encodes the hash function used by the repository
inside of its tables. The reftable backend thus needs to be initialized
with the correct hash function right from the start, or otherwise we may
end up writing tables with the wrong hash function. But git-clone(1)
initializes the reference database before learning about the hash
function used by the remote repository, which has never been a problem
with the reffiles backend.

To fix this, we'll have to change git-clone(1) to be more careful and
only create the reference backend once it learned about the remote hash
function. This creates a problem for git-remote-curl(1), which will then
be spawned at a time where the repository is not yet fully-initialized.
Consequentially, git-remote-curl(1) will fail to detect the repository,
which eventually causes it to error out once it is asked to fetch remote
objects.

We can address this issue by trying to re-discover the Git repository in
case none was detected at startup time. With this change, the clone will
look as following:

  1. git-clone(1) sets up the initial repository, excluding the
     reference database.

  2. git-clone(1) spawns git-remote-curl(1), which will be unable to
     detect the repository due to a missing "HEAD".

  3. git-clone(1) asks git-remote-curl(1) to list remote references.
     This works just fine as this step does not require a local
     repository

  4. git-clone(1) creates the reference database as it has now learned
     about the hash function.

  5. git-clone(1) asks git-remote-curl(1) to fetch the remote packfile.
     The latter notices that it doesn't have a repository available, but
     it now knows to try and re-discover it.

If the re-discovery succeeds in the last step we can continue with the
clone.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-12-12 11:16:54 -08:00
Junio C Hamano
ce481ac8b3 Merge branch 'cw/compat-util-header-cleanup'
Further shuffling of declarations across header files to streamline
file dependencies.

* cw/compat-util-header-cleanup:
  git-compat-util: move alloc macros to git-compat-util.h
  treewide: remove unnecessary includes for wrapper.h
  kwset: move translation table from ctype
  sane-ctype.h: create header for sane-ctype macros
  git-compat-util: move wrapper.c funcs to its header
  git-compat-util: move strbuf.c funcs to its header
2023-07-17 11:30:42 -07:00
René Scharfe
3e81b896f7 pkt-line: add size parameter to packet_length()
hex2chr() takes care not to run over the end of a NUL-terminated string.
It's used in packet_length(), but both callers of that function pass a
four-byte buffer, making NUL-checks unnecessary.  packet_length() could
accidentally be used with a pointer to a buffer of unknown size at new
call-sites, though, and the compiler wouldn't complain.

Add a size parameter plus check, and remove the NUL-checks by calling
hexval() directly.  This trades three NUL checks against one size check
and the ability to report the use of a short buffer at runtime.

If any of the four bytes is NUL or -- more generally -- not a
hexadecimal digit, then packet_length() still returns a negative value.

Signed-off-by: René Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-07 15:30:16 -07:00
Calvin Wan
91c080dff5 git-compat-util: move alloc macros to git-compat-util.h
alloc_nr, ALLOC_GROW, and ALLOC_GROW_BY are commonly used macros for
dynamic array allocation. Moving these macros to git-compat-util.h with
the other alloc macros focuses alloc.[ch] to allocation for Git objects
and additionally allows us to remove inclusions to alloc.h from files
that solely used the above macros.

Signed-off-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-07-05 11:42:31 -07:00
Junio C Hamano
0807e57807 Merge branch 'en/header-split-cache-h'
Header clean-up.

* en/header-split-cache-h: (24 commits)
  protocol.h: move definition of DEFAULT_GIT_PORT from cache.h
  mailmap, quote: move declarations of global vars to correct unit
  treewide: reduce includes of cache.h in other headers
  treewide: remove double forward declaration of read_in_full
  cache.h: remove unnecessary includes
  treewide: remove cache.h inclusion due to pager.h changes
  pager.h: move declarations for pager.c functions from cache.h
  treewide: remove cache.h inclusion due to editor.h changes
  editor: move editor-related functions and declarations into common file
  treewide: remove cache.h inclusion due to object.h changes
  object.h: move some inline functions and defines from cache.h
  treewide: remove cache.h inclusion due to object-file.h changes
  object-file.h: move declarations for object-file.c functions from cache.h
  treewide: remove cache.h inclusion due to git-zlib changes
  git-zlib: move declarations for git-zlib functions from cache.h
  treewide: remove cache.h inclusion due to object-name.h changes
  object-name.h: move declarations for object-name.c functions from cache.h
  treewide: remove unnecessary cache.h inclusion
  treewide: be explicit about dependence on mem-pool.h
  treewide: be explicit about dependence on oid-array.h
  ...
2023-04-25 13:56:20 -07:00
Elijah Newren
74ea5c9574 treewide: be explicit about dependence on trace.h & trace2.h
Dozens of files made use of trace and trace2 functions, without
explicitly including trace.h or trace2.h.  This made it more difficult
to find which files could remove a dependence on cache.h.  Make C files
explicitly include trace.h or trace2.h if they are using them.

Signed-off-by: Elijah Newren <newren@gmail.com>
Acked-by: Calvin Wan <calvinwan@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-04-11 08:52:08 -07:00
Junio C Hamano
6047b28eb7 Merge branch 'en/header-split-cleanup'
Split key function and data structure definitions out of cache.h to
new header files and adjust the users.

* en/header-split-cleanup:
  csum-file.h: remove unnecessary inclusion of cache.h
  write-or-die.h: move declarations for write-or-die.c functions from cache.h
  treewide: remove cache.h inclusion due to setup.h changes
  setup.h: move declarations for setup.c functions from cache.h
  treewide: remove cache.h inclusion due to environment.h changes
  environment.h: move declarations for environment.c functions from cache.h
  treewide: remove unnecessary includes of cache.h
  wrapper.h: move declarations for wrapper.c functions from cache.h
  path.h: move function declarations for path.c functions from cache.h
  cache.h: remove expand_user_path()
  abspath.h: move absolute path functions from cache.h
  environment: move comment_line_char from cache.h
  treewide: remove unnecessary cache.h inclusion from several sources
  treewide: remove unnecessary inclusion of gettext.h
  treewide: be explicit about dependence on gettext.h
  treewide: remove unnecessary cache.h inclusion from a few headers
2023-04-06 13:38:31 -07:00
Junio C Hamano
f879501ad0 Merge branch 'jk/fix-proto-downgrade-to-v0'
Transports that do not support protocol v2 did not correctly fall
back to protocol v0 under certain conditions, which has been
corrected.

* jk/fix-proto-downgrade-to-v0:
  git_connect(): fix corner cases in downgrading v2 to v0
2023-03-28 10:51:52 -07:00
Elijah Newren
d48be35ca6 write-or-die.h: move declarations for write-or-die.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren
e38da487cc setup.h: move declarations for setup.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:54 -07:00
Elijah Newren
32a8f51061 environment.h: move declarations for environment.c functions from cache.h
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:53 -07:00
Elijah Newren
f394e093df treewide: be explicit about dependence on gettext.h
Dozens of files made use of gettext functions, without explicitly
including gettext.h.  This made it more difficult to find which files
could remove a dependence on cache.h.  Make C files explicitly include
gettext.h if they are using it.

However, while compat/fsmonitor/fsm-ipc-darwin.c should also gain an
include of gettext.h, it was left out to avoid conflicting with an
in-flight topic.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-21 10:56:51 -07:00
Jeff King
eaa0fd6584 git_connect(): fix corner cases in downgrading v2 to v0
There's code in git_connect() that checks whether we are doing a push
with protocol_v2, and if so, drops us to protocol_v0 (since we know
how to do v2 only for fetches). But it misses some corner cases:

  1. it checks the "prog" variable, which is actually the path to
     receive-pack on the remote side. By default this is just
     "git-receive-pack", but it could be an arbitrary string (like
     "/path/to/git receive-pack", etc). We'd accidentally stay in v2
     mode in this case.

  2. besides "receive-pack" and "upload-pack", there's one other value
     we'd expect: "upload-archive" for handling "git archive --remote".
     Like receive-pack, this doesn't understand v2, and should use the
     v0 protocol.

In practice, neither of these causes bugs in the real world so far. We
do send a "we understand v2" probe to the server, but since no server
implements v2 for anything but upload-pack, it's simply ignored. But
this would eventually become a problem if we do implement v2 for those
endpoints, as older clients would falsely claim to understand it,
leading to a server response they can't parse.

We can fix (1) by passing in both the program path and the "name" of the
operation. I treat the name as a string here, because that's the pattern
set in transport_connect(), which is one of our callers (we were simply
throwing away the "name" value there before).

We can fix (2) by allowing only known-v2 protocols ("upload-pack"),
rather than blocking unknown ones ("receive-pack" and "upload-archive").
That will mean whoever eventually implements v2 push will have to adjust
this list, but that's reasonable. We'll do the safe, conservative thing
(sticking to v0) by default, and anybody working on v2 will quickly
realize this spot needs to be updated.

The new tests cover the receive-pack and upload-archive cases above, and
re-confirm that we allow v2 with an arbitrary "--upload-pack" path (that
already worked before this patch, of course, but it would be an easy
thing to break if we flipped the allow/block logic without also handling
"name" separately).

Here are a few miscellaneous implementation notes, since I had to do a
little head-scratching to understand who calls what:

  - transport_connect() is called only for git-upload-archive. For
    non-http git remotes, that resolves to the virtual connect_git()
    function (which then calls git_connect(); confused yet?). So
    plumbing through "name" in connect_git() covers that.

  - for regular fetches and pushes, callers use higher-level functions
    like transport_fetch_refs(). For non-http git remotes, that means
    calling git_connect() under the hood via connect_setup(). And that
    uses the "for_push" flag to decide which name to use.

  - likewise, plumbing like fetch-pack and send-pack may call
    git_connect() directly; they each know which name to use.

  - for remote helpers (including http), we already have separate
    parameters for "name" and "exec" (another name for "prog"). In
    process_connect_service(), we feed the "name" to the helper via
    "connect" or "stateless-connect" directives.

    There's also a "servpath" option, which can be used to tell the
    helper about the "exec" path. But no helpers we implement support
    it! For http it would be useless anyway (no reasonable server
    implementation will allow you to send a shell command to run the
    server). In theory it would be useful for more obscure helpers like
    remote-ext, but even there it is not implemented.

    It's tempting to get rid of it simply to reduce confusion, but we
    have publicly documented it since it was added in fa8c097cc9
    (Support remote helpers implementing smart transports, 2009-12-09),
    so it's possible some helper in the wild is using it.

  - So for v2, helpers (again, including http) are mainly used via
    stateless-connect, driven by the main program. But they do still
    need to decide whether to do a v2 probe. And so there's similar
    logic in remote-curl.c's discover_refs() that looks for
    "git-receive-pack". But it's not buggy in the same way. Since it
    doesn't support servpath, it is always dealing with a "service"
    string like "git-receive-pack". And since it doesn't support
    straight "connect", it can't be used for "upload-archive".

    So we could leave that spot alone. But I've updated it here to match
    the logic we're changing in connect_git(). That seems like the least
    confusing thing for somebody who has to touch both of these spots
    later (say, to add v2 push support). I didn't add a new test to make
    sure this doesn't break anything; we already have several tests (in
    t5551 and elsewhere) that make sure we are using v2 over http.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-03-17 15:15:59 -07:00
Elijah Newren
41771fa435 cache.h: remove dependence on hex.h; make other files include it explicitly
Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:29 -08:00
Elijah Newren
36bf195890 alloc.h: move ALLOC_GROW() functions from cache.h
This allows us to replace includes of cache.h with includes of the much
smaller alloc.h in many places.  It does mean that we also need to add
includes of alloc.h in a number of C files.

Signed-off-by: Elijah Newren <newren@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-02-23 17:25:28 -08:00
Johannes Schindelin
6487e9c459 Sync with 2.37.6
* maint-2.37:
  Git 2.37.6
  Git 2.36.5
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:43:28 +01:00
Johannes Schindelin
40843216c5 Sync with 2.35.7
* maint-2.35:
  Git 2.35.7
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:37:52 +01:00
Johannes Schindelin
6a53a59bf9 Sync with 2.34.7
* maint-2.34:
  Git 2.34.7
  http: support CURLOPT_PROTOCOLS_STR
  http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
  http-push: prefer CURLOPT_UPLOAD to CURLOPT_PUT
  Git 2.33.7
  Git 2.32.6
  Git 2.31.7
  Git 2.30.8
  apply: fix writing behind newly created symbolic links
  dir-iterator: prevent top-level symlinks without FOLLOW_SYMLINKS
  clone: delay picking a transport until after get_repo_path()
  t5619: demonstrate clone_local() with ambiguous transport
2023-02-06 09:29:44 +01:00
Jeff King
4bd481e0ad http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.

But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros).  But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.

Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).

Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.

Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
2023-02-06 09:27:09 +01:00
Jeff King
fe7e44e1ab http: prefer CURLOPT_SEEKFUNCTION to CURLOPT_IOCTLFUNCTION
The IOCTLFUNCTION option has been deprecated, and generates a compiler
warning in recent versions of curl. We can switch to using SEEKFUNCTION
instead. It was added in 2008 via curl 7.18.0; our INSTALL file already
indicates we require at least curl 7.19.4.

But there's one catch: curl says we should use CURL_SEEKFUNC_{OK,FAIL},
and those didn't arrive until 7.19.5. One workaround would be to use a
bare 0/1 here (or define our own macros).  But let's just bump the
minimum required version to 7.19.5. That version is only a minor version
bump from our existing requirement, and is only a 2 month time bump for
versions that are almost 13 years old. So it's not likely that anybody
cares about the distinction.

Switching means we have to rewrite the ioctl functions into seek
functions. In some ways they are simpler (seeking is the only
operation), but in some ways more complex (the ioctl allowed only a full
rewind, but now we can seek to arbitrary offsets).

Curl will only ever use SEEK_SET (per their documentation), so I didn't
bother implementing anything else, since it would naturally be
completely untested. This seems unlikely to change, but I added an
assertion just in case.

Likewise, I doubt curl will ever try to seek outside of the buffer sizes
we've told it, but I erred on the defensive side here, rather than do an
out-of-bounds read.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2023-01-17 08:03:08 -08:00
Derrick Stolee
b5624a4474 remote-curl: add 'get' capability
A future change will want a way to download a file over HTTP(S) using
the simplest of download mechanisms. We do not want to assume that the
server on the other side understands anything about the Git protocol but
could be a simple static web server.

Create the new 'get' capability for the remote helpers which advertises
that the 'get' command is avalable. A caller can send a line containing
'get <url> <path>' to download the file at <url> into the file at
<path>.

Reviewed-by: Josh Steadmon <steadmon@google.com>
Signed-off-by: Derrick Stolee <derrickstolee@github.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-08-10 14:07:37 -07:00
Li Linchao
b0c4adcdd7 remote-curl: send Accept-Language header to server
Git server end's ability to accept Accept-Language header was introduced
in f18604bbf2 (http: add Accept-Language header if possible, 2015-01-28),
but this is only used by very early phase of the transfer, which is HTTP
GET request to discover references. For other phases, like POST request
in the smart HTTP, the server does not know what language the client
speaks.

Teach git client to learn end-user's preferred language and throw
accept-language header to the server side. Once the server gets this header,
it has the ability to talk to end-user with language they understand.
This would be very helpful for many non-English speakers.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Li Linchao <lilinchao@oschina.cn>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-07-11 12:24:28 -07:00
Junio C Hamano
0f5e885173 Merge branch 'rc/fetch-refetch'
"git fetch --refetch" learned to fetch everything without telling
the other side what we already have, which is useful when you
cannot trust what you have in the local object store.

* rc/fetch-refetch:
  docs: mention --refetch fetch option
  fetch: after refetch, encourage auto gc repacking
  t5615-partial-clone: add test for fetch --refetch
  fetch: add --refetch option
  builtin/fetch-pack: add --refetch option
  fetch-pack: add refetch
  fetch-negotiator: add specific noop initializer
2022-04-04 10:56:23 -07:00
Robert Coup
869a0eb4eb builtin/fetch-pack: add --refetch option
Add a refetch option to fetch-pack to force a full fetch. Use when
applying a new partial clone filter to refetch all matching objects.

Signed-off-by: Robert Coup <robert@coup.net.nz>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-28 10:25:52 -07:00
Ævar Arnfjörð Bjarmason
b07fa8f1b2 remote-curl.c: free memory in cmd_main()
Plug a trivial memory leak in code added in a2d725b7bd (Use an
external program to implement fetching with curl, 2009-08-05).

To do this have the cmd_main() use a "goto cleanup" pattern, and to
return an error of 1 unless we can fall through to the http_cleanup()
at the end.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2022-03-04 13:24:18 -08:00
Ævar Arnfjörð Bjarmason
6def0ff878 run-command API users: use strvec_pushv(), not argv assignment
Migrate those run-command API users that assign directly to the "argv"
member to use a strvec_pushv() of "args" instead.

In these cases it did not make sense to further refactor these
callers, e.g. daemon.c could be made to construct the arguments closer
to handle(), but that would require moving the construction from its
cmd_main() and pass "argv" through two intermediate functions.

It would be possible for a change like this to introduce a regression
if we were doing:

      cp.argv = argv;
      argv[1] = "foo";

And changed the code, as is being done here, to:

      strvec_pushv(&cp.args, argv);
      argv[1] = "foo";

But as viewing this change with the "-W" flag reveals none of these
functions modify variable that's being pushed afterwards in a way that
would introduce such a logic error.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-11-25 22:15:07 -08:00
Junio C Hamano
06355d72dc Merge branch 'ab/pkt-line-cleanup'
Code clean-up.

* ab/pkt-line-cleanup:
  pkt-line.[ch]: remove unused packet_read_line_buf()
  pkt-line.[ch]: remove unused packet_buf_write_len()
2021-10-25 16:07:00 -07:00
Ævar Arnfjörð Bjarmason
ec9a37d69b pkt-line.[ch]: remove unused packet_read_line_buf()
This function was added in 4981fe750b (pkt-line: share
buffer/descriptor reading implementation, 2013-02-23), but in
01f9ec64c8 (Use packet_reader instead of packet_read_line,
2018-12-29) the code that was using it was removed.

Since it's being removed we can in turn remove the "src" and "src_len"
arguments to packet_read(), all the remaining users just passed a
NULL/NULL pair to it.

That function is only a thin wrapper for packet_read_with_status()
which still needs those arguments, but for the thin packet_read()
convenience wrapper we can do away with it for now.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-10-15 13:09:40 -07:00
Junio C Hamano
97492aacff Merge branch 'ab/http-pinned-public-key-mismatch'
HTTPS error handling updates.

* ab/http-pinned-public-key-mismatch:
  http: check CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors
2021-10-11 10:21:47 -07:00
Ævar Arnfjörð Bjarmason
abf897bacd string-list.[ch]: remove string_list_init() compatibility function
Remove this function left over to accommodate in-flight changes, see
770fedaf9f (string-list.[ch]: add a string_list_init_{nodup,dup}(),
2021-07-01) for the recent change to add
"string_list_init_{nodup,dup}()" initializers.

There was only one user of the API left in remote-curl.c. I don't know
why I didn't include this change to remote-curl.c in
bc40dfb10a (string-list.h users: change to use *_{nodup,dup}(),
2021-07-01), perhaps I just missed it.

In any case, let's change that one user to use the new API, as of
writing this there are no in-flight changes that use, so this seems
like a good time to drop this before we get any new users of this
compatibility API.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-28 14:43:38 -07:00
Ævar Arnfjörð Bjarmason
3e8084f188 http: check CURLE_SSL_PINNEDPUBKEYNOTMATCH when emitting errors
Change the error shown when a http.pinnedPubKey doesn't match to point
the http.pinnedPubKey variable added in aeff8a6121 (http: implement
public key pinning, 2016-02-15), e.g.:

    git -c http.pinnedPubKey=sha256/someNonMatchingKey ls-remote https://github.com/git/git.git
    fatal: unable to access 'https://github.com/git/git.git/' with http.pinnedPubkey configuration: SSL: public key does not match pinned public key!

Before this we'd emit the exact same thing without the " with
http.pinnedPubkey configuration". The advantage of doing this is that
we're going to get a translated message (everything after the ":" is
hardcoded in English in libcurl), and we've got a reference to the
git-specific configuration variable that's causing the error.

Unfortunately we can't test this easily, as there are no tests that
require https:// in the test suite, and t/lib-httpd.sh doesn't know
how to set up such tests. See [1] for the start of a discussion about
what it would take to have divergent "t/lib-httpd/apache.conf" test
setups. #leftoverbits

1. https://lore.kernel.org/git/YUonS1uoZlZEt+Yd@coredump.intra.peff.net/

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-09-27 10:58:07 -07:00
Junio C Hamano
e48a623dea Merge branch 'ab/http-drop-old-curl'
Support for ancient versions of cURL library (pre 7.19.4) has been
dropped.

* ab/http-drop-old-curl:
  http: rename CURLOPT_FILE to CURLOPT_WRITEDATA
  http: drop support for curl < 7.19.3 and < 7.17.0 (again)
  http: drop support for curl < 7.19.4
  http: drop support for curl < 7.16.0
  http: drop support for curl < 7.11.1
2021-08-24 15:32:36 -07:00
Ævar Arnfjörð Bjarmason
8dda4cbdf2 http: rename CURLOPT_FILE to CURLOPT_WRITEDATA
The CURLOPT_FILE name is an alias for CURLOPT_WRITEDATA, the
CURLOPT_WRITEDATA name has been preferred since curl 7.9.7, released
in May 2002[1].

1. https://curl.se/libcurl/c/CURLOPT_WRITEDATA.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 16:01:54 -07:00
Jeff King
013c7e2b07 http: drop support for curl < 7.16.0
In the last commit we dropped support for curl < 7.11.1, let's
continue that and drop support for versions older than 7.16.0. This
allows us to get rid of some now-obsolete #ifdefs.

Choosing 7.16.0 is a somewhat arbitrary cutoff:

  1. It came out in October of 2006, almost 15 years ago.
     Besides being a nice round number, around 10 years is
     a common end-of-life support period, even for conservative
     distributions.

  2. That version introduced the curl_multi interface, which
     gives us a lot of bang for the buck in removing #ifdefs

RHEL 5 came with curl 7.15.5[1] (released in August 2006). RHEL 5's
extended life cycle program ended on 2020-11-30[1]. RHEL 6 comes with
curl 7.19.7 (released in November 2009), and RHEL 7 comes with
7.29.0 (released in February 2013).

1. http://lore.kernel.org/git/873e1f31-2a96-5b72-2f20-a5816cad1b51@jupiterrise.com

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:11:15 -07:00
Jeff King
1119a15b5c http: drop support for curl < 7.11.1
Drop support for this ancient version of curl and simplify the code by
allowing us get rid of some "#ifdef"'s.

Git will not build with vanilla curl older than 7.11.1 due our use of
CURLOPT_POSTFIELDSIZE in 37ee680d9b
(http.postbuffer: allow full range of ssize_t values,
2017-04-11). This field was introduced in curl 7.11.1.

We could solve these compilation problems with more #ifdefs,
but it's not worth the trouble. Version 7.11.1 came out in
March of 2004, over 17 years ago. Let's declare that too old
and drop any existing ifdefs that go further back. One
obvious benefit is that we'll have fewer conditional bits
cluttering the code.

This patch drops all #ifdefs that reference older versions
(note that curl's preprocessor macros are in hex, so we're
looking for 070b01, not 071101).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-30 09:11:15 -07:00
Junio C Hamano
1e893a1216 Merge branch 'dl/packet-read-response-end-fix'
Error message update.

* dl/packet-read-response-end-fix:
  pkt-line: replace "stateless separator" with "response end"
2021-07-28 13:17:56 -07:00
Denton Liu
8232a0ff48 pkt-line: replace "stateless separator" with "response end"
In 0181b600a6 (pkt-line: define PACKET_READ_RESPONSE_END, 2020-05-19),
the Response End packet was defined for Git's network protocol. When the
patch was sent, it included an oversight where the error messages
referenced "stateless separator", the work-in-progress name, over
"response end", the final name chosen.

Correct these error messages by having them correctly reference
a "response end" packet.

Signed-off-by: Denton Liu <liu.denton@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-07-09 11:51:17 -07:00
Eric Wong
00bc8390d8 remote-curl: fix clone on sha256 repos
The remote-https process needs to update it's own instance of
`the_repository' when it sees an HTTP(S) remote is using sha256.
Without this, parse_oid_hex() fails to handle sha256 OIDs when
it's eventually called by parse_fetch().

Tested with:

	git clone https://yhbt.net/sha256test.git
	GIT_SMART_HTTP=0 git clone https://yhbt.net/sha256test.git
	(plain http:// also works)

Cloning the URL via git:// required no changes

Signed-off-by: Eric Wong <e@80x24.org>
Acked-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2021-05-12 12:14:44 +09:00
Srinidhi Kaushik
3b990aa645 push: parse and set flag for "--force-if-includes"
The previous commit added the necessary machinery to implement the
"--force-if-includes" protection, when "--force-with-lease" is used
without giving exact object the remote still ought to have. Surface
the feature by adding a command line option and a configuration
variable to enable it.

 - Add a flag: "TRANSPORT_PUSH_FORCE_IF_INCLUDES" to indicate that the
   new option was passed from the command line of via configuration
   settings; update command line and configuration parsers to set the
   new flag accordingly.

 - Introduce a new configuration option "push.useForceIfIncludes", which
   is equivalent to setting "--force-if-includes" in the command line.

 - Update "remote-curl" to recognize and pass this option to "send-pack"
   when enabled.

 - Update "advise" to catch the reject reason "REJECT_REF_NEEDS_UPDATE",
   set when the ref status is "REF_STATUS_REJECT_REMOTE_UPDATED" and
   (optionally) print a help message when the push fails.

 - The new option is a "no-op" in the following scenarios:
    * When used without "--force-with-lease".
    * When used with "--force-with-lease", and if the expected commit
      on the remote side is specified as an argument.

Signed-off-by: Srinidhi Kaushik <shrinidhi.kaushik@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-10-03 09:59:19 -07:00
Junio C Hamano
b4100f366c Merge branch 'jt/lazy-fetch'
Updates to on-demand fetching code in lazily cloned repositories.

* jt/lazy-fetch:
  fetch: no FETCH_HEAD display if --no-write-fetch-head
  fetch-pack: remove no_dependents code
  promisor-remote: lazy-fetch objects in subprocess
  fetch-pack: do not lazy-fetch during ref iteration
  fetch: only populate existing_refs if needed
  fetch: avoid reading submodule config until needed
  fetch: allow refspecs specified through stdin
  negotiator/noop: add noop fetch negotiator
2020-09-03 12:37:04 -07:00
Jonathan Tan
42d418df90 (various): document from_promisor parameter
88e2f9ed8e ("introduce fetch-object: fetch one promisor object",
2017-12-05) plumbed through the from_promisor parameter but did
not document it everywhere it appeared. Add the documentation.

(It also plumbed through the no_dependents parameter, but I have left
that alone because it is being removed in a commit under review [1].)

[1] https://lore.kernel.org/git/e8f16d69089a5011c355d5939c56fa53b7a1eb2d.1597184949.git.jonathantanmy@google.com/

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2020-08-24 17:30:08 -07:00