Commit graph

450 commits

Author SHA1 Message Date
Jeff King
514c5fdd03 sha1-file: modernize loose object file functions
The loose object access code in sha1-file.c is some of the oldest in
Git, and could use some modernizing. It mostly uses "unsigned char *"
for object ids, which these days should be "struct object_id".

It also uses the term "sha1_file" in many functions, which is confusing.
The term "loose_objects" is much better. It clearly distinguishes
them from packed objects (which didn't even exist back when the name
"sha1_file" came into being). And it also distinguishes it from the
checksummed-file concept in csum-file.c (which until recently was
actually called "struct sha1file"!).

This patch converts the functions {open,close,map,stat}_sha1_file() into
open_loose_object(), etc, and switches their sha1 arguments for
object_id structs. Similarly, path functions like fill_sha1_path()
become fill_loose_path() and use object_ids.

The function sha1_loose_object_info() already says "loose", so we can
just drop the "sha1" (and teach it to use object_id).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08 09:40:19 -08:00
Jeff King
f0be0db13d http: use struct object_id instead of bare sha1
The dumb-http walker code still passes around and stores object ids as
"unsigned char *sha1". Let's modernize it.

There's probably still more work to be done to handle dumb-http fetches
with a new, larger hash. But that can wait; this is enough that we can
now convert some of the low-level object routines that we call into from
here (and in fact, some of the "oid.hash" references added here will be
further improved in the next patch).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2019-01-08 09:40:19 -08:00
Junio C Hamano
3b2f8a02fa Merge branch 'jk/loose-object-cache'
Code clean-up with optimization for the codepath that checks
(non-)existence of loose objects.

* jk/loose-object-cache:
  odb_load_loose_cache: fix strbuf leak
  fetch-pack: drop custom loose object cache
  sha1-file: use loose object cache for quick existence check
  object-store: provide helpers for loose_objects_cache
  sha1-file: use an object_directory for the main object dir
  handle alternates paths the same as the main object dir
  sha1_file_name(): overwrite buffer instead of appending
  rename "alternate_object_database" to "object_directory"
  submodule--helper: prefer strip_suffix() to ends_with()
  fsck: do not reuse child_process structs
2019-01-04 13:33:32 -08:00
Junio C Hamano
13d9919298 Merge branch 'fc/http-version'
The "http.version" configuration variable can be used with recent
enough cURL library to force the version of HTTP used to talk when
fetching and pushing.

* fc/http-version:
  http: add support selecting http version
2019-01-04 13:33:32 -08:00
Jean-Noël Avila
d355e46a15 i18n: fix small typos
Translating the new strings introduced for v2.20 showed some typos.

Signed-off-by: Jean-Noël Avila <jn.avila@free.fr>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-29 15:45:31 +09:00
Jeff King
b69fb867b4 sha1_file_name(): overwrite buffer instead of appending
The sha1_file_name() function is used to generate the path to a loose
object in the object directory. It doesn't make much sense for it to
append, since the the path we write may be absolute (i.e., you cannot
reliably build up a path with it). Because many callers use it with a
static buffer, they have to strbuf_reset() manually before each call
(and the other callers always use an empty buffer, so they don't care
either way). Let's handle this automatically.

Since we're changing the semantics, let's take the opportunity to give
it a more hash-neutral name (which will also catch any callers from
topics in flight).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-13 14:22:02 +09:00
Force Charlie
d73019feb4 http: add support selecting http version
Usually we don't need to set libcurl to choose which version of the
HTTP protocol to use to communicate with a server.
But different versions of libcurl, the default value is not the same.

CURL >= 7.62.0: CURL_HTTP_VERSION_2TLS
CURL < 7.62: CURL_HTTP_VERSION_1_1

In order to give users the freedom to control the HTTP version,
we need to add a setting to choose which HTTP version to use.

Signed-off-by: Force Charlie <charlieio@outlook.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-11-09 13:21:24 +09:00
Junio C Hamano
cd69ec8cde Merge branch 'jc/http-curlver-warnings'
Warning message fix.

* jc/http-curlver-warnings:
  http: give curl version warnings consistently
2018-11-03 00:53:59 +09:00
Junio C Hamano
d7b1859732 Merge branch 'js/mingw-http-ssl'
On platforms with recent cURL library, http.sslBackend configuration
variable can be used to choose a different SSL backend at runtime.
The Windows port uses this mechanism to switch between OpenSSL and
Secure Channel while talking over the HTTPS protocol.

* js/mingw-http-ssl:
  http: when using Secure Channel, ignore sslCAInfo by default
  http: add support for disabling SSL revocation checks in cURL
  http: add support for selecting SSL backends at runtime
2018-11-03 00:53:58 +09:00
Junio C Hamano
d6b1d3b2d1 http: give curl version warnings consistently
When a requested feature cannot be activated because the version of
cURL library used to build Git with is too old, most of the codepaths
give a warning like "$Feature is not supported with cURL < $Version",
marked for l10n.  A few of them, however, did not follow that pattern
and said things like "$Feature is not activated, your curl version is
too old (>= $Version)", and without marking them for l10n.

Update these to match the style of the majority of warnings and mark
them for l10n.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-26 11:16:30 +09:00
Johannes Schindelin
b67d40adbb http: when using Secure Channel, ignore sslCAInfo by default
As of cURL v7.60.0, the Secure Channel backend can use the certificate
bundle provided via `http.sslCAInfo`, but that would override the
Windows Certificate Store. Since this is not desirable by default, let's
tell Git to not ask cURL to use that bundle by default when the `schannel`
backend was configured via `http.sslBackend`, unless
`http.schannelUseSSLCAInfo` overrides this behavior.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-26 11:15:49 +09:00
Brendan Forster
93aef7c79b http: add support for disabling SSL revocation checks in cURL
This adds support for a new http.schannelCheckRevoke config value.

This config value is only used if http.sslBackend is set to "schannel",
which forces cURL to use the Windows Certificate Store when validating
server certificates associated with a remote server.

This config value should only be set to "false" if you are in an
environment where revocation checks are blocked by the network, with
no alternative options.

This is only supported in cURL 7.44 or later.

Note: originally, we wanted to call the config setting
`http.schannel.checkRevoke`. This, however, does not work: the `http.*`
config settings can be limited to specific URLs via `http.<url>.*`
(and this feature would mistake `schannel` for a URL).

Helped by Agustín Martín Barbero.

Signed-off-by: Brendan Forster <github@brendanforster.com>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-26 11:15:49 +09:00
Johannes Schindelin
21084e84a4 http: add support for selecting SSL backends at runtime
As of version 7.56.0, curl supports being compiled with multiple SSL
backends.

This patch adds the Git side of that feature: by setting http.sslBackend
to "openssl" or "schannel", Git for Windows can now choose the SSL
backend at runtime.

This comes in handy on Windows because Secure Channel ("schannel") is
the native solution, accessing the Windows Credential Store, thereby
allowing for enterprise-wide management of certificates. For historical
reasons, Git for Windows needs to support OpenSSL still, as it has
previously been the only supported SSL backend in Git for Windows for
almost a decade.

The patch has been carried in Git for Windows for over a year, and is
considered mature.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-10-16 13:24:34 +09:00
Jeff King
67947c34ae convert "hashcmp() != 0" to "!hasheq()"
This rounds out the previous three patches, covering the
inequality logic for the "hash" variant of the functions.

As with the previous three, the accompanying code changes
are the mechanical result of applying the coccinelle patch;
see those patches for more discussion.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-29 11:32:49 -07:00
Ævar Arnfjörð Bjarmason
ce528de023 refactor various if (x) FREE_AND_NULL(x) to just FREE_AND_NULL(x)
Change the few conditional uses of FREE_AND_NULL(x) to be
unconditional. As noted in the standard[1] free(NULL) is perfectly
valid, so we might as well leave this check up to the C library.

1. http://pubs.opengroup.org/onlinepubs/9699919799/functions/free.html

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-08-17 10:08:56 -07:00
Junio C Hamano
13e8be95db Merge branch 'bw/remote-curl-compressed-responses'
Our HTTP client code used to advertise that we accept gzip encoding
from the other side; instead, just let cURL library to advertise
and negotiate the best one.

* bw/remote-curl-compressed-responses:
  remote-curl: accept compressed responses with protocol v2
  remote-curl: accept all encodings supported by curl
2018-05-30 21:51:29 +09:00
Junio C Hamano
7c3d15fe31 Merge branch 'jk/snprintf-truncation'
Avoid unchecked snprintf() to make future code auditing easier.

* jk/snprintf-truncation:
  fmt_with_err: add a comment that truncation is OK
  shorten_unambiguous_ref: use xsnprintf
  fsmonitor: use internal argv_array of struct child_process
  log_write_email_headers: use strbufs
  http: use strbufs instead of fixed buffers
2018-05-30 21:51:28 +09:00
Junio C Hamano
42c8ce1c49 Merge branch 'bc/object-id'
Conversion from uchar[20] to struct object_id continues.

* bc/object-id: (42 commits)
  merge-one-file: compute empty blob object ID
  add--interactive: compute the empty tree value
  Update shell scripts to compute empty tree object ID
  sha1_file: only expose empty object constants through git_hash_algo
  dir: use the_hash_algo for empty blob object ID
  sequencer: use the_hash_algo for empty tree object ID
  cache-tree: use is_empty_tree_oid
  sha1_file: convert cached object code to struct object_id
  builtin/reset: convert use of EMPTY_TREE_SHA1_BIN
  builtin/receive-pack: convert one use of EMPTY_TREE_SHA1_HEX
  wt-status: convert two uses of EMPTY_TREE_SHA1_HEX
  submodule: convert several uses of EMPTY_TREE_SHA1_HEX
  sequencer: convert one use of EMPTY_TREE_SHA1_HEX
  merge: convert empty tree constant to the_hash_algo
  builtin/merge: switch tree functions to use object_id
  builtin/am: convert uses of EMPTY_TREE_SHA1_BIN to the_hash_algo
  sha1-file: add functions for hex empty tree and blob OIDs
  builtin/receive-pack: avoid hard-coded constants for push certs
  diff: specify abbreviation size in terms of the_hash_algo
  upload-pack: replace use of several hard-coded constants
  ...
2018-05-30 14:04:10 +09:00
Junio C Hamano
50f08db594 Merge branch 'js/use-bug-macro'
Developer support update, by using BUG() macro instead of die() to
mark codepaths that should not happen more clearly.

* js/use-bug-macro:
  BUG_exit_code: fix sparse "symbol not declared" warning
  Convert remaining die*(BUG) messages
  Replace all die("BUG: ...") calls by BUG() ones
  run-command: use BUG() to report bugs, not die()
  test-tool: help verifying BUG() code paths
2018-05-30 14:04:07 +09:00
Brandon Williams
1a53e692af remote-curl: accept all encodings supported by curl
Configure curl to accept all encodings which curl supports instead of
only accepting gzip responses.

This fixes an issue when using an installation of curl which is built
without the "zlib" feature. Since aa90b9697 (Enable info/refs gzip
decompression in HTTP client, 2012-09-19) we end up requesting "gzip"
encoding anyway despite libcurl not being able to decode it.  Worse,
instead of getting a clear error message indicating so, we end up
falling back to "dumb" http, producing a confusing and difficult to
debug result.

Since curl doesn't do any checking to verify that it supports the a
requested encoding, instead set the curl option `CURLOPT_ENCODING` with
an empty string indicating that curl should send an "Accept-Encoding"
header containing only the encodings supported by curl.

Reported-by: Anton Golubev <anton.golubev@gmail.com>
Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-23 10:24:12 +09:00
Jeff King
390c6cbc5e http: use strbufs instead of fixed buffers
We keep the names of incoming packs and objects in fixed
PATH_MAX-size buffers, and snprintf() into them. This is
unlikely to end up with truncated filenames, but it is
possible (especially on systems where PATH_MAX is shorter
than actual paths can be). Let's switch to using strbufs,
which makes the question go away entirely.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-21 09:54:30 +09:00
Junio C Hamano
9bfa0f9be3 Merge branch 'bw/protocol-v2'
The beginning of the next-gen transfer protocol.

* bw/protocol-v2: (35 commits)
  remote-curl: don't request v2 when pushing
  remote-curl: implement stateless-connect command
  http: eliminate "# service" line when using protocol v2
  http: don't always add Git-Protocol header
  http: allow providing extra headers for http requests
  remote-curl: store the protocol version the server responded with
  remote-curl: create copy of the service name
  pkt-line: add packet_buf_write_len function
  transport-helper: introduce stateless-connect
  transport-helper: refactor process_connect_service
  transport-helper: remove name parameter
  connect: don't request v2 when pushing
  connect: refactor git_connect to only get the protocol version once
  fetch-pack: support shallow requests
  fetch-pack: perform a fetch using v2
  upload-pack: introduce fetch server command
  push: pass ref prefixes when pushing
  fetch: pass ref prefixes when fetching
  ls-remote: pass ref prefixes when requesting a remote's refs
  transport: convert transport_get_remote_refs to take a list of ref prefixes
  ...
2018-05-08 15:59:16 +09:00
Johannes Schindelin
033abf97fc Replace all die("BUG: ...") calls by BUG() ones
In d8193743e0 (usage.c: add BUG() function, 2017-05-12), a new macro
was introduced to use for reporting bugs instead of die(). It was then
subsequently used to convert one single caller in 588a538ae5
(setup_git_env: convert die("BUG") to BUG(), 2017-05-12).

The cover letter of the patch series containing this patch
(cf 20170513032414.mfrwabt4hovujde2@sigill.intra.peff.net) is not
terribly clear why only one call site was converted, or what the plan
is for other, similar calls to die() to report bugs.

Let's just convert all remaining ones in one fell swoop.

This trick was performed by this invocation:

	sed -i 's/die("BUG: /BUG("/g' $(git grep -l 'die("BUG' \*.c)

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-06 19:06:13 +09:00
brian m. carlson
dd724bcb2f http: eliminate hard-coded constants
Use the_hash_algo to find the right size for parsing pack names.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-05-02 13:59:51 +09:00
Junio C Hamano
9b59d8869d Merge branch 'lv/tls-1.3'
When built with more recent cURL, GIT_SSL_VERSION can now specify
"tlsv1.3" as its value.

* lv/tls-1.3:
  http: allow use of TLS 1.3
2018-04-11 13:09:57 +09:00
Junio C Hamano
3a1ec60c43 Merge branch 'sb/packfiles-in-repository'
Refactoring of the internal global data structure continues.

* sb/packfiles-in-repository:
  packfile: keep prepare_packed_git() private
  packfile: allow find_pack_entry to handle arbitrary repositories
  packfile: add repository argument to find_pack_entry
  packfile: allow reprepare_packed_git to handle arbitrary repositories
  packfile: allow prepare_packed_git to handle arbitrary repositories
  packfile: allow prepare_packed_git_one to handle arbitrary repositories
  packfile: add repository argument to reprepare_packed_git
  packfile: add repository argument to prepare_packed_git
  packfile: add repository argument to prepare_packed_git_one
  packfile: allow install_packed_git to handle arbitrary repositories
  packfile: allow rearrange_packed_git to handle arbitrary repositories
  packfile: allow prepare_packed_git_mru to handle arbitrary repositories
2018-04-11 13:09:55 +09:00
Junio C Hamano
cf0b1793ea Merge branch 'sb/object-store'
Refactoring the internal global data structure to make it possible
to open multiple repositories, work with and then close them.

Rerolled by Duy on top of a separate preliminary clean-up topic.
The resulting structure of the topics looked very sensible.

* sb/object-store: (27 commits)
  sha1_file: allow sha1_loose_object_info to handle arbitrary repositories
  sha1_file: allow map_sha1_file to handle arbitrary repositories
  sha1_file: allow map_sha1_file_1 to handle arbitrary repositories
  sha1_file: allow open_sha1_file to handle arbitrary repositories
  sha1_file: allow stat_sha1_file to handle arbitrary repositories
  sha1_file: allow sha1_file_name to handle arbitrary repositories
  sha1_file: add repository argument to sha1_loose_object_info
  sha1_file: add repository argument to map_sha1_file
  sha1_file: add repository argument to map_sha1_file_1
  sha1_file: add repository argument to open_sha1_file
  sha1_file: add repository argument to stat_sha1_file
  sha1_file: add repository argument to sha1_file_name
  sha1_file: allow prepare_alt_odb to handle arbitrary repositories
  sha1_file: allow link_alt_odb_entries to handle arbitrary repositories
  sha1_file: add repository argument to prepare_alt_odb
  sha1_file: add repository argument to link_alt_odb_entries
  sha1_file: add repository argument to read_info_alternates
  sha1_file: add repository argument to link_alt_odb_entry
  sha1_file: add raw_object_store argument to alt_odb_usable
  pack: move approximate object count to object store
  ...
2018-04-11 13:09:55 +09:00
Loganaden Velvindron
d81b651f56 http: allow use of TLS 1.3
Add a tlsv1.3 option to http.sslVersion in addition to the existing
tlsv1.[012] options. libcurl has supported this since 7.52.0.

This requires OpenSSL 1.1.1 with TLS 1.3 enabled or curl built with
recent versions of NSS or BoringSSL as the TLS backend.

Signed-off-by: Loganaden Velvindron <logan@hackers.mu>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-29 13:54:31 -07:00
Stefan Beller
5babff16d9 packfile: allow install_packed_git to handle arbitrary repositories
This conversion was done without the #define trick used in the earlier
series refactoring to have better repository access, because this function
is easy to review, as it only has one caller and all lines but the first
two are converted.

We must not convert 'pack_open_fds' to be a repository specific variable,
as it is used to monitor resource usage of the machine that Git executes
on.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26 10:07:43 -07:00
Stefan Beller
cf78ae4f3d sha1_file: add repository argument to sha1_file_name
Add a repository argument to allow sha1_file_name callers to be more
specific about which repository to handle. This is a small mechanical
change; it doesn't change the implementation to handle repositories
other than the_repository yet.

As with the previous commits, use a macro to catch callers passing a
repository other than the_repository at compile time.

While at it, move the declaration to object-store.h, where it should
be easier to find.

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26 10:05:55 -07:00
Stefan Beller
a80d72db2a object-store: move packed_git and packed_git_mru to object store
In a process with multiple repositories open, packfile accessors
should be associated to a single repository and not shared globally.
Move packed_git and packed_git_mru into the_repository and adjust
callers to reflect this.

[nd: while at there, wrap access to these two fields in get_packed_git()
and get_packed_git_mru(). This allows us to lazily initialize these
fields without caller doing that explicitly]

Signed-off-by: Stefan Beller <sbeller@google.com>
Signed-off-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-26 10:05:46 -07:00
Junio C Hamano
79cc6432ec Merge branch 'rj/http-code-cleanup'
There was an unused file-scope static variable left in http.c when
building for versions of libCURL that is older than 7.19.4, which
has been fixed.

* rj/http-code-cleanup:
  http: fix an unused variable warning for 'curl_no_proxy'
2018-03-21 11:30:12 -07:00
Ramsay Jones
b8fd6008ec http: fix an unused variable warning for 'curl_no_proxy'
Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 13:11:52 -07:00
Brandon Williams
884e586f9e http: don't always add Git-Protocol header
Instead of always sending the Git-Protocol header with the configured
version with every http request, explicitly send it when discovering
refs and then only send it on subsequent http requests if the server
understood the version requested.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:01:09 -07:00
Brandon Williams
8ff14ed412 http: allow providing extra headers for http requests
Add a way for callers to request that extra headers be included when
making http requests.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-03-15 12:01:09 -07:00
Junio C Hamano
169c9c0169 Merge branch 'bw/c-plus-plus'
Avoid using identifiers that clash with C++ keywords.  Even though
it is not a goal to compile Git with C++ compilers, changes like
this help use of code analysis tools that targets C++ on our
codebase.

* bw/c-plus-plus: (37 commits)
  replace: rename 'new' variables
  trailer: rename 'template' variables
  tempfile: rename 'template' variables
  wrapper: rename 'template' variables
  environment: rename 'namespace' variables
  diff: rename 'template' variables
  environment: rename 'template' variables
  init-db: rename 'template' variables
  unpack-trees: rename 'new' variables
  trailer: rename 'new' variables
  submodule: rename 'new' variables
  split-index: rename 'new' variables
  remote: rename 'new' variables
  ref-filter: rename 'new' variables
  read-cache: rename 'new' variables
  line-log: rename 'new' variables
  imap-send: rename 'new' variables
  http: rename 'new' variables
  entry: rename 'new' variables
  diffcore-delta: rename 'new' variables
  ...
2018-03-06 14:54:07 -08:00
Brandon Williams
ee6e065351 http: rename 'new' variables
Rename C++ keyword in order to bring the codebase closer to being able
to be compiled with a C++ compiler.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-02-22 10:08:05 -08:00
Junio C Hamano
5327463725 Merge branch 'jt/http-redact-cookies'
The http tracing code, often used to debug connection issues,
learned to redact potentially sensitive information from its output
so that it can be more safely sharable.

* jt/http-redact-cookies:
  http: support omitting data from traces
  http: support cookie redaction when tracing
2018-02-13 13:39:11 -08:00
Junio C Hamano
9238941618 Merge branch 'cc/sha1-file-name'
Code clean-up.

* cc/sha1-file-name:
  sha1_file: improve sha1_file_name() perfs
  sha1_file: remove static strbuf from sha1_file_name()
2018-02-13 13:39:10 -08:00
Jonathan Tan
8ba18e6fa4 http: support omitting data from traces
GIT_TRACE_CURL provides a way to debug what is being sent and received
over HTTP, with automatic redaction of sensitive information. But it
also logs data transmissions, which significantly increases the log file
size, sometimes unnecessarily. Add an option "GIT_TRACE_CURL_NO_DATA" to
allow the user to omit such data transmissions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-19 13:06:57 -08:00
Jonathan Tan
83411783c3 http: support cookie redaction when tracing
When using GIT_TRACE_CURL, Git already redacts the "Authorization:" and
"Proxy-Authorization:" HTTP headers. Extend this redaction to a
user-specified list of cookies, specified through the
"GIT_REDACT_COOKIES" environment variable.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-19 13:06:50 -08:00
Christian Couder
ea6577303f sha1_file: remove static strbuf from sha1_file_name()
Using a static buffer in sha1_file_name() is error prone
and the performance improvements it gives are not needed
in many of the callers.

So let's get rid of this static buffer and, if necessary
or helpful, let's use one in the caller.

Suggested-by: Jeff Hostetler <git@jeffhostetler.com>
Helped-by: Kevin Daudt <me@ikke.info>
Signed-off-by: Christian Couder <chriscool@tuxfamily.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2018-01-17 12:21:32 -08:00
Junio C Hamano
0956eaa621 Merge branch 'rs/use-argv-array-in-child-process'
Code cleanup.

* rs/use-argv-array-in-child-process:
  send-pack: use internal argv_array of struct child_process
  http: use internal argv_array of struct child_process
2018-01-05 13:28:10 -08:00
Junio C Hamano
fc4a226bf6 Merge branch 'ws/curl-http-proxy-over-https'
Git has been taught to support an https:// URL used for http.proxy
when using recent versions of libcurl.

* ws/curl-http-proxy-over-https:
  http: support CURLPROXY_HTTPS
2017-12-28 14:08:50 -08:00
René Scharfe
c7191fa510 http: use internal argv_array of struct child_process
Avoid a strangely magic array size (it's slightly too big) and explicit
index numbers by building the command line for index-pack using the
embedded argv_array of the child_process.  Add the flag -o and its
argument with argv_array_pushl() to make it obvious that they belong
together.  The resulting code is shorter and easier to extend.

Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-22 13:33:48 -08:00
Wei Shuyu
82b6803aee http: support CURLPROXY_HTTPS
HTTP proxy over SSL is supported by curl since 7.52.0.
This is very useful for networks with protocol whitelist.

Signed-off-by: Wei Shuyu <wsy@dogben.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-12-19 10:20:14 -08:00
Junio C Hamano
4c6dad0059 Merge branch 'bw/protocol-v1'
A new mechanism to upgrade the wire protocol in place is proposed
and demonstrated that it works with the older versions of Git
without harming them.

* bw/protocol-v1:
  Documentation: document Extra Parameters
  ssh: introduce a 'simple' ssh variant
  i5700: add interop test for protocol transition
  http: tell server that the client understands v1
  connect: tell server that the client understands v1
  connect: teach client to recognize v1 server response
  upload-pack, receive-pack: introduce protocol version 1
  daemon: recognize hidden request arguments
  protocol: introduce protocol extension mechanisms
  pkt-line: add packet_write function
  connect: in ref advertisement, shallows are last
2017-12-06 09:23:44 -08:00
Brandon Williams
19113a26b6 http: tell server that the client understands v1
Tell a server that protocol v1 can be used by sending the http header
'Git-Protocol' with 'version=1' indicating this.

Also teach the apache http server to pass through the 'Git-Protocol'
header as an environment variable 'GIT_PROTOCOL'.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-10-17 10:51:29 +09:00
Jeff King
d0e9983980 curl_trace(): eliminate switch fallthrough
Our trace handler is called by curl with a curl_infotype
variable to interpret its data field. For most types we
print the data and then break out of the switch. But for
CURLINFO_TEXT, we print data and then fall through to the
"default" case, which does the exact same thing (nothing!)
that breaking out of the switch would.

This is probably a leftover from an early iteration of the
patch where the code after the switch _did_ do something
interesting that was unique to the non-text case arms.
But in its current form, this fallthrough is merely
confusing (and causes gcc's -Wimplicit-fallthrough to
complain).

Let's make CURLINFO_TEXT like the other case arms, and push
the default arm to the end where it's more obviously a
catch-all.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-09-22 12:49:55 +09:00
Junio C Hamano
eabdcd4ab4 Merge branch 'jt/packmigrate'
Code movement to make it easier to hack later.

* jt/packmigrate: (23 commits)
  pack: move for_each_packed_object()
  pack: move has_pack_index()
  pack: move has_sha1_pack()
  pack: move find_pack_entry() and make it global
  pack: move find_sha1_pack()
  pack: move find_pack_entry_one(), is_pack_valid()
  pack: move check_pack_index_ptr(), nth_packed_object_offset()
  pack: move nth_packed_object_{sha1,oid}
  pack: move clear_delta_base_cache(), packed_object_info(), unpack_entry()
  pack: move unpack_object_header()
  pack: move get_size_from_delta()
  pack: move unpack_object_header_buffer()
  pack: move {,re}prepare_packed_git and approximate_object_count
  pack: move install_packed_git()
  pack: move add_packed_git()
  pack: move unuse_pack()
  pack: move use_pack()
  pack: move pack-closing functions
  pack: move release_pack_memory()
  pack: move open_pack_index(), parse_pack_index()
  ...
2017-08-26 22:55:09 -07:00
Junio C Hamano
6ea13d8845 Merge branch 'tc/curl-with-backports'
Updates to the HTTP layer we made recently unconditionally used
features of libCurl without checking the existence of them, causing
compilation errors, which has been fixed.  Also migrate the code to
check feature macros, not version numbers, to cope better with
libCurl that vendor ships with backported features.

* tc/curl-with-backports:
  http: use a feature check to enable GSSAPI delegation control
  http: fix handling of missing CURLPROTO_*
2017-08-24 10:20:02 -07:00
Jonathan Tan
4f39cd821d pack: move pack name-related functions
Currently, sha1_file.c and cache.h contain many functions, both related
to and unrelated to packfiles. This makes both files very large and
causes an unclear separation of concerns.

Create a new file, packfile.c, to hold all packfile-related functions
currently in sha1_file.c. It has a corresponding header packfile.h.

In this commit, the pack name-related functions are moved. Subsequent
commits will move the other functions.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-23 15:12:06 -07:00
Tom G. Christensen
dd5df538b5 http: use a feature check to enable GSSAPI delegation control
Turn the version check into a feature check to ensure this functionality
is also enabled with vendor supported curl versions where the feature
may have been backported.

Signed-off-by: Tom G. Christensen <tgc@jupiterrise.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11 15:12:41 -07:00
Tom G. Christensen
f18777ba6e http: fix handling of missing CURLPROTO_*
Commit aeae4db1 refactored the handling of the curl protocol
restriction support into a function but failed to add a version
check for older versions of curl that lack CURLPROTO_* support.

Add the missing check and at the same time convert it to a feature
check instead of a version based check.  This is done to ensure that
vendor supported curl versions that have had CURLPROTO_* support
backported are handled correctly.

Signed-off-by: Tom G. Christensen <tgc@jupiterrise.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-08-11 15:12:07 -07:00
Junio C Hamano
17b1e1d76c Merge branch 'jc/http-sslkey-and-ssl-cert-are-paths'
The http.{sslkey,sslCert} configuration variables are to be
interpreted as a pathname that honors "~[username]/" prefix, but
weren't, which has been fixed.

* jc/http-sslkey-and-ssl-cert-are-paths:
  http.c: http.sslcert and http.sslkey are both pathnames
2017-08-11 13:26:59 -07:00
Junio C Hamano
8d1549643e http.c: http.sslcert and http.sslkey are both pathnames
Back when the modern http_options() codepath was created to parse
various http.* options at 29508e1e ("Isolate shared HTTP request
functionality", 2005-11-18), and then later was corrected for
interation between the multiple configuration files in 7059cd99
("http_init(): Fix config file parsing", 2009-03-09), we parsed
configuration variables like http.sslkey, http.sslcert as plain
vanilla strings, because git_config_pathname() that understands
"~[username]/" prefix did not exist.  Later, we converted some of
them (namely, http.sslCAPath and http.sslCAInfo) to use the
function, and added variables like http.cookeyFile http.pinnedpubkey
to use the function from the beginning.  Because of that, these
variables all understand "~[username]/" prefix.

Make the remaining two variables, http.sslcert and http.sslkey, also
aware of the convention, as they are both clearly pathnames to
files.

Noticed-by: Victor Toni <victor.toni@gmail.com>
Helped-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-07-20 13:37:24 -07:00
Junio C Hamano
50f03c6676 Merge branch 'ab/free-and-null'
A common pattern to free a piece of memory and assign NULL to the
pointer that used to point at it has been replaced with a new
FREE_AND_NULL() macro.

* ab/free-and-null:
  *.[ch] refactoring: make use of the FREE_AND_NULL() macro
  coccinelle: make use of the "expression" FREE_AND_NULL() rule
  coccinelle: add a rule to make "expression" code use FREE_AND_NULL()
  coccinelle: make use of the "type" FREE_AND_NULL() rule
  coccinelle: add a rule to make "type" code use FREE_AND_NULL()
  git-compat-util: add a FREE_AND_NULL() wrapper around free(ptr); ptr = NULL
2017-06-24 14:28:41 -07:00
Ævar Arnfjörð Bjarmason
6a83d90207 coccinelle: make use of the "type" FREE_AND_NULL() rule
Apply the result of the just-added coccinelle rule. This manually
excludes a few occurrences, mostly things that resulted in many
FREE_AND_NULL() on one line, that'll be manually fixed in a subsequent
change.

Signed-off-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-16 12:44:03 -07:00
Brandon Williams
b2141fc1d2 config: don't include config.h by default
Stop including config.h by default in cache.h.  Instead only include
config.h in those files which require use of the config system.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-06-15 12:56:22 -07:00
Junio C Hamano
4c01f67d91 Merge branch 'dt/http-postbuffer-can-be-large'
Allow the http.postbuffer configuration variable to be set to a
size that can be expressed in size_t, which can be larger than
ulong on some platforms.

* dt/http-postbuffer-can-be-large:
  http.postbuffer: allow full range of ssize_t values
2017-04-23 22:07:45 -07:00
Junio C Hamano
6b51cb6181 Merge branch 'sr/http-proxy-configuration-fix'
"http.proxy" set to an empty string is used to disable the usage of
proxy.  We broke this early last year.

* sr/http-proxy-configuration-fix:
  http: fix the silent ignoring of proxy misconfiguraion
  http: honor empty http.proxy option to bypass proxy
2017-04-23 22:07:44 -07:00
David Turner
37ee680d9b http.postbuffer: allow full range of ssize_t values
Unfortunately, in order to push some large repos where a server does
not support chunked encoding, the http postbuffer must sometimes
exceed two gigabytes.  On a 64-bit system, this is OK: we just malloc
a larger buffer.

This means that we need to use CURLOPT_POSTFIELDSIZE_LARGE to set the
buffer size.

Signed-off-by: David Turner <dturner@twosigma.com>
Reviewed-by: Jonathan Nieder <jrnieder@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13 18:24:32 -07:00
Sergey Ryazanov
ae51d91105 http: fix the silent ignoring of proxy misconfiguraion
Earlier, the whole http.proxy option string was passed to curl without
any preprocessing so curl could complain about the invalid proxy
configuration.

After the commit 372370f167 ("http: use credential API to handle proxy
authentication", 2016-01-26), if the user specified an invalid HTTP
proxy option in the configuration, then the option parsing silently
fails and NULL will be passed to curl as a proxy. This forces curl to
fall back to detecting the proxy configuration from the environment,
causing the http.proxy option ignoring.

Fix this issue by checking the proxy option parsing result. If parsing
failed then print an error message and die. Such behaviour allows the
user to quickly figure the proxy misconfiguration and correct it.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13 15:51:33 -07:00
Sergey Ryazanov
57415089bd http: honor empty http.proxy option to bypass proxy
Curl distinguishes between an empty proxy address and a NULL proxy
address. In the first case it completely disables proxy usage, but if
the proxy address option is NULL then curl attempts to determine the
proxy address from the http_proxy environment variable.

According to the documentation, if the http.proxy option is set to an
empty string, git should bypass proxy and connect to the server
directly:

    export http_proxy=http://network-proxy/
    cd ~/foobar-project
    git config remote.origin.proxy ""
    git fetch

Previously, proxy host was configured by one line:

    curl_easy_setopt(result, CURLOPT_PROXY, curl_http_proxy);

Commit 372370f167 ("http: use credential API to handle proxy
authentication", 2016-01-26) parses the proxy option, then extracts the
proxy host address and updates the curl configuration, making the
previous call a noop:

    credential_from_url(&proxy_auth, curl_http_proxy);
    curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host);

But if the proxy option is empty then the proxy host field becomes NULL.
This forces curl to fall back to detecting the proxy configuration from
the environment, causing the http.proxy option to not work anymore.

Fix this issue by explicitly handling http.proxy being set the empty
string. This also makes the code a bit more clear and should help us
avoid such regressions in the future.

Helped-by: Ævar Arnfjörð Bjarmason <avarab@gmail.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Sergey Ryazanov <ryazanov.s.a@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-04-13 15:51:19 -07:00
Jeff King
1a168e5c86 convert unchecked snprintf into xsnprintf
These calls to snprintf should always succeed, because their
input is small and fixed. Let's use xsnprintf to make sure
this is the case (and to make auditing for actual truncation
easier).

These could be candidates for turning into heap buffers, but
they fall into a few broad categories that make it not worth
doing:

  - formatting single numbers is simple enough that we can
    see the result should fit

  - the size of a sha1 is likewise well-known, and I didn't
    want to cause unnecessary conflicts with the ongoing
    process to convert these constants to GIT_MAX_HEXSZ

  - the interface for curl_errorstr is dictated by curl

Signed-off-by: Jeff King <peff@peff.net>
2017-03-30 14:59:50 -07:00
Junio C Hamano
68e12d7d97 Merge branch 'jt/http-base-url-update-upon-redirect' into maint
When a redirected http transport gets an error during the
redirected request, we ignored the error we got from the server,
and ended up giving a not-so-useful error message.

* jt/http-base-url-update-upon-redirect:
  http: attempt updating base URL only if no error
2017-03-16 13:56:42 -07:00
Junio C Hamano
d880bfd947 Merge branch 'jk/http-auth' into maint
Reduce authentication round-trip over HTTP when the server supports
just a single authentication method.

* jk/http-auth:
  http: add an "auto" mode for http.emptyauth
  http: restrict auth methods to what the server advertises
2017-03-16 13:56:41 -07:00
Junio C Hamano
d0f549f403 Merge branch 'jt/http-base-url-update-upon-redirect'
When a redirected http transport gets an error during the
redirected request, we ignored the error we got from the server,
and ended up giving a not-so-useful error message.

* jt/http-base-url-update-upon-redirect:
  http: attempt updating base URL only if no error
2017-03-10 13:24:23 -08:00
Junio C Hamano
92718f57c2 Merge branch 'jk/http-auth'
Reduce authentication round-trip over HTTP when the server supports
just a single authentication method.

* jk/http-auth:
  http: add an "auto" mode for http.emptyauth
  http: restrict auth methods to what the server advertises
2017-03-10 13:24:23 -08:00
Jonathan Tan
8e27391a5f http: attempt updating base URL only if no error
http.c supports HTTP redirects of the form

  http://foo/info/refs?service=git-upload-pack
  -> http://anything
  -> http://bar/info/refs?service=git-upload-pack

(that is to say, as long as the Git part of the path and the query
string is preserved in the final redirect destination, the intermediate
steps can have any URL). However, if one of the intermediate steps
results in an HTTP exception, a confusing "unable to update url base
from redirection" message is printed instead of a Curl error message
with the HTTP exception code.

This was introduced by 2 commits. Commit c93c92f ("http: update base
URLs when we see redirects", 2013-09-28) introduced a best-effort
optimization that required checking if only the "base" part of the URL
differed between the initial request and the final redirect destination,
but it performed the check before any HTTP status checking was done. If
something went wrong, the normal code path was still followed, so this
did not cause any confusing error messages until commit 6628eb4 ("http:
always update the base URL for redirects", 2016-12-06), which taught
http to die if the non-"base" part of the URL differed.

Therefore, teach http to check the HTTP status before attempting to
check if only the "base" part of the URL differed. This commit teaches
http_request_reauth to return early without updating options->base_url
upon an error; the only invoker of this function that passes a non-NULL
"options" is remote-curl.c (through "http_get_strbuf"), which only uses
options->base_url for an informational message in the situations that
this commit cares about (that is, when the return value is not HTTP_OK).

The included test checks that the redirect scheme at the beginning of
this commit message works, and that returning a 502 in the middle of the
redirect scheme produces the correct result. Note that this is different
from the test in commit 6628eb4 ("http: always update the base URL for
redirects", 2016-12-06) in that this commit tests that a Git-shaped URL
(http://.../info/refs?service=git-upload-pack) works, whereas commit
6628eb4 tests that a non-Git-shaped URL
(http://.../info/refs/foo?service=git-upload-pack) does not work (even
though Git is processing that URL) and is an error that is fatal, not
silently swallowed.

Signed-off-by: Jonathan Tan <jonathantanmy@google.com>
Acked-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-28 11:35:53 -08:00
Jeff King
40a18fc77c http: add an "auto" mode for http.emptyauth
This variable needs to be specified to make some types of
non-basic authentication work, but ideally this would just
work out of the box for everyone.

However, simply setting it to "1" by default introduces an
extra round-trip for cases where it _isn't_ useful. We end
up sending a bogus empty credential that the server rejects.

Instead, let's introduce an automatic mode, that works like
this:

  1. We won't try to send the bogus credential on the first
     request. We'll wait to get an HTTP 401, as usual.

  2. After seeing an HTTP 401, the empty-auth hack will kick
     in only when we know there is an auth method available
     that might make use of it (i.e., something besides
     "Basic" or "Digest").

That should make it work out of the box, without incurring
any extra round-trips for people hitting Basic-only servers.

This _does_ incur an extra round-trip if you really want to
use "Basic" but your server advertises other methods (the
emptyauth hack will kick in but fail, and then Git will
actually ask for a password).

The auto mode may incur an extra round-trip over setting
http.emptyauth=true, because part of the emptyauth hack is
to feed this blank password to curl even before we've made a
single request.

Helped-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-27 10:35:24 -08:00
Jeff King
840398feb8 http: restrict auth methods to what the server advertises
By default, we tell curl to use CURLAUTH_ANY, which does not
limit its set of auth methods. However, this results in an
extra round-trip to the server when authentication is
required. After we've fed the credential to curl, it wants
to probe the server to find its list of available methods
before sending an Authorization header.

We can shortcut this by limiting our http_auth_methods by
what the server told us it supports. In some cases (such as
when the server only supports Basic), that lets curl skip
the extra probe request.

The end result should look the same to the user, but you can
use GIT_TRACE_CURL to verify the sequence of requests:

  GIT_TRACE_CURL=1 \
  git ls-remote https://example.com/repo.git \
  2>&1 >/dev/null |
  egrep '(Send|Recv) header: (GET|HTTP|Auth)'

Before this patch, hitting a Basic-only server like
github.com results in:

  Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1
  Recv header: HTTP/1.1 401 Authorization Required
  Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1
  Recv header: HTTP/1.1 401 Authorization Required
  Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1
  Send header: Authorization: Basic <redacted>
  Recv header: HTTP/1.1 200 OK

And after:

  Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1
  Recv header: HTTP/1.1 401 Authorization Required
  Send header: GET /repo.git/info/refs?service=git-upload-pack HTTP/1.1
  Send header: Authorization: Basic <redacted>
  Recv header: HTTP/1.1 200 OK

The possible downsides are:

  - This only helps for a Basic-only server; for a server
    with multiple auth options, curl may still send a probe
    request to see which ones are available (IOW, there's no
    way to say "don't probe, I already know what the server
    will say").

  - The http_auth_methods variable is global, so this will
    apply to all further requests. That's acceptable for
    Git's usage of curl, though, which also treats the
    credentials as global. I.e., in any given program
    invocation we hit only one conceptual server (we may be
    redirected at the outset, but in that case that's whose
    auth_avail field we'd see).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2017-02-23 11:11:56 -08:00
Junio C Hamano
5ce6f51ff7 Merge branch 'jk/http-walker-limit-redirect' into maint
Update the error messages from the dumb-http client when it fails
to obtain loose objects; we used to give sensible error message
only upon 404 but we now forbid unexpected redirects that needs to
be reported with something sensible.

* jk/http-walker-limit-redirect:
  http-walker: complain about non-404 loose object errors
  http: treat http-alternates like redirects
  http: make redirects more obvious
  remote-curl: rename shadowed options variable
  http: always update the base URL for redirects
  http: simplify update_url_from_redirect
2017-01-17 14:49:29 -08:00
Junio C Hamano
9d540e9726 Merge branch 'bw/transport-protocol-policy'
Finer-grained control of what protocols are allowed for transports
during clone/fetch/push have been enabled via a new configuration
mechanism.

* bw/transport-protocol-policy:
  http: respect protocol.*.allow=user for http-alternates
  transport: add from_user parameter to is_transport_allowed
  http: create function to get curl allowed protocols
  transport: add protocol policy config option
  http: always warn if libcurl version is too old
  lib-proto-disable: variable name fix
2016-12-27 00:11:41 -08:00
Junio C Hamano
da72ee87fb Merge branch 'jk/http-walker-limit-redirect'
Update the error messages from the dumb-http client when it fails
to obtain loose objects; we used to give sensible error message
only upon 404 but we now forbid unexpected redirects that needs to
be reported with something sensible.

* jk/http-walker-limit-redirect:
  http-walker: complain about non-404 loose object errors
2016-12-19 14:45:32 -08:00
Junio C Hamano
8a2882f23e Merge branch 'jk/http-walker-limit-redirect-2.9'
Transport with dumb http can be fooled into following foreign URLs
that the end user does not intend to, especially with the server
side redirects and http-alternates mechanism, which can lead to
security issues.  Tighten the redirection and make it more obvious
to the end user when it happens.

* jk/http-walker-limit-redirect-2.9:
  http: treat http-alternates like redirects
  http: make redirects more obvious
  remote-curl: rename shadowed options variable
  http: always update the base URL for redirects
  http: simplify update_url_from_redirect
2016-12-19 14:45:32 -08:00
Brandon Williams
a768a02265 transport: add from_user parameter to is_transport_allowed
Add a from_user parameter to is_transport_allowed() to allow http to be
able to distinguish between protocol restrictions for redirects versus
initial requests.  CURLOPT_REDIR_PROTOCOLS can now be set differently
from CURLOPT_PROTOCOLS to disallow use of protocols with the "user"
policy in redirects.

This change allows callers to query if a transport protocol is allowed,
given that the caller knows that the protocol is coming from the user
(1) or not from the user (0) such as redirects in libcurl.  If unknown a
-1 should be provided which falls back to reading
`GIT_PROTOCOL_FROM_USER` to determine if the protocol came from the
user.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-15 09:29:13 -08:00
Brandon Williams
aeae4db174 http: create function to get curl allowed protocols
Move the creation of an allowed protocols whitelist to a helper
function. This will be useful when we need to compute the set of
allowed protocols differently for normal and redirect cases.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-15 09:29:13 -08:00
Brandon Williams
f962ddf6ed http: always warn if libcurl version is too old
Always warn if libcurl version is too old because:

1. Even without a protocol whitelist, newer versions of curl have all
   non-standard protocols disabled by default.
2. A future patch will introduce default "known-good" and "known-bad"
   protocols which are allowed/disallowed by 'is_transport_allowed'
   which older version of libcurl can't respect.

Signed-off-by: Brandon Williams <bmwill@google.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-15 09:28:37 -08:00
Jeff King
3680f16f9d http-walker: complain about non-404 loose object errors
Since commit 17966c0a6 (http: avoid disconnecting on 404s
for loose objects, 2016-07-11), we turn off curl's
FAILONERROR option and instead manually deal with failing
HTTP codes.

However, the logic to do so only recognizes HTTP 404 as a
failure. This is probably the most common result, but if we
were to get another code, the curl result remains CURLE_OK,
and we treat it as success. We still end up detecting the
failure when we try to zlib-inflate the object (which will
fail), but instead of reporting the HTTP error, we just
claim that the object is corrupt.

Instead, let's catch anything in the 300's or above as an
error (300's are redirects which are not an error at the
HTTP level, but are an indication that we've explicitly
disabled redirects, so we should treat them as such; we
certainly don't have the resulting object content).

Note that we also fill in req->errorstr, which we didn't do
before. Without FAILONERROR, curl will not have filled this
in, and it will remain a blank string. This never mattered
for the 404 case, because in the logic below we hit the
"missing_target()" branch and print nothing. But for other
errors, we'd want to say _something_, if only to fill in the
blank slot in the error message.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06 12:43:34 -08:00
Junio C Hamano
43ec089eea Merge branch 'ew/http-walker' into jk/http-walker-limit-redirect
* ew/http-walker:
  list: avoid incompatibility with *BSD sys/queue.h
  http-walker: reduce O(n) ops with doubly-linked list
  http: avoid disconnecting on 404s for loose objects
  http-walker: remove unused parameter from fetch_object
2016-12-06 12:43:23 -08:00
Jeff King
cb4d2d35c4 http: treat http-alternates like redirects
The previous commit made HTTP redirects more obvious and
tightened up the default behavior. However, there's another
way for a server to ask a git client to fetch arbitrary
content: by having an http-alternates file (or a regular
alternates file, which is used as a backup).

Similar to the HTTP redirect case, a malicious server can
claim to have refs pointing at object X, return a 404 when
the client asks for X, but point to some other URL via
http-alternates, which the client will transparently fetch.
The end result is that it looks from the user's perspective
like the objects came from the malicious server, as the
other URL is not mentioned at all.

Worse, because we feed the new URL to curl ourselves, the
usual protocol restrictions do not kick in (neither curl's
default of disallowing file://, nor the protocol
whitelisting in f4113cac0 (http: limit redirection to
protocol-whitelist, 2015-09-22).

Let's apply the same rules here as we do for HTTP redirects.
Namely:

  - unless http.followRedirects is set to "always", we will
    not follow remote redirects from http-alternates (or
    alternates) at all

  - set CURLOPT_PROTOCOLS alongside CURLOPT_REDIR_PROTOCOLS
    restrict ourselves to a known-safe set and respect any
    user-provided whitelist.

  - mention alternate object stores on stderr so that the
    user is aware another source of objects may be involved

The first item may prove to be too restrictive. The most
common use of alternates is to point to another path on the
same server. While it's possible for a single-server
redirect to be an attack, it takes a fairly obscure setup
(victim and evil repository on the same host, host speaks
dumb http, and evil repository has access to edit its own
http-alternates file).

So we could make the checks more specific, and only cover
cross-server redirects. But that means parsing the URLs
ourselves, rather than letting curl handle them. This patch
goes for the simpler approach. Given that they are only used
with dumb http, http-alternates are probably pretty rare.
And there's an escape hatch: the user can allow redirects on
a specific server by setting http.<url>.followRedirects to
"always".

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06 12:32:48 -08:00
Jeff King
50d3413740 http: make redirects more obvious
We instruct curl to always follow HTTP redirects. This is
convenient, but it creates opportunities for malicious
servers to create confusing situations. For instance,
imagine Alice is a git user with access to a private
repository on Bob's server. Mallory runs her own server and
wants to access objects from Bob's repository.

Mallory may try a few tricks that involve asking Alice to
clone from her, build on top, and then push the result:

  1. Mallory may simply redirect all fetch requests to Bob's
     server. Git will transparently follow those redirects
     and fetch Bob's history, which Alice may believe she
     got from Mallory. The subsequent push seems like it is
     just feeding Mallory back her own objects, but is
     actually leaking Bob's objects. There is nothing in
     git's output to indicate that Bob's repository was
     involved at all.

     The downside (for Mallory) of this attack is that Alice
     will have received Bob's entire repository, and is
     likely to notice that when building on top of it.

  2. If Mallory happens to know the sha1 of some object X in
     Bob's repository, she can instead build her own history
     that references that object. She then runs a dumb http
     server, and Alice's client will fetch each object
     individually. When it asks for X, Mallory redirects her
     to Bob's server. The end result is that Alice obtains
     objects from Bob, but they may be buried deep in
     history. Alice is less likely to notice.

Both of these attacks are fairly hard to pull off. There's a
social component in getting Mallory to convince Alice to
work with her. Alice may be prompted for credentials in
accessing Bob's repository (but not always, if she is using
a credential helper that caches). Attack (1) requires a
certain amount of obliviousness on Alice's part while making
a new commit. Attack (2) requires that Mallory knows a sha1
in Bob's repository, that Bob's server supports dumb http,
and that the object in question is loose on Bob's server.

But we can probably make things a bit more obvious without
any loss of functionality. This patch does two things to
that end.

First, when we encounter a whole-repo redirect during the
initial ref discovery, we now inform the user on stderr,
making attack (1) much more obvious.

Second, the decision to follow redirects is now
configurable. The truly paranoid can set the new
http.followRedirects to false to avoid any redirection
entirely. But for a more practical default, we will disallow
redirects only after the initial ref discovery. This is
enough to thwart attacks similar to (2), while still
allowing the common use of redirects at the repository
level. Since c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28) we re-root all further requests from
the redirect destination, which should generally mean that
no further redirection is necessary.

As an escape hatch, in case there really is a server that
needs to redirect individual requests, the user can set
http.followRedirects to "true" (and this can be done on a
per-server basis via http.*.followRedirects config).

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06 12:32:48 -08:00
Jeff King
6628eb41db http: always update the base URL for redirects
If a malicious server redirects the initial ref
advertisement, it may be able to leak sha1s from other,
unrelated servers that the client has access to. For
example, imagine that Alice is a git user, she has access to
a private repository on a server hosted by Bob, and Mallory
runs a malicious server and wants to find out about Bob's
private repository.

Mallory asks Alice to clone an unrelated repository from her
over HTTP. When Alice's client contacts Mallory's server for
the initial ref advertisement, the server issues an HTTP
redirect for Bob's server. Alice contacts Bob's server and
gets the ref advertisement for the private repository. If
there is anything to fetch, she then follows up by asking
the server for one or more sha1 objects. But who is the
server?

If it is still Mallory's server, then Alice will leak the
existence of those sha1s to her.

Since commit c93c92f30 (http: update base URLs when we see
redirects, 2013-09-28), the client usually rewrites the base
URL such that all further requests will go to Bob's server.
But this is done by textually matching the URL. If we were
originally looking for "http://mallory/repo.git/info/refs",
and we got pointed at "http://bob/other.git/info/refs", then
we know that the right root is "http://bob/other.git".

If the redirect appears to change more than just the root,
we punt and continue to use the original server. E.g.,
imagine the redirect adds a URL component that Bob's server
will ignore, like "http://bob/other.git/info/refs?dummy=1".

We can solve this by aborting in this case rather than
silently continuing to use Mallory's server. In addition to
protecting from sha1 leakage, it's arguably safer and more
sane to refuse a confusing redirect like that in general.
For example, part of the motivation in c93c92f30 is
avoiding accidentally sending credentials over clear http,
just to get a response that says "try again over https". So
even in a non-malicious case, we'd prefer to err on the side
of caution.

The downside is that it's possible this will break a
legitimate but complicated server-side redirection scheme.
The setup given in the newly added test does work, but it's
convoluted enough that we don't need to care about it. A
more plausible case would be a server which redirects a
request for "info/refs?service=git-upload-pack" to just
"info/refs" (because it does not do smart HTTP, and for some
reason really dislikes query parameters).  Right now we
would transparently downgrade to dumb-http, but with this
patch, we'd complain (and the user would have to set
GIT_SMART_HTTP=0 to fetch).

Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06 12:32:48 -08:00
Jeff King
986d7f4d37 http: simplify update_url_from_redirect
This function looks for a common tail between what we asked
for and where we were redirected to, but it open-codes the
comparison. We can avoid some confusing subtractions by
using strip_suffix_mem().

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-12-06 12:32:48 -08:00
Junio C Hamano
c6400bf8d5 Merge branch 'dt/http-empty-auth'
http.emptyauth configuration is a way to allow an empty username to
pass when attempting to authenticate using mechanisms like
Kerberos.  We took an unspecified (NULL) username and sent ":"
(i.e. no username, no password) to CURLOPT_USERPWD, but did not do
the same when the username is explicitly set to an empty string.

* dt/http-empty-auth:
  http: http.emptyauth should allow empty (not just NULL) usernames
2016-10-17 13:25:18 -07:00
Junio C Hamano
fbfe878f97 Merge branch 'ps/http-gssapi-cred-delegation'
In recent versions of cURL, GSSAPI credential delegation is
disabled by default due to CVE-2011-2192; introduce a configuration
to selectively allow enabling this.

* ps/http-gssapi-cred-delegation:
  http: control GSSAPI credential delegation
2016-10-06 14:53:11 -07:00
David Turner
5275c3081c http: http.emptyauth should allow empty (not just NULL) usernames
When using Kerberos authentication with newer versions of libcurl,
CURLOPT_USERPWD must be set to a value, even if it is an empty value.
The value is never sent to the server.  Previous versions of libcurl
did not require this variable to be set.  One way that some users
express the empty username/password is http://:@gitserver.example.com,
which http.emptyauth was designed to support.  Another, equivalent,
URL is http://@gitserver.example.com.  The latter leads to a username
of zero-length, rather than a NULL username, but CURLOPT_USERPWD still
needs to be set (if http.emptyauth is set).  Do so.

Signed-off-by: David Turner <dturner@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-10-04 12:02:00 -07:00
Petr Stodulka
26a7b23429 http: control GSSAPI credential delegation
Delegation of credentials is disabled by default in libcurl since
version 7.21.7 due to security vulnerability CVE-2011-2192. Which
makes troubles with GSS/kerberos authentication when delegation
of credentials is required. This can be changed with option
CURLOPT_GSSAPI_DELEGATION in libcurl with set expected parameter
since libcurl version 7.22.0.

This patch provides new configuration variable http.delegation
which corresponds to curl parameter "--delegation" (see man 1 curl).

The following values are supported:

* none (default).
* policy
* always

Signed-off-by: Petr Stodulka <pstodulk@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-29 20:39:23 -07:00
Junio C Hamano
e007a094d4 Merge branch 'ew/http-do-not-forget-to-call-curl-multi-remove-handle' into maint
The http transport (with curl-multi option, which is the default
these days) failed to remove curl-easy handle from a curlm session,
which led to unnecessary API failures.

* ew/http-do-not-forget-to-call-curl-multi-remove-handle:
  http: always remove curl easy from curlm session on release
  http: consolidate #ifdefs for curl_multi_remove_handle
  http: warn on curl_multi_add_handle failures
2016-09-29 16:49:39 -07:00
Junio C Hamano
35ec7fd479 Merge branch 'jk/fix-remote-curl-url-wo-proto' into maint
"git fetch http::/site/path" did not die correctly and segfaulted
instead.

* jk/fix-remote-curl-url-wo-proto:
  remote-curl: handle URLs without protocol
2016-09-29 16:49:38 -07:00
Junio C Hamano
ac8ddd7ba3 Merge branch 'ew/http-do-not-forget-to-call-curl-multi-remove-handle'
The http transport (with curl-multi option, which is the default
these days) failed to remove curl-easy handle from a curlm session,
which led to unnecessary API failures.

* ew/http-do-not-forget-to-call-curl-multi-remove-handle:
  http: always remove curl easy from curlm session on release
  http: consolidate #ifdefs for curl_multi_remove_handle
  http: warn on curl_multi_add_handle failures
2016-09-21 15:15:23 -07:00
Junio C Hamano
c13f458d86 Merge branch 'jk/fix-remote-curl-url-wo-proto'
"git fetch http::/site/path" did not die correctly and segfaulted
instead.

* jk/fix-remote-curl-url-wo-proto:
  remote-curl: handle URLs without protocol
2016-09-15 14:11:15 -07:00
Eric Wong
2abc848d54 http: always remove curl easy from curlm session on release
We must call curl_multi_remove_handle when releasing the slot to
prevent subsequent calls to curl_multi_add_handle from failing
with CURLM_ADDED_ALREADY (in curl 7.32.1+; older versions
returned CURLM_BAD_EASY_HANDLE)

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 13:34:04 -07:00
Eric Wong
d8b6b84df0 http: consolidate #ifdefs for curl_multi_remove_handle
I find #ifdefs makes code difficult-to-follow.

An early version of this patch had error checking for
curl_multi_remove_handle calls, but caused some tests (e.g.
t5541) to fail under curl 7.26.0 on old Debian wheezy.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 13:34:03 -07:00
Eric Wong
9f1b58842a http: warn on curl_multi_add_handle failures
This will be useful for tracking down curl usage errors.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-13 13:34:01 -07:00
Jeff King
d63ed6ef24 remote-curl: handle URLs without protocol
Generally remote-curl would never see a URL that did not
have "proto:" at the beginning, as that is what tells git to
run the "git-remote-proto" helper (and git-remote-http, etc,
are aliases for git-remote-curl).

However, the special syntax "proto::something" will run
git-remote-proto with only "something" as the URL. So a
malformed URL like:

  http::/example.com/repo.git

will feed the URL "/example.com/repo.git" to
git-remote-http. The resulting URL has no protocol, but the
code added by 372370f (http: use credential API to handle
proxy authentication, 2016-01-26) does not handle this case
and segfaults.

For the purposes of this code, we don't really care what the
exact protocol; only whether or not it is https. So let's
just assume that a missing protocol is not, and curl will
handle the real error (which is that the URL is nonsense).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-09-08 11:23:43 -07:00
Junio C Hamano
940622bc8b Merge branch 'rs/use-strbuf-addstr'
* rs/use-strbuf-addstr:
  use strbuf_addstr() instead of strbuf_addf() with "%s"
  use strbuf_addstr() for adding constant strings to a strbuf
2016-08-08 14:48:41 -07:00
René Scharfe
bc57b9c0cc use strbuf_addstr() instead of strbuf_addf() with "%s"
Call strbuf_addstr() for adding a simple string to a strbuf instead of
using the heavier strbuf_addf().  This is shorter and documents the
intent more clearly.

Signed-off-by: Rene Scharfe <l.s.r@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-08-05 15:09:25 -07:00
Junio C Hamano
4067a45438 Merge branch 'ew/http-walker'
Dumb http transport on the client side has been optimized.

* ew/http-walker:
  list: avoid incompatibility with *BSD sys/queue.h
  http-walker: reduce O(n) ops with doubly-linked list
  http: avoid disconnecting on 404s for loose objects
  http-walker: remove unused parameter from fetch_object
2016-08-03 15:10:24 -07:00
Eric Wong
17966c0a63 http: avoid disconnecting on 404s for loose objects
404s are common when fetching loose objects on static HTTP
servers, and reestablishing a connection for every single
404 adds additional latency.

Signed-off-by: Eric Wong <e@80x24.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-07-12 15:17:42 -07:00
Junio C Hamano
2f84df2ca0 Merge branch 'ep/http-curl-trace'
HTTP transport gained an option to produce more detailed debugging
trace.

* ep/http-curl-trace:
  imap-send.c: introduce the GIT_TRACE_CURL enviroment variable
  http.c: implement the GIT_TRACE_CURL environment variable
2016-07-06 13:38:06 -07:00
Junio C Hamano
e646a82ce2 Merge branch 'bn/http-cookiefile-config' into maint
"http.cookieFile" configuration variable clearly wants a pathname,
but we forgot to treat it as such by e.g. applying tilde expansion.

* bn/http-cookiefile-config:
  http: expand http.cookieFile as a path
  Documentation: config: improve word ordering for http.cookieFile
2016-05-31 14:08:28 -07:00
Elia Pinto
74c682d3c6 http.c: implement the GIT_TRACE_CURL environment variable
Implement the GIT_TRACE_CURL environment variable to allow a
greater degree of detail of GIT_CURL_VERBOSE, in particular
the complete transport header and all the data payload exchanged.
It might be useful if a particular situation could require a more
thorough debugging analysis. Document the new GIT_TRACE_CURL
environment variable.

Helped-by: Torsten Bögershausen <tboegi@web.de>
Helped-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Jeff King <peff@peff.net>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-24 15:48:18 -07:00
Junio C Hamano
40cfc95856 Merge branch 'nd/error-errno'
The code for warning_errno/die_errno has been refactored and a new
error_errno() reporting helper is introduced.

* nd/error-errno: (41 commits)
  wrapper.c: use warning_errno()
  vcs-svn: use error_errno()
  upload-pack.c: use error_errno()
  unpack-trees.c: use error_errno()
  transport-helper.c: use error_errno()
  sha1_file.c: use {error,die,warning}_errno()
  server-info.c: use error_errno()
  sequencer.c: use error_errno()
  run-command.c: use error_errno()
  rerere.c: use error_errno() and warning_errno()
  reachable.c: use error_errno()
  mailmap.c: use error_errno()
  ident.c: use warning_errno()
  http.c: use error_errno() and warning_errno()
  grep.c: use error_errno()
  gpg-interface.c: use error_errno()
  fast-import.c: use error_errno()
  entry.c: use error_errno()
  editor.c: use error_errno()
  diff-no-index.c: use error_errno()
  ...
2016-05-17 14:38:28 -07:00
Junio C Hamano
459000ef63 Merge branch 'bn/http-cookiefile-config'
"http.cookieFile" configuration variable clearly wants a pathname,
but we forgot to treat it as such by e.g. applying tilde expansion.

* bn/http-cookiefile-config:
  http: expand http.cookieFile as a path
  Documentation: config: improve word ordering for http.cookieFile
2016-05-17 14:38:18 -07:00
Nguyễn Thái Ngọc Duy
d2e255eefa http.c: use error_errno() and warning_errno()
Signed-off-by: Nguyễn Thái Ngọc Duy <pclouds@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-09 12:29:08 -07:00
Junio C Hamano
e250f495b2 Merge branch 'js/http-custom-headers'
HTTP transport clients learned to throw extra HTTP headers at the
server, specified via http.extraHeader configuration variable.

* js/http-custom-headers:
  http: support sending custom HTTP headers
2016-05-06 14:45:43 -07:00
Brian Norris
e5a39ad8e6 http: expand http.cookieFile as a path
This should handle .gitconfig files that specify things like:

[http]
	cookieFile = "~/.gitcookies"

Signed-off-by: Brian Norris <computersforpeace@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-05-04 15:59:26 -07:00
Junio C Hamano
6b9eee2bb2 Merge branch 'jc/http-socks5h' into maint
The socks5:// proxy support added back in 2.6.4 days was not aware
that socks5h:// proxies behave differently.

* jc/http-socks5h:
  http: differentiate socks5:// and socks5h://
2016-05-02 14:24:10 -07:00
Johannes Schindelin
8cb01e2fd3 http: support sending custom HTTP headers
We introduce a way to send custom HTTP headers with all requests.

This allows us, for example, to send an extra token from build agents
for temporary access to private repositories. (This is the use case that
triggered this patch.)

This feature can be used like this:

	git -c http.extraheader='Secret: sssh!' fetch $URL $REF

Note that `curl_easy_setopt(..., CURLOPT_HTTPHEADER, ...)` takes only
a single list, overriding any previous call. This means we have to
collect _all_ of the headers we want to use into a single list, and
feed it to cURL in one shot. Since we already unconditionally set a
"pragma" header when initializing the curl handles, we can add our new
headers to that list.

For callers which override the default header list (like probe_rpc),
we provide `http_copy_default_headers()` so they can do the same
trick.

Big thanks to Jeff King and Junio Hamano for their outstanding help and
patient reviews.

Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-27 14:02:33 -07:00
Junio C Hamano
1c4f476900 Merge branch 'jc/http-socks5h'
The socks5:// proxy support added back in 2.6.4 days was not aware
that socks5h:// proxies behave differently.

* jc/http-socks5h:
  http: differentiate socks5:// and socks5h://
2016-04-22 15:45:09 -07:00
Junio C Hamano
87f8a0b279 http: differentiate socks5:// and socks5h://
Felix Ruess <felix.ruess@gmail.com> noticed that with configuration

    $ git config --global 'http.proxy=socks5h://127.0.0.1:1080'

connections to remote sites time out, waiting for DNS resolution.

The logic to detect various flavours of SOCKS proxy and ask the
libcurl layer to use appropriate one understands the proxy string
that begin with socks5, socks4a, etc., but does not know socks5h,
and we end up using CURLPROXY_SOCKS5.  The correct one to use is
CURLPROXY_SOCKS5_HOSTNAME.

https://curl.haxx.se/libcurl/c/CURLOPT_PROXY.html says

  ..., socks5h:// (the last one to enable socks5 and asking the
  proxy to do the resolving, also known as CURLPROXY_SOCKS5_HOSTNAME
  type).

which is consistent with the way the breakage was reported.

Tested-by: Felix Ruess <felix.ruess@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-04-10 11:03:17 -07:00
Junio C Hamano
f4a48e8708 Merge branch 'jx/http-no-proxy'
* jx/http-no-proxy:
  http: honor no_http env variable to bypass proxy
2016-03-10 10:56:43 -08:00
Jiang Xin
d445fda44d http: honor no_http env variable to bypass proxy
Curl and its families honor several proxy related environment variables:

* http_proxy and https_proxy define proxy for http/https connections.
* no_proxy (a comma separated hosts) defines hosts bypass the proxy.

This command will bypass the bad-proxy and connect to the host directly:

    no_proxy=* https_proxy=http://bad-proxy/ \
    curl -sk https://google.com/

Before commit 372370f (http: use credential API to handle proxy auth...),
Environment variable "no_proxy" will take effect if the config variable
"http.proxy" is not set.  So the following comamnd won't fail if not
behind a firewall.

    no_proxy=* https_proxy=http://bad-proxy/ \
    git ls-remote https://github.com/git/git

But commit 372370f not only read git config variable "http.proxy", but
also read "http_proxy" and "https_proxy" environment variables, and set
the curl option using:

    curl_easy_setopt(result, CURLOPT_PROXY, proxy_auth.host);

This caused "no_proxy" environment variable not working any more.

Set extra curl option "CURLOPT_NOPROXY" will fix this issue.

Signed-off-by: Jiang Xin <xin.jiang@huawei.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-29 11:28:39 -08:00
Junio C Hamano
e79112d210 Merge branch 'ce/https-public-key-pinning'
You can now set http.[<url>.]pinnedpubkey to specify the pinned
public key when building with recent enough versions of libcURL.

* ce/https-public-key-pinning:
  http: implement public key pinning
2016-02-24 13:25:58 -08:00
Junio C Hamano
65ba75ba7d Merge branch 'bc/http-empty-auth'
Some authentication methods do not need username or password, but
libcurl needs some hint that it needs to perform authentication.
Supplying an empty username and password string is a valid way to
do so, but you can set the http.[<url>.]emptyAuth configuration
variable to achieve the same, if you find it cleaner.

* bc/http-empty-auth:
  http: add option to try authentication without username
2016-02-24 13:25:57 -08:00
Junio C Hamano
e84d5e9fa1 Merge branch 'ew/force-ipv4'
"git fetch" and friends that make network connections can now be
told to only use ipv4 (or ipv6).

* ew/force-ipv4:
  connect & http: support -4 and -6 switches for remote operations
2016-02-24 13:25:54 -08:00
Christoph Egger
aeff8a6121 http: implement public key pinning
Add the http.pinnedpubkey configuration option for public key
pinning. It allows any string supported by libcurl --
base64(sha256(pubkey)) or filename of the full public key.

If cURL does not support pinning (is too old) output a warning to the
user.

Signed-off-by: Christoph Egger <christoph@christoph-egger.org>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-15 19:21:48 -08:00
brian m. carlson
121061f67f http: add option to try authentication without username
Performing GSS-Negotiate authentication using Kerberos does not require
specifying a username or password, since that information is already
included in the ticket itself.  However, libcurl refuses to perform
authentication if it has not been provided with a username and password.
Add an option, http.emptyAuth, that provides libcurl with an empty
username and password to make it attempt authentication anyway.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-15 14:13:37 -08:00
Eric Wong
c915f11eb4 connect & http: support -4 and -6 switches for remote operations
Sometimes it is necessary to force IPv4-only or IPv6-only operation
on networks where name lookups may return a non-routable address and
stall remote operations.

The ssh(1) command has an equivalent switches which we may pass when
we run them.  There may be old ssh(1) implementations out there
which do not support these switches; they should report the
appropriate error in that case.

rsync support is untouched for now since it is deprecated and
scheduled to be removed.

Signed-off-by: Eric Wong <normalperson@yhbt.net>
Reviewed-by: Torsten Bögershausen <tboegi@web.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-02-12 11:34:14 -08:00
Knut Franke
372370f167 http: use credential API to handle proxy authentication
Currently, the only way to pass proxy credentials to curl is by including them
in the proxy URL. Usually, this means they will end up on disk unencrypted, one
way or another (by inclusion in ~/.gitconfig, shell profile or history). Since
proxy authentication often uses a domain user, credentials can be security
sensitive; therefore, a safer way of passing credentials is desirable.

If the configured proxy contains a username but not a password, query the
credential API for one. Also, make sure we approve/reject proxy credentials
properly.

For consistency reasons, add parsing of http_proxy/https_proxy/all_proxy
environment variables, which would otherwise be evaluated as a fallback by curl.
Without this, we would have different semantics for git configuration and
environment variables.

Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 10:53:25 -08:00
Knut Franke
ef976395e2 http: allow selection of proxy authentication method
CURLAUTH_ANY does not work with proxies which answer unauthenticated requests
with a 307 redirect to an error page instead of a 407 listing supported
authentication methods. Therefore, allow the authentication method to be set
using the environment variable GIT_HTTP_PROXY_AUTHMETHOD or configuration
variables http.proxyAuthmethod and remote.<name>.proxyAuthmethod (in analogy
to http.proxy and remote.<name>.proxy).

The following values are supported:

* anyauth (default)
* basic
* digest
* negotiate
* ntlm

Signed-off-by: Knut Franke <k.franke@science-computing.de>
Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Helped-by: Junio C Hamano <gitster@pobox.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Helped-by: Elia Pinto <gitter.spiros@gmail.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2016-01-26 10:53:09 -08:00
Junio C Hamano
844a9ce472 Merge branch 'bc/object-id'
More transition from "unsigned char[40]" to "struct object_id".

This needed a few merge fixups, but is mostly disentangled from other
topics.

* bc/object-id:
  remote: convert functions to struct object_id
  Remove get_object_hash.
  Convert struct object to object_id
  Add several uses of get_object_hash.
  object: introduce get_object_hash macro.
  ref_newer: convert to use struct object_id
  push_refs_with_export: convert to struct object_id
  get_remote_heads: convert to struct object_id
  parse_fetch: convert to use struct object_id
  add_sought_entry_mem: convert to struct object_id
  Convert struct ref to use object_id.
  sha1_file: introduce has_object_file helper.
2015-12-10 12:36:13 -08:00
Jeff King
23bc35f3dc Merge branch 'dt/http-range'
Portability fix for a topic already in 'master'.

* dt/http-range:
  http: fix some printf format warnings
2015-12-01 18:54:28 -05:00
Jeff King
40fdcc5357 Merge branch 'maint'
* maint:
  http: treat config options sslCAPath and sslCAInfo as paths
  Documentation/diff: give --word-diff-regex=. example
  filter-branch: deal with object name vs. pathname ambiguity in tree-filter
  check-ignore: correct documentation about output
  git-p4: clean up after p4 submit failure
  git-p4: work with a detached head
  git-p4: add option to system() to return subshell status
  git-p4: add failing test for submit from detached head
  remote-http(s): support SOCKS proxies
  t5813: avoid creating urls that break on cygwin
  Escape Git's exec path in contrib/rerere-train.sh script
  allow hooks to ignore their standard input stream
  rebase-i-exec: Allow space in SHELL_PATH
  Documentation: make environment variable formatting more consistent
2015-12-01 17:32:38 -05:00
Jeff King
712a12e506 Merge branch 'cb/ssl-config-pathnames' into maint
Allow tilde-expansion in some http config variables.

* cb/ssl-config-pathnames:
  http: treat config options sslCAPath and sslCAInfo as paths
2015-12-01 17:21:01 -05:00
Jeff King
92b9bf4a15 Merge branch 'pt/http-socks-proxy' into maint
Add support for talking http/https over socks proxy.

* pt/http-socks-proxy:
  remote-http(s): support SOCKS proxies
2015-12-01 17:19:12 -05:00
Charles Bailey
bf9acba2c1 http: treat config options sslCAPath and sslCAInfo as paths
This enables ~ and ~user expansion for these config options.

Signed-off-by: Charles Bailey <cbailey32@bloomberg.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-24 18:51:00 -05:00
brian m. carlson
f4e54d02b8 Convert struct ref to use object_id.
Use struct object_id in three fields in struct ref and convert all the
necessary places that use it.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-20 08:02:05 -05:00
Pat Thoyts
6d7afe07f2 remote-http(s): support SOCKS proxies
With this patch we properly support SOCKS proxies, configured e.g. like
this:

	git config http.proxy socks5://192.168.67.1:32767

Without this patch, Git mistakenly tries to use SOCKS proxies as if they
were HTTP proxies, resulting in a error message like:

	fatal: unable to access 'http://.../': Proxy CONNECT aborted

This patch was required to work behind a faulty AP and scraped from
http://stackoverflow.com/questions/15227130/#15228479 and guarded with
an appropriate cURL version check by Johannes Schindelin.

Signed-off-by: Pat Thoyts <patthoyts@users.sourceforge.net>
Signed-off-by: Johannes Schindelin <johannes.schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-20 07:31:39 -05:00
Ramsay Jones
838ecf0b0f http: fix some printf format warnings
Commit f8117f55 ("http: use off_t to store partial file size",
02-11-2015) changed the type of some variables from long to off_t.
Unfortunately, the off_t type is not portable and can be represented
by several different actual types (even multiple types on the same
platform). This makes it difficult to print an off_t variable in
a platform independent way. As a result, this commit causes gcc to
issue some printf format warnings on a couple of different platforms.

In order to suppress the warnings, change the format specifier to use
the PRIuMAX macro and cast the off_t argument to uintmax_t. (See also
the http_opt_request_remainder() function, which uses the same
solution).

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
Signed-off-by: Jeff King <peff@peff.net>
2015-11-11 19:10:41 -05:00
Jeff King
f8117f550b http: use off_t to store partial file size
When we try to resume transfer of a partially-downloaded
object or pack, we fopen() the existing file for append,
then use ftell() to get the current position. We use a
"long", which can hold only 2GB on a 32-bit system, even
though packfiles may be larger than that.

Let's switch to using off_t, which should hold any file size
our system is capable of storing. We need to use ftello() to
get the off_t. This is in POSIX and hopefully available
everywhere; if not, we should be able to wrap it by falling
back to ftell(), which would presumably return "-1" on such
a large file (and we would simply skip resuming in that case).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-02 14:19:54 -08:00
David Turner
835c4d3689 http.c: use CURLOPT_RANGE for range requests
A HTTP server is permitted to return a non-range response to a HTTP
range request (and Apache httpd in fact does this in some cases).
While libcurl knows how to correctly handle this (by skipping bytes
before and after the requested range), it only turns on this handling
if it is aware that a range request is being made.  By manually
setting the range header instead of using CURLOPT_RANGE, we were
hiding the fact that this was a range request from libcurl.  This
could cause corruption.

Signed-off-by: David Turner <dturner@twopensource.com>
Reviewed-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-11-02 14:18:06 -08:00
Junio C Hamano
78891795df Merge branch 'jk/war-on-sprintf'
Many allocations that is manually counted (correctly) that are
followed by strcpy/sprintf have been replaced with a less error
prone constructs such as xstrfmt.

Macintosh-specific breakage was noticed and corrected in this
reroll.

* jk/war-on-sprintf: (70 commits)
  name-rev: use strip_suffix to avoid magic numbers
  use strbuf_complete to conditionally append slash
  fsck: use for_each_loose_file_in_objdir
  Makefile: drop D_INO_IN_DIRENT build knob
  fsck: drop inode-sorting code
  convert strncpy to memcpy
  notes: document length of fanout path with a constant
  color: add color_set helper for copying raw colors
  prefer memcpy to strcpy
  help: clean up kfmclient munging
  receive-pack: simplify keep_arg computation
  avoid sprintf and strcpy with flex arrays
  use alloc_ref rather than hand-allocating "struct ref"
  color: add overflow checks for parsing colors
  drop strcpy in favor of raw sha1_to_hex
  use sha1_to_hex_r() instead of strcpy
  daemon: use cld->env_array when re-spawning
  stat_tracking_info: convert to argv_array
  http-push: use an argv_array for setup_revisions
  fetch-pack: use argv_array for index-pack / unpack-objects
  ...
2015-10-20 15:24:01 -07:00
Junio C Hamano
3adc4ec7b9 Sync with v2.5.4 2015-09-28 19:16:54 -07:00
Junio C Hamano
11a458befc Sync with 2.4.10 2015-09-28 15:33:56 -07:00
Junio C Hamano
6343e2f6f2 Sync with 2.3.10 2015-09-28 15:28:31 -07:00
Blake Burkhart
b258116462 http: limit redirection depth
By default, libcurl will follow circular http redirects
forever. Let's put a cap on this so that somebody who can
trigger an automated fetch of an arbitrary repository (e.g.,
for CI) cannot convince git to loop infinitely.

The value chosen is 20, which is the same default that
Firefox uses.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 15:32:28 -07:00
Blake Burkhart
f4113cac0c http: limit redirection to protocol-whitelist
Previously, libcurl would follow redirection to any protocol
it was compiled for support with. This is desirable to allow
redirection from HTTP to HTTPS. However, it would even
successfully allow redirection from HTTP to SFTP, a protocol
that git does not otherwise support at all. Furthermore
git's new protocol-whitelisting could be bypassed by
following a redirect within the remote helper, as it was
only enforced at transport selection time.

This patch limits redirects within libcurl to HTTP, HTTPS,
FTP and FTPS. If there is a protocol-whitelist present, this
list is limited to those also allowed by the whitelist. As
redirection happens from within libcurl, it is impossible
for an HTTP redirect to a protocol implemented within
another remote helper.

When the curl version git was compiled with is too old to
support restrictions on protocol redirection, we warn the
user if GIT_ALLOW_PROTOCOL restrictions were requested. This
is a little inaccurate, as even without that variable in the
environment, we would still restrict SFTP, etc, and we do
not warn in that case. But anything else means we would
literally warn every time git accesses an http remote.

This commit includes a test, but it is not as robust as we
would hope. It redirects an http request to ftp, and checks
that curl complained about the protocol, which means that we
are relying on curl's specific error message to know what
happened. Ideally we would redirect to a working ftp server
and confirm that we can clone without protocol restrictions,
and not with them. But we do not have a portable way of
providing an ftp server, nor any other protocol that curl
supports (https is the closest, but we would have to deal
with certificates).

[jk: added test and version warning]

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 15:30:39 -07:00
Jeff King
9ae97018fb use strip_suffix and xstrfmt to replace suffix
When we want to convert "foo.pack" to "foo.idx", we do it by
duplicating the original string and then munging the bytes
in place. Let's use strip_suffix and xstrfmt instead, which
has several advantages:

  1. It's more clear what the intent is.

  2. It does not implicitly rely on the fact that
     strlen(".idx") <= strlen(".pack") to avoid an overflow.

  3. We communicate the assumption that the input file ends
     with ".pack" (and get a run-time check that this is so).

  4. We drop calls to strcpy, which makes auditing the code
     base easier.

Likewise, we can do this to convert ".pack" to ".bitmap",
avoiding some manual memory computation.

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Jeff King
5096d4909f convert trivial sprintf / strcpy calls to xsnprintf
We sometimes sprintf into fixed-size buffers when we know
that the buffer is large enough to fit the input (either
because it's a constant, or because it's numeric input that
is bounded in size). Likewise with strcpy of constant
strings.

However, these sites make it hard to audit sprintf and
strcpy calls for buffer overflows, as a reader has to
cross-reference the size of the array with the input. Let's
use xsnprintf instead, which communicates to a reader that
we don't expect this to overflow (and catches the mistake in
case we do).

Signed-off-by: Jeff King <peff@peff.net>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-09-25 10:18:18 -07:00
Junio C Hamano
ed070a4007 Merge branch 'ep/http-configure-ssl-version'
A new configuration variable http.sslVersion can be used to specify
what specific version of SSL/TLS to use to make a connection.

* ep/http-configure-ssl-version:
  http: add support for specifying the SSL version
2015-08-26 15:45:31 -07:00
Junio C Hamano
51a22ce147 Merge branch 'jc/finalize-temp-file'
Long overdue micro clean-up.

* jc/finalize-temp-file:
  sha1_file.c: rename move_temp_to_file() to finalize_object_file()
2015-08-19 14:48:55 -07:00
Elia Pinto
01861cb7a2 http: add support for specifying the SSL version
Teach git about a new option, "http.sslVersion", which permits one
to specify the SSL version to use when negotiating SSL connections.
The setting can be overridden by the GIT_SSL_VERSION environment
variable.

Signed-off-by: Elia Pinto <gitter.spiros@gmail.com>
Helped-by: Eric Sunshine <sunshine@sunshineco.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-17 10:16:34 -07:00
Junio C Hamano
cb5add5868 sha1_file.c: rename move_temp_to_file() to finalize_object_file()
Since 5a688fe4 ("core.sharedrepository = 0mode" should set, not
loosen, 2009-03-25), we kept reminding ourselves:

    NEEDSWORK: this should be renamed to finalize_temp_file() as
    "moving" is only a part of what it does, when no patch between
    master to pu changes the call sites of this function.

without doing anything about it.  Let's do so.

The purpose of this function was not to move but to finalize.  The
detail of the primarily implementation of finalizing was to link the
temporary file to its final name and then to unlink, which wasn't
even "moving".  The alternative implementation did "move" by calling
rename(2), which is a fun tangent.

Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-08-10 11:10:37 -07:00
Junio C Hamano
0e521a41b5 Merge branch 'et/http-proxyauth'
We used to ask libCURL to use the most secure authentication method
available when talking to an HTTP proxy only when we were told to
talk to one via configuration variables.  We now ask libCURL to
always use the most secure authentication method, because the user
can tell libCURL to use an HTTP proxy via an environment variable
without using configuration variables.

* et/http-proxyauth:
  http: always use any proxy auth method available
2015-07-13 14:00:24 -07:00
Enrique Tobis
5841520b03 http: always use any proxy auth method available
We set CURLOPT_PROXYAUTH to use the most secure authentication
method available only when the user has set configuration variables
to specify a proxy.  However, libcurl also supports specifying a
proxy through environment variables.  In that case libcurl defaults
to only using the Basic proxy authentication method, because we do
not use CURLOPT_PROXYAUTH.

Set CURLOPT_PROXYAUTH to always use the most secure authentication
method available, even when there is no git configuration telling us
to use a proxy. This allows the user to use environment variables to
configure a proxy that requires an authentication method different
from Basic.

Signed-off-by: Enrique A. Tobis <etobis@twosigma.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-06-29 09:57:43 -07:00
Junio C Hamano
39fa79178f Merge branch 'ls/http-ssl-cipher-list'
Introduce http.<url>.SSLCipherList configuration variable to tweak
the list of cipher suite to be used with libcURL when talking with
https:// sites.

* ls/http-ssl-cipher-list:
  http: add support for specifying an SSL cipher list
2015-05-22 12:41:45 -07:00
Lars Kellogg-Stedman
f6f2a9e42d http: add support for specifying an SSL cipher list
Teach git about a new option, "http.sslCipherList", which permits one to
specify a list of ciphers to use when negotiating SSL connections.  The
setting can be overwridden by the GIT_SSL_CIPHER_LIST environment
variable.

Signed-off-by: Lars Kellogg-Stedman <lars@redhat.com>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
2015-05-08 10:56:26 -07:00