The hope is this will result in more resilient comment handling, being
more consistent with rustdoc.
I also hoped for less code but `syn` is doing less than I had expected,
requiring us to copy code over from other parts of rust. It seems every
proc macro has to do this but there is no guide to it, so they all do it
differently, only covering the cases they thought to test for.
Note that this still won't support `include_str!()`.
Previously, fetches and clones would routinely fail with a panic
that indicated that pack-negotiation can't take longer than 1 round
with our previous `Naive` approach.
With this version of `gitoxide` there is now faithful support for both
the `consecutive` and the `skipping` algorithm and multiple rounds of
negotiations, which should make all clones and fetches possible.
support for shallow clones and fetches with `gitoxide`
This PR makes it possible to enable shallow clones and fetches for git dependencies and crate indices independently with the `-Zgitoxide=fetch,shallow_deps` and `-Zgitoxide=fetch,shallow_index` respectively.
### Tasks
* [x] setup the shallow option when fetching, differentiated by 'registry' and 'git-dependency'
* [x] validate registries are cloned shallowly *and* fetched shallowly
* [x] validate git-dependencies are cloned shallowly *and* fetched shallowly
* [x] a test to show what happens if a shallow index is opened with `git2` (*it can open it and fetch like normal, no issues*)
* [x] assure that `git2` can safely operate on a shallow clone - we unshallow it beforehand, both for registries and git dependencies
* [x] assure git-deps with revisions are handled correctly (they should just not be shallow, and they should unshallow themselves if they are)
* [x] make sure shallow index clones aren't seen by older cargo's
* [x] make sure shallow git dependency clones aren't seen by older cargo's
* [x] shallow.lock test and more test-suite runs with shallow clones enabled for everything
* [x] release new version of `gix` with full shallow support and use it here
* [x] check why `shallow` files remain after unshallowing. Should they not rather be deleted if empty? - Yes, `git` does so as well, implemented [with this commit](2cd5054b0a)
* ~~see if it can be avoided to ever unshallow an existing `-shallow` clone by using the right location from the start. If not, test that we can go `shallow->unshallow->shallow` without a hitch.~~ Cannot happen anymore as it can predict the final location perfectly.
* [x] `Cargo.lock` files don't prevent shallow clones
* [x] assure all other tests work with shallow cloning enabled (or fix the ones that don't with regression protection)
* [x] can the 'split-brain' issue be solved for good?
### Review Notes
* there is a chance of 'split brain' in git-dependencies as the logic for determining whether the clone/fetch is shallow is repeated in two places. This isn't the case for registries though.
### Notes
* I am highlighting that this is the `gitoxide` version of shallow clones as the `git2` version [might soon be available](https://github.com/libgit2/libgit2/pull/6396) as well. Having that would be good as it would ensure interoperability remains intact.
* Maybe for when `git2` has been phased out, i.e. everything else is working, I think (unscientifically) there might be benefits in using worktrees for checkouts. Admittedly I don't know the history of why they weren't used in the first place. Also: `gitoxide` doesn't yet support local clones and might not have to if worktrees were used instead.
The implementation hinges on passing information about the kind of clone
and fetch to the `fetch()` method, which then configures the fetch accordingly.
Note that it doesn't differentiate between initial clones and fetches as
the shallow-ness of the repository is maintained nonetheless.
This is a follow up to #12039.
This makes it easier for tools to report less irrelevant information.
I did both `publish = false` and `version = "0.0.0"` to help draw
attention to the fact that these crates are internal (inspired by a
matklad post).
I left `cargo-test-macro` and `cargo-test-support` in for my own
personal bias of one day wanting to see those crates published...
The only one removed that had previously been published was `mdman` but
seeing as that was a `0.0.0`, I'm assuming that was a mistake or just
reserving the name.
Before:
```console
$ cargo unpublished
name published current
==== ========= =======
cargo-platform 0.1.2 0.1.3
cargo-test-macro - 0.1.0
cargo-test-support - 0.1.0
cargo-util 0.2.3 0.2.4
crates-io 0.36.0 0.36.1
mdman 0.0.0 0.1.0
resolver-tests - 0.1.0
cargo 0.70.1 0.72.0
semver-check - 0.1.0
cargo-credential 0.1.0 0.2.0
cargo-credential-1password 0.1.0 0.2.0
cargo-credential-gnome-secret 0.1.0 0.2.0
cargo-credential-macos-keychain 0.1.0 0.2.0
cargo-credential-wincred 0.1.0 0.2.0
benchsuite - 0.1.0
```
After:
```console
name published current
==== ========= =======
cargo-platform 0.1.2 0.1.3
cargo-test-macro - 0.1.0
cargo-test-support - 0.1.0
cargo-util 0.2.3 0.2.4
crates-io 0.36.0 0.36.1
cargo 0.70.1 0.72.0
cargo-credential 0.1.0 0.2.0
cargo-credential-1password 0.1.0 0.2.0
cargo-credential-gnome-secret 0.1.0 0.2.0
cargo-credential-macos-keychain 0.1.0 0.2.0
cargo-credential-wincred 0.1.0 0.2.0
```
chore(xtask): Add `cargo xtask unpublished`
### What does this PR try to resolve?
This tries to make it easy to see what existing versions have not been published. A future version of this could post to a PR what the current delta in version numbers for touched crates so reviewer have more context when deciding if they should ask for a crate version to be bumped
```console
$ cargo unpublished
Finished dev [unoptimized + debuginfo] target(s) in 0.12s
Running `/home/epage/src/personal/cargo/target/debug/xtask unpublished`
Updating crates.io index
name published current
==== ========= =======
cargo-test-macro - 0.1.0
cargo-test-support - 0.1.0
cargo-util 0.2.3 0.2.4
mdman 0.0.0 0.1.0
resolver-tests - 0.1.0
cargo 0.70.0 0.71.0
cargo-credential 0.1.0 0.2.0
cargo-credential-1password 0.1.0 0.2.0
cargo-credential-gnome-secret 0.1.0 0.2.0
cargo-credential-macos-keychain 0.1.0 0.2.0
cargo-credential-wincred 0.1.0 0.2.0
benchsuite - 0.1.0
```
Room for improvement
- Aligning the start of each column
- Filtering the list by a commit range
- Adding this to an action to post to a review
- Maybe sorting the output
- Marking some our crates as `package.publish = false`, like benchsuite and resolver-tests
### How should we test and review this PR?
This is broken down commit by commit for easier seeing of the building blocks for our first xtask
This will allow running an xtask without requiring building the world.
In most cases, a user will already have been building cargo but not in
CI.
The packages keep an `xtask-` prefix to help raise awareness of them but
exposed as `cargo <suffix>` to avoid having a direction proxy to wrap
`cargo run -p xtask-<suffix>` as `cargo xtask <suffix>`.
Update windows-sys
This updates the windows-sys dependency from 0.45 to 0.48. This shouldn't add or remove any duplicate dependencies (since there are other dependencies still using 0.45 and 0.42). The intent is to move it along the direction towards unifying in the future (though it seems like a moving target that will be difficult to ever hit).
This also bumps the home crate version. I think it should be OK to make the migration from winapi to windows-sys a patch version, though there seems to be some issues with the way windows-sys works that could introduce some build-time problems in some situations (such as those encountered in https://github.com/rust-lang/rust/pull/108665 and https://github.com/rust-lang/rust/pull/106610). However, I don't expect too much of an issue.
Bump libc to 0.2.142
libc 0.2.141 cannot build successfully on AIX. (CI on AIX is not available yet)
Upgrade libc to 0.2.142 to make cargo build on AIX.
Update openssl-src to 111.25.3+1.1.1t
### What does this PR try to resolve?
Support for LoongArch has been added to the `openssl-src` start from `v111.25.3`. Therefore, we have updated the version to include this support.
Thanks
There's now a lock file upstream in rust-lang/rust so the one here isn't
actually used, and otherwise this crate is used as a dependency so the lock file
isn't respected anyway!
Fixes#4326
`cargo package` (and so `cargo publish`) parses a crate’s `Cargo.toml`,
makes some modifications, and re-serializes it.
Because the `TomlManifest` struct uses `HashMap`
with its default `RandomState` hasher,
the maps’ iteration order changed on every run.
As a result, when using `cargo vendor`,
updating a dependency would generate a diff larger than necessary,
with non-significant order-changes obscuring significant changes.
This replaces some uses of `HashMap` with `BTreeMap`,
whose iteration order is deterministic (based on `Ord`).
Use `same-file` to avoid unnecessary hard links
This is targeted at removing the need for a workaround in rust-lang/rust#39518,
allowing the main rust build system to move back to hard links which should be
much more efficient.
This is targeted at removing the need for a workaround in rust-lang/rust#39518,
allowing the main rust build system to move back to hard links which should be
much more efficient.
Add gitignore-like pattern matching logic to `list_files()` and throw
warnings for paths getting different inclusion/exclusion results from
the old and the new methods.
Migration Tracking: <https://github.com/rust-lang/cargo/issues/4268>
The API of `termcolor` fits what the system gives us much more nicely and should
be well battle-tested from ripgrep. Additionally we don't really need huge
terminfo parsers, that wasn't every really the intention of the color support
here.
Removing some allocations arounds the stored hashes by having nested hash maps
instead of tuple keys. Also remove an intermediate array when parsing
dependencies through a custom implementation of `Deserialize`. While this
doesn't make this code path blazingly fast it definitely knocks it down in the
profiles below other higher-value targets.
This commit is a relatively serious optimization pass of the resolution phase in
Cargo, targeted at removing as many allocations as possible from this phase.
Executed as an iterative loop this phase of Cargo can often be costly for large
graphs but it's run on every single build!
The main optimization here is to avoid cloning the context and/or pushing a
backtracking frame if there are no candidates left in the current list of
candidates. That optimizes a fast-path for crates with lock files (almost all of
them) and gets us to the point where cloning the context basically disappears
from all profiling.
Add a GNU make jobserver implementation to Cargo
This commit adds a GNU make jobserver implementation to Cargo, both as a client
of existing jobservers and also a creator of new jobservers. The jobserver is
actually just an IPC semaphore which manifests itself as a pipe with N bytes
of tokens on Unix and a literal IPC semaphore on Windows. The rough protocol
is then if you want to run a job you read acquire the semaphore (read a byte on
Unix or wait on the semaphore on Windows) and then you release it when you're
done.
All the hairy details of the jobserver implementation are housed in the
`jobserver` crate on crates.io instead of Cargo. This should hopefully make it
much easier for the compiler to also share a jobserver implementation
eventually.
The main tricky bit here is that on Unix and Windows acquiring a jobserver token
will block the calling thread. We need to either way for a running job to exit
or to acquire a new token when we want to spawn a new job. To handle this the
current implementation spawns a helper thread that does the blocking and sends a
message back to Cargo when it receives a token. It's a little trickier with
shutting down this thread gracefully as well but more details can be found in
the `jobserver` crate.
Unfortunately crates are unlikely to see an immediate benefit of this once
implemented. Most crates are run with a manual `make -jN` and this overrides the
jobserver in the environment, creating a new jobserver in the sub-make. If the
`-jN` argument is removed, however, then `make` will share Cargo's jobserver and
properly limit parallelism.
Closes#1744
This commit adds a GNU make jobserver implementation to Cargo, both as a client
of existing jobservers and also a creator of new jobservers. The jobserver is
actually just an IPC semaphore which manifests itself as a pipe with N bytes
of tokens on Unix and a literal IPC semaphore on Windows. The rough protocol
is then if you want to run a job you read acquire the semaphore (read a byte on
Unix or wait on the semaphore on Windows) and then you release it when you're
done.
All the hairy details of the jobserver implementation are housed in the
`jobserver` crate on crates.io instead of Cargo. This should hopefully make it
much easier for the compiler to also share a jobserver implementation
eventually.
The main tricky bit here is that on Unix and Windows acquiring a jobserver token
will block the calling thread. We need to either way for a running job to exit
or to acquire a new token when we want to spawn a new job. To handle this the
current implementation spawns a helper thread that does the blocking and sends a
message back to Cargo when it receives a token. It's a little trickier with
shutting down this thread gracefully as well but more details can be found in
the `jobserver` crate.
Unfortunately crates are unlikely to see an immediate benefit of this once
implemented. Most crates are run with a manual `make -jN` and this overrides the
jobserver in the environment, creating a new jobserver in the sub-make. If the
`-jN` argument is removed, however, then `make` will share Cargo's jobserver and
properly limit parallelism.
Closes#1744
Convert CargoResult, CargoError into an implementation provided by error-chain. The previous is_human machinery is mostly removed; now errors are displayed unless of the Internal kind, verbose mode will print all errors.
Add year to project template variables
This adds the current year as a `year` variable for project templates. Some license files / headers include the year, so this should make it easier to include those in a template.
This commit migrates Cargo as much as possible from rustc-serialize to
Serde. This not only provides an excellent testing ground for the toml
0.3 release but it also is a big boost to the speed of parsing the JSON
bits of the registry.
This doesn't completely excise the dependency just yet as docopt still
requires it along with handlebars. I'm sure though that in time those
crates will migrate to serde!
Update libgit2-sys, openssl-sys and openssl
as the versions currently in Cargo.lock are
not compatible with libressl.
Also update tests accordingly
Signed-off-by: Marc-Antoine Perennou <Marc-Antoine@Perennou.com>
PR #3004 This is a resubmission of the PR #1747 (from scratch) which adds
support for templating in Cargo. The templates are implemented using the
handlebars crate (where the original PR used mustache).
Examples:
cargo new --template https://url/to/template somedir foo
cargo new --template https://url/to/templates --template-subdir somedir foo
cargo new --template ../path/to/template somedir foo
Update libssh2 to fix a segfault on Windows
There's some more discussion on #3401, but this essentially is just an inclusion
of libssh2/libssh2#163Closes#3401
This just switches libz to always link statically instead of relying on the
system zlib. For MinGW it seems that linkage may default pull in a DLL, which is
almost never what we want.
Also update curl-sys to fix a build issue on MinGW.
Closes#3384
This updates our AppVeyor builds to compile with `-Ctarget-feature=+crt-static`
to help Cargo be a bit more portable and not rely on the MSVC redistributable
artifacts. Over time this may even let us converge on only releasing one build
of Cargo and just pairing that with all Windows toolchains...
This commit includes alexcrichton/git2-rs@a8f4a7faa which switches the order of
initialization of libgit2. That commit ensures that the relevant env vars which
a statically linked OpenSSL needs to function are set before libgit2 is
initialized to ensure that libgit2 uses them.
This was regressed accidentally in alexcrichton/git2-rs@071902aa when
initialization was tweaked.
Closes#3340
This updates libgit2/libssh2 bindings to fix initialization races in
OpenSSL. This should fix some of the spurious segfaults we've been
seeing on Travis OSX.
The primary targets here are openssl and openssl-sys crates 0.9,
bringing support for OpenSSL 1.1.0. This requires updating the curl
and git2 related dependencies as well.
A small change is required in cargo itself for the new Hasher API.
Results from the hasher are simply unwrapped for now, matching the
Windows behavior that already panics on error.
Leak mspdbsrv.exe processes on Windows
Instead of having our job object tear them down, instead leak them intentionally
if everything succeeded.
Closes#3161
This commit alters Cargo's behavior when the `-vv` option is passed (two verbose
flags) to stream output of all build scripts to the console. Cargo makes not
attempt to prevent interleaving or indicate *which* build script is producing
output, rather it simply forwards all output to one to the console.
Cargo still acts as a middle-man, capturing the output, to parse build script
output and interpret the results. The parsing is still deferred to completion
but the stream output happens while the build script is running.
On Unix this is implemented via `select` and on Windows this is implemented via
IOCP.
Closes#1106
Update TOML parser to pick up a bugfix
Cargo has previously accepted invalid TOML as valid, but this bugfix should fix
the problem. In order to prevent breaking all crates immediately toml-rs has a
compatibility mode which emulates the bug that was fixed. Cargo will issue a
warning if this compatibility is required to parse a crate.
Cargo has previously accepted invalid TOML as valid, but this bugfix should fix
the problem. In order to prevent breaking all crates immediately toml-rs has a
compatibility mode which emulates the bug that was fixed. Cargo will issue a
warning if this compatibility is required to parse a crate.
Compiling everything in one binary was getting annoying as it just took forever
to build, instead shard it all up so we can build just particular test suites at
a time.
Dearest Reviewer,
This branch resolves#1602 which relates to retrying network
issues automatically. There is a new utility helper for retrying
any network call.
There is a new config called net.retry value in the .cargo.config
file. The default value is 2. The documentation has also been
updated to reflect the new value.
Thanks
Becker
Picks up a fix to hopefully and correctly configure OpenSSL to be enabled in
cross-compiled situations where OpenSSL comes from a different location
(currently specified by the `OPENSSL_ROOT_DIR` environment variable that libssh2
also reads).
This commit beefs up Cargo's makefiles to support nightly builds of Cargo for
multiple platforms. This primarily involves vendoring the logic of how to build
OpenSSL for statically linking against Cargo into the Makefiles directly. We'll
have to update the version of OpenSSL as releases are made, but we essentially
already do that with the normal docker container.
The Linux nightlies will still run in the normal dist docker container (a really
old CentOS build) and builds for new platforms will happen in the standard
linux-cross container we use for other cross builds. The nightly versions of
these will produce Cargo tarballs for a whole bunch of platforms to get
uploaded.
This has been tested in the `alexcrichton/rust-slave-linux-cross:2016-03-17b`
docker container for the 3 ARM targets and FreeBSD target. NetBSD will come once
rust-lang/rust#32407 lands.
Cargo has historically had no protections against running it concurrently. This
is pretty unfortunate, however, as it essentially just means that you can only
run one instance of Cargo at a time **globally on a system**.
An "easy solution" to this would be the use of file locks, except they need to
be applied judiciously. It'd be a pretty bad experience to just lock the entire
system globally for Cargo (although it would work), but otherwise Cargo must be
principled how it accesses the filesystem to ensure that locks are properly
held. This commit intends to solve all of these problems.
A new utility module is added to cargo, `util::flock`, which contains two types:
* `FileLock` - a locked version of a `File`. This RAII guard will unlock the
lock on `Drop` and I/O can be performed through this object. The actual
underlying `Path` can be read from this object as well.
* `Filesystem` - an unlocked representation of a `Path`. There is no "safe"
method to access the underlying path without locking a file on the filesystem
first.
Built on the [fs2] library, these locks use the `flock` system call on Unix and
`LockFileEx` on Windows. Although file locking on Unix is [documented as not so
great][unix-bad], but largely only because of NFS, these are just advisory, and
there's no byte-range locking. These issues don't necessarily plague Cargo,
however, so we should try to leverage them. On both Windows and Unix the file
locks are released when the underlying OS handle is closed, which means that
if the process dies the locks are released.
Cargo has a number of global resources which it now needs to lock, and the
strategy is done in a fairly straightforward way:
* Each registry's index contains one lock (a dotfile in the index). Updating the
index requires a read/write lock while reading the index requires a shared
lock. This should allow each process to ensure a registry update happens while
not blocking out others for an unnecessarily long time. Additionally any
number of processes can read the index.
* When downloading crates, each downloaded crate is individually locked. A lock
for the downloaded crate implies a lock on the output directory as well.
Because downloaded crates are immutable, once the downloaded directory exists
the lock is no longer needed as it won't be modified, so it can be released.
This granularity of locking allows multiple Cargo instances to download
dependencies in parallel.
* Git repositories have separate locks for the database and for the project
checkout. The datbase and checkout are locked for read/write access when an
update is performed, and the lock of the checkout is held for the entire
lifetime of the git source. This is done to ensure that any other Cargo
processes must wait while we use the git repository. Unfortunately there's
just not that much parallelism here.
* Binaries managed by `cargo install` are locked by the local metadata file that
Cargo manages. This is relatively straightforward.
* The actual artifact output directory is just globally locked for the entire
build. It's hypothesized that running Cargo concurrently in *one directory* is
less of a feature needed rather than running multiple instances of Cargo
globally (for now at least). It would be possible to have finer grained
locking here, but that can likely be deferred to a future PR.
So with all of this infrastructure in place, Cargo is now ready to grab some
locks and ensure that you can call it concurrently anywhere at any time and
everything always works out as one might expect.
One interesting question, however, is what does Cargo do on contention? On one
hand Cargo could immediately abort, but this would lead to a pretty poor UI as
any Cargo process on the system could kick out any other. Instead this PR takes
a more nuanced approach.
* First, all locks are attempted to be acquired (a "try lock"). If this
succeeds, we're done.
* Next, Cargo prints a message to the console that it's going to block waiting
for a lock. This is done because it's indeterminate how long Cargo will wait
for the lock to become available, and most long-lasting operations in Cargo
have a message printed for them.
* Finally, a blocking acquisition of the lock is issued and we wait for it to
become available.
So all in all this should help Cargo fix any future concurrency bugs with file
locking in a principled fashion while also allowing concurrent Cargo processes
to proceed reasonably across the system.
[fs2]: https://github.com/danburkert/fs2-rs
[unix-bad]: http://0pointer.de/blog/projects/locking.htmlCloses#354
`build.rustflags` is treated exactly like `RUSTFLAGS`.
It is a list, so argument lists with spaces work.
`RUSTFLAGS` takes precedent, then `build.rustflags`.
This passes RUSTFLAGS to rustc builds for the target architecture.
We don't want to pass the RUSTFLAGS args to multiple architectures because
they may contain architecture-specific flags. Ideally, the scheme
we would use would treat plugins and build scripts - which may not
be for the target architecture - consistently. Unfortunately it's
quite difficult in the current Cargo architecture to seperately
identify build scripts, plugins and their dependencies from
code used by the target.
So the scheme here is very simple:
1) If --target is not specified, RUSTFLAGS applies to all builds.
2) If --target is specified, RUSTFLAGS only applies to builds
with the Kind::Target target kind, which indicates build units
derived from the requested --target.
Closes#2112
This crate was recently updated to the next release of libgit2, and I've noticed
historically that a noop `cargo build` was slow in the git2-rs repository.
Curious to see if the new libgit2 version helped speed things up at all, I
tested it out.
Before this commit, a noop `cargo build` produced 599108 syscalls. After this
commit, a noop build produced 86925 syscalls, an 85% reduction in the number of
syscalls! Needless to say it's much faster.
Before v0.4, term used to return `Ok(true)` if something succeeded,
`Ok(false)` if the operation was unsupported, and `Err(io::Error)` if
there was an IO error. Now, it returns `Ok(())` if the operation
succeeds and `Err(term::Error)` if the operation fails. If the operation
is unsupported, it returns `Err(term::Error::NotSupported)`. This means
that, if `op` is unsupported, `try!(term.op())` will now return an error
instead of silently failing (well, return false but that's effectively
silent).
Fixes#2338
This is undoing part of commit dd34296Fix#2338
(Unfortunately I do not have a suggestion for how to make a unit test
for this problem; doing so would require putting in a bit more effort
than I have time for at the moment.)
This commit is targeted at fixing #2102 via two routes:
1. The dependency on `tar` was upgraded to include more contextual information
in error messages about why the unpack failed. This should help diagnose
these sorts of issues that happen in the first place.
2. Packaging crates that have files with odd filenames is no longer allowed.
An error is returned indicating that the files cannot be packaged as they're
not cross platform. The currently rejected set of files are non-utf8
filenames (already present) and those containing characters special on
Windows.
Closes#2102
* Move along the same rails for all dependencies, picking up various small perf,
build, and portability improvements.
* Update pinned rustc to pick up perf improvements and such
* Tweak expected error message from tests to continue to work
* Updates git2-rs back to 0.3 now that the distribution issue on OSX has been
fixed.
* Updates libgit2-sys to using the `cmake` crate so building with VS 2015 can
work.
* Updates git2-rs back to 0.3 now that the distribution issue on OSX has been
fixed.
* Updates libgit2-sys to using the `cmake` crate so building with VS 2015 can
work.
* Update pkg-config to totally disable it on MSVC (basically guaranteed to never
work)
This commit overhauls how a `Fingerprint` is stored on the filesystem and
in-memory to help provide much better diagnostics as to why crates are being
rebuilt. This involves storing more structured data on the filesystem in order
to have a finer-grained comparison with the previous state. This is not
currently surfaced in the output of cargo and still requires
`RUST_LOG=cargo::ops::cargo_rustc::fingerprint=info` but if it turns out to be
useful we can perhaps surface the output.
There are performance considerations here to ensure that a noop build is still
quite speedy for a few reasons:
1. JSON decoding is slow (these are just big structures to decode)
2. Each fingerprint stores all recursive fingerprints, so we can't just "vanilla
decode" as it would decode O(n^2) items
3. Hashing is actually somewhat nontrivial for this many items here and there,
so we still need as much memoization as possible.
To ensure that builds are just as speedy tomorrow as they are today, a few
strategies are taken:
* The same fingerprint strategy is used today as a "first line of defense" where
a small text file with a string contains the "total fingerprint" hash. A
separately stored file then contains the more detailed JSON structure of the
old fingerprint, and that's only decoded if there's a mismatch of the short
hashes. The rationale here is that most crates don't need to be rebuilt so we
shouldn't decode JSON, but if it does need to be rebuilt then the work of
compiling far dwarfs the work of decoding the JSON.
* When encoding a full fingerprint as JSON we don't actually include any
dependencies, just the resolved u64 of them. This helps the O(n^2) problem in
terms of decoding time and storage space on the filesystem.
* Short hashes continue to be memoized to ensure we don't recompute a hash if
we've already done so (e.g. shared dependencies).
Overall, when profiling with Servo, this commit does not regress noop build
times, but should help diagnose why crates are being rebuilt hopefully!
Closes#2011
This commit started out identifying a relatively simple bug in Cargo. A recent
change made it such that the resolution graph included all target-specific
dependencies, relying on the structure of the backend to filter out those which
don't need to get built. This was unfortunately not accounted for in the portion
of the backend that schedules work, mistakenly causing spurious rebuilds if
different runs of the graph pulled in new crates. For example if `cargo build`
didn't build any target-specific dependencies but then later `cargo test` did
(e.g. a dev-dep pulled in a target-specific dep unconditionally) then it would
cause a rebuild of the entire graph.
This class of bug is certainly not the first in a long and storied history of
the backend having multiple points where dependencies are calculated and those
often don't quite agree with one another. The purpose of this rewrite is
twofold:
1. The `Stage` enum in the backend for scheduling work and ensuring that maximum
parallelism is achieved is removed entirely. There is already a function on
`Context` which expresses the dependency between targets (`dep_targets`)
which takes a much finer grain of dependencies into account as well as
already having all the logic for what-depends-on-what. This duplication has
caused numerous problems in the past, an unifying these two will truly grant
maximum parallelism while ensuring that everyone agrees on what their
dependencies are.
2. A large number of locations in the backend have grown to take a (Package,
Target, Profile, Kind) tuple, or some subset of this tuple. In general this
represents a "unit of work" and is much easier to pass around as one
variable, so a `Unit` was introduced which references all of these variables.
Almost the entire backend was altered to take a `Unit` instead of these
variables specifically, typically providing all of the contextual information
necessary for an operation.
A crucial part of this change is the inclusion of `Kind` in a `Unit` to ensure
that everyone *also* agrees on what architecture they're compiling everything
for. There have been many bugs in the past where one part of the backend
determined that a package was built for one architecture and then another part
thought it was built for another. With the inclusion of `Kind` in dependency
management this is handled in a much cleaner fashion as it's only calculated in
one location.
Some other miscellaneous changes made were:
* The `Platform` enumeration has finally been removed. This has been entirely
subsumed by `Kind`.
* The hokey logic for "build this crate once" even though it may be depended on
by both the host/target kinds has been removed. This is now handled in a much
nicer fashion where if there's no target then Kind::Target is just never used,
and multiple requests for a package are just naturally deduplicated.
* There's no longer any need to build up the "requirements" for a package in
terms of what platforms it's compiled for, this now just naturally falls out
of the dependency graph.
* If a build script is overridden then its entire tree of dependencies are not
compiled, not just the build script itself.
* The `threadpool` dependency has been replaced with one on `crossbeam`. The
method of calculating dependencies has quite a few non-static lifetimes and
the scoped threads of `crossbeam` are now used instead of a thread pool.
* Once any thread fails to execute a command work is no longer scheduled unlike
before where some extra pending work may continue to start.
* Many functions used early on, such as `compile` and `build_map` have been
massively simplified by farming out dependency management to
`Context::dep_targets`.
* There is now a new profile to represent running a build script. This is used
to inject dependencies as well as represent that a library depends on running
a build script, not just building it.
This change has currently been tested against cross-compiling Servo to Android
and passes the test suite (which has quite a few corner cases for build scripts
and such), so I'm pretty confident that this refactoring won't have at least too
many regressions!
The builds on the linux bots are currently broken because the recent
modifications to this build script forgot to set up PKG_CONFIG_PATH for custom
installations of OpenSSL