Auto merge of #8864 - bk2204:reproducible-crates, r=alexcrichton

Reproducible crate builds

This series introduces reproducible crate builds.  Since crates are essentially gzipped tar archives, we canonicalize the fields such that they don't contain extraneous and potentially privacy-leaking data such as user and group names and IDs, device major and minor, and system timestamps.  Outside of the timestamps, the user probably did not intend to share information about their user or system, so this also improves developer privacy somewhat.

The individual commit messages include copious details about the individual changes involved and the rationale for this change, but roughly, the idea is that by setting the environment variable `SOURCE_DATE_EPOCH`, which is [the preferred way to specify a fixed timestamp by the Reproducible Builds project](https://reproducible-builds.org/docs/source-date-epoch/), we will produce a fully reproducible archive.  In any event, we will now produce consistent timestamps throughout the archive and avoid looking up the system time repeatedly.

If desired, I could hash the produced crate in the tests, but I feel that would be a little overkill, especially since it's possible that one of our dependencies (e.g., flate2) might change and result in us producing an equivalent but different archive.  Since reproducible builds use a consistent toolchain, that's not a problem here.

Fixes #8612
This commit is contained in:
bors 2020-11-18 15:21:38 +00:00
commit 668a6c6292
2 changed files with 38 additions and 9 deletions

View file

@ -5,12 +5,11 @@ use std::io::SeekFrom;
use std::path::{Path, PathBuf};
use std::rc::Rc;
use std::sync::Arc;
use std::time::SystemTime;
use flate2::read::GzDecoder;
use flate2::{Compression, GzBuilder};
use log::debug;
use tar::{Archive, Builder, EntryType, Header};
use tar::{Archive, Builder, EntryType, Header, HeaderMode};
use crate::core::compiler::{BuildConfig, CompileMode, DefaultExecutor, Executor};
use crate::core::{Feature, Shell, Verbosity, Workspace};
@ -510,7 +509,7 @@ fn tar(
let metadata = file.metadata().chain_err(|| {
format!("could not learn metadata for: `{}`", disk_path.display())
})?;
header.set_metadata(&metadata);
header.set_metadata_in_mode(&metadata, HeaderMode::Deterministic);
header.set_cksum();
ar.append_data(&mut header, &ar_path, &mut file)
.chain_err(|| {
@ -525,12 +524,6 @@ fn tar(
};
header.set_entry_type(EntryType::file());
header.set_mode(0o644);
header.set_mtime(
SystemTime::now()
.duration_since(SystemTime::UNIX_EPOCH)
.unwrap()
.as_secs(),
);
header.set_size(contents.len() as u64);
header.set_cksum();
ar.append_data(&mut header, &ar_path, contents.as_bytes())

View file

@ -6,8 +6,10 @@ use cargo_test_support::registry::{self, Package};
use cargo_test_support::{
basic_manifest, cargo_process, git, path2url, paths, project, symlink_supported, t,
};
use flate2::read::GzDecoder;
use std::fs::{self, read_to_string, File};
use std::path::Path;
use tar::Archive;
#[cargo_test]
fn simple() {
@ -1917,3 +1919,37 @@ src/main.rs
))
.run();
}
#[cargo_test]
fn reproducible_output() {
let p = project()
.file(
"Cargo.toml",
r#"
[project]
name = "foo"
version = "0.0.1"
authors = []
exclude = ["*.txt"]
license = "MIT"
description = "foo"
"#,
)
.file("src/main.rs", r#"fn main() { println!("hello"); }"#)
.build();
p.cargo("package").run();
assert!(p.root().join("target/package/foo-0.0.1.crate").is_file());
let f = File::open(&p.root().join("target/package/foo-0.0.1.crate")).unwrap();
let decoder = GzDecoder::new(f);
let mut archive = Archive::new(decoder);
for ent in archive.entries().unwrap() {
let ent = ent.unwrap();
let header = ent.header();
assert_eq!(header.mode().unwrap(), 0o644);
assert_eq!(header.mtime().unwrap(), 0);
assert_eq!(header.username().unwrap().unwrap(), "");
assert_eq!(header.groupname().unwrap().unwrap(), "");
}
}