cargo/core/compiler/fingerprint/mod.rs
1//! Tracks changes to determine if something needs to be recompiled.
2//!
3//! This module implements change-tracking so that Cargo can know whether or
4//! not something needs to be recompiled. A Cargo [`Unit`] can be either "dirty"
5//! (needs to be recompiled) or "fresh" (it does not need to be recompiled).
6//!
7//! ## Mechanisms affecting freshness
8//!
9//! There are several mechanisms that influence a Unit's freshness:
10//!
11//! - The [`Fingerprint`] is a hash, saved to the filesystem in the
12//! `.fingerprint` directory, that tracks information about the Unit. If the
13//! fingerprint is missing (such as the first time the unit is being
14//! compiled), then the unit is dirty. If any of the fingerprint fields
15//! change (like the name of the source file), then the Unit is considered
16//! dirty.
17//!
18//! The `Fingerprint` also tracks the fingerprints of all its dependencies,
19//! so a change in a dependency will propagate the "dirty" status up.
20//!
21//! - Filesystem mtime tracking is also used to check if a unit is dirty.
22//! See the section below on "Mtime comparison" for more details. There
23//! are essentially two parts to mtime tracking:
24//!
25//! 1. The mtime of a Unit's output files is compared to the mtime of all
26//! its dependencies' output file mtimes (see
27//! [`check_filesystem`]). If any output is missing, or is
28//! older than a dependency's output, then the unit is dirty.
29//! 2. The mtime of a Unit's source files is compared to the mtime of its
30//! dep-info file in the fingerprint directory (see [`find_stale_file`]).
31//! The dep-info file is used as an anchor to know when the last build of
32//! the unit was done. See the "dep-info files" section below for more
33//! details. If any input files are missing, or are newer than the
34//! dep-info, then the unit is dirty.
35//!
36//! - Alternatively if you're using the unstable feature `checksum-freshness`
37//! mtimes are ignored entirely in favor of comparing first the file size, and
38//! then the checksum with a known prior value emitted by rustc. Only nightly
39//! rustc will emit the needed metadata at the time of writing. This is dependent
40//! on the unstable feature `-Z checksum-hash-algorithm`.
41//!
42//! Note: Fingerprinting is not a perfect solution. Filesystem mtime tracking
43//! is notoriously imprecise and problematic. Only a small part of the
44//! environment is captured. This is a balance of performance, simplicity, and
45//! completeness. Sandboxing, hashing file contents, tracking every file
46//! access, environment variable, and network operation would ensure more
47//! reliable and reproducible builds at the cost of being complex, slow, and
48//! platform-dependent.
49//!
50//! ## Fingerprints and [`UnitHash`]s
51//!
52//! [`Metadata`] tracks several [`UnitHash`]s, including
53//! [`Metadata::unit_id`], [`Metadata::c_metadata`], and [`Metadata::c_extra_filename`].
54//! See its documentation for more details.
55//!
56//! NOTE: Not all output files are isolated via filename hashes (like dylibs).
57//! The fingerprint directory uses a hash, but sometimes units share the same
58//! fingerprint directory (when they don't have Metadata) so care should be
59//! taken to handle this!
60//!
61//! Fingerprints and [`UnitHash`]s are similar, and track some of the same things.
62//! [`UnitHash`]s contains information that is required to keep Units separate.
63//! The Fingerprint includes additional information that should cause a
64//! recompile, but it is desired to reuse the same filenames. A comparison
65//! of what is tracked:
66//!
67//! Value | Fingerprint | `Metadata::unit_id` | `Metadata::c_metadata` | `Metadata::c_extra_filename`
68//! -------------------------------------------|-------------|---------------------|------------------------|----------
69//! rustc | ✓ | ✓ | ✓ | ✓
70//! [`Profile`] | ✓ | ✓ | ✓ | ✓
71//! `cargo rustc` extra args | ✓ | ✓[^7] | | ✓[^7]
72//! [`CompileMode`] | ✓ | ✓ | ✓ | ✓
73//! Target Name | ✓ | ✓ | ✓ | ✓
74//! `TargetKind` (bin/lib/etc.) | ✓ | ✓ | ✓ | ✓
75//! Enabled Features | ✓ | ✓ | ✓ | ✓
76//! Declared Features | ✓ | | |
77//! Immediate dependency’s hashes | ✓[^1] | ✓ | ✓ | ✓
78//! [`CompileKind`] (host/target) | ✓ | ✓ | ✓ | ✓
79//! `__CARGO_DEFAULT_LIB_METADATA`[^4] | | ✓ | ✓ | ✓
80//! `package_id` | | ✓ | ✓ | ✓
81//! Target src path relative to ws | ✓ | | |
82//! Target flags (test/bench/for_host/edition) | ✓ | | |
83//! -C incremental=… flag | ✓ | | |
84//! mtime of sources | ✓[^3] | | |
85//! RUSTFLAGS/RUSTDOCFLAGS | ✓ | ✓[^7] | | ✓[^7]
86//! [`Lto`] flags | ✓ | ✓ | ✓ | ✓
87//! config settings[^5] | ✓ | | |
88//! `is_std` | | ✓ | ✓ | ✓
89//! `[lints]` table[^6] | ✓ | | |
90//! `[lints.rust.unexpected_cfgs.check-cfg]` | ✓ | | |
91//!
92//! [^1]: Bin dependencies are not included.
93//!
94//! [^3]: See below for details on mtime tracking.
95//!
96//! [^4]: `__CARGO_DEFAULT_LIB_METADATA` is set by rustbuild to embed the
97//! release channel (bootstrap/stable/beta/nightly) in libstd.
98//!
99//! [^5]: Config settings that are not otherwise captured anywhere else.
100//! Currently, this is only `doc.extern-map`.
101//!
102//! [^6]: Via [`Manifest::lint_rustflags`][crate::core::Manifest::lint_rustflags]
103//!
104//! [^7]: extra-flags and RUSTFLAGS are conditionally excluded when `--remap-path-prefix` is
105//! present to avoid breaking build reproducibility while we wait for trim-paths
106//!
107//! When deciding what should go in the Metadata vs the Fingerprint, consider
108//! that some files (like dylibs) do not have a hash in their filename. Thus,
109//! if a value changes, only the fingerprint will detect the change (consider,
110//! for example, swapping between different features). Fields that are only in
111//! Metadata generally aren't relevant to the fingerprint because they
112//! fundamentally change the output (like target vs host changes the directory
113//! where it is emitted).
114//!
115//! ## Fingerprint files
116//!
117//! Fingerprint information is stored in the
118//! `target/{debug,release}/.fingerprint/` directory. Each Unit is stored in a
119//! separate directory. Each Unit directory contains:
120//!
121//! - A file with a 16 hex-digit hash. This is the Fingerprint hash, used for
122//! quick loading and comparison.
123//! - A `.json` file that contains details about the Fingerprint. This is only
124//! used to log details about *why* a fingerprint is considered dirty.
125//! `CARGO_LOG=cargo::core::compiler::fingerprint=trace cargo build` can be
126//! used to display this log information.
127//! - A "dep-info" file which is a translation of rustc's `*.d` dep-info files
128//! to a Cargo-specific format that tweaks file names and is optimized for
129//! reading quickly.
130//! - An `invoked.timestamp` file whose filesystem mtime is updated every time
131//! the Unit is built. This is used for capturing the time when the build
132//! starts, to detect if files are changed in the middle of the build. See
133//! below for more details.
134//!
135//! Note that some units are a little different. A Unit for *running* a build
136//! script or for `rustdoc` does not have a dep-info file (it's not
137//! applicable). Build script `invoked.timestamp` files are in the build
138//! output directory.
139//!
140//! ## Fingerprint calculation
141//!
142//! After the list of Units has been calculated, the Units are added to the
143//! [`JobQueue`]. As each one is added, the fingerprint is calculated, and the
144//! dirty/fresh status is recorded. A closure is used to update the fingerprint
145//! on-disk when the Unit successfully finishes. The closure will recompute the
146//! Fingerprint based on the updated information. If the Unit fails to compile,
147//! the fingerprint is not updated.
148//!
149//! Fingerprints are cached in the [`BuildRunner`]. This makes computing
150//! Fingerprints faster, but also is necessary for properly updating
151//! dependency information. Since a Fingerprint includes the Fingerprints of
152//! all dependencies, when it is updated, by using `Arc` clones, it
153//! automatically picks up the updates to its dependencies.
154//!
155//! ### dep-info files
156//!
157//! Cargo has several kinds of "dep info" files:
158//!
159//! * dep-info files generated by `rustc`.
160//! * Fingerprint dep-info files translated from the first one.
161//! * dep-info for external build system integration.
162//! * Unstable `-Zbinary-dep-depinfo`.
163//!
164//! #### `rustc` dep-info files
165//!
166//! Cargo passes the `--emit=dep-info` flag to `rustc` so that `rustc` will
167//! generate a "dep info" file (with the `.d` extension). This is a
168//! Makefile-like syntax that includes all of the source files used to build
169//! the crate. This file is used by Cargo to know which files to check to see
170//! if the crate will need to be rebuilt. Example:
171//!
172//! ```makefile
173//! /path/to/target/debug/deps/cargo-b6219d178925203d: src/bin/main.rs src/bin/cargo/cli.rs # … etc.
174//! ```
175//!
176//! #### Fingerprint dep-info files
177//!
178//! After `rustc` exits successfully, Cargo will read the first kind of dep
179//! info file and translate it into a binary format that is stored in the
180//! fingerprint directory ([`translate_dep_info`]).
181//!
182//! These are used to quickly scan for any changed files. The mtime of the
183//! fingerprint dep-info file itself is used as the reference for comparing the
184//! source files to determine if any of the source files have been modified
185//! (see [below](#mtime-comparison) for more detail).
186//!
187//! Note that Cargo parses the special `# env-var:...` comments in dep-info
188//! files to learn about environment variables that the rustc compile depends on.
189//! Cargo then later uses this to trigger a recompile if a referenced env var
190//! changes (even if the source didn't change).
191//! This also includes env vars generated from Cargo metadata like `CARGO_PKG_DESCRIPTION`.
192//! (See [`crate::core::manifest::ManifestMetadata`]
193//!
194//! #### dep-info files for build system integration.
195//!
196//! There is also a third dep-info file. Cargo will extend the file created by
197//! rustc with some additional information and saves this into the output
198//! directory. This is intended for build system integration. See the
199//! [`output_depinfo`] function for more detail.
200//!
201//! #### -Zbinary-dep-depinfo
202//!
203//! `rustc` has an experimental flag `-Zbinary-dep-depinfo`. This causes
204//! `rustc` to include binary files (like rlibs) in the dep-info file. This is
205//! primarily to support rustc development, so that Cargo can check the
206//! implicit dependency to the standard library (which lives in the sysroot).
207//! We want Cargo to recompile whenever the standard library rlib/dylibs
208//! change, and this is a generic mechanism to make that work.
209//!
210//! ### Mtime comparison
211//!
212//! The use of modification timestamps is the most common way a unit will be
213//! determined to be dirty or fresh between builds. There are many subtle
214//! issues and edge cases with mtime comparisons. This gives a high-level
215//! overview, but you'll need to read the code for the gritty details. Mtime
216//! handling is different for different unit kinds. The different styles are
217//! driven by the [`Fingerprint::local`] field, which is set based on the unit
218//! kind.
219//!
220//! The status of whether or not the mtime is "stale" or "up-to-date" is
221//! stored in [`Fingerprint::fs_status`].
222//!
223//! All units will compare the mtime of its newest output file with the mtimes
224//! of the outputs of all its dependencies. If any output file is missing,
225//! then the unit is stale. If any dependency is newer, the unit is stale.
226//!
227//! #### Normal package mtime handling
228//!
229//! [`LocalFingerprint::CheckDepInfo`] is used for checking the mtime of
230//! packages. It compares the mtime of the input files (the source files) to
231//! the mtime of the dep-info file (which is written last after a build is
232//! finished). If the dep-info is missing, the unit is stale (it has never
233//! been built). The list of input files comes from the dep-info file. See the
234//! section above for details on dep-info files.
235//!
236//! Also note that although registry and git packages use [`CheckDepInfo`], none
237//! of their source files are included in the dep-info (see
238//! [`translate_dep_info`]), so for those kinds no mtime checking is done
239//! (unless `-Zbinary-dep-depinfo` is used). Repository and git packages are
240//! static, so there is no need to check anything.
241//!
242//! When a build is complete, the mtime of the dep-info file in the
243//! fingerprint directory is modified to rewind it to the time when the build
244//! started. This is done by creating an `invoked.timestamp` file when the
245//! build starts to capture the start time. The mtime is rewound to the start
246//! to handle the case where the user modifies a source file while a build is
247//! running. Cargo can't know whether or not the file was included in the
248//! build, so it takes a conservative approach of assuming the file was *not*
249//! included, and it should be rebuilt during the next build.
250//!
251//! #### Rustdoc mtime handling
252//!
253//! Rustdoc does not emit a dep-info file, so Cargo currently has a relatively
254//! simple system for detecting rebuilds. [`LocalFingerprint::Precalculated`] is
255//! used for rustdoc units. For registry packages, this is the package
256//! version. For git packages, it is the git hash. For path packages, it is
257//! a string of the mtime of the newest file in the package.
258//!
259//! There are some known bugs with how this works, so it should be improved at
260//! some point.
261//!
262//! #### Build script mtime handling
263//!
264//! Build script mtime handling runs in different modes. There is the "old
265//! style" where the build script does not emit any `rerun-if` directives. In
266//! this mode, Cargo will use [`LocalFingerprint::Precalculated`]. See the
267//! "rustdoc" section above how it works.
268//!
269//! In the new-style, each `rerun-if` directive is translated to the
270//! corresponding [`LocalFingerprint`] variant. The [`RerunIfChanged`] variant
271//! compares the mtime of the given filenames against the mtime of the
272//! "output" file.
273//!
274//! Similar to normal units, the build script "output" file mtime is rewound
275//! to the time just before the build script is executed to handle mid-build
276//! modifications.
277//!
278//! ## Considerations for inclusion in a fingerprint
279//!
280//! Over time we've realized a few items which historically were included in
281//! fingerprint hashings should not actually be included. Examples are:
282//!
283//! * Modification time values. We strive to never include a modification time
284//! inside a `Fingerprint` to get hashed into an actual value. While
285//! theoretically fine to do, in practice this causes issues with common
286//! applications like Docker. Docker, after a layer is built, will zero out
287//! the nanosecond part of all filesystem modification times. This means that
288//! the actual modification time is different for all build artifacts, which
289//! if we tracked the actual values of modification times would cause
290//! unnecessary recompiles. To fix this we instead only track paths which are
291//! relevant. These paths are checked dynamically to see if they're up to
292//! date, and the modification time doesn't make its way into the fingerprint
293//! hash.
294//!
295//! * Absolute path names. We strive to maintain a property where if you rename
296//! a project directory Cargo will continue to preserve all build artifacts
297//! and reuse the cache. This means that we can't ever hash an absolute path
298//! name. Instead we always hash relative path names and the "root" is passed
299//! in at runtime dynamically. Some of this is best effort, but the general
300//! idea is that we assume all accesses within a crate stay within that
301//! crate.
302//!
303//! These are pretty tricky to test for unfortunately, but we should have a good
304//! test suite nowadays and lord knows Cargo gets enough testing in the wild!
305//!
306//! ## Build scripts
307//!
308//! The *running* of a build script ([`CompileMode::RunCustomBuild`]) is treated
309//! significantly different than all other Unit kinds. It has its own function
310//! for calculating the Fingerprint ([`calculate_run_custom_build`]) and has some
311//! unique considerations. It does not track the same information as a normal
312//! Unit. The information tracked depends on the `rerun-if-changed` and
313//! `rerun-if-env-changed` statements produced by the build script. If the
314//! script does not emit either of these statements, the Fingerprint runs in
315//! "old style" mode where an mtime change of *any* file in the package will
316//! cause the build script to be re-run. Otherwise, the fingerprint *only*
317//! tracks the individual "rerun-if" items listed by the build script.
318//!
319//! The "rerun-if" statements from a *previous* build are stored in the build
320//! output directory in a file called `output`. Cargo parses this file when
321//! the Unit for that build script is prepared for the [`JobQueue`]. The
322//! Fingerprint code can then use that information to compute the Fingerprint
323//! and compare against the old fingerprint hash.
324//!
325//! Care must be taken with build script Fingerprints because the
326//! [`Fingerprint::local`] value may be changed after the build script runs
327//! (such as if the build script adds or removes "rerun-if" items).
328//!
329//! Another complication is if a build script is overridden. In that case, the
330//! fingerprint is the hash of the output of the override.
331//!
332//! ## Special considerations
333//!
334//! Registry dependencies do not track the mtime of files. This is because
335//! registry dependencies are not expected to change (if a new version is
336//! used, the Package ID will change, causing a rebuild). Cargo currently
337//! partially works with Docker caching. When a Docker image is built, it has
338//! normal mtime information. However, when a step is cached, the nanosecond
339//! portions of all files is zeroed out. Currently this works, but care must
340//! be taken for situations like these.
341//!
342//! HFS on macOS only supports 1 second timestamps. This causes a significant
343//! number of problems, particularly with Cargo's testsuite which does rapid
344//! builds in succession. Other filesystems have various degrees of
345//! resolution.
346//!
347//! Various weird filesystems (such as network filesystems) also can cause
348//! complications. Network filesystems may track the time on the server
349//! (except when the time is set manually such as with
350//! `filetime::set_file_times`). Not all filesystems support modifying the
351//! mtime.
352//!
353//! See the [`A-rebuild-detection`] label on the issue tracker for more.
354//!
355//! [`check_filesystem`]: Fingerprint::check_filesystem
356//! [`Metadata`]: crate::core::compiler::Metadata
357//! [`Metadata::unit_id`]: crate::core::compiler::Metadata::unit_id
358//! [`Metadata::c_metadata`]: crate::core::compiler::Metadata::c_metadata
359//! [`Metadata::c_extra_filename`]: crate::core::compiler::Metadata::c_extra_filename
360//! [`UnitHash`]: crate::core::compiler::UnitHash
361//! [`Profile`]: crate::core::profiles::Profile
362//! [`CompileMode`]: crate::core::compiler::CompileMode
363//! [`Lto`]: crate::core::compiler::Lto
364//! [`CompileKind`]: crate::core::compiler::CompileKind
365//! [`JobQueue`]: super::job_queue::JobQueue
366//! [`output_depinfo`]: super::output_depinfo()
367//! [`CheckDepInfo`]: LocalFingerprint::CheckDepInfo
368//! [`RerunIfChanged`]: LocalFingerprint::RerunIfChanged
369//! [`CompileMode::RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
370//! [`A-rebuild-detection`]: https://github.com/rust-lang/cargo/issues?q=is%3Aissue+is%3Aopen+label%3AA-rebuild-detection
371
372mod dep_info;
373mod dirty_reason;
374mod rustdoc;
375
376use std::collections::hash_map::{Entry, HashMap};
377use std::env;
378use std::ffi::OsString;
379use std::fs;
380use std::fs::File;
381use std::hash::{self, Hash, Hasher};
382use std::io::{self};
383use std::path::{Path, PathBuf};
384use std::sync::{Arc, Mutex};
385use std::time::SystemTime;
386
387use anyhow::Context as _;
388use anyhow::format_err;
389use cargo_util::paths;
390use filetime::FileTime;
391use serde::de;
392use serde::ser;
393use serde::{Deserialize, Serialize};
394use tracing::{debug, info};
395
396use crate::core::Package;
397use crate::core::compiler::unit_graph::UnitDep;
398use crate::util;
399use crate::util::errors::CargoResult;
400use crate::util::interning::InternedString;
401use crate::util::log_message::LogMessage;
402use crate::util::{StableHasher, internal, path_args};
403use crate::{CARGO_ENV, GlobalContext};
404
405use super::custom_build::BuildDeps;
406use super::{BuildContext, BuildRunner, FileFlavor, Job, Unit, Work};
407
408pub use self::dep_info::Checksum;
409pub use self::dep_info::parse_dep_info;
410pub use self::dep_info::parse_rustc_dep_info;
411pub use self::dep_info::translate_dep_info;
412pub use self::dirty_reason::DirtyReason;
413pub use self::rustdoc::RustdocFingerprint;
414
415/// Determines if a [`Unit`] is up-to-date, and if not prepares necessary work to
416/// update the persisted fingerprint.
417///
418/// This function will inspect `Unit`, calculate a fingerprint for it, and then
419/// return an appropriate [`Job`] to run. The returned `Job` will be a noop if
420/// `unit` is considered "fresh", or if it was previously built and cached.
421/// Otherwise the `Job` returned will write out the true fingerprint to the
422/// filesystem, to be executed after the unit's work has completed.
423///
424/// The `force` flag is a way to force the `Job` to be "dirty", or always
425/// update the fingerprint. **Beware using this flag** because it does not
426/// transitively propagate throughout the dependency graph, it only forces this
427/// one unit which is very unlikely to be what you want unless you're
428/// exclusively talking about top-level units.
429#[tracing::instrument(
430 skip(build_runner, unit),
431 fields(package_id = %unit.pkg.package_id(), target = unit.target.name())
432)]
433pub fn prepare_target(
434 build_runner: &mut BuildRunner<'_, '_>,
435 unit: &Unit,
436 force: bool,
437) -> CargoResult<Job> {
438 let bcx = build_runner.bcx;
439 let loc = build_runner.files().fingerprint_file_path(unit, "");
440
441 debug!("fingerprint at: {}", loc.display());
442
443 // Figure out if this unit is up to date. After calculating the fingerprint
444 // compare it to an old version, if any, and attempt to print diagnostic
445 // information about failed comparisons to aid in debugging.
446 let fingerprint = calculate(build_runner, unit)?;
447 let mtime_on_use = build_runner.bcx.gctx.cli_unstable().mtime_on_use;
448 let dirty_reason = compare_old_fingerprint(unit, &loc, &*fingerprint, mtime_on_use, force);
449
450 let Some(dirty_reason) = dirty_reason else {
451 return Ok(Job::new_fresh());
452 };
453
454 if let Some(logger) = bcx.logger {
455 // Dont log FreshBuild as it is noisy.
456 if !dirty_reason.is_fresh_build() {
457 logger.log(LogMessage::Rebuild {
458 package_id: unit.pkg.package_id().to_spec(),
459 target: (&unit.target).into(),
460 mode: unit.mode,
461 cause: dirty_reason.clone(),
462 });
463 }
464 }
465
466 // We're going to rebuild, so ensure the source of the crate passes all
467 // verification checks before we build it.
468 //
469 // The `Source::verify` method is intended to allow sources to execute
470 // pre-build checks to ensure that the relevant source code is all
471 // up-to-date and as expected. This is currently used primarily for
472 // directory sources which will use this hook to perform an integrity check
473 // on all files in the source to ensure they haven't changed. If they have
474 // changed then an error is issued.
475 let source_id = unit.pkg.package_id().source_id();
476 let sources = bcx.packages.sources();
477 let source = sources
478 .get(source_id)
479 .ok_or_else(|| internal("missing package source"))?;
480 source.verify(unit.pkg.package_id())?;
481
482 // Clear out the old fingerprint file if it exists. This protects when
483 // compilation is interrupted leaving a corrupt file. For example, a
484 // project with a lib.rs and integration test (two units):
485 //
486 // 1. Build the library and integration test.
487 // 2. Make a change to lib.rs (NOT the integration test).
488 // 3. Build the integration test, hit Ctrl-C while linking. With gcc, this
489 // will leave behind an incomplete executable (zero size, or partially
490 // written). NOTE: The library builds successfully, it is the linking
491 // of the integration test that we are interrupting.
492 // 4. Build the integration test again.
493 //
494 // Without the following line, then step 3 will leave a valid fingerprint
495 // on the disk. Then step 4 will think the integration test is "fresh"
496 // because:
497 //
498 // - There is a valid fingerprint hash on disk (written in step 1).
499 // - The mtime of the output file (the corrupt integration executable
500 // written in step 3) is newer than all of its dependencies.
501 // - The mtime of the integration test fingerprint dep-info file (written
502 // in step 1) is newer than the integration test's source files, because
503 // we haven't modified any of its source files.
504 //
505 // But the executable is corrupt and needs to be rebuilt. Clearing the
506 // fingerprint at step 3 ensures that Cargo never mistakes a partially
507 // written output as up-to-date.
508 if loc.exists() {
509 // Truncate instead of delete so that compare_old_fingerprint will
510 // still log the reason for the fingerprint failure instead of just
511 // reporting "failed to read fingerprint" during the next build if
512 // this build fails.
513 paths::write(&loc, b"")?;
514 }
515
516 let write_fingerprint = if unit.mode.is_run_custom_build() {
517 // For build scripts the `local` field of the fingerprint may change
518 // while we're executing it. For example it could be in the legacy
519 // "consider everything a dependency mode" and then we switch to "deps
520 // are explicitly specified" mode.
521 //
522 // To handle this movement we need to regenerate the `local` field of a
523 // build script's fingerprint after it's executed. We do this by
524 // using the `build_script_local_fingerprints` function which returns a
525 // thunk we can invoke on a foreign thread to calculate this.
526 let build_script_outputs = Arc::clone(&build_runner.build_script_outputs);
527 let metadata = build_runner.get_run_build_script_metadata(unit);
528 let (gen_local, _overridden) = build_script_local_fingerprints(build_runner, unit)?;
529 let output_path = build_runner.build_explicit_deps[unit]
530 .build_script_output
531 .clone();
532 Work::new(move |_| {
533 let outputs = build_script_outputs.lock().unwrap();
534 let output = outputs
535 .get(metadata)
536 .expect("output must exist after running");
537 let deps = BuildDeps::new(&output_path, Some(output));
538
539 // FIXME: it's basically buggy that we pass `None` to `call_box`
540 // here. See documentation on `build_script_local_fingerprints`
541 // below for more information. Despite this just try to proceed and
542 // hobble along if it happens to return `Some`.
543 if let Some(new_local) = (gen_local)(&deps, None)? {
544 *fingerprint.local.lock().unwrap() = new_local;
545 }
546
547 write_fingerprint(&loc, &fingerprint)
548 })
549 } else {
550 Work::new(move |_| write_fingerprint(&loc, &fingerprint))
551 };
552
553 Ok(Job::new_dirty(write_fingerprint, dirty_reason))
554}
555
556/// Dependency edge information for fingerprints. This is generated for each
557/// dependency and is stored in a [`Fingerprint`].
558#[derive(Clone)]
559struct DepFingerprint {
560 /// The hash of the package id that this dependency points to
561 pkg_id: u64,
562 /// The crate name we're using for this dependency, which if we change we'll
563 /// need to recompile!
564 name: InternedString,
565 /// Whether or not this dependency is flagged as a public dependency or not.
566 public: bool,
567 /// Whether or not this dependency is an rmeta dependency or a "full"
568 /// dependency. In the case of an rmeta dependency our dependency edge only
569 /// actually requires the rmeta from what we depend on, so when checking
570 /// mtime information all files other than the rmeta can be ignored.
571 only_requires_rmeta: bool,
572 /// The dependency's fingerprint we recursively point to, containing all the
573 /// other hash information we'd otherwise need.
574 fingerprint: Arc<Fingerprint>,
575}
576
577/// A fingerprint can be considered to be a "short string" representing the
578/// state of a world for a package.
579///
580/// If a fingerprint ever changes, then the package itself needs to be
581/// recompiled. Inputs to the fingerprint include source code modifications,
582/// compiler flags, compiler version, etc. This structure is not simply a
583/// `String` due to the fact that some fingerprints cannot be calculated lazily.
584///
585/// Path sources, for example, use the mtime of the corresponding dep-info file
586/// as a fingerprint (all source files must be modified *before* this mtime).
587/// This dep-info file is not generated, however, until after the crate is
588/// compiled. As a result, this structure can be thought of as a fingerprint
589/// to-be. The actual value can be calculated via [`hash_u64()`], but the operation
590/// may fail as some files may not have been generated.
591///
592/// Note that dependencies are taken into account for fingerprints because rustc
593/// requires that whenever an upstream crate is recompiled that all downstream
594/// dependents are also recompiled. This is typically tracked through
595/// [`DependencyQueue`], but it also needs to be retained here because Cargo can
596/// be interrupted while executing, losing the state of the [`DependencyQueue`]
597/// graph.
598///
599/// [`hash_u64()`]: crate::core::compiler::fingerprint::Fingerprint::hash_u64
600/// [`DependencyQueue`]: crate::util::DependencyQueue
601#[derive(Serialize, Deserialize)]
602pub struct Fingerprint {
603 /// Hash of the version of `rustc` used.
604 rustc: u64,
605 /// Sorted list of cfg features enabled.
606 features: String,
607 /// Sorted list of all the declared cfg features.
608 declared_features: String,
609 /// Hash of the `Target` struct, including the target name,
610 /// package-relative source path, edition, etc.
611 target: u64,
612 /// Hash of the [`Profile`], [`CompileMode`], and any extra flags passed via
613 /// `cargo rustc` or `cargo rustdoc`.
614 ///
615 /// [`Profile`]: crate::core::profiles::Profile
616 /// [`CompileMode`]: crate::core::compiler::CompileMode
617 profile: u64,
618 /// Hash of the path to the base source file. This is relative to the
619 /// workspace root for path members, or absolute for other sources.
620 path: u64,
621 /// Fingerprints of dependencies.
622 deps: Vec<DepFingerprint>,
623 /// Information about the inputs that affect this Unit (such as source
624 /// file mtimes or build script environment variables).
625 local: Mutex<Vec<LocalFingerprint>>,
626 /// Cached hash of the [`Fingerprint`] struct. Used to improve performance
627 /// for hashing.
628 #[serde(skip)]
629 memoized_hash: Mutex<Option<u64>>,
630 /// RUSTFLAGS/RUSTDOCFLAGS environment variable value (or config value).
631 rustflags: Vec<String>,
632 /// Hash of various config settings that change how things are compiled.
633 config: u64,
634 /// The rustc target. This is only relevant for `.json` files, otherwise
635 /// the metadata hash segregates the units.
636 compile_kind: u64,
637 /// Description of whether the filesystem status for this unit is up to date
638 /// or should be considered stale.
639 #[serde(skip)]
640 fs_status: FsStatus,
641 /// Files, relative to `target_root`, that are produced by the step that
642 /// this `Fingerprint` represents. This is used to detect when the whole
643 /// fingerprint is out of date if this is missing, or if previous
644 /// fingerprints output files are regenerated and look newer than this one.
645 #[serde(skip)]
646 outputs: Vec<PathBuf>,
647}
648
649/// Indication of the status on the filesystem for a particular unit.
650#[derive(Clone, Default, Debug, Serialize)]
651#[serde(tag = "fs_status", rename_all = "kebab-case")]
652pub enum FsStatus {
653 /// This unit is to be considered stale, even if hash information all
654 /// matches.
655 #[default]
656 Stale,
657
658 /// File system inputs have changed (or are missing), or there were
659 /// changes to the environment variables that affect this unit. See
660 /// the variants of [`StaleItem`] for more information.
661 StaleItem(StaleItem),
662
663 /// A dependency was stale.
664 StaleDependency {
665 name: InternedString,
666 #[serde(serialize_with = "serialize_file_time")]
667 dep_mtime: FileTime,
668 #[serde(serialize_with = "serialize_file_time")]
669 max_mtime: FileTime,
670 },
671
672 /// A dependency was stale.
673 StaleDepFingerprint { name: InternedString },
674
675 /// This unit is up-to-date. All outputs and their corresponding mtime are
676 /// listed in the payload here for other dependencies to compare against.
677 #[serde(skip)]
678 UpToDate { mtimes: HashMap<PathBuf, FileTime> },
679}
680
681impl FsStatus {
682 fn up_to_date(&self) -> bool {
683 match self {
684 FsStatus::UpToDate { .. } => true,
685 FsStatus::Stale
686 | FsStatus::StaleItem(_)
687 | FsStatus::StaleDependency { .. }
688 | FsStatus::StaleDepFingerprint { .. } => false,
689 }
690 }
691}
692
693/// Serialize FileTime as milliseconds with nano.
694fn serialize_file_time<S>(ft: &FileTime, s: S) -> Result<S::Ok, S::Error>
695where
696 S: serde::Serializer,
697{
698 let secs_as_millis = ft.unix_seconds() as f64 * 1000.0;
699 let nanos_as_millis = ft.nanoseconds() as f64 / 1_000_000.0;
700 (secs_as_millis + nanos_as_millis).serialize(s)
701}
702
703impl Serialize for DepFingerprint {
704 fn serialize<S>(&self, ser: S) -> Result<S::Ok, S::Error>
705 where
706 S: ser::Serializer,
707 {
708 (
709 &self.pkg_id,
710 &self.name,
711 &self.public,
712 &self.fingerprint.hash_u64(),
713 )
714 .serialize(ser)
715 }
716}
717
718impl<'de> Deserialize<'de> for DepFingerprint {
719 fn deserialize<D>(d: D) -> Result<DepFingerprint, D::Error>
720 where
721 D: de::Deserializer<'de>,
722 {
723 let (pkg_id, name, public, hash) = <(u64, String, bool, u64)>::deserialize(d)?;
724 Ok(DepFingerprint {
725 pkg_id,
726 name: name.into(),
727 public,
728 fingerprint: Arc::new(Fingerprint {
729 memoized_hash: Mutex::new(Some(hash)),
730 ..Fingerprint::new()
731 }),
732 // This field is never read since it's only used in
733 // `check_filesystem` which isn't used by fingerprints loaded from
734 // disk.
735 only_requires_rmeta: false,
736 })
737 }
738}
739
740/// A `LocalFingerprint` represents something that we use to detect direct
741/// changes to a `Fingerprint`.
742///
743/// This is where we track file information, env vars, etc. This
744/// `LocalFingerprint` struct is hashed and if the hash changes will force a
745/// recompile of any fingerprint it's included into. Note that the "local"
746/// terminology comes from the fact that it only has to do with one crate, and
747/// `Fingerprint` tracks the transitive propagation of fingerprint changes.
748///
749/// Note that because this is hashed its contents are carefully managed. Like
750/// mentioned in the above module docs, we don't want to hash absolute paths or
751/// mtime information.
752///
753/// Also note that a `LocalFingerprint` is used in `check_filesystem` to detect
754/// when the filesystem contains stale information (based on mtime currently).
755/// The paths here don't change much between compilations but they're used as
756/// inputs when we probe the filesystem looking at information.
757#[derive(Debug, Serialize, Deserialize, Hash)]
758enum LocalFingerprint {
759 /// This is a precalculated fingerprint which has an opaque string we just
760 /// hash as usual. This variant is primarily used for rustdoc where we
761 /// don't have a dep-info file to compare against.
762 ///
763 /// This is also used for build scripts with no `rerun-if-*` statements, but
764 /// that's overall a mistake and causes bugs in Cargo. We shouldn't use this
765 /// for build scripts.
766 Precalculated(String),
767
768 /// This is used for crate compilations. The `dep_info` file is a relative
769 /// path anchored at `target_root(...)` to the dep-info file that Cargo
770 /// generates (which is a custom serialization after parsing rustc's own
771 /// `dep-info` output).
772 ///
773 /// The `dep_info` file, when present, also lists a number of other files
774 /// for us to look at. If any of those files are newer than this file then
775 /// we need to recompile.
776 ///
777 /// If the `checksum` bool is true then the `dep_info` file is expected to
778 /// contain file checksums instead of file mtimes.
779 CheckDepInfo { dep_info: PathBuf, checksum: bool },
780
781 /// This represents a nonempty set of `rerun-if-changed` annotations printed
782 /// out by a build script. The `output` file is a relative file anchored at
783 /// `target_root(...)` which is the actual output of the build script. That
784 /// output has already been parsed and the paths printed out via
785 /// `rerun-if-changed` are listed in `paths`. The `paths` field is relative
786 /// to `pkg.root()`
787 ///
788 /// This is considered up-to-date if all of the `paths` are older than
789 /// `output`, otherwise we need to recompile.
790 RerunIfChanged {
791 output: PathBuf,
792 paths: Vec<PathBuf>,
793 },
794
795 /// This represents a single `rerun-if-env-changed` annotation printed by a
796 /// build script. The exact env var and value are hashed here. There's no
797 /// filesystem dependence here, and if the values are changed the hash will
798 /// change forcing a recompile.
799 RerunIfEnvChanged { var: String, val: Option<String> },
800}
801
802/// See [`FsStatus::StaleItem`].
803#[derive(Clone, Debug, Serialize)]
804#[serde(tag = "stale_item", rename_all = "kebab-case")]
805pub enum StaleItem {
806 MissingFile {
807 path: PathBuf,
808 },
809 UnableToReadFile {
810 path: PathBuf,
811 },
812 FailedToReadMetadata {
813 path: PathBuf,
814 },
815 FileSizeChanged {
816 path: PathBuf,
817 old_size: u64,
818 new_size: u64,
819 },
820 ChangedFile {
821 reference: PathBuf,
822 #[serde(serialize_with = "serialize_file_time")]
823 reference_mtime: FileTime,
824 stale: PathBuf,
825 #[serde(serialize_with = "serialize_file_time")]
826 stale_mtime: FileTime,
827 },
828 ChangedChecksum {
829 source: PathBuf,
830 stored_checksum: Checksum,
831 new_checksum: Checksum,
832 },
833 MissingChecksum {
834 path: PathBuf,
835 },
836 ChangedEnv {
837 var: String,
838 previous: Option<String>,
839 current: Option<String>,
840 },
841}
842
843impl LocalFingerprint {
844 /// Read the environment variable of the given env `key`, and creates a new
845 /// [`LocalFingerprint::RerunIfEnvChanged`] for it. The `env_config` is used firstly
846 /// to check if the env var is set in the config system as some envs need to be overridden.
847 /// If not, it will fallback to `std::env::var`.
848 ///
849 // TODO: `std::env::var` is allowed at this moment. Should figure out
850 // if it makes sense if permitting to read env from the env snapshot.
851 #[allow(clippy::disallowed_methods)]
852 fn from_env<K: AsRef<str>>(
853 key: K,
854 env_config: &Arc<HashMap<String, OsString>>,
855 ) -> LocalFingerprint {
856 let key = key.as_ref();
857 let var = key.to_owned();
858 let val = if let Some(val) = env_config.get(key) {
859 val.to_str().map(ToOwned::to_owned)
860 } else {
861 env::var(key).ok()
862 };
863 LocalFingerprint::RerunIfEnvChanged { var, val }
864 }
865
866 /// Checks dynamically at runtime if this `LocalFingerprint` has a stale
867 /// item inside of it.
868 ///
869 /// The main purpose of this function is to handle two different ways
870 /// fingerprints can be invalidated:
871 ///
872 /// * One is a dependency listed in rustc's dep-info files is invalid. Note
873 /// that these could either be env vars or files. We check both here.
874 ///
875 /// * Another is the `rerun-if-changed` directive from build scripts. This
876 /// is where we'll find whether files have actually changed
877 fn find_stale_item(
878 &self,
879 mtime_cache: &mut HashMap<PathBuf, FileTime>,
880 checksum_cache: &mut HashMap<PathBuf, Checksum>,
881 pkg: &Package,
882 build_root: &Path,
883 cargo_exe: &Path,
884 gctx: &GlobalContext,
885 ) -> CargoResult<Option<StaleItem>> {
886 let pkg_root = pkg.root();
887 match self {
888 // We need to parse `dep_info`, learn about the crate's dependencies.
889 //
890 // For each env var we see if our current process's env var still
891 // matches, and for each file we see if any of them are newer than
892 // the `dep_info` file itself whose mtime represents the start of
893 // rustc.
894 LocalFingerprint::CheckDepInfo { dep_info, checksum } => {
895 let dep_info = build_root.join(dep_info);
896 let Some(info) = parse_dep_info(pkg_root, build_root, &dep_info)? else {
897 return Ok(Some(StaleItem::MissingFile { path: dep_info }));
898 };
899 for (key, previous) in info.env.iter() {
900 if let Some(value) = pkg.manifest().metadata().env_var(key.as_str()) {
901 if Some(value.as_ref()) == previous.as_deref() {
902 continue;
903 }
904 }
905
906 let current = if key == CARGO_ENV {
907 Some(cargo_exe.to_str().ok_or_else(|| {
908 format_err!(
909 "cargo exe path {} must be valid UTF-8",
910 cargo_exe.display()
911 )
912 })?)
913 } else {
914 if let Some(value) = gctx.env_config()?.get(key) {
915 value.to_str()
916 } else {
917 gctx.get_env(key).ok()
918 }
919 };
920 if current == previous.as_deref() {
921 continue;
922 }
923 return Ok(Some(StaleItem::ChangedEnv {
924 var: key.clone(),
925 previous: previous.clone(),
926 current: current.map(Into::into),
927 }));
928 }
929 if *checksum {
930 Ok(find_stale_file(
931 mtime_cache,
932 checksum_cache,
933 &dep_info,
934 info.files.iter().map(|(file, checksum)| (file, *checksum)),
935 *checksum,
936 ))
937 } else {
938 Ok(find_stale_file(
939 mtime_cache,
940 checksum_cache,
941 &dep_info,
942 info.files.into_keys().map(|p| (p, None)),
943 *checksum,
944 ))
945 }
946 }
947
948 // We need to verify that no paths listed in `paths` are newer than
949 // the `output` path itself, or the last time the build script ran.
950 LocalFingerprint::RerunIfChanged { output, paths } => Ok(find_stale_file(
951 mtime_cache,
952 checksum_cache,
953 &build_root.join(output),
954 paths.iter().map(|p| (pkg_root.join(p), None)),
955 false,
956 )),
957
958 // These have no dependencies on the filesystem, and their values
959 // are included natively in the `Fingerprint` hash so nothing
960 // tocheck for here.
961 LocalFingerprint::RerunIfEnvChanged { .. } => Ok(None),
962 LocalFingerprint::Precalculated(..) => Ok(None),
963 }
964 }
965
966 fn kind(&self) -> &'static str {
967 match self {
968 LocalFingerprint::Precalculated(..) => "precalculated",
969 LocalFingerprint::CheckDepInfo { .. } => "dep-info",
970 LocalFingerprint::RerunIfChanged { .. } => "rerun-if-changed",
971 LocalFingerprint::RerunIfEnvChanged { .. } => "rerun-if-env-changed",
972 }
973 }
974}
975
976impl Fingerprint {
977 fn new() -> Fingerprint {
978 Fingerprint {
979 rustc: 0,
980 target: 0,
981 profile: 0,
982 path: 0,
983 features: String::new(),
984 declared_features: String::new(),
985 deps: Vec::new(),
986 local: Mutex::new(Vec::new()),
987 memoized_hash: Mutex::new(None),
988 rustflags: Vec::new(),
989 config: 0,
990 compile_kind: 0,
991 fs_status: FsStatus::Stale,
992 outputs: Vec::new(),
993 }
994 }
995
996 /// For performance reasons fingerprints will memoize their own hash, but
997 /// there's also internal mutability with its `local` field which can
998 /// change, for example with build scripts, during a build.
999 ///
1000 /// This method can be used to bust all memoized hashes just before a build
1001 /// to ensure that after a build completes everything is up-to-date.
1002 pub fn clear_memoized(&self) {
1003 *self.memoized_hash.lock().unwrap() = None;
1004 }
1005
1006 fn hash_u64(&self) -> u64 {
1007 if let Some(s) = *self.memoized_hash.lock().unwrap() {
1008 return s;
1009 }
1010 let ret = util::hash_u64(self);
1011 *self.memoized_hash.lock().unwrap() = Some(ret);
1012 ret
1013 }
1014
1015 /// Compares this fingerprint with an old version which was previously
1016 /// serialized to filesystem.
1017 ///
1018 /// The purpose of this is exclusively to produce a diagnostic message
1019 /// [`DirtyReason`], indicating why we're recompiling something.
1020 fn compare(&self, old: &Fingerprint) -> DirtyReason {
1021 if self.rustc != old.rustc {
1022 return DirtyReason::RustcChanged;
1023 }
1024 if self.features != old.features {
1025 return DirtyReason::FeaturesChanged {
1026 old: old.features.clone(),
1027 new: self.features.clone(),
1028 };
1029 }
1030 if self.declared_features != old.declared_features {
1031 return DirtyReason::DeclaredFeaturesChanged {
1032 old: old.declared_features.clone(),
1033 new: self.declared_features.clone(),
1034 };
1035 }
1036 if self.target != old.target {
1037 return DirtyReason::TargetConfigurationChanged;
1038 }
1039 if self.path != old.path {
1040 return DirtyReason::PathToSourceChanged;
1041 }
1042 if self.profile != old.profile {
1043 return DirtyReason::ProfileConfigurationChanged;
1044 }
1045 if self.rustflags != old.rustflags {
1046 return DirtyReason::RustflagsChanged {
1047 old: old.rustflags.clone(),
1048 new: self.rustflags.clone(),
1049 };
1050 }
1051 if self.config != old.config {
1052 return DirtyReason::ConfigSettingsChanged;
1053 }
1054 if self.compile_kind != old.compile_kind {
1055 return DirtyReason::CompileKindChanged;
1056 }
1057 let my_local = self.local.lock().unwrap();
1058 let old_local = old.local.lock().unwrap();
1059 if my_local.len() != old_local.len() {
1060 return DirtyReason::LocalLengthsChanged;
1061 }
1062 for (new, old) in my_local.iter().zip(old_local.iter()) {
1063 match (new, old) {
1064 (LocalFingerprint::Precalculated(a), LocalFingerprint::Precalculated(b)) => {
1065 if a != b {
1066 return DirtyReason::PrecalculatedComponentsChanged {
1067 old: b.to_string(),
1068 new: a.to_string(),
1069 };
1070 }
1071 }
1072 (
1073 LocalFingerprint::CheckDepInfo {
1074 dep_info: a_dep,
1075 checksum: checksum_a,
1076 },
1077 LocalFingerprint::CheckDepInfo {
1078 dep_info: b_dep,
1079 checksum: checksum_b,
1080 },
1081 ) => {
1082 if a_dep != b_dep {
1083 return DirtyReason::DepInfoOutputChanged {
1084 old: b_dep.clone(),
1085 new: a_dep.clone(),
1086 };
1087 }
1088 if checksum_a != checksum_b {
1089 return DirtyReason::ChecksumUseChanged { old: *checksum_b };
1090 }
1091 }
1092 (
1093 LocalFingerprint::RerunIfChanged {
1094 output: a_out,
1095 paths: a_paths,
1096 },
1097 LocalFingerprint::RerunIfChanged {
1098 output: b_out,
1099 paths: b_paths,
1100 },
1101 ) => {
1102 if a_out != b_out {
1103 return DirtyReason::RerunIfChangedOutputFileChanged {
1104 old: b_out.clone(),
1105 new: a_out.clone(),
1106 };
1107 }
1108 if a_paths != b_paths {
1109 return DirtyReason::RerunIfChangedOutputPathsChanged {
1110 old: b_paths.clone(),
1111 new: a_paths.clone(),
1112 };
1113 }
1114 }
1115 (
1116 LocalFingerprint::RerunIfEnvChanged {
1117 var: a_key,
1118 val: a_value,
1119 },
1120 LocalFingerprint::RerunIfEnvChanged {
1121 var: b_key,
1122 val: b_value,
1123 },
1124 ) => {
1125 if *a_key != *b_key {
1126 return DirtyReason::EnvVarsChanged {
1127 old: b_key.clone(),
1128 new: a_key.clone(),
1129 };
1130 }
1131 if *a_value != *b_value {
1132 return DirtyReason::EnvVarChanged {
1133 name: a_key.clone(),
1134 old_value: b_value.clone(),
1135 new_value: a_value.clone(),
1136 };
1137 }
1138 }
1139 (a, b) => {
1140 return DirtyReason::LocalFingerprintTypeChanged {
1141 old: b.kind(),
1142 new: a.kind(),
1143 };
1144 }
1145 }
1146 }
1147
1148 if self.deps.len() != old.deps.len() {
1149 return DirtyReason::NumberOfDependenciesChanged {
1150 old: old.deps.len(),
1151 new: self.deps.len(),
1152 };
1153 }
1154 for (a, b) in self.deps.iter().zip(old.deps.iter()) {
1155 if a.name != b.name {
1156 return DirtyReason::UnitDependencyNameChanged {
1157 old: b.name,
1158 new: a.name,
1159 };
1160 }
1161
1162 if a.fingerprint.hash_u64() != b.fingerprint.hash_u64() {
1163 return DirtyReason::UnitDependencyInfoChanged {
1164 new_name: a.name,
1165 new_fingerprint: a.fingerprint.hash_u64(),
1166 old_name: b.name,
1167 old_fingerprint: b.fingerprint.hash_u64(),
1168 };
1169 }
1170 }
1171
1172 if !self.fs_status.up_to_date() {
1173 return DirtyReason::FsStatusOutdated(self.fs_status.clone());
1174 }
1175
1176 // This typically means some filesystem modifications happened or
1177 // something transitive was odd. In general we should strive to provide
1178 // a better error message than this, so if you see this message a lot it
1179 // likely means this method needs to be updated!
1180 DirtyReason::NothingObvious
1181 }
1182
1183 /// Dynamically inspect the local filesystem to update the `fs_status` field
1184 /// of this `Fingerprint`.
1185 ///
1186 /// This function is used just after a `Fingerprint` is constructed to check
1187 /// the local state of the filesystem and propagate any dirtiness from
1188 /// dependencies up to this unit as well. This function assumes that the
1189 /// unit starts out as [`FsStatus::Stale`] and then it will optionally switch
1190 /// it to `UpToDate` if it can.
1191 fn check_filesystem(
1192 &mut self,
1193 mtime_cache: &mut HashMap<PathBuf, FileTime>,
1194 checksum_cache: &mut HashMap<PathBuf, Checksum>,
1195 pkg: &Package,
1196 build_root: &Path,
1197 cargo_exe: &Path,
1198 gctx: &GlobalContext,
1199 ) -> CargoResult<()> {
1200 assert!(!self.fs_status.up_to_date());
1201
1202 let pkg_root = pkg.root();
1203 let mut mtimes = HashMap::new();
1204
1205 // Get the `mtime` of all outputs. Optionally update their mtime
1206 // afterwards based on the `mtime_on_use` flag. Afterwards we want the
1207 // minimum mtime as it's the one we'll be comparing to inputs and
1208 // dependencies.
1209 for output in self.outputs.iter() {
1210 let Ok(mtime) = paths::mtime(output) else {
1211 // This path failed to report its `mtime`. It probably doesn't
1212 // exists, so leave ourselves as stale and bail out.
1213 let item = StaleItem::FailedToReadMetadata {
1214 path: output.clone(),
1215 };
1216 self.fs_status = FsStatus::StaleItem(item);
1217 return Ok(());
1218 };
1219 assert!(mtimes.insert(output.clone(), mtime).is_none());
1220 }
1221
1222 let opt_max = mtimes.iter().max_by_key(|kv| kv.1);
1223 let Some((max_path, max_mtime)) = opt_max else {
1224 // We had no output files. This means we're an overridden build
1225 // script and we're just always up to date because we aren't
1226 // watching the filesystem.
1227 self.fs_status = FsStatus::UpToDate { mtimes };
1228 return Ok(());
1229 };
1230 debug!(
1231 "max output mtime for {:?} is {:?} {}",
1232 pkg_root, max_path, max_mtime
1233 );
1234
1235 for dep in self.deps.iter() {
1236 let dep_mtimes = match &dep.fingerprint.fs_status {
1237 FsStatus::UpToDate { mtimes } => mtimes,
1238 // If our dependency is stale, so are we, so bail out.
1239 FsStatus::Stale
1240 | FsStatus::StaleItem(_)
1241 | FsStatus::StaleDependency { .. }
1242 | FsStatus::StaleDepFingerprint { .. } => {
1243 self.fs_status = FsStatus::StaleDepFingerprint { name: dep.name };
1244 return Ok(());
1245 }
1246 };
1247
1248 // If our dependency edge only requires the rmeta file to be present
1249 // then we only need to look at that one output file, otherwise we
1250 // need to consider all output files to see if we're out of date.
1251 let (dep_path, dep_mtime) = if dep.only_requires_rmeta {
1252 dep_mtimes
1253 .iter()
1254 .find(|(path, _mtime)| {
1255 path.extension().and_then(|s| s.to_str()) == Some("rmeta")
1256 })
1257 .expect("failed to find rmeta")
1258 } else {
1259 match dep_mtimes.iter().max_by_key(|kv| kv.1) {
1260 Some(dep_mtime) => dep_mtime,
1261 // If our dependencies is up to date and has no filesystem
1262 // interactions, then we can move on to the next dependency.
1263 None => continue,
1264 }
1265 };
1266 debug!(
1267 "max dep mtime for {:?} is {:?} {}",
1268 pkg_root, dep_path, dep_mtime
1269 );
1270
1271 // If the dependency is newer than our own output then it was
1272 // recompiled previously. We transitively become stale ourselves in
1273 // that case, so bail out.
1274 //
1275 // Note that this comparison should probably be `>=`, not `>`, but
1276 // for a discussion of why it's `>` see the discussion about #5918
1277 // below in `find_stale`.
1278 if dep_mtime > max_mtime {
1279 info!(
1280 "dependency on `{}` is newer than we are {} > {} {:?}",
1281 dep.name, dep_mtime, max_mtime, pkg_root
1282 );
1283
1284 self.fs_status = FsStatus::StaleDependency {
1285 name: dep.name,
1286 dep_mtime: *dep_mtime,
1287 max_mtime: *max_mtime,
1288 };
1289
1290 return Ok(());
1291 }
1292 }
1293
1294 // If we reached this far then all dependencies are up to date. Check
1295 // all our `LocalFingerprint` information to see if we have any stale
1296 // files for this package itself. If we do find something log a helpful
1297 // message and bail out so we stay stale.
1298 for local in self.local.get_mut().unwrap().iter() {
1299 if let Some(item) = local.find_stale_item(
1300 mtime_cache,
1301 checksum_cache,
1302 pkg,
1303 build_root,
1304 cargo_exe,
1305 gctx,
1306 )? {
1307 item.log();
1308 self.fs_status = FsStatus::StaleItem(item);
1309 return Ok(());
1310 }
1311 }
1312
1313 // Everything was up to date! Record such.
1314 self.fs_status = FsStatus::UpToDate { mtimes };
1315 debug!("filesystem up-to-date {:?}", pkg_root);
1316
1317 Ok(())
1318 }
1319}
1320
1321impl hash::Hash for Fingerprint {
1322 fn hash<H: Hasher>(&self, h: &mut H) {
1323 let Fingerprint {
1324 rustc,
1325 ref features,
1326 ref declared_features,
1327 target,
1328 path,
1329 profile,
1330 ref deps,
1331 ref local,
1332 config,
1333 compile_kind,
1334 ref rustflags,
1335 ..
1336 } = *self;
1337 let local = local.lock().unwrap();
1338 (
1339 rustc,
1340 features,
1341 declared_features,
1342 target,
1343 path,
1344 profile,
1345 &*local,
1346 config,
1347 compile_kind,
1348 rustflags,
1349 )
1350 .hash(h);
1351
1352 h.write_usize(deps.len());
1353 for DepFingerprint {
1354 pkg_id,
1355 name,
1356 public,
1357 fingerprint,
1358 only_requires_rmeta: _, // static property, no need to hash
1359 } in deps
1360 {
1361 pkg_id.hash(h);
1362 name.hash(h);
1363 public.hash(h);
1364 // use memoized dep hashes to avoid exponential blowup
1365 h.write_u64(fingerprint.hash_u64());
1366 }
1367 }
1368}
1369
1370impl DepFingerprint {
1371 fn new(
1372 build_runner: &mut BuildRunner<'_, '_>,
1373 parent: &Unit,
1374 dep: &UnitDep,
1375 ) -> CargoResult<DepFingerprint> {
1376 let fingerprint = calculate(build_runner, &dep.unit)?;
1377 // We need to be careful about what we hash here. We have a goal of
1378 // supporting renaming a project directory and not rebuilding
1379 // everything. To do that, however, we need to make sure that the cwd
1380 // doesn't make its way into any hashes, and one source of that is the
1381 // `SourceId` for `path` packages.
1382 //
1383 // We already have a requirement that `path` packages all have unique
1384 // names (sort of for this same reason), so if the package source is a
1385 // `path` then we just hash the name, but otherwise we hash the full
1386 // id as it won't change when the directory is renamed.
1387 let pkg_id = if dep.unit.pkg.package_id().source_id().is_path() {
1388 util::hash_u64(dep.unit.pkg.package_id().name())
1389 } else {
1390 util::hash_u64(dep.unit.pkg.package_id())
1391 };
1392
1393 Ok(DepFingerprint {
1394 pkg_id,
1395 name: dep.extern_crate_name,
1396 public: dep.public,
1397 fingerprint,
1398 only_requires_rmeta: build_runner.only_requires_rmeta(parent, &dep.unit),
1399 })
1400 }
1401}
1402
1403impl StaleItem {
1404 /// Use the `log` crate to log a hopefully helpful message in diagnosing
1405 /// what file is considered stale and why. This is intended to be used in
1406 /// conjunction with `CARGO_LOG` to determine why Cargo is recompiling
1407 /// something. Currently there's no user-facing usage of this other than
1408 /// that.
1409 fn log(&self) {
1410 match self {
1411 StaleItem::MissingFile { path } => {
1412 info!("stale: missing {:?}", path);
1413 }
1414 StaleItem::UnableToReadFile { path } => {
1415 info!("stale: unable to read {:?}", path);
1416 }
1417 StaleItem::FailedToReadMetadata { path } => {
1418 info!("stale: couldn't read metadata {:?}", path);
1419 }
1420 StaleItem::ChangedFile {
1421 reference,
1422 reference_mtime,
1423 stale,
1424 stale_mtime,
1425 } => {
1426 info!("stale: changed {:?}", stale);
1427 info!(" (vs) {:?}", reference);
1428 info!(" {:?} < {:?}", reference_mtime, stale_mtime);
1429 }
1430 StaleItem::FileSizeChanged {
1431 path,
1432 new_size,
1433 old_size,
1434 } => {
1435 info!("stale: changed {:?}", path);
1436 info!("prior file size {old_size}");
1437 info!(" new file size {new_size}");
1438 }
1439 StaleItem::ChangedChecksum {
1440 source,
1441 stored_checksum,
1442 new_checksum,
1443 } => {
1444 info!("stale: changed {:?}", source);
1445 info!("prior checksum {stored_checksum}");
1446 info!(" new checksum {new_checksum}");
1447 }
1448 StaleItem::MissingChecksum { path } => {
1449 info!("stale: no prior checksum {:?}", path);
1450 }
1451 StaleItem::ChangedEnv {
1452 var,
1453 previous,
1454 current,
1455 } => {
1456 info!("stale: changed env {:?}", var);
1457 info!(" {:?} != {:?}", previous, current);
1458 }
1459 }
1460 }
1461}
1462
1463/// Calculates the fingerprint for a [`Unit`].
1464///
1465/// This fingerprint is used by Cargo to learn about when information such as:
1466///
1467/// * A non-path package changes (changes version, changes revision, etc).
1468/// * Any dependency changes
1469/// * The compiler changes
1470/// * The set of features a package is built with changes
1471/// * The profile a target is compiled with changes (e.g., opt-level changes)
1472/// * Any other compiler flags change that will affect the result
1473///
1474/// Information like file modification time is only calculated for path
1475/// dependencies.
1476fn calculate(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> CargoResult<Arc<Fingerprint>> {
1477 // This function is slammed quite a lot, so the result is memoized.
1478 if let Some(s) = build_runner.fingerprints.get(unit) {
1479 return Ok(Arc::clone(s));
1480 }
1481 let mut fingerprint = if unit.mode.is_run_custom_build() {
1482 calculate_run_custom_build(build_runner, unit)?
1483 } else if unit.mode.is_doc_test() {
1484 panic!("doc tests do not fingerprint");
1485 } else {
1486 calculate_normal(build_runner, unit)?
1487 };
1488
1489 // After we built the initial `Fingerprint` be sure to update the
1490 // `fs_status` field of it.
1491 let build_root = build_root(build_runner);
1492 let cargo_exe = build_runner.bcx.gctx.cargo_exe()?;
1493 fingerprint.check_filesystem(
1494 &mut build_runner.mtime_cache,
1495 &mut build_runner.checksum_cache,
1496 &unit.pkg,
1497 &build_root,
1498 cargo_exe,
1499 build_runner.bcx.gctx,
1500 )?;
1501
1502 let fingerprint = Arc::new(fingerprint);
1503 build_runner
1504 .fingerprints
1505 .insert(unit.clone(), Arc::clone(&fingerprint));
1506 Ok(fingerprint)
1507}
1508
1509/// Calculate a fingerprint for a "normal" unit, or anything that's not a build
1510/// script. This is an internal helper of [`calculate`], don't call directly.
1511fn calculate_normal(
1512 build_runner: &mut BuildRunner<'_, '_>,
1513 unit: &Unit,
1514) -> CargoResult<Fingerprint> {
1515 let deps = {
1516 // Recursively calculate the fingerprint for all of our dependencies.
1517 //
1518 // Skip fingerprints of binaries because they don't actually induce a
1519 // recompile, they're just dependencies in the sense that they need to be
1520 // built. The only exception here are artifact dependencies,
1521 // which is an actual dependency that needs a recompile.
1522 //
1523 // Create Vec since mutable build_runner is needed in closure.
1524 let deps = Vec::from(build_runner.unit_deps(unit));
1525 let mut deps = deps
1526 .into_iter()
1527 .filter(|dep| !dep.unit.target.is_bin() || dep.unit.artifact.is_true())
1528 .map(|dep| DepFingerprint::new(build_runner, unit, &dep))
1529 .collect::<CargoResult<Vec<_>>>()?;
1530 deps.sort_by(|a, b| a.pkg_id.cmp(&b.pkg_id));
1531 deps
1532 };
1533
1534 // Afterwards calculate our own fingerprint information.
1535 let build_root = build_root(build_runner);
1536 let is_any_doc_gen = unit.mode.is_doc() || unit.mode.is_doc_scrape();
1537 let rustdoc_depinfo_enabled = build_runner.bcx.gctx.cli_unstable().rustdoc_depinfo;
1538 let local = if is_any_doc_gen && !rustdoc_depinfo_enabled {
1539 // rustdoc does not have dep-info files.
1540 let fingerprint = pkg_fingerprint(build_runner.bcx, &unit.pkg).with_context(|| {
1541 format!(
1542 "failed to determine package fingerprint for documenting {}",
1543 unit.pkg
1544 )
1545 })?;
1546 vec![LocalFingerprint::Precalculated(fingerprint)]
1547 } else {
1548 let dep_info = dep_info_loc(build_runner, unit);
1549 let dep_info = dep_info.strip_prefix(&build_root).unwrap().to_path_buf();
1550 vec![LocalFingerprint::CheckDepInfo {
1551 dep_info,
1552 checksum: build_runner.bcx.gctx.cli_unstable().checksum_freshness,
1553 }]
1554 };
1555
1556 // Figure out what the outputs of our unit is, and we'll be storing them
1557 // into the fingerprint as well.
1558 let outputs = build_runner
1559 .outputs(unit)?
1560 .iter()
1561 .filter(|output| {
1562 !matches!(
1563 output.flavor,
1564 FileFlavor::DebugInfo | FileFlavor::Auxiliary | FileFlavor::Sbom
1565 )
1566 })
1567 .map(|output| output.path.clone())
1568 .collect();
1569
1570 // Fill out a bunch more information that we'll be tracking typically
1571 // hashed to take up less space on disk as we just need to know when things
1572 // change.
1573 let extra_flags = if unit.mode.is_doc() || unit.mode.is_doc_scrape() {
1574 &unit.rustdocflags
1575 } else {
1576 &unit.rustflags
1577 }
1578 .to_vec();
1579
1580 let profile_hash = util::hash_u64((
1581 &unit.profile,
1582 unit.mode,
1583 build_runner.bcx.extra_args_for(unit),
1584 build_runner.lto[unit],
1585 unit.pkg.manifest().lint_rustflags(),
1586 ));
1587 let mut config = StableHasher::new();
1588 if let Some(linker) = build_runner.compilation.target_linker(unit.kind) {
1589 linker.hash(&mut config);
1590 }
1591 if unit.mode.is_doc() && build_runner.bcx.gctx.cli_unstable().rustdoc_map {
1592 if let Ok(map) = build_runner.bcx.gctx.doc_extern_map() {
1593 map.hash(&mut config);
1594 }
1595 }
1596 if let Some(allow_features) = &build_runner.bcx.gctx.cli_unstable().allow_features {
1597 allow_features.hash(&mut config);
1598 }
1599 let compile_kind = unit.kind.fingerprint_hash();
1600 let mut declared_features = unit.pkg.summary().features().keys().collect::<Vec<_>>();
1601 declared_features.sort(); // to avoid useless rebuild if the user orders it's features
1602 // differently
1603 Ok(Fingerprint {
1604 rustc: util::hash_u64(&build_runner.bcx.rustc().verbose_version),
1605 target: util::hash_u64(&unit.target),
1606 profile: profile_hash,
1607 // Note that .0 is hashed here, not .1 which is the cwd. That doesn't
1608 // actually affect the output artifact so there's no need to hash it.
1609 path: util::hash_u64(path_args(build_runner.bcx.ws, unit).0),
1610 features: format!("{:?}", unit.features),
1611 declared_features: format!("{declared_features:?}"),
1612 deps,
1613 local: Mutex::new(local),
1614 memoized_hash: Mutex::new(None),
1615 config: Hasher::finish(&config),
1616 compile_kind,
1617 rustflags: extra_flags,
1618 fs_status: FsStatus::Stale,
1619 outputs,
1620 })
1621}
1622
1623/// Calculate a fingerprint for an "execute a build script" unit. This is an
1624/// internal helper of [`calculate`], don't call directly.
1625fn calculate_run_custom_build(
1626 build_runner: &mut BuildRunner<'_, '_>,
1627 unit: &Unit,
1628) -> CargoResult<Fingerprint> {
1629 assert!(unit.mode.is_run_custom_build());
1630 // Using the `BuildDeps` information we'll have previously parsed and
1631 // inserted into `build_explicit_deps` built an initial snapshot of the
1632 // `LocalFingerprint` list for this build script. If we previously executed
1633 // the build script this means we'll be watching files and env vars.
1634 // Otherwise if we haven't previously executed it we'll just start watching
1635 // the whole crate.
1636 let (gen_local, overridden) = build_script_local_fingerprints(build_runner, unit)?;
1637 let deps = &build_runner.build_explicit_deps[unit];
1638 let local = (gen_local)(
1639 deps,
1640 Some(&|| {
1641 const IO_ERR_MESSAGE: &str = "\
1642An I/O error happened. Please make sure you can access the file.
1643
1644By default, if your project contains a build script, cargo scans all files in
1645it to determine whether a rebuild is needed. If you don't expect to access the
1646file, specify `rerun-if-changed` in your build script.
1647See https://doc.rust-lang.org/cargo/reference/build-scripts.html#rerun-if-changed for more information.";
1648 pkg_fingerprint(build_runner.bcx, &unit.pkg).map_err(|err| {
1649 let mut message = format!("failed to determine package fingerprint for build script for {}", unit.pkg);
1650 if err.root_cause().is::<io::Error>() {
1651 message = format!("{}\n{}", message, IO_ERR_MESSAGE)
1652 }
1653 err.context(message)
1654 })
1655 }),
1656 )?
1657 .unwrap();
1658 let output = deps.build_script_output.clone();
1659
1660 // Include any dependencies of our execution, which is typically just the
1661 // compilation of the build script itself. (if the build script changes we
1662 // should be rerun!). Note though that if we're an overridden build script
1663 // we have no dependencies so no need to recurse in that case.
1664 let deps = if overridden {
1665 // Overridden build scripts don't need to track deps.
1666 vec![]
1667 } else {
1668 // Create Vec since mutable build_runner is needed in closure.
1669 let deps = Vec::from(build_runner.unit_deps(unit));
1670 deps.into_iter()
1671 .map(|dep| DepFingerprint::new(build_runner, unit, &dep))
1672 .collect::<CargoResult<Vec<_>>>()?
1673 };
1674
1675 let rustflags = unit.rustflags.to_vec();
1676
1677 Ok(Fingerprint {
1678 local: Mutex::new(local),
1679 rustc: util::hash_u64(&build_runner.bcx.rustc().verbose_version),
1680 deps,
1681 outputs: if overridden { Vec::new() } else { vec![output] },
1682 rustflags,
1683
1684 // Most of the other info is blank here as we don't really include it
1685 // in the execution of the build script, but... this may be a latent
1686 // bug in Cargo.
1687 ..Fingerprint::new()
1688 })
1689}
1690
1691/// Get ready to compute the [`LocalFingerprint`] values
1692/// for a [`RunCustomBuild`] unit.
1693///
1694/// This function has, what's on the surface, a seriously wonky interface.
1695/// You'll call this function and it'll return a closure and a boolean. The
1696/// boolean is pretty simple in that it indicates whether the `unit` has been
1697/// overridden via `.cargo/config.toml`. The closure is much more complicated.
1698///
1699/// This closure is intended to capture any local state necessary to compute
1700/// the `LocalFingerprint` values for this unit. It is `Send` and `'static` to
1701/// be sent to other threads as well (such as when we're executing build
1702/// scripts). That deduplication is the rationale for the closure at least.
1703///
1704/// The arguments to the closure are a bit weirder, though, and I'll apologize
1705/// in advance for the weirdness too. The first argument to the closure is a
1706/// `&BuildDeps`. This is the parsed version of a build script, and when Cargo
1707/// starts up this is cached from previous runs of a build script. After a
1708/// build script executes the output file is reparsed and passed in here.
1709///
1710/// The second argument is the weirdest, it's *optionally* a closure to
1711/// call [`pkg_fingerprint`]. The `pkg_fingerprint` requires access to
1712/// "source map" located in `Context`. That's very non-`'static` and
1713/// non-`Send`, so it can't be used on other threads, such as when we invoke
1714/// this after a build script has finished. The `Option` allows us to for sure
1715/// calculate it on the main thread at the beginning, and then swallow the bug
1716/// for now where a worker thread after a build script has finished doesn't
1717/// have access. Ideally there would be no second argument or it would be more
1718/// "first class" and not an `Option` but something that can be sent between
1719/// threads. In any case, it's a bug for now.
1720///
1721/// This isn't the greatest of interfaces, and if there's suggestions to
1722/// improve please do so!
1723///
1724/// FIXME(#6779) - see all the words above
1725///
1726/// [`RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
1727fn build_script_local_fingerprints(
1728 build_runner: &mut BuildRunner<'_, '_>,
1729 unit: &Unit,
1730) -> CargoResult<(
1731 Box<
1732 dyn FnOnce(
1733 &BuildDeps,
1734 Option<&dyn Fn() -> CargoResult<String>>,
1735 ) -> CargoResult<Option<Vec<LocalFingerprint>>>
1736 + Send,
1737 >,
1738 bool,
1739)> {
1740 assert!(unit.mode.is_run_custom_build());
1741 // First up, if this build script is entirely overridden, then we just
1742 // return the hash of what we overrode it with. This is the easy case!
1743 if let Some(fingerprint) = build_script_override_fingerprint(build_runner, unit) {
1744 debug!("override local fingerprints deps {}", unit.pkg);
1745 return Ok((
1746 Box::new(
1747 move |_: &BuildDeps, _: Option<&dyn Fn() -> CargoResult<String>>| {
1748 Ok(Some(vec![fingerprint]))
1749 },
1750 ),
1751 true, // this is an overridden build script
1752 ));
1753 }
1754
1755 // ... Otherwise this is a "real" build script and we need to return a real
1756 // closure. Our returned closure classifies the build script based on
1757 // whether it prints `rerun-if-*`. If it *doesn't* print this it's where the
1758 // magical second argument comes into play, which fingerprints a whole
1759 // package. Remember that the fact that this is an `Option` is a bug, but a
1760 // longstanding bug, in Cargo. Recent refactorings just made it painfully
1761 // obvious.
1762 let pkg_root = unit.pkg.root().to_path_buf();
1763 let build_dir = build_root(build_runner);
1764 let env_config = Arc::clone(build_runner.bcx.gctx.env_config()?);
1765 let calculate =
1766 move |deps: &BuildDeps, pkg_fingerprint: Option<&dyn Fn() -> CargoResult<String>>| {
1767 if deps.rerun_if_changed.is_empty() && deps.rerun_if_env_changed.is_empty() {
1768 match pkg_fingerprint {
1769 // FIXME: this is somewhat buggy with respect to docker and
1770 // weird filesystems. The `Precalculated` variant
1771 // constructed below will, for `path` dependencies, contain
1772 // a stringified version of the mtime for the local crate.
1773 // This violates one of the things we describe in this
1774 // module's doc comment, never hashing mtimes. We should
1775 // figure out a better scheme where a package fingerprint
1776 // may be a string (like for a registry) or a list of files
1777 // (like for a path dependency). Those list of files would
1778 // be stored here rather than the mtime of them.
1779 Some(f) => {
1780 let s = f()?;
1781 debug!(
1782 "old local fingerprints deps {:?} precalculated={:?}",
1783 pkg_root, s
1784 );
1785 return Ok(Some(vec![LocalFingerprint::Precalculated(s)]));
1786 }
1787 None => return Ok(None),
1788 }
1789 }
1790
1791 // Ok so now we're in "new mode" where we can have files listed as
1792 // dependencies as well as env vars listed as dependencies. Process
1793 // them all here.
1794 Ok(Some(local_fingerprints_deps(
1795 deps,
1796 &build_dir,
1797 &pkg_root,
1798 &env_config,
1799 )))
1800 };
1801
1802 // Note that `false` == "not overridden"
1803 Ok((Box::new(calculate), false))
1804}
1805
1806/// Create a [`LocalFingerprint`] for an overridden build script.
1807/// Returns None if it is not overridden.
1808fn build_script_override_fingerprint(
1809 build_runner: &mut BuildRunner<'_, '_>,
1810 unit: &Unit,
1811) -> Option<LocalFingerprint> {
1812 // Build script output is only populated at this stage when it is
1813 // overridden.
1814 let build_script_outputs = build_runner.build_script_outputs.lock().unwrap();
1815 let metadata = build_runner.get_run_build_script_metadata(unit);
1816 // Returns None if it is not overridden.
1817 let output = build_script_outputs.get(metadata)?;
1818 let s = format!(
1819 "overridden build state with hash: {}",
1820 util::hash_u64(output)
1821 );
1822 Some(LocalFingerprint::Precalculated(s))
1823}
1824
1825/// Compute the [`LocalFingerprint`] values for a [`RunCustomBuild`] unit for
1826/// non-overridden new-style build scripts only. This is only used when `deps`
1827/// is already known to have a nonempty `rerun-if-*` somewhere.
1828///
1829/// [`RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
1830fn local_fingerprints_deps(
1831 deps: &BuildDeps,
1832 build_root: &Path,
1833 pkg_root: &Path,
1834 env_config: &Arc<HashMap<String, OsString>>,
1835) -> Vec<LocalFingerprint> {
1836 debug!("new local fingerprints deps {:?}", pkg_root);
1837 let mut local = Vec::new();
1838
1839 if !deps.rerun_if_changed.is_empty() {
1840 // Note that like the module comment above says we are careful to never
1841 // store an absolute path in `LocalFingerprint`, so ensure that we strip
1842 // absolute prefixes from them.
1843 let output = deps
1844 .build_script_output
1845 .strip_prefix(build_root)
1846 .unwrap()
1847 .to_path_buf();
1848 let paths = deps
1849 .rerun_if_changed
1850 .iter()
1851 .map(|p| p.strip_prefix(pkg_root).unwrap_or(p).to_path_buf())
1852 .collect();
1853 local.push(LocalFingerprint::RerunIfChanged { output, paths });
1854 }
1855
1856 local.extend(
1857 deps.rerun_if_env_changed
1858 .iter()
1859 .map(|s| LocalFingerprint::from_env(s, env_config)),
1860 );
1861
1862 local
1863}
1864
1865/// Writes the short fingerprint hash value to `<loc>`
1866/// and logs detailed JSON information to `<loc>.json`.
1867fn write_fingerprint(loc: &Path, fingerprint: &Fingerprint) -> CargoResult<()> {
1868 debug_assert_ne!(fingerprint.rustc, 0);
1869 // fingerprint::new().rustc == 0, make sure it doesn't make it to the file system.
1870 // This is mostly so outside tools can reliably find out what rust version this file is for,
1871 // as we can use the full hash.
1872 let hash = fingerprint.hash_u64();
1873 debug!("write fingerprint ({:x}) : {}", hash, loc.display());
1874 paths::write(loc, util::to_hex(hash).as_bytes())?;
1875
1876 let json = serde_json::to_string(fingerprint).unwrap();
1877 if cfg!(debug_assertions) {
1878 let f: Fingerprint = serde_json::from_str(&json).unwrap();
1879 assert_eq!(f.hash_u64(), hash);
1880 }
1881 paths::write(&loc.with_extension("json"), json.as_bytes())?;
1882 Ok(())
1883}
1884
1885/// Prepare for work when a package starts to build
1886pub fn prepare_init(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> CargoResult<()> {
1887 let new1 = build_runner.files().fingerprint_dir(unit);
1888
1889 // Doc tests have no output, thus no fingerprint.
1890 if !new1.exists() && !unit.mode.is_doc_test() {
1891 paths::create_dir_all(&new1)?;
1892 }
1893
1894 Ok(())
1895}
1896
1897/// Returns the location that the dep-info file will show up at
1898/// for the [`Unit`] specified.
1899pub fn dep_info_loc(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> PathBuf {
1900 build_runner.files().fingerprint_file_path(unit, "dep-")
1901}
1902
1903/// Returns an absolute path that build directory.
1904/// All paths are rewritten to be relative to this.
1905fn build_root(build_runner: &BuildRunner<'_, '_>) -> PathBuf {
1906 build_runner.bcx.ws.build_dir().into_path_unlocked()
1907}
1908
1909/// Reads the value from the old fingerprint hash file and compare.
1910///
1911/// If dirty, it then restores the detailed information
1912/// from the fingerprint JSON file, and provides an rich dirty reason.
1913fn compare_old_fingerprint(
1914 unit: &Unit,
1915 old_hash_path: &Path,
1916 new_fingerprint: &Fingerprint,
1917 mtime_on_use: bool,
1918 forced: bool,
1919) -> Option<DirtyReason> {
1920 if mtime_on_use {
1921 // update the mtime so other cleaners know we used it
1922 let t = FileTime::from_system_time(SystemTime::now());
1923 debug!("mtime-on-use forcing {:?} to {}", old_hash_path, t);
1924 paths::set_file_time_no_err(old_hash_path, t);
1925 }
1926
1927 let compare = _compare_old_fingerprint(old_hash_path, new_fingerprint);
1928
1929 match compare.as_ref() {
1930 Ok(None) => {}
1931 Ok(Some(reason)) => {
1932 info!(
1933 "fingerprint dirty for {}/{:?}/{:?}",
1934 unit.pkg, unit.mode, unit.target,
1935 );
1936 info!(" dirty: {reason:?}");
1937 }
1938 Err(e) => {
1939 info!(
1940 "fingerprint error for {}/{:?}/{:?}",
1941 unit.pkg, unit.mode, unit.target,
1942 );
1943 info!(" err: {e:?}");
1944 }
1945 }
1946
1947 match compare {
1948 Ok(None) if forced => Some(DirtyReason::Forced),
1949 Ok(reason) => reason,
1950 Err(_) => Some(DirtyReason::FreshBuild),
1951 }
1952}
1953
1954fn _compare_old_fingerprint(
1955 old_hash_path: &Path,
1956 new_fingerprint: &Fingerprint,
1957) -> CargoResult<Option<DirtyReason>> {
1958 let old_fingerprint_short = paths::read(old_hash_path)?;
1959
1960 let new_hash = new_fingerprint.hash_u64();
1961
1962 if util::to_hex(new_hash) == old_fingerprint_short && new_fingerprint.fs_status.up_to_date() {
1963 return Ok(None);
1964 }
1965
1966 let old_fingerprint_json = paths::read(&old_hash_path.with_extension("json"))?;
1967 let old_fingerprint: Fingerprint = serde_json::from_str(&old_fingerprint_json)
1968 .with_context(|| internal("failed to deserialize json"))?;
1969 // Fingerprint can be empty after a failed rebuild (see comment in prepare_target).
1970 if !old_fingerprint_short.is_empty() {
1971 debug_assert_eq!(
1972 util::to_hex(old_fingerprint.hash_u64()),
1973 old_fingerprint_short
1974 );
1975 }
1976
1977 Ok(Some(new_fingerprint.compare(&old_fingerprint)))
1978}
1979
1980/// Calculates the fingerprint of a unit thats contains no dep-info files.
1981fn pkg_fingerprint(bcx: &BuildContext<'_, '_>, pkg: &Package) -> CargoResult<String> {
1982 let source_id = pkg.package_id().source_id();
1983 let sources = bcx.packages.sources();
1984
1985 let source = sources
1986 .get(source_id)
1987 .ok_or_else(|| internal("missing package source"))?;
1988 source.fingerprint(pkg)
1989}
1990
1991/// The `reference` file is considered as "stale" if any file from `paths` has a newer mtime.
1992fn find_stale_file<I, P>(
1993 mtime_cache: &mut HashMap<PathBuf, FileTime>,
1994 checksum_cache: &mut HashMap<PathBuf, Checksum>,
1995 reference: &Path,
1996 paths: I,
1997 use_checksums: bool,
1998) -> Option<StaleItem>
1999where
2000 I: IntoIterator<Item = (P, Option<(u64, Checksum)>)>,
2001 P: AsRef<Path>,
2002{
2003 let reference_mtime = match paths::mtime(reference) {
2004 Ok(mtime) => mtime,
2005 Err(..) => {
2006 return Some(StaleItem::MissingFile {
2007 path: reference.to_path_buf(),
2008 });
2009 }
2010 };
2011
2012 let skippable_dirs = if let Ok(cargo_home) = home::cargo_home() {
2013 let skippable_dirs: Vec<_> = ["git", "registry"]
2014 .into_iter()
2015 .map(|subfolder| cargo_home.join(subfolder))
2016 .collect();
2017 Some(skippable_dirs)
2018 } else {
2019 None
2020 };
2021 for (path, prior_checksum) in paths {
2022 let path = path.as_ref();
2023
2024 // Assuming anything in cargo_home/{git, registry} is immutable
2025 // (see also #9455 about marking the src directory readonly) which avoids rebuilds when CI
2026 // caches $CARGO_HOME/registry/{index, cache} and $CARGO_HOME/git/db across runs, keeping
2027 // the content the same but changing the mtime.
2028 if let Some(ref skippable_dirs) = skippable_dirs {
2029 if skippable_dirs.iter().any(|dir| path.starts_with(dir)) {
2030 continue;
2031 }
2032 }
2033 if use_checksums {
2034 let Some((file_len, prior_checksum)) = prior_checksum else {
2035 return Some(StaleItem::MissingChecksum {
2036 path: path.to_path_buf(),
2037 });
2038 };
2039 let path_buf = path.to_path_buf();
2040
2041 let path_checksum = match checksum_cache.entry(path_buf) {
2042 Entry::Occupied(o) => *o.get(),
2043 Entry::Vacant(v) => {
2044 let Ok(current_file_len) = fs::metadata(&path).map(|m| m.len()) else {
2045 return Some(StaleItem::FailedToReadMetadata {
2046 path: path.to_path_buf(),
2047 });
2048 };
2049 if current_file_len != file_len {
2050 return Some(StaleItem::FileSizeChanged {
2051 path: path.to_path_buf(),
2052 new_size: current_file_len,
2053 old_size: file_len,
2054 });
2055 }
2056 let Ok(file) = File::open(path) else {
2057 return Some(StaleItem::MissingFile {
2058 path: path.to_path_buf(),
2059 });
2060 };
2061 let Ok(checksum) = Checksum::compute(prior_checksum.algo(), file) else {
2062 return Some(StaleItem::UnableToReadFile {
2063 path: path.to_path_buf(),
2064 });
2065 };
2066 *v.insert(checksum)
2067 }
2068 };
2069 if path_checksum == prior_checksum {
2070 continue;
2071 }
2072 return Some(StaleItem::ChangedChecksum {
2073 source: path.to_path_buf(),
2074 stored_checksum: prior_checksum,
2075 new_checksum: path_checksum,
2076 });
2077 } else {
2078 let path_mtime = match mtime_cache.entry(path.to_path_buf()) {
2079 Entry::Occupied(o) => *o.get(),
2080 Entry::Vacant(v) => {
2081 let Ok(mtime) = paths::mtime_recursive(path) else {
2082 return Some(StaleItem::MissingFile {
2083 path: path.to_path_buf(),
2084 });
2085 };
2086 *v.insert(mtime)
2087 }
2088 };
2089
2090 // TODO: fix #5918.
2091 // Note that equal mtimes should be considered "stale". For filesystems with
2092 // not much timestamp precision like 1s this is would be a conservative approximation
2093 // to handle the case where a file is modified within the same second after
2094 // a build starts. We want to make sure that incremental rebuilds pick that up!
2095 //
2096 // For filesystems with nanosecond precision it's been seen in the wild that
2097 // its "nanosecond precision" isn't really nanosecond-accurate. It turns out that
2098 // kernels may cache the current time so files created at different times actually
2099 // list the same nanosecond precision. Some digging on #5919 picked up that the
2100 // kernel caches the current time between timer ticks, which could mean that if
2101 // a file is updated at most 10ms after a build starts then Cargo may not
2102 // pick up the build changes.
2103 //
2104 // All in all, an equality check here would be a conservative assumption that,
2105 // if equal, files were changed just after a previous build finished.
2106 // Unfortunately this became problematic when (in #6484) cargo switch to more accurately
2107 // measuring the start time of builds.
2108 if path_mtime <= reference_mtime {
2109 continue;
2110 }
2111
2112 return Some(StaleItem::ChangedFile {
2113 reference: reference.to_path_buf(),
2114 reference_mtime,
2115 stale: path.to_path_buf(),
2116 stale_mtime: path_mtime,
2117 });
2118 }
2119 }
2120
2121 debug!(
2122 "all paths up-to-date relative to {:?} mtime={}",
2123 reference, reference_mtime
2124 );
2125 None
2126}