Skip to main content

cargo/core/compiler/fingerprint/
mod.rs

1//! Tracks changes to determine if something needs to be recompiled.
2//!
3//! This module implements change-tracking so that Cargo can know whether or
4//! not something needs to be recompiled. A Cargo [`Unit`] can be either "dirty"
5//! (needs to be recompiled) or "fresh" (it does not need to be recompiled).
6//!
7//! ## Mechanisms affecting freshness
8//!
9//! There are several mechanisms that influence a Unit's freshness:
10//!
11//! - The [`Fingerprint`] is a hash, saved to the filesystem in the
12//!   `.fingerprint` directory, that tracks information about the Unit. If the
13//!   fingerprint is missing (such as the first time the unit is being
14//!   compiled), then the unit is dirty. If any of the fingerprint fields
15//!   change (like the name of the source file), then the Unit is considered
16//!   dirty.
17//!
18//!   The `Fingerprint` also tracks the fingerprints of all its dependencies,
19//!   so a change in a dependency will propagate the "dirty" status up.
20//!
21//! - Filesystem mtime tracking is also used to check if a unit is dirty.
22//!   See the section below on "Mtime comparison" for more details. There
23//!   are essentially two parts to mtime tracking:
24//!
25//!   1. The mtime of a Unit's output files is compared to the mtime of all
26//!      its dependencies' output file mtimes (see
27//!      [`check_filesystem`]). If any output is missing, or is
28//!      older than a dependency's output, then the unit is dirty.
29//!   2. The mtime of a Unit's source files is compared to the mtime of its
30//!      dep-info file in the fingerprint directory (see [`find_stale_file`]).
31//!      The dep-info file is used as an anchor to know when the last build of
32//!      the unit was done. See the "dep-info files" section below for more
33//!      details. If any input files are missing, or are newer than the
34//!      dep-info, then the unit is dirty.
35//!
36//!  - Alternatively if you're using the unstable feature `checksum-freshness`
37//!    mtimes are ignored entirely in favor of comparing first the file size, and
38//!    then the checksum with a known prior value emitted by rustc. Only nightly
39//!    rustc will emit the needed metadata at the time of writing. This is dependent
40//!    on the unstable feature `-Z checksum-hash-algorithm`.
41//!
42//! Note: Fingerprinting is not a perfect solution. Filesystem mtime tracking
43//! is notoriously imprecise and problematic. Only a small part of the
44//! environment is captured. This is a balance of performance, simplicity, and
45//! completeness. Sandboxing, hashing file contents, tracking every file
46//! access, environment variable, and network operation would ensure more
47//! reliable and reproducible builds at the cost of being complex, slow, and
48//! platform-dependent.
49//!
50//! ## Fingerprints and [`UnitHash`]s
51//!
52//! [`Metadata`] tracks several [`UnitHash`]s, including
53//! [`Metadata::unit_id`], [`Metadata::c_metadata`], and [`Metadata::c_extra_filename`].
54//! See its documentation for more details.
55//!
56//! NOTE: Not all output files are isolated via filename hashes (like dylibs).
57//! The fingerprint directory uses a hash, but sometimes units share the same
58//! fingerprint directory (when they don't have Metadata) so care should be
59//! taken to handle this!
60//!
61//! Fingerprints and [`UnitHash`]s are similar, and track some of the same things.
62//! [`UnitHash`]s contains information that is required to keep Units separate.
63//! The Fingerprint includes additional information that should cause a
64//! recompile, but it is desired to reuse the same filenames. A comparison
65//! of what is tracked:
66//!
67//! Value                                      | Fingerprint | `Metadata::unit_id` [^8] | `Metadata::c_metadata`
68//! -------------------------------------------|-------------|--------------------------|-----------------------
69//! rustc                                      | ✓           | ✓                        | ✓
70//! [`Profile`]                                | ✓           | ✓                        | ✓
71//! `cargo rustc` extra args                   | ✓           | ✓[^7]                    |
72//! [`CompileMode`]                            | ✓           | ✓                        | ✓
73//! Target Name                                | ✓           | ✓                        | ✓
74//! `TargetKind` (bin/lib/etc.)                | ✓           | ✓                        | ✓
75//! Enabled Features                           | ✓           | ✓                        | ✓
76//! Declared Features                          | ✓           |                          |
77//! Immediate dependency’s hashes              | ✓[^1]       | ✓                        | ✓
78//! [`CompileKind`] (host/target)              | ✓           | ✓                        | ✓
79//! `__CARGO_DEFAULT_LIB_METADATA`[^4]         |             | ✓                        | ✓
80//! `package_id`                               |             | ✓                        | ✓
81//! Target src path relative to ws             | ✓           |                          |
82//! Target flags (test/bench/for_host/edition) | ✓           |                          |
83//! -C incremental=… flag                      | ✓           |                          |
84//! mtime of sources                           | ✓[^3]       |                          |
85//! RUSTFLAGS/RUSTDOCFLAGS                     | ✓           | ✓[^7]                    |
86//! [`Lto`] flags                              | ✓           | ✓                        | ✓
87//! config settings[^5]                        | ✓           |                          |
88//! `is_std`                                   |             | ✓                        | ✓
89//! `[lints]` table[^6]                        | ✓           |                          |
90//! `[lints.rust.unexpected_cfgs.check-cfg]`   | ✓           |                          |
91//! `--extern priv:`                           | ✓           |                          |
92//!
93//! [^1]: Bin dependencies are not included.
94//!
95//! [^3]: See below for details on mtime tracking.
96//!
97//! [^4]: `__CARGO_DEFAULT_LIB_METADATA` is set by rustbuild to embed the
98//!        release channel (bootstrap/stable/beta/nightly) in libstd.
99//!
100//! [^5]: Config settings that are not otherwise captured anywhere else.
101//!       Currently, this is only `doc.extern-map`.
102//!
103//! [^6]: Via [`Manifest::lint_rustflags`][crate::core::Manifest::lint_rustflags]
104//!
105//! [^7]: extra-flags and RUSTFLAGS are conditionally excluded when `--remap-path-prefix` is
106//!       present to avoid breaking build reproducibility while we wait for trim-paths
107//!
108//! [^8]: including `-Cextra-filename`
109//!
110//! When deciding what should go in the Metadata vs the Fingerprint, consider
111//! that some files (like dylibs) do not have a hash in their filename. Thus,
112//! if a value changes, only the fingerprint will detect the change (consider,
113//! for example, swapping between different features). Fields that are only in
114//! Metadata generally aren't relevant to the fingerprint because they
115//! fundamentally change the output (like target vs host changes the directory
116//! where it is emitted).
117//!
118//! ## Fingerprint files
119//!
120//! Fingerprint information is stored in the
121//! `target/{debug,release}/.fingerprint/` directory. Each Unit is stored in a
122//! separate directory. Each Unit directory contains:
123//!
124//! - A file with a 16 hex-digit hash. This is the Fingerprint hash, used for
125//!   quick loading and comparison.
126//! - A `.json` file that contains details about the Fingerprint. This is only
127//!   used to log details about *why* a fingerprint is considered dirty.
128//!   `CARGO_LOG=cargo::core::compiler::fingerprint=trace cargo build` can be
129//!   used to display this log information.
130//! - A "dep-info" file which is a translation of rustc's `*.d` dep-info files
131//!   to a Cargo-specific format that tweaks file names and is optimized for
132//!   reading quickly.
133//! - An `invoked.timestamp` file whose filesystem mtime is updated every time
134//!   the Unit is built. This is used for capturing the time when the build
135//!   starts, to detect if files are changed in the middle of the build. See
136//!   below for more details.
137//!
138//! Note that some units are a little different. A Unit for *running* a build
139//! script or for `rustdoc` does not have a dep-info file (it's not
140//! applicable). Build script `invoked.timestamp` files are in the build
141//! output directory.
142//!
143//! ## Fingerprint calculation
144//!
145//! After the list of Units has been calculated, the Units are added to the
146//! [`JobQueue`]. As each one is added, the fingerprint is calculated, and the
147//! dirty/fresh status is recorded. A closure is used to update the fingerprint
148//! on-disk when the Unit successfully finishes. The closure will recompute the
149//! Fingerprint based on the updated information. If the Unit fails to compile,
150//! the fingerprint is not updated.
151//!
152//! Fingerprints are cached in the [`BuildRunner`]. This makes computing
153//! Fingerprints faster, but also is necessary for properly updating
154//! dependency information. Since a Fingerprint includes the Fingerprints of
155//! all dependencies, when it is updated, by using `Arc` clones, it
156//! automatically picks up the updates to its dependencies.
157//!
158//! ### dep-info files
159//!
160//! Cargo has several kinds of "dep info" files:
161//!
162//! * dep-info files generated by `rustc`.
163//! * Fingerprint dep-info files translated from the first one.
164//! * dep-info for external build system integration.
165//! * Unstable `-Zbinary-dep-depinfo`.
166//!
167//! #### `rustc` dep-info files
168//!
169//! Cargo passes the `--emit=dep-info` flag to `rustc` so that `rustc` will
170//! generate a "dep info" file (with the `.d` extension). This is a
171//! Makefile-like syntax that includes all of the source files used to build
172//! the crate. This file is used by Cargo to know which files to check to see
173//! if the crate will need to be rebuilt. Example:
174//!
175//! ```makefile
176//! /path/to/target/debug/deps/cargo-b6219d178925203d: src/bin/main.rs src/bin/cargo/cli.rs # … etc.
177//! ```
178//!
179//! #### Fingerprint dep-info files
180//!
181//! After `rustc` exits successfully, Cargo will read the first kind of dep
182//! info file and translate it into a binary format that is stored in the
183//! fingerprint directory ([`translate_dep_info`]).
184//!
185//! These are used to quickly scan for any changed files. The mtime of the
186//! fingerprint dep-info file itself is used as the reference for comparing the
187//! source files to determine if any of the source files have been modified
188//! (see [below](#mtime-comparison) for more detail).
189//!
190//! Note that Cargo parses the special `# env-var:...` comments in dep-info
191//! files to learn about environment variables that the rustc compile depends on.
192//! Cargo then later uses this to trigger a recompile if a referenced env var
193//! changes (even if the source didn't change).
194//! This also includes env vars generated from Cargo metadata like `CARGO_PKG_DESCRIPTION`.
195//! (See [`crate::core::manifest::ManifestMetadata`]
196//!
197//! #### dep-info files for build system integration.
198//!
199//! There is also a third dep-info file. Cargo will extend the file created by
200//! rustc with some additional information and saves this into the output
201//! directory. This is intended for build system integration. See the
202//! [`output_depinfo`] function for more detail.
203//!
204//! #### -Zbinary-dep-depinfo
205//!
206//! `rustc` has an experimental flag `-Zbinary-dep-depinfo`. This causes
207//! `rustc` to include binary files (like rlibs) in the dep-info file. This is
208//! primarily to support rustc development, so that Cargo can check the
209//! implicit dependency to the standard library (which lives in the sysroot).
210//! We want Cargo to recompile whenever the standard library rlib/dylibs
211//! change, and this is a generic mechanism to make that work.
212//!
213//! ### Mtime comparison
214//!
215//! The use of modification timestamps is the most common way a unit will be
216//! determined to be dirty or fresh between builds. There are many subtle
217//! issues and edge cases with mtime comparisons. This gives a high-level
218//! overview, but you'll need to read the code for the gritty details. Mtime
219//! handling is different for different unit kinds. The different styles are
220//! driven by the [`Fingerprint::local`] field, which is set based on the unit
221//! kind.
222//!
223//! The status of whether or not the mtime is "stale" or "up-to-date" is
224//! stored in [`Fingerprint::fs_status`].
225//!
226//! All units will compare the mtime of its newest output file with the mtimes
227//! of the outputs of all its dependencies. If any output file is missing,
228//! then the unit is stale. If any dependency is newer, the unit is stale.
229//!
230//! #### Normal package mtime handling
231//!
232//! [`LocalFingerprint::CheckDepInfo`] is used for checking the mtime of
233//! packages. It compares the mtime of the input files (the source files) to
234//! the mtime of the dep-info file (which is written last after a build is
235//! finished). If the dep-info is missing, the unit is stale (it has never
236//! been built). The list of input files comes from the dep-info file. See the
237//! section above for details on dep-info files.
238//!
239//! Also note that although registry and git packages use [`CheckDepInfo`], none
240//! of their source files are included in the dep-info (see
241//! [`translate_dep_info`]), so for those kinds no mtime checking is done
242//! (unless `-Zbinary-dep-depinfo` is used). Repository and git packages are
243//! static, so there is no need to check anything.
244//!
245//! When a build is complete, the mtime of the dep-info file in the
246//! fingerprint directory is modified to rewind it to the time when the build
247//! started. This is done by creating an `invoked.timestamp` file when the
248//! build starts to capture the start time. The mtime is rewound to the start
249//! to handle the case where the user modifies a source file while a build is
250//! running. Cargo can't know whether or not the file was included in the
251//! build, so it takes a conservative approach of assuming the file was *not*
252//! included, and it should be rebuilt during the next build.
253//!
254//! #### Rustdoc mtime handling
255//!
256//! Rustdoc does not emit a dep-info file, so Cargo currently has a relatively
257//! simple system for detecting rebuilds. [`LocalFingerprint::Precalculated`] is
258//! used for rustdoc units. For registry packages, this is the package
259//! version. For git packages, it is the git hash. For path packages, it is
260//! a string of the mtime of the newest file in the package.
261//!
262//! There are some known bugs with how this works, so it should be improved at
263//! some point.
264//!
265//! #### Build script mtime handling
266//!
267//! Build script mtime handling runs in different modes. There is the "old
268//! style" where the build script does not emit any `rerun-if` directives. In
269//! this mode, Cargo will use [`LocalFingerprint::Precalculated`]. See the
270//! "rustdoc" section above how it works.
271//!
272//! In the new-style, each `rerun-if` directive is translated to the
273//! corresponding [`LocalFingerprint`] variant. The [`RerunIfChanged`] variant
274//! compares the mtime of the given filenames against the mtime of the
275//! "output" file.
276//!
277//! Similar to normal units, the build script "output" file mtime is rewound
278//! to the time just before the build script is executed to handle mid-build
279//! modifications.
280//!
281//! ## Considerations for inclusion in a fingerprint
282//!
283//! Over time we've realized a few items which historically were included in
284//! fingerprint hashings should not actually be included. Examples are:
285//!
286//! * Modification time values. We strive to never include a modification time
287//!   inside a `Fingerprint` to get hashed into an actual value. While
288//!   theoretically fine to do, in practice this causes issues with common
289//!   applications like Docker. Docker, after a layer is built, will zero out
290//!   the nanosecond part of all filesystem modification times. This means that
291//!   the actual modification time is different for all build artifacts, which
292//!   if we tracked the actual values of modification times would cause
293//!   unnecessary recompiles. To fix this we instead only track paths which are
294//!   relevant. These paths are checked dynamically to see if they're up to
295//!   date, and the modification time doesn't make its way into the fingerprint
296//!   hash.
297//!
298//! * Absolute path names. We strive to maintain a property where if you rename
299//!   a project directory Cargo will continue to preserve all build artifacts
300//!   and reuse the cache. This means that we can't ever hash an absolute path
301//!   name. Instead we always hash relative path names and the "root" is passed
302//!   in at runtime dynamically. Some of this is best effort, but the general
303//!   idea is that we assume all accesses within a crate stay within that
304//!   crate.
305//!
306//! These are pretty tricky to test for unfortunately, but we should have a good
307//! test suite nowadays and lord knows Cargo gets enough testing in the wild!
308//!
309//! ## Build scripts
310//!
311//! The *running* of a build script ([`CompileMode::RunCustomBuild`]) is treated
312//! significantly different than all other Unit kinds. It has its own function
313//! for calculating the Fingerprint ([`calculate_run_custom_build`]) and has some
314//! unique considerations. It does not track the same information as a normal
315//! Unit. The information tracked depends on the `rerun-if-changed` and
316//! `rerun-if-env-changed` statements produced by the build script. If the
317//! script does not emit either of these statements, the Fingerprint runs in
318//! "old style" mode where an mtime change of *any* file in the package will
319//! cause the build script to be re-run. Otherwise, the fingerprint *only*
320//! tracks the individual "rerun-if" items listed by the build script.
321//!
322//! The "rerun-if" statements from a *previous* build are stored in the build
323//! output directory in a file called `output`. Cargo parses this file when
324//! the Unit for that build script is prepared for the [`JobQueue`]. The
325//! Fingerprint code can then use that information to compute the Fingerprint
326//! and compare against the old fingerprint hash.
327//!
328//! Care must be taken with build script Fingerprints because the
329//! [`Fingerprint::local`] value may be changed after the build script runs
330//! (such as if the build script adds or removes "rerun-if" items).
331//!
332//! Another complication is if a build script is overridden. In that case, the
333//! fingerprint is the hash of the output of the override.
334//!
335//! ## Special considerations
336//!
337//! Registry dependencies do not track the mtime of files. This is because
338//! registry dependencies are not expected to change (if a new version is
339//! used, the Package ID will change, causing a rebuild). Cargo currently
340//! partially works with Docker caching. When a Docker image is built, it has
341//! normal mtime information. However, when a step is cached, the nanosecond
342//! portions of all files is zeroed out. Currently this works, but care must
343//! be taken for situations like these.
344//!
345//! HFS on macOS only supports 1 second timestamps. This causes a significant
346//! number of problems, particularly with Cargo's testsuite which does rapid
347//! builds in succession. Other filesystems have various degrees of
348//! resolution.
349//!
350//! Various weird filesystems (such as network filesystems) also can cause
351//! complications. Network filesystems may track the time on the server
352//! (except when the time is set manually such as with
353//! `filetime::set_file_times`). Not all filesystems support modifying the
354//! mtime.
355//!
356//! See the [`A-rebuild-detection`] label on the issue tracker for more.
357//!
358//! [`check_filesystem`]: Fingerprint::check_filesystem
359//! [`Metadata`]: crate::core::compiler::Metadata
360//! [`Metadata::unit_id`]: crate::core::compiler::Metadata::unit_id
361//! [`Metadata::c_metadata`]: crate::core::compiler::Metadata::c_metadata
362//! [`Metadata::c_extra_filename`]: crate::core::compiler::Metadata::c_extra_filename
363//! [`UnitHash`]: crate::core::compiler::UnitHash
364//! [`Profile`]: crate::core::profiles::Profile
365//! [`CompileMode`]: crate::core::compiler::CompileMode
366//! [`Lto`]: crate::core::compiler::Lto
367//! [`CompileKind`]: crate::core::compiler::CompileKind
368//! [`JobQueue`]: super::job_queue::JobQueue
369//! [`output_depinfo`]: super::output_depinfo()
370//! [`CheckDepInfo`]: LocalFingerprint::CheckDepInfo
371//! [`RerunIfChanged`]: LocalFingerprint::RerunIfChanged
372//! [`CompileMode::RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
373//! [`A-rebuild-detection`]: https://github.com/rust-lang/cargo/issues?q=is%3Aissue+is%3Aopen+label%3AA-rebuild-detection
374
375mod dep_info;
376mod dirty_reason;
377mod rustdoc;
378
379use std::collections::hash_map::{Entry, HashMap};
380use std::env;
381use std::ffi::OsString;
382use std::fs;
383use std::fs::File;
384use std::hash::{self, Hash, Hasher};
385use std::io::{self};
386use std::path::{Path, PathBuf};
387use std::sync::{Arc, Mutex};
388use std::time::SystemTime;
389
390use anyhow::Context as _;
391use anyhow::format_err;
392use cargo_util::paths;
393use filetime::FileTime;
394use serde::de;
395use serde::ser;
396use serde::{Deserialize, Serialize};
397use tracing::{debug, info};
398
399use crate::core::Package;
400use crate::core::compiler::unit_graph::UnitDep;
401use crate::util;
402use crate::util::errors::CargoResult;
403use crate::util::interning::InternedString;
404use crate::util::log_message::LogMessage;
405use crate::util::{StableHasher, internal, path_args};
406use crate::{CARGO_ENV, GlobalContext};
407
408use super::BuildContext;
409use super::BuildRunner;
410use super::FileFlavor;
411use super::Job;
412use super::Unit;
413use super::UnitIndex;
414use super::Work;
415use super::custom_build::BuildDeps;
416
417pub use self::dep_info::Checksum;
418pub use self::dep_info::parse_dep_info;
419pub use self::dep_info::parse_rustc_dep_info;
420pub use self::dep_info::translate_dep_info;
421pub use self::dirty_reason::DirtyReason;
422pub use self::rustdoc::RustdocFingerprint;
423
424/// Result of comparing fingerprints between the current and previous builds.
425enum FingerprintComparison {
426    /// The unit does not need rebuilding.
427    Fresh,
428    /// The unit needs rebuilding.
429    Dirty {
430        /// The reason why the unit is dirty.
431        reason: DirtyReason,
432    },
433}
434
435/// Determines if a [`Unit`] is up-to-date, and if not prepares necessary work to
436/// update the persisted fingerprint.
437///
438/// This function will inspect `Unit`, calculate a fingerprint for it, and then
439/// return an appropriate [`Job`] to run. The returned `Job` will be a noop if
440/// `unit` is considered "fresh", or if it was previously built and cached.
441/// Otherwise the `Job` returned will write out the true fingerprint to the
442/// filesystem, to be executed after the unit's work has completed.
443///
444/// The `force` flag is a way to force the `Job` to be "dirty", or always
445/// update the fingerprint. **Beware using this flag** because it does not
446/// transitively propagate throughout the dependency graph, it only forces this
447/// one unit which is very unlikely to be what you want unless you're
448/// exclusively talking about top-level units.
449#[tracing::instrument(
450    skip(build_runner, unit),
451    fields(package_id = %unit.pkg.package_id(), target = unit.target.name())
452)]
453pub fn prepare_target(
454    build_runner: &mut BuildRunner<'_, '_>,
455    unit: &Unit,
456    force: bool,
457) -> CargoResult<Job> {
458    let bcx = build_runner.bcx;
459    let loc = build_runner.files().fingerprint_file_path(unit, "");
460
461    debug!("fingerprint at: {}", loc.display());
462
463    // Figure out if this unit is up to date. After calculating the fingerprint
464    // compare it to an old version, if any, and attempt to print diagnostic
465    // information about failed comparisons to aid in debugging.
466    let fingerprint = calculate(build_runner, unit)?;
467    let mtime_on_use = build_runner.bcx.gctx.cli_unstable().mtime_on_use;
468    let dirty_reason = match compare_old_fingerprint(unit, &loc, &*fingerprint, mtime_on_use, force)
469    {
470        FingerprintComparison::Fresh => None,
471        FingerprintComparison::Dirty { reason } => Some(reason),
472    };
473
474    if let Some(logger) = bcx.logger {
475        let index = bcx.unit_to_index[unit];
476        let mut cause = None;
477        let status = match dirty_reason.as_ref() {
478            Some(reason) if reason.is_fresh_build() => util::log_message::FingerprintStatus::New,
479            Some(reason) => {
480                cause = Some(reason.clone());
481                util::log_message::FingerprintStatus::Dirty
482            }
483            None => util::log_message::FingerprintStatus::Fresh,
484        };
485        logger.log(LogMessage::UnitFingerprint {
486            index,
487            status,
488            cause,
489        });
490    }
491
492    let Some(dirty_reason) = dirty_reason else {
493        return Ok(Job::new_fresh());
494    };
495
496    // We're going to rebuild, so ensure the source of the crate passes all
497    // verification checks before we build it.
498    //
499    // The `Source::verify` method is intended to allow sources to execute
500    // pre-build checks to ensure that the relevant source code is all
501    // up-to-date and as expected. This is currently used primarily for
502    // directory sources which will use this hook to perform an integrity check
503    // on all files in the source to ensure they haven't changed. If they have
504    // changed then an error is issued.
505    let source_id = unit.pkg.package_id().source_id();
506    let sources = bcx.packages.sources();
507    let source = sources
508        .get(source_id)
509        .ok_or_else(|| internal("missing package source"))?;
510    source.verify(unit.pkg.package_id())?;
511
512    // Clear out the old fingerprint file if it exists. This protects when
513    // compilation is interrupted leaving a corrupt file. For example, a
514    // project with a lib.rs and integration test (two units):
515    //
516    // 1. Build the library and integration test.
517    // 2. Make a change to lib.rs (NOT the integration test).
518    // 3. Build the integration test, hit Ctrl-C while linking. With gcc, this
519    //    will leave behind an incomplete executable (zero size, or partially
520    //    written). NOTE: The library builds successfully, it is the linking
521    //    of the integration test that we are interrupting.
522    // 4. Build the integration test again.
523    //
524    // Without the following line, then step 3 will leave a valid fingerprint
525    // on the disk. Then step 4 will think the integration test is "fresh"
526    // because:
527    //
528    // - There is a valid fingerprint hash on disk (written in step 1).
529    // - The mtime of the output file (the corrupt integration executable
530    //   written in step 3) is newer than all of its dependencies.
531    // - The mtime of the integration test fingerprint dep-info file (written
532    //   in step 1) is newer than the integration test's source files, because
533    //   we haven't modified any of its source files.
534    //
535    // But the executable is corrupt and needs to be rebuilt. Clearing the
536    // fingerprint at step 3 ensures that Cargo never mistakes a partially
537    // written output as up-to-date.
538    if loc.exists() {
539        // Truncate instead of delete so that compare_old_fingerprint will
540        // still log the reason for the fingerprint failure instead of just
541        // reporting "failed to read fingerprint" during the next build if
542        // this build fails.
543        paths::write(&loc, b"")?;
544    }
545
546    let write_fingerprint = if unit.mode.is_run_custom_build() {
547        // For build scripts the `local` field of the fingerprint may change
548        // while we're executing it. For example it could be in the legacy
549        // "consider everything a dependency mode" and then we switch to "deps
550        // are explicitly specified" mode.
551        //
552        // To handle this movement we need to regenerate the `local` field of a
553        // build script's fingerprint after it's executed. We do this by
554        // using the `build_script_local_fingerprints` function which returns a
555        // thunk we can invoke on a foreign thread to calculate this.
556        let build_script_outputs = Arc::clone(&build_runner.build_script_outputs);
557        let metadata = build_runner.get_run_build_script_metadata(unit);
558        let (gen_local, _overridden) = build_script_local_fingerprints(build_runner, unit)?;
559        let output_path = build_runner.build_explicit_deps[unit]
560            .build_script_output
561            .clone();
562        Work::new(move |_| {
563            let outputs = build_script_outputs.lock().unwrap();
564            let output = outputs
565                .get(metadata)
566                .expect("output must exist after running");
567            let deps = BuildDeps::new(&output_path, Some(output));
568
569            // FIXME: it's basically buggy that we pass `None` to `call_box`
570            // here. See documentation on `build_script_local_fingerprints`
571            // below for more information. Despite this just try to proceed and
572            // hobble along if it happens to return `Some`.
573            if let Some(new_local) = (gen_local)(&deps, None)? {
574                *fingerprint.local.lock().unwrap() = new_local;
575            }
576
577            write_fingerprint(&loc, &fingerprint)
578        })
579    } else {
580        Work::new(move |_| write_fingerprint(&loc, &fingerprint))
581    };
582
583    Ok(Job::new_dirty(write_fingerprint, dirty_reason))
584}
585
586/// Dependency edge information for fingerprints. This is generated for each
587/// dependency and is stored in a [`Fingerprint`].
588#[derive(Clone)]
589struct DepFingerprint {
590    /// The hash of the package id that this dependency points to
591    pkg_id: u64,
592    /// The crate name we're using for this dependency, which if we change we'll
593    /// need to recompile!
594    name: InternedString,
595    /// Whether or not this dependency is flagged as a public dependency or not.
596    public: bool,
597    /// Whether or not this dependency is an rmeta dependency or a "full"
598    /// dependency. In the case of an rmeta dependency our dependency edge only
599    /// actually requires the rmeta from what we depend on, so when checking
600    /// mtime information all files other than the rmeta can be ignored.
601    only_requires_rmeta: bool,
602    /// The dependency's fingerprint we recursively point to, containing all the
603    /// other hash information we'd otherwise need.
604    fingerprint: Arc<Fingerprint>,
605}
606
607/// A fingerprint can be considered to be a "short string" representing the
608/// state of a world for a package.
609///
610/// If a fingerprint ever changes, then the package itself needs to be
611/// recompiled. Inputs to the fingerprint include source code modifications,
612/// compiler flags, compiler version, etc. This structure is not simply a
613/// `String` due to the fact that some fingerprints cannot be calculated lazily.
614///
615/// Path sources, for example, use the mtime of the corresponding dep-info file
616/// as a fingerprint (all source files must be modified *before* this mtime).
617/// This dep-info file is not generated, however, until after the crate is
618/// compiled. As a result, this structure can be thought of as a fingerprint
619/// to-be. The actual value can be calculated via [`hash_u64()`], but the operation
620/// may fail as some files may not have been generated.
621///
622/// Note that dependencies are taken into account for fingerprints because rustc
623/// requires that whenever an upstream crate is recompiled that all downstream
624/// dependents are also recompiled. This is typically tracked through
625/// [`DependencyQueue`], but it also needs to be retained here because Cargo can
626/// be interrupted while executing, losing the state of the [`DependencyQueue`]
627/// graph.
628///
629/// [`hash_u64()`]: crate::core::compiler::fingerprint::Fingerprint::hash_u64
630/// [`DependencyQueue`]: crate::util::DependencyQueue
631#[derive(Serialize, Deserialize)]
632pub struct Fingerprint {
633    /// Hash of the version of `rustc` used.
634    rustc: u64,
635    /// Sorted list of cfg features enabled.
636    features: String,
637    /// Sorted list of all the declared cfg features.
638    declared_features: String,
639    /// Hash of the `Target` struct, including the target name,
640    /// package-relative source path, edition, etc.
641    target: u64,
642    /// Hash of the [`Profile`], [`CompileMode`], and any extra flags passed via
643    /// `cargo rustc` or `cargo rustdoc`.
644    ///
645    /// [`Profile`]: crate::core::profiles::Profile
646    /// [`CompileMode`]: crate::core::compiler::CompileMode
647    profile: u64,
648    /// Hash of the path to the base source file. This is relative to the
649    /// workspace root for path members, or absolute for other sources.
650    path: u64,
651    /// Fingerprints of dependencies.
652    deps: Vec<DepFingerprint>,
653    /// Information about the inputs that affect this Unit (such as source
654    /// file mtimes or build script environment variables).
655    local: Mutex<Vec<LocalFingerprint>>,
656    /// Cached hash of the [`Fingerprint`] struct. Used to improve performance
657    /// for hashing.
658    #[serde(skip)]
659    memoized_hash: Mutex<Option<u64>>,
660    /// RUSTFLAGS/RUSTDOCFLAGS environment variable value (or config value).
661    rustflags: Vec<String>,
662    /// Hash of various config settings that change how things are compiled.
663    config: u64,
664    /// The rustc target. This is only relevant for `.json` files, otherwise
665    /// the metadata hash segregates the units.
666    compile_kind: u64,
667    /// Unit index for this fingerprint, used for tracing cascading rebuilds.
668    /// Not persisted to disk as indices can change between builds.
669    #[serde(skip)]
670    index: UnitIndex,
671    /// Description of whether the filesystem status for this unit is up to date
672    /// or should be considered stale.
673    #[serde(skip)]
674    fs_status: FsStatus,
675    /// Files, relative to `target_root`, that are produced by the step that
676    /// this `Fingerprint` represents. This is used to detect when the whole
677    /// fingerprint is out of date if this is missing, or if previous
678    /// fingerprints output files are regenerated and look newer than this one.
679    #[serde(skip)]
680    outputs: Vec<PathBuf>,
681}
682
683/// Indication of the status on the filesystem for a particular unit.
684#[derive(Clone, Default, Debug, Serialize, Deserialize)]
685#[serde(tag = "fs_status", rename_all = "kebab-case")]
686pub enum FsStatus {
687    /// This unit is to be considered stale, even if hash information all
688    /// matches.
689    #[default]
690    Stale,
691
692    /// File system inputs have changed (or are missing), or there were
693    /// changes to the environment variables that affect this unit. See
694    /// the variants of [`StaleItem`] for more information.
695    StaleItem(StaleItem),
696
697    /// A dependency was stale.
698    StaleDependency {
699        unit: UnitIndex,
700        #[serde(with = "serde_file_time")]
701        dep_mtime: FileTime,
702        #[serde(with = "serde_file_time")]
703        max_mtime: FileTime,
704    },
705
706    /// A dependency's fingerprint was stale.
707    StaleDepFingerprint { unit: UnitIndex },
708
709    /// This unit is up-to-date. All outputs and their corresponding mtime are
710    /// listed in the payload here for other dependencies to compare against.
711    #[serde(skip)]
712    UpToDate { mtimes: HashMap<PathBuf, FileTime> },
713}
714
715impl FsStatus {
716    fn up_to_date(&self) -> bool {
717        match self {
718            FsStatus::UpToDate { .. } => true,
719            FsStatus::Stale
720            | FsStatus::StaleItem(_)
721            | FsStatus::StaleDependency { .. }
722            | FsStatus::StaleDepFingerprint { .. } => false,
723        }
724    }
725}
726
727mod serde_file_time {
728    use filetime::FileTime;
729    use serde::Deserialize;
730    use serde::Serialize;
731
732    /// Serialize FileTime as milliseconds with nano.
733    pub(super) fn serialize<S>(ft: &FileTime, s: S) -> Result<S::Ok, S::Error>
734    where
735        S: serde::Serializer,
736    {
737        let secs_as_millis = ft.unix_seconds() as f64 * 1000.0;
738        let nanos_as_millis = ft.nanoseconds() as f64 / 1_000_000.0;
739        (secs_as_millis + nanos_as_millis).serialize(s)
740    }
741
742    /// Deserialize FileTime from milliseconds with nano.
743    pub(super) fn deserialize<'de, D>(d: D) -> Result<FileTime, D::Error>
744    where
745        D: serde::Deserializer<'de>,
746    {
747        let millis = f64::deserialize(d)?;
748        let secs = (millis / 1000.0) as i64;
749        let nanos = ((millis % 1000.0) * 1_000_000.0) as u32;
750        Ok(FileTime::from_unix_time(secs, nanos))
751    }
752}
753
754impl Serialize for DepFingerprint {
755    fn serialize<S>(&self, ser: S) -> Result<S::Ok, S::Error>
756    where
757        S: ser::Serializer,
758    {
759        (
760            &self.pkg_id,
761            &self.name,
762            &self.public,
763            &self.fingerprint.hash_u64(),
764        )
765            .serialize(ser)
766    }
767}
768
769impl<'de> Deserialize<'de> for DepFingerprint {
770    fn deserialize<D>(d: D) -> Result<DepFingerprint, D::Error>
771    where
772        D: de::Deserializer<'de>,
773    {
774        let (pkg_id, name, public, hash) = <(u64, String, bool, u64)>::deserialize(d)?;
775        Ok(DepFingerprint {
776            pkg_id,
777            name: name.into(),
778            public,
779            fingerprint: Arc::new(Fingerprint {
780                memoized_hash: Mutex::new(Some(hash)),
781                ..Fingerprint::new()
782            }),
783            // This field is never read since it's only used in
784            // `check_filesystem` which isn't used by fingerprints loaded from
785            // disk.
786            only_requires_rmeta: false,
787        })
788    }
789}
790
791/// A `LocalFingerprint` represents something that we use to detect direct
792/// changes to a `Fingerprint`.
793///
794/// This is where we track file information, env vars, etc. This
795/// `LocalFingerprint` struct is hashed and if the hash changes will force a
796/// recompile of any fingerprint it's included into. Note that the "local"
797/// terminology comes from the fact that it only has to do with one crate, and
798/// `Fingerprint` tracks the transitive propagation of fingerprint changes.
799///
800/// Note that because this is hashed its contents are carefully managed. Like
801/// mentioned in the above module docs, we don't want to hash absolute paths or
802/// mtime information.
803///
804/// Also note that a `LocalFingerprint` is used in `check_filesystem` to detect
805/// when the filesystem contains stale information (based on mtime currently).
806/// The paths here don't change much between compilations but they're used as
807/// inputs when we probe the filesystem looking at information.
808#[derive(Debug, Serialize, Deserialize, Hash)]
809enum LocalFingerprint {
810    /// This is a precalculated fingerprint which has an opaque string we just
811    /// hash as usual. This variant is primarily used for rustdoc where we
812    /// don't have a dep-info file to compare against.
813    ///
814    /// This is also used for build scripts with no `rerun-if-*` statements, but
815    /// that's overall a mistake and causes bugs in Cargo. We shouldn't use this
816    /// for build scripts.
817    Precalculated(String),
818
819    /// This is used for crate compilations. The `dep_info` file is a relative
820    /// path anchored at `target_root(...)` to the dep-info file that Cargo
821    /// generates (which is a custom serialization after parsing rustc's own
822    /// `dep-info` output).
823    ///
824    /// The `dep_info` file, when present, also lists a number of other files
825    /// for us to look at. If any of those files are newer than this file then
826    /// we need to recompile.
827    ///
828    /// If the `checksum` bool is true then the `dep_info` file is expected to
829    /// contain file checksums instead of file mtimes.
830    CheckDepInfo { dep_info: PathBuf, checksum: bool },
831
832    /// This represents a nonempty set of `rerun-if-changed` annotations printed
833    /// out by a build script. The `output` file is a relative file anchored at
834    /// `target_root(...)` which is the actual output of the build script. That
835    /// output has already been parsed and the paths printed out via
836    /// `rerun-if-changed` are listed in `paths`. The `paths` field is relative
837    /// to `pkg.root()`
838    ///
839    /// This is considered up-to-date if all of the `paths` are older than
840    /// `output`, otherwise we need to recompile.
841    RerunIfChanged {
842        output: PathBuf,
843        paths: Vec<PathBuf>,
844    },
845
846    /// This represents a single `rerun-if-env-changed` annotation printed by a
847    /// build script. The exact env var and value are hashed here. There's no
848    /// filesystem dependence here, and if the values are changed the hash will
849    /// change forcing a recompile.
850    RerunIfEnvChanged { var: String, val: Option<String> },
851}
852
853/// See [`FsStatus::StaleItem`].
854#[derive(Clone, Debug, Serialize, Deserialize)]
855#[serde(tag = "stale_item", rename_all = "kebab-case")]
856pub enum StaleItem {
857    MissingFile {
858        path: PathBuf,
859    },
860    UnableToReadFile {
861        path: PathBuf,
862    },
863    FailedToReadMetadata {
864        path: PathBuf,
865    },
866    FileSizeChanged {
867        path: PathBuf,
868        old_size: u64,
869        new_size: u64,
870    },
871    ChangedFile {
872        reference: PathBuf,
873        #[serde(with = "serde_file_time")]
874        reference_mtime: FileTime,
875        stale: PathBuf,
876        #[serde(with = "serde_file_time")]
877        stale_mtime: FileTime,
878    },
879    ChangedChecksum {
880        source: PathBuf,
881        stored_checksum: Checksum,
882        new_checksum: Checksum,
883    },
884    MissingChecksum {
885        path: PathBuf,
886    },
887    ChangedEnv {
888        var: String,
889        previous: Option<String>,
890        current: Option<String>,
891    },
892}
893
894impl LocalFingerprint {
895    /// Read the environment variable of the given env `key`, and creates a new
896    /// [`LocalFingerprint::RerunIfEnvChanged`] for it. The `env_config` is used firstly
897    /// to check if the env var is set in the config system as some envs need to be overridden.
898    /// If not, it will fallback to `std::env::var`.
899    ///
900    // TODO: `std::env::var` is allowed at this moment. Should figure out
901    // if it makes sense if permitting to read env from the env snapshot.
902    #[allow(clippy::disallowed_methods)]
903    fn from_env<K: AsRef<str>>(
904        key: K,
905        env_config: &Arc<HashMap<String, OsString>>,
906    ) -> LocalFingerprint {
907        let key = key.as_ref();
908        let var = key.to_owned();
909        let val = if let Some(val) = env_config.get(key) {
910            val.to_str().map(ToOwned::to_owned)
911        } else {
912            env::var(key).ok()
913        };
914        LocalFingerprint::RerunIfEnvChanged { var, val }
915    }
916
917    /// Checks dynamically at runtime if this `LocalFingerprint` has a stale
918    /// item inside of it.
919    ///
920    /// The main purpose of this function is to handle two different ways
921    /// fingerprints can be invalidated:
922    ///
923    /// * One is a dependency listed in rustc's dep-info files is invalid. Note
924    ///   that these could either be env vars or files. We check both here.
925    ///
926    /// * Another is the `rerun-if-changed` directive from build scripts. This
927    ///   is where we'll find whether files have actually changed
928    fn find_stale_item(
929        &self,
930        mtime_cache: &mut HashMap<PathBuf, FileTime>,
931        checksum_cache: &mut HashMap<PathBuf, Checksum>,
932        pkg: &Package,
933        build_root: &Path,
934        cargo_exe: &Path,
935        gctx: &GlobalContext,
936    ) -> CargoResult<Option<StaleItem>> {
937        let pkg_root = pkg.root();
938        match self {
939            // We need to parse `dep_info`, learn about the crate's dependencies.
940            //
941            // For each env var we see if our current process's env var still
942            // matches, and for each file we see if any of them are newer than
943            // the `dep_info` file itself whose mtime represents the start of
944            // rustc.
945            LocalFingerprint::CheckDepInfo { dep_info, checksum } => {
946                let dep_info = build_root.join(dep_info);
947                let Some(info) = parse_dep_info(pkg_root, build_root, &dep_info)? else {
948                    return Ok(Some(StaleItem::MissingFile { path: dep_info }));
949                };
950                for (key, previous) in info.env.iter() {
951                    if let Some(value) = pkg.manifest().metadata().env_var(key.as_str()) {
952                        if Some(value.as_ref()) == previous.as_deref() {
953                            continue;
954                        }
955                    }
956
957                    let current = if key == CARGO_ENV {
958                        Some(cargo_exe.to_str().ok_or_else(|| {
959                            format_err!(
960                                "cargo exe path {} must be valid UTF-8",
961                                cargo_exe.display()
962                            )
963                        })?)
964                    } else {
965                        if let Some(value) = gctx.env_config()?.get(key) {
966                            value.to_str()
967                        } else {
968                            gctx.get_env(key).ok()
969                        }
970                    };
971                    if current == previous.as_deref() {
972                        continue;
973                    }
974                    return Ok(Some(StaleItem::ChangedEnv {
975                        var: key.clone(),
976                        previous: previous.clone(),
977                        current: current.map(Into::into),
978                    }));
979                }
980                if *checksum {
981                    Ok(find_stale_file(
982                        mtime_cache,
983                        checksum_cache,
984                        &dep_info,
985                        info.files.iter().map(|(file, checksum)| (file, *checksum)),
986                        *checksum,
987                    ))
988                } else {
989                    Ok(find_stale_file(
990                        mtime_cache,
991                        checksum_cache,
992                        &dep_info,
993                        info.files.into_keys().map(|p| (p, None)),
994                        *checksum,
995                    ))
996                }
997            }
998
999            // We need to verify that no paths listed in `paths` are newer than
1000            // the `output` path itself, or the last time the build script ran.
1001            LocalFingerprint::RerunIfChanged { output, paths } => Ok(find_stale_file(
1002                mtime_cache,
1003                checksum_cache,
1004                &build_root.join(output),
1005                paths.iter().map(|p| (pkg_root.join(p), None)),
1006                false,
1007            )),
1008
1009            // These have no dependencies on the filesystem, and their values
1010            // are included natively in the `Fingerprint` hash so nothing
1011            // tocheck for here.
1012            LocalFingerprint::RerunIfEnvChanged { .. } => Ok(None),
1013            LocalFingerprint::Precalculated(..) => Ok(None),
1014        }
1015    }
1016
1017    fn kind(&self) -> &'static str {
1018        match self {
1019            LocalFingerprint::Precalculated(..) => "precalculated",
1020            LocalFingerprint::CheckDepInfo { .. } => "dep-info",
1021            LocalFingerprint::RerunIfChanged { .. } => "rerun-if-changed",
1022            LocalFingerprint::RerunIfEnvChanged { .. } => "rerun-if-env-changed",
1023        }
1024    }
1025}
1026
1027impl Fingerprint {
1028    fn new() -> Fingerprint {
1029        Fingerprint {
1030            rustc: 0,
1031            target: 0,
1032            profile: 0,
1033            path: 0,
1034            features: String::new(),
1035            declared_features: String::new(),
1036            deps: Vec::new(),
1037            local: Mutex::new(Vec::new()),
1038            memoized_hash: Mutex::new(None),
1039            rustflags: Vec::new(),
1040            config: 0,
1041            compile_kind: 0,
1042            index: UnitIndex::default(),
1043            fs_status: FsStatus::Stale,
1044            outputs: Vec::new(),
1045        }
1046    }
1047
1048    /// For performance reasons fingerprints will memoize their own hash, but
1049    /// there's also internal mutability with its `local` field which can
1050    /// change, for example with build scripts, during a build.
1051    ///
1052    /// This method can be used to bust all memoized hashes just before a build
1053    /// to ensure that after a build completes everything is up-to-date.
1054    pub fn clear_memoized(&self) {
1055        *self.memoized_hash.lock().unwrap() = None;
1056    }
1057
1058    fn hash_u64(&self) -> u64 {
1059        if let Some(s) = *self.memoized_hash.lock().unwrap() {
1060            return s;
1061        }
1062        let ret = util::hash_u64(self);
1063        *self.memoized_hash.lock().unwrap() = Some(ret);
1064        ret
1065    }
1066
1067    /// Compares this fingerprint with an old version which was previously
1068    /// serialized to filesystem.
1069    ///
1070    /// The purpose of this is exclusively to produce a diagnostic message
1071    /// [`DirtyReason`], indicating why we're recompiling something.
1072    fn compare(&self, old: &Fingerprint) -> DirtyReason {
1073        if self.rustc != old.rustc {
1074            return DirtyReason::RustcChanged;
1075        }
1076        if self.features != old.features {
1077            return DirtyReason::FeaturesChanged {
1078                old: old.features.clone(),
1079                new: self.features.clone(),
1080            };
1081        }
1082        if self.declared_features != old.declared_features {
1083            return DirtyReason::DeclaredFeaturesChanged {
1084                old: old.declared_features.clone(),
1085                new: self.declared_features.clone(),
1086            };
1087        }
1088        if self.target != old.target {
1089            return DirtyReason::TargetConfigurationChanged;
1090        }
1091        if self.path != old.path {
1092            return DirtyReason::PathToSourceChanged;
1093        }
1094        if self.profile != old.profile {
1095            return DirtyReason::ProfileConfigurationChanged;
1096        }
1097        if self.rustflags != old.rustflags {
1098            return DirtyReason::RustflagsChanged {
1099                old: old.rustflags.clone(),
1100                new: self.rustflags.clone(),
1101            };
1102        }
1103        if self.config != old.config {
1104            return DirtyReason::ConfigSettingsChanged;
1105        }
1106        if self.compile_kind != old.compile_kind {
1107            return DirtyReason::CompileKindChanged;
1108        }
1109        let my_local = self.local.lock().unwrap();
1110        let old_local = old.local.lock().unwrap();
1111        if my_local.len() != old_local.len() {
1112            return DirtyReason::LocalLengthsChanged;
1113        }
1114        for (new, old) in my_local.iter().zip(old_local.iter()) {
1115            match (new, old) {
1116                (LocalFingerprint::Precalculated(a), LocalFingerprint::Precalculated(b)) => {
1117                    if a != b {
1118                        return DirtyReason::PrecalculatedComponentsChanged {
1119                            old: b.to_string(),
1120                            new: a.to_string(),
1121                        };
1122                    }
1123                }
1124                (
1125                    LocalFingerprint::CheckDepInfo {
1126                        dep_info: a_dep,
1127                        checksum: checksum_a,
1128                    },
1129                    LocalFingerprint::CheckDepInfo {
1130                        dep_info: b_dep,
1131                        checksum: checksum_b,
1132                    },
1133                ) => {
1134                    if a_dep != b_dep {
1135                        return DirtyReason::DepInfoOutputChanged {
1136                            old: b_dep.clone(),
1137                            new: a_dep.clone(),
1138                        };
1139                    }
1140                    if checksum_a != checksum_b {
1141                        return DirtyReason::ChecksumUseChanged { old: *checksum_b };
1142                    }
1143                }
1144                (
1145                    LocalFingerprint::RerunIfChanged {
1146                        output: a_out,
1147                        paths: a_paths,
1148                    },
1149                    LocalFingerprint::RerunIfChanged {
1150                        output: b_out,
1151                        paths: b_paths,
1152                    },
1153                ) => {
1154                    if a_out != b_out {
1155                        return DirtyReason::RerunIfChangedOutputFileChanged {
1156                            old: b_out.clone(),
1157                            new: a_out.clone(),
1158                        };
1159                    }
1160                    if a_paths != b_paths {
1161                        return DirtyReason::RerunIfChangedOutputPathsChanged {
1162                            old: b_paths.clone(),
1163                            new: a_paths.clone(),
1164                        };
1165                    }
1166                }
1167                (
1168                    LocalFingerprint::RerunIfEnvChanged {
1169                        var: a_key,
1170                        val: a_value,
1171                    },
1172                    LocalFingerprint::RerunIfEnvChanged {
1173                        var: b_key,
1174                        val: b_value,
1175                    },
1176                ) => {
1177                    if *a_key != *b_key {
1178                        return DirtyReason::EnvVarsChanged {
1179                            old: b_key.clone(),
1180                            new: a_key.clone(),
1181                        };
1182                    }
1183                    if *a_value != *b_value {
1184                        return DirtyReason::EnvVarChanged {
1185                            name: a_key.clone(),
1186                            old_value: b_value.clone(),
1187                            new_value: a_value.clone(),
1188                        };
1189                    }
1190                }
1191                (a, b) => {
1192                    return DirtyReason::LocalFingerprintTypeChanged {
1193                        old: b.kind().to_owned(),
1194                        new: a.kind().to_owned(),
1195                    };
1196                }
1197            }
1198        }
1199
1200        if self.deps.len() != old.deps.len() {
1201            return DirtyReason::NumberOfDependenciesChanged {
1202                old: old.deps.len(),
1203                new: self.deps.len(),
1204            };
1205        }
1206        for (a, b) in self.deps.iter().zip(old.deps.iter()) {
1207            if a.name != b.name {
1208                return DirtyReason::UnitDependencyNameChanged {
1209                    old: b.name,
1210                    new: a.name,
1211                };
1212            }
1213
1214            if a.fingerprint.hash_u64() != b.fingerprint.hash_u64() {
1215                return DirtyReason::UnitDependencyInfoChanged {
1216                    unit: a.fingerprint.index,
1217                };
1218            }
1219        }
1220
1221        if !self.fs_status.up_to_date() {
1222            return DirtyReason::FsStatusOutdated(self.fs_status.clone());
1223        }
1224
1225        // This typically means some filesystem modifications happened or
1226        // something transitive was odd. In general we should strive to provide
1227        // a better error message than this, so if you see this message a lot it
1228        // likely means this method needs to be updated!
1229        DirtyReason::NothingObvious
1230    }
1231
1232    /// Dynamically inspect the local filesystem to update the `fs_status` field
1233    /// of this `Fingerprint`.
1234    ///
1235    /// This function is used just after a `Fingerprint` is constructed to check
1236    /// the local state of the filesystem and propagate any dirtiness from
1237    /// dependencies up to this unit as well. This function assumes that the
1238    /// unit starts out as [`FsStatus::Stale`] and then it will optionally switch
1239    /// it to `UpToDate` if it can.
1240    fn check_filesystem(
1241        &mut self,
1242        mtime_cache: &mut HashMap<PathBuf, FileTime>,
1243        checksum_cache: &mut HashMap<PathBuf, Checksum>,
1244        pkg: &Package,
1245        build_root: &Path,
1246        cargo_exe: &Path,
1247        gctx: &GlobalContext,
1248    ) -> CargoResult<()> {
1249        assert!(!self.fs_status.up_to_date());
1250
1251        let pkg_root = pkg.root();
1252        let mut mtimes = HashMap::new();
1253
1254        // Get the `mtime` of all outputs. Optionally update their mtime
1255        // afterwards based on the `mtime_on_use` flag. Afterwards we want the
1256        // minimum mtime as it's the one we'll be comparing to inputs and
1257        // dependencies.
1258        for output in self.outputs.iter() {
1259            let Ok(mtime) = paths::mtime(output) else {
1260                // This path failed to report its `mtime`. It probably doesn't
1261                // exists, so leave ourselves as stale and bail out.
1262                let item = StaleItem::FailedToReadMetadata {
1263                    path: output.clone(),
1264                };
1265                self.fs_status = FsStatus::StaleItem(item);
1266                return Ok(());
1267            };
1268            assert!(mtimes.insert(output.clone(), mtime).is_none());
1269        }
1270
1271        let opt_max = mtimes.iter().max_by_key(|kv| kv.1);
1272        let Some((max_path, max_mtime)) = opt_max else {
1273            // We had no output files. This means we're an overridden build
1274            // script and we're just always up to date because we aren't
1275            // watching the filesystem.
1276            self.fs_status = FsStatus::UpToDate { mtimes };
1277            return Ok(());
1278        };
1279        debug!(
1280            "max output mtime for {:?} is {:?} {}",
1281            pkg_root, max_path, max_mtime
1282        );
1283
1284        for dep in self.deps.iter() {
1285            let dep_mtimes = match &dep.fingerprint.fs_status {
1286                FsStatus::UpToDate { mtimes } => mtimes,
1287                // If our dependency is stale, so are we, so bail out.
1288                FsStatus::Stale
1289                | FsStatus::StaleItem(_)
1290                | FsStatus::StaleDependency { .. }
1291                | FsStatus::StaleDepFingerprint { .. } => {
1292                    self.fs_status = FsStatus::StaleDepFingerprint {
1293                        unit: dep.fingerprint.index,
1294                    };
1295                    return Ok(());
1296                }
1297            };
1298
1299            // If our dependency edge only requires the rmeta file to be present
1300            // then we only need to look at that one output file, otherwise we
1301            // need to consider all output files to see if we're out of date.
1302            let (dep_path, dep_mtime) = if dep.only_requires_rmeta {
1303                dep_mtimes
1304                    .iter()
1305                    .find(|(path, _mtime)| {
1306                        path.extension().and_then(|s| s.to_str()) == Some("rmeta")
1307                    })
1308                    .expect("failed to find rmeta")
1309            } else {
1310                match dep_mtimes.iter().max_by_key(|kv| kv.1) {
1311                    Some(dep_mtime) => dep_mtime,
1312                    // If our dependencies is up to date and has no filesystem
1313                    // interactions, then we can move on to the next dependency.
1314                    None => continue,
1315                }
1316            };
1317            debug!(
1318                "max dep mtime for {:?} is {:?} {}",
1319                pkg_root, dep_path, dep_mtime
1320            );
1321
1322            // If the dependency is newer than our own output then it was
1323            // recompiled previously. We transitively become stale ourselves in
1324            // that case, so bail out.
1325            //
1326            // Note that this comparison should probably be `>=`, not `>`, but
1327            // for a discussion of why it's `>` see the discussion about #5918
1328            // below in `find_stale`.
1329            if dep_mtime > max_mtime {
1330                info!(
1331                    "dependency on `{}` is newer than we are {} > {} {:?}",
1332                    dep.name, dep_mtime, max_mtime, pkg_root
1333                );
1334
1335                self.fs_status = FsStatus::StaleDependency {
1336                    unit: dep.fingerprint.index,
1337                    dep_mtime: *dep_mtime,
1338                    max_mtime: *max_mtime,
1339                };
1340
1341                return Ok(());
1342            }
1343        }
1344
1345        // If we reached this far then all dependencies are up to date. Check
1346        // all our `LocalFingerprint` information to see if we have any stale
1347        // files for this package itself. If we do find something log a helpful
1348        // message and bail out so we stay stale.
1349        for local in self.local.get_mut().unwrap().iter() {
1350            if let Some(item) = local.find_stale_item(
1351                mtime_cache,
1352                checksum_cache,
1353                pkg,
1354                build_root,
1355                cargo_exe,
1356                gctx,
1357            )? {
1358                item.log();
1359                self.fs_status = FsStatus::StaleItem(item);
1360                return Ok(());
1361            }
1362        }
1363
1364        // Everything was up to date! Record such.
1365        self.fs_status = FsStatus::UpToDate { mtimes };
1366        debug!("filesystem up-to-date {:?}", pkg_root);
1367
1368        Ok(())
1369    }
1370}
1371
1372impl hash::Hash for Fingerprint {
1373    fn hash<H: Hasher>(&self, h: &mut H) {
1374        let Fingerprint {
1375            rustc,
1376            ref features,
1377            ref declared_features,
1378            target,
1379            path,
1380            profile,
1381            ref deps,
1382            ref local,
1383            config,
1384            compile_kind,
1385            ref rustflags,
1386            ..
1387        } = *self;
1388        let local = local.lock().unwrap();
1389        (
1390            rustc,
1391            features,
1392            declared_features,
1393            target,
1394            path,
1395            profile,
1396            &*local,
1397            config,
1398            compile_kind,
1399            rustflags,
1400        )
1401            .hash(h);
1402
1403        h.write_usize(deps.len());
1404        for DepFingerprint {
1405            pkg_id,
1406            name,
1407            public,
1408            fingerprint,
1409            only_requires_rmeta: _, // static property, no need to hash
1410        } in deps
1411        {
1412            pkg_id.hash(h);
1413            name.hash(h);
1414            public.hash(h);
1415            // use memoized dep hashes to avoid exponential blowup
1416            h.write_u64(fingerprint.hash_u64());
1417        }
1418    }
1419}
1420
1421impl DepFingerprint {
1422    fn new(
1423        build_runner: &mut BuildRunner<'_, '_>,
1424        parent: &Unit,
1425        dep: &UnitDep,
1426    ) -> CargoResult<DepFingerprint> {
1427        let fingerprint = calculate(build_runner, &dep.unit)?;
1428        // We need to be careful about what we hash here. We have a goal of
1429        // supporting renaming a project directory and not rebuilding
1430        // everything. To do that, however, we need to make sure that the cwd
1431        // doesn't make its way into any hashes, and one source of that is the
1432        // `SourceId` for `path` packages.
1433        //
1434        // We already have a requirement that `path` packages all have unique
1435        // names (sort of for this same reason), so if the package source is a
1436        // `path` then we just hash the name, but otherwise we hash the full
1437        // id as it won't change when the directory is renamed.
1438        let pkg_id = if dep.unit.pkg.package_id().source_id().is_path() {
1439            util::hash_u64(dep.unit.pkg.package_id().name())
1440        } else {
1441            util::hash_u64(dep.unit.pkg.package_id())
1442        };
1443
1444        Ok(DepFingerprint {
1445            pkg_id,
1446            name: dep.extern_crate_name,
1447            public: dep.public,
1448            fingerprint,
1449            only_requires_rmeta: build_runner.only_requires_rmeta(parent, &dep.unit),
1450        })
1451    }
1452}
1453
1454impl StaleItem {
1455    /// Use the `log` crate to log a hopefully helpful message in diagnosing
1456    /// what file is considered stale and why. This is intended to be used in
1457    /// conjunction with `CARGO_LOG` to determine why Cargo is recompiling
1458    /// something. Currently there's no user-facing usage of this other than
1459    /// that.
1460    fn log(&self) {
1461        match self {
1462            StaleItem::MissingFile { path } => {
1463                info!("stale: missing {:?}", path);
1464            }
1465            StaleItem::UnableToReadFile { path } => {
1466                info!("stale: unable to read {:?}", path);
1467            }
1468            StaleItem::FailedToReadMetadata { path } => {
1469                info!("stale: couldn't read metadata {:?}", path);
1470            }
1471            StaleItem::ChangedFile {
1472                reference,
1473                reference_mtime,
1474                stale,
1475                stale_mtime,
1476            } => {
1477                info!("stale: changed {:?}", stale);
1478                info!("          (vs) {:?}", reference);
1479                info!("               {:?} < {:?}", reference_mtime, stale_mtime);
1480            }
1481            StaleItem::FileSizeChanged {
1482                path,
1483                new_size,
1484                old_size,
1485            } => {
1486                info!("stale: changed {:?}", path);
1487                info!("prior file size {old_size}");
1488                info!("  new file size {new_size}");
1489            }
1490            StaleItem::ChangedChecksum {
1491                source,
1492                stored_checksum,
1493                new_checksum,
1494            } => {
1495                info!("stale: changed {:?}", source);
1496                info!("prior checksum {stored_checksum}");
1497                info!("  new checksum {new_checksum}");
1498            }
1499            StaleItem::MissingChecksum { path } => {
1500                info!("stale: no prior checksum {:?}", path);
1501            }
1502            StaleItem::ChangedEnv {
1503                var,
1504                previous,
1505                current,
1506            } => {
1507                info!("stale: changed env {:?}", var);
1508                info!("       {:?} != {:?}", previous, current);
1509            }
1510        }
1511    }
1512}
1513
1514/// Calculates the fingerprint for a [`Unit`].
1515///
1516/// This fingerprint is used by Cargo to learn about when information such as:
1517///
1518/// * A non-path package changes (changes version, changes revision, etc).
1519/// * Any dependency changes
1520/// * The compiler changes
1521/// * The set of features a package is built with changes
1522/// * The profile a target is compiled with changes (e.g., opt-level changes)
1523/// * Any other compiler flags change that will affect the result
1524///
1525/// Information like file modification time is only calculated for path
1526/// dependencies.
1527fn calculate(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> CargoResult<Arc<Fingerprint>> {
1528    // This function is slammed quite a lot, so the result is memoized.
1529    if let Some(s) = build_runner.fingerprints.get(unit) {
1530        return Ok(Arc::clone(s));
1531    }
1532    let mut fingerprint = if unit.mode.is_run_custom_build() {
1533        calculate_run_custom_build(build_runner, unit)?
1534    } else if unit.mode.is_doc_test() {
1535        panic!("doc tests do not fingerprint");
1536    } else {
1537        calculate_normal(build_runner, unit)?
1538    };
1539
1540    // After we built the initial `Fingerprint` be sure to update the
1541    // `fs_status` field of it.
1542    let build_root = build_root(build_runner);
1543    let cargo_exe = build_runner.bcx.gctx.cargo_exe()?;
1544    fingerprint.check_filesystem(
1545        &mut build_runner.mtime_cache,
1546        &mut build_runner.checksum_cache,
1547        &unit.pkg,
1548        &build_root,
1549        cargo_exe,
1550        build_runner.bcx.gctx,
1551    )?;
1552
1553    let fingerprint = Arc::new(fingerprint);
1554    build_runner
1555        .fingerprints
1556        .insert(unit.clone(), Arc::clone(&fingerprint));
1557    Ok(fingerprint)
1558}
1559
1560/// Calculate a fingerprint for a "normal" unit, or anything that's not a build
1561/// script. This is an internal helper of [`calculate`], don't call directly.
1562fn calculate_normal(
1563    build_runner: &mut BuildRunner<'_, '_>,
1564    unit: &Unit,
1565) -> CargoResult<Fingerprint> {
1566    let deps = {
1567        // Recursively calculate the fingerprint for all of our dependencies.
1568        //
1569        // Skip fingerprints of binaries because they don't actually induce a
1570        // recompile, they're just dependencies in the sense that they need to be
1571        // built. The only exception here are artifact dependencies,
1572        // which is an actual dependency that needs a recompile.
1573        //
1574        // Create Vec since mutable build_runner is needed in closure.
1575        let deps = Vec::from(build_runner.unit_deps(unit));
1576        let mut deps = deps
1577            .into_iter()
1578            .filter(|dep| !dep.unit.target.is_bin() || dep.unit.artifact.is_true())
1579            .map(|dep| DepFingerprint::new(build_runner, unit, &dep))
1580            .collect::<CargoResult<Vec<_>>>()?;
1581        deps.sort_by(|a, b| a.pkg_id.cmp(&b.pkg_id));
1582        deps
1583    };
1584
1585    // Afterwards calculate our own fingerprint information.
1586    let build_root = build_root(build_runner);
1587    let is_any_doc_gen = unit.mode.is_doc() || unit.mode.is_doc_scrape();
1588    let rustdoc_depinfo_enabled = build_runner.bcx.gctx.cli_unstable().rustdoc_depinfo;
1589    let local = if is_any_doc_gen && !rustdoc_depinfo_enabled {
1590        // rustdoc does not have dep-info files.
1591        let fingerprint = pkg_fingerprint(build_runner.bcx, &unit.pkg).with_context(|| {
1592            format!(
1593                "failed to determine package fingerprint for documenting {}",
1594                unit.pkg
1595            )
1596        })?;
1597        vec![LocalFingerprint::Precalculated(fingerprint)]
1598    } else {
1599        let dep_info = dep_info_loc(build_runner, unit);
1600        let dep_info = dep_info.strip_prefix(&build_root).unwrap().to_path_buf();
1601        vec![LocalFingerprint::CheckDepInfo {
1602            dep_info,
1603            checksum: build_runner.bcx.gctx.cli_unstable().checksum_freshness,
1604        }]
1605    };
1606
1607    // Figure out what the outputs of our unit is, and we'll be storing them
1608    // into the fingerprint as well.
1609    let outputs = build_runner
1610        .outputs(unit)?
1611        .iter()
1612        .filter(|output| {
1613            !matches!(
1614                output.flavor,
1615                FileFlavor::DebugInfo | FileFlavor::Auxiliary | FileFlavor::Sbom
1616            )
1617        })
1618        .map(|output| output.path.clone())
1619        .collect();
1620
1621    // Fill out a bunch more information that we'll be tracking typically
1622    // hashed to take up less space on disk as we just need to know when things
1623    // change.
1624    let extra_flags = if unit.mode.is_doc() || unit.mode.is_doc_scrape() {
1625        &unit.rustdocflags
1626    } else {
1627        &unit.rustflags
1628    }
1629    .to_vec();
1630
1631    let profile_hash = util::hash_u64((
1632        &unit.profile,
1633        unit.mode,
1634        build_runner.bcx.extra_args_for(unit),
1635        build_runner.lto[unit],
1636        unit.pkg.manifest().lint_rustflags(),
1637    ));
1638    let mut config = StableHasher::new();
1639    let linker = if unit.target.for_host() && !build_runner.bcx.gctx.target_applies_to_host()? {
1640        build_runner.compilation.host_linker()
1641    } else {
1642        build_runner.compilation.target_linker(unit.kind)
1643    };
1644    if let Some(linker) = linker {
1645        linker.hash(&mut config);
1646    }
1647    if unit.mode.is_doc() && build_runner.bcx.gctx.cli_unstable().rustdoc_map {
1648        if let Ok(map) = build_runner.bcx.gctx.doc_extern_map() {
1649            map.hash(&mut config);
1650        }
1651    }
1652    if let Some(allow_features) = &build_runner.bcx.gctx.cli_unstable().allow_features {
1653        allow_features.hash(&mut config);
1654    }
1655    // -Zpublic-dependency changes how library units pass dependency privacy
1656    // to rustc via `--extern`.
1657    (unit.target.is_lib()
1658        && build_runner.unit_deps(unit).iter().any(|dep| !dep.public)
1659        && super::is_public_dependency_enabled(build_runner, unit))
1660    .hash(&mut config);
1661    // -Zno-embed-metadata changes how all units are compiled, and it also changes how we tell
1662    // rustc to link to deps using `--extern`. If it changes, we should rebuild everything.
1663    build_runner
1664        .bcx
1665        .gctx
1666        .cli_unstable()
1667        .no_embed_metadata
1668        .hash(&mut config);
1669
1670    let compile_kind = unit.kind.fingerprint_hash();
1671    let mut declared_features = unit.pkg.summary().features().keys().collect::<Vec<_>>();
1672    declared_features.sort(); // to avoid useless rebuild if the user orders it's features
1673    // differently
1674    Ok(Fingerprint {
1675        rustc: util::hash_u64(&build_runner.bcx.rustc().verbose_version),
1676        target: util::hash_u64(&unit.target),
1677        profile: profile_hash,
1678        // Note that .0 is hashed here, not .1 which is the cwd. That doesn't
1679        // actually affect the output artifact so there's no need to hash it.
1680        path: util::hash_u64(path_args(build_runner.bcx.ws, unit).0),
1681        features: format!("{:?}", unit.features),
1682        declared_features: format!("{declared_features:?}"),
1683        deps,
1684        local: Mutex::new(local),
1685        memoized_hash: Mutex::new(None),
1686        config: Hasher::finish(&config),
1687        compile_kind,
1688        index: build_runner.bcx.unit_to_index[unit],
1689        rustflags: extra_flags,
1690        fs_status: FsStatus::Stale,
1691        outputs,
1692    })
1693}
1694
1695/// Calculate a fingerprint for an "execute a build script" unit.  This is an
1696/// internal helper of [`calculate`], don't call directly.
1697fn calculate_run_custom_build(
1698    build_runner: &mut BuildRunner<'_, '_>,
1699    unit: &Unit,
1700) -> CargoResult<Fingerprint> {
1701    assert!(unit.mode.is_run_custom_build());
1702    // Using the `BuildDeps` information we'll have previously parsed and
1703    // inserted into `build_explicit_deps` built an initial snapshot of the
1704    // `LocalFingerprint` list for this build script. If we previously executed
1705    // the build script this means we'll be watching files and env vars.
1706    // Otherwise if we haven't previously executed it we'll just start watching
1707    // the whole crate.
1708    let (gen_local, overridden) = build_script_local_fingerprints(build_runner, unit)?;
1709    let deps = &build_runner.build_explicit_deps[unit];
1710    let local = (gen_local)(
1711        deps,
1712        Some(&|| {
1713            const IO_ERR_MESSAGE: &str = "\
1714An I/O error happened. Please make sure you can access the file.
1715
1716By default, if your project contains a build script, cargo scans all files in
1717it to determine whether a rebuild is needed. If you don't expect to access the
1718file, specify `rerun-if-changed` in your build script.
1719See https://doc.rust-lang.org/cargo/reference/build-scripts.html#rerun-if-changed for more information.";
1720            pkg_fingerprint(build_runner.bcx, &unit.pkg).map_err(|err| {
1721                let mut message = format!("failed to determine package fingerprint for build script for {}", unit.pkg);
1722                if err.root_cause().is::<io::Error>() {
1723                    message = format!("{}\n{}", message, IO_ERR_MESSAGE)
1724                }
1725                err.context(message)
1726            })
1727        }),
1728    )?
1729    .unwrap();
1730    let output = deps.build_script_output.clone();
1731
1732    // Include any dependencies of our execution, which is typically just the
1733    // compilation of the build script itself. (if the build script changes we
1734    // should be rerun!). Note though that if we're an overridden build script
1735    // we have no dependencies so no need to recurse in that case.
1736    let deps = if overridden {
1737        // Overridden build scripts don't need to track deps.
1738        vec![]
1739    } else {
1740        // Create Vec since mutable build_runner is needed in closure.
1741        let deps = Vec::from(build_runner.unit_deps(unit));
1742        deps.into_iter()
1743            .map(|dep| DepFingerprint::new(build_runner, unit, &dep))
1744            .collect::<CargoResult<Vec<_>>>()?
1745    };
1746
1747    let rustflags = unit.rustflags.to_vec();
1748
1749    Ok(Fingerprint {
1750        local: Mutex::new(local),
1751        rustc: util::hash_u64(&build_runner.bcx.rustc().verbose_version),
1752        deps,
1753        outputs: if overridden { Vec::new() } else { vec![output] },
1754        rustflags,
1755        index: build_runner.bcx.unit_to_index[unit],
1756
1757        // Most of the other info is blank here as we don't really include it
1758        // in the execution of the build script, but... this may be a latent
1759        // bug in Cargo.
1760        ..Fingerprint::new()
1761    })
1762}
1763
1764/// Get ready to compute the [`LocalFingerprint`] values
1765/// for a [`RunCustomBuild`] unit.
1766///
1767/// This function has, what's on the surface, a seriously wonky interface.
1768/// You'll call this function and it'll return a closure and a boolean. The
1769/// boolean is pretty simple in that it indicates whether the `unit` has been
1770/// overridden via `.cargo/config.toml`. The closure is much more complicated.
1771///
1772/// This closure is intended to capture any local state necessary to compute
1773/// the `LocalFingerprint` values for this unit. It is `Send` and `'static` to
1774/// be sent to other threads as well (such as when we're executing build
1775/// scripts). That deduplication is the rationale for the closure at least.
1776///
1777/// The arguments to the closure are a bit weirder, though, and I'll apologize
1778/// in advance for the weirdness too. The first argument to the closure is a
1779/// `&BuildDeps`. This is the parsed version of a build script, and when Cargo
1780/// starts up this is cached from previous runs of a build script.  After a
1781/// build script executes the output file is reparsed and passed in here.
1782///
1783/// The second argument is the weirdest, it's *optionally* a closure to
1784/// call [`pkg_fingerprint`]. The `pkg_fingerprint` requires access to
1785/// "source map" located in `Context`. That's very non-`'static` and
1786/// non-`Send`, so it can't be used on other threads, such as when we invoke
1787/// this after a build script has finished. The `Option` allows us to for sure
1788/// calculate it on the main thread at the beginning, and then swallow the bug
1789/// for now where a worker thread after a build script has finished doesn't
1790/// have access. Ideally there would be no second argument or it would be more
1791/// "first class" and not an `Option` but something that can be sent between
1792/// threads. In any case, it's a bug for now.
1793///
1794/// This isn't the greatest of interfaces, and if there's suggestions to
1795/// improve please do so!
1796///
1797/// FIXME(#6779) - see all the words above
1798///
1799/// [`RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
1800fn build_script_local_fingerprints(
1801    build_runner: &mut BuildRunner<'_, '_>,
1802    unit: &Unit,
1803) -> CargoResult<(
1804    Box<
1805        dyn FnOnce(
1806                &BuildDeps,
1807                Option<&dyn Fn() -> CargoResult<String>>,
1808            ) -> CargoResult<Option<Vec<LocalFingerprint>>>
1809            + Send,
1810    >,
1811    bool,
1812)> {
1813    assert!(unit.mode.is_run_custom_build());
1814    // First up, if this build script is entirely overridden, then we just
1815    // return the hash of what we overrode it with. This is the easy case!
1816    if let Some(fingerprint) = build_script_override_fingerprint(build_runner, unit) {
1817        debug!("override local fingerprints deps {}", unit.pkg);
1818        return Ok((
1819            Box::new(
1820                move |_: &BuildDeps, _: Option<&dyn Fn() -> CargoResult<String>>| {
1821                    Ok(Some(vec![fingerprint]))
1822                },
1823            ),
1824            true, // this is an overridden build script
1825        ));
1826    }
1827
1828    // ... Otherwise this is a "real" build script and we need to return a real
1829    // closure. Our returned closure classifies the build script based on
1830    // whether it prints `rerun-if-*`. If it *doesn't* print this it's where the
1831    // magical second argument comes into play, which fingerprints a whole
1832    // package. Remember that the fact that this is an `Option` is a bug, but a
1833    // longstanding bug, in Cargo. Recent refactorings just made it painfully
1834    // obvious.
1835    let pkg_root = unit.pkg.root().to_path_buf();
1836    let build_dir = build_root(build_runner);
1837    let env_config = Arc::clone(build_runner.bcx.gctx.env_config()?);
1838    let calculate =
1839        move |deps: &BuildDeps, pkg_fingerprint: Option<&dyn Fn() -> CargoResult<String>>| {
1840            if deps.rerun_if_changed.is_empty() && deps.rerun_if_env_changed.is_empty() {
1841                match pkg_fingerprint {
1842                    // FIXME: this is somewhat buggy with respect to docker and
1843                    // weird filesystems. The `Precalculated` variant
1844                    // constructed below will, for `path` dependencies, contain
1845                    // a stringified version of the mtime for the local crate.
1846                    // This violates one of the things we describe in this
1847                    // module's doc comment, never hashing mtimes. We should
1848                    // figure out a better scheme where a package fingerprint
1849                    // may be a string (like for a registry) or a list of files
1850                    // (like for a path dependency). Those list of files would
1851                    // be stored here rather than the mtime of them.
1852                    Some(f) => {
1853                        let s = f()?;
1854                        debug!(
1855                            "old local fingerprints deps {:?} precalculated={:?}",
1856                            pkg_root, s
1857                        );
1858                        return Ok(Some(vec![LocalFingerprint::Precalculated(s)]));
1859                    }
1860                    None => return Ok(None),
1861                }
1862            }
1863
1864            // Ok so now we're in "new mode" where we can have files listed as
1865            // dependencies as well as env vars listed as dependencies. Process
1866            // them all here.
1867            Ok(Some(local_fingerprints_deps(
1868                deps,
1869                &build_dir,
1870                &pkg_root,
1871                &env_config,
1872            )))
1873        };
1874
1875    // Note that `false` == "not overridden"
1876    Ok((Box::new(calculate), false))
1877}
1878
1879/// Create a [`LocalFingerprint`] for an overridden build script.
1880/// Returns None if it is not overridden.
1881fn build_script_override_fingerprint(
1882    build_runner: &mut BuildRunner<'_, '_>,
1883    unit: &Unit,
1884) -> Option<LocalFingerprint> {
1885    // Build script output is only populated at this stage when it is
1886    // overridden.
1887    let build_script_outputs = build_runner.build_script_outputs.lock().unwrap();
1888    let metadata = build_runner.get_run_build_script_metadata(unit);
1889    // Returns None if it is not overridden.
1890    let output = build_script_outputs.get(metadata)?;
1891    let s = format!(
1892        "overridden build state with hash: {}",
1893        util::hash_u64(output)
1894    );
1895    Some(LocalFingerprint::Precalculated(s))
1896}
1897
1898/// Compute the [`LocalFingerprint`] values for a [`RunCustomBuild`] unit for
1899/// non-overridden new-style build scripts only. This is only used when `deps`
1900/// is already known to have a nonempty `rerun-if-*` somewhere.
1901///
1902/// [`RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
1903fn local_fingerprints_deps(
1904    deps: &BuildDeps,
1905    build_root: &Path,
1906    pkg_root: &Path,
1907    env_config: &Arc<HashMap<String, OsString>>,
1908) -> Vec<LocalFingerprint> {
1909    debug!("new local fingerprints deps {:?}", pkg_root);
1910    let mut local = Vec::new();
1911
1912    if !deps.rerun_if_changed.is_empty() {
1913        // Note that like the module comment above says we are careful to never
1914        // store an absolute path in `LocalFingerprint`, so ensure that we strip
1915        // absolute prefixes from them.
1916        let output = deps
1917            .build_script_output
1918            .strip_prefix(build_root)
1919            .unwrap()
1920            .to_path_buf();
1921        let paths = deps
1922            .rerun_if_changed
1923            .iter()
1924            .map(|p| p.strip_prefix(pkg_root).unwrap_or(p).to_path_buf())
1925            .collect();
1926        local.push(LocalFingerprint::RerunIfChanged { output, paths });
1927    }
1928
1929    local.extend(
1930        deps.rerun_if_env_changed
1931            .iter()
1932            .map(|s| LocalFingerprint::from_env(s, env_config)),
1933    );
1934
1935    local
1936}
1937
1938/// Writes the short fingerprint hash value to `<loc>`
1939/// and logs detailed JSON information to `<loc>.json`.
1940fn write_fingerprint(loc: &Path, fingerprint: &Fingerprint) -> CargoResult<()> {
1941    debug_assert_ne!(fingerprint.rustc, 0);
1942    // fingerprint::new().rustc == 0, make sure it doesn't make it to the file system.
1943    // This is mostly so outside tools can reliably find out what rust version this file is for,
1944    // as we can use the full hash.
1945    let hash = fingerprint.hash_u64();
1946    debug!("write fingerprint ({:x}) : {}", hash, loc.display());
1947    paths::write(loc, util::to_hex(hash).as_bytes())?;
1948
1949    let json = serde_json::to_string(fingerprint).unwrap();
1950    if cfg!(debug_assertions) {
1951        let f: Fingerprint = serde_json::from_str(&json).unwrap();
1952        assert_eq!(f.hash_u64(), hash);
1953    }
1954    paths::write(&loc.with_extension("json"), json.as_bytes())?;
1955    Ok(())
1956}
1957
1958/// Prepare for work when a package starts to build
1959pub fn prepare_init(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> CargoResult<()> {
1960    let new1 = build_runner.files().fingerprint_dir(unit);
1961
1962    // Doc tests have no output, thus no fingerprint.
1963    if !new1.exists() && !unit.mode.is_doc_test() {
1964        paths::create_dir_all(&new1)?;
1965    }
1966
1967    Ok(())
1968}
1969
1970/// Returns the location that the dep-info file will show up at
1971/// for the [`Unit`] specified.
1972pub fn dep_info_loc(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> PathBuf {
1973    build_runner.files().fingerprint_file_path(unit, "dep-")
1974}
1975
1976/// Returns an absolute path that build directory.
1977/// All paths are rewritten to be relative to this.
1978fn build_root(build_runner: &BuildRunner<'_, '_>) -> PathBuf {
1979    build_runner.bcx.ws.build_dir().into_path_unlocked()
1980}
1981
1982/// Reads the value from the old fingerprint hash file and compare.
1983///
1984/// If dirty, it then restores the detailed information
1985/// from the fingerprint JSON file, and provides an rich dirty reason.
1986fn compare_old_fingerprint(
1987    unit: &Unit,
1988    old_hash_path: &Path,
1989    new_fingerprint: &Fingerprint,
1990    mtime_on_use: bool,
1991    forced: bool,
1992) -> FingerprintComparison {
1993    if mtime_on_use {
1994        // update the mtime so other cleaners know we used it
1995        let t = FileTime::from_system_time(SystemTime::now());
1996        debug!("mtime-on-use forcing {:?} to {}", old_hash_path, t);
1997        paths::set_file_time_no_err(old_hash_path, t);
1998    }
1999
2000    let compare = _compare_old_fingerprint(old_hash_path, new_fingerprint);
2001
2002    match compare.as_ref() {
2003        Ok(FingerprintComparison::Fresh) => {}
2004        Ok(FingerprintComparison::Dirty { reason }) => {
2005            info!(
2006                "fingerprint dirty for {}/{:?}/{:?}",
2007                unit.pkg, unit.mode, unit.target,
2008            );
2009            info!("    dirty: {reason:?}");
2010        }
2011        Err(e) => {
2012            info!(
2013                "fingerprint error for {}/{:?}/{:?}",
2014                unit.pkg, unit.mode, unit.target,
2015            );
2016            info!("    err: {e:?}");
2017        }
2018    }
2019
2020    match compare {
2021        Ok(FingerprintComparison::Fresh) if forced => FingerprintComparison::Dirty {
2022            reason: DirtyReason::Forced,
2023        },
2024        Ok(cmp) => cmp,
2025        Err(_) => FingerprintComparison::Dirty {
2026            reason: DirtyReason::FreshBuild,
2027        },
2028    }
2029}
2030
2031fn _compare_old_fingerprint(
2032    old_hash_path: &Path,
2033    new_fingerprint: &Fingerprint,
2034) -> CargoResult<FingerprintComparison> {
2035    let old_fingerprint_short = paths::read(old_hash_path)?;
2036
2037    let new_hash = new_fingerprint.hash_u64();
2038
2039    if util::to_hex(new_hash) == old_fingerprint_short && new_fingerprint.fs_status.up_to_date() {
2040        return Ok(FingerprintComparison::Fresh);
2041    }
2042
2043    let old_fingerprint_json = paths::read(&old_hash_path.with_extension("json"))?;
2044    let old_fingerprint: Fingerprint = serde_json::from_str(&old_fingerprint_json)
2045        .with_context(|| internal("failed to deserialize json"))?;
2046    // Fingerprint can be empty after a failed rebuild (see comment in prepare_target).
2047    if !old_fingerprint_short.is_empty() {
2048        debug_assert_eq!(
2049            util::to_hex(old_fingerprint.hash_u64()),
2050            old_fingerprint_short
2051        );
2052    }
2053
2054    let reason = new_fingerprint.compare(&old_fingerprint);
2055    Ok(FingerprintComparison::Dirty { reason })
2056}
2057
2058/// Calculates the fingerprint of a unit thats contains no dep-info files.
2059fn pkg_fingerprint(bcx: &BuildContext<'_, '_>, pkg: &Package) -> CargoResult<String> {
2060    let source_id = pkg.package_id().source_id();
2061    let sources = bcx.packages.sources();
2062
2063    let source = sources
2064        .get(source_id)
2065        .ok_or_else(|| internal("missing package source"))?;
2066    source.fingerprint(pkg)
2067}
2068
2069/// The `reference` file is considered as "stale" if any file from `paths` has a newer mtime.
2070fn find_stale_file<I, P>(
2071    mtime_cache: &mut HashMap<PathBuf, FileTime>,
2072    checksum_cache: &mut HashMap<PathBuf, Checksum>,
2073    reference: &Path,
2074    paths: I,
2075    use_checksums: bool,
2076) -> Option<StaleItem>
2077where
2078    I: IntoIterator<Item = (P, Option<(u64, Checksum)>)>,
2079    P: AsRef<Path>,
2080{
2081    let reference_mtime = match paths::mtime(reference) {
2082        Ok(mtime) => mtime,
2083        Err(..) => {
2084            return Some(StaleItem::MissingFile {
2085                path: reference.to_path_buf(),
2086            });
2087        }
2088    };
2089
2090    let skippable_dirs = if let Ok(cargo_home) = home::cargo_home() {
2091        let skippable_dirs: Vec<_> = ["git", "registry"]
2092            .into_iter()
2093            .map(|subfolder| cargo_home.join(subfolder))
2094            .collect();
2095        Some(skippable_dirs)
2096    } else {
2097        None
2098    };
2099    for (path, prior_checksum) in paths {
2100        let path = path.as_ref();
2101
2102        // Assuming anything in cargo_home/{git, registry} is immutable
2103        // (see also #9455 about marking the src directory readonly) which avoids rebuilds when CI
2104        // caches $CARGO_HOME/registry/{index, cache} and $CARGO_HOME/git/db across runs, keeping
2105        // the content the same but changing the mtime.
2106        if let Some(ref skippable_dirs) = skippable_dirs {
2107            if skippable_dirs.iter().any(|dir| path.starts_with(dir)) {
2108                continue;
2109            }
2110        }
2111        if use_checksums {
2112            let Some((file_len, prior_checksum)) = prior_checksum else {
2113                return Some(StaleItem::MissingChecksum {
2114                    path: path.to_path_buf(),
2115                });
2116            };
2117            let path_buf = path.to_path_buf();
2118
2119            let path_checksum = match checksum_cache.entry(path_buf) {
2120                Entry::Occupied(o) => *o.get(),
2121                Entry::Vacant(v) => {
2122                    let Ok(current_file_len) = fs::metadata(&path).map(|m| m.len()) else {
2123                        return Some(StaleItem::FailedToReadMetadata {
2124                            path: path.to_path_buf(),
2125                        });
2126                    };
2127                    if current_file_len != file_len {
2128                        return Some(StaleItem::FileSizeChanged {
2129                            path: path.to_path_buf(),
2130                            new_size: current_file_len,
2131                            old_size: file_len,
2132                        });
2133                    }
2134                    let Ok(file) = File::open(path) else {
2135                        return Some(StaleItem::MissingFile {
2136                            path: path.to_path_buf(),
2137                        });
2138                    };
2139                    let Ok(checksum) = Checksum::compute(prior_checksum.algo(), file) else {
2140                        return Some(StaleItem::UnableToReadFile {
2141                            path: path.to_path_buf(),
2142                        });
2143                    };
2144                    *v.insert(checksum)
2145                }
2146            };
2147            if path_checksum == prior_checksum {
2148                continue;
2149            }
2150            return Some(StaleItem::ChangedChecksum {
2151                source: path.to_path_buf(),
2152                stored_checksum: prior_checksum,
2153                new_checksum: path_checksum,
2154            });
2155        } else {
2156            let path_mtime = match mtime_cache.entry(path.to_path_buf()) {
2157                Entry::Occupied(o) => *o.get(),
2158                Entry::Vacant(v) => {
2159                    let Ok(mtime) = paths::mtime_recursive(path) else {
2160                        return Some(StaleItem::MissingFile {
2161                            path: path.to_path_buf(),
2162                        });
2163                    };
2164                    *v.insert(mtime)
2165                }
2166            };
2167
2168            // TODO: fix #5918.
2169            // Note that equal mtimes should be considered "stale". For filesystems with
2170            // not much timestamp precision like 1s this is would be a conservative approximation
2171            // to handle the case where a file is modified within the same second after
2172            // a build starts. We want to make sure that incremental rebuilds pick that up!
2173            //
2174            // For filesystems with nanosecond precision it's been seen in the wild that
2175            // its "nanosecond precision" isn't really nanosecond-accurate. It turns out that
2176            // kernels may cache the current time so files created at different times actually
2177            // list the same nanosecond precision. Some digging on #5919 picked up that the
2178            // kernel caches the current time between timer ticks, which could mean that if
2179            // a file is updated at most 10ms after a build starts then Cargo may not
2180            // pick up the build changes.
2181            //
2182            // All in all, an equality check here would be a conservative assumption that,
2183            // if equal, files were changed just after a previous build finished.
2184            // Unfortunately this became problematic when (in #6484) cargo switch to more accurately
2185            // measuring the start time of builds.
2186            if path_mtime <= reference_mtime {
2187                continue;
2188            }
2189
2190            return Some(StaleItem::ChangedFile {
2191                reference: reference.to_path_buf(),
2192                reference_mtime,
2193                stale: path.to_path_buf(),
2194                stale_mtime: path_mtime,
2195            });
2196        }
2197    }
2198
2199    debug!(
2200        "all paths up-to-date relative to {:?} mtime={}",
2201        reference, reference_mtime
2202    );
2203    None
2204}