cargo/core/compiler/fingerprint/
mod.rs

1//! Tracks changes to determine if something needs to be recompiled.
2//!
3//! This module implements change-tracking so that Cargo can know whether or
4//! not something needs to be recompiled. A Cargo [`Unit`] can be either "dirty"
5//! (needs to be recompiled) or "fresh" (it does not need to be recompiled).
6//!
7//! ## Mechanisms affecting freshness
8//!
9//! There are several mechanisms that influence a Unit's freshness:
10//!
11//! - The [`Fingerprint`] is a hash, saved to the filesystem in the
12//!   `.fingerprint` directory, that tracks information about the Unit. If the
13//!   fingerprint is missing (such as the first time the unit is being
14//!   compiled), then the unit is dirty. If any of the fingerprint fields
15//!   change (like the name of the source file), then the Unit is considered
16//!   dirty.
17//!
18//!   The `Fingerprint` also tracks the fingerprints of all its dependencies,
19//!   so a change in a dependency will propagate the "dirty" status up.
20//!
21//! - Filesystem mtime tracking is also used to check if a unit is dirty.
22//!   See the section below on "Mtime comparison" for more details. There
23//!   are essentially two parts to mtime tracking:
24//!
25//!   1. The mtime of a Unit's output files is compared to the mtime of all
26//!      its dependencies' output file mtimes (see
27//!      [`check_filesystem`]). If any output is missing, or is
28//!      older than a dependency's output, then the unit is dirty.
29//!   2. The mtime of a Unit's source files is compared to the mtime of its
30//!      dep-info file in the fingerprint directory (see [`find_stale_file`]).
31//!      The dep-info file is used as an anchor to know when the last build of
32//!      the unit was done. See the "dep-info files" section below for more
33//!      details. If any input files are missing, or are newer than the
34//!      dep-info, then the unit is dirty.
35//!
36//!  - Alternatively if you're using the unstable feature `checksum-freshness`
37//!    mtimes are ignored entirely in favor of comparing first the file size, and
38//!    then the checksum with a known prior value emitted by rustc. Only nightly
39//!    rustc will emit the needed metadata at the time of writing. This is dependent
40//!    on the unstable feature `-Z checksum-hash-algorithm`.
41//!
42//! Note: Fingerprinting is not a perfect solution. Filesystem mtime tracking
43//! is notoriously imprecise and problematic. Only a small part of the
44//! environment is captured. This is a balance of performance, simplicity, and
45//! completeness. Sandboxing, hashing file contents, tracking every file
46//! access, environment variable, and network operation would ensure more
47//! reliable and reproducible builds at the cost of being complex, slow, and
48//! platform-dependent.
49//!
50//! ## Fingerprints and [`UnitHash`]s
51//!
52//! [`Metadata`] tracks several [`UnitHash`]s, including
53//! [`Metadata::unit_id`], [`Metadata::c_metadata`], and [`Metadata::c_extra_filename`].
54//! See its documentation for more details.
55//!
56//! NOTE: Not all output files are isolated via filename hashes (like dylibs).
57//! The fingerprint directory uses a hash, but sometimes units share the same
58//! fingerprint directory (when they don't have Metadata) so care should be
59//! taken to handle this!
60//!
61//! Fingerprints and [`UnitHash`]s are similar, and track some of the same things.
62//! [`UnitHash`]s contains information that is required to keep Units separate.
63//! The Fingerprint includes additional information that should cause a
64//! recompile, but it is desired to reuse the same filenames. A comparison
65//! of what is tracked:
66//!
67//! Value                                      | Fingerprint | `Metadata::unit_id` [^8] | `Metadata::c_metadata`
68//! -------------------------------------------|-------------|--------------------------|-----------------------
69//! rustc                                      | ✓           | ✓                        | ✓
70//! [`Profile`]                                | ✓           | ✓                        | ✓
71//! `cargo rustc` extra args                   | ✓           | ✓[^7]                    |
72//! [`CompileMode`]                            | ✓           | ✓                        | ✓
73//! Target Name                                | ✓           | ✓                        | ✓
74//! `TargetKind` (bin/lib/etc.)                | ✓           | ✓                        | ✓
75//! Enabled Features                           | ✓           | ✓                        | ✓
76//! Declared Features                          | ✓           |                          |
77//! Immediate dependency’s hashes              | ✓[^1]       | ✓                        | ✓
78//! [`CompileKind`] (host/target)              | ✓           | ✓                        | ✓
79//! `__CARGO_DEFAULT_LIB_METADATA`[^4]         |             | ✓                        | ✓
80//! `package_id`                               |             | ✓                        | ✓
81//! Target src path relative to ws             | ✓           |                          |
82//! Target flags (test/bench/for_host/edition) | ✓           |                          |
83//! -C incremental=… flag                      | ✓           |                          |
84//! mtime of sources                           | ✓[^3]       |                          |
85//! RUSTFLAGS/RUSTDOCFLAGS                     | ✓           | ✓[^7]                    |
86//! [`Lto`] flags                              | ✓           | ✓                        | ✓
87//! config settings[^5]                        | ✓           |                          |
88//! `is_std`                                   |             | ✓                        | ✓
89//! `[lints]` table[^6]                        | ✓           |                          |
90//! `[lints.rust.unexpected_cfgs.check-cfg]`   | ✓           |                          |
91//!
92//! [^1]: Bin dependencies are not included.
93//!
94//! [^3]: See below for details on mtime tracking.
95//!
96//! [^4]: `__CARGO_DEFAULT_LIB_METADATA` is set by rustbuild to embed the
97//!        release channel (bootstrap/stable/beta/nightly) in libstd.
98//!
99//! [^5]: Config settings that are not otherwise captured anywhere else.
100//!       Currently, this is only `doc.extern-map`.
101//!
102//! [^6]: Via [`Manifest::lint_rustflags`][crate::core::Manifest::lint_rustflags]
103//!
104//! [^7]: extra-flags and RUSTFLAGS are conditionally excluded when `--remap-path-prefix` is
105//!       present to avoid breaking build reproducibility while we wait for trim-paths
106//!
107//! [^8]: including `-Cextra-filename`
108//!
109//! When deciding what should go in the Metadata vs the Fingerprint, consider
110//! that some files (like dylibs) do not have a hash in their filename. Thus,
111//! if a value changes, only the fingerprint will detect the change (consider,
112//! for example, swapping between different features). Fields that are only in
113//! Metadata generally aren't relevant to the fingerprint because they
114//! fundamentally change the output (like target vs host changes the directory
115//! where it is emitted).
116//!
117//! ## Fingerprint files
118//!
119//! Fingerprint information is stored in the
120//! `target/{debug,release}/.fingerprint/` directory. Each Unit is stored in a
121//! separate directory. Each Unit directory contains:
122//!
123//! - A file with a 16 hex-digit hash. This is the Fingerprint hash, used for
124//!   quick loading and comparison.
125//! - A `.json` file that contains details about the Fingerprint. This is only
126//!   used to log details about *why* a fingerprint is considered dirty.
127//!   `CARGO_LOG=cargo::core::compiler::fingerprint=trace cargo build` can be
128//!   used to display this log information.
129//! - A "dep-info" file which is a translation of rustc's `*.d` dep-info files
130//!   to a Cargo-specific format that tweaks file names and is optimized for
131//!   reading quickly.
132//! - An `invoked.timestamp` file whose filesystem mtime is updated every time
133//!   the Unit is built. This is used for capturing the time when the build
134//!   starts, to detect if files are changed in the middle of the build. See
135//!   below for more details.
136//!
137//! Note that some units are a little different. A Unit for *running* a build
138//! script or for `rustdoc` does not have a dep-info file (it's not
139//! applicable). Build script `invoked.timestamp` files are in the build
140//! output directory.
141//!
142//! ## Fingerprint calculation
143//!
144//! After the list of Units has been calculated, the Units are added to the
145//! [`JobQueue`]. As each one is added, the fingerprint is calculated, and the
146//! dirty/fresh status is recorded. A closure is used to update the fingerprint
147//! on-disk when the Unit successfully finishes. The closure will recompute the
148//! Fingerprint based on the updated information. If the Unit fails to compile,
149//! the fingerprint is not updated.
150//!
151//! Fingerprints are cached in the [`BuildRunner`]. This makes computing
152//! Fingerprints faster, but also is necessary for properly updating
153//! dependency information. Since a Fingerprint includes the Fingerprints of
154//! all dependencies, when it is updated, by using `Arc` clones, it
155//! automatically picks up the updates to its dependencies.
156//!
157//! ### dep-info files
158//!
159//! Cargo has several kinds of "dep info" files:
160//!
161//! * dep-info files generated by `rustc`.
162//! * Fingerprint dep-info files translated from the first one.
163//! * dep-info for external build system integration.
164//! * Unstable `-Zbinary-dep-depinfo`.
165//!
166//! #### `rustc` dep-info files
167//!
168//! Cargo passes the `--emit=dep-info` flag to `rustc` so that `rustc` will
169//! generate a "dep info" file (with the `.d` extension). This is a
170//! Makefile-like syntax that includes all of the source files used to build
171//! the crate. This file is used by Cargo to know which files to check to see
172//! if the crate will need to be rebuilt. Example:
173//!
174//! ```makefile
175//! /path/to/target/debug/deps/cargo-b6219d178925203d: src/bin/main.rs src/bin/cargo/cli.rs # … etc.
176//! ```
177//!
178//! #### Fingerprint dep-info files
179//!
180//! After `rustc` exits successfully, Cargo will read the first kind of dep
181//! info file and translate it into a binary format that is stored in the
182//! fingerprint directory ([`translate_dep_info`]).
183//!
184//! These are used to quickly scan for any changed files. The mtime of the
185//! fingerprint dep-info file itself is used as the reference for comparing the
186//! source files to determine if any of the source files have been modified
187//! (see [below](#mtime-comparison) for more detail).
188//!
189//! Note that Cargo parses the special `# env-var:...` comments in dep-info
190//! files to learn about environment variables that the rustc compile depends on.
191//! Cargo then later uses this to trigger a recompile if a referenced env var
192//! changes (even if the source didn't change).
193//! This also includes env vars generated from Cargo metadata like `CARGO_PKG_DESCRIPTION`.
194//! (See [`crate::core::manifest::ManifestMetadata`]
195//!
196//! #### dep-info files for build system integration.
197//!
198//! There is also a third dep-info file. Cargo will extend the file created by
199//! rustc with some additional information and saves this into the output
200//! directory. This is intended for build system integration. See the
201//! [`output_depinfo`] function for more detail.
202//!
203//! #### -Zbinary-dep-depinfo
204//!
205//! `rustc` has an experimental flag `-Zbinary-dep-depinfo`. This causes
206//! `rustc` to include binary files (like rlibs) in the dep-info file. This is
207//! primarily to support rustc development, so that Cargo can check the
208//! implicit dependency to the standard library (which lives in the sysroot).
209//! We want Cargo to recompile whenever the standard library rlib/dylibs
210//! change, and this is a generic mechanism to make that work.
211//!
212//! ### Mtime comparison
213//!
214//! The use of modification timestamps is the most common way a unit will be
215//! determined to be dirty or fresh between builds. There are many subtle
216//! issues and edge cases with mtime comparisons. This gives a high-level
217//! overview, but you'll need to read the code for the gritty details. Mtime
218//! handling is different for different unit kinds. The different styles are
219//! driven by the [`Fingerprint::local`] field, which is set based on the unit
220//! kind.
221//!
222//! The status of whether or not the mtime is "stale" or "up-to-date" is
223//! stored in [`Fingerprint::fs_status`].
224//!
225//! All units will compare the mtime of its newest output file with the mtimes
226//! of the outputs of all its dependencies. If any output file is missing,
227//! then the unit is stale. If any dependency is newer, the unit is stale.
228//!
229//! #### Normal package mtime handling
230//!
231//! [`LocalFingerprint::CheckDepInfo`] is used for checking the mtime of
232//! packages. It compares the mtime of the input files (the source files) to
233//! the mtime of the dep-info file (which is written last after a build is
234//! finished). If the dep-info is missing, the unit is stale (it has never
235//! been built). The list of input files comes from the dep-info file. See the
236//! section above for details on dep-info files.
237//!
238//! Also note that although registry and git packages use [`CheckDepInfo`], none
239//! of their source files are included in the dep-info (see
240//! [`translate_dep_info`]), so for those kinds no mtime checking is done
241//! (unless `-Zbinary-dep-depinfo` is used). Repository and git packages are
242//! static, so there is no need to check anything.
243//!
244//! When a build is complete, the mtime of the dep-info file in the
245//! fingerprint directory is modified to rewind it to the time when the build
246//! started. This is done by creating an `invoked.timestamp` file when the
247//! build starts to capture the start time. The mtime is rewound to the start
248//! to handle the case where the user modifies a source file while a build is
249//! running. Cargo can't know whether or not the file was included in the
250//! build, so it takes a conservative approach of assuming the file was *not*
251//! included, and it should be rebuilt during the next build.
252//!
253//! #### Rustdoc mtime handling
254//!
255//! Rustdoc does not emit a dep-info file, so Cargo currently has a relatively
256//! simple system for detecting rebuilds. [`LocalFingerprint::Precalculated`] is
257//! used for rustdoc units. For registry packages, this is the package
258//! version. For git packages, it is the git hash. For path packages, it is
259//! a string of the mtime of the newest file in the package.
260//!
261//! There are some known bugs with how this works, so it should be improved at
262//! some point.
263//!
264//! #### Build script mtime handling
265//!
266//! Build script mtime handling runs in different modes. There is the "old
267//! style" where the build script does not emit any `rerun-if` directives. In
268//! this mode, Cargo will use [`LocalFingerprint::Precalculated`]. See the
269//! "rustdoc" section above how it works.
270//!
271//! In the new-style, each `rerun-if` directive is translated to the
272//! corresponding [`LocalFingerprint`] variant. The [`RerunIfChanged`] variant
273//! compares the mtime of the given filenames against the mtime of the
274//! "output" file.
275//!
276//! Similar to normal units, the build script "output" file mtime is rewound
277//! to the time just before the build script is executed to handle mid-build
278//! modifications.
279//!
280//! ## Considerations for inclusion in a fingerprint
281//!
282//! Over time we've realized a few items which historically were included in
283//! fingerprint hashings should not actually be included. Examples are:
284//!
285//! * Modification time values. We strive to never include a modification time
286//!   inside a `Fingerprint` to get hashed into an actual value. While
287//!   theoretically fine to do, in practice this causes issues with common
288//!   applications like Docker. Docker, after a layer is built, will zero out
289//!   the nanosecond part of all filesystem modification times. This means that
290//!   the actual modification time is different for all build artifacts, which
291//!   if we tracked the actual values of modification times would cause
292//!   unnecessary recompiles. To fix this we instead only track paths which are
293//!   relevant. These paths are checked dynamically to see if they're up to
294//!   date, and the modification time doesn't make its way into the fingerprint
295//!   hash.
296//!
297//! * Absolute path names. We strive to maintain a property where if you rename
298//!   a project directory Cargo will continue to preserve all build artifacts
299//!   and reuse the cache. This means that we can't ever hash an absolute path
300//!   name. Instead we always hash relative path names and the "root" is passed
301//!   in at runtime dynamically. Some of this is best effort, but the general
302//!   idea is that we assume all accesses within a crate stay within that
303//!   crate.
304//!
305//! These are pretty tricky to test for unfortunately, but we should have a good
306//! test suite nowadays and lord knows Cargo gets enough testing in the wild!
307//!
308//! ## Build scripts
309//!
310//! The *running* of a build script ([`CompileMode::RunCustomBuild`]) is treated
311//! significantly different than all other Unit kinds. It has its own function
312//! for calculating the Fingerprint ([`calculate_run_custom_build`]) and has some
313//! unique considerations. It does not track the same information as a normal
314//! Unit. The information tracked depends on the `rerun-if-changed` and
315//! `rerun-if-env-changed` statements produced by the build script. If the
316//! script does not emit either of these statements, the Fingerprint runs in
317//! "old style" mode where an mtime change of *any* file in the package will
318//! cause the build script to be re-run. Otherwise, the fingerprint *only*
319//! tracks the individual "rerun-if" items listed by the build script.
320//!
321//! The "rerun-if" statements from a *previous* build are stored in the build
322//! output directory in a file called `output`. Cargo parses this file when
323//! the Unit for that build script is prepared for the [`JobQueue`]. The
324//! Fingerprint code can then use that information to compute the Fingerprint
325//! and compare against the old fingerprint hash.
326//!
327//! Care must be taken with build script Fingerprints because the
328//! [`Fingerprint::local`] value may be changed after the build script runs
329//! (such as if the build script adds or removes "rerun-if" items).
330//!
331//! Another complication is if a build script is overridden. In that case, the
332//! fingerprint is the hash of the output of the override.
333//!
334//! ## Special considerations
335//!
336//! Registry dependencies do not track the mtime of files. This is because
337//! registry dependencies are not expected to change (if a new version is
338//! used, the Package ID will change, causing a rebuild). Cargo currently
339//! partially works with Docker caching. When a Docker image is built, it has
340//! normal mtime information. However, when a step is cached, the nanosecond
341//! portions of all files is zeroed out. Currently this works, but care must
342//! be taken for situations like these.
343//!
344//! HFS on macOS only supports 1 second timestamps. This causes a significant
345//! number of problems, particularly with Cargo's testsuite which does rapid
346//! builds in succession. Other filesystems have various degrees of
347//! resolution.
348//!
349//! Various weird filesystems (such as network filesystems) also can cause
350//! complications. Network filesystems may track the time on the server
351//! (except when the time is set manually such as with
352//! `filetime::set_file_times`). Not all filesystems support modifying the
353//! mtime.
354//!
355//! See the [`A-rebuild-detection`] label on the issue tracker for more.
356//!
357//! [`check_filesystem`]: Fingerprint::check_filesystem
358//! [`Metadata`]: crate::core::compiler::Metadata
359//! [`Metadata::unit_id`]: crate::core::compiler::Metadata::unit_id
360//! [`Metadata::c_metadata`]: crate::core::compiler::Metadata::c_metadata
361//! [`Metadata::c_extra_filename`]: crate::core::compiler::Metadata::c_extra_filename
362//! [`UnitHash`]: crate::core::compiler::UnitHash
363//! [`Profile`]: crate::core::profiles::Profile
364//! [`CompileMode`]: crate::core::compiler::CompileMode
365//! [`Lto`]: crate::core::compiler::Lto
366//! [`CompileKind`]: crate::core::compiler::CompileKind
367//! [`JobQueue`]: super::job_queue::JobQueue
368//! [`output_depinfo`]: super::output_depinfo()
369//! [`CheckDepInfo`]: LocalFingerprint::CheckDepInfo
370//! [`RerunIfChanged`]: LocalFingerprint::RerunIfChanged
371//! [`CompileMode::RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
372//! [`A-rebuild-detection`]: https://github.com/rust-lang/cargo/issues?q=is%3Aissue+is%3Aopen+label%3AA-rebuild-detection
373
374mod dep_info;
375mod dirty_reason;
376mod rustdoc;
377
378use std::collections::hash_map::{Entry, HashMap};
379use std::env;
380use std::ffi::OsString;
381use std::fs;
382use std::fs::File;
383use std::hash::{self, Hash, Hasher};
384use std::io::{self};
385use std::path::{Path, PathBuf};
386use std::sync::{Arc, Mutex};
387use std::time::SystemTime;
388
389use anyhow::Context as _;
390use anyhow::format_err;
391use cargo_util::paths;
392use filetime::FileTime;
393use serde::de;
394use serde::ser;
395use serde::{Deserialize, Serialize};
396use tracing::{debug, info};
397
398use crate::core::Package;
399use crate::core::compiler::unit_graph::UnitDep;
400use crate::util;
401use crate::util::errors::CargoResult;
402use crate::util::interning::InternedString;
403use crate::util::log_message::LogMessage;
404use crate::util::{StableHasher, internal, path_args};
405use crate::{CARGO_ENV, GlobalContext};
406
407use super::custom_build::BuildDeps;
408use super::{BuildContext, BuildRunner, FileFlavor, Job, Unit, Work};
409
410pub use self::dep_info::Checksum;
411pub use self::dep_info::parse_dep_info;
412pub use self::dep_info::parse_rustc_dep_info;
413pub use self::dep_info::translate_dep_info;
414pub use self::dirty_reason::DirtyReason;
415pub use self::rustdoc::RustdocFingerprint;
416
417/// Determines if a [`Unit`] is up-to-date, and if not prepares necessary work to
418/// update the persisted fingerprint.
419///
420/// This function will inspect `Unit`, calculate a fingerprint for it, and then
421/// return an appropriate [`Job`] to run. The returned `Job` will be a noop if
422/// `unit` is considered "fresh", or if it was previously built and cached.
423/// Otherwise the `Job` returned will write out the true fingerprint to the
424/// filesystem, to be executed after the unit's work has completed.
425///
426/// The `force` flag is a way to force the `Job` to be "dirty", or always
427/// update the fingerprint. **Beware using this flag** because it does not
428/// transitively propagate throughout the dependency graph, it only forces this
429/// one unit which is very unlikely to be what you want unless you're
430/// exclusively talking about top-level units.
431#[tracing::instrument(
432    skip(build_runner, unit),
433    fields(package_id = %unit.pkg.package_id(), target = unit.target.name())
434)]
435pub fn prepare_target(
436    build_runner: &mut BuildRunner<'_, '_>,
437    unit: &Unit,
438    force: bool,
439) -> CargoResult<Job> {
440    let bcx = build_runner.bcx;
441    let loc = build_runner.files().fingerprint_file_path(unit, "");
442
443    debug!("fingerprint at: {}", loc.display());
444
445    // Figure out if this unit is up to date. After calculating the fingerprint
446    // compare it to an old version, if any, and attempt to print diagnostic
447    // information about failed comparisons to aid in debugging.
448    let fingerprint = calculate(build_runner, unit)?;
449    let mtime_on_use = build_runner.bcx.gctx.cli_unstable().mtime_on_use;
450    let dirty_reason = compare_old_fingerprint(unit, &loc, &*fingerprint, mtime_on_use, force);
451
452    let Some(dirty_reason) = dirty_reason else {
453        return Ok(Job::new_fresh());
454    };
455
456    if let Some(logger) = bcx.logger {
457        // Dont log FreshBuild as it is noisy.
458        if !dirty_reason.is_fresh_build() {
459            logger.log(LogMessage::Rebuild {
460                package_id: unit.pkg.package_id().to_spec(),
461                target: (&unit.target).into(),
462                mode: unit.mode,
463                cause: dirty_reason.clone(),
464            });
465        }
466    }
467
468    // We're going to rebuild, so ensure the source of the crate passes all
469    // verification checks before we build it.
470    //
471    // The `Source::verify` method is intended to allow sources to execute
472    // pre-build checks to ensure that the relevant source code is all
473    // up-to-date and as expected. This is currently used primarily for
474    // directory sources which will use this hook to perform an integrity check
475    // on all files in the source to ensure they haven't changed. If they have
476    // changed then an error is issued.
477    let source_id = unit.pkg.package_id().source_id();
478    let sources = bcx.packages.sources();
479    let source = sources
480        .get(source_id)
481        .ok_or_else(|| internal("missing package source"))?;
482    source.verify(unit.pkg.package_id())?;
483
484    // Clear out the old fingerprint file if it exists. This protects when
485    // compilation is interrupted leaving a corrupt file. For example, a
486    // project with a lib.rs and integration test (two units):
487    //
488    // 1. Build the library and integration test.
489    // 2. Make a change to lib.rs (NOT the integration test).
490    // 3. Build the integration test, hit Ctrl-C while linking. With gcc, this
491    //    will leave behind an incomplete executable (zero size, or partially
492    //    written). NOTE: The library builds successfully, it is the linking
493    //    of the integration test that we are interrupting.
494    // 4. Build the integration test again.
495    //
496    // Without the following line, then step 3 will leave a valid fingerprint
497    // on the disk. Then step 4 will think the integration test is "fresh"
498    // because:
499    //
500    // - There is a valid fingerprint hash on disk (written in step 1).
501    // - The mtime of the output file (the corrupt integration executable
502    //   written in step 3) is newer than all of its dependencies.
503    // - The mtime of the integration test fingerprint dep-info file (written
504    //   in step 1) is newer than the integration test's source files, because
505    //   we haven't modified any of its source files.
506    //
507    // But the executable is corrupt and needs to be rebuilt. Clearing the
508    // fingerprint at step 3 ensures that Cargo never mistakes a partially
509    // written output as up-to-date.
510    if loc.exists() {
511        // Truncate instead of delete so that compare_old_fingerprint will
512        // still log the reason for the fingerprint failure instead of just
513        // reporting "failed to read fingerprint" during the next build if
514        // this build fails.
515        paths::write(&loc, b"")?;
516    }
517
518    let write_fingerprint = if unit.mode.is_run_custom_build() {
519        // For build scripts the `local` field of the fingerprint may change
520        // while we're executing it. For example it could be in the legacy
521        // "consider everything a dependency mode" and then we switch to "deps
522        // are explicitly specified" mode.
523        //
524        // To handle this movement we need to regenerate the `local` field of a
525        // build script's fingerprint after it's executed. We do this by
526        // using the `build_script_local_fingerprints` function which returns a
527        // thunk we can invoke on a foreign thread to calculate this.
528        let build_script_outputs = Arc::clone(&build_runner.build_script_outputs);
529        let metadata = build_runner.get_run_build_script_metadata(unit);
530        let (gen_local, _overridden) = build_script_local_fingerprints(build_runner, unit)?;
531        let output_path = build_runner.build_explicit_deps[unit]
532            .build_script_output
533            .clone();
534        Work::new(move |_| {
535            let outputs = build_script_outputs.lock().unwrap();
536            let output = outputs
537                .get(metadata)
538                .expect("output must exist after running");
539            let deps = BuildDeps::new(&output_path, Some(output));
540
541            // FIXME: it's basically buggy that we pass `None` to `call_box`
542            // here. See documentation on `build_script_local_fingerprints`
543            // below for more information. Despite this just try to proceed and
544            // hobble along if it happens to return `Some`.
545            if let Some(new_local) = (gen_local)(&deps, None)? {
546                *fingerprint.local.lock().unwrap() = new_local;
547            }
548
549            write_fingerprint(&loc, &fingerprint)
550        })
551    } else {
552        Work::new(move |_| write_fingerprint(&loc, &fingerprint))
553    };
554
555    Ok(Job::new_dirty(write_fingerprint, dirty_reason))
556}
557
558/// Dependency edge information for fingerprints. This is generated for each
559/// dependency and is stored in a [`Fingerprint`].
560#[derive(Clone)]
561struct DepFingerprint {
562    /// The hash of the package id that this dependency points to
563    pkg_id: u64,
564    /// The crate name we're using for this dependency, which if we change we'll
565    /// need to recompile!
566    name: InternedString,
567    /// Whether or not this dependency is flagged as a public dependency or not.
568    public: bool,
569    /// Whether or not this dependency is an rmeta dependency or a "full"
570    /// dependency. In the case of an rmeta dependency our dependency edge only
571    /// actually requires the rmeta from what we depend on, so when checking
572    /// mtime information all files other than the rmeta can be ignored.
573    only_requires_rmeta: bool,
574    /// The dependency's fingerprint we recursively point to, containing all the
575    /// other hash information we'd otherwise need.
576    fingerprint: Arc<Fingerprint>,
577}
578
579/// A fingerprint can be considered to be a "short string" representing the
580/// state of a world for a package.
581///
582/// If a fingerprint ever changes, then the package itself needs to be
583/// recompiled. Inputs to the fingerprint include source code modifications,
584/// compiler flags, compiler version, etc. This structure is not simply a
585/// `String` due to the fact that some fingerprints cannot be calculated lazily.
586///
587/// Path sources, for example, use the mtime of the corresponding dep-info file
588/// as a fingerprint (all source files must be modified *before* this mtime).
589/// This dep-info file is not generated, however, until after the crate is
590/// compiled. As a result, this structure can be thought of as a fingerprint
591/// to-be. The actual value can be calculated via [`hash_u64()`], but the operation
592/// may fail as some files may not have been generated.
593///
594/// Note that dependencies are taken into account for fingerprints because rustc
595/// requires that whenever an upstream crate is recompiled that all downstream
596/// dependents are also recompiled. This is typically tracked through
597/// [`DependencyQueue`], but it also needs to be retained here because Cargo can
598/// be interrupted while executing, losing the state of the [`DependencyQueue`]
599/// graph.
600///
601/// [`hash_u64()`]: crate::core::compiler::fingerprint::Fingerprint::hash_u64
602/// [`DependencyQueue`]: crate::util::DependencyQueue
603#[derive(Serialize, Deserialize)]
604pub struct Fingerprint {
605    /// Hash of the version of `rustc` used.
606    rustc: u64,
607    /// Sorted list of cfg features enabled.
608    features: String,
609    /// Sorted list of all the declared cfg features.
610    declared_features: String,
611    /// Hash of the `Target` struct, including the target name,
612    /// package-relative source path, edition, etc.
613    target: u64,
614    /// Hash of the [`Profile`], [`CompileMode`], and any extra flags passed via
615    /// `cargo rustc` or `cargo rustdoc`.
616    ///
617    /// [`Profile`]: crate::core::profiles::Profile
618    /// [`CompileMode`]: crate::core::compiler::CompileMode
619    profile: u64,
620    /// Hash of the path to the base source file. This is relative to the
621    /// workspace root for path members, or absolute for other sources.
622    path: u64,
623    /// Fingerprints of dependencies.
624    deps: Vec<DepFingerprint>,
625    /// Information about the inputs that affect this Unit (such as source
626    /// file mtimes or build script environment variables).
627    local: Mutex<Vec<LocalFingerprint>>,
628    /// Cached hash of the [`Fingerprint`] struct. Used to improve performance
629    /// for hashing.
630    #[serde(skip)]
631    memoized_hash: Mutex<Option<u64>>,
632    /// RUSTFLAGS/RUSTDOCFLAGS environment variable value (or config value).
633    rustflags: Vec<String>,
634    /// Hash of various config settings that change how things are compiled.
635    config: u64,
636    /// The rustc target. This is only relevant for `.json` files, otherwise
637    /// the metadata hash segregates the units.
638    compile_kind: u64,
639    /// Description of whether the filesystem status for this unit is up to date
640    /// or should be considered stale.
641    #[serde(skip)]
642    fs_status: FsStatus,
643    /// Files, relative to `target_root`, that are produced by the step that
644    /// this `Fingerprint` represents. This is used to detect when the whole
645    /// fingerprint is out of date if this is missing, or if previous
646    /// fingerprints output files are regenerated and look newer than this one.
647    #[serde(skip)]
648    outputs: Vec<PathBuf>,
649}
650
651/// Indication of the status on the filesystem for a particular unit.
652#[derive(Clone, Default, Debug, Serialize)]
653#[serde(tag = "fs_status", rename_all = "kebab-case")]
654pub enum FsStatus {
655    /// This unit is to be considered stale, even if hash information all
656    /// matches.
657    #[default]
658    Stale,
659
660    /// File system inputs have changed (or are missing), or there were
661    /// changes to the environment variables that affect this unit. See
662    /// the variants of [`StaleItem`] for more information.
663    StaleItem(StaleItem),
664
665    /// A dependency was stale.
666    StaleDependency {
667        name: InternedString,
668        #[serde(serialize_with = "serialize_file_time")]
669        dep_mtime: FileTime,
670        #[serde(serialize_with = "serialize_file_time")]
671        max_mtime: FileTime,
672    },
673
674    /// A dependency was stale.
675    StaleDepFingerprint { name: InternedString },
676
677    /// This unit is up-to-date. All outputs and their corresponding mtime are
678    /// listed in the payload here for other dependencies to compare against.
679    #[serde(skip)]
680    UpToDate { mtimes: HashMap<PathBuf, FileTime> },
681}
682
683impl FsStatus {
684    fn up_to_date(&self) -> bool {
685        match self {
686            FsStatus::UpToDate { .. } => true,
687            FsStatus::Stale
688            | FsStatus::StaleItem(_)
689            | FsStatus::StaleDependency { .. }
690            | FsStatus::StaleDepFingerprint { .. } => false,
691        }
692    }
693}
694
695/// Serialize FileTime as milliseconds with nano.
696fn serialize_file_time<S>(ft: &FileTime, s: S) -> Result<S::Ok, S::Error>
697where
698    S: serde::Serializer,
699{
700    let secs_as_millis = ft.unix_seconds() as f64 * 1000.0;
701    let nanos_as_millis = ft.nanoseconds() as f64 / 1_000_000.0;
702    (secs_as_millis + nanos_as_millis).serialize(s)
703}
704
705impl Serialize for DepFingerprint {
706    fn serialize<S>(&self, ser: S) -> Result<S::Ok, S::Error>
707    where
708        S: ser::Serializer,
709    {
710        (
711            &self.pkg_id,
712            &self.name,
713            &self.public,
714            &self.fingerprint.hash_u64(),
715        )
716            .serialize(ser)
717    }
718}
719
720impl<'de> Deserialize<'de> for DepFingerprint {
721    fn deserialize<D>(d: D) -> Result<DepFingerprint, D::Error>
722    where
723        D: de::Deserializer<'de>,
724    {
725        let (pkg_id, name, public, hash) = <(u64, String, bool, u64)>::deserialize(d)?;
726        Ok(DepFingerprint {
727            pkg_id,
728            name: name.into(),
729            public,
730            fingerprint: Arc::new(Fingerprint {
731                memoized_hash: Mutex::new(Some(hash)),
732                ..Fingerprint::new()
733            }),
734            // This field is never read since it's only used in
735            // `check_filesystem` which isn't used by fingerprints loaded from
736            // disk.
737            only_requires_rmeta: false,
738        })
739    }
740}
741
742/// A `LocalFingerprint` represents something that we use to detect direct
743/// changes to a `Fingerprint`.
744///
745/// This is where we track file information, env vars, etc. This
746/// `LocalFingerprint` struct is hashed and if the hash changes will force a
747/// recompile of any fingerprint it's included into. Note that the "local"
748/// terminology comes from the fact that it only has to do with one crate, and
749/// `Fingerprint` tracks the transitive propagation of fingerprint changes.
750///
751/// Note that because this is hashed its contents are carefully managed. Like
752/// mentioned in the above module docs, we don't want to hash absolute paths or
753/// mtime information.
754///
755/// Also note that a `LocalFingerprint` is used in `check_filesystem` to detect
756/// when the filesystem contains stale information (based on mtime currently).
757/// The paths here don't change much between compilations but they're used as
758/// inputs when we probe the filesystem looking at information.
759#[derive(Debug, Serialize, Deserialize, Hash)]
760enum LocalFingerprint {
761    /// This is a precalculated fingerprint which has an opaque string we just
762    /// hash as usual. This variant is primarily used for rustdoc where we
763    /// don't have a dep-info file to compare against.
764    ///
765    /// This is also used for build scripts with no `rerun-if-*` statements, but
766    /// that's overall a mistake and causes bugs in Cargo. We shouldn't use this
767    /// for build scripts.
768    Precalculated(String),
769
770    /// This is used for crate compilations. The `dep_info` file is a relative
771    /// path anchored at `target_root(...)` to the dep-info file that Cargo
772    /// generates (which is a custom serialization after parsing rustc's own
773    /// `dep-info` output).
774    ///
775    /// The `dep_info` file, when present, also lists a number of other files
776    /// for us to look at. If any of those files are newer than this file then
777    /// we need to recompile.
778    ///
779    /// If the `checksum` bool is true then the `dep_info` file is expected to
780    /// contain file checksums instead of file mtimes.
781    CheckDepInfo { dep_info: PathBuf, checksum: bool },
782
783    /// This represents a nonempty set of `rerun-if-changed` annotations printed
784    /// out by a build script. The `output` file is a relative file anchored at
785    /// `target_root(...)` which is the actual output of the build script. That
786    /// output has already been parsed and the paths printed out via
787    /// `rerun-if-changed` are listed in `paths`. The `paths` field is relative
788    /// to `pkg.root()`
789    ///
790    /// This is considered up-to-date if all of the `paths` are older than
791    /// `output`, otherwise we need to recompile.
792    RerunIfChanged {
793        output: PathBuf,
794        paths: Vec<PathBuf>,
795    },
796
797    /// This represents a single `rerun-if-env-changed` annotation printed by a
798    /// build script. The exact env var and value are hashed here. There's no
799    /// filesystem dependence here, and if the values are changed the hash will
800    /// change forcing a recompile.
801    RerunIfEnvChanged { var: String, val: Option<String> },
802}
803
804/// See [`FsStatus::StaleItem`].
805#[derive(Clone, Debug, Serialize)]
806#[serde(tag = "stale_item", rename_all = "kebab-case")]
807pub enum StaleItem {
808    MissingFile {
809        path: PathBuf,
810    },
811    UnableToReadFile {
812        path: PathBuf,
813    },
814    FailedToReadMetadata {
815        path: PathBuf,
816    },
817    FileSizeChanged {
818        path: PathBuf,
819        old_size: u64,
820        new_size: u64,
821    },
822    ChangedFile {
823        reference: PathBuf,
824        #[serde(serialize_with = "serialize_file_time")]
825        reference_mtime: FileTime,
826        stale: PathBuf,
827        #[serde(serialize_with = "serialize_file_time")]
828        stale_mtime: FileTime,
829    },
830    ChangedChecksum {
831        source: PathBuf,
832        stored_checksum: Checksum,
833        new_checksum: Checksum,
834    },
835    MissingChecksum {
836        path: PathBuf,
837    },
838    ChangedEnv {
839        var: String,
840        previous: Option<String>,
841        current: Option<String>,
842    },
843}
844
845impl LocalFingerprint {
846    /// Read the environment variable of the given env `key`, and creates a new
847    /// [`LocalFingerprint::RerunIfEnvChanged`] for it. The `env_config` is used firstly
848    /// to check if the env var is set in the config system as some envs need to be overridden.
849    /// If not, it will fallback to `std::env::var`.
850    ///
851    // TODO: `std::env::var` is allowed at this moment. Should figure out
852    // if it makes sense if permitting to read env from the env snapshot.
853    #[allow(clippy::disallowed_methods)]
854    fn from_env<K: AsRef<str>>(
855        key: K,
856        env_config: &Arc<HashMap<String, OsString>>,
857    ) -> LocalFingerprint {
858        let key = key.as_ref();
859        let var = key.to_owned();
860        let val = if let Some(val) = env_config.get(key) {
861            val.to_str().map(ToOwned::to_owned)
862        } else {
863            env::var(key).ok()
864        };
865        LocalFingerprint::RerunIfEnvChanged { var, val }
866    }
867
868    /// Checks dynamically at runtime if this `LocalFingerprint` has a stale
869    /// item inside of it.
870    ///
871    /// The main purpose of this function is to handle two different ways
872    /// fingerprints can be invalidated:
873    ///
874    /// * One is a dependency listed in rustc's dep-info files is invalid. Note
875    ///   that these could either be env vars or files. We check both here.
876    ///
877    /// * Another is the `rerun-if-changed` directive from build scripts. This
878    ///   is where we'll find whether files have actually changed
879    fn find_stale_item(
880        &self,
881        mtime_cache: &mut HashMap<PathBuf, FileTime>,
882        checksum_cache: &mut HashMap<PathBuf, Checksum>,
883        pkg: &Package,
884        build_root: &Path,
885        cargo_exe: &Path,
886        gctx: &GlobalContext,
887    ) -> CargoResult<Option<StaleItem>> {
888        let pkg_root = pkg.root();
889        match self {
890            // We need to parse `dep_info`, learn about the crate's dependencies.
891            //
892            // For each env var we see if our current process's env var still
893            // matches, and for each file we see if any of them are newer than
894            // the `dep_info` file itself whose mtime represents the start of
895            // rustc.
896            LocalFingerprint::CheckDepInfo { dep_info, checksum } => {
897                let dep_info = build_root.join(dep_info);
898                let Some(info) = parse_dep_info(pkg_root, build_root, &dep_info)? else {
899                    return Ok(Some(StaleItem::MissingFile { path: dep_info }));
900                };
901                for (key, previous) in info.env.iter() {
902                    if let Some(value) = pkg.manifest().metadata().env_var(key.as_str()) {
903                        if Some(value.as_ref()) == previous.as_deref() {
904                            continue;
905                        }
906                    }
907
908                    let current = if key == CARGO_ENV {
909                        Some(cargo_exe.to_str().ok_or_else(|| {
910                            format_err!(
911                                "cargo exe path {} must be valid UTF-8",
912                                cargo_exe.display()
913                            )
914                        })?)
915                    } else {
916                        if let Some(value) = gctx.env_config()?.get(key) {
917                            value.to_str()
918                        } else {
919                            gctx.get_env(key).ok()
920                        }
921                    };
922                    if current == previous.as_deref() {
923                        continue;
924                    }
925                    return Ok(Some(StaleItem::ChangedEnv {
926                        var: key.clone(),
927                        previous: previous.clone(),
928                        current: current.map(Into::into),
929                    }));
930                }
931                if *checksum {
932                    Ok(find_stale_file(
933                        mtime_cache,
934                        checksum_cache,
935                        &dep_info,
936                        info.files.iter().map(|(file, checksum)| (file, *checksum)),
937                        *checksum,
938                    ))
939                } else {
940                    Ok(find_stale_file(
941                        mtime_cache,
942                        checksum_cache,
943                        &dep_info,
944                        info.files.into_keys().map(|p| (p, None)),
945                        *checksum,
946                    ))
947                }
948            }
949
950            // We need to verify that no paths listed in `paths` are newer than
951            // the `output` path itself, or the last time the build script ran.
952            LocalFingerprint::RerunIfChanged { output, paths } => Ok(find_stale_file(
953                mtime_cache,
954                checksum_cache,
955                &build_root.join(output),
956                paths.iter().map(|p| (pkg_root.join(p), None)),
957                false,
958            )),
959
960            // These have no dependencies on the filesystem, and their values
961            // are included natively in the `Fingerprint` hash so nothing
962            // tocheck for here.
963            LocalFingerprint::RerunIfEnvChanged { .. } => Ok(None),
964            LocalFingerprint::Precalculated(..) => Ok(None),
965        }
966    }
967
968    fn kind(&self) -> &'static str {
969        match self {
970            LocalFingerprint::Precalculated(..) => "precalculated",
971            LocalFingerprint::CheckDepInfo { .. } => "dep-info",
972            LocalFingerprint::RerunIfChanged { .. } => "rerun-if-changed",
973            LocalFingerprint::RerunIfEnvChanged { .. } => "rerun-if-env-changed",
974        }
975    }
976}
977
978impl Fingerprint {
979    fn new() -> Fingerprint {
980        Fingerprint {
981            rustc: 0,
982            target: 0,
983            profile: 0,
984            path: 0,
985            features: String::new(),
986            declared_features: String::new(),
987            deps: Vec::new(),
988            local: Mutex::new(Vec::new()),
989            memoized_hash: Mutex::new(None),
990            rustflags: Vec::new(),
991            config: 0,
992            compile_kind: 0,
993            fs_status: FsStatus::Stale,
994            outputs: Vec::new(),
995        }
996    }
997
998    /// For performance reasons fingerprints will memoize their own hash, but
999    /// there's also internal mutability with its `local` field which can
1000    /// change, for example with build scripts, during a build.
1001    ///
1002    /// This method can be used to bust all memoized hashes just before a build
1003    /// to ensure that after a build completes everything is up-to-date.
1004    pub fn clear_memoized(&self) {
1005        *self.memoized_hash.lock().unwrap() = None;
1006    }
1007
1008    fn hash_u64(&self) -> u64 {
1009        if let Some(s) = *self.memoized_hash.lock().unwrap() {
1010            return s;
1011        }
1012        let ret = util::hash_u64(self);
1013        *self.memoized_hash.lock().unwrap() = Some(ret);
1014        ret
1015    }
1016
1017    /// Compares this fingerprint with an old version which was previously
1018    /// serialized to filesystem.
1019    ///
1020    /// The purpose of this is exclusively to produce a diagnostic message
1021    /// [`DirtyReason`], indicating why we're recompiling something.
1022    fn compare(&self, old: &Fingerprint) -> DirtyReason {
1023        if self.rustc != old.rustc {
1024            return DirtyReason::RustcChanged;
1025        }
1026        if self.features != old.features {
1027            return DirtyReason::FeaturesChanged {
1028                old: old.features.clone(),
1029                new: self.features.clone(),
1030            };
1031        }
1032        if self.declared_features != old.declared_features {
1033            return DirtyReason::DeclaredFeaturesChanged {
1034                old: old.declared_features.clone(),
1035                new: self.declared_features.clone(),
1036            };
1037        }
1038        if self.target != old.target {
1039            return DirtyReason::TargetConfigurationChanged;
1040        }
1041        if self.path != old.path {
1042            return DirtyReason::PathToSourceChanged;
1043        }
1044        if self.profile != old.profile {
1045            return DirtyReason::ProfileConfigurationChanged;
1046        }
1047        if self.rustflags != old.rustflags {
1048            return DirtyReason::RustflagsChanged {
1049                old: old.rustflags.clone(),
1050                new: self.rustflags.clone(),
1051            };
1052        }
1053        if self.config != old.config {
1054            return DirtyReason::ConfigSettingsChanged;
1055        }
1056        if self.compile_kind != old.compile_kind {
1057            return DirtyReason::CompileKindChanged;
1058        }
1059        let my_local = self.local.lock().unwrap();
1060        let old_local = old.local.lock().unwrap();
1061        if my_local.len() != old_local.len() {
1062            return DirtyReason::LocalLengthsChanged;
1063        }
1064        for (new, old) in my_local.iter().zip(old_local.iter()) {
1065            match (new, old) {
1066                (LocalFingerprint::Precalculated(a), LocalFingerprint::Precalculated(b)) => {
1067                    if a != b {
1068                        return DirtyReason::PrecalculatedComponentsChanged {
1069                            old: b.to_string(),
1070                            new: a.to_string(),
1071                        };
1072                    }
1073                }
1074                (
1075                    LocalFingerprint::CheckDepInfo {
1076                        dep_info: a_dep,
1077                        checksum: checksum_a,
1078                    },
1079                    LocalFingerprint::CheckDepInfo {
1080                        dep_info: b_dep,
1081                        checksum: checksum_b,
1082                    },
1083                ) => {
1084                    if a_dep != b_dep {
1085                        return DirtyReason::DepInfoOutputChanged {
1086                            old: b_dep.clone(),
1087                            new: a_dep.clone(),
1088                        };
1089                    }
1090                    if checksum_a != checksum_b {
1091                        return DirtyReason::ChecksumUseChanged { old: *checksum_b };
1092                    }
1093                }
1094                (
1095                    LocalFingerprint::RerunIfChanged {
1096                        output: a_out,
1097                        paths: a_paths,
1098                    },
1099                    LocalFingerprint::RerunIfChanged {
1100                        output: b_out,
1101                        paths: b_paths,
1102                    },
1103                ) => {
1104                    if a_out != b_out {
1105                        return DirtyReason::RerunIfChangedOutputFileChanged {
1106                            old: b_out.clone(),
1107                            new: a_out.clone(),
1108                        };
1109                    }
1110                    if a_paths != b_paths {
1111                        return DirtyReason::RerunIfChangedOutputPathsChanged {
1112                            old: b_paths.clone(),
1113                            new: a_paths.clone(),
1114                        };
1115                    }
1116                }
1117                (
1118                    LocalFingerprint::RerunIfEnvChanged {
1119                        var: a_key,
1120                        val: a_value,
1121                    },
1122                    LocalFingerprint::RerunIfEnvChanged {
1123                        var: b_key,
1124                        val: b_value,
1125                    },
1126                ) => {
1127                    if *a_key != *b_key {
1128                        return DirtyReason::EnvVarsChanged {
1129                            old: b_key.clone(),
1130                            new: a_key.clone(),
1131                        };
1132                    }
1133                    if *a_value != *b_value {
1134                        return DirtyReason::EnvVarChanged {
1135                            name: a_key.clone(),
1136                            old_value: b_value.clone(),
1137                            new_value: a_value.clone(),
1138                        };
1139                    }
1140                }
1141                (a, b) => {
1142                    return DirtyReason::LocalFingerprintTypeChanged {
1143                        old: b.kind(),
1144                        new: a.kind(),
1145                    };
1146                }
1147            }
1148        }
1149
1150        if self.deps.len() != old.deps.len() {
1151            return DirtyReason::NumberOfDependenciesChanged {
1152                old: old.deps.len(),
1153                new: self.deps.len(),
1154            };
1155        }
1156        for (a, b) in self.deps.iter().zip(old.deps.iter()) {
1157            if a.name != b.name {
1158                return DirtyReason::UnitDependencyNameChanged {
1159                    old: b.name,
1160                    new: a.name,
1161                };
1162            }
1163
1164            if a.fingerprint.hash_u64() != b.fingerprint.hash_u64() {
1165                return DirtyReason::UnitDependencyInfoChanged {
1166                    new_name: a.name,
1167                    new_fingerprint: a.fingerprint.hash_u64(),
1168                    old_name: b.name,
1169                    old_fingerprint: b.fingerprint.hash_u64(),
1170                };
1171            }
1172        }
1173
1174        if !self.fs_status.up_to_date() {
1175            return DirtyReason::FsStatusOutdated(self.fs_status.clone());
1176        }
1177
1178        // This typically means some filesystem modifications happened or
1179        // something transitive was odd. In general we should strive to provide
1180        // a better error message than this, so if you see this message a lot it
1181        // likely means this method needs to be updated!
1182        DirtyReason::NothingObvious
1183    }
1184
1185    /// Dynamically inspect the local filesystem to update the `fs_status` field
1186    /// of this `Fingerprint`.
1187    ///
1188    /// This function is used just after a `Fingerprint` is constructed to check
1189    /// the local state of the filesystem and propagate any dirtiness from
1190    /// dependencies up to this unit as well. This function assumes that the
1191    /// unit starts out as [`FsStatus::Stale`] and then it will optionally switch
1192    /// it to `UpToDate` if it can.
1193    fn check_filesystem(
1194        &mut self,
1195        mtime_cache: &mut HashMap<PathBuf, FileTime>,
1196        checksum_cache: &mut HashMap<PathBuf, Checksum>,
1197        pkg: &Package,
1198        build_root: &Path,
1199        cargo_exe: &Path,
1200        gctx: &GlobalContext,
1201    ) -> CargoResult<()> {
1202        assert!(!self.fs_status.up_to_date());
1203
1204        let pkg_root = pkg.root();
1205        let mut mtimes = HashMap::new();
1206
1207        // Get the `mtime` of all outputs. Optionally update their mtime
1208        // afterwards based on the `mtime_on_use` flag. Afterwards we want the
1209        // minimum mtime as it's the one we'll be comparing to inputs and
1210        // dependencies.
1211        for output in self.outputs.iter() {
1212            let Ok(mtime) = paths::mtime(output) else {
1213                // This path failed to report its `mtime`. It probably doesn't
1214                // exists, so leave ourselves as stale and bail out.
1215                let item = StaleItem::FailedToReadMetadata {
1216                    path: output.clone(),
1217                };
1218                self.fs_status = FsStatus::StaleItem(item);
1219                return Ok(());
1220            };
1221            assert!(mtimes.insert(output.clone(), mtime).is_none());
1222        }
1223
1224        let opt_max = mtimes.iter().max_by_key(|kv| kv.1);
1225        let Some((max_path, max_mtime)) = opt_max else {
1226            // We had no output files. This means we're an overridden build
1227            // script and we're just always up to date because we aren't
1228            // watching the filesystem.
1229            self.fs_status = FsStatus::UpToDate { mtimes };
1230            return Ok(());
1231        };
1232        debug!(
1233            "max output mtime for {:?} is {:?} {}",
1234            pkg_root, max_path, max_mtime
1235        );
1236
1237        for dep in self.deps.iter() {
1238            let dep_mtimes = match &dep.fingerprint.fs_status {
1239                FsStatus::UpToDate { mtimes } => mtimes,
1240                // If our dependency is stale, so are we, so bail out.
1241                FsStatus::Stale
1242                | FsStatus::StaleItem(_)
1243                | FsStatus::StaleDependency { .. }
1244                | FsStatus::StaleDepFingerprint { .. } => {
1245                    self.fs_status = FsStatus::StaleDepFingerprint { name: dep.name };
1246                    return Ok(());
1247                }
1248            };
1249
1250            // If our dependency edge only requires the rmeta file to be present
1251            // then we only need to look at that one output file, otherwise we
1252            // need to consider all output files to see if we're out of date.
1253            let (dep_path, dep_mtime) = if dep.only_requires_rmeta {
1254                dep_mtimes
1255                    .iter()
1256                    .find(|(path, _mtime)| {
1257                        path.extension().and_then(|s| s.to_str()) == Some("rmeta")
1258                    })
1259                    .expect("failed to find rmeta")
1260            } else {
1261                match dep_mtimes.iter().max_by_key(|kv| kv.1) {
1262                    Some(dep_mtime) => dep_mtime,
1263                    // If our dependencies is up to date and has no filesystem
1264                    // interactions, then we can move on to the next dependency.
1265                    None => continue,
1266                }
1267            };
1268            debug!(
1269                "max dep mtime for {:?} is {:?} {}",
1270                pkg_root, dep_path, dep_mtime
1271            );
1272
1273            // If the dependency is newer than our own output then it was
1274            // recompiled previously. We transitively become stale ourselves in
1275            // that case, so bail out.
1276            //
1277            // Note that this comparison should probably be `>=`, not `>`, but
1278            // for a discussion of why it's `>` see the discussion about #5918
1279            // below in `find_stale`.
1280            if dep_mtime > max_mtime {
1281                info!(
1282                    "dependency on `{}` is newer than we are {} > {} {:?}",
1283                    dep.name, dep_mtime, max_mtime, pkg_root
1284                );
1285
1286                self.fs_status = FsStatus::StaleDependency {
1287                    name: dep.name,
1288                    dep_mtime: *dep_mtime,
1289                    max_mtime: *max_mtime,
1290                };
1291
1292                return Ok(());
1293            }
1294        }
1295
1296        // If we reached this far then all dependencies are up to date. Check
1297        // all our `LocalFingerprint` information to see if we have any stale
1298        // files for this package itself. If we do find something log a helpful
1299        // message and bail out so we stay stale.
1300        for local in self.local.get_mut().unwrap().iter() {
1301            if let Some(item) = local.find_stale_item(
1302                mtime_cache,
1303                checksum_cache,
1304                pkg,
1305                build_root,
1306                cargo_exe,
1307                gctx,
1308            )? {
1309                item.log();
1310                self.fs_status = FsStatus::StaleItem(item);
1311                return Ok(());
1312            }
1313        }
1314
1315        // Everything was up to date! Record such.
1316        self.fs_status = FsStatus::UpToDate { mtimes };
1317        debug!("filesystem up-to-date {:?}", pkg_root);
1318
1319        Ok(())
1320    }
1321}
1322
1323impl hash::Hash for Fingerprint {
1324    fn hash<H: Hasher>(&self, h: &mut H) {
1325        let Fingerprint {
1326            rustc,
1327            ref features,
1328            ref declared_features,
1329            target,
1330            path,
1331            profile,
1332            ref deps,
1333            ref local,
1334            config,
1335            compile_kind,
1336            ref rustflags,
1337            ..
1338        } = *self;
1339        let local = local.lock().unwrap();
1340        (
1341            rustc,
1342            features,
1343            declared_features,
1344            target,
1345            path,
1346            profile,
1347            &*local,
1348            config,
1349            compile_kind,
1350            rustflags,
1351        )
1352            .hash(h);
1353
1354        h.write_usize(deps.len());
1355        for DepFingerprint {
1356            pkg_id,
1357            name,
1358            public,
1359            fingerprint,
1360            only_requires_rmeta: _, // static property, no need to hash
1361        } in deps
1362        {
1363            pkg_id.hash(h);
1364            name.hash(h);
1365            public.hash(h);
1366            // use memoized dep hashes to avoid exponential blowup
1367            h.write_u64(fingerprint.hash_u64());
1368        }
1369    }
1370}
1371
1372impl DepFingerprint {
1373    fn new(
1374        build_runner: &mut BuildRunner<'_, '_>,
1375        parent: &Unit,
1376        dep: &UnitDep,
1377    ) -> CargoResult<DepFingerprint> {
1378        let fingerprint = calculate(build_runner, &dep.unit)?;
1379        // We need to be careful about what we hash here. We have a goal of
1380        // supporting renaming a project directory and not rebuilding
1381        // everything. To do that, however, we need to make sure that the cwd
1382        // doesn't make its way into any hashes, and one source of that is the
1383        // `SourceId` for `path` packages.
1384        //
1385        // We already have a requirement that `path` packages all have unique
1386        // names (sort of for this same reason), so if the package source is a
1387        // `path` then we just hash the name, but otherwise we hash the full
1388        // id as it won't change when the directory is renamed.
1389        let pkg_id = if dep.unit.pkg.package_id().source_id().is_path() {
1390            util::hash_u64(dep.unit.pkg.package_id().name())
1391        } else {
1392            util::hash_u64(dep.unit.pkg.package_id())
1393        };
1394
1395        Ok(DepFingerprint {
1396            pkg_id,
1397            name: dep.extern_crate_name,
1398            public: dep.public,
1399            fingerprint,
1400            only_requires_rmeta: build_runner.only_requires_rmeta(parent, &dep.unit),
1401        })
1402    }
1403}
1404
1405impl StaleItem {
1406    /// Use the `log` crate to log a hopefully helpful message in diagnosing
1407    /// what file is considered stale and why. This is intended to be used in
1408    /// conjunction with `CARGO_LOG` to determine why Cargo is recompiling
1409    /// something. Currently there's no user-facing usage of this other than
1410    /// that.
1411    fn log(&self) {
1412        match self {
1413            StaleItem::MissingFile { path } => {
1414                info!("stale: missing {:?}", path);
1415            }
1416            StaleItem::UnableToReadFile { path } => {
1417                info!("stale: unable to read {:?}", path);
1418            }
1419            StaleItem::FailedToReadMetadata { path } => {
1420                info!("stale: couldn't read metadata {:?}", path);
1421            }
1422            StaleItem::ChangedFile {
1423                reference,
1424                reference_mtime,
1425                stale,
1426                stale_mtime,
1427            } => {
1428                info!("stale: changed {:?}", stale);
1429                info!("          (vs) {:?}", reference);
1430                info!("               {:?} < {:?}", reference_mtime, stale_mtime);
1431            }
1432            StaleItem::FileSizeChanged {
1433                path,
1434                new_size,
1435                old_size,
1436            } => {
1437                info!("stale: changed {:?}", path);
1438                info!("prior file size {old_size}");
1439                info!("  new file size {new_size}");
1440            }
1441            StaleItem::ChangedChecksum {
1442                source,
1443                stored_checksum,
1444                new_checksum,
1445            } => {
1446                info!("stale: changed {:?}", source);
1447                info!("prior checksum {stored_checksum}");
1448                info!("  new checksum {new_checksum}");
1449            }
1450            StaleItem::MissingChecksum { path } => {
1451                info!("stale: no prior checksum {:?}", path);
1452            }
1453            StaleItem::ChangedEnv {
1454                var,
1455                previous,
1456                current,
1457            } => {
1458                info!("stale: changed env {:?}", var);
1459                info!("       {:?} != {:?}", previous, current);
1460            }
1461        }
1462    }
1463}
1464
1465/// Calculates the fingerprint for a [`Unit`].
1466///
1467/// This fingerprint is used by Cargo to learn about when information such as:
1468///
1469/// * A non-path package changes (changes version, changes revision, etc).
1470/// * Any dependency changes
1471/// * The compiler changes
1472/// * The set of features a package is built with changes
1473/// * The profile a target is compiled with changes (e.g., opt-level changes)
1474/// * Any other compiler flags change that will affect the result
1475///
1476/// Information like file modification time is only calculated for path
1477/// dependencies.
1478fn calculate(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> CargoResult<Arc<Fingerprint>> {
1479    // This function is slammed quite a lot, so the result is memoized.
1480    if let Some(s) = build_runner.fingerprints.get(unit) {
1481        return Ok(Arc::clone(s));
1482    }
1483    let mut fingerprint = if unit.mode.is_run_custom_build() {
1484        calculate_run_custom_build(build_runner, unit)?
1485    } else if unit.mode.is_doc_test() {
1486        panic!("doc tests do not fingerprint");
1487    } else {
1488        calculate_normal(build_runner, unit)?
1489    };
1490
1491    // After we built the initial `Fingerprint` be sure to update the
1492    // `fs_status` field of it.
1493    let build_root = build_root(build_runner);
1494    let cargo_exe = build_runner.bcx.gctx.cargo_exe()?;
1495    fingerprint.check_filesystem(
1496        &mut build_runner.mtime_cache,
1497        &mut build_runner.checksum_cache,
1498        &unit.pkg,
1499        &build_root,
1500        cargo_exe,
1501        build_runner.bcx.gctx,
1502    )?;
1503
1504    let fingerprint = Arc::new(fingerprint);
1505    build_runner
1506        .fingerprints
1507        .insert(unit.clone(), Arc::clone(&fingerprint));
1508    Ok(fingerprint)
1509}
1510
1511/// Calculate a fingerprint for a "normal" unit, or anything that's not a build
1512/// script. This is an internal helper of [`calculate`], don't call directly.
1513fn calculate_normal(
1514    build_runner: &mut BuildRunner<'_, '_>,
1515    unit: &Unit,
1516) -> CargoResult<Fingerprint> {
1517    let deps = {
1518        // Recursively calculate the fingerprint for all of our dependencies.
1519        //
1520        // Skip fingerprints of binaries because they don't actually induce a
1521        // recompile, they're just dependencies in the sense that they need to be
1522        // built. The only exception here are artifact dependencies,
1523        // which is an actual dependency that needs a recompile.
1524        //
1525        // Create Vec since mutable build_runner is needed in closure.
1526        let deps = Vec::from(build_runner.unit_deps(unit));
1527        let mut deps = deps
1528            .into_iter()
1529            .filter(|dep| !dep.unit.target.is_bin() || dep.unit.artifact.is_true())
1530            .map(|dep| DepFingerprint::new(build_runner, unit, &dep))
1531            .collect::<CargoResult<Vec<_>>>()?;
1532        deps.sort_by(|a, b| a.pkg_id.cmp(&b.pkg_id));
1533        deps
1534    };
1535
1536    // Afterwards calculate our own fingerprint information.
1537    let build_root = build_root(build_runner);
1538    let is_any_doc_gen = unit.mode.is_doc() || unit.mode.is_doc_scrape();
1539    let rustdoc_depinfo_enabled = build_runner.bcx.gctx.cli_unstable().rustdoc_depinfo;
1540    let local = if is_any_doc_gen && !rustdoc_depinfo_enabled {
1541        // rustdoc does not have dep-info files.
1542        let fingerprint = pkg_fingerprint(build_runner.bcx, &unit.pkg).with_context(|| {
1543            format!(
1544                "failed to determine package fingerprint for documenting {}",
1545                unit.pkg
1546            )
1547        })?;
1548        vec![LocalFingerprint::Precalculated(fingerprint)]
1549    } else {
1550        let dep_info = dep_info_loc(build_runner, unit);
1551        let dep_info = dep_info.strip_prefix(&build_root).unwrap().to_path_buf();
1552        vec![LocalFingerprint::CheckDepInfo {
1553            dep_info,
1554            checksum: build_runner.bcx.gctx.cli_unstable().checksum_freshness,
1555        }]
1556    };
1557
1558    // Figure out what the outputs of our unit is, and we'll be storing them
1559    // into the fingerprint as well.
1560    let outputs = build_runner
1561        .outputs(unit)?
1562        .iter()
1563        .filter(|output| {
1564            !matches!(
1565                output.flavor,
1566                FileFlavor::DebugInfo | FileFlavor::Auxiliary | FileFlavor::Sbom
1567            )
1568        })
1569        .map(|output| output.path.clone())
1570        .collect();
1571
1572    // Fill out a bunch more information that we'll be tracking typically
1573    // hashed to take up less space on disk as we just need to know when things
1574    // change.
1575    let extra_flags = if unit.mode.is_doc() || unit.mode.is_doc_scrape() {
1576        &unit.rustdocflags
1577    } else {
1578        &unit.rustflags
1579    }
1580    .to_vec();
1581
1582    let profile_hash = util::hash_u64((
1583        &unit.profile,
1584        unit.mode,
1585        build_runner.bcx.extra_args_for(unit),
1586        build_runner.lto[unit],
1587        unit.pkg.manifest().lint_rustflags(),
1588    ));
1589    let mut config = StableHasher::new();
1590    if let Some(linker) = build_runner.compilation.target_linker(unit.kind) {
1591        linker.hash(&mut config);
1592    }
1593    if unit.mode.is_doc() && build_runner.bcx.gctx.cli_unstable().rustdoc_map {
1594        if let Ok(map) = build_runner.bcx.gctx.doc_extern_map() {
1595            map.hash(&mut config);
1596        }
1597    }
1598    if let Some(allow_features) = &build_runner.bcx.gctx.cli_unstable().allow_features {
1599        allow_features.hash(&mut config);
1600    }
1601    let compile_kind = unit.kind.fingerprint_hash();
1602    let mut declared_features = unit.pkg.summary().features().keys().collect::<Vec<_>>();
1603    declared_features.sort(); // to avoid useless rebuild if the user orders it's features
1604    // differently
1605    Ok(Fingerprint {
1606        rustc: util::hash_u64(&build_runner.bcx.rustc().verbose_version),
1607        target: util::hash_u64(&unit.target),
1608        profile: profile_hash,
1609        // Note that .0 is hashed here, not .1 which is the cwd. That doesn't
1610        // actually affect the output artifact so there's no need to hash it.
1611        path: util::hash_u64(path_args(build_runner.bcx.ws, unit).0),
1612        features: format!("{:?}", unit.features),
1613        declared_features: format!("{declared_features:?}"),
1614        deps,
1615        local: Mutex::new(local),
1616        memoized_hash: Mutex::new(None),
1617        config: Hasher::finish(&config),
1618        compile_kind,
1619        rustflags: extra_flags,
1620        fs_status: FsStatus::Stale,
1621        outputs,
1622    })
1623}
1624
1625/// Calculate a fingerprint for an "execute a build script" unit.  This is an
1626/// internal helper of [`calculate`], don't call directly.
1627fn calculate_run_custom_build(
1628    build_runner: &mut BuildRunner<'_, '_>,
1629    unit: &Unit,
1630) -> CargoResult<Fingerprint> {
1631    assert!(unit.mode.is_run_custom_build());
1632    // Using the `BuildDeps` information we'll have previously parsed and
1633    // inserted into `build_explicit_deps` built an initial snapshot of the
1634    // `LocalFingerprint` list for this build script. If we previously executed
1635    // the build script this means we'll be watching files and env vars.
1636    // Otherwise if we haven't previously executed it we'll just start watching
1637    // the whole crate.
1638    let (gen_local, overridden) = build_script_local_fingerprints(build_runner, unit)?;
1639    let deps = &build_runner.build_explicit_deps[unit];
1640    let local = (gen_local)(
1641        deps,
1642        Some(&|| {
1643            const IO_ERR_MESSAGE: &str = "\
1644An I/O error happened. Please make sure you can access the file.
1645
1646By default, if your project contains a build script, cargo scans all files in
1647it to determine whether a rebuild is needed. If you don't expect to access the
1648file, specify `rerun-if-changed` in your build script.
1649See https://doc.rust-lang.org/cargo/reference/build-scripts.html#rerun-if-changed for more information.";
1650            pkg_fingerprint(build_runner.bcx, &unit.pkg).map_err(|err| {
1651                let mut message = format!("failed to determine package fingerprint for build script for {}", unit.pkg);
1652                if err.root_cause().is::<io::Error>() {
1653                    message = format!("{}\n{}", message, IO_ERR_MESSAGE)
1654                }
1655                err.context(message)
1656            })
1657        }),
1658    )?
1659    .unwrap();
1660    let output = deps.build_script_output.clone();
1661
1662    // Include any dependencies of our execution, which is typically just the
1663    // compilation of the build script itself. (if the build script changes we
1664    // should be rerun!). Note though that if we're an overridden build script
1665    // we have no dependencies so no need to recurse in that case.
1666    let deps = if overridden {
1667        // Overridden build scripts don't need to track deps.
1668        vec![]
1669    } else {
1670        // Create Vec since mutable build_runner is needed in closure.
1671        let deps = Vec::from(build_runner.unit_deps(unit));
1672        deps.into_iter()
1673            .map(|dep| DepFingerprint::new(build_runner, unit, &dep))
1674            .collect::<CargoResult<Vec<_>>>()?
1675    };
1676
1677    let rustflags = unit.rustflags.to_vec();
1678
1679    Ok(Fingerprint {
1680        local: Mutex::new(local),
1681        rustc: util::hash_u64(&build_runner.bcx.rustc().verbose_version),
1682        deps,
1683        outputs: if overridden { Vec::new() } else { vec![output] },
1684        rustflags,
1685
1686        // Most of the other info is blank here as we don't really include it
1687        // in the execution of the build script, but... this may be a latent
1688        // bug in Cargo.
1689        ..Fingerprint::new()
1690    })
1691}
1692
1693/// Get ready to compute the [`LocalFingerprint`] values
1694/// for a [`RunCustomBuild`] unit.
1695///
1696/// This function has, what's on the surface, a seriously wonky interface.
1697/// You'll call this function and it'll return a closure and a boolean. The
1698/// boolean is pretty simple in that it indicates whether the `unit` has been
1699/// overridden via `.cargo/config.toml`. The closure is much more complicated.
1700///
1701/// This closure is intended to capture any local state necessary to compute
1702/// the `LocalFingerprint` values for this unit. It is `Send` and `'static` to
1703/// be sent to other threads as well (such as when we're executing build
1704/// scripts). That deduplication is the rationale for the closure at least.
1705///
1706/// The arguments to the closure are a bit weirder, though, and I'll apologize
1707/// in advance for the weirdness too. The first argument to the closure is a
1708/// `&BuildDeps`. This is the parsed version of a build script, and when Cargo
1709/// starts up this is cached from previous runs of a build script.  After a
1710/// build script executes the output file is reparsed and passed in here.
1711///
1712/// The second argument is the weirdest, it's *optionally* a closure to
1713/// call [`pkg_fingerprint`]. The `pkg_fingerprint` requires access to
1714/// "source map" located in `Context`. That's very non-`'static` and
1715/// non-`Send`, so it can't be used on other threads, such as when we invoke
1716/// this after a build script has finished. The `Option` allows us to for sure
1717/// calculate it on the main thread at the beginning, and then swallow the bug
1718/// for now where a worker thread after a build script has finished doesn't
1719/// have access. Ideally there would be no second argument or it would be more
1720/// "first class" and not an `Option` but something that can be sent between
1721/// threads. In any case, it's a bug for now.
1722///
1723/// This isn't the greatest of interfaces, and if there's suggestions to
1724/// improve please do so!
1725///
1726/// FIXME(#6779) - see all the words above
1727///
1728/// [`RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
1729fn build_script_local_fingerprints(
1730    build_runner: &mut BuildRunner<'_, '_>,
1731    unit: &Unit,
1732) -> CargoResult<(
1733    Box<
1734        dyn FnOnce(
1735                &BuildDeps,
1736                Option<&dyn Fn() -> CargoResult<String>>,
1737            ) -> CargoResult<Option<Vec<LocalFingerprint>>>
1738            + Send,
1739    >,
1740    bool,
1741)> {
1742    assert!(unit.mode.is_run_custom_build());
1743    // First up, if this build script is entirely overridden, then we just
1744    // return the hash of what we overrode it with. This is the easy case!
1745    if let Some(fingerprint) = build_script_override_fingerprint(build_runner, unit) {
1746        debug!("override local fingerprints deps {}", unit.pkg);
1747        return Ok((
1748            Box::new(
1749                move |_: &BuildDeps, _: Option<&dyn Fn() -> CargoResult<String>>| {
1750                    Ok(Some(vec![fingerprint]))
1751                },
1752            ),
1753            true, // this is an overridden build script
1754        ));
1755    }
1756
1757    // ... Otherwise this is a "real" build script and we need to return a real
1758    // closure. Our returned closure classifies the build script based on
1759    // whether it prints `rerun-if-*`. If it *doesn't* print this it's where the
1760    // magical second argument comes into play, which fingerprints a whole
1761    // package. Remember that the fact that this is an `Option` is a bug, but a
1762    // longstanding bug, in Cargo. Recent refactorings just made it painfully
1763    // obvious.
1764    let pkg_root = unit.pkg.root().to_path_buf();
1765    let build_dir = build_root(build_runner);
1766    let env_config = Arc::clone(build_runner.bcx.gctx.env_config()?);
1767    let calculate =
1768        move |deps: &BuildDeps, pkg_fingerprint: Option<&dyn Fn() -> CargoResult<String>>| {
1769            if deps.rerun_if_changed.is_empty() && deps.rerun_if_env_changed.is_empty() {
1770                match pkg_fingerprint {
1771                    // FIXME: this is somewhat buggy with respect to docker and
1772                    // weird filesystems. The `Precalculated` variant
1773                    // constructed below will, for `path` dependencies, contain
1774                    // a stringified version of the mtime for the local crate.
1775                    // This violates one of the things we describe in this
1776                    // module's doc comment, never hashing mtimes. We should
1777                    // figure out a better scheme where a package fingerprint
1778                    // may be a string (like for a registry) or a list of files
1779                    // (like for a path dependency). Those list of files would
1780                    // be stored here rather than the mtime of them.
1781                    Some(f) => {
1782                        let s = f()?;
1783                        debug!(
1784                            "old local fingerprints deps {:?} precalculated={:?}",
1785                            pkg_root, s
1786                        );
1787                        return Ok(Some(vec![LocalFingerprint::Precalculated(s)]));
1788                    }
1789                    None => return Ok(None),
1790                }
1791            }
1792
1793            // Ok so now we're in "new mode" where we can have files listed as
1794            // dependencies as well as env vars listed as dependencies. Process
1795            // them all here.
1796            Ok(Some(local_fingerprints_deps(
1797                deps,
1798                &build_dir,
1799                &pkg_root,
1800                &env_config,
1801            )))
1802        };
1803
1804    // Note that `false` == "not overridden"
1805    Ok((Box::new(calculate), false))
1806}
1807
1808/// Create a [`LocalFingerprint`] for an overridden build script.
1809/// Returns None if it is not overridden.
1810fn build_script_override_fingerprint(
1811    build_runner: &mut BuildRunner<'_, '_>,
1812    unit: &Unit,
1813) -> Option<LocalFingerprint> {
1814    // Build script output is only populated at this stage when it is
1815    // overridden.
1816    let build_script_outputs = build_runner.build_script_outputs.lock().unwrap();
1817    let metadata = build_runner.get_run_build_script_metadata(unit);
1818    // Returns None if it is not overridden.
1819    let output = build_script_outputs.get(metadata)?;
1820    let s = format!(
1821        "overridden build state with hash: {}",
1822        util::hash_u64(output)
1823    );
1824    Some(LocalFingerprint::Precalculated(s))
1825}
1826
1827/// Compute the [`LocalFingerprint`] values for a [`RunCustomBuild`] unit for
1828/// non-overridden new-style build scripts only. This is only used when `deps`
1829/// is already known to have a nonempty `rerun-if-*` somewhere.
1830///
1831/// [`RunCustomBuild`]: crate::core::compiler::CompileMode::RunCustomBuild
1832fn local_fingerprints_deps(
1833    deps: &BuildDeps,
1834    build_root: &Path,
1835    pkg_root: &Path,
1836    env_config: &Arc<HashMap<String, OsString>>,
1837) -> Vec<LocalFingerprint> {
1838    debug!("new local fingerprints deps {:?}", pkg_root);
1839    let mut local = Vec::new();
1840
1841    if !deps.rerun_if_changed.is_empty() {
1842        // Note that like the module comment above says we are careful to never
1843        // store an absolute path in `LocalFingerprint`, so ensure that we strip
1844        // absolute prefixes from them.
1845        let output = deps
1846            .build_script_output
1847            .strip_prefix(build_root)
1848            .unwrap()
1849            .to_path_buf();
1850        let paths = deps
1851            .rerun_if_changed
1852            .iter()
1853            .map(|p| p.strip_prefix(pkg_root).unwrap_or(p).to_path_buf())
1854            .collect();
1855        local.push(LocalFingerprint::RerunIfChanged { output, paths });
1856    }
1857
1858    local.extend(
1859        deps.rerun_if_env_changed
1860            .iter()
1861            .map(|s| LocalFingerprint::from_env(s, env_config)),
1862    );
1863
1864    local
1865}
1866
1867/// Writes the short fingerprint hash value to `<loc>`
1868/// and logs detailed JSON information to `<loc>.json`.
1869fn write_fingerprint(loc: &Path, fingerprint: &Fingerprint) -> CargoResult<()> {
1870    debug_assert_ne!(fingerprint.rustc, 0);
1871    // fingerprint::new().rustc == 0, make sure it doesn't make it to the file system.
1872    // This is mostly so outside tools can reliably find out what rust version this file is for,
1873    // as we can use the full hash.
1874    let hash = fingerprint.hash_u64();
1875    debug!("write fingerprint ({:x}) : {}", hash, loc.display());
1876    paths::write(loc, util::to_hex(hash).as_bytes())?;
1877
1878    let json = serde_json::to_string(fingerprint).unwrap();
1879    if cfg!(debug_assertions) {
1880        let f: Fingerprint = serde_json::from_str(&json).unwrap();
1881        assert_eq!(f.hash_u64(), hash);
1882    }
1883    paths::write(&loc.with_extension("json"), json.as_bytes())?;
1884    Ok(())
1885}
1886
1887/// Prepare for work when a package starts to build
1888pub fn prepare_init(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> CargoResult<()> {
1889    let new1 = build_runner.files().fingerprint_dir(unit);
1890
1891    // Doc tests have no output, thus no fingerprint.
1892    if !new1.exists() && !unit.mode.is_doc_test() {
1893        paths::create_dir_all(&new1)?;
1894    }
1895
1896    Ok(())
1897}
1898
1899/// Returns the location that the dep-info file will show up at
1900/// for the [`Unit`] specified.
1901pub fn dep_info_loc(build_runner: &mut BuildRunner<'_, '_>, unit: &Unit) -> PathBuf {
1902    build_runner.files().fingerprint_file_path(unit, "dep-")
1903}
1904
1905/// Returns an absolute path that build directory.
1906/// All paths are rewritten to be relative to this.
1907fn build_root(build_runner: &BuildRunner<'_, '_>) -> PathBuf {
1908    build_runner.bcx.ws.build_dir().into_path_unlocked()
1909}
1910
1911/// Reads the value from the old fingerprint hash file and compare.
1912///
1913/// If dirty, it then restores the detailed information
1914/// from the fingerprint JSON file, and provides an rich dirty reason.
1915fn compare_old_fingerprint(
1916    unit: &Unit,
1917    old_hash_path: &Path,
1918    new_fingerprint: &Fingerprint,
1919    mtime_on_use: bool,
1920    forced: bool,
1921) -> Option<DirtyReason> {
1922    if mtime_on_use {
1923        // update the mtime so other cleaners know we used it
1924        let t = FileTime::from_system_time(SystemTime::now());
1925        debug!("mtime-on-use forcing {:?} to {}", old_hash_path, t);
1926        paths::set_file_time_no_err(old_hash_path, t);
1927    }
1928
1929    let compare = _compare_old_fingerprint(old_hash_path, new_fingerprint);
1930
1931    match compare.as_ref() {
1932        Ok(None) => {}
1933        Ok(Some(reason)) => {
1934            info!(
1935                "fingerprint dirty for {}/{:?}/{:?}",
1936                unit.pkg, unit.mode, unit.target,
1937            );
1938            info!("    dirty: {reason:?}");
1939        }
1940        Err(e) => {
1941            info!(
1942                "fingerprint error for {}/{:?}/{:?}",
1943                unit.pkg, unit.mode, unit.target,
1944            );
1945            info!("    err: {e:?}");
1946        }
1947    }
1948
1949    match compare {
1950        Ok(None) if forced => Some(DirtyReason::Forced),
1951        Ok(reason) => reason,
1952        Err(_) => Some(DirtyReason::FreshBuild),
1953    }
1954}
1955
1956fn _compare_old_fingerprint(
1957    old_hash_path: &Path,
1958    new_fingerprint: &Fingerprint,
1959) -> CargoResult<Option<DirtyReason>> {
1960    let old_fingerprint_short = paths::read(old_hash_path)?;
1961
1962    let new_hash = new_fingerprint.hash_u64();
1963
1964    if util::to_hex(new_hash) == old_fingerprint_short && new_fingerprint.fs_status.up_to_date() {
1965        return Ok(None);
1966    }
1967
1968    let old_fingerprint_json = paths::read(&old_hash_path.with_extension("json"))?;
1969    let old_fingerprint: Fingerprint = serde_json::from_str(&old_fingerprint_json)
1970        .with_context(|| internal("failed to deserialize json"))?;
1971    // Fingerprint can be empty after a failed rebuild (see comment in prepare_target).
1972    if !old_fingerprint_short.is_empty() {
1973        debug_assert_eq!(
1974            util::to_hex(old_fingerprint.hash_u64()),
1975            old_fingerprint_short
1976        );
1977    }
1978
1979    Ok(Some(new_fingerprint.compare(&old_fingerprint)))
1980}
1981
1982/// Calculates the fingerprint of a unit thats contains no dep-info files.
1983fn pkg_fingerprint(bcx: &BuildContext<'_, '_>, pkg: &Package) -> CargoResult<String> {
1984    let source_id = pkg.package_id().source_id();
1985    let sources = bcx.packages.sources();
1986
1987    let source = sources
1988        .get(source_id)
1989        .ok_or_else(|| internal("missing package source"))?;
1990    source.fingerprint(pkg)
1991}
1992
1993/// The `reference` file is considered as "stale" if any file from `paths` has a newer mtime.
1994fn find_stale_file<I, P>(
1995    mtime_cache: &mut HashMap<PathBuf, FileTime>,
1996    checksum_cache: &mut HashMap<PathBuf, Checksum>,
1997    reference: &Path,
1998    paths: I,
1999    use_checksums: bool,
2000) -> Option<StaleItem>
2001where
2002    I: IntoIterator<Item = (P, Option<(u64, Checksum)>)>,
2003    P: AsRef<Path>,
2004{
2005    let reference_mtime = match paths::mtime(reference) {
2006        Ok(mtime) => mtime,
2007        Err(..) => {
2008            return Some(StaleItem::MissingFile {
2009                path: reference.to_path_buf(),
2010            });
2011        }
2012    };
2013
2014    let skippable_dirs = if let Ok(cargo_home) = home::cargo_home() {
2015        let skippable_dirs: Vec<_> = ["git", "registry"]
2016            .into_iter()
2017            .map(|subfolder| cargo_home.join(subfolder))
2018            .collect();
2019        Some(skippable_dirs)
2020    } else {
2021        None
2022    };
2023    for (path, prior_checksum) in paths {
2024        let path = path.as_ref();
2025
2026        // Assuming anything in cargo_home/{git, registry} is immutable
2027        // (see also #9455 about marking the src directory readonly) which avoids rebuilds when CI
2028        // caches $CARGO_HOME/registry/{index, cache} and $CARGO_HOME/git/db across runs, keeping
2029        // the content the same but changing the mtime.
2030        if let Some(ref skippable_dirs) = skippable_dirs {
2031            if skippable_dirs.iter().any(|dir| path.starts_with(dir)) {
2032                continue;
2033            }
2034        }
2035        if use_checksums {
2036            let Some((file_len, prior_checksum)) = prior_checksum else {
2037                return Some(StaleItem::MissingChecksum {
2038                    path: path.to_path_buf(),
2039                });
2040            };
2041            let path_buf = path.to_path_buf();
2042
2043            let path_checksum = match checksum_cache.entry(path_buf) {
2044                Entry::Occupied(o) => *o.get(),
2045                Entry::Vacant(v) => {
2046                    let Ok(current_file_len) = fs::metadata(&path).map(|m| m.len()) else {
2047                        return Some(StaleItem::FailedToReadMetadata {
2048                            path: path.to_path_buf(),
2049                        });
2050                    };
2051                    if current_file_len != file_len {
2052                        return Some(StaleItem::FileSizeChanged {
2053                            path: path.to_path_buf(),
2054                            new_size: current_file_len,
2055                            old_size: file_len,
2056                        });
2057                    }
2058                    let Ok(file) = File::open(path) else {
2059                        return Some(StaleItem::MissingFile {
2060                            path: path.to_path_buf(),
2061                        });
2062                    };
2063                    let Ok(checksum) = Checksum::compute(prior_checksum.algo(), file) else {
2064                        return Some(StaleItem::UnableToReadFile {
2065                            path: path.to_path_buf(),
2066                        });
2067                    };
2068                    *v.insert(checksum)
2069                }
2070            };
2071            if path_checksum == prior_checksum {
2072                continue;
2073            }
2074            return Some(StaleItem::ChangedChecksum {
2075                source: path.to_path_buf(),
2076                stored_checksum: prior_checksum,
2077                new_checksum: path_checksum,
2078            });
2079        } else {
2080            let path_mtime = match mtime_cache.entry(path.to_path_buf()) {
2081                Entry::Occupied(o) => *o.get(),
2082                Entry::Vacant(v) => {
2083                    let Ok(mtime) = paths::mtime_recursive(path) else {
2084                        return Some(StaleItem::MissingFile {
2085                            path: path.to_path_buf(),
2086                        });
2087                    };
2088                    *v.insert(mtime)
2089                }
2090            };
2091
2092            // TODO: fix #5918.
2093            // Note that equal mtimes should be considered "stale". For filesystems with
2094            // not much timestamp precision like 1s this is would be a conservative approximation
2095            // to handle the case where a file is modified within the same second after
2096            // a build starts. We want to make sure that incremental rebuilds pick that up!
2097            //
2098            // For filesystems with nanosecond precision it's been seen in the wild that
2099            // its "nanosecond precision" isn't really nanosecond-accurate. It turns out that
2100            // kernels may cache the current time so files created at different times actually
2101            // list the same nanosecond precision. Some digging on #5919 picked up that the
2102            // kernel caches the current time between timer ticks, which could mean that if
2103            // a file is updated at most 10ms after a build starts then Cargo may not
2104            // pick up the build changes.
2105            //
2106            // All in all, an equality check here would be a conservative assumption that,
2107            // if equal, files were changed just after a previous build finished.
2108            // Unfortunately this became problematic when (in #6484) cargo switch to more accurately
2109            // measuring the start time of builds.
2110            if path_mtime <= reference_mtime {
2111                continue;
2112            }
2113
2114            return Some(StaleItem::ChangedFile {
2115                reference: reference.to_path_buf(),
2116                reference_mtime,
2117                stale: path.to_path_buf(),
2118                stale_mtime: path_mtime,
2119            });
2120        }
2121    }
2122
2123    debug!(
2124        "all paths up-to-date relative to {:?} mtime={}",
2125        reference, reference_mtime
2126    );
2127    None
2128}