rustc_incremental/persist/
fs.rs

1//! This module manages how the incremental compilation cache is represented in
2//! the file system.
3//!
4//! Incremental compilation caches are managed according to a copy-on-write
5//! strategy: Once a complete, consistent cache version is finalized, it is
6//! never modified. Instead, when a subsequent compilation session is started,
7//! the compiler will allocate a new version of the cache that starts out as
8//! a copy of the previous version. Then only this new copy is modified and it
9//! will not be visible to other processes until it is finalized. This ensures
10//! that multiple compiler processes can be executed concurrently for the same
11//! crate without interfering with each other or blocking each other.
12//!
13//! More concretely this is implemented via the following protocol:
14//!
15//! 1. For a newly started compilation session, the compiler allocates a
16//!    new `session` directory within the incremental compilation directory.
17//!    This session directory will have a unique name that ends with the suffix
18//!    "-working" and that contains a creation timestamp.
19//! 2. Next, the compiler looks for the newest finalized session directory,
20//!    that is, a session directory from a previous compilation session that
21//!    has been marked as valid and consistent. A session directory is
22//!    considered finalized if the "-working" suffix in the directory name has
23//!    been replaced by the SVH of the crate.
24//! 3. Once the compiler has found a valid, finalized session directory, it will
25//!    hard-link/copy its contents into the new "-working" directory. If all
26//!    goes well, it will have its own, private copy of the source directory and
27//!    subsequently not have to worry about synchronizing with other compiler
28//!    processes.
29//! 4. Now the compiler can do its normal compilation process, which involves
30//!    reading and updating its private session directory.
31//! 5. When compilation finishes without errors, the private session directory
32//!    will be in a state where it can be used as input for other compilation
33//!    sessions. That is, it will contain a dependency graph and cache artifacts
34//!    that are consistent with the state of the source code it was compiled
35//!    from, with no need to change them ever again. At this point, the compiler
36//!    finalizes and "publishes" its private session directory by renaming it
37//!    from "s-{timestamp}-{random}-working" to "s-{timestamp}-{SVH}".
38//! 6. At this point the "old" session directory that we copied our data from
39//!    at the beginning of the session has become obsolete because we have just
40//!    published a more current version. Thus the compiler will delete it.
41//!
42//! ## Garbage Collection
43//!
44//! Naively following the above protocol might lead to old session directories
45//! piling up if a compiler instance crashes for some reason before its able to
46//! remove its private session directory. In order to avoid wasting disk space,
47//! the compiler also does some garbage collection each time it is started in
48//! incremental compilation mode. Specifically, it will scan the incremental
49//! compilation directory for private session directories that are not in use
50//! any more and will delete those. It will also delete any finalized session
51//! directories for a given crate except for the most recent one.
52//!
53//! ## Synchronization
54//!
55//! There is some synchronization needed in order for the compiler to be able to
56//! determine whether a given private session directory is not in use any more.
57//! This is done by creating a lock file for each session directory and
58//! locking it while the directory is still being used. Since file locks have
59//! operating system support, we can rely on the lock being released if the
60//! compiler process dies for some unexpected reason. Thus, when garbage
61//! collecting private session directories, the collecting process can determine
62//! whether the directory is still in use by trying to acquire a lock on the
63//! file. If locking the file fails, the original process must still be alive.
64//! If locking the file succeeds, we know that the owning process is not alive
65//! any more and we can safely delete the directory.
66//! There is still a small time window between the original process creating the
67//! lock file and actually locking it. In order to minimize the chance that
68//! another process tries to acquire the lock in just that instance, only
69//! session directories that are older than a few seconds are considered for
70//! garbage collection.
71//!
72//! Another case that has to be considered is what happens if one process
73//! deletes a finalized session directory that another process is currently
74//! trying to copy from. This case is also handled via the lock file. Before
75//! a process starts copying a finalized session directory, it will acquire a
76//! shared lock on the directory's lock file. Any garbage collecting process,
77//! on the other hand, will acquire an exclusive lock on the lock file.
78//! Thus, if a directory is being collected, any reader process will fail
79//! acquiring the shared lock and will leave the directory alone. Conversely,
80//! if a collecting process can't acquire the exclusive lock because the
81//! directory is currently being read from, it will leave collecting that
82//! directory to another process at a later point in time.
83//! The exact same scheme is also used when reading the metadata hashes file
84//! from an extern crate. When a crate is compiled, the hash values of its
85//! metadata are stored in a file in its session directory. When the
86//! compilation session of another crate imports the first crate's metadata,
87//! it also has to read in the accompanying metadata hashes. It thus will access
88//! the finalized session directory of all crates it links to and while doing
89//! so, it will also place a read lock on that the respective session directory
90//! so that it won't be deleted while the metadata hashes are loaded.
91//!
92//! ## Preconditions
93//!
94//! This system relies on two features being available in the file system in
95//! order to work really well: file locking and hard linking.
96//! If hard linking is not available (like on FAT) the data in the cache
97//! actually has to be copied at the beginning of each session.
98//! If file locking does not work reliably (like on NFS), some of the
99//! synchronization will go haywire.
100//! In both cases we recommend to locate the incremental compilation directory
101//! on a file system that supports these things.
102//! It might be a good idea though to try and detect whether we are on an
103//! unsupported file system and emit a warning in that case. This is not yet
104//! implemented.
105
106use std::fs as std_fs;
107use std::io::{self, ErrorKind};
108use std::path::{Path, PathBuf};
109use std::time::{Duration, SystemTime, UNIX_EPOCH};
110
111use rand::{RngCore, rng};
112use rustc_data_structures::base_n::{BaseNString, CASE_INSENSITIVE, ToBaseN};
113use rustc_data_structures::fx::{FxHashSet, FxIndexSet};
114use rustc_data_structures::svh::Svh;
115use rustc_data_structures::unord::{UnordMap, UnordSet};
116use rustc_data_structures::{base_n, flock};
117use rustc_fs_util::{LinkOrCopy, link_or_copy, try_canonicalize};
118use rustc_middle::bug;
119use rustc_session::{Session, StableCrateId};
120use rustc_span::Symbol;
121use tracing::debug;
122
123use crate::errors;
124
125#[cfg(test)]
126mod tests;
127
128const LOCK_FILE_EXT: &str = ".lock";
129const DEP_GRAPH_FILENAME: &str = "dep-graph.bin";
130const STAGING_DEP_GRAPH_FILENAME: &str = "dep-graph.part.bin";
131const WORK_PRODUCTS_FILENAME: &str = "work-products.bin";
132const QUERY_CACHE_FILENAME: &str = "query-cache.bin";
133
134// We encode integers using the following base, so they are shorter than decimal
135// or hexadecimal numbers (we want short file and directory names). Since these
136// numbers will be used in file names, we choose an encoding that is not
137// case-sensitive (as opposed to base64, for example).
138const INT_ENCODE_BASE: usize = base_n::CASE_INSENSITIVE;
139
140/// Returns the path to a session's dependency graph.
141pub(crate) fn dep_graph_path(sess: &Session) -> PathBuf {
142    in_incr_comp_dir_sess(sess, DEP_GRAPH_FILENAME)
143}
144
145/// Returns the path to a session's staging dependency graph.
146///
147/// On the difference between dep-graph and staging dep-graph,
148/// see `build_dep_graph`.
149pub(crate) fn staging_dep_graph_path(sess: &Session) -> PathBuf {
150    in_incr_comp_dir_sess(sess, STAGING_DEP_GRAPH_FILENAME)
151}
152
153pub(crate) fn work_products_path(sess: &Session) -> PathBuf {
154    in_incr_comp_dir_sess(sess, WORK_PRODUCTS_FILENAME)
155}
156
157/// Returns the path to a session's query cache.
158pub(crate) fn query_cache_path(sess: &Session) -> PathBuf {
159    in_incr_comp_dir_sess(sess, QUERY_CACHE_FILENAME)
160}
161
162/// Locks a given session directory.
163fn lock_file_path(session_dir: &Path) -> PathBuf {
164    let crate_dir = session_dir.parent().unwrap();
165
166    let directory_name = session_dir
167        .file_name()
168        .unwrap()
169        .to_str()
170        .expect("malformed session dir name: contains non-Unicode characters");
171
172    let dash_indices: Vec<_> = directory_name.match_indices('-').map(|(idx, _)| idx).collect();
173    if dash_indices.len() != 3 {
174        bug!(
175            "Encountered incremental compilation session directory with \
176              malformed name: {}",
177            session_dir.display()
178        )
179    }
180
181    crate_dir.join(&directory_name[0..dash_indices[2]]).with_extension(&LOCK_FILE_EXT[1..])
182}
183
184/// Returns the path for a given filename within the incremental compilation directory
185/// in the current session.
186pub fn in_incr_comp_dir_sess(sess: &Session, file_name: &str) -> PathBuf {
187    in_incr_comp_dir(&sess.incr_comp_session_dir(), file_name)
188}
189
190/// Returns the path for a given filename within the incremental compilation directory,
191/// not necessarily from the current session.
192///
193/// To ensure the file is part of the current session, use [`in_incr_comp_dir_sess`].
194pub fn in_incr_comp_dir(incr_comp_session_dir: &Path, file_name: &str) -> PathBuf {
195    incr_comp_session_dir.join(file_name)
196}
197
198/// Allocates the private session directory.
199///
200/// If the result of this function is `Ok`, we have a valid incremental
201/// compilation session directory. A valid session
202/// directory is one that contains a locked lock file. It may or may not contain
203/// a dep-graph and work products from a previous session.
204///
205/// This always attempts to load a dep-graph from the directory.
206/// If loading fails for some reason, we fallback to a disabled `DepGraph`.
207/// See [`rustc_interface::queries::dep_graph`].
208///
209/// If this function returns an error, it may leave behind an invalid session directory.
210/// The garbage collection will take care of it.
211///
212/// [`rustc_interface::queries::dep_graph`]: ../../rustc_interface/struct.Queries.html#structfield.dep_graph
213pub(crate) fn prepare_session_directory(
214    sess: &Session,
215    crate_name: Symbol,
216    stable_crate_id: StableCrateId,
217) {
218    if sess.opts.incremental.is_none() {
219        return;
220    }
221
222    let _timer = sess.timer("incr_comp_prepare_session_directory");
223
224    debug!("prepare_session_directory");
225
226    // {incr-comp-dir}/{crate-name-and-disambiguator}
227    let crate_dir = crate_path(sess, crate_name, stable_crate_id);
228    debug!("crate-dir: {}", crate_dir.display());
229    create_dir(sess, &crate_dir, "crate");
230
231    // Hack: canonicalize the path *after creating the directory*
232    // because, on windows, long paths can cause problems;
233    // canonicalization inserts this weird prefix that makes windows
234    // tolerate long paths.
235    let crate_dir = match try_canonicalize(&crate_dir) {
236        Ok(v) => v,
237        Err(err) => {
238            sess.dcx().emit_fatal(errors::CanonicalizePath { path: crate_dir, err });
239        }
240    };
241
242    let mut source_directories_already_tried = FxHashSet::default();
243
244    loop {
245        // Generate a session directory of the form:
246        //
247        // {incr-comp-dir}/{crate-name-and-disambiguator}/s-{timestamp}-{random}-working
248        let session_dir = generate_session_dir_path(&crate_dir);
249        debug!("session-dir: {}", session_dir.display());
250
251        // Lock the new session directory. If this fails, return an
252        // error without retrying
253        let (directory_lock, lock_file_path) = lock_directory(sess, &session_dir);
254
255        // Now that we have the lock, we can actually create the session
256        // directory
257        create_dir(sess, &session_dir, "session");
258
259        // Find a suitable source directory to copy from. Ignore those that we
260        // have already tried before.
261        let source_directory = find_source_directory(&crate_dir, &source_directories_already_tried);
262
263        let Some(source_directory) = source_directory else {
264            // There's nowhere to copy from, we're done
265            debug!(
266                "no source directory found. Continuing with empty session \
267                    directory."
268            );
269
270            sess.init_incr_comp_session(session_dir, directory_lock);
271            return;
272        };
273
274        debug!("attempting to copy data from source: {}", source_directory.display());
275
276        // Try copying over all files from the source directory
277        if let Ok(allows_links) = copy_files(sess, &session_dir, &source_directory) {
278            debug!("successfully copied data from: {}", source_directory.display());
279
280            if !allows_links {
281                sess.dcx().emit_warn(errors::HardLinkFailed { path: &session_dir });
282            }
283
284            sess.init_incr_comp_session(session_dir, directory_lock);
285            return;
286        } else {
287            debug!("copying failed - trying next directory");
288
289            // Something went wrong while trying to copy/link files from the
290            // source directory. Try again with a different one.
291            source_directories_already_tried.insert(source_directory);
292
293            // Try to remove the session directory we just allocated. We don't
294            // know if there's any garbage in it from the failed copy action.
295            if let Err(err) = std_fs::remove_dir_all(&session_dir) {
296                sess.dcx().emit_warn(errors::DeletePartial { path: &session_dir, err });
297            }
298
299            delete_session_dir_lock_file(sess, &lock_file_path);
300            drop(directory_lock);
301        }
302    }
303}
304
305/// This function finalizes and thus 'publishes' the session directory by
306/// renaming it to `s-{timestamp}-{svh}` and releasing the file lock.
307/// If there have been compilation errors, however, this function will just
308/// delete the presumably invalid session directory.
309pub fn finalize_session_directory(sess: &Session, svh: Option<Svh>) {
310    if sess.opts.incremental.is_none() {
311        return;
312    }
313    // The svh is always produced when incr. comp. is enabled.
314    let svh = svh.unwrap();
315
316    let _timer = sess.timer("incr_comp_finalize_session_directory");
317
318    let incr_comp_session_dir: PathBuf = sess.incr_comp_session_dir().clone();
319
320    if sess.dcx().has_errors_or_delayed_bugs().is_some() {
321        // If there have been any errors during compilation, we don't want to
322        // publish this session directory. Rather, we'll just delete it.
323
324        debug!(
325            "finalize_session_directory() - invalidating session directory: {}",
326            incr_comp_session_dir.display()
327        );
328
329        if let Err(err) = std_fs::remove_dir_all(&*incr_comp_session_dir) {
330            sess.dcx().emit_warn(errors::DeleteFull { path: &incr_comp_session_dir, err });
331        }
332
333        let lock_file_path = lock_file_path(&*incr_comp_session_dir);
334        delete_session_dir_lock_file(sess, &lock_file_path);
335        sess.mark_incr_comp_session_as_invalid();
336    }
337
338    debug!("finalize_session_directory() - session directory: {}", incr_comp_session_dir.display());
339
340    let mut sub_dir_name = incr_comp_session_dir
341        .file_name()
342        .unwrap()
343        .to_str()
344        .expect("malformed session dir name: contains non-Unicode characters")
345        .to_string();
346
347    // Keep the 's-{timestamp}-{random-number}' prefix, but replace "working" with the SVH of the crate
348    sub_dir_name.truncate(sub_dir_name.len() - "working".len());
349    // Double-check that we kept this: "s-{timestamp}-{random-number}-"
350    assert!(sub_dir_name.ends_with('-'), "{:?}", sub_dir_name);
351    assert!(sub_dir_name.as_bytes().iter().filter(|b| **b == b'-').count() == 3);
352
353    // Append the SVH
354    sub_dir_name.push_str(&svh.as_u128().to_base_fixed_len(CASE_INSENSITIVE));
355
356    // Create the full path
357    let new_path = incr_comp_session_dir.parent().unwrap().join(&*sub_dir_name);
358    debug!("finalize_session_directory() - new path: {}", new_path.display());
359
360    match rename_path_with_retry(&*incr_comp_session_dir, &new_path, 3) {
361        Ok(_) => {
362            debug!("finalize_session_directory() - directory renamed successfully");
363
364            // This unlocks the directory
365            sess.finalize_incr_comp_session(new_path);
366        }
367        Err(e) => {
368            // Warn about the error. However, no need to abort compilation now.
369            sess.dcx().emit_warn(errors::Finalize { path: &incr_comp_session_dir, err: e });
370
371            debug!("finalize_session_directory() - error, marking as invalid");
372            // Drop the file lock, so we can garage collect
373            sess.mark_incr_comp_session_as_invalid();
374        }
375    }
376
377    let _ = garbage_collect_session_directories(sess);
378}
379
380pub(crate) fn delete_all_session_dir_contents(sess: &Session) -> io::Result<()> {
381    let sess_dir_iterator = sess.incr_comp_session_dir().read_dir()?;
382    for entry in sess_dir_iterator {
383        let entry = entry?;
384        safe_remove_file(&entry.path())?
385    }
386    Ok(())
387}
388
389fn copy_files(sess: &Session, target_dir: &Path, source_dir: &Path) -> Result<bool, ()> {
390    // We acquire a shared lock on the lock file of the directory, so that
391    // nobody deletes it out from under us while we are reading from it.
392    let lock_file_path = lock_file_path(source_dir);
393
394    // not exclusive
395    let Ok(_lock) = flock::Lock::new(
396        &lock_file_path,
397        false, // don't wait,
398        false, // don't create
399        false,
400    ) else {
401        // Could not acquire the lock, don't try to copy from here
402        return Err(());
403    };
404
405    let Ok(source_dir_iterator) = source_dir.read_dir() else {
406        return Err(());
407    };
408
409    let mut files_linked = 0;
410    let mut files_copied = 0;
411
412    for entry in source_dir_iterator {
413        match entry {
414            Ok(entry) => {
415                let file_name = entry.file_name();
416
417                let target_file_path = target_dir.join(file_name);
418                let source_path = entry.path();
419
420                debug!("copying into session dir: {}", source_path.display());
421                match link_or_copy(source_path, target_file_path) {
422                    Ok(LinkOrCopy::Link) => files_linked += 1,
423                    Ok(LinkOrCopy::Copy) => files_copied += 1,
424                    Err(_) => return Err(()),
425                }
426            }
427            Err(_) => return Err(()),
428        }
429    }
430
431    if sess.opts.unstable_opts.incremental_info {
432        eprintln!(
433            "[incremental] session directory: \
434                  {files_linked} files hard-linked"
435        );
436        eprintln!(
437            "[incremental] session directory: \
438                 {files_copied} files copied"
439        );
440    }
441
442    Ok(files_linked > 0 || files_copied == 0)
443}
444
445/// Generates unique directory path of the form:
446/// {crate_dir}/s-{timestamp}-{random-number}-working
447fn generate_session_dir_path(crate_dir: &Path) -> PathBuf {
448    let timestamp = timestamp_to_string(SystemTime::now());
449    debug!("generate_session_dir_path: timestamp = {}", timestamp);
450    let random_number = rng().next_u32();
451    debug!("generate_session_dir_path: random_number = {}", random_number);
452
453    // Chop the first 3 characters off the timestamp. Those 3 bytes will be zero for a while.
454    let (zeroes, timestamp) = timestamp.split_at(3);
455    assert_eq!(zeroes, "000");
456    let directory_name =
457        format!("s-{}-{}-working", timestamp, random_number.to_base_fixed_len(CASE_INSENSITIVE));
458    debug!("generate_session_dir_path: directory_name = {}", directory_name);
459    let directory_path = crate_dir.join(directory_name);
460    debug!("generate_session_dir_path: directory_path = {}", directory_path.display());
461    directory_path
462}
463
464fn create_dir(sess: &Session, path: &Path, dir_tag: &str) {
465    match std_fs::create_dir_all(path) {
466        Ok(()) => {
467            debug!("{} directory created successfully", dir_tag);
468        }
469        Err(err) => sess.dcx().emit_fatal(errors::CreateIncrCompDir { tag: dir_tag, path, err }),
470    }
471}
472
473/// Allocate the lock-file and lock it.
474fn lock_directory(sess: &Session, session_dir: &Path) -> (flock::Lock, PathBuf) {
475    let lock_file_path = lock_file_path(session_dir);
476    debug!("lock_directory() - lock_file: {}", lock_file_path.display());
477
478    match flock::Lock::new(
479        &lock_file_path,
480        false, // don't wait
481        true,  // create the lock file
482        true,
483    ) {
484        // the lock should be exclusive
485        Ok(lock) => (lock, lock_file_path),
486        Err(lock_err) => {
487            let is_unsupported_lock = flock::Lock::error_unsupported(&lock_err);
488            sess.dcx().emit_fatal(errors::CreateLock {
489                lock_err,
490                session_dir,
491                is_unsupported_lock,
492                is_cargo: rustc_session::utils::was_invoked_from_cargo(),
493            });
494        }
495    }
496}
497
498fn delete_session_dir_lock_file(sess: &Session, lock_file_path: &Path) {
499    if let Err(err) = safe_remove_file(lock_file_path) {
500        sess.dcx().emit_warn(errors::DeleteLock { path: lock_file_path, err });
501    }
502}
503
504/// Finds the most recent published session directory that is not in the
505/// ignore-list.
506fn find_source_directory(
507    crate_dir: &Path,
508    source_directories_already_tried: &FxHashSet<PathBuf>,
509) -> Option<PathBuf> {
510    let iter = crate_dir
511        .read_dir()
512        .unwrap() // FIXME
513        .filter_map(|e| e.ok().map(|e| e.path()));
514
515    find_source_directory_in_iter(iter, source_directories_already_tried)
516}
517
518fn find_source_directory_in_iter<I>(
519    iter: I,
520    source_directories_already_tried: &FxHashSet<PathBuf>,
521) -> Option<PathBuf>
522where
523    I: Iterator<Item = PathBuf>,
524{
525    let mut best_candidate = (UNIX_EPOCH, None);
526
527    for session_dir in iter {
528        debug!("find_source_directory_in_iter - inspecting `{}`", session_dir.display());
529
530        let Some(directory_name) = session_dir.file_name().unwrap().to_str() else {
531            debug!("find_source_directory_in_iter - ignoring");
532            continue;
533        };
534
535        if source_directories_already_tried.contains(&session_dir)
536            || !is_session_directory(&directory_name)
537            || !is_finalized(&directory_name)
538        {
539            debug!("find_source_directory_in_iter - ignoring");
540            continue;
541        }
542
543        let timestamp = match extract_timestamp_from_session_dir(&directory_name) {
544            Ok(timestamp) => timestamp,
545            Err(e) => {
546                debug!("unexpected incr-comp session dir: {}: {}", session_dir.display(), e);
547                continue;
548            }
549        };
550
551        if timestamp > best_candidate.0 {
552            best_candidate = (timestamp, Some(session_dir.clone()));
553        }
554    }
555
556    best_candidate.1
557}
558
559fn is_finalized(directory_name: &str) -> bool {
560    !directory_name.ends_with("-working")
561}
562
563fn is_session_directory(directory_name: &str) -> bool {
564    directory_name.starts_with("s-") && !directory_name.ends_with(LOCK_FILE_EXT)
565}
566
567fn is_session_directory_lock_file(file_name: &str) -> bool {
568    file_name.starts_with("s-") && file_name.ends_with(LOCK_FILE_EXT)
569}
570
571fn extract_timestamp_from_session_dir(directory_name: &str) -> Result<SystemTime, &'static str> {
572    if !is_session_directory(directory_name) {
573        return Err("not a directory");
574    }
575
576    let dash_indices: Vec<_> = directory_name.match_indices('-').map(|(idx, _)| idx).collect();
577    if dash_indices.len() != 3 {
578        return Err("not three dashes in name");
579    }
580
581    string_to_timestamp(&directory_name[dash_indices[0] + 1..dash_indices[1]])
582}
583
584fn timestamp_to_string(timestamp: SystemTime) -> BaseNString {
585    let duration = timestamp.duration_since(UNIX_EPOCH).unwrap();
586    let micros: u64 = duration.as_micros().try_into().unwrap();
587    micros.to_base_fixed_len(CASE_INSENSITIVE)
588}
589
590fn string_to_timestamp(s: &str) -> Result<SystemTime, &'static str> {
591    let micros_since_unix_epoch = match u64::from_str_radix(s, INT_ENCODE_BASE as u32) {
592        Ok(micros) => micros,
593        Err(_) => return Err("timestamp not an int"),
594    };
595
596    let duration = Duration::from_micros(micros_since_unix_epoch);
597    Ok(UNIX_EPOCH + duration)
598}
599
600fn crate_path(sess: &Session, crate_name: Symbol, stable_crate_id: StableCrateId) -> PathBuf {
601    let incr_dir = sess.opts.incremental.as_ref().unwrap().clone();
602
603    let crate_name =
604        format!("{crate_name}-{}", stable_crate_id.as_u64().to_base_fixed_len(CASE_INSENSITIVE));
605    incr_dir.join(crate_name)
606}
607
608fn is_old_enough_to_be_collected(timestamp: SystemTime) -> bool {
609    timestamp < SystemTime::now() - Duration::from_secs(10)
610}
611
612/// Runs garbage collection for the current session.
613pub(crate) fn garbage_collect_session_directories(sess: &Session) -> io::Result<()> {
614    debug!("garbage_collect_session_directories() - begin");
615
616    let session_directory = sess.incr_comp_session_dir();
617    debug!(
618        "garbage_collect_session_directories() - session directory: {}",
619        session_directory.display()
620    );
621
622    let crate_directory = session_directory.parent().unwrap();
623    debug!(
624        "garbage_collect_session_directories() - crate directory: {}",
625        crate_directory.display()
626    );
627
628    // First do a pass over the crate directory, collecting lock files and
629    // session directories
630    let mut session_directories = FxIndexSet::default();
631    let mut lock_files = UnordSet::default();
632
633    for dir_entry in crate_directory.read_dir()? {
634        let Ok(dir_entry) = dir_entry else {
635            // Ignore any errors
636            continue;
637        };
638
639        let entry_name = dir_entry.file_name();
640        let Some(entry_name) = entry_name.to_str() else {
641            continue;
642        };
643
644        if is_session_directory_lock_file(&entry_name) {
645            lock_files.insert(entry_name.to_string());
646        } else if is_session_directory(&entry_name) {
647            session_directories.insert(entry_name.to_string());
648        } else {
649            // This is something we don't know, leave it alone
650        }
651    }
652    session_directories.sort();
653
654    // Now map from lock files to session directories
655    let lock_file_to_session_dir: UnordMap<String, Option<String>> = lock_files
656        .into_items()
657        .map(|lock_file_name| {
658            assert!(lock_file_name.ends_with(LOCK_FILE_EXT));
659            let dir_prefix_end = lock_file_name.len() - LOCK_FILE_EXT.len();
660            let session_dir = {
661                let dir_prefix = &lock_file_name[0..dir_prefix_end];
662                session_directories.iter().find(|dir_name| dir_name.starts_with(dir_prefix))
663            };
664            (lock_file_name, session_dir.map(String::clone))
665        })
666        .into();
667
668    // Delete all lock files, that don't have an associated directory. They must
669    // be some kind of leftover
670    for (lock_file_name, directory_name) in
671        lock_file_to_session_dir.items().into_sorted_stable_ord()
672    {
673        if directory_name.is_none() {
674            let Ok(timestamp) = extract_timestamp_from_session_dir(lock_file_name) else {
675                debug!(
676                    "found lock-file with malformed timestamp: {}",
677                    crate_directory.join(&lock_file_name).display()
678                );
679                // Ignore it
680                continue;
681            };
682
683            let lock_file_path = crate_directory.join(&*lock_file_name);
684
685            if is_old_enough_to_be_collected(timestamp) {
686                debug!(
687                    "garbage_collect_session_directories() - deleting \
688                    garbage lock file: {}",
689                    lock_file_path.display()
690                );
691                delete_session_dir_lock_file(sess, &lock_file_path);
692            } else {
693                debug!(
694                    "garbage_collect_session_directories() - lock file with \
695                    no session dir not old enough to be collected: {}",
696                    lock_file_path.display()
697                );
698            }
699        }
700    }
701
702    // Filter out `None` directories
703    let lock_file_to_session_dir: UnordMap<String, String> = lock_file_to_session_dir
704        .into_items()
705        .filter_map(|(lock_file_name, directory_name)| directory_name.map(|n| (lock_file_name, n)))
706        .into();
707
708    // Delete all session directories that don't have a lock file.
709    for directory_name in session_directories {
710        if !lock_file_to_session_dir.items().any(|(_, dir)| *dir == directory_name) {
711            let path = crate_directory.join(directory_name);
712            if let Err(err) = std_fs::remove_dir_all(&path) {
713                sess.dcx().emit_warn(errors::InvalidGcFailed { path: &path, err });
714            }
715        }
716    }
717
718    let current_session_directory_name =
719        session_directory.file_name().expect("session directory is not `..`");
720
721    // Now garbage collect the valid session directories.
722    let deletion_candidates =
723        lock_file_to_session_dir.items().filter_map(|(lock_file_name, directory_name)| {
724            debug!("garbage_collect_session_directories() - inspecting: {}", directory_name);
725
726            if directory_name.as_str() == current_session_directory_name {
727                // Skipping our own directory is, unfortunately, important for correctness.
728                //
729                // To summarize #147821: we will try to lock directories before deciding they can be
730                // garbage collected, but the ability of `flock::Lock` to detect a lock held *by the
731                // same process* varies across file locking APIs. Then, if our own session directory
732                // has become old enough to be eligible for GC, we are beholden to platform-specific
733                // details about detecting the our own lock on the session directory.
734                //
735                // POSIX `fcntl(F_SETLK)`-style file locks are maintained across a process. On
736                // systems where this is the mechanism for `flock::Lock`, there is no way to
737                // discover if an `flock::Lock` has been created in the same process on the same
738                // file. Attempting to set a lock on the lockfile again will succeed, even if the
739                // lock was set by another thread, on another file descriptor. Then we would
740                // garbage collect our own live directory, unable to tell it was locked perhaps by
741                // this same thread.
742                //
743                // It's not clear that `flock::Lock` can be fixed for this in general, and our own
744                // incremental session directory is the only one which this process may own, so skip
745                // it here and avoid the problem. We know it's not garbage anyway: we're using it.
746                return None;
747            }
748
749            let Ok(timestamp) = extract_timestamp_from_session_dir(directory_name) else {
750                debug!(
751                    "found session-dir with malformed timestamp: {}",
752                    crate_directory.join(directory_name).display()
753                );
754                // Ignore it
755                return None;
756            };
757
758            if is_finalized(directory_name) {
759                let lock_file_path = crate_directory.join(lock_file_name);
760                match flock::Lock::new(
761                    &lock_file_path,
762                    false, // don't wait
763                    false, // don't create the lock-file
764                    true,
765                ) {
766                    // get an exclusive lock
767                    Ok(lock) => {
768                        debug!(
769                            "garbage_collect_session_directories() - \
770                            successfully acquired lock"
771                        );
772                        debug!(
773                            "garbage_collect_session_directories() - adding \
774                            deletion candidate: {}",
775                            directory_name
776                        );
777
778                        // Note that we are holding on to the lock
779                        return Some((
780                            (timestamp, crate_directory.join(directory_name)),
781                            Some(lock),
782                        ));
783                    }
784                    Err(_) => {
785                        debug!(
786                            "garbage_collect_session_directories() - \
787                            not collecting, still in use"
788                        );
789                    }
790                }
791            } else if is_old_enough_to_be_collected(timestamp) {
792                // When cleaning out "-working" session directories, i.e.
793                // session directories that might still be in use by another
794                // compiler instance, we only look a directories that are
795                // at least ten seconds old. This is supposed to reduce the
796                // chance of deleting a directory in the time window where
797                // the process has allocated the directory but has not yet
798                // acquired the file-lock on it.
799
800                // Try to acquire the directory lock. If we can't, it
801                // means that the owning process is still alive and we
802                // leave this directory alone.
803                let lock_file_path = crate_directory.join(lock_file_name);
804                match flock::Lock::new(
805                    &lock_file_path,
806                    false, // don't wait
807                    false, // don't create the lock-file
808                    true,
809                ) {
810                    // get an exclusive lock
811                    Ok(lock) => {
812                        debug!(
813                            "garbage_collect_session_directories() - \
814                            successfully acquired lock"
815                        );
816
817                        delete_old(sess, &crate_directory.join(directory_name));
818
819                        // Let's make it explicit that the file lock is released at this point,
820                        // or rather, that we held on to it until here
821                        drop(lock);
822                    }
823                    Err(_) => {
824                        debug!(
825                            "garbage_collect_session_directories() - \
826                            not collecting, still in use"
827                        );
828                    }
829                }
830            } else {
831                debug!(
832                    "garbage_collect_session_directories() - not finalized, not \
833                    old enough"
834                );
835            }
836            None
837        });
838    let deletion_candidates = deletion_candidates.into();
839
840    // Delete all but the most recent of the candidates
841    all_except_most_recent(deletion_candidates).into_items().all(|(path, lock)| {
842        debug!("garbage_collect_session_directories() - deleting `{}`", path.display());
843
844        if let Err(err) = std_fs::remove_dir_all(&path) {
845            sess.dcx().emit_warn(errors::FinalizedGcFailed { path: &path, err });
846        } else {
847            delete_session_dir_lock_file(sess, &lock_file_path(&path));
848        }
849
850        // Let's make it explicit that the file lock is released at this point,
851        // or rather, that we held on to it until here
852        drop(lock);
853        true
854    });
855
856    Ok(())
857}
858
859fn delete_old(sess: &Session, path: &Path) {
860    debug!("garbage_collect_session_directories() - deleting `{}`", path.display());
861
862    if let Err(err) = std_fs::remove_dir_all(path) {
863        sess.dcx().emit_warn(errors::SessionGcFailed { path, err });
864    } else {
865        delete_session_dir_lock_file(sess, &lock_file_path(path));
866    }
867}
868
869fn all_except_most_recent(
870    deletion_candidates: UnordMap<(SystemTime, PathBuf), Option<flock::Lock>>,
871) -> UnordMap<PathBuf, Option<flock::Lock>> {
872    let most_recent = deletion_candidates.items().map(|(&(timestamp, _), _)| timestamp).max();
873
874    if let Some(most_recent) = most_recent {
875        deletion_candidates
876            .into_items()
877            .filter(|&((timestamp, _), _)| timestamp != most_recent)
878            .map(|((_, path), lock)| (path, lock))
879            .collect()
880    } else {
881        UnordMap::default()
882    }
883}
884
885fn safe_remove_file(p: &Path) -> io::Result<()> {
886    match std_fs::remove_file(p) {
887        Err(err) if err.kind() == io::ErrorKind::NotFound => Ok(()),
888        result => result,
889    }
890}
891
892// On Windows the compiler would sometimes fail to rename the session directory because
893// the OS thought something was still being accessed in it. So we retry a few times to give
894// the OS time to catch up.
895// See https://github.com/rust-lang/rust/issues/86929.
896fn rename_path_with_retry(from: &Path, to: &Path, mut retries_left: usize) -> std::io::Result<()> {
897    loop {
898        match std_fs::rename(from, to) {
899            Ok(()) => return Ok(()),
900            Err(e) => {
901                if retries_left > 0 && e.kind() == ErrorKind::PermissionDenied {
902                    // Try again after a short waiting period.
903                    std::thread::sleep(Duration::from_millis(50));
904                    retries_left -= 1;
905                } else {
906                    return Err(e);
907                }
908            }
909        }
910    }
911}