rustc_mir_transform/
dest_prop.rs

1//! Propagates assignment destinations backwards in the CFG to eliminate redundant assignments.
2//!
3//! # Motivation
4//!
5//! MIR building can insert a lot of redundant copies, and Rust code in general often tends to move
6//! values around a lot. The result is a lot of assignments of the form `dest = {move} src;` in MIR.
7//! MIR building for constants in particular tends to create additional locals that are only used
8//! inside a single block to shuffle a value around unnecessarily.
9//!
10//! LLVM by itself is not good enough at eliminating these redundant copies (eg. see
11//! <https://github.com/rust-lang/rust/issues/32966>), so this leaves some performance on the table
12//! that we can regain by implementing an optimization for removing these assign statements in rustc
13//! itself. When this optimization runs fast enough, it can also speed up the constant evaluation
14//! and code generation phases of rustc due to the reduced number of statements and locals.
15//!
16//! # The Optimization
17//!
18//! Conceptually, this optimization is "destination propagation". It is similar to the Named Return
19//! Value Optimization, or NRVO, known from the C++ world, except that it isn't limited to return
20//! values or the return place `_0`. On a very high level, independent of the actual implementation
21//! details, it does the following:
22//!
23//! 1) Identify `dest = src;` statements with values for `dest` and `src` whose storage can soundly
24//!    be merged.
25//! 2) Replace all mentions of `src` with `dest` ("unifying" them and propagating the destination
26//!    backwards).
27//! 3) Delete the `dest = src;` statement (by making it a `nop`).
28//!
29//! Step 1) is by far the hardest, so it is explained in more detail below.
30//!
31//! ## Soundness
32//!
33//! We have a pair of places `p` and `q`, whose memory we would like to merge. In order for this to
34//! be sound, we need to check a number of conditions:
35//!
36//! * `p` and `q` must both be *constant* - it does not make much sense to talk about merging them
37//!   if they do not consistently refer to the same place in memory. This is satisfied if they do
38//!   not contain any indirection through a pointer or any indexing projections.
39//!
40//! * `p` and `q` must have the **same type**. If we replace a local with a subtype or supertype,
41//!   we may end up with a different vtable for that local. See the `subtyping-impacts-selection`
42//!   tests for an example where that causes issues.
43//!
44//! * We need to make sure that the goal of "merging the memory" is actually structurally possible
45//!   in MIR. For example, even if all the other conditions are satisfied, there is no way to
46//!   "merge" `_5.foo` and `_6.bar`. For now, we ensure this by requiring that both `p` and `q` are
47//!   locals with no further projections. Future iterations of this pass should improve on this.
48//!
49//! * Finally, we want `p` and `q` to use the same memory - however, we still need to make sure that
50//!   each of them has enough "ownership" of that memory to continue "doing its job." More
51//!   precisely, what we will check is that whenever the program performs a write to `p`, then it
52//!   does not currently care about what the value in `q` is (and vice versa). We formalize the
53//!   notion of "does not care what the value in `q` is" by checking the *liveness* of `q`.
54//!
55//!   Because of the difficulty of computing liveness of places that have their address taken, we do
56//!   not even attempt to do it. Any places that are in a local that has its address taken is
57//!   excluded from the optimization.
58//!
59//! The first two conditions are simple structural requirements on the `Assign` statements that can
60//! be trivially checked. The third requirement however is more difficult and costly to check.
61//!
62//! ## Future Improvements
63//!
64//! There are a number of ways in which this pass could be improved in the future:
65//!
66//! * Merging storage liveness ranges instead of removing storage statements completely. This may
67//!   improve stack usage.
68//!
69//! * Allow merging locals into places with projections, eg `_5` into `_6.foo`.
70//!
71//! * Liveness analysis with more precision than whole locals at a time. The smaller benefit of this
72//!   is that it would allow us to dest prop at "sub-local" levels in some cases. The bigger benefit
73//!   of this is that such liveness analysis can report more accurate results about whole locals at
74//!   a time. For example, consider:
75//!
76//!   ```ignore (syntax-highlighting-only)
77//!   _1 = u;
78//!   // unrelated code
79//!   _1.f1 = v;
80//!   _2 = _1.f1;
81//!   ```
82//!
83//!   Because the current analysis only thinks in terms of locals, it does not have enough
84//!   information to report that `_1` is dead in the "unrelated code" section.
85//!
86//! * Liveness analysis enabled by alias analysis. This would allow us to not just bail on locals
87//!   that ever have their address taken. Of course that requires actually having alias analysis
88//!   (and a model to build it on), so this might be a bit of a ways off.
89//!
90//! * Various perf improvements. There are a bunch of comments in here marked `PERF` with ideas for
91//!   how to do things more efficiently. However, the complexity of the pass as a whole should be
92//!   kept in mind.
93//!
94//! ## Previous Work
95//!
96//! A [previous attempt][attempt 1] at implementing an optimization like this turned out to be a
97//! significant regression in compiler performance. Fixing the regressions introduced a lot of
98//! undesirable complexity to the implementation.
99//!
100//! A [subsequent approach][attempt 2] tried to avoid the costly computation by limiting itself to
101//! acyclic CFGs, but still turned out to be far too costly to run due to suboptimal performance
102//! within individual basic blocks, requiring a walk across the entire block for every assignment
103//! found within the block. For the `tuple-stress` benchmark, which has 458745 statements in a
104//! single block, this proved to be far too costly.
105//!
106//! [Another approach after that][attempt 3] was much closer to correct, but had some soundness
107//! issues - it was failing to consider stores outside live ranges, and failed to uphold some of the
108//! requirements that MIR has for non-overlapping places within statements. However, it also had
109//! performance issues caused by `O(l² * s)` runtime, where `l` is the number of locals and `s` is
110//! the number of statements and terminators.
111//!
112//! Since the first attempt at this, the compiler has improved dramatically, and new analysis
113//! frameworks have been added that should make this approach viable without requiring a limited
114//! approach that only works for some classes of CFGs:
115//! - rustc now has a powerful dataflow analysis framework that can handle forwards and backwards
116//!   analyses efficiently.
117//! - Layout optimizations for coroutines have been added to improve code generation for
118//!   async/await, which are very similar in spirit to what this optimization does.
119//!
120//! Also, rustc now has a simple NRVO pass (see `nrvo.rs`), which handles a subset of the cases that
121//! this destination propagation pass handles, proving that similar optimizations can be performed
122//! on MIR.
123//!
124//! ## Pre/Post Optimization
125//!
126//! It is recommended to run `SimplifyCfg` and then `SimplifyLocals` some time after this pass, as
127//! it replaces the eliminated assign statements with `nop`s and leaves unused locals behind.
128//!
129//! [liveness]: https://en.wikipedia.org/wiki/Live_variable_analysis
130//! [attempt 1]: https://github.com/rust-lang/rust/pull/47954
131//! [attempt 2]: https://github.com/rust-lang/rust/pull/71003
132//! [attempt 3]: https://github.com/rust-lang/rust/pull/72632
133
134use rustc_data_structures::fx::{FxIndexMap, IndexEntry, IndexOccupiedEntry};
135use rustc_index::bit_set::DenseBitSet;
136use rustc_index::interval::SparseIntervalMatrix;
137use rustc_middle::bug;
138use rustc_middle::mir::visit::{MutVisitor, PlaceContext, Visitor};
139use rustc_middle::mir::{
140    Body, HasLocalDecls, InlineAsmOperand, Local, LocalKind, Location, Operand, PassWhere, Place,
141    Rvalue, Statement, StatementKind, TerminatorKind, dump_mir, traversal,
142};
143use rustc_middle::ty::TyCtxt;
144use rustc_mir_dataflow::Analysis;
145use rustc_mir_dataflow::impls::MaybeLiveLocals;
146use rustc_mir_dataflow::points::{DenseLocationMap, PointIndex, save_as_intervals};
147use tracing::{debug, trace};
148
149pub(super) struct DestinationPropagation;
150
151impl<'tcx> crate::MirPass<'tcx> for DestinationPropagation {
152    fn is_enabled(&self, sess: &rustc_session::Session) -> bool {
153        // For now, only run at MIR opt level 3. Two things need to be changed before this can be
154        // turned on by default:
155        //  1. Because of the overeager removal of storage statements, this can cause stack space
156        //     regressions. This opt is not the place to fix this though, it's a more general
157        //     problem in MIR.
158        //  2. Despite being an overall perf improvement, this still causes a 30% regression in
159        //     keccak. We can temporarily fix this by bounding function size, but in the long term
160        //     we should fix this by being smarter about invalidating analysis results.
161        sess.mir_opt_level() >= 3
162    }
163
164    fn run_pass(&self, tcx: TyCtxt<'tcx>, body: &mut Body<'tcx>) {
165        let def_id = body.source.def_id();
166        let mut candidates = Candidates::default();
167        let mut write_info = WriteInfo::default();
168        trace!(func = ?tcx.def_path_str(def_id));
169
170        let borrowed = rustc_mir_dataflow::impls::borrowed_locals(body);
171
172        let live = MaybeLiveLocals.iterate_to_fixpoint(tcx, body, Some("MaybeLiveLocals-DestProp"));
173        let points = DenseLocationMap::new(body);
174        let mut live = save_as_intervals(&points, body, live);
175
176        // In order to avoid having to collect data for every single pair of locals in the body, we
177        // do not allow doing more than one merge for places that are derived from the same local at
178        // once. To avoid missed opportunities, we instead iterate to a fixed point - we'll refer to
179        // each of these iterations as a "round."
180        //
181        // Reaching a fixed point could in theory take up to `min(l, s)` rounds - however, we do not
182        // expect to see MIR like that. To verify this, a test was run against `[rust-lang/regex]` -
183        // the average MIR body saw 1.32 full iterations of this loop. The most that was hit were 30
184        // for a single function. Only 80/2801 (2.9%) of functions saw at least 5.
185        //
186        // [rust-lang/regex]:
187        //     https://github.com/rust-lang/regex/tree/b5372864e2df6a2f5e543a556a62197f50ca3650
188        let mut round_count = 0;
189        loop {
190            // PERF: Can we do something smarter than recalculating the candidates and liveness
191            // results?
192            candidates.reset_and_find(body, &borrowed);
193            trace!(?candidates);
194            dest_prop_mir_dump(tcx, body, &points, &live, round_count);
195
196            FilterInformation::filter_liveness(
197                &mut candidates,
198                &points,
199                &live,
200                &mut write_info,
201                body,
202            );
203
204            // Because we only filter once per round, it is unsound to use a local for more than
205            // one merge operation within a single round of optimizations. We store here which ones
206            // we have already used.
207            let mut merged_locals: DenseBitSet<Local> =
208                DenseBitSet::new_empty(body.local_decls.len());
209
210            // This is the set of merges we will apply this round. It is a subset of the candidates.
211            let mut merges = FxIndexMap::default();
212
213            for (src, candidates) in candidates.c.iter() {
214                if merged_locals.contains(*src) {
215                    continue;
216                }
217                let Some(dest) = candidates.iter().find(|dest| !merged_locals.contains(**dest))
218                else {
219                    continue;
220                };
221
222                // Replace `src` by `dest` everywhere.
223                merges.insert(*src, *dest);
224                merged_locals.insert(*src);
225                merged_locals.insert(*dest);
226
227                // Update liveness information based on the merge we just performed.
228                // Every location where `src` was live, `dest` will be live.
229                live.union_rows(*src, *dest);
230            }
231            trace!(merging = ?merges);
232
233            if merges.is_empty() {
234                break;
235            }
236            round_count += 1;
237
238            apply_merges(body, tcx, merges, merged_locals);
239        }
240
241        trace!(round_count);
242    }
243
244    fn is_required(&self) -> bool {
245        false
246    }
247}
248
249#[derive(Debug, Default)]
250struct Candidates {
251    /// The set of candidates we are considering in this optimization.
252    ///
253    /// We will always merge the key into at most one of its values.
254    ///
255    /// Whether a place ends up in the key or the value does not correspond to whether it appears as
256    /// the lhs or rhs of any assignment. As a matter of fact, the places in here might never appear
257    /// in an assignment at all. This happens because if we see an assignment like this:
258    ///
259    /// ```ignore (syntax-highlighting-only)
260    /// _1.0 = _2.0
261    /// ```
262    ///
263    /// We will still report that we would like to merge `_1` and `_2` in an attempt to allow us to
264    /// remove that assignment.
265    c: FxIndexMap<Local, Vec<Local>>,
266
267    /// A reverse index of the `c` set; if the `c` set contains `a => Place { local: b, proj }`,
268    /// then this contains `b => a`.
269    // PERF: Possibly these should be `SmallVec`s?
270    reverse: FxIndexMap<Local, Vec<Local>>,
271}
272
273//////////////////////////////////////////////////////////
274// Merging
275//
276// Applies the actual optimization
277
278fn apply_merges<'tcx>(
279    body: &mut Body<'tcx>,
280    tcx: TyCtxt<'tcx>,
281    merges: FxIndexMap<Local, Local>,
282    merged_locals: DenseBitSet<Local>,
283) {
284    let mut merger = Merger { tcx, merges, merged_locals };
285    merger.visit_body_preserves_cfg(body);
286}
287
288struct Merger<'tcx> {
289    tcx: TyCtxt<'tcx>,
290    merges: FxIndexMap<Local, Local>,
291    merged_locals: DenseBitSet<Local>,
292}
293
294impl<'tcx> MutVisitor<'tcx> for Merger<'tcx> {
295    fn tcx(&self) -> TyCtxt<'tcx> {
296        self.tcx
297    }
298
299    fn visit_local(&mut self, local: &mut Local, _: PlaceContext, _location: Location) {
300        if let Some(dest) = self.merges.get(local) {
301            *local = *dest;
302        }
303    }
304
305    fn visit_statement(&mut self, statement: &mut Statement<'tcx>, location: Location) {
306        match &statement.kind {
307            // FIXME: Don't delete storage statements, but "merge" the storage ranges instead.
308            StatementKind::StorageDead(local) | StatementKind::StorageLive(local)
309                if self.merged_locals.contains(*local) =>
310            {
311                statement.make_nop();
312                return;
313            }
314            _ => (),
315        };
316        self.super_statement(statement, location);
317        match &statement.kind {
318            StatementKind::Assign(box (dest, rvalue)) => {
319                match rvalue {
320                    Rvalue::CopyForDeref(place)
321                    | Rvalue::Use(Operand::Copy(place) | Operand::Move(place)) => {
322                        // These might've been turned into self-assignments by the replacement
323                        // (this includes the original statement we wanted to eliminate).
324                        if dest == place {
325                            debug!("{:?} turned into self-assignment, deleting", location);
326                            statement.make_nop();
327                        }
328                    }
329                    _ => {}
330                }
331            }
332
333            _ => {}
334        }
335    }
336}
337
338//////////////////////////////////////////////////////////
339// Liveness filtering
340//
341// This section enforces bullet point 2
342
343struct FilterInformation<'a, 'tcx> {
344    body: &'a Body<'tcx>,
345    points: &'a DenseLocationMap,
346    live: &'a SparseIntervalMatrix<Local, PointIndex>,
347    candidates: &'a mut Candidates,
348    write_info: &'a mut WriteInfo,
349    at: Location,
350}
351
352// We first implement some utility functions which we will expose removing candidates according to
353// different needs. Throughout the liveness filtering, the `candidates` are only ever accessed
354// through these methods, and not directly.
355impl Candidates {
356    /// Collects the candidates for merging.
357    ///
358    /// This is responsible for enforcing the first and third bullet point.
359    fn reset_and_find<'tcx>(&mut self, body: &Body<'tcx>, borrowed: &DenseBitSet<Local>) {
360        self.c.clear();
361        self.reverse.clear();
362        let mut visitor = FindAssignments { body, candidates: &mut self.c, borrowed };
363        visitor.visit_body(body);
364        // Deduplicate candidates.
365        for (_, cands) in self.c.iter_mut() {
366            cands.sort();
367            cands.dedup();
368        }
369        // Generate the reverse map.
370        for (src, cands) in self.c.iter() {
371            for dest in cands.iter().copied() {
372                self.reverse.entry(dest).or_default().push(*src);
373            }
374        }
375    }
376
377    /// Just `Vec::retain`, but the condition is inverted and we add debugging output
378    fn vec_filter_candidates(
379        src: Local,
380        v: &mut Vec<Local>,
381        mut f: impl FnMut(Local) -> CandidateFilter,
382        at: Location,
383    ) {
384        v.retain(|dest| {
385            let remove = f(*dest);
386            if remove == CandidateFilter::Remove {
387                trace!("eliminating {:?} => {:?} due to conflict at {:?}", src, dest, at);
388            }
389            remove == CandidateFilter::Keep
390        });
391    }
392
393    /// `vec_filter_candidates` but for an `Entry`
394    fn entry_filter_candidates(
395        mut entry: IndexOccupiedEntry<'_, Local, Vec<Local>>,
396        p: Local,
397        f: impl FnMut(Local) -> CandidateFilter,
398        at: Location,
399    ) {
400        let candidates = entry.get_mut();
401        Self::vec_filter_candidates(p, candidates, f, at);
402        if candidates.len() == 0 {
403            // FIXME(#120456) - is `swap_remove` correct?
404            entry.swap_remove();
405        }
406    }
407
408    /// For all candidates `(p, q)` or `(q, p)` removes the candidate if `f(q)` says to do so
409    fn filter_candidates_by(
410        &mut self,
411        p: Local,
412        mut f: impl FnMut(Local) -> CandidateFilter,
413        at: Location,
414    ) {
415        // Cover the cases where `p` appears as a `src`
416        if let IndexEntry::Occupied(entry) = self.c.entry(p) {
417            Self::entry_filter_candidates(entry, p, &mut f, at);
418        }
419        // And the cases where `p` appears as a `dest`
420        let Some(srcs) = self.reverse.get_mut(&p) else {
421            return;
422        };
423        // We use `retain` here to remove the elements from the reverse set if we've removed the
424        // matching candidate in the forward set.
425        srcs.retain(|src| {
426            if f(*src) == CandidateFilter::Keep {
427                return true;
428            }
429            let IndexEntry::Occupied(entry) = self.c.entry(*src) else {
430                return false;
431            };
432            Self::entry_filter_candidates(
433                entry,
434                *src,
435                |dest| {
436                    if dest == p { CandidateFilter::Remove } else { CandidateFilter::Keep }
437                },
438                at,
439            );
440            false
441        });
442    }
443}
444
445#[derive(Copy, Clone, PartialEq, Eq)]
446enum CandidateFilter {
447    Keep,
448    Remove,
449}
450
451impl<'a, 'tcx> FilterInformation<'a, 'tcx> {
452    /// Filters the set of candidates to remove those that conflict.
453    ///
454    /// The steps we take are exactly those that are outlined at the top of the file. For each
455    /// statement/terminator, we collect the set of locals that are written to in that
456    /// statement/terminator, and then we remove all pairs of candidates that contain one such local
457    /// and another one that is live.
458    ///
459    /// We need to be careful about the ordering of operations within each statement/terminator
460    /// here. Many statements might write and read from more than one place, and we need to consider
461    /// them all. The strategy for doing this is as follows: We first gather all the places that are
462    /// written to within the statement/terminator via `WriteInfo`. Then, we use the liveness
463    /// analysis from *before* the statement/terminator (in the control flow sense) to eliminate
464    /// candidates - this is because we want to conservatively treat a pair of locals that is both
465    /// read and written in the statement/terminator to be conflicting, and the liveness analysis
466    /// before the statement/terminator will correctly report locals that are read in the
467    /// statement/terminator to be live. We are additionally conservative by treating all written to
468    /// locals as also being read from.
469    fn filter_liveness(
470        candidates: &mut Candidates,
471        points: &DenseLocationMap,
472        live: &SparseIntervalMatrix<Local, PointIndex>,
473        write_info: &mut WriteInfo,
474        body: &Body<'tcx>,
475    ) {
476        let mut this = FilterInformation {
477            body,
478            points,
479            live,
480            candidates,
481            // We don't actually store anything at this scope, we just keep things here to be able
482            // to reuse the allocation.
483            write_info,
484            // Doesn't matter what we put here, will be overwritten before being used
485            at: Location::START,
486        };
487        this.internal_filter_liveness();
488    }
489
490    fn internal_filter_liveness(&mut self) {
491        for (block, data) in traversal::preorder(self.body) {
492            self.at = Location { block, statement_index: data.statements.len() };
493            self.write_info.for_terminator(&data.terminator().kind);
494            self.apply_conflicts();
495
496            for (i, statement) in data.statements.iter().enumerate().rev() {
497                self.at = Location { block, statement_index: i };
498                self.write_info.for_statement(&statement.kind, self.body);
499                self.apply_conflicts();
500            }
501        }
502    }
503
504    fn apply_conflicts(&mut self) {
505        let writes = &self.write_info.writes;
506        for p in writes {
507            let other_skip = self.write_info.skip_pair.and_then(|(a, b)| {
508                if a == *p {
509                    Some(b)
510                } else if b == *p {
511                    Some(a)
512                } else {
513                    None
514                }
515            });
516            let at = self.points.point_from_location(self.at);
517            self.candidates.filter_candidates_by(
518                *p,
519                |q| {
520                    if Some(q) == other_skip {
521                        return CandidateFilter::Keep;
522                    }
523                    // It is possible that a local may be live for less than the
524                    // duration of a statement This happens in the case of function
525                    // calls or inline asm. Because of this, we also mark locals as
526                    // conflicting when both of them are written to in the same
527                    // statement.
528                    if self.live.contains(q, at) || writes.contains(&q) {
529                        CandidateFilter::Remove
530                    } else {
531                        CandidateFilter::Keep
532                    }
533                },
534                self.at,
535            );
536        }
537    }
538}
539
540/// Describes where a statement/terminator writes to
541#[derive(Default, Debug)]
542struct WriteInfo {
543    writes: Vec<Local>,
544    /// If this pair of locals is a candidate pair, completely skip processing it during this
545    /// statement. All other candidates are unaffected.
546    skip_pair: Option<(Local, Local)>,
547}
548
549impl WriteInfo {
550    fn for_statement<'tcx>(&mut self, statement: &StatementKind<'tcx>, body: &Body<'tcx>) {
551        self.reset();
552        match statement {
553            StatementKind::Assign(box (lhs, rhs)) => {
554                self.add_place(*lhs);
555                match rhs {
556                    Rvalue::Use(op) => {
557                        self.add_operand(op);
558                        self.consider_skipping_for_assign_use(*lhs, op, body);
559                    }
560                    Rvalue::Repeat(op, _) => {
561                        self.add_operand(op);
562                    }
563                    Rvalue::Cast(_, op, _)
564                    | Rvalue::UnaryOp(_, op)
565                    | Rvalue::ShallowInitBox(op, _) => {
566                        self.add_operand(op);
567                    }
568                    Rvalue::BinaryOp(_, ops) => {
569                        for op in [&ops.0, &ops.1] {
570                            self.add_operand(op);
571                        }
572                    }
573                    Rvalue::Aggregate(_, ops) => {
574                        for op in ops {
575                            self.add_operand(op);
576                        }
577                    }
578                    Rvalue::WrapUnsafeBinder(op, _) => {
579                        self.add_operand(op);
580                    }
581                    Rvalue::ThreadLocalRef(_)
582                    | Rvalue::NullaryOp(_, _)
583                    | Rvalue::Ref(_, _, _)
584                    | Rvalue::RawPtr(_, _)
585                    | Rvalue::Len(_)
586                    | Rvalue::Discriminant(_)
587                    | Rvalue::CopyForDeref(_) => {}
588                }
589            }
590            // Retags are technically also reads, but reporting them as a write suffices
591            StatementKind::SetDiscriminant { place, .. }
592            | StatementKind::Deinit(place)
593            | StatementKind::Retag(_, place) => {
594                self.add_place(**place);
595            }
596            StatementKind::Intrinsic(_)
597            | StatementKind::ConstEvalCounter
598            | StatementKind::Nop
599            | StatementKind::Coverage(_)
600            | StatementKind::StorageLive(_)
601            | StatementKind::StorageDead(_)
602            | StatementKind::BackwardIncompatibleDropHint { .. }
603            | StatementKind::PlaceMention(_) => {}
604            StatementKind::FakeRead(_) | StatementKind::AscribeUserType(_, _) => {
605                bug!("{:?} not found in this MIR phase", statement)
606            }
607        }
608    }
609
610    fn consider_skipping_for_assign_use<'tcx>(
611        &mut self,
612        lhs: Place<'tcx>,
613        rhs: &Operand<'tcx>,
614        body: &Body<'tcx>,
615    ) {
616        let Some(rhs) = rhs.place() else { return };
617        if let Some(pair) = places_to_candidate_pair(lhs, rhs, body) {
618            self.skip_pair = Some(pair);
619        }
620    }
621
622    fn for_terminator<'tcx>(&mut self, terminator: &TerminatorKind<'tcx>) {
623        self.reset();
624        match terminator {
625            TerminatorKind::SwitchInt { discr: op, .. }
626            | TerminatorKind::Assert { cond: op, .. } => {
627                self.add_operand(op);
628            }
629            TerminatorKind::Call { destination, func, args, .. } => {
630                self.add_place(*destination);
631                self.add_operand(func);
632                for arg in args {
633                    self.add_operand(&arg.node);
634                }
635            }
636            TerminatorKind::TailCall { func, args, .. } => {
637                self.add_operand(func);
638                for arg in args {
639                    self.add_operand(&arg.node);
640                }
641            }
642            TerminatorKind::InlineAsm { operands, .. } => {
643                for asm_operand in operands {
644                    match asm_operand {
645                        InlineAsmOperand::In { value, .. } => {
646                            self.add_operand(value);
647                        }
648                        InlineAsmOperand::Out { place, .. } => {
649                            if let Some(place) = place {
650                                self.add_place(*place);
651                            }
652                        }
653                        // Note that the `late` field in `InOut` is about whether the registers used
654                        // for these things overlap, and is of absolutely no interest to us.
655                        InlineAsmOperand::InOut { in_value, out_place, .. } => {
656                            if let Some(place) = out_place {
657                                self.add_place(*place);
658                            }
659                            self.add_operand(in_value);
660                        }
661                        InlineAsmOperand::Const { .. }
662                        | InlineAsmOperand::SymFn { .. }
663                        | InlineAsmOperand::SymStatic { .. }
664                        | InlineAsmOperand::Label { .. } => {}
665                    }
666                }
667            }
668            TerminatorKind::Goto { .. }
669            | TerminatorKind::UnwindResume
670            | TerminatorKind::UnwindTerminate(_)
671            | TerminatorKind::Return
672            | TerminatorKind::Unreachable { .. } => (),
673            TerminatorKind::Drop { .. } => {
674                // `Drop`s create a `&mut` and so are not considered
675            }
676            TerminatorKind::Yield { .. }
677            | TerminatorKind::CoroutineDrop
678            | TerminatorKind::FalseEdge { .. }
679            | TerminatorKind::FalseUnwind { .. } => {
680                bug!("{:?} not found in this MIR phase", terminator)
681            }
682        }
683    }
684
685    fn add_place(&mut self, place: Place<'_>) {
686        self.writes.push(place.local);
687    }
688
689    fn add_operand<'tcx>(&mut self, op: &Operand<'tcx>) {
690        match op {
691            // FIXME(JakobDegen): In a previous version, the `Move` case was incorrectly treated as
692            // being a read only. This was unsound, however we cannot add a regression test because
693            // it is not possible to set this off with current MIR. Once we have that ability, a
694            // regression test should be added.
695            Operand::Move(p) => self.add_place(*p),
696            Operand::Copy(_) | Operand::Constant(_) => (),
697        }
698    }
699
700    fn reset(&mut self) {
701        self.writes.clear();
702        self.skip_pair = None;
703    }
704}
705
706/////////////////////////////////////////////////////
707// Candidate accumulation
708
709/// If the pair of places is being considered for merging, returns the candidate which would be
710/// merged in order to accomplish this.
711///
712/// The contract here is in one direction - there is a guarantee that merging the locals that are
713/// outputted by this function would result in an assignment between the inputs becoming a
714/// self-assignment. However, there is no guarantee that the returned pair is actually suitable for
715/// merging - candidate collection must still check this independently.
716///
717/// This output is unique for each unordered pair of input places.
718fn places_to_candidate_pair<'tcx>(
719    a: Place<'tcx>,
720    b: Place<'tcx>,
721    body: &Body<'tcx>,
722) -> Option<(Local, Local)> {
723    let (mut a, mut b) = if a.projection.len() == 0 && b.projection.len() == 0 {
724        (a.local, b.local)
725    } else {
726        return None;
727    };
728
729    // By sorting, we make sure we're input order independent
730    if a > b {
731        std::mem::swap(&mut a, &mut b);
732    }
733
734    // We could now return `(a, b)`, but then we miss some candidates in the case where `a` can't be
735    // used as a `src`.
736    if is_local_required(a, body) {
737        std::mem::swap(&mut a, &mut b);
738    }
739    // We could check `is_local_required` again here, but there's no need - after all, we make no
740    // promise that the candidate pair is actually valid
741    Some((a, b))
742}
743
744struct FindAssignments<'a, 'tcx> {
745    body: &'a Body<'tcx>,
746    candidates: &'a mut FxIndexMap<Local, Vec<Local>>,
747    borrowed: &'a DenseBitSet<Local>,
748}
749
750impl<'tcx> Visitor<'tcx> for FindAssignments<'_, 'tcx> {
751    fn visit_statement(&mut self, statement: &Statement<'tcx>, _: Location) {
752        if let StatementKind::Assign(box (
753            lhs,
754            Rvalue::CopyForDeref(rhs) | Rvalue::Use(Operand::Copy(rhs) | Operand::Move(rhs)),
755        )) = &statement.kind
756        {
757            let Some((src, dest)) = places_to_candidate_pair(*lhs, *rhs, self.body) else {
758                return;
759            };
760
761            // As described at the top of the file, we do not go near things that have
762            // their address taken.
763            if self.borrowed.contains(src) || self.borrowed.contains(dest) {
764                return;
765            }
766
767            // As described at the top of this file, we do not touch locals which have
768            // different types.
769            let src_ty = self.body.local_decls()[src].ty;
770            let dest_ty = self.body.local_decls()[dest].ty;
771            if src_ty != dest_ty {
772                // FIXME(#112651): This can be removed afterwards. Also update the module description.
773                trace!("skipped `{src:?} = {dest:?}` due to subtyping: {src_ty} != {dest_ty}");
774                return;
775            }
776
777            // Also, we need to make sure that MIR actually allows the `src` to be removed
778            if is_local_required(src, self.body) {
779                return;
780            }
781
782            // We may insert duplicates here, but that's fine
783            self.candidates.entry(src).or_default().push(dest);
784        }
785    }
786}
787
788/// Some locals are part of the function's interface and can not be removed.
789///
790/// Note that these locals *can* still be merged with non-required locals by removing that other
791/// local.
792fn is_local_required(local: Local, body: &Body<'_>) -> bool {
793    match body.local_kind(local) {
794        LocalKind::Arg | LocalKind::ReturnPointer => true,
795        LocalKind::Temp => false,
796    }
797}
798
799/////////////////////////////////////////////////////////
800// MIR Dump
801
802fn dest_prop_mir_dump<'tcx>(
803    tcx: TyCtxt<'tcx>,
804    body: &Body<'tcx>,
805    points: &DenseLocationMap,
806    live: &SparseIntervalMatrix<Local, PointIndex>,
807    round: usize,
808) {
809    let locals_live_at = |location| {
810        let location = points.point_from_location(location);
811        live.rows().filter(|&r| live.contains(r, location)).collect::<Vec<_>>()
812    };
813    dump_mir(tcx, false, "DestinationPropagation-dataflow", &round, body, |pass_where, w| {
814        if let PassWhere::BeforeLocation(loc) = pass_where {
815            writeln!(w, "        // live: {:?}", locals_live_at(loc))?;
816        }
817
818        Ok(())
819    });
820}