Expand description
As explained in crate::usefulness
, values and patterns are made from constructors applied to
fields. This file defines a Constructor
enum and various operations to manipulate them.
There are two important bits of core logic in this file: constructor inclusion and constructor
splitting. Constructor inclusion, i.e. whether a constructor is included in/covered by another,
is straightforward and defined in Constructor::is_covered_by
.
Constructor splitting is mentioned in crate::usefulness
but not detailed. We describe it
precisely here.
§Constructor grouping and splitting
As explained in the corresponding section in crate::usefulness
, to make usefulness tractable
we need to group together constructors that have the same effect when they are used to
specialize the matrix.
Example:
match (0, false) {
(0 ..=100, true) => {}
(50..=150, false) => {}
(0 ..=200, _) => {}
}
In this example we can restrict specialization to 5 cases: 0..50
, 50..=100
, 101..=150
,
151..=200
and 200..
.
In crate::usefulness
, we had said that specialize
only takes value-only constructors. We
now relax this restriction: we allow specialize
to take constructors like 0..50
as long as
we’re careful to only do that with constructors that make sense. For example, specialize(0..50, (0..=100, true))
is sensible, but specialize(50..=200, (0..=100, true))
is not.
Constructor splitting looks at the constructors in the first column of the matrix and constructs such a sensible set of constructors. Formally, we want to find a smallest disjoint set of constructors:
- Whose union covers the whole type, and
- That have no non-trivial intersection with any of the constructors in the column (i.e. they’re each either disjoint with or covered by any given column constructor).
We compute this in two steps: first PatCx::ctors_for_ty
determines the
set of all possible constructors for the type. Then ConstructorSet::split
looks at the
column of constructors and splits the set into groups accordingly. The precise invariants of
ConstructorSet::split
is described in SplitConstructorSet
.
Constructor splitting has two interesting special cases: integer range splitting (see
IntRange::split
) and slice splitting (see Slice::split
).
§The Missing
constructor
We detail a special case of constructor splitting that is a bit subtle. Take the following:
enum Direction { North, South, East, West }
match wind {
(Direction::North, 50..) => {}
(_, _) => {}
}
Here we expect constructor splitting to output two cases: North
, and “everything else”. This
“everything else” is represented by Constructor::Missing
. Unlike other constructors, it’s a
bit contextual: to know the exact list of constructors it represents we have to look at the
column. In practice however we don’t need to, because by construction it only matches rows that
have wildcards. This is how this constructor is special: the only constructor that covers it is
Wildcard
.
The only place where we care about which constructors Missing
represents is in diagnostics
(see crate::usefulness::WitnessMatrix::apply_constructor
).
We choose whether to specialize with Missing
in
crate::usefulness::compute_exhaustiveness_and_usefulness
.
§Empty types, empty constructors, and the exhaustive_patterns
feature
An empty type is a type that has no valid value, like !
, enum Void {}
, or Result<!, !>
.
They require careful handling.
First, for soundness reasons related to the possible existence of invalid values, by default we
don’t treat empty types as empty. We force them to be matched with wildcards. Except if the
exhaustive_patterns
feature is turned on, in which case we do treat them as empty. And also
except if the type has no constructors (like enum Void {}
but not like Result<!, !>
), we
specifically allow match void {}
to be exhaustive. There are additionally considerations of
place validity that are handled in crate::usefulness
. Yes this is a bit tricky.
The second thing is that regardless of the above, it is always allowed to use all the constructors of a type. For example, all the following is ok:
fn foo(x: Option<!>) {
match x {
None => {}
Some(_) => {}
}
}
fn bar(x: &[!]) -> u32 {
match x {
[] => 1,
[_] => 2,
[_, _] => 3,
}
}
Moreover, take the following:
match x {
None => {}
}
On a normal type, we would identify Some
as missing and tell the user. If x: Option<!>
however (and exhaustive_patterns
is on), it’s ok to omit Some
. When listing the constructors
of a type, we must therefore track which can be omitted.
Let’s call “empty” a constructor that matches no valid value for the type, like Some
for the
type Option<!>
. What this all means is that ConstructorSet
must know which constructors are
empty. The difference between empty and nonempty constructors is that empty constructors need
not be present for the match to be exhaustive.
A final remark: empty constructors of arity 0 break specialization, we must avoid them. The
reason is that if we specialize by them, nothing remains to witness the emptiness; the rest of
the algorithm can’t distinguish them from a nonempty constructor. The only known case where this
could happen is the [..]
pattern on [!; N]
with N > 0
so we must take care to not emit it.
This is all handled by PatCx::ctors_for_ty
and
ConstructorSet::split
. The invariants of SplitConstructorSet
are also of interest.
§Unions
Unions allow us to match a value via several overlapping representations at the same time. For
example, the following is exhaustive because when seeing the value as a boolean we handled all
possible cases (other cases such as n == 3
would trigger UB).
union U8AsBool {
n: u8,
b: bool,
}
let x = U8AsBool { n: 1 };
unsafe {
match x {
U8AsBool { n: 2 } => {}
U8AsBool { b: true } => {}
U8AsBool { b: false } => {}
}
}
Pattern-matching has no knowledge that e.g. false as u8 == 0
, so the values we consider in the
algorithm look like U8AsBool { b: true, n: 2 }
. In other words, for the most part a union is
treated like a struct with the same fields. The difference lies in how we construct witnesses of
non-exhaustiveness.
§Opaque patterns
Some patterns, such as constants that are not allowed to be matched structurally, cannot be
inspected, which we handle with Constructor::Opaque
. Since we know nothing of these patterns,
we assume they never cover each other. In order to respect the invariants of
SplitConstructorSet
, we give each Opaque
constructor a unique id so we can recognize it.
Structs§
- An exclusive interval, used for precise integer exhaustiveness checking.
IntRange
s always store a contiguous range. - A globally unique id to distinguish
Opaque
patterns. - A constructor for array and slice patterns.
- Describes the result of analyzing the constructors in a column of a match.
Enums§
- A value can be decomposed into a constructor applied to some fields. This struct represents the constructor. See also
Fields
. - Describes the set of all constructors for a type. For details, in particular about the emptiness of constructors, see the top of the file.
- A possibly infinite integer. Values are encoded such that the ordering on
u128
matches the natural order on the original type. For example,-128i8
is encoded as0
and127i8
as255
. Seesigned_bias
for details. - Presence 🔒Whether we have seen a constructor in the column or not.