For now, this reference is a best-effort document. We strive for validity and completeness, but are not yet there. In the future, the docs and lang teams will work together to figure out how best to do this. Until then, this is a best-effort attempt. If you find something wrong or missing, file an issue or send in a pull request.

# Introduction

This book is the primary reference for the Rust programming language. It provides three kinds of material:

• Chapters that informally describe each language construct and their use.
• Chapters that informally describe the memory model, concurrency model, runtime services, linkage model and debugging facilities.
• Appendix chapters providing rationale and references to languages that influenced the design.

Warning: This book is incomplete. Documenting everything takes a while. See the GitHub issues for what is not documented in this book.

## What The Reference is Not

This book does not serve as an introduction to the language. Background familiarity with the language is assumed. A separate book is available to help acquire such background familiarity.

This book also does not serve as a reference to the standard library included in the language distribution. Those libraries are documented separately by extracting documentation attributes from their source code. Many of the features that one might expect to be language features are library features in Rust, so what you're looking for may be there, not here.

Similarly, this book does not usually document the specifics of rustc as a tool or of Cargo. rustc has its own book. Cargo has a book that contains a reference. There are a few pages such as linkage that still describe how rustc works.

This book also only serves as a reference to what is available in stable Rust. For unstable features being worked on, see the Unstable Book.

Finally, this book is not normative. It may include details that are specific to rustc itself, and should not be taken as a specification for the Rust language. We intend to produce such a book someday, and until then, the reference is the closest thing we have to one.

## How to Use This Book

This book does not assume you are reading this book sequentially. Each chapter generally can be read standalone, but will cross-link to other chapters for facets of the language they refer to, but do not discuss.

There are two main ways to read this document.

The first is to answer a specific question. If you know which chapter answers that question, you can jump to that chapter in the table of contents. Otherwise, you can press s or the click the magnifying glass on the top bar to search for keywords related to your question. For example, say you wanted to know when a temporary value created in a let statement is dropped. If you didn't already know that the lifetime of temporaries is defined in the expressions chapter, you could search "temporary let" and the first search result will take you to that section.

That said, there is no wrong way to read this book. Read it however you feel helps you best.

### Conventions

Like all technical books, this book has certain conventions in how it displays information. These conventions are documented here.

• Statements that define a term contain that term in italics. Whenever that term is used outside of that chapter, it is usually a link to the section that has this definition.

An example term is an example of a term being defined.

• Differences in the language by which edition the crate is compiled under are in a blockquote that start with the words "Edition Differences:" in bold.

Edition Differences: In the 2015 edition, this syntax is valid that is disallowed as of the 2018 edition.

• Notes that contain useful information about the state of the book or point out useful, but mostly out of scope, information are in blockquotes that start with the word "Note:" in bold.

Note: This is an example note.

• Warnings that show unsound behavior in the language or possibly confusing interactions of language features are in a special warning box.

Warning: This is an example warning.

• Code snippets inline in the text are inside <code> tags.

Longer code examples are in a syntax highlighted box that has controls for copying, executing, and showing hidden lines in the top right corner.

# // This is a hidden line.
fn main() {
println!("This is a code example");
}

• The grammar and lexical structure is in blockquotes with either "Lexer" or "Syntax" in bold superscript as the first line.

Syntax
ExampleGrammar:
~ Expression
| box Expression

See Notation for more detail.

## Contributing

We welcome contributions of all kinds.

You can contribute to this book by opening an issue or sending a pull request to the Rust Reference repository. If this book does not answer your question, and you think its answer is in scope of it, please do not hesitate to file an issue or ask about it in the #docs channels on Discord. Knowing what people use this book for the most helps direct our attention to making those sections the best that they can be.

# Notation

## Grammar

The following notations are used by the Lexer and Syntax grammar snippets:

NotationExamplesMeaning
CAPITALKW_IF, INTEGER_LITERALA token produced by the lexer
ItalicCamelCaseLetStatement, ItemA syntactical production
stringx, while, *The exact character(s)
\x\n, \r, \t, \0The character represented by this escape
x?pub?An optional item
x\*OuterAttribute\*0 or more of x
x+MacroMatch+1 or more of x
xa..bHEX_DIGIT1..6a to b repetitions of x
|u8 | u16, Block | ItemEither one or another
[ ][b B]Any of the characters listed
[ - ][a-z]Any of the characters in the range
~[ ]~[b B]Any characters, except those listed
~string~\n, ~*/Any characters, except this sequence
( )(, Parameter)?Groups items

## String table productions

Some rules in the grammar — notably unary operators, binary operators, and keywords — are given in a simplified form: as a listing of printable strings. These cases form a subset of the rules regarding the token rule, and are assumed to be the result of a lexical-analysis phase feeding the parser, driven by a DFA, operating over the disjunction of all such string table entries.

When such a string in monospace font occurs inside the grammar, it is an implicit reference to a single member of such a string table production. See tokens for more information.

# Input format

Rust input is interpreted as a sequence of Unicode code points encoded in UTF-8.

# Keywords

Rust divides keywords into three categories:

## Strict keywords

These keywords can only be used in their correct contexts. They cannot be used as the names of:

Lexer:
KW_AS : as
KW_BREAK : break
KW_CONST : const
KW_CONTINUE : continue
KW_CRATE : crate
KW_ELSE : else
KW_ENUM : enum
KW_EXTERN : extern
KW_FALSE : false
KW_FN : fn
KW_FOR : for
KW_IF : if
KW_IMPL : impl
KW_IN : in
KW_LET : let
KW_LOOP : loop
KW_MATCH : match
KW_MOD : mod
KW_MOVE : move
KW_MUT : mut
KW_PUB : pub
KW_REF : ref
KW_RETURN : return
KW_SELFVALUE : self
KW_SELFTYPE : Self
KW_STATIC : static
KW_STRUCT : struct
KW_SUPER : super
KW_TRAIT : trait
KW_TRUE : true
KW_TYPE : type
KW_UNSAFE : unsafe
KW_USE : use
KW_WHERE : where
KW_WHILE : while

The following keywords were added beginning in the 2018 edition.

Lexer 2018+
KW_DYN : dyn

## Reserved keywords

These keywords aren't used yet, but they are reserved for future use. They have the same restrictions as strict keywords. The reasoning behind this is to make current programs forward compatible with future versions of Rust by forbidding them to use these keywords.

Lexer
KW_ABSTRACT : abstract
KW_BECOME : become
KW_BOX : box
KW_DO : do
KW_FINAL : final
KW_MACRO : macro
KW_OVERRIDE : override
KW_PRIV : priv
KW_TYPEOF : typeof
KW_UNSIZED : unsized
KW_VIRTUAL : virtual
KW_YIELD : yield

The following keywords are reserved beginning in the 2018 edition.

Lexer 2018+
KW_ASYNC : async
KW_AWAIT : await
KW_TRY : try

## Weak keywords

These keywords have special meaning only in certain contexts. For example, it is possible to declare a variable or method with the name union.

• union is used to declare a union and is only a keyword when used in a union declaration.

• 'static is used for the static lifetime and cannot be used as a generic lifetime parameter

// error[E0262]: invalid lifetime parameter name: 'static
fn invalid_lifetime_parameter<'static>(s: &'static str) -> &'static str { s }

• In the 2015 edition, dyn is a keyword when used in a type position followed by a path that does not start with ::.

Beginning in the 2018 edition, dyn has been promoted to a strict keyword.

Lexer
KW_UNION : union
KW_STATICLIFETIME : 'static

Lexer 2015
KW_DYN : dyn

# Identifiers

Lexer:
IDENTIFIER_OR_KEYWORD :
[a-z A-Z] [a-z A-Z 0-9 _]\*
| _ [a-z A-Z 0-9 _]+

RAW_IDENTIFIER : r# IDENTIFIER_OR_KEYWORD Except crate, extern, self, super, Self

NON_KEYWORD_IDENTIFIER : IDENTIFIER_OR_KEYWORD Except a strict or reserved keyword

IDENTIFIER :
NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIER

An identifier is any nonempty ASCII string of the following form:

Either

• The first character is a letter.
• The remaining characters are alphanumeric or _.

Or

• The first character is _.
• The identifier is more than one character. _ alone is not an identifier.
• The remaining characters are alphanumeric or _.

A raw identifier is like a normal identifier, but prefixed by r#. (Note that the r# prefix is not included as part of the actual identifier.) Unlike a normal identifier, a raw identifier may be any strict or reserved keyword except the ones listed above for RAW_IDENTIFIER.

Lexer
LINE_COMMENT :
// (~[/ !] | //) ~\n\*
| //

BLOCK_COMMENT :
/* (~[* !] | ** | BlockCommentOrDoc) (BlockCommentOrDoc | ~*/)\* */
| /**/
| /***/

INNER_LINE_DOC :
//! ~[\n IsolatedCR]\*

INNER_BLOCK_DOC :
/*! ( BlockCommentOrDoc | ~[*/ IsolatedCR] )\* */

OUTER_LINE_DOC :
/// (~/ ~[\n IsolatedCR]\*)?

OUTER_BLOCK_DOC :
/** (~* | BlockCommentOrDoc ) (BlockCommentOrDoc | ~[*/ IsolatedCR])\* */

BlockCommentOrDoc :
BLOCK_COMMENT
| OUTER_BLOCK_DOC
| INNER_BLOCK_DOC

IsolatedCR :
A \r not followed by a \n

Comments in Rust code follow the general C++ style of line (//) and block (/* ... */) comment forms. Nested block comments are supported.

Non-doc comments are interpreted as a form of whitespace.

Line doc comments beginning with exactly three slashes (///), and block doc comments (/** ... */), both inner doc comments, are interpreted as a special syntax for doc attributes. That is, they are equivalent to writing #[doc="..."] around the body of the comment, i.e., /// Foo turns into #[doc="Foo"] and /** Bar */ turns into #[doc="Bar"].

Line comments beginning with //! and block comments /*! ... */ are doc comments that apply to the parent of the comment, rather than the item that follows. That is, they are equivalent to writing #![doc="..."] around the body of the comment. //! comments are usually used to document modules that occupy a source file.

Isolated CRs (\r), i.e. not followed by LF (\n), are not allowed in doc comments.

## Examples


# #![allow(unused_variables)]
#fn main() {
//! A doc comment that applies to the implicit anonymous module of this crate

pub mod outer_module {

//!  - Inner line doc
//!! - Still an inner line doc (but with a bang at the beginning)

/*!  - Inner block doc */
/*!! - Still an inner block doc (but with a bang at the beginning) */

//   - Only a comment
///  - Outer line doc (exactly 3 slashes)
//// - Only a comment

/*   - Only a comment */
/**  - Outer block doc (exactly) 2 asterisks */
/*** - Only a comment */

pub mod inner_module {}

/* In Rust /* we can /* nest comments */ */ */

// All three types of block comments can contain or be nested inside
// any other type:

/*   /* */  /** */  /*! */  */
/*!  /* */  /** */  /*! */  */
/**  /* */  /** */  /*! */  */
pub mod dummy_item {}
}

pub mod degenerate_cases {
// empty inner line doc
//!

// empty inner block doc
/*!*/

// empty line comment
//

// empty outer line doc
///

// empty block comment
/**/

pub mod dummy_item {}

// empty 2-asterisk block isn't a doc block, it is a block comment
/***/

}

/* The next one isn't allowed because outer doc comments
require an item that will receive the doc */

/// Where is my item?
#   mod boo {}
}
#}

# Whitespace

Whitespace is any non-empty string containing only characters that have the Pattern_White_Space Unicode property, namely:

• U+0009 (horizontal tab, '\t')
• U+000A (line feed, '\n')
• U+000B (vertical tab)
• U+000C (form feed)
• U+000D (carriage return, '\r')
• U+0020 (space, ' ')
• U+0085 (next line)
• U+200E (left-to-right mark)
• U+200F (right-to-left mark)
• U+2028 (line separator)
• U+2029 (paragraph separator)

Rust is a "free-form" language, meaning that all forms of whitespace serve only to separate tokens in the grammar, and have no semantic significance.

A Rust program has identical meaning if each whitespace element is replaced with any other legal whitespace element, such as a single space character.

# Tokens

Tokens are primitive productions in the grammar defined by regular (non-recursive) languages. Rust source input can be broken down into the following kinds of tokens:

Within this documentation's grammar, "simple" tokens are given in string table production form, and appear in monospace font.

## Literals

A literal is an expression consisting of a single token, rather than a sequence of tokens, that immediately and directly denotes the value it evaluates to, rather than referring to it by name or some other evaluation rule. A literal is a form of constant expression, so is evaluated (primarily) at compile time.

### Examples

#### Characters and strings

Example# setsCharactersEscapes
Character'H'0All UnicodeQuote & ASCII & Unicode
String"hello"0All UnicodeQuote & ASCII & Unicode
Raw stringr#"hello"#0 or more*All UnicodeN/A
Byteb'H'0All ASCIIQuote & Byte
Byte stringb"hello"0All ASCIIQuote & Byte
Raw byte stringbr#"hello"#0 or more*All ASCIIN/A

* The number of #s on each side of the same literal must be equivalent

#### ASCII escapes

Name
\x417-bit character code (exactly 2 digits, up to 0x7F)
\nNewline
\rCarriage return
\tTab
\\Backslash
\0Null

#### Byte escapes

Name
\x7F8-bit character code (exactly 2 digits)
\nNewline
\rCarriage return
\tTab
\\Backslash
\0Null

#### Unicode escapes

Name
\u{7FFF}24-bit Unicode character code (up to 6 digits)

#### Quote escapes

Name
\'Single quote
\"Double quote

#### Numbers

Number literals*ExampleExponentiationSuffixes
Decimal integer98_222N/AInteger suffixes
Hex integer0xffN/AInteger suffixes
Octal integer0o77N/AInteger suffixes
Binary integer0b1111_0000N/AInteger suffixes
Floating-point123.0E+77OptionalFloating-point suffixes

* All number literals allow _ as a visual separator: 1_234.0E+18f64

#### Suffixes

A suffix is a non-raw identifier immediately (without whitespace) following the primary part of a literal.

Any kind of literal (string, integer, etc) with any suffix is valid as a token, and can be passed to a macro without producing an error.
The macro itself will decide how to interpret such a token and whether to produce an error or not.


# #![allow(unused_variables)]
#fn main() {
macro_rules! blackhole { ($tt:tt) => () } blackhole!("string"suffix); // OK #} However, suffixes on literal tokens parsed as Rust code are restricted. Any suffixes are rejected on non-numeric literal tokens, and numeric literal tokens are accepted only with suffixes from the list below. IntegerFloating-point u8, i8, u16, i16, u32, i32, u64, i64, u128, i128, usize, isizef32, f64 ### Character and string literals #### Character literals Lexer CHAR_LITERAL : ' ( ~[' \ \n \r \t] | QUOTE_ESCAPE | ASCII_ESCAPE | UNICODE_ESCAPE ) ' QUOTE_ESCAPE : \' | \" ASCII_ESCAPE : \x OCT_DIGIT HEX_DIGIT | \n | \r | \t | \\ | \0 UNICODE_ESCAPE : \u{ ( HEX_DIGIT _\* )1..6 } A character literal is a single Unicode character enclosed within two U+0027 (single-quote) characters, with the exception of U+0027 itself, which must be escaped by a preceding U+005C character (\). #### String literals Lexer STRING_LITERAL : " ( ~[" \ IsolatedCR] | QUOTE_ESCAPE | ASCII_ESCAPE | UNICODE_ESCAPE | STRING_CONTINUE )\* " STRING_CONTINUE : \ followed by \n A string literal is a sequence of any Unicode characters enclosed within two U+0022 (double-quote) characters, with the exception of U+0022 itself, which must be escaped by a preceding U+005C character (\). Line-break characters are allowed in string literals. Normally they represent themselves (i.e. no translation), but as a special exception, when an unescaped U+005C character (\) occurs immediately before the newline (U+000A), the U+005C character, the newline, and all whitespace at the beginning of the next line are ignored. Thus a and b are equal:  # #![allow(unused_variables)] #fn main() { let a = "foobar"; let b = "foo\ bar"; assert_eq!(a,b); #} #### Character escapes Some additional escapes are available in either character or non-raw string literals. An escape starts with a U+005C (\) and continues with one of the following forms: • A 7-bit code point escape starts with U+0078 (x) and is followed by exactly two hex digits with value up to 0x7F. It denotes the ASCII character with value equal to the provided hex value. Higher values are not permitted because it is ambiguous whether they mean Unicode code points or byte values. • A 24-bit code point escape starts with U+0075 (u) and is followed by up to six hex digits surrounded by braces U+007B ({) and U+007D (}). It denotes the Unicode code point equal to the provided hex value. • A whitespace escape is one of the characters U+006E (n), U+0072 (r), or U+0074 (t), denoting the Unicode values U+000A (LF), U+000D (CR) or U+0009 (HT) respectively. • The null escape is the character U+0030 (0) and denotes the Unicode value U+0000 (NUL). • The backslash escape is the character U+005C (\) which must be escaped in order to denote itself. #### Raw string literals Lexer RAW_STRING_LITERAL : r RAW_STRING_CONTENT RAW_STRING_CONTENT : " ( ~ IsolatedCR )* (non-greedy) " | # RAW_STRING_CONTENT # Raw string literals do not process any escapes. They start with the character U+0072 (r), followed by zero or more of the character U+0023 (#) and a U+0022 (double-quote) character. The raw string body can contain any sequence of Unicode characters and is terminated only by another U+0022 (double-quote) character, followed by the same number of U+0023 (#) characters that preceded the opening U+0022 (double-quote) character. All Unicode characters contained in the raw string body represent themselves, the characters U+0022 (double-quote) (except when followed by at least as many U+0023 (#) characters as were used to start the raw string literal) or U+005C (\) do not have any special meaning. Examples for string literals:  # #![allow(unused_variables)] #fn main() { "foo"; r"foo"; // foo "\"foo\""; r#""foo""#; // "foo" "foo #\"# bar"; r##"foo #"# bar"##; // foo #"# bar "\x52"; "R"; r"R"; // R "\\x52"; r"\x52"; // \x52 #} ### Byte and byte string literals #### Byte literals Lexer BYTE_LITERAL : b' ( ASCII_FOR_CHAR | BYTE_ESCAPE ) ' ASCII_FOR_CHAR : any ASCII (i.e. 0x00 to 0x7F), except ', \, \n, \r or \t BYTE_ESCAPE : \x HEX_DIGIT HEX_DIGIT | \n | \r | \t | \\ | \0 A byte literal is a single ASCII character (in the U+0000 to U+007F range) or a single escape preceded by the characters U+0062 (b) and U+0027 (single-quote), and followed by the character U+0027. If the character U+0027 is present within the literal, it must be escaped by a preceding U+005C (\) character. It is equivalent to a u8 unsigned 8-bit integer number literal. #### Byte string literals Lexer BYTE_STRING_LITERAL : b" ( ASCII_FOR_STRING | BYTE_ESCAPE | STRING_CONTINUE )\* " ASCII_FOR_STRING : any ASCII (i.e 0x00 to 0x7F), except ", \ and IsolatedCR A non-raw byte string literal is a sequence of ASCII characters and escapes, preceded by the characters U+0062 (b) and U+0022 (double-quote), and followed by the character U+0022. If the character U+0022 is present within the literal, it must be escaped by a preceding U+005C (\) character. Alternatively, a byte string literal can be a raw byte string literal, defined below. The type of a byte string literal of length n is &'static [u8; n]. Some additional escapes are available in either byte or non-raw byte string literals. An escape starts with a U+005C (\) and continues with one of the following forms: • A byte escape escape starts with U+0078 (x) and is followed by exactly two hex digits. It denotes the byte equal to the provided hex value. • A whitespace escape is one of the characters U+006E (n), U+0072 (r), or U+0074 (t), denoting the bytes values 0x0A (ASCII LF), 0x0D (ASCII CR) or 0x09 (ASCII HT) respectively. • The null escape is the character U+0030 (0) and denotes the byte value 0x00 (ASCII NUL). • The backslash escape is the character U+005C (\) which must be escaped in order to denote its ASCII encoding 0x5C. #### Raw byte string literals Lexer RAW_BYTE_STRING_LITERAL : br RAW_BYTE_STRING_CONTENT RAW_BYTE_STRING_CONTENT : " ASCII* (non-greedy) " | # RAW_STRING_CONTENT # ASCII : any ASCII (i.e. 0x00 to 0x7F) Raw byte string literals do not process any escapes. They start with the character U+0062 (b), followed by U+0072 (r), followed by zero or more of the character U+0023 (#), and a U+0022 (double-quote) character. The raw string body can contain any sequence of ASCII characters and is terminated only by another U+0022 (double-quote) character, followed by the same number of U+0023 (#) characters that preceded the opening U+0022 (double-quote) character. A raw byte string literal can not contain any non-ASCII byte. All characters contained in the raw string body represent their ASCII encoding, the characters U+0022 (double-quote) (except when followed by at least as many U+0023 (#) characters as were used to start the raw string literal) or U+005C (\) do not have any special meaning. Examples for byte string literals:  # #![allow(unused_variables)] #fn main() { b"foo"; br"foo"; // foo b"\"foo\""; br#""foo""#; // "foo" b"foo #\"# bar"; br##"foo #"# bar"##; // foo #"# bar b"\x52"; b"R"; br"R"; // R b"\\x52"; br"\x52"; // \x52 #} ### Number literals A number literal is either an integer literal or a floating-point literal. The grammar for recognizing the two kinds of literals is mixed. #### Integer literals Lexer INTEGER_LITERAL : ( DEC_LITERAL | BIN_LITERAL | OCT_LITERAL | HEX_LITERAL ) INTEGER_SUFFIX? DEC_LITERAL : DEC_DIGIT (DEC_DIGIT|_)\* TUPLE_INDEX : 0 | NON_ZERO_DEC_DIGIT DEC_DIGIT\* BIN_LITERAL : 0b (BIN_DIGIT|_)\* BIN_DIGIT (BIN_DIGIT|_)\* OCT_LITERAL : 0o (OCT_DIGIT|_)\* OCT_DIGIT (OCT_DIGIT|_)\* HEX_LITERAL : 0x (HEX_DIGIT|_)\* HEX_DIGIT (HEX_DIGIT|_)\* BIN_DIGIT : [0-1] OCT_DIGIT : [0-7] DEC_DIGIT : [0-9] NON_ZERO_DEC_DIGIT : [1-9] HEX_DIGIT : [0-9 a-f A-F] INTEGER_SUFFIX : u8 | u16 | u32 | u64 | u128 | usize | i8 | i16 | i32 | i64 | i128 | isize An integer literal has one of four forms: • A decimal literal starts with a decimal digit and continues with any mixture of decimal digits and underscores. • A tuple index is either 0, or starts with a non-zero decimal digit and continues with zero or more decimal digits. Tuple indexes are used to refer to the fields of tuples, tuple structs and tuple variants. • A hex literal starts with the character sequence U+0030 U+0078 (0x) and continues as any mixture (with at least one digit) of hex digits and underscores. • An octal literal starts with the character sequence U+0030 U+006F (0o) and continues as any mixture (with at least one digit) of octal digits and underscores. • A binary literal starts with the character sequence U+0030 U+0062 (0b) and continues as any mixture (with at least one digit) of binary digits and underscores. Like any literal, an integer literal may be followed (immediately, without any spaces) by an integer suffix, which forcibly sets the type of the literal. The integer suffix must be the name of one of the integral types: u8, i8, u16, i16, u32, i32, u64, i64, u128, i128, usize, or isize. The type of an unsuffixed integer literal is determined by type inference: • If an integer type can be uniquely determined from the surrounding program context, the unsuffixed integer literal has that type. • If the program context under-constrains the type, it defaults to the signed 32-bit integer i32. • If the program context over-constrains the type, it is considered a static type error. Examples of integer literals of various forms:  # #![allow(unused_variables)] #fn main() { 123; // type i32 123i32; // type i32 123u32; // type u32 123_u32; // type u32 let a: u64 = 123; // type u64 0xff; // type i32 0xff_u8; // type u8 0o70; // type i32 0o70_i16; // type i16 0b1111_1111_1001_0000; // type i32 0b1111_1111_1001_0000i64; // type i64 0b________1; // type i32 0usize; // type usize #} Examples of invalid integer literals: // invalid suffixes 0invalidSuffix; // uses numbers of the wrong base 123AFB43; 0b0102; 0o0581; // integers too big for their type (they overflow) 128_i8; 256_u8; // bin, hex and octal literals must have at least one digit 0b_; 0b____;  Note that the Rust syntax considers -1i8 as an application of the unary minus operator to an integer literal 1i8, rather than a single integer literal. #### Floating-point literals Lexer FLOAT_LITERAL : DEC_LITERAL . (not immediately followed by ., _ or an identifier) | DEC_LITERAL FLOAT_EXPONENT | DEC_LITERAL . DEC_LITERAL FLOAT_EXPONENT? | DEC_LITERAL (. DEC_LITERAL)? FLOAT_EXPONENT? FLOAT_SUFFIX FLOAT_EXPONENT : (e|E) (+|-)? (DEC_DIGIT|_)\* DEC_DIGIT (DEC_DIGIT|_)\* FLOAT_SUFFIX : f32 | f64 A floating-point literal has one of two forms: • A decimal literal followed by a period character U+002E (.). This is optionally followed by another decimal literal, with an optional exponent. • A single decimal literal followed by an exponent. Like integer literals, a floating-point literal may be followed by a suffix, so long as the pre-suffix part does not end with U+002E (.). The suffix forcibly sets the type of the literal. There are two valid floating-point suffixes, f32 and f64 (the 32-bit and 64-bit floating point types), which explicitly determine the type of the literal. The type of an unsuffixed floating-point literal is determined by type inference: • If a floating-point type can be uniquely determined from the surrounding program context, the unsuffixed floating-point literal has that type. • If the program context under-constrains the type, it defaults to f64. • If the program context over-constrains the type, it is considered a static type error. Examples of floating-point literals of various forms:  # #![allow(unused_variables)] #fn main() { 123.0f64; // type f64 0.1f64; // type f64 0.1f32; // type f32 12E+99_f64; // type f64 let x: f64 = 2.; // type f64 #} This last example is different because it is not possible to use the suffix syntax with a floating point literal ending in a period. 2.f64 would attempt to call a method named f64 on 2. The representation semantics of floating-point numbers are described in "Machine Types". ### Boolean literals Lexer BOOLEAN_LITERAL : true | false The two values of the boolean type are written true and false. ## Lifetimes and loop labels Lexer LIFETIME_TOKEN : ' IDENTIFIER_OR_KEYWORD | '_ LIFETIME_OR_LABEL : ' NON_KEYWORD_IDENTIFIER Lifetime parameters and loop labels use LIFETIME_OR_LABEL tokens. Any LIFETIME_TOKEN will be accepted by the lexer, and for example, can be used in macros. ## Punctuation Punctuation symbol tokens are listed here for completeness. Their individual usages and meanings are defined in the linked pages. SymbolNameUsage +PlusAddition, Trait Bounds, Macro Kleene Matcher -MinusSubtraction, Negation *StarMultiplication, Dereference, Raw Pointers, Macro Kleene Matcher /SlashDivision %PercentRemainder ^CaretBitwise and Logical XOR !NotBitwise and Logical NOT, Macro Calls, Inner Attributes, Never Type &AndBitwise and Logical AND, Borrow, References, Reference patterns \|OrBitwise and Logical OR, Closures, Match &&AndAndLazy AND, Borrow, References, Reference patterns \||OrOrLazy OR, Closures <<ShlShift Left, Nested Generics >>ShrShift Right, Nested Generics +=PlusEqAddition assignment -=MinusEqSubtraction assignment *=StarEqMultiplication assignment /=SlashEqDivision assignment %=PercentEqRemainder assignment ^=CaretEqBitwise XOR assignment &=AndEqBitwise And assignment \|=OrEqBitwise Or assignment <<=ShlEqShift Left assignment >>=ShrEqShift Right assignment, Nested Generics =EqAssignment, Attributes, Various type definitions ==EqEqEqual !=NeNot Equal >GtGreater than, Generics, Paths <LtLess than, Generics, Paths >=GeGreater than or equal to, Generics <=LeLess than or equal to @AtSubpattern binding _UnderscoreWildcard patterns, Inferred types .DotField access, Tuple index ..DotDotRange, Struct expressions, Patterns ...DotDotDotVariadic functions, Range patterns ..=DotDotEqInclusive Range, Range patterns ,CommaVarious separators ;SemiTerminator for various items and statements, Array types :ColonVarious separators ::PathSepPath separator ->RArrowFunction return type, Closure return type =>FatArrowMatch arms, Macros #PoundAttributes $DollarMacros
?QuestionQuestion mark operator, Questionably sized

## Delimiters

Bracket punctuation is used in various parts of the grammar. An open bracket must always be paired with a close bracket. Brackets and the tokens within them are referred to as "token trees" in macros. The three types of brackets are:

BracketType
{ }Curly braces
[ ]Square brackets
( )Parentheses

# Paths

A path is a sequence of one or more path segments logically separated by a namespace qualifier (::). If a path consists of only one segment, it refers to either an item or a variable in a local control scope. If a path has multiple segments, it always refers to an item.

Two examples of simple paths consisting of only identifier segments:

x;
x::y::z;


## Types of paths

### Simple Paths

Syntax
SimplePath :
::? SimplePathSegment (:: SimplePathSegment)\*

SimplePathSegment :
IDENTIFIER | super | self | crate | $crate Simple paths are used in visibility markers, attributes, macros, and use items. Examples:  # #![allow(unused_variables)] #fn main() { use std::io::{self, Write}; mod m { #[clippy::cyclomatic_complexity = "0"] pub (in super) fn f1() {} } #} ### Paths in expressions Syntax PathInExpression : ::? PathExprSegment (:: PathExprSegment)\* PathExprSegment : PathIdentSegment (:: GenericArgs)? PathIdentSegment : IDENTIFIER | super | self | Self | crate | $crate

GenericArgs :
< >
| < GenericArgsLifetimes ,? >
| < GenericArgsTypes ,? >
| < GenericArgsBindings ,? >
| < GenericArgsTypes , GenericArgsBindings ,? >
| < GenericArgsLifetimes , GenericArgsTypes ,? >
| < GenericArgsLifetimes , GenericArgsBindings ,? >
| < GenericArgsLifetimes , GenericArgsTypes , GenericArgsBindings ,? >

Lifetime (, Lifetime)\*

GenericArgsTypes :
Type (, Type)\*

GenericArgsBindings :
GenericArgsBinding (, GenericArgsBinding)\*

GenericArgsBinding :
IDENTIFIER = Type

Paths in expressions allow for paths with generic arguments to be specified. They are used in various places in expressions and patterns.

The :: token is required before the opening < for generic arguments to avoid ambiguity with the less-than operator. This is colloquially known as "turbofish" syntax.


# #![allow(unused_variables)]
#fn main() {
(0..10).collect::<Vec<_>>();
Vec::<u8>::with_capacity(1024);
#}

## Qualified paths

Syntax
QualifiedPathInExpression :
QualifiedPathType (:: PathExprSegment)+

QualifiedPathType :
< Type (as TypePath)? >

QualifiedPathInType :
QualifiedPathType (:: TypePathSegment)+

Fully qualified paths allow for disambiguating the path for trait implementations and for specifying canonical paths. When used in a type specification, it supports using the type syntax specified below.


# #![allow(unused_variables)]
#fn main() {
struct S;
impl S {
fn f() { println!("S"); }
}
trait T1 {
fn f() { println!("T1 f"); }
}
impl T1 for S {}
trait T2 {
fn f() { println!("T2 f"); }
}
impl T2 for S {}
S::f();  // Calls the inherent impl.
<S as T1>::f();  // Calls the T1 trait function.
<S as T2>::f();  // Calls the T2 trait function.
#}

### Paths in types

Syntax
TypePath :
::? TypePathSegment (:: TypePathSegment)\*

TypePathSegment :
PathIdentSegment ::? (GenericArgs | TypePathFn)?

TypePathFn :
( TypePathFnInputs? ) (-> Type)?

TypePathFnInputs :
Type (, Type)\* ,?

Type paths are used within type definitions, trait bounds, type parameter bounds, and qualified paths.

Although the :: token is allowed before the generics arguments, it is not required because there is no ambiguity like there is in PathInExpression.

impl ops::Index<ops::Range<usize>> for S { /*...*/ }
fn i() -> impl Iterator<Item = op::Example<'a>> { /*...*/ }
type G = std::boxed::Box<std::ops::FnOnce(isize) -> isize>;


## Path qualifiers

Paths can be denoted with various leading qualifiers to change the meaning of how it is resolved.

### ::

Paths starting with :: are considered to be global paths where the segments of the path start being resolved from the crate root. Each identifier in the path must resolve to an item.

Edition Differences: In the 2015 Edition, the crate root contains a variety of different items, including external crates, default crates such as std and core, and items in the top level of the crate (including use imports).

Beginning with the 2018 Edition, paths starting with :: can only reference crates.

mod a {
pub fn foo() {}
}
mod b {
pub fn foo() {
::a::foo(); // call a's foo function
}
}
# fn main() {}


### self

self resolves the path relative to the current module. self can only be used as the first segment, without a preceding ::.

fn foo() {}
fn bar() {
self::foo();
}
# fn main() {}


### Self

Self, with a capital "S", is used to refer to the implementing type within traits and implementations.

Self can only be used as the first segment, without a preceding ::.


# #![allow(unused_variables)]
#fn main() {
trait T {
type Item;
const C: i32;
// Self will be whatever type that implements T.
fn new() -> Self;
// Self::Item will be the type alias in the implementation.
fn f(&self) -> Self::Item;
}
struct S;
impl T for S {
type Item = i32;
const C: i32 = 9;
fn new() -> Self {           // Self is the type S.
S
}
fn f(&self) -> Self::Item {  // Self::Item is the type i32.
Self::C                  // Self::C is the constant value 9.
}
}
#}

### super

super in a path resolves to the parent module. It may only be used in leading segments of the path, possibly after an initial self segment.

mod a {
pub fn foo() {}
}
mod b {
pub fn foo() {
super::a::foo(); // call a's foo function
}
}
# fn main() {}


super may be repeated several times after the first super or self to refer to ancestor modules.

mod a {
fn foo() {}

mod b {
mod c {
fn foo() {
super::super::foo(); // call a's foo function
self::super::super::foo(); // call a's foo function
}
}
}
}
# fn main() {}


### crate

crate resolves the path relative to the current crate. crate can only be used as the first segment, without a preceding ::.

fn foo() {}
mod a {
fn bar() {
crate::foo();
}
}
# fn main() {}


### $crate $crate is only used within macro transcribers, and can only be used as the first segment, without a preceding ::. $crate will expand to a path to access items from the top level of the crate where the macro is defined, regardless of which crate the macro is invoked. pub fn increment(x: u32) -> u32 { x + 1 } #[macro_export] macro_rules! inc { ($x:expr) => ( $crate::increment($x) )
}
# fn main() { }


## Canonical paths

Items defined in a module or implementation have a canonical path that corresponds to where within its crate it is defined. All other paths to these items are aliases. The canonical path is defined as a path prefix appended by the path segment the item itself defines.

Implementations and use declarations do not have canonical paths, although the items that implementations define do have them. Items defined in block expressions do not have canonical paths. Items defined in a module that does not have a canonical path do not have a canonical path. Associated items defined in an implementation that refers to an item without a canonical path, e.g. as the implementing type, the trait being implemented, a type parameter or bound on a type parameter, do not have canonical paths.

The path prefix for modules is the canonical path to that module. For bare implementations, it is the canonical path of the item being implemented surrounded by angle (<>) brackets. For trait implementations, it is the canonical path of the item being implemented followed by as followed by the canonical path to the trait all surrounded in angle (<>) brackets.

The canonical path is only meaningful within a given crate. There is no global namespace across crates; an item's canonical path merely identifies it within the crate.

// Comments show the canonical path of the item.

mod a { // ::a
pub struct Struct; // ::a::Struct

pub trait Trait { // ::a::Trait
fn f(&self); // a::Trait::f
}

impl Trait for Struct {
fn f(&self) {} // <::a::Struct as ::a::Trait>::f
}

impl Struct {
fn g(&self) {} // <::a::Struct>::g
}
}

mod without { // ::without
fn canonicals() { // ::without::canonicals
struct OtherStruct; // None

trait OtherTrait { // None
fn g(&self); // None
}

impl OtherTrait for OtherStruct {
fn g(&self) {} // None
}

impl OtherTrait for ::a::Struct {
fn g(&self) {} // None
}

impl ::a::Trait for OtherStruct {
fn f(&self) {} // None
}
}
}

# fn main() {}


# Macros

The functionality and syntax of Rust can be extended with custom definitions called macros. They are given names, and invoked through a consistent syntax:some_extension!(...).

There are two ways to define new macros:

## Macro Invocation

Syntax
MacroInvocation :
SimplePath ! DelimTokenTree

DelimTokenTree :
( TokenTree\* )
| [ TokenTree\* ]
| { TokenTree\* }

TokenTree :
Tokenexcept delimiters | DelimTokenTree

MacroInvocationSemi :
SimplePath ! ( TokenTree\* ) ;
| SimplePath ! [ TokenTree\* ] ;
| SimplePath ! { TokenTree\* }

A macro invocation executes a macro at compile time and replaces the invocation with the result of the macro. Macros may be invoked in the following situations:

When used as an item or a statement, the MacroInvocationSemi form is used where a semicolon is required at the end when not using curly braces. Visibility qualifiers are never allowed before a macro invocation or macro_rules definition.


# #![allow(unused_variables)]
#fn main() {
// Used as an expression.
let x = vec![1,2,3];

// Used as a statement.
println!("Hello!");

// Used in a pattern.
macro_rules! pat {
($i:ident) => (Some($i))
}

if let pat!(x) = Some(1) {
assert_eq!(x, 1);
}

// Used in a type.
macro_rules! Tuple {
{ $A:ty,$B:ty } => { ($A,$B) };
}

type N2 = Tuple!(i32, i32);

// Used as an item.
# use std::cell::RefCell;

// Used as an associated item.
macro_rules! const_maker {
($t:ty,$v:tt) => { const CONST: $t =$v; };
}
trait T {
const_maker!{i32, 7}
}

// Macro calls within macros.
macro_rules! example {
() => { println!("Macro call in a macro!") };
}
// Outer macro example is expanded, then inner macro println is expanded.
example!();
#}

# Macros By Example

Syntax
MacroRulesDefinition :
macro_rules ! IDENTIFIER MacroRulesDef

MacroRulesDef :
( MacroRules ) ;
| [ MacroRules ] ;
| { MacroRules }

MacroRules :
MacroRule ( ; MacroRule )\* ;?

MacroRule :
MacroMatcher => MacroTranscriber

MacroMatcher :
( MacroMatch\* )
| [ MacroMatch\* ]
| { MacroMatch\* }

MacroMatch :
Tokenexcept $and delimiters | MacroMatcher | $ IDENTIFIER : MacroFragSpec
| $ ( MacroMatch+ ) MacroRepSep? MacroRepOp MacroFragSpec : block | expr | ident | item | lifetime | literal | meta | pat | path | stmt | tt | ty | vis MacroRepSep : Tokenexcept delimiters and repetition operators MacroRepOp : * | + | ? MacroTranscriber : DelimTokenTree macro_rules allows users to define syntax extension in a declarative way. We call such extensions "macros by example" or simply "macros". Each macro by example has a name, and one or more rules. Each rule has two parts: a matcher, describing the syntax that it matches, and a transcriber, describing the syntax that will replace a successfully matched invocation. Both the matcher and the transcriber must be surrounded by delimiters. Macros can expand to expressions, statements, items (including traits, impls, and foreign items), types, or patterns. ## Transcribing When a macro is invoked, the macro expander looks up macro invocations by name, and tries each macro rule in turn. It transcribes the first successful match; if this results in an error, then future matches are not tried. When matching, no lookahead is performed; if the compiler cannot unambiguously determine how to parse the macro invocation one token at a time, then it is an error. In the following example, the compiler does not look ahead past the identifier to see if the following token is a ), even though that would allow it to parse the invocation unambiguously:  # #![allow(unused_variables)] #fn main() { macro_rules! ambiguity { ($($i:ident)*$j:ident) => { };
}

ambiguity!(error); // Error: local ambiguity
#}

In both the matcher and the transcriber, the $ token is used to invoke special behaviours from the macro engine (described below in Metavariables and Repetitions). Tokens that aren't part of such an invocation are matched and transcribed literally, with one exception. The exception is that the outer delimiters for the matcher will match any pair of delimiters. Thus, for instance, the matcher (()) will match {()} but not {{}}. The character $ cannot be matched or transcribed literally.

When forwarding a matched fragment to another macro-by-example, matchers in the second macro will see an opaque AST of the fragment type. The second macro can't use literal tokens to match the fragments in the matcher, only a fragment specifier of the same type. The ident, lifetime, and tt fragment types are an exception, and can be matched by literal tokens. The following illustrates this restriction:


# #![allow(unused_variables)]
#fn main() {
macro_rules! foo {
($l:expr) => { bar!($l); }
// ERROR:               ^^ no rules expected this token in macro call
}

macro_rules! bar {
(3) => {}
}

foo!(3);
#}

The following illustrates how tokens can be directly matched after matching a tt fragment:


# #![allow(unused_variables)]
#fn main() {
// compiles OK
macro_rules! foo {
($l:tt) => { bar!($l); }
}

macro_rules! bar {
(3) => {}
}

foo!(3);
#}

## Metavariables

In the matcher, $ name : fragment-specifier matches a Rust syntax fragment of the kind specified and binds it to the metavariable $name. Valid fragment specifiers are:

• item: an Item
• block: a BlockExpression
• stmt: a Statement without the trailing semicolon (except for item statements that require semicolons)
• pat: a Pattern
• expr: an Expression
• ty: a Type
• ident: an IDENTIFIER_OR_KEYWORD
• path: a TypePath style path
• tt: a TokenTree (a single token or tokens in matching delimiters (), [], or {})
• meta: a MetaItem, the contents of an attribute
• lifetime: a LIFETIME_TOKEN
• vis: a possibly empty Visibility qualifier
• literal: matches -?LiteralExpression

In the transcriber, metavariables are referred to simply by $name, since the fragment kind is specified in the matcher. Metavariables are replaced with the syntax element that matched them. The keyword metavariable $crate can be used to refer to the current crate; see Hygiene below. Metavariables can be transcribed more than once or not at all.

## Repetitions

In both the matcher and transcriber, repetitions are indicated by placing the tokens to be repeated inside $(), followed by a repetition operator, optionally with a separator token between. The separator token can be any token other than a delimiter or one of the repetition operators, but ; and , are the most common. For instance, $( $i:ident ),* represents any number of identifiers separated by commas. Nested repetitions are permitted. The repetition operators are: • * — indicates any number of repetitions. • + — indicates any number but at least one. • ? — indicates an optional fragment with zero or one occurrences. Since ? represents at most one occurrence, it cannot be used with a separator. The repeated fragment both matches and transcribes to the specified number of the fragment, separated by the separator token. Metavariables are matched to every repetition of their corresponding fragment. For instance, the $( $i:ident ),* example above matches $i to all of the identifiers in the list.

During transcription, additional restrictions apply to repetitions so that the compiler knows how to expand them properly:

1. A metavariable must appear in exactly the same number, kind, and nesting order of repetitions in the transcriber as it did in the matcher. So for the matcher $($i:ident ),*, the transcribers => { $i }, => {$( $($i)* )* }, and => { $($i )+ } are all illegal, but => { $($i );* } is correct and replaces a comma-separated list of identifiers with a semicolon-separated list.
2. Second, each repetition in the transcriber must contain at least one metavariable to decide now many times to expand it. If multiple metavariables appear in the same repetition, they must be bound to the same number of fragments. For instance, ( $($i:ident ),* ; $($j:ident ),* ) => ( $( ($i,$j) ),* must bind the same number of $i fragments as $j fragments. This means that invoking the macro with (a, b, c; d, e, f) is legal and expands to ((a,d), (b,e), c,f)), but (a, b, c; d, e) is illegal because it does not have the same number. This requirement applies to every layer of nested repetitions. ## Scoping, Exporting, and Importing For historical reasons, the scoping of macros by example does not work entirely like items. Macros have two forms of scope: textual scope, and path-based scope. Textual scope is based on the order that things appear in source files, or even across multiple files, and is the default scoping. It is explained further below. Path-based scope works exactly the same way that item scoping does. The scoping, exporting, and importing of macros is controlled largely by attributes. When a macro is invoked by an unqualified identifier (not part of a multi-part path), it is first looked up in textual scoping. If this does not yield any results, then it is looked up in path-based scoping. If the macro's name is qualified with a path, then it is only looked up in path-based scoping. use lazy_static::lazy_static; // Path-based import. macro_rules! lazy_static { // Textual definition. (lazy) => {}; } lazy_static!{lazy} // Textual lookup finds our macro first. self::lazy_static!{} // Path-based lookup ignores our macro, finds imported one.  ### Textual Scope Textual scope is based largely on the order that things appear in source files, and works similarly to the scope of local variables declared with let except it also applies at the module level. When macro_rules! is used to define a macro, the macro enters the scope after the definition (note that it can still be used recursively, since names are looked up from the invocation site), up until its surrounding scope, typically a module, is closed. This can enter child modules and even span across multiple files: //// src/lib.rs mod has_macro { // m!{} // Error: m is not in scope. macro_rules! m { () => {}; } m!{} // OK: appears after declaration of m. mod uses_macro; } // m!{} // Error: m is not in scope. //// src/has_macro/uses_macro.rs m!{} // OK: appears after declaration of m in src/lib.rs  It is not an error to define a macro multiple times; the most recent declaration will shadow the previous one unless it has gone out of scope.  # #![allow(unused_variables)] #fn main() { macro_rules! m { (1) => {}; } m!(1); mod inner { m!(1); macro_rules! m { (2) => {}; } // m!(1); // Error: no rule matches '1' m!(2); macro_rules! m { (3) => {}; } m!(3); } m!(1); #} Macros can be declared and used locally inside functions as well, and work similarly:  # #![allow(unused_variables)] #fn main() { fn foo() { // m!(); // Error: m is not in scope. macro_rules! m { () => {}; } m!(); } // m!(); // Error: m is not in scope. #} ### The macro_use attribute The macro_use attribute has two purposes. First, it can be used to make a module's macro scope not end when the module is closed, by applying it to a module:  # #![allow(unused_variables)] #fn main() { #[macro_use] mod inner { macro_rules! m { () => {}; } } m!(); #} Second, it can be used to import macros from another crate, by attaching it to an extern crate declaration appearing in the crate's root module. Macros imported this way are imported into the prelude of the crate, not textually, which means that they can be shadowed by any other name. While macros imported by #[macro_use] can be used before the import statement, in case of a conflict, the last macro imported wins. Optionally, a list of macros to import can be specified using the MetaListIdents syntax; this is not supported when #[macro_use] is applied to a module. #[macro_use(lazy_static)] // Or #[macro_use] to import all macros. extern crate lazy_static; lazy_static!{} // self::lazy_static!{} // Error: lazy_static is not defined in self  Macros to be imported with #[macro_use] must be exported with #[macro_export], which is described below. ### Path-Based Scope By default, a macro has no path-based scope. However, if it has the #[macro_export] attribute, then it is declared in the crate root scope and can be referred to normally as such:  # #![allow(unused_variables)] #fn main() { self::m!(); m!(); // OK: Path-based lookup finds m in the current module. mod inner { super::m!(); crate::m!(); } mod mac { #[macro_export] macro_rules! m { () => {}; } } #} Macros labeled with #[macro_export] are always pub and can be referred to by other crates, either by path or by #[macro_use] as described above. ## Hygiene By default, all identifiers referred to in a macro are expanded as-is, and are looked up at the macro's invocation site. This can lead to issues if a macro refers to an item or macro which isn't in scope at the invocation site. To alleviate this, the $crate metavariable can be used at the start of a path to force lookup to occur inside the crate defining the macro.

//// Definitions in the helper_macro crate.
#[macro_export]
macro_rules! helped {
// () => { helper!() } // This might lead to an error due to 'helper' not being in scope.
() => { $crate::helper!() } } #[macro_export] macro_rules! helper { () => { () } } //// Usage in another crate. // Note that helper_macro::helper is not imported! use helper_macro::helped; fn unit() { helped!(); }  Note that, because $crate refers to the current crate, it must be used with a fully qualified module path when referring to non-macro items:


# #![allow(unused_variables)]
#fn main() {
pub mod inner {
#[macro_export]
macro_rules! call_foo {
() => { $crate::inner::foo() }; } pub fn foo() {} } #} Additionally, even though $crate allows a macro to refer to items within its own crate when expanding, its use has no effect on visibility. An item or macro referred to must still be visible from the invocation site. In the following example, any attempt to invoke call_foo!() from outside its crate will fail because foo() is not public.


# #![allow(unused_variables)]
#fn main() {
#[macro_export]
macro_rules! call_foo {
() => { $crate::foo() }; } fn foo() {} #} Version & Edition Differences: Prior to Rust 1.30, $crate and local_inner_macros (below) were unsupported. They were added alongside path-based imports of macros (described above), to ensure that helper macros did not need to be manually imported by users of a macro-exporting crate. Crates written for earlier versions of Rust that use helper macros need to be modified to use $crate or local_inner_macros to work well with path-based imports. When a macro is exported, the #[macro_export] attribute can have the local_inner_macros keyword added to automatically prefix all contained macro invocations with $crate::. This is intended primarily as a tool to migrate code written before $crate was added to the language to work with Rust 2018's path-based imports of macros. Its use is discouraged in new code.  # #![allow(unused_variables)] #fn main() { #[macro_export(local_inner_macros)] macro_rules! helped { () => { helper!() } // Automatically converted to$crate::helper!().
}

#[macro_export]
macro_rules! helper {
() => { () }
}
#}

## Follow-set Ambiguity Restrictions

The parser used by the macro system is reasonably powerful, but it is limited in order to prevent ambiguity in current or future versions of the language. In particular, in addition to the rule about ambiguous expansions, a nonterminal matched by a metavariable must be followed by a token which has been decided can be safely used after that kind of match.

As an example, a macro matcher like $i:expr [ , ] could in theory be accepted in Rust today, since [,] cannot be part of a legal expression and therefore the parse would always be unambiguous. However, because [ can start trailing expressions, [ is not a character which can safely be ruled out as coming after an expression. If [,] were accepted in a later version of Rust, this matcher would become ambiguous or would misparse, breaking working code. Matchers like $i:expr, or $i:expr; would be legal, however, because , and ; are legal expression separators. The specific rules are: • expr and stmt may only be followed by one of: =>, ,, or ;. • pat may only be followed by one of: =>, ,, =, |, if, or in. • path and ty may only be followed by one of: =>, ,, =, |, ;, :, >, >>, [, {, as, where, or a macro variable of block fragment specifier. • vis may only be followed by one of: ,, an identifier other than a non-raw priv, any token that can begin a type, or a metavariable with a ident, ty, or path fragment specifier. • All other fragment specifiers have no restrictions. When repetitions are involved, then the rules apply to every possible number of expansions, taking separators into account. This means: • If the repetition includes a separator, that separator must be able to follow the contents of the repetition. • If the repetition can repeat multiple times (* or +), then the contents must be able to follow themselves. • The contents of the repetition must be able to follow whatever comes before, and whatever comes after must be able to follow the contents of the repetition. • If the repetition can match zero times (* or ?), then whatever comes after must be able to follow whatever comes before. For more detail, see the formal specification. ## Procedural Macros Procedural macros allow creating syntax extensions as execution of a function. Procedural macros come in one of three flavors: Procedural macros allow you to run code at compile time that operates over Rust syntax, both consuming and producing Rust syntax. You can sort of think of procedural macros as functions from an AST to another AST. Procedural macros must be defined in a crate with the crate type of proc-macro. Note: When using Cargo, Procedural macro crates are defined with the proc-macro key in your manifest: [lib] proc-macro = true  As functions, they must either return syntax, panic, or loop endlessly. Returned syntax either replaces or adds the syntax depending on the kind of procedural macro. Panics are caught by the compiler and are turned into a compiler error. Endless loops are not caught by the compiler which hangs the compiler. Procedural macros run during compilation, and thus have the same resources that the compiler has. For example, standard input, error, and output are the same that the compiler has access to. Similarly, file access is the same. Because of this, procedural macros have the same security concerns that Cargo's build scripts have. Procedural macros have two ways of reporting errors. The first is to panic. The second is to emit a compile_error macro invocation. ### The proc_macro crate Procedural macro crates almost always will link to the compiler-provided proc_macro crate. The proc_macro crate provides types required for writing procedural macros and facilities to make it easier. This crate primarily contains a TokenStream type. Procedural macros operate over token streams instead of AST nodes, which is a far more stable interface over time for both the compiler and for procedural macros to target. A token stream is roughly equivalent to Vec<TokenTree> where a TokenTree can roughly be thought of as lexical token. For example foo is an Ident token, . is a Punct token, and 1.2 is a Literal token. The TokenStream type, unlike Vec<TokenTree>, is cheap to clone. All tokens have an associated Span. A Span is an opaque value that cannot be modified but can be manufactured. Spans represent an extent of source code within a program and are primarily used for error reporting. You can modify the Span of any token. ### Procedural macro hygiene Procedural macros are unhygienic. This means they behave as if the output token stream was simply written inline to the code it's next to. This means that it's affected by external items and also affects external imports. Macro authors need to be careful to ensure their macros work in as many contexts as possible given this limitation. This often includes using absolute paths to items in libraries (for example, ::std::option::Option instead of Option) or by ensuring that generated functions have names that are unlikely to clash with other functions (like __internal_foo instead of foo). ### Function-like procedural macros Function-like procedural macros are procedural macros that are invoked using the macro invocation operator (!). These macros are defined by a public function with the proc_macro attribute and a signature of (TokenStream) -> TokenStream. The input TokenStream is what is inside the delimiters of the macro invocation and the output TokenStream replaces the entire macro invocation. It may contain an arbitrary number of items. These macros cannot expand to syntax that defines new macro_rules style macros. For example, the following macro definition ignores its input and outputs a function answer into its scope. extern crate proc_macro; use proc_macro::TokenStream; #[proc_macro] pub fn make_answer(_item: TokenStream) -> TokenStream { "fn answer() -> u32 { 42 }".parse().unwrap() }  And then we use it a binary crate to print "42" to standard output. extern crate proc_macro_examples; use proc_macro_examples::make_answer; make_answer!(); fn main() { println!("{}", answer()); }  These macros are only invokable in modules. They cannot even be invoked to create item declaration statements. Furthermore, they must either be invoked with curly braces and no semicolon or a different delimiter followed by a semicolon. For example, make_answer from the previous example can be invoked as make_answer!{}, make_answer!(); or make_answer![];. ### Derive macros Derive macros define new inputs for the derive attribute. These macros can create new items given the token stream of a struct, enum, or union. They can also define derive macro helper attributes. Custom derive macros are defined by a public function with the proc_macro_derive attribute and a signature of (TokenStream) -> TokenStream. The input TokenStream is the token stream of the item that has the derive attribute on it. The output TokenStream must be a set of items that are then appended to the module or block that the item from the input TokenStream is in. The following is an example of a derive macro. Instead of doing anything useful with its input, it just appends a function answer. extern crate proc_macro; use proc_macro::TokenStream; #[proc_macro_derive(AnswerFn)] pub fn derive_answer_fn(_item: TokenStream) -> TokenStream { "fn answer() -> u32 { 42 }".parse().unwrap() }  And then using said derive macro: extern crate proc_macro_examples; use proc_macro_examples::AnswerFn; #[derive(AnswerFn)] struct Struct; fn main() { assert_eq!(42, answer()); }  #### Derive macro helper attributes Derive macros can add additional attributes into the scope of the item they are on. Said attributes are called derive macro helper attributes. These attributes are inert, and their only purpose is to be fed into the derive macro that defined them. That said, they can be seen by all macros. The way to define helper attributes is to put an attributes key in the proc_macro_derive macro with a comma separated list of identifiers that are the names of the helper attributes. For example, the following derive macro defines a helper attribute helper, but ultimately doesn't do anything with it. # #[crate_type="proc-macro"] # extern crate proc_macro; # use proc_macro::TokenStream; #[proc_macro_derive(HelperAttr, attributes(helper))] pub fn derive_helper_attr(_item: TokenStream) -> TokenStream { TokenStream::new() }  And then usage on the derive macro on a struct: # #![crate_type="proc-macro"] # extern crate proc_macro_examples; # use proc_macro_examples::HelperAttr; #[derive(HelperAttr)] struct Struct { #[helper] field: () }  ### Attribute macros Attribute macros define new attributes which can be attached to items. Attribute macros are defined by a public function with the proc_macro_attribute attribute that has a signature of (TokenStream, TokenStream) -> TokenStream. The first TokenStream is the delimited token tree following the attribute's name, not including the outer delimiters. If the attribute is written as a bare attribute name, the attribute TokenStream is empty. The second TokenStream is the rest of the item including other attributes on the item. The returned TokenStream replaces the item with an arbitrary number of items. These macros cannot expand to syntax that defines new macro_rules style macros. For example, this attribute macro takes the input stream and returns it as is, effectively being the no-op of attributes. # #![crate_type = "proc-macro"] # extern crate proc_macro; # use proc_macro::TokenStream; #[proc_macro_attribute] pub fn return_as_is(_attr: TokenStream, item: TokenStream) -> TokenStream { item }  This following example shows the stringified TokenStreams that the attribute macros see. The output will show in the output of the compiler. The output is shown in the comments after the function prefixed with "out:". // my-macro/src/lib.rs # extern crate proc_macro; # use proc_macro::TokenStream; #[proc_macro_attribute] pub fn show_streams(attr: TokenStream, item: TokenStream) -> TokenStream { println!("attr: \"{}\"", attr.to_string()); println!("item: \"{}\"", item.to_string()); item }  // src/lib.rs extern crate my_macro; use my_macro::show_streams; // Example: Basic function #[show_streams] fn invoke1() {} // out: attr: "" // out: item: "fn invoke1() { }" // Example: Attribute with input #[show_streams(bar)] fn invoke2() {} // out: attr: "bar" // out: item: "fn invoke2() {}" // Example: Multiple tokens in the input #[show_streams(multiple => tokens)] fn invoke3() {} // out: attr: "multiple => tokens" // out: item: "fn invoke3() {}" // Example: #[show_streams { delimiters }] fn invoke4() {} // out: attr: "delimiters" // out: item: "fn invoke4() {}"  # Crates and source files Syntax Crate : UTF8BOM? SHEBANG? InnerAttribute\* Item\* Lexer UTF8BOM : \uFEFF SHEBANG : #! ~[[ \n] ~\n\* Note: Although Rust, like any other language, can be implemented by an interpreter as well as a compiler, the only existing implementation is a compiler, and the language has always been designed to be compiled. For these reasons, this section assumes a compiler. Rust's semantics obey a phase distinction between compile-time and run-time.1 Semantic rules that have a static interpretation govern the success or failure of compilation, while semantic rules that have a dynamic interpretation govern the behavior of the program at run-time. The compilation model centers on artifacts called crates. Each compilation processes a single crate in source form, and if successful, produces a single crate in binary form: either an executable or some sort of library.2 A crate is a unit of compilation and linking, as well as versioning, distribution and runtime loading. A crate contains a tree of nested module scopes. The top level of this tree is a module that is anonymous (from the point of view of paths within the module) and any item within a crate has a canonical module path denoting its location within the crate's module tree. The Rust compiler is always invoked with a single source file as input, and always produces a single output crate. The processing of that source file may result in other source files being loaded as modules. Source files have the extension .rs. A Rust source file describes a module, the name and location of which — in the module tree of the current crate — are defined from outside the source file: either by an explicit Module item in a referencing source file, or by the name of the crate itself. Every source file is a module, but not every module needs its own source file: module definitions can be nested within one file. Each source file contains a sequence of zero or more Item definitions, and may optionally begin with any number of attributes that apply to the containing module, most of which influence the behavior of the compiler. The anonymous crate module can have additional attributes that apply to the crate as a whole.  # #![allow(unused_variables)] #fn main() { // Specify the crate name. #![crate_name = "projx"] // Specify the type of output artifact. #![crate_type = "lib"] // Turn on a warning. // This can be done in any module, not just the anonymous crate module. #![warn(non_camel_case_types)] #} The optional UTF8 byte order mark (UTF8BOM production) indicates that the file is encoded in UTF8. It can only occur at the beginning of the file and is ignored by the compiler. A source file can have a shebang (SHEBANG production), which indicates to the operating system what program to use to execute this file. It serves essentially to treat the source file as an executable script. The shebang can only occur at the beginning of the file (but after the optional UTF8BOM). It is ignored by the compiler. For example: #!/usr/bin/env rustx fn main() { println!("Hello!"); }  ## Preludes and no_std All crates have a prelude that automatically inserts names from a specific module, the prelude module, into scope of each module and an extern crate into the crate root module. By default, the standard prelude is used. The linked crate is std and the prelude module is std::prelude::v1. The prelude can be changed to the core prelude by using the no_std attribute on the root crate module. The linked crate is core and the prelude module is core::prelude::v1. Using the core prelude over the standard prelude is useful when either the crate is targeting a platform that does not support the standard library or is purposefully not using the capabilities of the standard library. Those capabilities are mainly dynamic memory allocation (e.g. Box and Vec) and file and network capabilities (e.g. std::fs and std::io). Warning: Using no_std does not prevent the standard library from being linked in. It is still valid to put extern crate std; into the crate and dependencies can also link it in. ## Main Functions A crate that contains a main function can be compiled to an executable. If a main function is present, it must take no arguments, must not declare any trait or lifetime bounds, must not have any where clauses, and its return type must be one of the following: • () • Result<(), E> where E: Error Note: The implementation of which return types are allowed is determined by the unstable Termination trait. ### The no_main attribute The no_main attribute may be applied at the crate level to disable emitting the main symbol for an executable binary. This is useful when some other object being linked to defines main. ## The crate_name attribute The crate_name attribute may be applied at the crate level to specify the name of the crate with the MetaNameValueStr syntax. #![crate_name = "mycrate"]  The crate name must not be empty, and must only contain Unicode alphanumeric or - (U+002D) characters. 1 This distinction would also exist in an interpreter. Static checks like syntactic analysis, type checking, and lints should happen before the program is executed regardless of when it is executed. 2 A crate is somewhat analogous to an assembly in the ECMA-335 CLI model, a library in the SML/NJ Compilation Manager, a unit in the Owens and Flatt module system, or a configuration in Mesa. # Conditional compilation Syntax ConfigurationPredicate : ConfigurationOption | ConfigurationAll | ConfigurationAny | ConfigurationNot ConfigurationOption : IDENTIFIER (= (STRING_LITERAL | RAW_STRING_LITERAL))? ConfigurationAll all ( ConfigurationPredicateList? ) ConfigurationAny any ( ConfigurationPredicateList? ) ConfigurationNot not ( ConfigurationPredicate ) ConfigurationPredicateList ConfigurationPredicate (, ConfigurationPredicate)\* ,? Conditionally compiled source code is source code that may or may not be considered a part of the source code depending on certain conditions. Source code can be conditionally compiled using attributes, cfg and cfg_attr, and the built-in cfg macro. These conditions are based on the target architecture of the compiled crate, arbitrary values passed to the compiler, and a few other miscellaneous things further described below in detail. Each form of conditional compilation takes a configuration predicate that evaluates to true or false. The predicate is one of the following: • A configuration option. It is true if the option is set and false if it is unset. • all() with a comma separated list of configuration predicates. It is false if at least one predicate is false. If there are no predicates, it is true. • any() with a comma separated list of configuration predicates. It is true if at least one predicate is true. If there are no predicates, it is false. • not() with a configuration predicate. It is true if its predicate is false and false if its predicate is true. Configuration options are names and key-value pairs that are either set or unset. Names are written as a single identifier such as, for example, unix. Key-value pairs are written as an identifier, =, and then a string. For example, target_arch = "x86_64" is a configuration option. Note: Whitespace around the = is ignored. foo="bar" and foo = "bar" are equivalent configuration options. Keys are not unique in the set of key-value configuration options. For example, both feature = "std" and feature = "serde" can be set at the same time. ## Set Configuration Options Which configuration options are set is determined statically during the compilation of the crate. Certain options are compiler-set based on data about the compilation. Other options are arbitrarily-set, set based on input passed to the compiler outside of the code. It is not possible to set a configuration option from within the source code of the crate being compiled. Note: For rustc, arbitrary-set configuration options are set using the --cfg flag. Warning: It is possible for arbitrarily-set configuration options to have the same value as compiler-set configuration options. For example, it is possible to do rustc --cfg "unix" program.rs while compiling to a Windows target, and have both unix and windows configuration options set at the same time. It is unwise to actually do this. ### target_arch Key-value option set once with the target's CPU architecture. The value is similar to the first element of the platform's target triple, but not identical. Example values: • "x86" • "x86_64" • "mips" • "powerpc" • "powerpc64" • "arm" • "aarch64" ### target_feature Key-value option set for each platform feature available for the current compilation target. Example values: • "avx" • "avx2" • "crt-static" • "rdrand" • "sse" • "sse2" • "sse4.1" See the target_feature attribute for more details on the available features. An additional feature of crt-static is available to the target_feature option to indicate that a static C runtime is available. ### target_os Key-value option set once with the target's operating system. This value is similar to the second and third element of the platform's target triple. Example values: • "windows" • "macos" • "ios" • "linux" • "android" • "freebsd" • "dragonfly" • "openbsd" • "netbsd" ### target_family Key-value option set at most once with the target's operating system value. Example values: • "unix" • "windows" ### unix and windows unix is set if target_family = "unix" is set and windows is set if target_family = "windows" is set. ### target_env Key-value option set with further disambiguating information about the target platform with information about the ABI or libc used. For historical reasons, this value is only defined as not the empty-string when actually needed for disambiguation. Thus, for example, on many GNU platforms, this value will be empty. This value is similar to the fourth element of the platform's target triple. One difference is that embedded ABIs such as gnueabihf will simply define target_env as "gnu". Example values: • "" • "gnu" • "msvc" • "musl" • "sgx" ### target_endian Key-value option set once with either a value of "little" or "big" depending on the endianness of the target's CPU. ### target_pointer_width Key-value option set once with the target's pointer width in bits. For example, for targets with 32-bit pointers, this is set to "32". Likewise, it is set to "64" for targets with 64-bit pointers. ### target_vendor Key-value option set once with the vendor of the target. Example values: • "apple" • "fortanix" • "pc" • "unknown" ### test Enabled when compiling the test harness. Done with rustc by using the --test flag. See Testing for more on testing support. ### debug_assertions Enabled by default when compiling without optimizations. This can be used to enable extra debugging code in development but not in production. For example, it controls the behavior of the standard library's debug_assert! macro. ### proc_macro Set when the crate being compiled is being compiled with the proc_macro crate type. ## Forms of conditional compilation ### The cfg attribute Syntax CfgAttrAttribute : cfg ( ConfigurationPredicate ) The cfg attribute conditionally includes the thing it is attached to based on a configuration predicate. It is written as cfg, (, a configuration predicate, and finally ). If the predicate is true, the thing is rewritten to not have the cfg attribute on it. If the predicate is false, the thing is removed from the source code. Some examples on functions:  # #![allow(unused_variables)] #fn main() { // The function is only included in the build when compiling for macOS #[cfg(target_os = "macos")] fn macos_only() { // ... } // This function is only included when either foo or bar is defined #[cfg(any(foo, bar))] fn needs_foo_or_bar() { // ... } // This function is only included when compiling for a unixish OS with a 32-bit // architecture #[cfg(all(unix, target_pointer_width = "32"))] fn on_32bit_unix() { // ... } // This function is only included when foo is not defined #[cfg(not(foo))] fn needs_not_foo() { // ... } #} The cfg attribute is allowed anywhere attributes are allowed. ### The cfg_attr attribute Syntax CfgAttrAttribute : cfg_attr ( ConfigurationPredicate , CfgAttrs? ) CfgAttrs : Attr (, Attr)\* ,? The cfg_attr attribute conditionally includes attributes based on a configuration predicate. When the configuration predicate is true, this attribute expands out to the attributes listed after the predicate. For example, the following module will either be found at linux.rs or windows.rs based on the target. #[cfg_attr(linux, path = "linux.rs")] #[cfg_attr(windows, path = "windows.rs")] mod os;  Zero, one, or more attributes may be listed. Multiple attributes will each be expanded into separate attributes. For example: #[cfg_attr(feature = "magic", sparkles, crackles)] fn bewitched() {} // When the magic feature flag is enabled, the above will expand to: #[sparkles] #[crackles] fn bewitched() {}  Note: The cfg_attr can expand to another cfg_attr. For example, #[cfg_attr(linux, cfg_attr(feature = "multithreaded", some_other_attribute)) is valid. This example would be equivalent to #[cfg_attr(all(linux, feature ="multithreaded"), some_other_attribute)]. The cfg_attr attribute is allowed anywhere attributes are allowed. ### The cfg macro The built-in cfg macro takes in a single configuration predicate and evaluates to the true literal when the predicate is true and the false literal when it is false. For example:  # #![allow(unused_variables)] #fn main() { let machine_kind = if cfg!(unix) { "unix" } else if cfg!(windows) { "windows" } else { "unknown" }; println!("I'm running on a {} machine!", machine_kind); #} # Items Syntax: Item: OuterAttribute\* VisItem | MacroItem MacroItem: MacroInvocationSemi | MacroRulesDefinition An item is a component of a crate. Items are organized within a crate by a nested set of modules. Every crate has a single "outermost" anonymous module; all further items within the crate have paths within the module tree of the crate. Items are entirely determined at compile-time, generally remain fixed during execution, and may reside in read-only memory. There are several kinds of items: Some items form an implicit scope for the declaration of sub-items. In other words, within a function or module, declarations of items can (in many cases) be mixed with the statements, control blocks, and similar artifacts that otherwise compose the item body. The meaning of these scoped items is the same as if the item was declared outside the scope — it is still a static item — except that the item's path name within the module namespace is qualified by the name of the enclosing item, or is private to the enclosing item (in the case of functions). The grammar specifies the exact locations in which sub-item declarations may appear. # Modules Syntax: Module : mod IDENTIFIER ; | mod IDENTIFIER { InnerAttribute\* Item\* } A module is a container for zero or more items. A module item is a module, surrounded in braces, named, and prefixed with the keyword mod. A module item introduces a new, named module into the tree of modules making up a crate. Modules can nest arbitrarily. An example of a module:  # #![allow(unused_variables)] #fn main() { mod math { type Complex = (f64, f64); fn sin(f: f64) -> f64 { /* ... */ # unimplemented!(); } fn cos(f: f64) -> f64 { /* ... */ # unimplemented!(); } fn tan(f: f64) -> f64 { /* ... */ # unimplemented!(); } } #} Modules and types share the same namespace. Declaring a named type with the same name as a module in scope is forbidden: that is, a type definition, trait, struct, enumeration, union, type parameter or crate can't shadow the name of a module in scope, or vice versa. Items brought into scope with use also have this restriction. ## Module Source Filenames A module without a body is loaded from an external file. When the module does not have a path attribute, the path to the file mirrors the logical module path. Ancestor module path components are directories, and the module's contents are in a file with the name of the module plus the .rs extension. For example, the following module structure can have this corresponding filesystem structure: Module PathFilesystem PathFile Contents cratelib.rsmod util; crate::utilutil.rsmod config; crate::util::configutil/config.rs Module filenames may also be the name of the module as a directory with the contents in a file named mod.rs within that directory. The above example can alternately be expressed with crate::util's contents in a file named util/mod.rs. It is not allowed to have both util.rs and util/mod.rs. Note: Previous to rustc 1.30, using mod.rs files was the way to load a module with nested children. It is encouraged to use the new naming convention as it is more consistent, and avoids having many files named mod.rs within a project. ### The path attribute The directories and files used for loading external file modules can be influenced with the path attribute. For path attributes on modules not inside inline module blocks, the file path is relative to the directory the source file is located. For example, the following code snippet would use the paths shown based on where it is located: #[path = "foo.rs"] mod c;  Source Filec's File Locationc's Module Path src/a/b.rssrc/a/foo.rscrate::a::b::c src/a/mod.rssrc/a/foo.rscrate::a::c For path attributes inside inline module blocks, the relative location of the file path depends on the kind of source file the path attribute is located in. "mod-rs" source files are root modules (such as lib.rs or main.rs) and modules with files named mod.rs. "non-mod-rs" source files are all other module files. Paths for path attributes inside inline module blocks in a mod-rs file are relative to the directory of the mod-rs file including the inline module components as directories. For non-mod-rs files, it is the same except the path starts with a directory with the name of the non-mod-rs module. For example, the following code snippet would use the paths shown based on where it is located: mod inline { #[path = "other.rs"] mod inner; }  Source Fileinner's File Locationinner's Module Path src/a/b.rssrc/a/b/inline/other.rscrate::a::b::inline::inner src/a/mod.rssrc/a/inline/other.rscrate::a::inline::inner An example of combining the above rules of path attributes on inline modules and nested modules within (applies to both mod-rs and non-mod-rs files): #[path = "thread_files"] mod thread { // Load the local_data module from thread_files/tls.rs relative to // this source file's directory. #[path = "tls.rs"] mod local_data; }  ## Prelude Items Modules implicitly have some names in scope. These name are to built-in types, macros imported with #[macro_use] on an extern crate, and by the crate's prelude. These names are all made of a single identifier. These names are not part of the module, so for example, any name name, self::name is not a valid path. The names added by the prelude can be removed by placing the no_implicit_prelude attribute onto the module. ## Attributes on Modules Modules, like all items, accept outer attributes. They also accept inner attributes: either after { for a module with a body, or at the beginning of the source file, after the optional BOM and shebang. The built-in attributes that have meaning on a module are cfg, deprecated, doc, the lint check attributes, path, and no_implicit_prelude. Modules also accept macro attributes. # Extern crate declarations Syntax: ExternCrate : extern crate CrateRef AsClause? ; CrateRef : IDENTIFIER | self AsClause : as ( IDENTIFIER | _ ) An extern crate declaration specifies a dependency on an external crate. The external crate is then bound into the declaring scope as the identifier provided in the extern crate declaration. The as clause can be used to bind the imported crate to a different name. The external crate is resolved to a specific soname at compile time, and a runtime linkage requirement to that soname is passed to the linker for loading at runtime. The soname is resolved at compile time by scanning the compiler's library path and matching the optional crateid provided against the crateid attributes that were declared on the external crate when it was compiled. If no crateid is provided, a default name attribute is assumed, equal to the identifier given in the extern crate declaration. The self crate may be imported which creates a binding to the current crate. In this case the as clause must be used to specify the name to bind it to. Three examples of extern crate declarations: extern crate pcre; extern crate std; // equivalent to: extern crate std as std; extern crate std as ruststd; // linking to 'std' under another name  When naming Rust crates, hyphens are disallowed. However, Cargo packages may make use of them. In such case, when Cargo.toml doesn't specify a crate name, Cargo will transparently replace - with _ (Refer to RFC 940 for more details). Here is an example: // Importing the Cargo package hello-world extern crate hello_world; // hyphen replaced with an underscore  ## Extern Prelude External crates imported with extern crate in the root module or provided to the compiler (as with the --extern flag with rustc) are added to the "extern prelude". Crates in the extern prelude are in scope in the entire crate, including inner modules. If imported with extern crate orig_name as new_name, then the symbol new_name is instead added to the prelude. The core crate is always added to the extern prelude. The std crate is added as long as the no_std attribute is not specified in the crate root. The no_implicit_prelude attribute can be used on a module to disable prelude lookups within that module. Edition Differences: In the 2015 edition, crates in the extern prelude cannot be referenced via use declarations, so it is generally standard practice to include extern crate declarations to bring them into scope. Beginning in the 2018 edition, use declarations can reference crates in the extern prelude, so it is considered unidiomatic to use extern crate. Note: Additional crates that ship with rustc, such as proc_macro, alloc, and test, are not automatically included with the --extern flag when using Cargo. They must be brought into scope with an extern crate declaration, even in the 2018 edition.  # #![allow(unused_variables)] #fn main() { extern crate proc_macro; use proc_macro::TokenStream; #} ## Underscore Imports An external crate dependency can be declared without binding its name in scope by using an underscore with the form extern crate foo as _. This may be useful for crates that only need to be linked, but are never referenced, and will avoid being reported as unused. The macro_use attribute works as usual and import the macro names into the macro-use prelude. ## The no_link attribute The no_link attribute may be specified on an extern crate item to prevent linking the crate into the output. This is commonly used to load a crate to access only its macros. # Use declarations Syntax: UseDeclaration : use UseTree ; UseTree : (SimplePath? ::)? * | (SimplePath? ::)? { (UseTree ( , UseTree )\* ,?)? } | SimplePath ( as ( IDENTIFIER | _ ) )? A use declaration creates one or more local name bindings synonymous with some other path. Usually a use declaration is used to shorten the path required to refer to a module item. These declarations may appear in modules and blocks, usually at the top. Use declarations support a number of convenient shortcuts: • Simultaneously binding a list of paths with a common prefix, using the glob-like brace syntax use a::b::{c, d, e::f, g::h::i}; • Simultaneously binding a list of paths with a common prefix and their common parent module, using the self keyword, such as use a::b::{self, c, d::e}; • Rebinding the target name as a new local name, using the syntax use p::q::r as x;. This can also be used with the last two features: use a::b::{self as ab, c as abc}. • Binding all paths matching a given prefix, using the asterisk wildcard syntax use a::b::*;. • Nesting groups of the previous features multiple times, such as use a::b::{self as ab, c, d::{*, e::f}}; An example of use declarations: use std::option::Option::{Some, None}; use std::collections::hash_map::{self, HashMap}; fn foo<T>(_: T){} fn bar(map1: HashMap<String, usize>, map2: hash_map::HashMap<String, usize>){} fn main() { // Equivalent to 'foo(vec![std::option::Option::Some(1.0f64), // std::option::Option::None]);' foo(vec![Some(1.0f64), None]); // Both hash_map and HashMap are in scope. let map1 = HashMap::new(); let map2 = hash_map::HashMap::new(); bar(map1, map2); }  ## use Visibility Like items, use declarations are private to the containing module, by default. Also like items, a use declaration can be public, if qualified by the pub keyword. Such a use declaration serves to re-export a name. A public use declaration can therefore redirect some public name to a different target definition: even a definition with a private canonical path, inside a different module. If a sequence of such redirections form a cycle or cannot be resolved unambiguously, they represent a compile-time error. An example of re-exporting: # fn main() { } mod quux { pub use quux::foo::{bar, baz}; pub mod foo { pub fn bar() { } pub fn baz() { } } }  In this example, the module quux re-exports two public names defined in foo. ## use Paths Paths in use items must start with a crate name or one of the path qualifiers crate, self, super, or ::. crate refers to the current crate. self refers to the current module. super refers to the parent module. :: can be used to explicitly refer to a crate, requiring an extern crate name to follow. An example of what will and will not work for use items: # #![allow(unused_imports)] use std::path::{self, Path, PathBuf}; // good: std is a crate name use crate::foo::baz::foobaz; // good: foo is at the root of the crate mod foo { mod example { pub mod iter {} } use crate::foo::example::iter; // good: foo is at crate root // use example::iter; // bad: relative paths are not allowed without self use self::baz::foobaz; // good: self refers to module 'foo' use crate::foo::bar::foobar; // good: foo is at crate root pub mod bar { pub fn foobar() { } } pub mod baz { use super::bar::foobar; // good: super refers to module 'foo' pub fn foobaz() { } } } fn main() {}  Edition Differences: In the 2015 edition, use paths also allow accessing items in the crate root. Using the example above, the following use paths work in 2015 but not 2018: use foo::example::iter; use ::foo::baz::foobaz;  The 2015 edition does not allow use declarations to reference the [extern prelude]. Thus extern crate declarations are still required in 2015 to reference an external crate in a use declaration. Beginning with the 2018 edition, use declarations can specify an external crate dependency the same way extern crate can. In the 2018 edition, if an in-scope item has the same name as an external crate, then use of that crate name requires a leading :: to unambiguously select the crate name. This is to retain compatibility with potential future changes. // use std::fs; // Error, this is ambiguous. use ::std::fs; // Imports from the std crate, not the module below. use self::std::fs as self_fs; // Imports the module below. mod std { pub mod fs {} } # fn main() {}  ## Underscore Imports Items can be imported without binding to a name by using an underscore with the form use path as _. This is particularly useful to import a trait so that its methods may be used without importing the trait's symbol, for example if the trait's symbol may conflict with another symbol. Another example is to link an external crate without importing its name. Asterisk glob imports will import items imported with _ in their unnameable form. mod foo { pub trait Zoo { fn zoo(&self) {} } impl<T> Zoo for T {} } use self::foo::Zoo as _; struct Zoo; // Underscore import avoids name conflict with this item. fn main() { let z = Zoo; z.zoo(); }  The unique, unnameable symbols are created after macro expansion so that macros may safely emit multiple references to _ imports. For example, the following should not produce an error:  # #![allow(unused_variables)] #fn main() { macro_rules! m { ($item: item) => { $item$item }
}

m!(use std as _;);
// This expands to:
// use std as _;
// use std as _;
#}

# Functions

Syntax
Function :
FunctionQualifiers fn IDENTIFIER Generics?
( FunctionParameters? )
FunctionReturnType? WhereClause?
BlockExpression

FunctionQualifiers :
const? unsafe? (extern Abi?)?

FunctionParameters :
FunctionParam (, FunctionParam)\* ,?

FunctionParam :
Pattern : Type

FunctionReturnType :
-> Type

A function consists of a block, along with a name and a set of parameters. Other than a name, all these are optional. Functions are declared with the keyword fn. Functions may declare a set of input variables as parameters, through which the caller passes arguments into the function, and the output type of the value the function will return to its caller on completion.

When referred to, a function yields a first-class value of the corresponding zero-sized function item type, which when called evaluates to a direct call to the function.

For example, this is a simple function:


# #![allow(unused_variables)]
#fn main() {
return 42;
}
#}

As with let bindings, function arguments are irrefutable patterns, so any pattern that is valid in a let binding is also valid as an argument:


# #![allow(unused_variables)]
#fn main() {
fn first((value, _): (i32, i32)) -> i32 { value }
#}

The block of a function is conceptually wrapped in a block that binds the argument patterns and then returns the value of the function's block. This means that the tail expression of the block, if evaluated, ends up being returned to the caller. As usual, an explicit return expression within the body of the function will short-cut that implicit return, if reached.

For example, the function above behaves as if it was written as:

// argument_0 is the actual first argument passed from the caller
let (value, _) = argument_0;
return {
value
};


## Generic functions

A generic function allows one or more parameterized types to appear in its signature. Each type parameter must be explicitly declared in an angle-bracket-enclosed and comma-separated list, following the function name.


# #![allow(unused_variables)]
#fn main() {
// foo is generic over A and B

fn foo<A, B>(x: A, y: B) {
# }
#}

Inside the function signature and body, the name of the type parameter can be used as a type name. Trait bounds can be specified for type parameters to allow methods with that trait to be called on values of that type. This is specified using the where syntax:


# #![allow(unused_variables)]
#fn main() {
# use std::fmt::Debug;
fn foo<T>(x: T) where T: Debug {
# }
#}

When a generic function is referenced, its type is instantiated based on the context of the reference. For example, calling the foo function here:


# #![allow(unused_variables)]
#fn main() {
use std::fmt::Debug;

fn foo<T>(x: &[T]) where T: Debug {
// details elided
}

foo(&[1, 2]);
#}

will instantiate type parameter T with i32.

The type parameters can also be explicitly supplied in a trailing path component after the function name. This might be necessary if there is not sufficient context to determine the type parameters. For example, mem::size_of::<u32>() == 4.

## Extern functions

Extern functions are part of Rust's foreign function interface, providing the opposite functionality to external blocks. Whereas external blocks allow Rust code to call foreign code, extern functions with bodies defined in Rust code can be called by foreign code. They are defined in the same way as any other Rust function, except that they have the extern qualifier.


# #![allow(unused_variables)]
#fn main() {
// Declares an extern fn, the ABI defaults to "C"
extern fn new_i32() -> i32 { 0 }

// Declares an extern fn with "stdcall" ABI
# #[cfg(target_arch = "x86_64")]
extern "stdcall" fn new_i32_stdcall() -> i32 { 0 }
#}

Unlike normal functions, extern fns have type extern "ABI" fn(). This is the same type as the functions declared in an extern block.


# #![allow(unused_variables)]
#fn main() {
# extern fn new_i32() -> i32 { 0 }
let fptr: extern "C" fn() -> i32 = new_i32;
#}

As non-Rust calling conventions do not support unwinding, unwinding past the end of an extern function will cause the process to abort. In LLVM, this is implemented by executing an illegal instruction.

## Const functions

Functions qualified with the const keyword are const functions. Const functions can be called from within const contexts. When called from a const context, the function is interpreted by the compiler at compile time. The interpretation happens in the environment of the compilation target and not the host. So usize is 32 bits if you are compiling against a 32 bit system, irrelevant of whether you are building on a 64 bit or a 32 bit system.

If a const function is called outside a const context, it is indistinguishable from any other function. You can freely do anything with a const function that you can do with a regular function.

Const functions have various restrictions to make sure that they can be evaluated at compile-time. It is, for example, not possible to write a random number generator as a const function. Calling a const function at compile-time will always yield the same result as calling it at runtime, even when called multiple times. There's one exception to this rule: if you are doing complex floating point operations in extreme situations, then you might get (very slightly) different results. It is advisable to not make array lengths and enum discriminants depend on floating point computations.

Exhaustive list of permitted structures in const functions:

Note: this list is more restrictive than what you can write in regular constants

• Type parameters where the parameters only have any trait bounds of the following kind:

This means that <T: 'a + ?Sized>, <T: 'b + Sized> and <T> are all permitted.

This rule also applies to type parameters of impl blocks that contain const methods

• Arithmetic and comparison operators on integers

• All boolean operators except for && and || which are banned since they are short-circuiting.

• Any kind of aggregate constructor (array, struct, enum, tuple, ...)

• Calls to other safe const functions (whether by function call or method call)

• Index expressions on arrays and slices

• Field accesses on structs and tuples

• Reading from constants (but not statics, not even taking a reference to a static)

• & and * (only dereferencing of references, not raw pointers)

• Casts except for raw pointer to integer casts

• unsafe blocks and const unsafe fn are allowed, but the body/block may only do the following unsafe operations:

• calls to const unsafe functions

## Attributes on functions

Outer attributes are allowed on functions. Inner attributes are allowed directly after the { inside its block.

This example shows an inner attribute on a function. The function will only be available while running tests.

fn test_only() {
#![test]
}


Note: Except for lints, it is idiomatic to only use outer attributes on function items.

The attributes that have meaning on a function are cfg, deprecated, doc, export_name, link_section, no_mangle, the lint check attributes, must_use, the procedural macro attributes, the testing attributes, and the optimization hint attributes. Functions also accept attributes macros.

# Type aliases

Syntax
TypeAlias :
type IDENTIFIER Generics? WhereClause? = Type ;

A type alias defines a new name for an existing type. Type aliases are declared with the keyword type. Every value has a single, specific type, but may implement several different traits, or be compatible with several different type constraints.

For example, the following defines the type Point as a synonym for the type (u8, u8), the type of pairs of unsigned 8 bit integers:


# #![allow(unused_variables)]
#fn main() {
type Point = (u8, u8);
let p: Point = (41, 68);
#}

A type alias to an enum type cannot be used to qualify the constructors:


# #![allow(unused_variables)]
#fn main() {
enum E { A }
type F = E;
let _: F = E::A;  // OK
// let _: F = F::A;  // Doesn't work
#}

# Structs

Syntax
Struct :
StructStruct
| TupleStruct

StructStruct :
struct IDENTIFIER  Generics? WhereClause? ( { StructFields? } | ; )

TupleStruct :
struct IDENTIFIER  Generics? ( TupleFields? ) WhereClause? ;

StructFields :
StructField (, StructField)\* ,?

StructField :
OuterAttribute\*
Visibility?
IDENTIFIER : Type

TupleFields :
TupleField (, TupleField)\* ,?

TupleField :
OuterAttribute\*
Visibility?
Type

A struct is a nominal struct type defined with the keyword struct.

An example of a struct item and its use:


# #![allow(unused_variables)]
#fn main() {
struct Point {x: i32, y: i32}
let p = Point {x: 10, y: 11};
let px: i32 = p.x;
#}

A tuple struct is a nominal tuple type, also defined with the keyword struct. For example:


# #![allow(unused_variables)]
#fn main() {
struct Point(i32, i32);
let p = Point(10, 11);
let px: i32 = match p { Point(x, _) => x };
#}

A unit-like struct is a struct without any fields, defined by leaving off the list of fields entirely. Such a struct implicitly defines a constant of its type with the same name. For example:


# #![allow(unused_variables)]
#fn main() {
#}

is equivalent to


# #![allow(unused_variables)]
#fn main() {
#}

The precise memory layout of a struct is not specified. One can specify a particular layout using the repr attribute.

# Enumerations

Syntax
Enumeration :
enum IDENTIFIER  Generics? WhereClause? { EnumItems? }

EnumItems :
EnumItem ( , EnumItem )\* ,?

EnumItem :
OuterAttribute\*
IDENTIFIER ( EnumItemTuple | EnumItemStruct | EnumItemDiscriminant )?

EnumItemTuple :
( TupleFields? )

EnumItemStruct :
{ StructFields? }

EnumItemDiscriminant :
= Expression

An enumeration, also referred to as enum is a simultaneous definition of a nominal enumerated type as well as a set of constructors, that can be used to create or pattern-match values of the corresponding enumerated type.

Enumerations are declared with the keyword enum.

An example of an enum item and its use:


# #![allow(unused_variables)]
#fn main() {
enum Animal {
Dog,
Cat,
}

let mut a: Animal = Animal::Dog;
a = Animal::Cat;
#}

Enum constructors can have either named or unnamed fields:


# #![allow(unused_variables)]
#fn main() {
enum Animal {
Dog(String, f64),
Cat { name: String, weight: f64 },
}

let mut a: Animal = Animal::Dog("Cocoa".to_string(), 37.2);
a = Animal::Cat { name: "Spotty".to_string(), weight: 2.7 };
#}

In this example, Cat is a struct-like enum variant, whereas Dog is simply called an enum variant. Each enum instance has a discriminant which is an integer associated to it that is used to determine which variant it holds. An opaque reference to this discriminant can be obtained with the mem::discriminant function.

## Custom Discriminant Values for Field-Less Enumerations

If there is no data attached to any of the variants of an enumeration, then the discriminant can be directly chosen and accessed.

These enumerations can be cast to integer types with the as operator by a numeric cast. The enumeration can optionally specify which integer each discriminant gets by following the variant name with = followed by a constant expression. If the first variant in the declaration is unspecified, then it is set to zero. For every other unspecified discriminant, it is set to one higher than the previous variant in the declaration.


# #![allow(unused_variables)]
#fn main() {
enum Foo {
Bar,            // 0
Baz = 123,      // 123
Quux,           // 124
}

let baz_discriminant = Foo::Baz as u32;
assert_eq!(baz_discriminant, 123);
#}

Under the default representation, the specified discriminant is interpreted as an isize value although the compiler is allowed to use a smaller type in the actual memory layout. The size and thus acceptable values can be changed by using a primitive representation or the C representation.

It is an error when two variants share the same discriminant.

enum SharedDiscriminantError {
SharedA = 1,
SharedB = 1
}

enum SharedDiscriminantError2 {
Zero,       // 0
One,        // 1
OneToo = 1  // 1 (collision with previous!)
}


It is also an error to have an unspecified discriminant where the previous discriminant is the maximum value for the size of the discriminant.

#[repr(u8)]
enum OverflowingDiscriminantError {
Max = 255,
MaxPlusOne // Would be 256, but that overflows the enum.
}

#[repr(u8)]
enum OverflowingDiscriminantError2 {
MaxMinusOne = 254, // 254
Max,               // 255
MaxPlusOne         // Would be 256, but that overflows the enum.
}


## Zero-variant Enums

Enums with zero variants are known as zero-variant enums. As they have no valid values, they cannot be instantiated.


# #![allow(unused_variables)]
#fn main() {
enum ZeroVariants {}
#}

# Unions

Syntax
Union :
union IDENTIFIER Generics? WhereClause? {StructFields }

A union declaration uses the same syntax as a struct declaration, except with union in place of struct.


# #![allow(unused_variables)]
#fn main() {
#[repr(C)]
union MyUnion {
f1: u32,
f2: f32,
}
#}

The key property of unions is that all fields of a union share common storage. As a result writes to one field of a union can overwrite its other fields, and size of a union is determined by the size of its largest field.

## Initialization of a union

A value of a union type can be created using the same syntax that is used for struct types, except that it must specify exactly one field:


# #![allow(unused_variables)]
#fn main() {
# union MyUnion { f1: u32, f2: f32 }
#
let u = MyUnion { f1: 1 };
#}

The expression above creates a value of type MyUnion and initializes the storage using field f1. The union can be accessed using the same syntax as struct fields:

let f = u.f1;


## Reading and writing union fields

Unions have no notion of an "active field". Instead, every union access just interprets the storage at the type of the field used for the access. Reading a union field reads the bits of the union at the field's type. Fields might have a non-zero offset (except when #[repr(C)] is used); in that case the bits starting at the offset of the fields are read. It is the programmer's responsibility to make sure that the data is valid at the field's type. Failing to do so results in undefined behavior. For example, reading the value 3 at type bool is undefined behavior. Effectively, writing to and then reading from a #[repr(C)] union is analogous to a transmute from the type used for writing to the type used for reading.

Consequently, all reads of union fields have to be placed in unsafe blocks:


# #![allow(unused_variables)]
#fn main() {
# union MyUnion { f1: u32, f2: f32 }
# let u = MyUnion { f1: 1 };
#
unsafe {
let f = u.f1;
}
#}

Writes to Copy union fields do not require reads for running destructors, so these writes don't have to be placed in unsafe blocks


# #![allow(unused_variables)]
#fn main() {
# union MyUnion { f1: u32, f2: f32 }
# let mut u = MyUnion { f1: 1 };
#
u.f1 = 2;
#}

Commonly, code using unions will provide safe wrappers around unsafe union field accesses.

## Pattern matching on unions

Another way to access union fields is to use pattern matching. Pattern matching on union fields uses the same syntax as struct patterns, except that the pattern must specify exactly one field. Since pattern matching is like reading the union with a particular field, it has to be placed in unsafe blocks as well.


# #![allow(unused_variables)]
#fn main() {
# union MyUnion { f1: u32, f2: f32 }
#
fn f(u: MyUnion) {
unsafe {
match u {
MyUnion { f1: 10 } => { println!("ten"); }
MyUnion { f2 } => { println!("{}", f2); }
}
}
}
#}

Pattern matching may match a union as a field of a larger structure. In particular, when using a Rust union to implement a C tagged union via FFI, this allows matching on the tag and the corresponding field simultaneously:


# #![allow(unused_variables)]
#fn main() {
#[repr(u32)]
enum Tag { I, F }

#[repr(C)]
union U {
i: i32,
f: f32,
}

#[repr(C)]
struct Value {
tag: Tag,
u: U,
}

fn is_zero(v: Value) -> bool {
unsafe {
match v {
Value { tag: I, u: U { i: 0 } } => true,
Value { tag: F, u: U { f: 0.0 } } => true,
_ => false,
}
}
}
#}

## References to union fields

Since union fields share common storage, gaining write access to one field of a union can give write access to all its remaining fields. Borrow checking rules have to be adjusted to account for this fact. As a result, if one field of a union is borrowed, all its remaining fields are borrowed as well for the same lifetime.

// ERROR: cannot borrow u (via u.f2) as mutable more than once at a time
fn test() {
let mut u = MyUnion { f1: 1 };
unsafe {
let b1 = &mut u.f1;
---- first mutable borrow occurs here (via u.f1)
let b2 = &mut u.f2;
^^^^ second mutable borrow occurs here (via u.f2)
*b1 = 5;
}
- first borrow ends here
assert_eq!(unsafe { u.f1 }, 5);
}


As you could see, in many aspects (except for layouts, safety and ownership) unions behave exactly like structs, largely as a consequence of inheriting their syntactic shape from structs. This is also true for many unmentioned aspects of Rust language (such as privacy, name resolution, type inference, generics, trait implementations, inherent implementations, coherence, pattern checking, etc etc etc).

# Constant items

Syntax
ConstantItem :
const ( IDENTIFIER | _ ) : Type = Expression ;

A constant item is an optionally named constant value which is not associated with a specific memory location in the program. Constants are essentially inlined wherever they are used, meaning that they are copied directly into the relevant context when used. References to the same constant are not necessarily guaranteed to refer to the same memory address.

Constants must be explicitly typed. The type must have a 'static lifetime: any references it contains must have 'static lifetimes.

Constants may refer to the address of other constants, in which case the address will have elided lifetimes where applicable, otherwise – in most cases – defaulting to the static lifetime. (See static lifetime elision.) The compiler is, however, still at liberty to translate the constant many times, so the address referred to may not be stable.


# #![allow(unused_variables)]
#fn main() {
const BIT1: u32 = 1 << 0;
const BIT2: u32 = 1 << 1;

const BITS: [u32; 2] = [BIT1, BIT2];
const STRING: &'static str = "bitstring";

struct BitsNStrings<'a> {
mybits: [u32; 2],
mystring: &'a str,
}

const BITS_N_STRINGS: BitsNStrings<'static> = BitsNStrings {
mybits: BITS,
mystring: STRING,
};
#}

## Constants with Destructors

Constants can contain destructors. Destructors are run when the value goes out of scope.


# #![allow(unused_variables)]
#fn main() {
struct TypeWithDestructor(i32);

impl Drop for TypeWithDestructor {
fn drop(&mut self) {
println!("Dropped. Held {}.", self.0);
}
}

const ZERO_WITH_DESTRUCTOR: TypeWithDestructor = TypeWithDestructor(0);

fn create_and_drop_zero_with_destructor() {
let x = ZERO_WITH_DESTRUCTOR;
// x gets dropped at end of function, calling drop.
// prints "Dropped. Held 0.".
}
#}

## Unnamed constant

Unlike an associated constant, a free constant may be unnamed by using an underscore instead of the name. For example:


# #![allow(unused_variables)]
#fn main() {
const _: () =  { struct _SameNameTwice; };

// OK although it is the same name as above:
const _: () =  { struct _SameNameTwice; };
#}

As with underscore imports, macros may safely emit the same unnamed constant in the same scope more than once. For example, the following should not produce an error:


# #![allow(unused_variables)]
#fn main() {
macro_rules! m {

$nm -C foo.o 0000000000000000 R foo::BAZ 0000000000000000 r foo::FOO 0000000000000000 R foo::QUUX 0000000000000000 T foo::quux  ## The no_mangle attribute The no_mangle attribute may be used on any item to disable standard symbol name mangling. The symbol for the item will be the identifier of the item's name. ## The link_section attribute The link_section attribute specifies the section of the object file that a function or static's content will be placed into. It uses the MetaNameValueStr syntax to specify the section name. #[no_mangle] #[link_section = ".example_section"] pub static VAR1: u32 = 1;  ## The export_name attribute The export_name attribute specifies the name of the symbol that will be exported on a function or static. It uses the MetaNameValueStr syntax to specify the symbol name. #[export_name = "exported_symbol_name"] pub fn name_in_rust() { }  # The Rust runtime This section documents features that define some aspects of the Rust runtime. ## The panic_handler attribute The panic_handler attribute can only be applied to a function with signature fn(&PanicInfo) -> !. The function marked with this attribute defines the behavior of panics. The PanicInfo struct contains information about the location of the panic. There must be a single panic_handler function in the dependency graph of a binary, dylib or cdylib crate. Below is shown a panic_handler function that logs the panic message and then halts the thread. #![no_std] use core::fmt::{self, Write}; use core::panic::PanicInfo; struct Sink { // .. # _0: (), } # # impl Sink { # fn new() -> Sink { Sink { _0: () }} # } # # impl fmt::Write for Sink { # fn write_str(&mut self, _: &str) -> fmt::Result { Ok(()) } # } #[panic_handler] fn panic(info: &PanicInfo) -> ! { let mut sink = Sink::new(); // logs "panicked at '$reason', src/main.rs:27:4" to some sink
let _ = writeln!(sink, "{}", info);

loop {}
}


### Standard behavior

The standard library provides an implementation of panic_handler that defaults to unwinding the stack but that can be changed to abort the process. The standard library's panic behavior can be modified at runtime with the set_hook function.

## The global_allocator attribute

The global_allocator attribute is used on a static item implementing the GlobalAlloc trait to set the global allocator.

## The windows_subsystem attribute

The windows_subsystem attribute may be applied at the crate level to set the subsystem when linking on a Windows target. It uses the MetaNameValueStr syntax to specify the subsystem with a value of either console or windows. This attribute is ignored on non-Windows targets, and for non-bin crate types.

#![windows_subsystem = "windows"]


# Appendix: Macro Follow-Set Ambiguity Formal Specification

This page documents the formal specification of the follow rules for Macros By Example. They were originally specified in RFC 550, from which the bulk of this text is copied, and expanded upon in subsequent RFCs.

## Definitions & Conventions

• macro: anything invokable as foo!(...) in source code.
• MBE: macro-by-example, a macro defined by macro_rules.
• matcher: the left-hand-side of a rule in a macro_rules invocation, or a subportion thereof.
• macro parser: the bit of code in the Rust parser that will parse the input using a grammar derived from all of the matchers.
• fragment: The class of Rust syntax that a given matcher will accept (or "match").
• repetition : a fragment that follows a regular repeating pattern
• NT: non-terminal, the various "meta-variables" or repetition matchers that can appear in a matcher, specified in MBE syntax with a leading $ character. • simple NT: a "meta-variable" non-terminal (further discussion below). • complex NT: a repetition matching non-terminal, specified via repetition operators (\*, +, ?). • token: an atomic element of a matcher; i.e. identifiers, operators, open/close delimiters, and simple NT's. • token tree: a tree structure formed from tokens (the leaves), complex NT's, and finite sequences of token trees. • delimiter token: a token that is meant to divide the end of one fragment and the start of the next fragment. • separator token: an optional delimiter token in an complex NT that separates each pair of elements in the matched repetition. • separated complex NT: a complex NT that has its own separator token. • delimited sequence: a sequence of token trees with appropriate open- and close-delimiters at the start and end of the sequence. • empty fragment: The class of invisible Rust syntax that separates tokens, i.e. whitespace, or (in some lexical contexts), the empty token sequence. • fragment specifier: The identifier in a simple NT that specifies which fragment the NT accepts. • language: a context-free language. Example:  # #![allow(unused_variables)] #fn main() { macro_rules! i_am_an_mbe { (start$foo:expr $($i:ident),* end) => ($foo) } #} (start$foo:expr $($i:ident),\* end) is a matcher. The whole matcher is a delimited sequence (with open- and close-delimiters ( and )), and $foo and $i are simple NT's with expr and ident as their respective fragment specifiers.

$(i:ident),\* is also an NT; it is a complex NT that matches a comma-separated repetition of identifiers. The , is the separator token for the complex NT; it occurs in between each pair of elements (if any) of the matched fragment. Another example of a complex NT is $(hi $e:expr ;)+, which matches any fragment of the form hi <expr>; hi <expr>; ... where hi <expr>; occurs at least once. Note that this complex NT does not have a dedicated separator token. (Note that Rust's parser ensures that delimited sequences always occur with proper nesting of token tree structure and correct matching of open- and close-delimiters.) We will tend to use the variable "M" to stand for a matcher, variables "t" and "u" for arbitrary individual tokens, and the variables "tt" and "uu" for arbitrary token trees. (The use of "tt" does present potential ambiguity with its additional role as a fragment specifier; but it will be clear from context which interpretation is meant.) "SEP" will range over separator tokens, "OP" over the repetition operators \*, +, and ?, "OPEN"/"CLOSE" over matching token pairs surrounding a delimited sequence (e.g. [ and ]). Greek letters "α" "β" "γ" "δ" stand for potentially empty token-tree sequences. (However, the Greek letter "ε" (epsilon) has a special role in the presentation and does not stand for a token-tree sequence.) • This Greek letter convention is usually just employed when the presence of a sequence is a technical detail; in particular, when we wish to emphasize that we are operating on a sequence of token-trees, we will use the notation "tt ..." for the sequence, not a Greek letter. Note that a matcher is merely a token tree. A "simple NT", as mentioned above, is an meta-variable NT; thus it is a non-repetition. For example, $foo:ty is a simple NT but $($foo:ty)+ is a complex NT.

Note also that in the context of this formalism, the term "token" generally includes simple NTs.

Finally, it is useful for the reader to keep in mind that according to the definitions of this formalism, no simple NT matches the empty fragment, and likewise no token matches the empty fragment of Rust syntax. (Thus, the only NT that can match the empty fragment is a complex NT.) This is not actually true, because the vis matcher can match an empty fragment. Thus, for the purposes of the formalism, we will treat $v:vis as actually being $($v:vis)?, with a requirement that the matcher match an empty fragment. ### The Matcher Invariants To be valid, a matcher must meet the following three invariants. The definitions of FIRST and FOLLOW are described later. 1. For any two successive token tree sequences in a matcher M (i.e. M = ... tt uu ...) with uu ... nonempty, we must have FOLLOW(... tt) ∪ {ε} ⊇ FIRST(uu ...). 2. For any separated complex NT in a matcher, M = ...$(tt ...) SEP OP ..., we must have SEP ∈ FOLLOW(tt ...).
3. For an unseparated complex NT in a matcher, M = ... $(tt ...) OP ..., if OP = \* or +, we must have FOLLOW(tt ...) ⊇ FIRST(tt ...). The first invariant says that whatever actual token that comes after a matcher, if any, must be somewhere in the predetermined follow set. This ensures that a legal macro definition will continue to assign the same determination as to where ... tt ends and uu ... begins, even as new syntactic forms are added to the language. The second invariant says that a separated complex NT must use a separator token that is part of the predetermined follow set for the internal contents of the NT. This ensures that a legal macro definition will continue to parse an input fragment into the same delimited sequence of tt ...'s, even as new syntactic forms are added to the language. The third invariant says that when we have a complex NT that can match two or more copies of the same thing with no separation in between, it must be permissible for them to be placed next to each other as per the first invariant. This invariant also requires they be nonempty, which eliminates a possible ambiguity. NOTE: The third invariant is currently unenforced due to historical oversight and significant reliance on the behaviour. It is currently undecided what to do about this going forward. Macros that do not respect the behaviour may become invalid in a future edition of Rust. See the tracking issue. ### FIRST and FOLLOW, informally A given matcher M maps to three sets: FIRST(M), LAST(M) and FOLLOW(M). Each of the three sets is made up of tokens. FIRST(M) and LAST(M) may also contain a distinguished non-token element ε ("epsilon"), which indicates that M can match the empty fragment. (But FOLLOW(M) is always just a set of tokens.) Informally: • FIRST(M): collects the tokens potentially used first when matching a fragment to M. • LAST(M): collects the tokens potentially used last when matching a fragment to M. • FOLLOW(M): the set of tokens allowed to follow immediately after some fragment matched by M. In other words: t ∈ FOLLOW(M) if and only if there exists (potentially empty) token sequences α, β, γ, δ where: • M matches β, • t matches γ, and • The concatenation α β γ δ is a parseable Rust program. We use the shorthand ANYTOKEN to denote the set of all tokens (including simple NTs). For example, if any token is legal after a matcher M, then FOLLOW(M) = ANYTOKEN. (To review one's understanding of the above informal descriptions, the reader at this point may want to jump ahead to the examples of FIRST/LAST before reading their formal definitions.) ### FIRST, LAST Below are formal inductive definitions for FIRST and LAST. "A ∪ B" denotes set union, "A ∩ B" denotes set intersection, and "A \ B" denotes set difference (i.e. all elements of A that are not present in B). #### FIRST FIRST(M) is defined by case analysis on the sequence M and the structure of its first token-tree (if any): • if M is the empty sequence, then FIRST(M) = { ε }, • if M starts with a token t, then FIRST(M) = { t }, (Note: this covers the case where M starts with a delimited token-tree sequence, M = OPEN tt ... CLOSE ..., in which case t = OPEN and thus FIRST(M) = { OPEN }.) (Note: this critically relies on the property that no simple NT matches the empty fragment.) • Otherwise, M is a token-tree sequence starting with a complex NT: M =$( tt ... ) OP α, or M = $( tt ... ) SEP OP α, (where α is the (potentially empty) sequence of token trees for the rest of the matcher). • Let SEP_SET(M) = { SEP } if SEP is present and ε ∈ FIRST(tt ...); otherwise SEP_SET(M) = {}. • Let ALPHA_SET(M) = FIRST(α) if OP = \* or ? and ALPHA_SET(M) = {} if OP = +. • FIRST(M) = (FIRST(tt ...) \ {ε}) ∪ SEP_SET(M) ∪ ALPHA_SET(M). The definition for complex NTs deserves some justification. SEP_SET(M) defines the possibility that the separator could be a valid first token for M, which happens when there is a separator defined and the repeated fragment could be empty. ALPHA_SET(M) defines the possibility that the complex NT could be empty, meaning that M's valid first tokens are those of the following token-tree sequences α. This occurs when either \* or ? is used, in which case there could be zero repetitions. In theory, this could also occur if + was used with a potentially-empty repeating fragment, but this is forbidden by the third invariant. From there, clearly FIRST(M) can include any token from SEP_SET(M) or ALPHA_SET(M), and if the complex NT match is nonempty, then any token starting FIRST(tt ...) could work too. The last piece to consider is ε. SEP_SET(M) and FIRST(tt ...) \ {ε} cannot contain ε, but ALPHA_SET(M) could. Hence, this definition allows M to accept ε if and only if ε ∈ ALPHA_SET(M) does. This is correct because for M to accept ε in the complex NT case, both the complex NT and α must accept it. If OP = +, meaning that the complex NT cannot be empty, then by definition ε ∉ ALPHA_SET(M). Otherwise, the complex NT can accept zero repetitions, and then ALPHA_SET(M) = FOLLOW(α). So this definition is correct with respect to \varepsilon as well. #### LAST LAST(M), defined by case analysis on M itself (a sequence of token-trees): • if M is the empty sequence, then LAST(M) = { ε } • if M is a singleton token t, then LAST(M) = { t } • if M is the singleton complex NT repeating zero or more times, M =$( tt ... ) *, or M = $( tt ... ) SEP * • Let sep_set = { SEP } if SEP present; otherwise sep_set = {}. • if ε ∈ LAST(tt ...) then LAST(M) = LAST(tt ...) ∪ sep_set • otherwise, the sequence tt ... must be non-empty; LAST(M) = LAST(tt ...) ∪ {ε}. • if M is the singleton complex NT repeating one or more times, M =$( tt ... ) +, or M = $( tt ... ) SEP + • Let sep_set = { SEP } if SEP present; otherwise sep_set = {}. • if ε ∈ LAST(tt ...) then LAST(M) = LAST(tt ...) ∪ sep_set • otherwise, the sequence tt ... must be non-empty; LAST(M) = LAST(tt ...) • if M is the singleton complex NT repeating zero or one time, M =$( tt ...) ?, then LAST(M) = LAST(tt ...) ∪ {ε}.

• if M is a delimited token-tree sequence OPEN tt ... CLOSE, then LAST(M) = { CLOSE }.

• if M is a non-empty sequence of token-trees tt uu ...,

• If ε ∈ LAST(uu ...), then LAST(M) = LAST(tt) ∪ (LAST(uu ...) \ { ε }).

• Otherwise, the sequence uu ... must be non-empty; then LAST(M) = LAST(uu ...).

### Examples of FIRST and LAST

Below are some examples of FIRST and LAST. (Note in particular how the special ε element is introduced and eliminated based on the interaction between the pieces of the input.)

Our first example is presented in a tree structure to elaborate on how the analysis of the matcher composes. (Some of the simpler subtrees have been elided.)

INPUT:  $($d:ident   $e:expr );*$( $( h )* );*$( f ; )+   g
~~~~~~~~   ~~~~~~~                ~
|         |                   |
FIRST:   { $d:ident } {$e:expr }          { h }

INPUT:  $($d:ident   $e:expr );*$( $( h )* );*$( f ; )+
~~~~~~~~~~~~~~~~~~             ~~~~~~~           ~~~
|                      |               |
FIRST:          { $d:ident } { h, ε } { f } INPUT:$(  $d:ident$e:expr   );*    $($( h )* );*    $( f ; )+ g ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~ ~~~~~~~~~ ~ | | | | FIRST: {$d:ident, ε }            {  h, ε, ;  }      { f }   { g }

INPUT:  $($d:ident   $e:expr );*$( $( h )* );*$( f ; )+   g
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
FIRST:                       { $d:ident, h, ;, f }  Thus: • FIRST($($d:ident$e:expr );* $($(h)* );* $( f ;)+ g) = { $d:ident, h, ;, f }

Note however that:

• FIRST($($d:ident $e:expr );*$( $(h)* );*$($( f ;)+ g)*) = { $d:ident, h, ;, f, ε }

Here are similar examples but now for LAST.

• LAST($d:ident$e:expr) = { $e:expr } • LAST($( $d:ident$e:expr );*) = { $e:expr, ε } • LAST($( $d:ident$e:expr );* $(h)*) = { $e:expr, ε, h }
• LAST($($d:ident $e:expr );*$(h)* $( f ;)+) = { ; } • LAST($( $d:ident$e:expr );* $(h)*$( f ;)+ g) = { g }

### FOLLOW(M)

Finally, the definition for FOLLOW(M) is built up as follows. pat, expr, etc. represent simple nonterminals with the given fragment specifier.

• FOLLOW(pat) = {=>, ,, =, |, if, in}.

• FOLLOW(expr) = FOLLOW(stmt) = {=>, ,, ;}.

• FOLLOW(ty) = FOLLOW(path) = {{, [, ,, =>, :, =, >, >>, ;, |, as, where, block nonterminals}.

• FOLLOW(vis) = {,l any keyword or identifier except a non-raw priv; any token that can begin a type; ident, ty, and path nonterminals}.

• FOLLOW(t) = ANYTOKEN for any other simple token, including block, ident, tt, item, lifetime, literal and meta simple nonterminals, and all terminals.

• FOLLOW(M), for any other M, is defined as the intersection, as t ranges over (LAST(M) \ {ε}), of FOLLOW(t).

The tokens that can begin a type are, as of this writing, {(, [, !, \*, &, &&, ?, lifetimes, >, >>, ::, any non-keyword identifier, super, self, Self, extern, crate, $crate, _, for, impl, fn, unsafe, typeof, dyn}, although this list may not be complete because people won't always remember to update the appendix when new ones are added. Examples of FOLLOW for complex M: • FOLLOW($( $d:ident$e:expr )\*) = FOLLOW($e:expr) • FOLLOW($( $d:ident$e:expr )\* $(;)\*) = FOLLOW($e:expr) ∩ ANYTOKEN = FOLLOW($e:expr) • FOLLOW($( $d:ident$e:expr )\* $(;)\*$( f |)+) = ANYTOKEN

### Examples of valid and invalid matchers

With the above specification in hand, we can present arguments for why particular matchers are legal and others are not.

• ($ty:ty < foo ,) : illegal, because FIRST(< foo ,) = { < } ⊈ FOLLOW(ty) • ($ty:ty , foo <) : legal, because FIRST(, foo <) = { , } is ⊆ FOLLOW(ty).

• ($pa:pat$pb:pat $ty:ty ,) : illegal, because FIRST($pb:pat $ty:ty ,) = { $pb:pat } ⊈ FOLLOW(pat), and also FIRST($ty:ty ,) = { $ty:ty } ⊈ FOLLOW(pat).

• ( $($a:tt $b:tt)* ; ) : legal, because FIRST($b:tt) = { $b:tt } is ⊆ FOLLOW(tt) = ANYTOKEN, as is FIRST(;) = { ; }. • ($($t:tt),* ,$(t:tt),* ) : legal, (though any attempt to actually use this macro will signal a local ambiguity error during expansion).

• ($ty:ty$(; not sep)* -) : illegal, because FIRST($(; not sep)* -) = { ;, - } is not in FOLLOW(ty). • ($($ty:ty)-+) : illegal, because separator - is not in FOLLOW(ty). • ($(\$e:expr)*) : illegal, because expr NTs are not in FOLLOW(expr NT).

# Influences

Rust is not a particularly original language, with design elements coming from a wide range of sources. Some of these are listed below (including elements that have since been removed):

• SML, OCaml: algebraic data types, pattern matching, type inference, semicolon statement separation
• C++: references, RAII, smart pointers, move semantics, monomorphization, memory model
• ML Kit, Cyclone: region based memory management
• Haskell (GHC): typeclasses, type families
• Newsqueak, Alef, Limbo: channels, concurrency
• Swift: optional bindings
• Scheme: hygienic macros
• C#: attributes
• Ruby: block syntax
• NIL, Hermes: typestate
• Unicode Annex #31: identifier and pattern syntax

# Glossary

### Abstract syntax tree

An ‘abstract syntax tree’, or ‘AST’, is an intermediate representation of the structure of the program when the compiler is compiling it.

### Alignment

The alignment of a value specifies what addresses values are preferred to start at. Always a power of two. References to a value must be aligned. More.

### Arity

Arity refers to the number of arguments a function or operator takes. For some examples, f(2, 3) and g(4, 6) have arity 2, while h(8, 2, 6) has arity 3. The ! operator has arity 1.

### Array

An array, sometimes also called a fixed-size array or an inline array, is a value describing a collection of elements, each selected by an index that can be computed at run time by the program. It occupies a contiguous region of memory.

### Associated item

An associated item is an item that is associated with another item. Associated items are defined in implementations and declared in traits. Only functions, constants, and type aliases can be associated. Contrast to a free item.

### Bound

Bounds are constraints on a type or trait. For example, if a bound is placed on the argument a function takes, types passed to that function must abide by that constraint.

### Combinator

Combinators are higher-order functions that apply only functions and earlier defined combinators to provide a result from its arguments. They can be used to manage control flow in a modular fashion.

### Dispatch

Dispatch is the mechanism to determine which specific version of code is actually run when it involves polymorphism. Two major forms of dispatch are static dispatch and dynamic dispatch. While Rust favors static dispatch, it also supports dynamic dispatch through a mechanism called ‘trait objects’.

### Dynamically sized type

A dynamically sized type (DST) is a type without a statically known size or alignment.

### Expression

An expression is a combination of values, constants, variables, operators and functions that evaluate to a single value, with or without side-effects.

For example, 2 + (3 * 4) is an expression that returns the value 14.

### Free item

An item that is not a member of an implementation, such as a free function or a free const. Contrast to an associated item.

### Inherent implementation

An implementation that applies to a nominal type, not to a trait-type pair. More.

### Inherent method

A method defined in an inherent implementation, not in a trait implementation.

### Initialized

A variable is initialized if it has been assigned a value and hasn't since been moved from. All other memory locations are assumed to be uninitialized. Only unsafe Rust can create such a memory without initializing it.

### Nominal types

Types that can be referred to by a path directly. Specifically enums, structs, unions, and trait objects.

### Object safe traits

Traits that can be used as trait objects. Only traits that follow specific rules are object safe.

### Prelude

Prelude, or The Rust Prelude, is a small collection of items - mostly traits - that are imported into every module of every crate. The traits in the prelude are pervasive.

### Scrutinee

A scrutinee is the expression that is matched on in match expressions and similar pattern matching constructs. For example, in match x { A => 1, B => 2 }, the expression x is the scrutinee.

### Size

The size of a value has two definitions.

The first is that it is how much memory must be allocated to store that value.

The second is that it is the offset in bytes between successive elements in an array with that item type.

It is a multiple of the alignment, including zero. The size can change depending on compiler version (as new optimizations are made) and target platform (similar to how usize varies per-platform).

More.

### Slice

A slice is dynamically-sized view into a contiguous sequence, written as [T].

It is often seen in its borrowed forms, either mutable or shared. The shared slice type is &[T], while the mutable slice type is &mut [T], where T represents the element type.

### Statement

A statement is the smallest standalone element of a programming language that commands a computer to perform an action.

### String literal

A string literal is a string stored directly in the final binary, and so will be valid for the 'static duration.

Its type is 'static duration borrowed string slice, &'static str.

### String slice

A string slice is the most primitive string type in Rust, written as str. It is often seen in its borrowed forms, either mutable or shared. The shared string slice type is &str, while the mutable string slice type is &mut str.

Strings slices are always valid UTF-8.

### Trait

A trait is a language item that is used for describing the functionalities a type must provide. It allows a type to make certain promises about its behavior.

Generic functions and generic structs can use traits to constrain, or bound, the types they accept.

### Undefined behavior

Compile-time or run-time behavior that is not specified. This may result in, but is not limited to: process termination or corruption; improper, incorrect, or unintended computation; or platform-specific results. More.