Notation

Grammar

The following notations are used by the Lexer and Syntax grammar snippets:

NotationExamplesMeaning
CAPITALKW_IF, INTEGER_LITERALA token produced by the lexer
ItalicCamelCaseLetStatement, ItemA syntactical production
stringx, while, *The exact character(s)
x?pub?An optional item
x*OuterAttribute*0 or more of x
x+MacroMatch+1 or more of x
xa..bHEX_DIGIT1..6a to b repetitions of x
Rule1 Rule2fn Name ParametersSequence of rules in order
|u8 | u16, Block | ItemEither one or another
[ ][b B]Any of the characters listed
[ - ][a-z]Any of the characters in the range
~[ ]~[b B]Any characters, except those listed
~string~\n, ~*/Any characters, except this sequence
( )(, Parameter)?Groups items
U+xxxxU+0060A single unicode character
<text><any ASCII char except CR>An English description of what should be matched
Rule suffixIDENTIFIER_OR_KEYWORD except crateA modification to the previous rule

Sequences have a higher precedence than | alternation.

String table productions

Some rules in the grammar — notably unary operators, binary operators, and keywords — are given in a simplified form: as a listing of printable strings. These cases form a subset of the rules regarding the token rule, and are assumed to be the result of a lexical-analysis phase feeding the parser, driven by a DFA, operating over the disjunction of all such string table entries.

When such a string in monospace font occurs inside the grammar, it is an implicit reference to a single member of such a string table production. See tokens for more information.

Railroad visualizations

Below each grammar block is a button to toggle the display of a railroad diagram. A square element is a non-terminal rule, and a rounded rectangle is a terminal.

Common productions

The following are common definitions used in the grammar.

Lexer
CHAR<a Unicode scalar value>

NUL → U+0000

TAB → U+0009

LF → U+000A

CR → U+000D

CHAR a Unicode scalar value
NUL U+0000
TAB U+0009
LF U+000A
CR U+000D