[][src]Module syntax::ext::tt::macro_parser

⚙️ This is an internal compiler API. (rustc_private)

This crate is being loaded from the sysroot, a permanently unstable location for private compiler dependencies. It is not intended for general use. Prefer using a public version of this crate from crates.io via Cargo.toml.

This is an NFA-based parser, which calls out to the main rust parser for named non-terminals (which it commits to fully when it hits one in a grammar). There's a set of current NFA threads and a set of next ones. Instead of NTs, we have a special case for Kleene star. The big-O, in pathological cases, is worse than traditional use of NFA or Earley parsing, but it's an easier fit for Macro-by-Example-style rules.

(In order to prevent the pathological case, we'd need to lazily construct the resulting NamedMatches at the very end. It'd be a pain, and require more memory to keep around old items, but it would also save overhead)

We don't say this parser uses the Earley algorithm, because it's unnecessarily inaccurate. The macro parser restricts itself to the features of finite state automata. Earley parsers can be described as an extension of NFAs with completion rules, prediction rules, and recursion.

Quick intro to how the parser works:

A 'position' is a dot in the middle of a matcher, usually represented as a dot. For example · a $( a )* a b is a position, as is a $( · a )* a b.

The parser walks through the input a character at a time, maintaining a list of threads consistent with the current position in the input string: cur_items.

As it processes them, it fills up eof_items with threads that would be valid if the macro invocation is now over, bb_items with threads that are waiting on a Rust non-terminal like $e:expr, and next_items with threads that are waiting on a particular token. Most of the logic concerns moving the · through the repetitions indicated by Kleene stars. The rules for moving the · without consuming any input are called epsilon transitions. It only advances or calls out to the real Rust parser when no cur_items threads remain.

Example:

Start parsing a a a a b against [· a $( a )* a b].

Remaining input: a a a a b
next: [· a $( a )* a b]

- - - Advance over an a. - - -

Remaining input: a a a b
cur: [a · $( a )* a b]
Descend/Skip (first item).
next: [a $( · a )* a b]  [a $( a )* · a b].

- - - Advance over an a. - - -

Remaining input: a a b
cur: [a $( a · )* a b]  [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first item)
next: [a $( a )* · a b]  [a $( · a )* a b]  [a $( a )* a · b]

- - - Advance over an a. - - - (this looks exactly like the last step)

Remaining input: a b
cur: [a $( a · )* a b]  [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first item)
next: [a $( a )* · a b]  [a $( · a )* a b]  [a $( a )* a · b]

- - - Advance over an a. - - - (this looks exactly like the last step)

Remaining input: b
cur: [a $( a · )* a b]  [a $( a )* a · b]
Follow epsilon transition: Finish/Repeat (first item)
next: [a $( a )* · a b]  [a $( · a )* a b]  [a $( a )* a · b]

- - - Advance over a b. - - -

Remaining input: ''
eof: [a $( a )* a b ·]

Re-exports

pub use NamedMatch::*;
pub use ParseResult::*;

Structs

MatcherPosInternal

Represents a single "position" (aka "matcher position", aka "item"), as described in the module documentation.

MatcherTtFrameInternal

An unzipping of TokenTrees... see the stack field of MatcherPos.

Enums

MatcherPosHandleInternal
NamedMatchInternal

NamedMatch is a pattern-match result for a single token::MATCH_NONTERMINAL: so it is associated with a single ident in a parse, and all MatchedNonterminals in the NamedMatch have the same non-terminal type (expr, item, etc). Each leaf in a single NamedMatch corresponds to a single token::MATCH_NONTERMINAL in the TokenTree that produced it.

ParseResultInternal

Represents the possible results of an attempted parse.

TokenTreeOrTokenTreeSliceInternal

Either a sequence of token trees or a single one. This is used as the representation of the sequence of tokens that make up a matcher.

Functions

count_namesInternal

Count how many metavars are named in the given matcher ms.

create_matchesInternal

len Vecs (initially shared and empty) that will store matches of metavars.

get_macro_nameInternal

The token is an identifier, but not _. We prohibit passing _ to macros expecting ident for now.

initial_matcher_posInternal

Generates the top-level matcher position in which the "dot" is before the first token of the matcher ms and we are going to start matching at the span open in the source.

inner_parse_loopInternal

Process the matcher positions of cur_items until it is empty. In the process, this will produce more items in next_items, eof_items, and bb_items.

may_begin_withInternal

Checks whether a non-terminal may begin with a particular token.

nameizeInternal

Takes a sequence of token trees ms representing a matcher which successfully matched input and an iterator of items that matched input and produces a NamedParseResult.

parseInternal

Use the given sequence of token trees (ms) as a matcher. Match the given token stream tts against it and return the match.

parse_failure_msgInternal

Generates an appropriate parsing failure message. For EOF, this is "unexpected end...". For other tokens, this is "unexpected token...".

parse_ntInternal

A call to the "black-box" parser to parse some Rust non-terminal.

token_name_eqInternal

Performs a token equality check, ignoring syntax context (that is, an unhygienic comparison)

Type Definitions

NamedMatchVecInternal
NamedParseResultInternal

A ParseResult where the Success variant contains a mapping of Idents to NamedMatches. This represents the mapping of metavars to the token trees they bind to.