[][src]Crate rustc_lexer

⚙️ This is an internal compiler API. (rustc_private)

This crate is being loaded from the sysroot, a permanently unstable location for private compiler dependencies. It is not intended for general use. Prefer using a public version of this crate from crates.io via Cargo.toml.

Low-level Rust lexer.

The idea with librustc_lexer is to make a reusable library, by separating out pure lexing and rustc-specific concerns, like spans, error reporting an interning. So, rustc_lexer operates directly on &str, produces simple tokens which are a pair of type-tag and a bit of original text, and does not report errors, instead storing them as flags on the token.

Tokens produced by this lexer are not yet ready for parsing the Rust syntax, for that see librustc_parse::lexer, which converts this basic token stream into wide tokens used by actual parser.

The purpose of this crate is to convert raw sources into a labeled sequence of well-known token types, so building an actual Rust token stream will be easier.

Main entity of this crate is TokenKind enum which represents common lexeme types.

Modules

cursorInternal
unescapeInternal

Utilities for validating string and char literals and turning them into values they represent.

Structs

TokenInternal

Parsed token. It doesn't contain information about data that has been parsed, only the type of the token and its size.

UnvalidatedRawStrInternal

Represents something that looks like a raw string, but may have some problems. Use .validate() to convert it into something usable.

ValidatedRawStrInternal

Raw String that contains a valid prefix (#+") and postfix ("#+) where there are a matching number of # characters in both. Note that this will not consume extra trailing # characters: r###"abcde"#### is lexed as a ValidatedRawString { n_hashes: 3 } followed by a # token.

Enums

BaseInternal

Base of numeric literal encoding according to its prefix.

LexRawStrErrorInternal

Error produced validating a raw string. Represents cases like:

LiteralKindInternal
TokenKindInternal

Enum representing common lexeme types.

Functions

first_tokenInternal

Parses the first token from the provided input string.

is_id_continueInternal

True if c is valid as a non-first character of an identifier. See Rust language reference for a formal definition of valid identifier name.

is_id_startInternal

True if c is valid as a first character of an identifier. See Rust language reference for a formal definition of valid identifier name.

is_whitespaceInternal

True if c is considered a whitespace according to Rust language definition. See Rust language reference for definitions of these classes.

strip_shebangInternal

rustc allows files to have a shebang, e.g. "#!/usr/bin/rustrun", but shebang isn't a part of rust syntax.

tokenizeInternal

Creates an iterator that produces tokens from the input string.