Identifiers

Lexer:
IDENTIFIER_OR_KEYWORD :
      XID_Start XID_Continue*
   | _ XID_Continue+

RAW_IDENTIFIER : r# IDENTIFIER_OR_KEYWORD Except crate, self, super, Self

NON_KEYWORD_IDENTIFIER : IDENTIFIER_OR_KEYWORD Except a strict or reserved keyword

IDENTIFIER :
NON_KEYWORD_IDENTIFIER | RAW_IDENTIFIER

RESERVED_RAW_IDENTIFIER : r#_

Identifiers follow the specification in Unicode Standard Annex #31 for Unicode version 16.0, with the additions described below. Some examples of identifiers:

  • foo
  • _identifier
  • r#true
  • Москва
  • 東京

The profile used from UAX #31 is:

with the additional constraint that a single underscore character is not an identifier.

Note: Identifiers starting with an underscore are typically used to indicate an identifier that is intentionally unused, and will silence the unused warning in rustc.

Identifiers may not be a strict or reserved keyword without the r# prefix described below in raw identifiers.

Zero width non-joiner (ZWNJ U+200C) and zero width joiner (ZWJ U+200D) characters are not allowed in identifiers.

Identifiers are restricted to the ASCII subset of XID_Start and XID_Continue in the following situations:

Normalization

Identifiers are normalized using Normalization Form C (NFC) as defined in Unicode Standard Annex #15. Two identifiers are equal if their NFC forms are equal.

Procedural and declarative macros receive normalized identifiers in their input.

Raw identifiers

A raw identifier is like a normal identifier, but prefixed by r#. (Note that the r# prefix is not included as part of the actual identifier.)

Unlike a normal identifier, a raw identifier may be any strict or reserved keyword except the ones listed above for RAW_IDENTIFIER.

It is an error to use the RESERVED_RAW_IDENTIFIER token r#_ in order to avoid confusion with the WildcardPattern.