Primitive Type char []

Character manipulation (char type, Unicode Scalar Value)

This module provides the CharExt trait, as well as its implementation for the primitive char type, in order to allow basic character manipulation.

A char actually represents a Unicode Scalar Value, as it can contain any Unicode code point except high-surrogate and low-surrogate code points.

As such, only values in the ranges [0x0,0xD7FF] and [0xE000,0x10FFFF] (inclusive) are allowed. A char can always be safely cast to a u32; however the converse is not always true due to the above range limits and, as such, should be performed via the from_u32 function.

Methods

impl char

fn is_digit(self, radix: u32) -> bool

Checks if a char parses as a numeric digit in the given radix.

Compared to is_numeric(), this function only recognizes the characters 0-9, a-z and A-Z.

Return value

Returns true if c is a valid digit under radix, and false otherwise.

Panics

Panics if given a radix > 36.

Examples

fn main() { let c = '1'; assert!(c.is_digit(10)); assert!('f'.is_digit(16)); }
let c = '1';

assert!(c.is_digit(10));

assert!('f'.is_digit(16));

fn to_digit(self, radix: u32) -> Option<u32>

Converts a character to the corresponding digit.

Return value

If c is between '0' and '9', the corresponding value between 0 and 9. If c is 'a' or 'A', 10. If c is 'b' or 'B', 11, etc. Returns none if the character does not refer to a digit in the given radix.

Panics

Panics if given a radix outside the range [0..36].

Examples

fn main() { let c = '1'; assert_eq!(c.to_digit(10), Some(1)); assert_eq!('f'.to_digit(16), Some(15)); }
let c = '1';

assert_eq!(c.to_digit(10), Some(1));

assert_eq!('f'.to_digit(16), Some(15));

fn escape_unicode(self) -> EscapeUnicode

Returns an iterator that yields the hexadecimal Unicode escape of a character, as chars.

All characters are escaped with Rust syntax of the form \\u{NNNN} where NNNN is the shortest hexadecimal representation of the code point.

Examples

fn main() { for i in '❤'.escape_unicode() { println!("{}", i); } }
for i in '❤'.escape_unicode() {
    println!("{}", i);
}

This prints:

\
u
{
2
7
6
4
}

Collecting into a String:

fn main() { let heart: String = '❤'.escape_unicode().collect(); assert_eq!(heart, r"\u{2764}"); }
let heart: String = '❤'.escape_unicode().collect();

assert_eq!(heart, r"\u{2764}");

fn escape_default(self) -> EscapeDefault

Returns an iterator that yields the 'default' ASCII and C++11-like literal escape of a character, as chars.

The default is chosen with a bias toward producing literals that are legal in a variety of languages, including C++11 and similar C-family languages. The exact rules are:

  • Tab, CR and LF are escaped as '\t', '\r' and '\n' respectively.
  • Single-quote, double-quote and backslash chars are backslash- escaped.
  • Any other chars in the range [0x20,0x7e] are not escaped.
  • Any other chars are given hex Unicode escapes; see escape_unicode.

Examples

fn main() { for i in '"'.escape_default() { println!("{}", i); } }
for i in '"'.escape_default() {
    println!("{}", i);
}

This prints:

\
"

Collecting into a String:

fn main() { let quote: String = '"'.escape_default().collect(); assert_eq!(quote, "\\\""); }
let quote: String = '"'.escape_default().collect();

assert_eq!(quote, "\\\"");

fn len_utf8(self) -> usize

Returns the number of bytes this character would need if encoded in UTF-8.

Examples

fn main() { let n = 'ß'.len_utf8(); assert_eq!(n, 2); }
let n = 'ß'.len_utf8();

assert_eq!(n, 2);

fn len_utf16(self) -> usize

Returns the number of 16-bit code units this character would need if encoded in UTF-16.

Examples

fn main() { let n = 'ß'.len_utf16(); assert_eq!(n, 1); }
let n = 'ß'.len_utf16();

assert_eq!(n, 1);

fn encode_utf8(self, dst: &mut [u8]) -> Option<usize>

Unstable

: pending decision about Iterator/Writer/Reader

Encodes this character as UTF-8 into the provided byte buffer, and then returns the number of bytes written.

If the buffer is not large enough, nothing will be written into it and a None will be returned. A buffer of length four is large enough to encode any char.

Examples

In both of these examples, 'ß' takes two bytes to encode.

#![feature(unicode)] fn main() { let mut b = [0; 2]; let result = 'ß'.encode_utf8(&mut b); assert_eq!(result, Some(2)); }
let mut b = [0; 2];

let result = 'ß'.encode_utf8(&mut b);

assert_eq!(result, Some(2));

A buffer that's too small:

#![feature(unicode)] fn main() { let mut b = [0; 1]; let result = 'ß'.encode_utf8(&mut b); assert_eq!(result, None); }
let mut b = [0; 1];

let result = 'ß'.encode_utf8(&mut b);

assert_eq!(result, None);

fn encode_utf16(self, dst: &mut [u16]) -> Option<usize>

Unstable

: pending decision about Iterator/Writer/Reader

Encodes this character as UTF-16 into the provided u16 buffer, and then returns the number of u16s written.

If the buffer is not large enough, nothing will be written into it and a None will be returned. A buffer of length 2 is large enough to encode any char.

Examples

In both of these examples, 'ß' takes one u16 to encode.

#![feature(unicode)] fn main() { let mut b = [0; 1]; let result = 'ß'.encode_utf16(&mut b); assert_eq!(result, Some(1)); }
let mut b = [0; 1];

let result = 'ß'.encode_utf16(&mut b);

assert_eq!(result, Some(1));

A buffer that's too small:

#![feature(unicode)] fn main() { let mut b = [0; 0]; let result = 'ß'.encode_utf8(&mut b); assert_eq!(result, None); }
let mut b = [0; 0];

let result = 'ß'.encode_utf8(&mut b);

assert_eq!(result, None);

fn is_alphabetic(self) -> bool

Returns whether the specified character is considered a Unicode alphabetic code point.

fn is_xid_start(self) -> bool

Unstable

: mainly needed for compiler internals

Returns whether the specified character satisfies the 'XID_Start' Unicode property.

'XID_Start' is a Unicode Derived Property specified in UAX #31, mostly similar to ID_Start but modified for closure under NFKx.

fn is_xid_continue(self) -> bool

Unstable

: mainly needed for compiler internals

Returns whether the specified char satisfies the 'XID_Continue' Unicode property.

'XID_Continue' is a Unicode Derived Property specified in UAX #31, mostly similar to 'ID_Continue' but modified for closure under NFKx.

fn is_lowercase(self) -> bool

Indicates whether a character is in lowercase.

This is defined according to the terms of the Unicode Derived Core Property Lowercase.

fn is_uppercase(self) -> bool

Indicates whether a character is in uppercase.

This is defined according to the terms of the Unicode Derived Core Property Uppercase.

fn is_whitespace(self) -> bool

Indicates whether a character is whitespace.

Whitespace is defined in terms of the Unicode Property White_Space.

fn is_alphanumeric(self) -> bool

Indicates whether a character is alphanumeric.

Alphanumericness is defined in terms of the Unicode General Categories 'Nd', 'Nl', 'No' and the Derived Core Property 'Alphabetic'.

fn is_control(self) -> bool

Indicates whether a character is a control code point.

Control code points are defined in terms of the Unicode General Category Cc.

fn is_numeric(self) -> bool

Indicates whether the character is numeric (Nd, Nl, or No).

fn to_lowercase(self) -> ToLowercase

Converts a character to its lowercase equivalent.

The case-folding performed is the common or simple mapping. See to_uppercase() for references and more information.

Return value

Returns an iterator which yields the characters corresponding to the lowercase equivalent of the character. If no conversion is possible then the input character is returned.

fn to_uppercase(self) -> ToUppercase

Converts a character to its uppercase equivalent.

The case-folding performed is the common or simple mapping: it maps one Unicode codepoint to its uppercase equivalent according to the Unicode database 1. The additional [SpecialCasing.txt] is not yet considered here, but the iterator returned will soon support this form of case folding.

A full reference can be found here 2.

Return value

Returns an iterator which yields the characters corresponding to the uppercase equivalent of the character. If no conversion is possible then the input character is returned.

fn width(self, is_cjk: bool) -> Option<usize>

Deprecated since 1.0.0

: use the crates.io unicode-width library instead

Returns this character's displayed width in columns, or None if it is a control character other than '\x00'.

is_cjk determines behavior for characters in the Ambiguous category: if is_cjk is true, these are 2 columns wide; otherwise, they are 1. In CJK contexts, is_cjk should be true, else it should be false. Unicode Standard Annex #11 recommends that these characters be treated as 1 column (i.e., is_cjk = false) if the context cannot be reliably determined.

Trait Implementations

impl PartialEq<char> for char

fn eq(&self, other: &char) -> bool

fn ne(&self, other: &char) -> bool

impl Eq for char

impl PartialOrd<char> for char

fn partial_cmp(&self, other: &char) -> Option<Ordering>

fn lt(&self, other: &char) -> bool

fn le(&self, other: &char) -> bool

fn ge(&self, other: &char) -> bool

fn gt(&self, other: &char) -> bool

impl Ord for char

fn cmp(&self, other: &char) -> Ordering

impl Clone for char

fn clone(&self) -> char

fn clone_from(&mut self, source: &Self)

impl Default for char

fn default() -> char

impl<'a> Pattern<'a> for char

Searches for chars that are equal to a given char

type Searcher = CharSearcher<'a>

fn into_searcher(self, haystack: &'a str) -> CharSearcher<'a>

fn is_contained_in(self, haystack: &'a str) -> bool

fn is_prefix_of(self, haystack: &'a str) -> bool

fn is_suffix_of(self, haystack: &'a str) -> bool where CharSearcher<'a>: ReverseSearcher<'a>

impl Hash for char

fn hash<H>(&self, state: &mut H) where H: Hasher

fn hash_slice<H>(data: &[Self], state: &mut H) where H: Hasher

impl Debug for char

fn fmt(&self, f: &mut Formatter) -> Result<(), Error>

impl Display for char

fn fmt(&self, f: &mut Formatter) -> Result<(), Error>

impl AsciiExt for char

type Owned = char

fn is_ascii(&self) -> bool

fn to_ascii_uppercase(&self) -> char

fn to_ascii_lowercase(&self) -> char

fn eq_ignore_ascii_case(&self, other: &char) -> bool

fn make_ascii_uppercase(&mut self)

fn make_ascii_lowercase(&mut self)

impl Rand for char

fn rand<R>(rng: &mut R) -> char where R: Rng