[src]

Module std::str

Unicode string manipulation (str type)

Basic Usage

Rust's string type is one of the core primitive types of the language. While represented by the name str, the name str is not actually a valid type in Rust. Each string must also be decorated with its ownership. This means that there are three common kinds of strings in rust:

As an example, here's a few different kinds of strings.

#[feature(managed_boxes)];

fn main() {
    let owned_string = ~"I am an owned string";
    let managed_string = @"This string is garbage-collected";
    let borrowed_string1 = "This string is borrowed with the 'static lifetime";
    let borrowed_string2: &str = owned_string;   // owned strings can be borrowed
    let borrowed_string3: &str = managed_string; // managed strings can also be borrowed
}

From the example above, you can see that rust has 3 different kinds of string literals. The owned/managed literals correspond to the owned/managed string types, but the "borrowed literal" is actually more akin to C's concept of a static string.

When a string is declared without a ~ or @ sigil, then the string is allocated statically in the rodata of the executable/library. The string then has the type &'static str meaning that the string is valid for the 'static lifetime, otherwise known as the lifetime of the entire program. As can be inferred from the type, these static strings are not mutable.

Mutability

Many languages have immutable strings by default, and rust has a particular flavor on this idea. As with the rest of Rust types, strings are immutable by default. If a string is declared as mut, however, it may be mutated. This works the same way as the rest of Rust's type system in the sense that if there's a mutable reference to a string, there may only be one mutable reference to that string. With these guarantees, strings can easily transition between being mutable/immutable with the same benefits of having mutable strings in other languages.

let mut buf = ~"testing";
buf.push_char(' ');
buf.push_str("123");
assert_eq!(buf, ~"testing 123");

Representation

Rust's string type, str, is a sequence of unicode codepoints encoded as a stream of UTF-8 bytes. All safely-created strings are guaranteed to be validly encoded UTF-8 sequences. Additionally, strings are not null-terminated and can contain null codepoints.

The actual representation of strings have direct mappings to vectors:

Modules

not_utf8
raw

Unsafe operations

Structs

CharIterator

External iterator for a string's characters. Use with the std::iter module.

CharOffsetIterator

External iterator for a string's characters and their byte offsets. Use with the std::iter module.

CharRange

Struct that contains a char and the index of the first byte of the next char in a string. This can be used as a data structure for iterating over the UTF-8 bytes of a string.

CharSplitIterator

An iterator over the substrings of a string, separated by sep.

CharSplitNIterator

An iterator over the substrings of a string, separated by sep, splitting at most count times.

MatchesIndexIterator

An iterator over the start and end indices of the matches of a substring within a larger string

StrSplitIterator

An iterator over the substrings of a string separated by a given search string

Traits

CharEq

Something that can be used to compare against a character

OwnedStr

Methods for owned strings

Str

Any string that can be represented as a slice

StrSlice

Methods for string slices

StrVector

Methods for vectors of strings

Functions

eq

Bytewise string equality

eq_slice

Bytewise slice equality

from_byte

Convert a byte to a UTF-8 string

from_char

Convert a char to a string

from_chars

Convert a vector of chars to a string

from_utf16

Allocates a new string from the utf-16 slice provided

from_utf8

Converts a vector to a string slice without performing any allocations.

from_utf8_opt

Converts a vector to a string slice without performing any allocations.

from_utf8_owned

Consumes a vector of bytes to create a new utf-8 string

from_utf8_owned_opt

Consumes a vector of bytes to create a new utf-8 string. Returns None if the vector contains invalid UTF-8.

is_utf16

Determines if a vector of u16 contains valid UTF-16

is_utf8

Determines if a vector of bytes contains valid UTF-8

replace

Replace all occurrences of one string with another

utf16_chars

Iterates over the utf-16 characters in the specified slice, yielding each decoded unicode character to the function provided.

utf8_char_width

Given a first byte, determine how many bytes are in this UTF-8 character

with_capacity

Allocates a new string with the specified capacity. The string returned is the empty string, but has capacity for much more.

Type Definitions

AnyLineIterator

An iterator over the lines of a string, separated by either \n or (\r\n).

ByteIterator

External iterator for a string's bytes. Use with the std::iter module.

ByteRevIterator

External iterator for a string's bytes in reverse order. Use with the std::iter module.

CharOffsetRevIterator

External iterator for a string's characters and their byte offsets in reverse order. Use with the std::iter module.

CharRSplitIterator

An iterator over the substrings of a string, separated by sep, starting from the back of the string.

CharRevIterator

External iterator for a string's characters in reverse order. Use with the std::iter module.

WordIterator

An iterator over the words of a string, separated by an sequence of whitespace