文字列

Rustには文字列を扱う型が2つあります。Stringと&strです。

Stringは有効なUTF-8の配列であることを保証されたバイトのベクタ(Vec<u8>)として保持されます。ヒープ上に保持され、伸長可能で、末端にnull文字を含みません。

&strは有効なUTF-8の配列のスライス（&[u8]）で、いつでもStringに変換することができます。&[T]がいつでもVec<T>に変換できるのと同様です。

fn main() {
    // （以下の例では型を明示していますが、これらは必須ではありません。）
    // read only memory上に割り当てられた文字列への参照
    let pangram: &'static str = "the quick brown fox jumps over the lazy dog";
    println!("Pangram: {}", pangram);

    // 単語を逆順にイテレートします。新しい文字列の割り当ては起こりません。
    println!("Words in reverse");
    for word in pangram.split_whitespace().rev() {
        println!("> {}", word);
    }

    // 文字をベクタにコピー。ソートして重複を除去。
    let mut chars: Vec<char> = pangram.chars().collect();
    chars.sort();
    chars.dedup();

    // 中身が空で、伸長可能な`String`を作成。
    let mut string = String::new();
    for c in chars {
        // 文字を文字列の末端に挿入。
        string.push(c);
        // 文字列を文字列の末端に挿入。
        string.push_str(", ");
    }

    // 文字列のトリミング（特定文字種の除去）はオリジナルの文字列のスライスを
    // 返すので、新規のメモリ割り当ては発生しません。
    let chars_to_trim: &[char] = &[' ', ','];
    let trimmed_str: &str = string.trim_matches(chars_to_trim);
    println!("Used characters: {}", trimmed_str);

    // 文字列をヒープに割り当てます。
    let alice = String::from("I like dogs");
    // 新しくメモリを確保し、変更を加えた文字列をそこに割り当てます。
    let bob: String = alice.replace("dog", "cat");

    println!("Alice says: {}", alice);
    println!("Bob says: {}", bob);
}

str/Stringのメソッドをもっと見たい場合はstd::str、std::stringモジュールを参照してください。

Literals and escapes

There are multiple ways to write string literals with special characters in them. All result in a similar &str so it’s best to use the form that is the most convenient to write. Similarly there are multiple ways to write byte string literals, which all result in &[u8; N].

Generally special characters are escaped with a backslash character: \. This way you can add any character to your string, even unprintable ones and ones that you don’t know how to type. If you want a literal backslash, escape it with another one: \\

String or character literal delimiters occurring within a literal must be escaped: "\"", '\''.

fn main() {
    // You can use escapes to write bytes by their hexadecimal values...
    let byte_escape = "I'm writing \x52\x75\x73\x74!";
    println!("What are you doing\x3F (\\x3F means ?) {}", byte_escape);

    // ...or Unicode code points.
    let unicode_codepoint = "\u{211D}";
    let character_name = "\"DOUBLE-STRUCK CAPITAL R\"";

    println!("Unicode character {} (U+211D) is called {}",
                unicode_codepoint, character_name );


    let long_string = "String literals
                        can span multiple lines.
                        The linebreak and indentation here ->\
                        <- can be escaped too!";
    println!("{}", long_string);
}

Sometimes there are just too many characters that need to be escaped or it’s just much more convenient to write a string out as-is. This is where raw string literals come into play.

fn main() {
    let raw_str = r"Escapes don't work here: \x3F \u{211D}";
    println!("{}", raw_str);

    // If you need quotes in a raw string, add a pair of #s
    let quotes = r#"And then I said: "There is no escape!""#;
    println!("{}", quotes);

    // If you need "# in your string, just use more #s in the delimiter.
    // You can use up to 255 #s.
    let longer_delimiter = r###"A string with "# in it. And even "##!"###;
    println!("{}", longer_delimiter);
}

Want a string that’s not UTF-8? (Remember, str and String must be valid UTF-8). Or maybe you want an array of bytes that’s mostly text? Byte strings to the rescue!

use std::str;

fn main() {
    // Note that this is not actually a `&str`
    let bytestring: &[u8; 21] = b"this is a byte string";

    // Byte arrays don't have the `Display` trait, so printing them is a bit limited
    println!("A byte string: {:?}", bytestring);

    // Byte strings can have byte escapes...
    let escaped = b"\x52\x75\x73\x74 as bytes";
    // ...but no unicode escapes
    // let escaped = b"\u{211D} is not allowed";
    println!("Some escaped bytes: {:?}", escaped);


    // Raw byte strings work just like raw strings
    let raw_bytestring = br"\u{211D} is not escaped here";
    println!("{:?}", raw_bytestring);

    // Converting a byte array to `str` can fail
    if let Ok(my_str) = str::from_utf8(raw_bytestring) {
        println!("And the same as text: '{}'", my_str);
    }

    let _quotes = br#"You can also use "fancier" formatting, \
                    like with normal raw strings"#;

    // Byte strings don't have to be UTF-8
    let shift_jis = b"\x82\xe6\x82\xa8\x82\xb1\x82\xbb"; // "ようこそ" in SHIFT-JIS

    // But then they can't always be converted to `str`
    match str::from_utf8(shift_jis) {
        Ok(my_str) => println!("Conversion successful: '{}'", my_str),
        Err(e) => println!("Conversion failed: {:?}", e),
    };
}

For conversions between character encodings check out the encoding crate.

A more detailed listing of the ways to write string literals and escape characters is given in the ‘Tokens’ chapter of the Rust Reference.

Keyboard shortcuts

Rust By Example

文字列

Literals and escapes