Windows-specific extensions to primitives in the
For historical reasons, the Windows API uses a form of potentially ill-formed UTF-16 encoding for strings. Specifically, the 16-bit code units in Windows strings may contain isolated surrogate code points which are not paired together. The Unicode standard requires that surrogate code points (those in the range U+D800 to U+DFFF) always be paired, because in the UTF-16 encoding a surrogate code unit pair is used to encode a single character. For compatibility with code that does not enforce these pairings, Windows does not enforce them, either.
While it is not always possible to convert such a string losslessly into
a valid UTF-16 string (or even UTF-8), it is often desirable to be
able to round-trip such a string from and to Windows APIs
losslessly. For example, some Rust code may be “bridging” some
Windows APIs together, just passing
WCHAR strings among those
APIs without ever really looking into the strings.
If Rust code does need to look into those strings, it can
convert them to valid UTF-8, possibly lossily, by substituting
invalid sequences with
U+FFFD REPLACEMENT CHARACTER, as is
conventionally done in other Rust APIs that deal with string
OsString is the Rust wrapper for owned strings in the
preferred representation of the operating system. On Windows,
this struct gets augmented with an implementation of the
OsStringExt trait, which has an
OsStringExt::from_wide method. This
lets you create an
OsString from a
&[u16] slice; presumably
you get such a slice out of a
WCHAR Windows API.
OsStr is the Rust wrapper for borrowed strings from
preferred representation of the operating system. On Windows, the
OsStrExt trait provides the
OsStrExt::encode_wide method, which
EncodeWide iterator. You can
iterator, for example, to obtain a
Vec<u16>; you can later get a
pointer to this vector’s contents and feed it to Windows APIs.
Generates a wide character sequence for potentially ill-formed UTF-16.