byCodeUnit

fnauto byCodeUnit(R)(R r) if ((isConvertibleToString!R && !isStaticArray!R) || (isInputRange!R && isSomeChar!(ElementEncodingType!R)))

Iterate a range of char, wchar, or dchars by code unit.

The purpose is to bypass the special case decoding that

front does to character arrays. As a result,

using ranges with byCodeUnit can be nothrow while

front throws when it encounters invalid Unicode

sequences.

A code unit is a building block of the UTF encodings. Generally, an individual code unit does not represent what's perceived as a full character (a.k.a. a grapheme cluster in Unicode terminology). Many characters are encoded with multiple code units. For example, the UTF-8 code units for ø are 0xC3 0xB8. That means, an individual element of byCodeUnit often does not form a character on its own. Attempting to treat it as one while iterating over the resulting range will give nonsensical results.

Parameters

ran input range of characters (including strings) or a type that implicitly converts to a string type.

Returns

If r is not an auto-decodable string (i.e. a narrow string or a

user-defined type that implicitly converts to a string type), then r is returned.

Otherwise, r is converted to its corresponding string type (if it's not already a string) and wrapped in a random-access range where the element encoding type of the string (its code unit) is the element type of the range, and that range returned. The range has slicing.

If r is quirky enough to be a struct or class which is an input range of characters on its own (i.e. it has the input range API as member functions), and it's implicitly convertible to a string type, then r is returned, and no implicit conversion takes place.

If r is wrapped in a new range, then that range has a source property for returning the string that's currently contained within that range.

See Also

Refer to the std.uni docs for a reference on Unicode

terminology.

For a range that iterates by grapheme cluster (written character) see

byGrapheme.