std.uni
The std.uni module provides an implementation
of fundamental Unicode algorithms and data structures. This doesn't include UTF encoding and decoding primitives, see decode and encode in std.utf for this functionality.
All primitives listed operate on Unicode characters and
sets of characters. For functions which operate on ASCII characters and ignore Unicode characters, see std.ascii. For definitions of Unicode character, code point and other terms used throughout this module see the terminology section below.
The focus of this module is the core needs of developing Unicode-aware
applications. To that effect it provides the following optimized primitives:
- Character classification by category and common properties: isAlpha, isWhite and others.
- Case-insensitive string comparison (sicmp, icmp).
- Converting text to any of the four normalization forms via normalize.
- Decoding (decodeGrapheme) and iteration (byGrapheme, graphemeStride)
by user-perceived characters, that is by Grapheme clusters.
- Decomposing and composing of individual character(s) according to canonical
or compatibility rules, see compose and decompose, including the specific version for Hangul syllables composeJamo and decomposeHangul.
It's recognized that an application may need further enhancements
and extensions, such as less commonly known algorithms, or tailoring existing ones for region specific needs. To help users with building any extra functionality beyond the core primitives, the module provides:
- CodepointSet, a type for easy manipulation of sets of characters.
Besides the typical set algebra it provides an unusual feature: a D source code generator for detection of code points in this set. This is a boon for meta-programming parser frameworks, and is used internally to power classification in small sets like isWhite.
- A way to construct optimal packed multi-stage tables also known as a
special case of Trie. The functions codepointTrie, codepointSetTrie construct custom tries that map dchar to value. The end result is a fast and predictable 1 lookup that powers functions like isAlpha and combiningClass, but for user-defined data sets.
- A useful technique for Unicode-aware parsers that perform
character classification of encoded code points is to avoid unnecassary decoding at all costs.
utfMatcher provides an improvement over the usual workflowof decode-classify-process, combining the decoding and classification steps. By extracting necessary bits directly from encoded
code units matchers achievesignificant performance improvements. See MatcherConcept for the common interface of UTF matchers.
- Generally useful building blocks for customized normalization:
combiningClass for querying combining class
and allowedIn for testing the Quick_Check property of a given normalization form.
- Access to a large selection of commonly used sets of code points.
Supported sets include Script,
Block and General Category. The exact contents of a set can be observed in the CLDR utility, on the
property index pageof the Unicode website. See unicode for easy and (optionally) compile-time checked set queries.
Synopsis
import std.uni;
void main()
{
// initialize code point sets using script/block or property name
// now 'set' contains code points from both scripts.
auto set = unicode("Cyrillic") | unicode("Armenian");
// same thing but simpler and checked at compile-time
auto ascii = unicode.ASCII;
auto currency = unicode.Currency_Symbol;
// easy set ops
auto a = set & ascii;
assert(a.empty); // as it has no intersection with ascii
a = set | ascii;
auto b = currency - a; // subtract all ASCII, Cyrillic and Armenian
// some properties of code point sets
assert(b.length > 45); // 46 items in Unicode 6.1, even more in 6.2
// testing presence of a code point in a set
// is just fine, it is O(logN)
assert(!b['$']);
assert(!b['\u058F']); // Armenian dram sign
assert(b['¥']);
// building fast lookup tables, these guarantee O(1) complexity
// 1-level Trie lookup table essentially a huge bit-set ~262Kb
auto oneTrie = toTrie!1(b);
// 2-level far more compact but typically slightly slower
auto twoTrie = toTrie!2(b);
// 3-level even smaller, and a bit slower yet
auto threeTrie = toTrie!3(b);
assert(oneTrie['£']);
assert(twoTrie['£']);
assert(threeTrie['£']);
// build the trie with the most sensible trie level
// and bind it as a functor
auto cyrillicOrArmenian = toDelegate(set);
auto balance = find!(cyrillicOrArmenian)("Hello ընկեր!");
assert(balance == "ընկեր!");
// compatible with bool delegate(dchar)
bool delegate(dchar) bindIt = cyrillicOrArmenian;
// Normalization
string s = "Plain ascii (and not only), is always normalized!";
assert(s is normalize(s));// is the same string
string nonS = "A\u0308ffin"; // A ligature
auto nS = normalize(nonS); // to NFC, the W3C endorsed standard
assert(nS == "Äffin");
assert(nS != nonS);
string composed = "Äffin";
assert(normalize!NFD(composed) == "A\u0308ffin");
// to NFKD, compatibility decomposition useful for fuzzy matching/searching
assert(normalize!NFKD("2¹⁰") == "210");
}Terminology
The following is a list of important Unicode notions
and definitions. Any conventions used specifically in this module alone are marked as such. The descriptions are based on the formal definition as found in chapter three of The Unicode Standard Core Specification.
A unit of information used for the organization,control, or representation of textual data. Note that:
- When representing data, the nature of that data
is generally symbolic as opposed to some other kind of data (for example, visual).
- An abstract character has no concrete form
and should not be confused with a glyph.
- An abstract character does not necessarily
correspond to what a user thinks of as a “character” and should not be confused with a Grapheme.
- The abstract characters encoded (see Encoded character)
are known as Unicode abstract characters.
- Abstract characters not directly
encoded by the Unicode Standard can often be represented by the use of combining character sequences.
The decomposition of a character or character sequence that results from recursively applying the canonical mappings found in the Unicode Character Database and these described in Conjoining Jamo Behavior (section 12 of
Unicode Conformance).The precise definition of the Canonical composition is the algorithm as specified in Unicode Conformance section 11. Informally it's the process that does the reverse of the canonical decomposition with the addition of certain rules that e.g. prevent legacy characters from appearing in the composed result.
Two character sequences are said to be canonical equivalents if their full canonical decompositions are identical.
Typically differs by context.For the purpose of this documentation the term character implies encoded character, that is, a code point having an assigned abstract character (a symbolic meaning).
Any value in the Unicode codespace;that is, the range of integers from 0 to 10FFFF (hex). Not all code points are assigned to encoded characters.
The minimal bit combination that can representa unit of encoded text for processing or interchange. Depending on the encoding this could be: 8-bit code units in the UTF-8 (char), 16-bit code units in the UTF-16 (wchar), and 32-bit code units in the UTF-32 (dchar).
and is represented by the D dchar type.
of Combining Mark(M).
- All characters with non-zero canonical combining class
are combining characters, but the reverse is not the case: there are combining characters with a zero combining class.
- These characters are not normally used in isolation
unless they are being described. They include such characters as accents, diacritics, Hebrew points, Arabic vowel signs, and Indic matras.
A numerical value used by the Unicode Canonical Ordering Algorithm to determine which sequences of combining marks are to be considered canonically equivalent and which are not.
The decomposition of a character or character sequence that results from recursively applying both the compatibility mappings and the canonical mappings found in the Unicode Character Database, and those described in Conjoining Jamo Behavior no characters can be further decomposed.
Two character sequences are said to be compatibility equivalents if their full compatibility decompositions are identical.
An association (or mapping)between an abstract character and a code point.
The actual, concrete image of a glyph representationhaving been rasterized or otherwise imaged onto some display surface.
A character with the propertyGrapheme_Base, or any standard Korean syllable block.
Defined as the text betweengrapheme boundaries as specified by Unicode Standard Annex #29,
Unicode text segmentation.Important general properties of a grapheme:
- The grapheme cluster represents a horizontally segmentable
unit of text, consisting of some grapheme base (which may consist of a Korean syllable) together with any number of nonspacing marks applied to it.
- A grapheme cluster typically starts with a grapheme base
and then extends across any subsequent sequence of nonspacing marks. A grapheme cluster is most directly relevant to text rendering and processes such as cursor placement and text selection in editing, but may also be relevant to comparison and searching.
- For many processes, a grapheme cluster behaves as if it was a
single character with the same properties as its grapheme base. Effectively, nonspacing marks apply graphically to the base, but do not change its properties.
This module defines a number of primitives that work with graphemes:
Grapheme, decodeGrapheme and graphemeStride.All of them are using extended grapheme boundaries as defined in the aforementioned standard annex.
A combining character with theGeneral Category of Nonspacing Mark (Mn) or Enclosing Mark (Me).
A combining character that is not a nonspacing mark.Normalization
The concepts of canonical equivalent
or compatibility equivalent characters in the Unicode Standard make it necessary to have a full, formal definition of equivalence for Unicode strings. String equivalence is determined by a process called normalization, whereby strings are converted into forms which are compared directly for identity. This is the primary goal of the normalization process, see the function normalize to convert into any of the four defined forms.
A very important attribute of the Unicode Normalization Forms
is that they must remain stable between versions of the Unicode Standard. A Unicode string normalized to a particular Unicode Normalization Form in one version of the standard is guaranteed to remain in that Normalization Form for implementations of future versions of the standard.
The Unicode Standard specifies four normalization forms.
Informally, two of these forms are defined by maximal decomposition of equivalent sequences, and two of these forms are defined by maximal composition of equivalent sequences.
- Normalization Form D (NFD): The canonical decomposition of a character sequence.
- Normalization Form KD (NFKD): The compatibility decomposition of a character sequence.
- Normalization Form C (NFC): The canonical composition of the
canonical decomposition
of a coded character sequence.
- Normalization Form KC (NFKC): The canonical composition
of the compatibility decomposition of a character sequence
The choice of the normalization form depends on the particular use case.
NFC is the best form for general text, since it's more compatible with strings converted from legacy encodings. NFKC is the preferred form for identifiers, especially where there are security concerns. NFD and NFKD are the most useful for internal processing.
Construction of lookup tables
The Unicode standard describes a set of algorithms that
depend on having the ability to quickly look up various properties of a code point. Given the codespace of about 1 million code points, it is not a trivial task to provide a space-efficient solution for the multitude of properties.
Common approaches such as hash-tables or binary search over
sorted code point intervals (as in InversionList) are insufficient. Hash-tables have enormous memory footprint and binary search over intervals is not fast enough for some heavy-duty algorithms.
The recommended solution (see Unicode Implementation Guidelines)
is using multi-stage tables that are an implementation of the
Trie data structure with integerkeys and a fixed number of stages. For the remainder of the section this will be called a fixed trie. The following describes a particular implementation that is aimed for the speed of access at the expense of ideal size savings.
Taking a 2-level Trie as an example the principle of operation is as follows.
Split the number of bits in a key (code point, 21 bits) into 2 components (e.g. 15 and 8). The first is the number of bits in the index of the trie and the other is number of bits in each page of the trie. The layout of the trie is then an array of size 2^^bits-of-index followed an array of memory chunks of size 2^^bits-of-page/bits-per-element.
The number of pages is variable (but not less then 1)
unlike the number of entries in the index. The slots of the index all have to contain a number of a page that is present. The lookup is then just a couple of operations - slice the upper bits, lookup an index for these, take a page at this index and use the lower bits as an offset within this page.
Assuming that pages are laid out consequently in one array at pages, the pseudo-code is:
auto elemsPerPage = (2 ^^ bits_per_page) / Value.sizeOfInBits;
pages[index[n >> bits_per_page]][n & (elemsPerPage - 1)];Where if elemsPerPage is a power of 2 the whole process is
a handful of simple instructions and 2 array reads. Subsequent levels of the trie are introduced by recursing on this notion - the index array is treated as values. The number of bits in index is then again split into 2 parts, with pages over 'current-index' and the new 'upper-index'.
For completeness a level 1 trie is simply an array.
The current implementation takes advantage of bit-packing values when the range is known to be limited in advance (such as bool). See also BitPacked for enforcing it manually. The major size advantage however comes from the fact that multiple identical pages on every level are merged by construction.
The process of constructing a trie is more involved and is hidden from
the user in a form of the convenience functions codepointTrie,
codepointSetTrie and the even more convenient toTrie.In general a set or built-in AA with dchar type can be turned into a trie. The trie object in this module is read-only (immutable); it's effectively frozen after construction.
Unicode properties
This is a full list of Unicode properties accessible through unicode
with specific helpers per category nested within. Consult the
CLDR utilitywhen in doubt about the contents of a particular set.
General category sets listed below are only accessible with the unicode shorthand accessor.
Sets for other commonly useful properties that are
accessible with unicode:
Below is the table with block names accepted by unicode.block.
Note that the shorthand version unicode requires "In" to be prepended to the names of blocks so as to disambiguate scripts and blocks.
Below is the table with script names accepted by unicode.script
and by the shorthand version unicode:
Below is the table of names accepted by unicode.hangulSyllableType.
References:
ASCII Table, Wikipedia, The Unicode Consortium, Unicode normalization forms, Unicode text segmentation Unicode Implementation Guidelines Unicode ConformanceTrademarks: Unicode(tm) is a trademark of Unicode, Inc.
Copyright
Types 25
Types.length dimsize_t[dim] offsetssize_t[dim] szsize_t[] storagevoid store(OutRange)(scope OutRange sink) if (isOutputRange!(OutRange, char)) constthis(size_t[] sizes...)this(const(size_t)[] raw_offsets,
const(size_t)[] raw_sizes,
return scope const(size_t)[] data)length(size_t n)size_t.sizeof * 8 / bits factorsize_t.sizeof bytesPerWordsize_t * originvoid simpleWrite(TypeOfBitPacked!T val, size_t n)this(inout(size_t) * ptr)inoutbool zeros(size_t s, size_t e)T opIndex(size_t idx) inoutvoid opIndexAssign(TypeOfBitPacked!T val, size_t idx)void opSliceAssign(TypeOfBitPacked!T val, size_t start, size_t end)auto opSlice(size_t from, size_t to)inout inoutauto opSlice(){bool opEquals(T)(auto ref T arr) constauto roundUp()(size_t val){auto roundDown()(size_t val){this(inout(size_t) * origin, size_t offset, size_t items)is(typeof(() { T.init[0] = Item.init; })) assignableIndexis(typeof(() { T.init[0 .. 0] = Item.init; })) assignableSlicesize_t fromT * arrauto opIndex(size_t idx)const constauto opSlice(size_t a, size_t b)void opSliceAssign(T)(T val, size_t start, size_t end)auto opSlice()@property auto front()const const@property auto back()const const@property auto save() inoutvoid popFront()void popBack()bool opEquals(T)(auto ref T arr) consttrue accessIsSafeT[] dup(T)(const T[] arr)T[] alloc(T)(size_t size) @trustedT[] realloc(T)(return scope T[] arr, size_t size) @trustedvoid replaceImpl(T, Range)(ref T[] dest, size_t from, size_t to, Range stuff)void append(T, V)(ref T[] arr, V value) if (!isInputRange!V)void append(T, V)(ref T[] arr, V value) if (isInputRange!V && hasLength!V)void destroy(T)(scope ref T[] arr) @trustedThe recommended default type for set of U+. For details, see the current implementation: InversionList.
The recommended type of Tuple to represent [a, b) intervals of U+. As used in InversionList. Any interval type should pass isIntegralPair trait.
uint[2] _tuplebool opEquals(T)(T val) constthis(uint low, uint high)InversionList is a set of U+
represented as an array of open-right [a, b) intervals (see CodepointInterval above). The name comes from the way the representation reads left to right. For instance a set of all values [10, 50), [80, 90), plus a singular value 60 looks like this:
10, 50, 60, 61, 80, 90The way to read this is: start with negative meaning that all numbers
smaller then the next one are not present in this set (and positive - the contrary). Then switch positive/negative after each number passed from left to right.
This way negative spans until 10, then positive until 50,
then negative until 60, then positive until 61, and so on. As seen this provides a space-efficient storage of highly redundant data that comes in long runs. A description which Unicode properties fit nicely. The technique itself could be seen as a variation on RLE encoding.
Sets are value types (just like int is) thus they
are never aliased.
Example: --- auto a = CodepointSet('a', 'z'+1); auto b = CodepointSet('A', 'Z'+1); auto c = a; a = a | b; assert(a == CodepointSet('A', 'Z'+1, 'a', 'z'+1)); assert(a != c); ---
See also unicode for simpler construction of sets
from predefined ones.
Memory usage is 8 bytes per each contiguous interval in a set.
The value semantics are achieved by using the
COW techniqueand thus it's not safe to cast this type to shared.
Note
It's not recommended to rely on the template parameters
or the exact type of a current U+ set in std.uni. The type and parameters may change when the standard allocators design is finalized. Use isCodepointSet with templates or just stick with the default alias CodepointSet throughout the whole code base.
CowArray!SP dataauto scanFor()(dchar ch) constThis opBinary(string op, U)(U rhs) if (isCodepointSet!U || is(U: dchar))Sets support natural syntax for set algebra ) )This opOpAssign(string op, U)(U rhs) if (isCodepointSet!U || is(U: dchar)) refThe 'op=' versions of the above overloaded operators.bool opBinaryRight(string op: "in", U)(U ch) if (is(U : dchar)) constTests the presence of codepoint `ch` in this set, the same as opIndex.auto opUnary(string op: "!")()Obtains a set that is the inversion of this set.@property auto byCodepoint()A range that spans each in this set.void toString(Writer)(scope Writer sink, scope const ref FormatSpec!char fmt)Obtain a textual representation of this InversionList in form of open-right intervals.ref add()(uint a, uint b)Add an interval [a, b to this set.ref intersect()(dchar ch)ref sub()(dchar ch)@property auto inverted()Obtains a set that is the inversion of this set.string toSourceCode(const(CodepointInterval)[] range, string funcName)string toSourceCode(string funcName = "")Generates string with D source code of unary function with name of `funcName` taking a single `dchar` argument. If `funcName` is empty the code is adjusted to be a lambda function.void sanitize()ref subChar(dchar ch)Marker addInterval(int a, int b, Marker hint = Marker.init) scopeMarker dropUpTo(uint a, Marker pos = Marker.init)Marker skipUpTo(uint a, Marker pos = Marker.init)this(Set set)Construct from another code point set of any type.this(uint[] intervals...)Construct a set from plain values of code point intervals.Intervalsuint[] datastatic auto reuse(uint[] arr)void length(size_t len) @propertyuint opIndex()(size_t idx)const constvoid opIndexAssign(uint val, size_t idx)auto opSlice(size_t from, size_t to)auto opSlice(size_t from, size_t to) constauto opSlice()auto opSlice() constvoid append(Range)(Range range) if (isInputRange!Range && hasLength!Range && is(ElementType!Range : uint))void append()(uint[] val...)void refCount(uint cnt) @propertyvoid freeThisReference()void dupThisReference(uint count)Prefix.length - 1 lastLevelsize_t[Prefix.length] indicesValue defValuesize_t curIndexConstructState[Prefix.length] stateMultiArray!(idxTypes!(Key, fullBitSize!(Prefix), Prefix[0..$]), V) table"non-monotonic prefix function(s), an unsorted range or " ~
"duplicate key->value mapping" errMsgstatic auto deduceMaxIndex(Preds...)()@property ref idx(size_t level)(){void addValue(size_t level, T)(T val, size_t numVals)void spillToNextPage(size_t level, Slice)(ref Slice ptr)void spillToNextPageImpl(size_t level, Slice)(ref Slice ptr)void putRangeAt(size_t idxA, size_t idxB, Value v)void putRange(Key a, Key b, Value v)Put a value `v` into interval as mapped by keys from `a` to `b`. All slots prior to `a` are filled with the default filler.void putValue(Key key, Value v)Put a value `v` into slot mapped by `key`. All slots prior to `key` are filled with the default filler.auto build()Finishes construction of Trie, yielding an immutable Trie instance.this()ConstructStateA generic Trie data-structure for a fixed number of stages.
The design goal is optimal speed with smallest footprint size.
It's intentionally read-only and doesn't provide constructors.
To construct one use a special builder, see TrieBuilder and buildTrie.
MultiArray!(idxTypes!(Key, fullBitSize!(Prefix), Prefix[0..$]), Value) _tablevoid store(OutRange)(scope OutRange sink) if (isOutputRange!(OutRange, char)) constbits bitSizesize_t opCall(T)(T arg){bits bitSizesize_t opCall(T)(T arg){Conceptual type that outlines the common properties of all UTF Matchers.
Note
call results in assertion failure. Use utfMatcher to obtain a concrete matcher for UTF-8 or UTF-16 encodings.
bool match(Range)(ref Range inp) if (isRandomAccessRange!Range && is(ElementType!Range : char))Perform a semantic equivalent 2 operations: decoding a at front of `inp` and testing if it belongs to the set of of this matcher.@property auto subMatcher(Lengths...)()Advanced feature - provide direct access to a subset of matcher based a set of known encoding lengths. Lengths are provided in Code unit. The sub-matcher then may do less operations per any `test`/...Opaque wrapper around unsigned built-in integers and
code unit (char/wchar/dchar) types. Parameter sz indicates that the value is confined to the range of [0, 2^^sz). With this knowledge it can be packed more tightly when stored in certain data-structures like trie.
Note
The BitPacked!(T, sz) is implicitly convertible to T
but not vise-versa. Users have to ensure the value fits in the range required and use the cast operator to perform the conversion.
sz bitSizeT _valuebits bitSizestatic auto ref opCall(T)(auto ref T arg)to - from bitSizestatic auto opCall(T)(T x)this(const(ubyte)[] stream)Functions 37
void copyBackwards(T, U)(T[] src, U[] dest)void copyForward(T, U)(T[] src, U[] dest)auto force(T, F)(F from) if (isIntegral!T && !is(T == F))auto force(T, F)(F from) if (isBitPacked!T && !is(T == F))auto force(T, F)(F from) if (is(T == F))SliceOverIndexed!(const(T)) sliceOverIndexed(T)(size_t a, size_t b, const(T) * x) if (is(Unqual!T == T))SliceOverIndexed!T sliceOverIndexed(T)(size_t a, size_t b, T * x) if (is(Unqual!T == T))size_t uniformLowerBound(alias pred, Range, T)(Range range, T needle) if (is(T : ElementType!Range))size_t switchUniformLowerBound(alias pred, Range, T)(Range range, T needle) if (is(T : ElementType!Range))size_t genericReplace(Policy = void, T, Range)(ref T dest, size_t from, size_t to, Range stuff) @trustedauto arrayRepr(T)(T x)auto utfMatcher(Char, Set)(Set set) if (isCodepointSet!Set)Constructs a matcher object to classify from the `set` for encoding that has `Char` as code unit.auto decoder(C)(C[] s, size_t offset = 0) if (is(C : wchar) || is(C : char))auto toTrie(size_t level, Set)(Set set) if (isCodepointSet!Set)Convenience function to construct optimal configurations for packed Trie from any `set` of .auto toDelegate(Set)(Set set) if (isCodepointSet!Set)Builds a `Trie` with typically optimal speed-size trade-off and wraps it into a delegate of the following type: bool delegate(dchar ch. )int comparePropertyName(Char1, Char2)(const(Char1)[] a, const(Char2)[] b) if (is(Char1 : dchar) && is(Char2 : dchar)) @safe purebool propertyNameLess(Char1, Char2)(const(Char1)[] a, const(Char2)[] b) if (is(Char1 : dchar) && is(Char2 : dchar)) @safe pureubyte[] compressIntervals(Range)(Range intervals) if (isInputRange!Range && isIntegralPair!(ElementType!Range))Variables 4
lineSep = '\u2028'paraSep = '\u2029'nelSep = '\u0085'lastDchar = 0x10FFFFTemplates 27
Tests if T is some kind a set of code points. Intended for template constraints.
Tests if T is a pair of integers that implicitly convert to V. The following code must compile for any pair T:
(T x){ V a = x[0]; V b = x[1];}The following must not compile:
(T x){ V c = x[2];}Maps Key to a suitable integer index within the range of size_t. The mapping is constructed by applying predicates from Prefix left to right and concatenating the resulting bits.
The first (leftmost) predicate defines the most significant bits of the resulting index.
if (sumOfIntegerTuple!sizes == 21)A shorthand for creating a custom multi-level fixed Trie from a CodepointSet. sizes are numbers of bits per level, with the most significant bits used first.
Note
sizes must be equal 21.
See Also
Example:
{
import std.stdio;
auto set = unicode("Number");
auto trie = codepointSetTrie!(8, 5, 8)(set);
writeln("Input code points to test:");
foreach (line; stdin.byLine)
{
int count=0;
foreach (dchar ch; line)
if (trie[ch])// is number
count++;
writefln("Contains %d number code points.", count);
}
}if (sumOfIntegerTuple!sizes == 21)Type of Trie generated by codepointSetTrie function.
if (sumOfIntegerTuple!sizes == 21)A slightly more general tool for building fixed Trie for the Unicode data.
Specifically unlike codepointSetTrie it's allows creating mappings of dchar to an arbitrary type T.
Note
CodepointSets will naturally convert
only to bool mapping Tries.
CodepointTrie is the type of Trie as generated by codepointTrie function.
if (sumOfIntegerTuple!sizes == 21)ditto
The most general utility for construction of Tries short of using TrieBuilder directly.
Provides a number of convenience overloads. Args is tuple of maximum key value followed by predicates to construct index from key.
Alternatively if the first argument is not a value convertible to Key then the whole tuple of Args is treated as predicates and the maximum Key is deduced from predicates.
if (Args.length == 1)Tests if T is some instantiation of BitPacked!(U, x) and thus suitable for packing.
Gives the type U from BitPacked!(U, x) or T itself for every other type.
if (isIntegral!T || is(T: dchar))