- A _lexer for D is available here.
- A _lexer for Lua is available here.
- A _lexer for JSON is available here.
TemplateParameters Template Parameter Definitions $(DDOC_ANCHOR defaultTokenFunction)
defaultTokenFunction $(DD A function that serves as the default token lexing function. For most languages this will be the identifier lexing function.)) $(DT $(DDOC_ANCHOR tokenSeparatingFunction)
tokenSeparatingFunction) $(DD A function that is able to determine if an identifier/keyword has come to an end. This function must return bool and take a single size_t argument representing the number of bytes to skip over before looking for a separating character.) $(DT $(DDOC_ANCHOR staticTokens)
staticTokens) $(DD A listing of the tokens whose exact value never changes and which cannot possibly be a token handled by the default token lexing function. The most common example of this kind of token is an operator such as $(D_STRING "*"), or $(D_STRING "-") in a programming language.) $(DT $(DDOC_ANCHOR dynamicTokens)
dynamicTokens) $(DD A listing of tokens whose value is variable, such as whitespace, identifiers, number literals, and string literals.) $(DT $(DDOC_ANCHOR possibleDefaultTokens)
possibleDefaultTokens) $(DD A listing of tokens that could posibly be one of the tokens handled by the default token handling function. An common example of this is a keyword such as $(D_STRING "for"), which looks like the beginning of the identifier $(D_STRING "fortunate").
tokenSeparatingFunction is called to determine if the character after the $(D_STRING 'r') separates the identifier, indicating that the token is $(D_STRING "for"), or if lexing should be turned over to the
defaultTokenFunction.) $(DT $(DDOC_ANCHOR tokenHandlers)
tokenHandlers) $(DD A mapping of prefixes to custom token handling function names. The generated _lexer will search for the even-index elements of this array, and then call the function whose name is the element immedately after the even-indexed element. This is used for lexing complex tokens whose prefix is fixed.)
Here are some example constants for a simple calculator _lexer:
// There are a near infinite number of valid number literals, so numbers are
// dynamic tokens.
enum string[] dynamicTokens = ["numberLiteral", "whitespace"];
// The operators are always the same, and cannot start a numberLiteral, so
// they are staticTokens
enum string[] staticTokens = ["-", "+", "*", "/"];
// In this simple example there are no keywords or other tokens that could
// look like dynamic tokens, so this is blank.
enum string[] possibleDefaultTokens = [];
// If any whitespace character or digit is encountered, pass lexing over to
// our custom handler functions. These will be demonstrated in an example
// later on.
enum string[] tokenHandlers = [
"0", "lexNumber",
"1", "lexNumber",
"2", "lexNumber",
"3", "lexNumber",
"4", "lexNumber",
"5", "lexNumber",
"6", "lexNumber",
"7", "lexNumber",
"8", "lexNumber",
"9", "lexNumber",
" ", "lexWhitespace",
"\n", "lexWhitespace",
"\t", "lexWhitespace",
"\r", "lexWhitespace"
];