License
BSD-3-Clause
CF Lexer - Tokenizer for CF Configuration Files
This module provides a lexer that converts CF source text into a stream of tokens. It implements the input range interface for convenient iteration and preserves comments as tokens for roundtrip-preserving document model support.
The lexer handles:
Lexer for CF source text.
Converts CF source into a stream of tokens, implementing the input range interface. Comments are preserved as tokens to support the roundtrip-preserving document model.
Example:
auto lexer = Lexer(`{ key = "value" }`);
foreach (token; lexer) {
writeln(token);
}string sourcestring filenamesize_t possize_t linesize_t colToken currentTokenbool initializedbool eofReturnedvoid skipWhitespaceExceptNewline() @safe pureSkips whitespace characters except newlines. Newlines are handled separately as they can be significant separators.void lexString() @safe pureLexes a string literal (double or single quoted, including triple-quoted). Validates escape sequences and rejects unescaped control characters.bool isControlChar(char c) @safe pure nothrowChecks if a character is a control character (U+0000 to U+001F, except tab U+0009).void lexDecimalNumber(size_t startPos, Location startLoc) @safe pureLexes a decimal number (integer or float).void lexEnvVar() @safe pureLexes an identifier or keyword. Lexes an environment variable reference: ${VAR}, ${VAR:-default}, ${VAR:?message}bool looksLikeTemporalStart() @safe pure nothrowChecks if current position looks like the start of a temporal literal. Pattern: YYYY-MM-DD (4 digits, hyphen, 2 digits, hyphen, 2 digits)bool looksLikeTimeStart() @safe pure nothrowChecks if the current position looks like a time literal start.void lexTime() @safe pureLexes a standalone time literal (HH:MM:SS with optional fractional seconds).TokenType classifyKeyword(string text) @safe pure nothrowClassifies a keyword or returns IDENTIFIER.Token makeToken(TokenType type, string value) @safe pureCreates a token with the current location.bool isIdStart(char c) @safe pure nothrowChecks if a character is a valid identifier start. Per UAX #31: XID_Start or underscore. Uses std.uni.isAlpha for Unicode support.bool isIdStartAt(size_t idx) @safe pureChecks if the UTF-8 character at position idx is a valid identifier start. Decodes multi-byte UTF-8 sequences for proper Unicode support.bool isIdContinue(char c) @safe pure nothrowChecks if a character is a valid identifier continuation. Per UAX #31: XID_Continue or hyphen (CF-specific extension). Uses std.uni.isAlphaNum for Unicode support.bool isIdContinueAt(size_t idx) @safe pureChecks if the UTF-8 character at position idx is a valid identifier continuation. Decodes multi-byte UTF-8 sequences for proper Unicode support.void consumeUtf8Char() @safe pureConsumes a UTF-8 character (potentially multi-byte) and updates position/column.this(string input, string sourceFilename = "")Constructs a lexer for the given CF source text.Lexer tokenize(string input, string filename = "") @safe pureCreates a lexer for the given CF source text.Token[] tokenizeAll(string input, string filename = "") @safe pureTokenizes the input and eagerly collects all tokens into an array.