ddn.compressor.lz4
ddn.compressor.lz4
LZ4 compression provider for ddn.api.compressor.
This module provides a complete LZ4 compression and decompression implementation conforming to the LZ4 frame format specification. It includes:
- Streaming
CompressorandDecompressorclasses implementing the
ddn.api.compressor interface.
- Hash-based match finding with configurable compression levels (0-9).
- Support for both greedy and lazy (HC) matching strategies.
- Full LZ4 frame format support including magic numbers, frame
descriptors, block headers, and checksums.
- Interoperability with the standard
lz4command-line tool.
Compression levels control the trade-off between speed and compression ratio:
- Level 0: Store-only mode (no compression, fastest).
- Levels 1-3: Fast compression with smaller hash tables (4KB-16KB).
- Levels 4-6: Default compression with 64KB hash table.
- Levels 7-9: High compression (HC mode) with lazy matching for
better compression ratios at the cost of speed.
The default compression level is 3, matching the behaviour of the standard lz4 tool.
Module Initializers 1
()Types 4
Hash table used by the match finder.
The table stores the last seen position (byte index) for a given hash value. Unused entries contain 0xFFFF_FFFF.
uint[] tableuint _hashBitsRepresents a single LZ4 sequence (literals followed by an optional match).
In the LZ4 block format, data is encoded as a series of sequences. Each sequence consists of:
- A run of literal bytes (may be empty for non-first sequences)
- An optional back-reference match (offset + length)
The final sequence in a block has no match (matchLen == 0).
size_t literalStartsize_t literalLensize_t matchOffsetsize_t matchLenStreaming LZ4 compressor implementing the Compressor interface.
The compressor currently emits a single LZ4 frame per stream using independent blocks, no block checksums, and a content checksum. The frame layout is compatible with the lz4 CLI tool.
private CompressionOptions _optsprivate OutputSink _sinkprivate bool _finishedprivate ulong _bytesInprivate ulong _bytesOutprivate bool _dictSetprivate const(ubyte)[] _dictionaryprivate ubyte[] _inBufferint effectiveLevel() const @safe pure nothrow @nogcCompute an effective numeric level in range 0..9 from options.size_t targetChunkSize(int lvl) const @safe pure nothrow @nogcDecide a target uncompressed chunk size based on level (smaller at low levels, larger at high levels).void setOutputSink(OutputSink sink)Set the output sink that will receive compressed bytes.void setProgressCallback(ProgressCallback callback)Set an optional progress callback.void write(const(ubyte)[] data)Feed more uncompressed data into the compressor.void flush(FlushMode mode = FlushMode.SYNC)Flush any buffered output according to the specified mode.void finish()Finalise the compression stream.void reset()Reset the compressor to its initial state.bool setDictionary(const(ubyte)[] dict)Set or update the compression dictionary.bool isFinished() @property constReturns true if finish() has been called and the stream is closed for further writes.this(CompressionOptions opts)Construct a new LZ4 compressor with the given options.Streaming LZ4 decompressor implementing the Decompressor interface.
The implementation supports two input forms:
- A single raw LZ4 block as produced by
Lz4Compressor(no frame
header), decoded via lz4DecompressBlock.
- A standard LZ4 frame stream as produced by the
lz4CLI tool,
including frame header, independent blocks and optional content checksum, decoded block-by-block.
private DecompressionOptions _optsprivate OutputSink _sinkprivate bool _finishedprivate ulong _bytesInprivate ulong _bytesOutprivate bool _dictSetprivate const(ubyte)[] _dictionaryprivate ubyte[] _inBufferDecompressionOptions options() @property constReturn the options this decompressor was created with.void setOutputSink(OutputSink sink)Set the output sink that will receive decompressed bytes.void setProgressCallback(ProgressCallback callback)Set an optional progress callback.void write(const(ubyte)[] data)Feed more compressed data into the decompressor.void finish()Signal end-of-input to the decompressor.void reset()Reset the decompressor to its initial state.bool setDictionary(const(ubyte)[] dict)Set or update the decompression dictionary.bool isFinished() @property constReturns true if finish() has been called and the stream is closed for further writes.void decompressSingleBlock()Decompress a raw single LZ4 block stored in `inBuffer` and emit it to the configured output sink.this(DecompressionOptions opts)Construct a new LZ4 decompressor with the given options.Functions 28
uint hashBitsForLevel(int level)Compute the number of hash bits to use for a given compression level.uint hashTableSize(uint hashBits)Compute the hash-table size from the number of hash bits.uint readLE32p(const(ubyte) * p) nothrow @nogcRead a 32-bit little-endian word from an unaligned pointer.uint lz4HashIndex(const(ubyte) * p, uint hashBits) nothrow @nogcCompute the LZ4 hash index for the 4-byte sequence at `p`.size_t countMatch(const(ubyte)[] src, size_t pos1, size_t pos2, size_t maxLen)Count the number of matching bytes starting at two positions.size_t countMatch(const(ubyte) * srcPtr, size_t srcLen,
size_t pos1, size_t pos2, size_t maxLen) nothrow @nogcPointer-based variant of `countMatch` for use from `@system`/`@trusted` contexts where the source pointer and length are already available.size_t countMatchWithDict(
const(ubyte)[] dict,
const(ubyte)[] src,
size_t matchPos,
size_t srcPos,
size_t maxLen
)Count matching bytes when the match candidate may be in the dictionary.uint computeDictId(const(ubyte)[] dict)Compute the XXH32 hash of dictionary data to produce a dictionary ID.Lz4Sequence[] findGreedySequences(const(ubyte)[] src, uint hashBits)Find sequences using a greedy match-finding algorithm.Lz4Sequence[] findHCSequences(const(ubyte)[] src, uint hashBits)Find sequences using a lazy matching algorithm (HC mode).Lz4Sequence[] findGreedySequencesWithDict(
const(ubyte)[] src,
uint hashBits,
const(ubyte)[] dict
)Find sequences using a greedy match-finding algorithm with dictionary support.Lz4Sequence[] findHCSequencesWithDict(
const(ubyte)[] src,
uint hashBits,
const(ubyte)[] dict
)Find sequences using a lazy matching algorithm (HC mode) with dictionary support.void encodeLength(ref ubyte * p, size_t length) nothrow @nogcEncode a length value using LZ4's variable-length encoding scheme.size_t encodeSequences(const(Lz4Sequence)[] sequences, const(ubyte)[] src, ubyte[] buf)Encode a sequence of LZ4 sequences into the LZ4 block format.size_t lz4MaxCompressedSize(size_t srcSize)Compute an upper bound on the compressed size of a single LZ4 block.size_t lz4CompressBlockGreedyCore(
const(ubyte) * srcPtr, size_t srcLen,
ubyte * outPtr, size_t maxOut,
uint * hashTblPtr, uint hashBits) nothrow @trusted @nogcCore greedy compression loop operating entirely on raw pointers.ubyte[] lz4CompressBlockGreedy(const(ubyte)[] src, uint hashBits)Compress a block using greedy matching in a single pass.ubyte[] lz4CompressBlock(const(ubyte)[] src, int level = 1)Compress a single block of data using the LZ4 block format.size_t lz4DecompressBlock(const(ubyte)[] src, ubyte[] dst)Decompress a single LZ4 block into a caller‑provided buffer.ubyte[] lz4CompressBlockWithDict(const(ubyte)[] src, int level, const(ubyte)[] dict)Compress a single block of data using the LZ4 block format with dictionary.size_t lz4DecompressBlockWithDict(const(ubyte)[] src, ubyte[] dst, const(ubyte)[] dict)Decompress a single LZ4 block with dictionary support.uint readLE32(const(ubyte)[] buf, size_t offset)Read a 32-bit little-endian unsigned integer from `buf` at `offset`.uint xxh32(const(ubyte)[] input, uint seed = 0)Compute the XXH32 hash of `input` with the given `seed`.ubyte lz4HeaderChecksum(const(ubyte)[] header)Compute the single-byte LZ4 frame header checksum (HC).Compressor makeLz4Compressor(CompressionOptions opts)Create a new LZ4 compressor instance using the provided options.Decompressor makeLz4Decompressor(DecompressionOptions opts)Create a new LZ4 decompressor instance using the provided options.Variables 11
size_t LZ4_MIN_MATCHMinimum match length in LZ4 format.
size_t LZ4_END_LITERALSNumber of bytes that must remain as literals at the end of a block. The LZ4 format requires the last 5 bytes to always be literals for safe decoding (provides a safety margin for the decoder).
size_t LZ4_MAX_INPUT_SIZEMaximum input size supported by the simple block helpers.
LZ4_HASH_MULT = 0x9E37_79B1uMultiplicative constant used by LZ4 for hashing 4-byte sequences.
ubyte[4] LZ4_FRAME_MAGICLZ4 frame magic number (little-endian 0x184D2204).
LZ4_FRAME_MAX_BLOCK_SIZE = 4 * 1024 * 1024Maximum uncompressed block size advertised in the LZ4 frame header (4 MiB, corresponding to block size code 7 in the BD field).
XXH32_PRIME1 = 0x9E37_79B1uXXH32 prime constants used by the checksum implementation.
XXH32_PRIME2 = 0x85EB_CA77uXXH32_PRIME3 = 0xC2B2_AE3DuXXH32_PRIME4 = 0x27D4_EB2FuXXH32_PRIME5 = 0x1656_67B1u