ddn.compressor.lz4

ddn.compressor.lz4

LZ4 compression provider for ddn.api.compressor.

This module provides a complete LZ4 compression and decompression implementation conforming to the LZ4 frame format specification. It includes:

  • Streaming Compressor and Decompressor classes implementing the

ddn.api.compressor interface.

  • Hash-based match finding with configurable compression levels (0-9).
  • Support for both greedy and lazy (HC) matching strategies.
  • Full LZ4 frame format support including magic numbers, frame

descriptors, block headers, and checksums.

  • Interoperability with the standard lz4 command-line tool.

Compression levels control the trade-off between speed and compression ratio:

  • Level 0: Store-only mode (no compression, fastest).
  • Levels 1-3: Fast compression with smaller hash tables (4KB-16KB).
  • Levels 4-6: Default compression with 64KB hash table.
  • Levels 7-9: High compression (HC mode) with lazy matching for

better compression ratios at the cost of speed.

The default compression level is 3, matching the behaviour of the standard lz4 tool.

Module Initializers 1

shared static this()

Types 4

private structLz4HashTable

Hash table used by the match finder.

The table stores the last seen position (byte index) for a given hash value. Unused entries contain 0xFFFF_FFFF.

Fields
uint[] table
uint _hashBits
Methods
void allocate(uint hashBits)Allocate or reallocate the table for `hashBits`.
void reset()Reset all entries to the empty-sentinel value.
uint length() @property constNumber of addressable entries in the table.
uint get(uint index) constFetch the stored position for `index`.
void set(uint index, uint pos)Store `pos` at `index`.
uint indexFor(const(ubyte) * p) constCompute the hash index for pointer `p`.
private structLz4Sequence

Represents a single LZ4 sequence (literals followed by an optional match).

In the LZ4 block format, data is encoded as a series of sequences. Each sequence consists of:

  • A run of literal bytes (may be empty for non-first sequences)
  • An optional back-reference match (offset + length)

The final sequence in a block has no match (matchLen == 0).

Fields
size_t literalStart
size_t literalLen
size_t matchOffset
size_t matchLen

Streaming LZ4 compressor implementing the Compressor interface.

The compressor currently emits a single LZ4 frame per stream using independent blocks, no block checksums, and a content checksum. The frame layout is compatible with the lz4 CLI tool.

Fields
private CompressionOptions _opts
private OutputSink _sink
private bool _finished
private ulong _bytesIn
private ulong _bytesOut
private bool _dictSet
private const(ubyte)[] _dictionary
private ubyte[] _inBuffer
Methods
private int effectiveLevel() const @safe pure nothrow @nogcCompute an effective numeric level in range 0..9 from options.
private size_t targetChunkSize(int lvl) const @safe pure nothrow @nogcDecide a target uncompressed chunk size based on level (smaller at low levels, larger at high levels).
CompressionOptions options() @property constReturn the options this compressor was created with.
void setOutputSink(OutputSink sink)Set the output sink that will receive compressed bytes.
void setProgressCallback(ProgressCallback callback)Set an optional progress callback.
ulong bytesInTotal() @property constTotal uncompressed bytes accepted since the last `reset`.
ulong bytesOutTotal() @property constTotal compressed bytes produced since the last `reset`.
void write(const(ubyte)[] data)Feed more uncompressed data into the compressor.
void flush(FlushMode mode = FlushMode.SYNC)Flush any buffered output according to the specified mode.
void finish()Finalise the compression stream.
void reset()Reset the compressor to its initial state.
bool setDictionary(const(ubyte)[] dict)Set or update the compression dictionary.
bool isFinished() @property constReturns true if finish() has been called and the stream is closed for further writes.
Constructors
this(CompressionOptions opts)Construct a new LZ4 compressor with the given options.

Streaming LZ4 decompressor implementing the Decompressor interface.

The implementation supports two input forms:

  • A single raw LZ4 block as produced by Lz4Compressor (no frame

header), decoded via lz4DecompressBlock.

  • A standard LZ4 frame stream as produced by the lz4 CLI tool,

including frame header, independent blocks and optional content checksum, decoded block-by-block.

Fields
private DecompressionOptions _opts
private OutputSink _sink
private bool _finished
private ulong _bytesIn
private ulong _bytesOut
private bool _dictSet
private const(ubyte)[] _dictionary
private ubyte[] _inBuffer
Methods
DecompressionOptions options() @property constReturn the options this decompressor was created with.
void setOutputSink(OutputSink sink)Set the output sink that will receive decompressed bytes.
void setProgressCallback(ProgressCallback callback)Set an optional progress callback.
ulong bytesInTotal() @property constTotal compressed bytes accepted since the last `reset`.
ulong bytesOutTotal() @property constTotal decompressed bytes produced since the last `reset`.
void write(const(ubyte)[] data)Feed more compressed data into the decompressor.
void finish()Signal end-of-input to the decompressor.
void reset()Reset the decompressor to its initial state.
bool setDictionary(const(ubyte)[] dict)Set or update the decompression dictionary.
bool isFinished() @property constReturns true if finish() has been called and the stream is closed for further writes.
private void decompressSingleBlock()Decompress a raw single LZ4 block stored in `inBuffer` and emit it to the configured output sink.
private void decompressFrame()Decompress an LZ4 frame contained in `inBuffer`.
Constructors
this(DecompressionOptions opts)Construct a new LZ4 decompressor with the given options.

Functions 28

private fnuint hashBitsForLevel(int level)Compute the number of hash bits to use for a given compression level.
private fnuint hashTableSize(uint hashBits)Compute the hash-table size from the number of hash bits.
private fnuint readLE32p(const(ubyte) * p) nothrow @nogcRead a 32-bit little-endian word from an unaligned pointer.
private fnuint lz4HashIndex(const(ubyte) * p, uint hashBits) nothrow @nogcCompute the LZ4 hash index for the 4-byte sequence at `p`.
private fnsize_t countMatch(const(ubyte)[] src, size_t pos1, size_t pos2, size_t maxLen)Count the number of matching bytes starting at two positions.
private fnsize_t countMatch(const(ubyte) * srcPtr, size_t srcLen, size_t pos1, size_t pos2, size_t maxLen) nothrow @nogcPointer-based variant of `countMatch` for use from `@system`/`@trusted` contexts where the source pointer and length are already available.
private fnsize_t countMatchWithDict( const(ubyte)[] dict, const(ubyte)[] src, size_t matchPos, size_t srcPos, size_t maxLen )Count matching bytes when the match candidate may be in the dictionary.
private fnuint computeDictId(const(ubyte)[] dict)Compute the XXH32 hash of dictionary data to produce a dictionary ID.
private fnLz4Sequence[] findGreedySequences(const(ubyte)[] src, uint hashBits)Find sequences using a greedy match-finding algorithm.
private fnLz4Sequence[] findHCSequences(const(ubyte)[] src, uint hashBits)Find sequences using a lazy matching algorithm (HC mode).
private fnLz4Sequence[] findGreedySequencesWithDict( const(ubyte)[] src, uint hashBits, const(ubyte)[] dict )Find sequences using a greedy match-finding algorithm with dictionary support.
private fnLz4Sequence[] findHCSequencesWithDict( const(ubyte)[] src, uint hashBits, const(ubyte)[] dict )Find sequences using a lazy matching algorithm (HC mode) with dictionary support.
private fnvoid encodeLength(ref ubyte * p, size_t length) nothrow @nogcEncode a length value using LZ4's variable-length encoding scheme.
private fnsize_t encodeSequences(const(Lz4Sequence)[] sequences, const(ubyte)[] src, ubyte[] buf)Encode a sequence of LZ4 sequences into the LZ4 block format.
private fnsize_t lz4MaxCompressedSize(size_t srcSize)Compute an upper bound on the compressed size of a single LZ4 block.
private fnsize_t lz4CompressBlockGreedyCore( const(ubyte) * srcPtr, size_t srcLen, ubyte * outPtr, size_t maxOut, uint * hashTblPtr, uint hashBits) nothrow @trusted @nogcCore greedy compression loop operating entirely on raw pointers.
private fnubyte[] lz4CompressBlockGreedy(const(ubyte)[] src, uint hashBits)Compress a block using greedy matching in a single pass.
private fnubyte[] lz4CompressBlock(const(ubyte)[] src, int level = 1)Compress a single block of data using the LZ4 block format.
private fnsize_t lz4DecompressBlock(const(ubyte)[] src, ubyte[] dst)Decompress a single LZ4 block into a caller‑provided buffer.
private fnubyte[] lz4CompressBlockWithDict(const(ubyte)[] src, int level, const(ubyte)[] dict)Compress a single block of data using the LZ4 block format with dictionary.
private fnsize_t lz4DecompressBlockWithDict(const(ubyte)[] src, ubyte[] dst, const(ubyte)[] dict)Decompress a single LZ4 block with dictionary support.
private fnuint rotl32(uint x, uint r)Rotate a 32-bit value left by `r` bits.
private fnuint readLE32(const(ubyte)[] buf, size_t offset)Read a 32-bit little-endian unsigned integer from `buf` at `offset`.
private fnuint xxh32Round(uint acc, uint input)XXH32 round transformation (internal helper).
private fnuint xxh32(const(ubyte)[] input, uint seed = 0)Compute the XXH32 hash of `input` with the given `seed`.
private fnubyte lz4HeaderChecksum(const(ubyte)[] header)Compute the single-byte LZ4 frame header checksum (HC).
private fnCompressor makeLz4Compressor(CompressionOptions opts)Create a new LZ4 compressor instance using the provided options.
private fnDecompressor makeLz4Decompressor(DecompressionOptions opts)Create a new LZ4 decompressor instance using the provided options.

Variables 11

private varsize_t LZ4_MIN_MATCH

Minimum match length in LZ4 format.

private varsize_t LZ4_END_LITERALS

Number of bytes that must remain as literals at the end of a block. The LZ4 format requires the last 5 bytes to always be literals for safe decoding (provides a safety margin for the decoder).

private varsize_t LZ4_MAX_INPUT_SIZE

Maximum input size supported by the simple block helpers.

private enumvarLZ4_HASH_MULT = 0x9E37_79B1u

Multiplicative constant used by LZ4 for hashing 4-byte sequences.

private varubyte[4] LZ4_FRAME_MAGIC

LZ4 frame magic number (little-endian 0x184D2204).

private enumvarLZ4_FRAME_MAX_BLOCK_SIZE = 4 * 1024 * 1024

Maximum uncompressed block size advertised in the LZ4 frame header (4 MiB, corresponding to block size code 7 in the BD field).

private enumvarXXH32_PRIME1 = 0x9E37_79B1u

XXH32 prime constants used by the checksum implementation.

private enumvarXXH32_PRIME2 = 0x85EB_CA77u
private enumvarXXH32_PRIME3 = 0xC2B2_AE3Du
private enumvarXXH32_PRIME4 = 0x27D4_EB2Fu
private enumvarXXH32_PRIME5 = 0x1656_67B1u