Hellishly Slow Level 13 Deflate Compression
4 days ago
- #optimization techniques
- #DEFLATE compression
- #libdeflate level 13
- DEFLATE level 13 is a deliberately impractical compression level that saves 0.134% in size over level 12 but runs 56.4x slower, useful when data is compressed once and distributed many times.
- It optimizes DEFLATE encoders without changing the decoder contract, improving compatibility in formats like HTTP, ZIP, PNG, and software distribution.
- Mechanics include searching the full 32 KiB window, 15 optimization passes, static Huffman optimization for blocks up to 50,000 bytes, and delayed block splitting based on byte distribution.
- For text-like data, it extends soft block size from 300,000 to 1,000,000 bytes if sample conditions (no NULL byte, at most 97 distinct byte values) are met.
- The parser expands minimum cost search, considering cheapest offsets, Huffman cost estimates, and comparing dynamic vs. static Huffman and literal-only encodings.
- Block splitting is delayed with up to nine candidates, scoring via bounded shortest path; a split must beat the full block by at least 512 bits.
- Development followed a zero compression regression policy using the Silesia corpus, ensuring no file size increased while reducing at least one.
- Results show size reductions across files (e.g., -32,370 bytes for mozilla, -0.296% for nci) but significant time increases (e.g., +665,713 ms for nci).