Hasty Briefsbeta

Bilingual

Benchmarking Language Modeling for Lossless Compression of Full-Fidelity Audio

2 days ago
  • #language models
  • #lossless compression
  • #audio processing
  • Autoregressive language models (LMs) trained on raw waveforms can be repurposed for lossless audio compression.
  • Prior work was limited to 8-bit audio, leaving gaps in understanding for practical settings (16/24-bit).
  • The study benchmarks LM-based compression on full-fidelity audio across diverse domains, sampling rates, and bit depths.
  • Standard sample-level tokenization becomes intractable at higher bit depths due to vocabulary size.
  • Proposed Trilobyte, a byte-level tokenization schema, improves vocabulary scaling from O(2^b) to O(1), enabling tractable 24-bit LM-based lossless compression.
  • LMs consistently outperform FLAC and achieve state-of-the-art compression at 8-bit and 16-bit.
  • Compression gains become more modest as bit depth increases beyond 8-bit.