Can gzip be a language model?
12 hours ago
- #compression
- #language_modeling
- #gzip
- Explores using gzip for language modeling by leveraging the compression-prediction equivalence principle.
- Demonstrates gzip generating text after priming on a Shakespeare corpus, producing somewhat coherent output.
- Explains how compressors like gzip implicitly use probability models through compression algorithms like DEFLATE.
- Describes scoring candidates based on compressed length to measure how "predicted" a continuation is.
- Details a beam search approach to improve generation quality by considering multiple byte sequences ahead.
- Highlights the use of a sliding window context and tail bytes to avoid verbatim loops in generation.