Compressed filesystems à la language models
15 days ago
- #LLMs
- #filesystems
- #compression
- Systems engineers often aspire to write a filesystem, which is simpler than it seems.
- Coding models can generate functional filesystems, prompting exploration into modeling the filesystem engine itself.
- A LoggingLoopbackFS was created to generate fine-tuning data by logging filesystem operations.
- Filesystem interaction simulator was used to generate diverse FUSE prompt/completion pairs.
- Fine-tuning was performed using XML representation of filesystem state, achieving 98% accuracy.
- A minimal FUSE filesystem was implemented, with operations passed through to an LLM.
- Arithmetic coding enables reversible compression using LLMs, achieving significant compression ratios.
- Fine-tuned models achieve better compression on XML filesystem trees due to 'self-compression' effect.
- Comparison shows llmfuse achieves ~8x better compression than squashfs on text data.
- The experiment raises questions about real-world applications and potential of LLM-based compression.