Hasty Briefsbeta

Compressed filesystems à la language models

15 days ago
  • #LLMs
  • #filesystems
  • #compression
  • Systems engineers often aspire to write a filesystem, which is simpler than it seems.
  • Coding models can generate functional filesystems, prompting exploration into modeling the filesystem engine itself.
  • A LoggingLoopbackFS was created to generate fine-tuning data by logging filesystem operations.
  • Filesystem interaction simulator was used to generate diverse FUSE prompt/completion pairs.
  • Fine-tuning was performed using XML representation of filesystem state, achieving 98% accuracy.
  • A minimal FUSE filesystem was implemented, with operations passed through to an LLM.
  • Arithmetic coding enables reversible compression using LLMs, achieving significant compression ratios.
  • Fine-tuned models achieve better compression on XML filesystem trees due to 'self-compression' effect.
  • Comparison shows llmfuse achieves ~8x better compression than squashfs on text data.
  • The experiment raises questions about real-world applications and potential of LLM-based compression.