Hugging Face Storage Buckets: Mutable, non-versioned object storage at $12/TB
3 days ago
- #machine-learning
- #cloud-storage
- #huggingface
- Hugging Face introduces Storage Buckets for mutable, S3-like object storage on the Hub.
- Buckets are designed for intermediate ML files like checkpoints, logs, and processed data that don't need version control.
- Built on Xet, Buckets offer efficient storage with deduplication, reducing bandwidth and speeding up transfers.
- Pre-warming feature allows data to be closer to compute resources, improving performance for distributed training.
- Buckets can be managed via CLI, Python, or JavaScript, with filesystem integration through HfFileSystem.
- Enterprise billing is based on deduplicated storage, optimizing costs.
- Future roadmap includes direct transfers between Buckets and versioned repos for stable deliverables.
- Early adopters like Jasper, Arcee, IBM, and PixAI helped shape the feature during private beta.