Package managers keep using Git as a database, it never works out
4 months ago
- #package-managers
- #scalability
- #git
- Using git as a database is tempting due to free version history, review workflows via pull requests, and distributed design, but it often fails at scale.
- Package managers like Cargo, Homebrew, CocoaPods, Nixpkgs, vcpkg, and Go modules initially used git but faced performance issues, leading them to adopt alternatives like HTTP-based protocols or CDNs.
- Cargo switched to a sparse HTTP protocol to avoid cloning the entire index, significantly improving performance for users.
- Homebrew moved to JSON downloads for tap updates after GitHub requested they stop using shallow clones due to high costs and slow performance.
- CocoaPods abandoned git for a CDN, reducing disk usage and speeding up operations.
- Nixpkgs struggles with GitHub's infrastructure due to its massive size and frequent CI queries, but cannot easily migrate away from git.
- vcpkg relies on git tree hashes for versioning, making shallow clones problematic, with no easy HTTP-based solution available.
- Go modules improved dependency resolution speed from 18 minutes to 12 seconds by using a module proxy instead of cloning repositories.
- Git-based wikis and CMS platforms face scalability issues, leading to slow performance and rate limits.
- Git inherits filesystem limitations, making it a poor database substitute due to directory limits, case sensitivity issues, path length constraints, and lack of database features like constraints and indexes.
- The pattern shows that while git is excellent for distributed collaboration on source code, it is ill-suited for use as a database or package registry, leading to inevitable migration to more suitable solutions.