NanoChat – The best ChatGPT that $100 can buy
7 hours ago
- #AI-development
- #LLM
- #ChatGPT-clone
- nanochat is a full-stack implementation of an LLM like ChatGPT designed to run on a single 8XH100 node.
- The speedrun.sh script trains and infers the $100 tier of nanochat in about 4 hours on an 8XH100 node.
- After training, users can interact with the LLM via a ChatGPT-like web UI by running python -m scripts.chat_web.
- The project includes evaluations and metrics in a report.md file, showing performance across various benchmarks.
- nanochat supports scaling to higher tiers like the $300 (d26 model) and $1000 tiers, with adjustments in data shards and batch sizes.
- The code is minimal, hackable, and designed to be accessible, running on vanilla PyTorch with potential adjustments for different GPU setups.
- nanochat is inspired by nanoGPT and modded-nanoGPT, with acknowledgments to HuggingFace, Lambda, and Alec Radford.
- The project is open-source under MIT license and encourages citation in research.