Hasty Briefsbeta

NanoChat – The best ChatGPT that $100 can buy

7 hours ago
  • #AI-development
  • #LLM
  • #ChatGPT-clone
  • nanochat is a full-stack implementation of an LLM like ChatGPT designed to run on a single 8XH100 node.
  • The speedrun.sh script trains and infers the $100 tier of nanochat in about 4 hours on an 8XH100 node.
  • After training, users can interact with the LLM via a ChatGPT-like web UI by running python -m scripts.chat_web.
  • The project includes evaluations and metrics in a report.md file, showing performance across various benchmarks.
  • nanochat supports scaling to higher tiers like the $300 (d26 model) and $1000 tiers, with adjustments in data shards and batch sizes.
  • The code is minimal, hackable, and designed to be accessible, running on vanilla PyTorch with potential adjustments for different GPU setups.
  • nanochat is inspired by nanoGPT and modded-nanoGPT, with acknowledgments to HuggingFace, Lambda, and Alec Radford.
  • The project is open-source under MIT license and encourages citation in research.