Show HN: Create-LLM – Train your own LLM in 60 seconds

6 months ago

create-llm is a tool for quickly setting up custom language model training projects.
It offers four templates: NANO (1M params), TINY (6M params), SMALL (100M params), and BASE (1B params).
The tool provides everything needed for training, including model architecture, data preprocessing, tokenizer training, and deployment tools.
It includes features like auto-detection of vocab size, warnings for model/data mismatches, and suggestions for optimal hyperparameters.
Optional integrations with WandB, HuggingFace, and SynthexAI are available.
The tool is designed for various use cases, from learning and prototyping to production and research.
It supports real-time training monitoring, model comparison, and automatic checkpoint management.
Minimum data requirements and data quality tips are provided to avoid overfitting.
Deployment options include Hugging Face Hub, Replicate, Docker, and cloud platforms.
Common issues and solutions are documented, such as vocab size mismatch and CUDA out of memory errors.
The tool requires Node.js, npm, Python, and PyTorch, with specific hardware recommendations for each template.
Contributions are welcome, with areas ranging from bug fixes to new features and integrations.
Future plans include more model architectures, distributed training support, and advanced optimization techniques.

Hasty Briefsbeta