Show HN: Create-LLM – Train your own LLM in 60 seconds
6 months ago
- #language-model
- #machine-learning
- #training
- create-llm is a tool for quickly setting up custom language model training projects.
- It offers four templates: NANO (1M params), TINY (6M params), SMALL (100M params), and BASE (1B params).
- The tool provides everything needed for training, including model architecture, data preprocessing, tokenizer training, and deployment tools.
- It includes features like auto-detection of vocab size, warnings for model/data mismatches, and suggestions for optimal hyperparameters.
- Optional integrations with WandB, HuggingFace, and SynthexAI are available.
- The tool is designed for various use cases, from learning and prototyping to production and research.
- It supports real-time training monitoring, model comparison, and automatic checkpoint management.
- Minimum data requirements and data quality tips are provided to avoid overfitting.
- Deployment options include Hugging Face Hub, Replicate, Docker, and cloud platforms.
- Common issues and solutions are documented, such as vocab size mismatch and CUDA out of memory errors.
- The tool requires Node.js, npm, Python, and PyTorch, with specific hardware recommendations for each template.
- Contributions are welcome, with areas ranging from bug fixes to new features and integrations.
- Future plans include more model architectures, distributed training support, and advanced optimization techniques.