GitHub - unslothai/unsloth: Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.

3 months ago

#fine-tuning
#machine-learning
#performance-optimization

Notebooks are beginner-friendly, allowing users to add datasets, run, and deploy trained models.
Performance comparison of various models (e.g., gpt-oss, Qwen3, Gemma 3) showing speed and memory improvements.
Unsloth supports faster embedding fine-tuning (~1.8-3.3x) and new batching algorithms for longer context RL.
New RoPE & MLP Triton Kernels & Padding Free + Packing offer 3x faster training and 30% less VRAM.
Training a 20B model with >500K context is now possible on an 80GB GPU.
FP8 Reinforcement Learning is now supported on consumer GPUs.
DeepSeek-OCR improves language understanding by 89%.
Unsloth Docker image simplifies setup and environment issues.
Vision RL now supports training VLMs with GRPO or GSPO.
Quantization-Aware Training recovers ~70% accuracy.
Memory-efficient RL introduces faster RL with 50% less VRAM and 10× more context.
Support for Mistral 3, Gemma 3n, Qwen3, and other models.
Dynamic 2.0 quants set new benchmarks on 5-shot MMLU & Aider Polyglot.
Unsloth supports all models (TTS, BERT, Mamba), FFT, and MultiGPU.
Long-context Reasoning (GRPO) allows training reasoning models with just 5GB VRAM.
Unsloth Dynamic 4-bit Quantization increases accuracy with <10% more VRAM than BnB 4-bit.
Support for Llama 4, Phi-4, Vision models, and Llama 3.3 (70B).
Cut Cross Entropy supports 89K context for Llama 3.3 (70B) on an 80GB GPU.
Memory usage cut by 30%, supporting 4x longer context windows.
Installation guides for pip, Conda, and Docker.
Example code for fine-tuning gpt-oss-20b provided.
RL support includes GRPO, GSPO, FP8 training, DrGRPO, DAPO, PPO, and more.
Benchmarks show Unsloth's speed, VRAM reduction, and longer context capabilities.
Citations and acknowledgments for contributors and libraries used.

Hasty Briefsbeta

GitHub - unslothai/unsloth: Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, DeepSeek, Qwen, Llama, Gemma, TTS 2x faster with 70% less VRAM.