Hasty Briefsbeta

Bilingual

Llama-Nemotron: Efficient Reasoning Models

a year ago
  • #AI
  • #Open Source
  • #Machine Learning
  • Introduction of Llama-Nemotron series: open family of heterogeneous reasoning models with exceptional capabilities and efficiency.
  • Three model sizes: Nano (8B), Super (49B), Ultra (253B), competitive with state-of-the-art models like DeepSeek-R1.
  • Training procedure includes neural architecture search, knowledge distillation, continued pretraining, and reasoning-focused post-training.
  • First open-source models with dynamic reasoning toggle for switching between chat and reasoning modes.
  • Release includes models under NVIDIA Open Model License, post-training dataset, and training codebases (NeMo, NeMo-Aligner, Megatron-LM).