Hasty Briefsbeta

Bilingual

Nvidia Nemotron 3 Ultra

8 hours ago
  • #LLM
  • #NVIDIA
  • #AI Research
  • Nemotron 3 Ultra is NVIDIA's most capable model with 550B total and 55B active parameters.
  • Uses a Mixture-of-Experts Hybrid Mamba-Attention architecture, LatentMoE for accuracy, and MTP layers for faster inference.
  • Pretrained in NVFP4 and post-trained with SFT, RL, and Multi-teacher On-Policy Distillation for improved accuracy.
  • Achieves up to 5.9x higher inference throughput compared to other models and supports up to 1M token context length.
  • Open-source release includes pre-trained, post-trained, quantized checkpoints, and datasets for training.