Hasty Briefsbeta

Arcee Trinity Mini: US-Trained Moe Model

9 days ago
  • #AI
  • #Open Source
  • #Machine Learning
  • Mergekit returns to GNU Lesser General Public License v3 effective October 31, 2025.
  • Arcee introduces Trinity Mini, a compact MoE model trained in the U.S., offering open weights and strong reasoning.
  • Chinese labs like Qwen and DeepSeek are leading in open weight MoE models.
  • Arcee AI aims to provide open weight models trained end-to-end in America with Trinity family.
  • Trinity Nano and Mini are available now; Trinity Large is training and will arrive in January 2026.
  • Trinity Mini is a fully post-trained reasoning model, while Trinity Nano is an experimental chat model.
  • Arcee shifted from post-training open bases to training their own foundations for long-term improvements.
  • AFM-4.5B was their initial dense model experiment, leading to the development of Trinity.
  • Trinity uses afmoe architecture with gated attention, Muon, and a U.S.-controlled data pipeline.
  • Training involves grouped-query attention, gated attention, and local/global attention patterns.
  • MoE layers follow DeepSeekMoE design with 128 routed experts, 8 active per token.
  • Training uses Muon, TorchTitan in bf16 precision, and a curriculum of 10T tokens across three phases.
  • Trinity Large is a 420B parameter model with 13B active parameters per token.
  • Arcee encourages the community to test and provide feedback on Trinity models to shape future developments.