Hasty Briefsbeta

Bilingual

A new, 200% faster DeepSeek R1-0528 variant appears from German lab

10 months ago
  • #Enterprise
  • #AI
  • #Open Source
  • DeepSeek, a Chinese AI startup, released its latest open-source model, R1-0528, which is being widely adapted due to its permissive Apache 2.0 license.
  • TNG Technology Consulting GmbH released DeepSeek-TNG R1T2 Chimera, an adaptation of R1-0528, which is 20% faster than the original R1 and more than twice as fast as R1-0528.
  • R1T2 achieves 90% of R1-0528's intelligence benchmarks while using less than 40% of its output tokens, making it more efficient and cost-effective.
  • The model uses TNG's Assembly-of-Experts (AoE) method, which merges weight tensors from multiple pre-trained models without further fine-tuning.
  • AoE differs from Mixture-of-Experts (MoE) by focusing on merging expert tensors rather than dynamically activating experts at runtime.
  • R1T2 is designed for high reasoning capability with concise responses, making it ideal for enterprise and research use.
  • The model is available under an MIT License on Hugging Face, but has limitations in function calling and tool use.
  • EU users should assess compliance with the upcoming EU AI Act, while U.S. companies have more flexibility.
  • TNG Technology Consulting GmbH, founded in 2001, specializes in AI and software development, serving major industries.
  • For enterprises, R1T2 offers lower inference costs, high reasoning quality, and open-source modifiability, though with some limitations.