A new, 200% faster DeepSeek R1-0528 variant appears from German lab

a year ago

DeepSeek, a Chinese AI startup, released its latest open-source model, R1-0528, which is being widely adapted due to its permissive Apache 2.0 license.
TNG Technology Consulting GmbH released DeepSeek-TNG R1T2 Chimera, an adaptation of R1-0528, which is 20% faster than the original R1 and more than twice as fast as R1-0528.
R1T2 achieves 90% of R1-0528's intelligence benchmarks while using less than 40% of its output tokens, making it more efficient and cost-effective.
The model uses TNG's Assembly-of-Experts (AoE) method, which merges weight tensors from multiple pre-trained models without further fine-tuning.
AoE differs from Mixture-of-Experts (MoE) by focusing on merging expert tensors rather than dynamically activating experts at runtime.
R1T2 is designed for high reasoning capability with concise responses, making it ideal for enterprise and research use.
The model is available under an MIT License on Hugging Face, but has limitations in function calling and tool use.
EU users should assess compliance with the upcoming EU AI Act, while U.S. companies have more flexibility.
TNG Technology Consulting GmbH, founded in 2001, specializes in AI and software development, serving major industries.
For enterprises, R1T2 offers lower inference costs, high reasoning quality, and open-source modifiability, though with some limitations.

Hasty Briefsbeta