A new, 200% faster DeepSeek R1-0528 variant appears from German lab
10 months ago
- #Enterprise
- #AI
- #Open Source
- DeepSeek, a Chinese AI startup, released its latest open-source model, R1-0528, which is being widely adapted due to its permissive Apache 2.0 license.
- TNG Technology Consulting GmbH released DeepSeek-TNG R1T2 Chimera, an adaptation of R1-0528, which is 20% faster than the original R1 and more than twice as fast as R1-0528.
- R1T2 achieves 90% of R1-0528's intelligence benchmarks while using less than 40% of its output tokens, making it more efficient and cost-effective.
- The model uses TNG's Assembly-of-Experts (AoE) method, which merges weight tensors from multiple pre-trained models without further fine-tuning.
- AoE differs from Mixture-of-Experts (MoE) by focusing on merging expert tensors rather than dynamically activating experts at runtime.
- R1T2 is designed for high reasoning capability with concise responses, making it ideal for enterprise and research use.
- The model is available under an MIT License on Hugging Face, but has limitations in function calling and tool use.
- EU users should assess compliance with the upcoming EU AI Act, while U.S. companies have more flexibility.
- TNG Technology Consulting GmbH, founded in 2001, specializes in AI and software development, serving major industries.
- For enterprises, R1T2 offers lower inference costs, high reasoning quality, and open-source modifiability, though with some limitations.