Sarvam 105B, the first competitive Indian open source LLM

2 months ago

#AI
#OpenSource
#IndiaAI

Sarvam releases two open-source models: Sarvam 30B and Sarvam 105B, trained from scratch in India.
Both models are optimized for reasoning, programming, and agentic tasks, with strong performance on Indian language benchmarks.
Sarvam 30B is designed for real-time deployment, powering the conversational agent platform Samvaad.
Sarvam 105B excels in complex reasoning and agentic workflows, powering the AI assistant Indus.
The models use Mixture-of-Experts (MoE) Transformer architecture for efficient training and deployment.
Pre-training involved large datasets (16T tokens for 30B, 12T tokens for 105B) with a focus on reasoning and multilingual content.
Supervised fine-tuning included high-quality prompts and safety fine-tuning for India-specific risk scenarios.
Reinforcement learning used diverse prompts and adaptive sampling for effective learning.
Benchmarks show Sarvam 105B outperforming comparable models in knowledge, reasoning, and agentic tasks.
Sarvam 30B performs well on coding and reasoning benchmarks, optimized for efficient deployment.
Tokenizer efficiency is optimized for Indian languages, reducing cost and latency.
Inference optimizations include kernel-level rewrites and advanced scheduling for high throughput.
Demos showcase practical applications, including webpage generation, tutoring, and competitive programming.
The models are available via API and can be downloaded from AI Kosh and Hugging Face.
Conclusion highlights the models' role in building sovereign AI infrastructure in India.

Hasty Briefsbeta

Sarvam 105B, the first competitive Indian open source LLM