Hasty Briefsbeta

Bilingual

Alibaba open-sources Qwen3.6-35B-A3B, a 35B MoE model with 3B active parameters

6 hours ago
  • #open-source-ai
  • #multimodal-ai
  • #large-language-model
  • Qwen3.6-35B-A3B is a post-trained model released by the Qwen team, offering improved stability and utility based on community feedback.
  • It features 35B total parameters (3B activated), a context length of up to 262,144 tokens (extendable to 1,010,000 with YaRN), and supports multimodal inputs including images and videos.
  • Key enhancements include better agentic coding for frontend workflows and repository-level reasoning, along with thinking preservation to retain reasoning context from historical messages.
  • The model is compatible with various inference frameworks such as SGLang, vLLM, KTransformers, and Hugging Face Transformers for deployment and serving.
  • It achieves strong benchmark results in coding (e.g., SWE-bench, Terminal-Bench), knowledge (e.g., MMLU-Pro, C-Eval), and vision-language tasks (e.g., MMMU, MathVista).
  • Usage involves API integration with recommended sampling parameters for different modes (thinking vs. non-thinking) and tasks, plus support for tool calling and agent applications via Qwen-Agent.