Hasty Briefsbeta

Bilingual

Qwen 3.5 small models out

6 hours ago
  • #AI
  • #LanguageModel
  • #Multimodal
  • Qwen3.5-35B-A3B is a post-trained model available in Hugging Face Transformers format, compatible with various inference frameworks like vLLM, SGLang, and KTransformers.
  • Alibaba Cloud Model Studio provides a managed API service for Qwen3.5, featuring Qwen3.5-Flash with extended context length and built-in tools.
  • Qwen3.5 introduces significant advancements in multimodal learning, architectural efficiency, reinforcement learning scalability, and global linguistic coverage (201 languages).
  • Key enhancements include Unified Vision-Language Foundation, Efficient Hybrid Architecture, Scalable RL Generalization, and Next-Generation Training Infrastructure.
  • Model specifications: 35B total parameters (3B activated), 40 layers, 262,144 native context length (extendable to 1M tokens), and Mixture of Experts architecture.
  • Benchmark results show competitive performance across knowledge, instruction following, long context, STEM & reasoning, coding, and multilingual tasks.
  • Vision-language benchmarks demonstrate strong capabilities in STEM, general VQA, text recognition, spatial intelligence, video understanding, and medical VQA.
  • Quickstart guides for serving Qwen3.5 via SGLang, vLLM, KTransformers, and Hugging Face Transformers, with recommendations for optimal performance.
  • API usage examples for text, image, and video inputs, including sampling parameters for different task types (general, coding, reasoning).
  • Agentic usage with Qwen-Agent and Qwen Code for terminal automation, tool calling capabilities, and processing ultra-long texts with YaRN scaling.
  • Best practices include recommended sampling parameters, adequate output length (32K-81K tokens), and standardized output formats for benchmarking.
  • Citation available for referencing Qwen3.5's work in native multimodal agents.