How to run Qwen 3.5 locally

8 hours ago

Qwen3.5 is Alibaba’s new model family, including various sizes like Qwen3.5-35B-A3B, 27B, 122B-A10B, 397B-A17B, and smaller models (0.8B, 2B, 4B, 9B).
Supports 256K context across 201 languages, excels in agentic coding, vision, chat, and long-context tasks.
Updated quantization algorithm improves chat, coding, long context, and tool-calling performance.
Thinking and non-thinking modes available with different settings for general and precise tasks.
Hardware requirements vary by model size; llama.cpp recommended for local inference.
LM Studio guide provided for enabling/disabling thinking toggle.
Tool calling capabilities and benchmarks show SOTA performance for quantized models.
Qwen3.5-397B-A17B competes with top-tier models like Gemini 3 Pro and Claude Opus 4.5.

Hasty Briefsbeta