Qwen 3.6 27B is the sweet spot for local development

3 days ago

Qwen 3.6 27B is a powerful dense local model praised for punching above its weight, making it the first local model that works well as a general intelligence.
The author recommends Qwen 3.6 27B over the faster mixture-of-experts 35B A3B variant for higher quality output, despite being slower, and demonstrates its capabilities in creative writing and coding tasks like creating a hexagonal minesweeper.
Running Qwen 3.6 locally is straightforward with llama.cpp, supporting quantizations like 8-bit for reduced size and multi-token prediction for speed, and it can be integrated into tools like OpenCode for vibe coding.
Performance tests on an M5 Macbook show Qwen 3.6 27B achieving around 30 tokens per second efficiently using GPU resources, with quantizations allowing it to run on devices with as little as 32 GB RAM.
Benchmarks indicate Qwen 3.6 27B outperforms alternatives like Gemma 4 31B, and the author highlights the benefits of local models for privacy, fine-tuning, and avoiding reliance on subsidized proprietary APIs.

Hasty Briefsbeta