Qwen3-4B-Thinking-2507

18 days ago

Copy Link

Qwen3-4B-Thinking-2507 introduces significant improvements in reasoning tasks, including logical reasoning, mathematics, science, coding, and academic benchmarks.
Enhanced general capabilities such as instruction following, tool usage, text generation, and alignment with human preferences.
Supports 256K long-context understanding capabilities and has an increased thinking length for complex reasoning tasks.
Model features include 4.0B parameters, 36 layers, 32 attention heads for Q and 8 for KV, and a native context length of 262,144.
Performance benchmarks show improvements in knowledge, reasoning, coding, alignment, agent tasks, and multilingualism.
Quickstart guide provided for using the model with Hugging Face transformers, including code snippets for text generation.
Deployment options include sglang, vLLM, Ollama, LMStudio, MLX-LM, llama.cpp, and KTransformers.
Agentic use is enhanced with Qwen-Agent, which simplifies tool-calling templates and parsers.
Best practices include recommended sampling parameters, adequate output length, and standardized output formats for benchmarking.
Citation information provided for referencing the Qwen3 Technical Report.

Hasty Briefsbeta