Qwen3-ASR Technical Report

a month ago

Introduction of Qwen3-ASR family, including two ASR models and a forced alignment model.
Qwen3-ASR-1.7B and Qwen3-ASR-0.6B support 52 languages and dialects, leveraging large-scale training data.
Qwen3-ASR-1.7B achieves SOTA performance among open-sourced ASR models.
Qwen3-ASR-0.6B offers the best accuracy-efficiency trade-off with low latency.
Qwen3-ForcedAligner-0.6B outperforms existing force alignment models in efficiency and versatility.
Models are released under the Apache 2.0 license to accelerate ASR and audio understanding research.

Hasty Briefsbeta