Drax: Speech Recognition with Discrete Flow Matching
13 days ago
- #flow-matching
- #ASR
- #non-autoregressive
- Drax is a discrete flow matching framework for ASR that enables efficient parallel decoding.
- It constructs an audio-conditioned probability path to guide the model through likely intermediate inference errors.
- Theoretical analysis links the generalization gap to divergences between training and inference occupancies.
- Empirical evaluation shows Drax achieves state-of-the-art recognition accuracy with improved efficiency.
- Drax offers better control over the accuracy-efficiency trade-off compared to autoregressive models.