Nvidia releases open dataset, 2 models for multilingual speech AI
7 days ago
- #Multilingual
- #AI
- #Speech Recognition
- NVIDIA introduces a new dataset and models supporting 25 European languages for AI speech recognition and translation.
- Granary, an open-source multilingual speech dataset, contains around a million hours of audio for AI training.
- NVIDIA Canary-1b-v2 and Parakeet-tdt-0.6b-v3 models offer high-quality transcription and translation with optimized performance for different tasks.
- Granary addresses data scarcity by enhancing public speech data without human annotation, supporting underrepresented languages.
- The new models and dataset are available on Hugging Face, with the methodology shared to accelerate speech AI innovation.