Hasty Briefsbeta

Meta Omnilingual ASR: Advancing Automatic Speech Recognition for 1600 Languages

12 days ago
  • #Multilingual
  • #AI
  • #Speech Recognition
  • Meta's FAIR team introduces Omnilingual ASR, supporting over 1,600 languages, including 500 low-resource languages.
  • Omnilingual wav2vec 2.0 is open-sourced, a 7B-parameter model for multilingual speech representation.
  • The Omnilingual ASR Corpus is released, featuring transcribed speech in 350 underserved languages.
  • Two architectural variants introduced: a scaled wav2vec 2.0 encoder and two decoder variants for character tokens.
  • LLM-ASR achieves state-of-the-art performance with character error rates below 10 for 78% of languages.
  • In-context learning allows transcription of unsupported languages with minimal audio-text samples.
  • A suite of models released, from lightweight 300M versions to powerful 7B models, under Apache 2.0 license.
  • Collaboration with global partners and local communities to collect and transcribe underrepresented languages.
  • Omnilingual ASR Corpus is the largest ultra-low-resource spontaneous ASR dataset available.