Hasty Briefsbeta

Bilingual

GitHub - OpenBMB/VoxCPM: VoxCPM2: Tokenizer-Free TTS for Multilingual Speech Generation, Creative Voice Design, and True-to-Life Cloning

8 hours ago
  • #Voice Cloning
  • #Text-to-Speech
  • #Multilingual AI
  • VoxCPM2 is a tokenizer-free Text-to-Speech system with 2B parameters, trained on over 2 million hours of multilingual speech data.
  • It supports 30 languages, Voice Design, Controllable Voice Cloning, and outputs 48kHz studio-quality audio via an end-to-end diffusion autoregressive architecture.
  • Features include real-time streaming with low RTF, fully open-source Apache-2.0 licensing, and fine-tuning options like SFT and LoRA.
  • Performance benchmarks show state-of-the-art results in multilingual TTS tasks, with high intelligibility and similarity scores across languages.
  • Risks include potential misuse for impersonation, variability in controllable generation, and limited language coverage for non-supported languages.