Chatterbox, Resemble AI's production-grade open source TTS model
a year ago
- #TTS
- #Open Source
- #AI
- Resemble AI introduces Chatterbox, its first production-grade open source TTS model under MIT license.
- Chatterbox benchmarks favorably against closed-source systems like ElevenLabs in side-by-side evaluations.
- Supports emotion exaggeration control, a unique feature for open source TTS models.
- Available for use in memes, videos, games, and AI agents via Hugging Face Gradio app.
- Offers a competitively priced TTS service for scaling or tuning with ultra-low latency (<200ms).
- Features include SoTA zeroshot TTS, 0.5B Llama backbone, and watermarked outputs.
- Includes easy voice conversion scripts and alignment-informed inference for stability.
- Trained on 0.5M hours of cleaned data.
- Provides usage tips for general and expressive/dramatic speech settings.
- Includes Python code examples for generating TTS with optional different voice prompts.
- All generated audio includes imperceptible neural watermarks for security.
- Encourages community engagement via Discord while emphasizing ethical use.