Eleven v3 (Alpha)
a year ago
- #AI
- #Text-to-Speech
- #VoiceSynthesis
- Eleven v3 is the most expressive Text to Speech model with controllable emotion, delivery, and direction using audio tags.
- It supports dynamic conversations between multiple speakers, making dialogue sound natural and human.
- Human-like speech is available in 70+ languages, including English, Portuguese, and Chinese.
- Eleven v3 offers a broad dynamic range controlled through inline audio tags, with features like Dialogue Mode and full emotional range.
- The model is 80% off until June 2025 for self-serve users via the UI.
- Text to Dialogue weaves multiple voices together, matching prosody and emotional range for engaging conversations.
- Public API for Eleven v3 (alpha) is coming soon, with early access available via sales contact.
- Audio tags are voice and context-dependent; a prompting guide is available for further details.