Eleven v3 (Alpha)

a year ago

Eleven v3 is the most expressive Text to Speech model with controllable emotion, delivery, and direction using audio tags.
It supports dynamic conversations between multiple speakers, making dialogue sound natural and human.
Human-like speech is available in 70+ languages, including English, Portuguese, and Chinese.
Eleven v3 offers a broad dynamic range controlled through inline audio tags, with features like Dialogue Mode and full emotional range.
The model is 80% off until June 2025 for self-serve users via the UI.
Text to Dialogue weaves multiple voices together, matching prosody and emotional range for engaging conversations.
Public API for Eleven v3 (alpha) is coming soon, with early access available via sales contact.
Audio tags are voice and context-dependent; a prompting guide is available for further details.

Hasty Briefsbeta