Hasty Briefsbeta

Bilingual

Gemini 3.1 Flash TTS: the next generation of expressive AI speech

8 hours ago
  • #Google Gemini
  • #AI Speech
  • #Text-to-Speech
  • Gemini 3.1 Flash TTS is a new text-to-speech model offering improved controllability, expressivity, and quality.
  • It is now available in preview for developers via Gemini API and Google AI Studio, for enterprises on Vertex AI, and for Workspace users via Google Vids.
  • The model achieves high speech quality with an Elo score of 1,211 on the Artificial Analysis TTS leaderboard and is noted for its blend of quality and low cost.
  • It supports multi-speaker dialogue, over 70 languages, and granular control through natural language and new audio tags.
  • Audio tags allow intuitive control of vocal style, pace, and delivery by embedding natural language commands in text input.
  • Features include scene direction, speaker-level specificity with Audio Profiles and Director's Notes, and seamless export of parameters as API code.
  • The model is built for global scale, enabling localized, expressive speech experiences across major markets.
  • All generated audio is watermarked with SynthID to help detect AI-generated content and prevent misinformation.