Updated Gemini 2.5 Flash Native Audio Model
a day ago
- #AI
- #Google Gemini
- #Voice Technology
- Google introduced an updated Gemini 2.5 Flash Native Audio model for live voice agents, improving complex workflow handling and natural conversations.
- The model is now available across Google products like Google AI Studio, Vertex AI, Gemini Live, and Search Live.
- Key improvements include sharper function calling (71.5% score on ComplexFuncBench Audio), robust instruction following (90% adherence rate), and smoother multi-turn conversations.
- Live speech translation is introduced, supporting over 70 languages and 2000 language pairs, with features like style transfer, multilingual input, auto-detection, and noise robustness.
- Beta live speech translation is rolling out in the Google Translate app for Android in the US, Mexico, and India, with iOS and more regions coming soon.
- Customers like Shopify, United Wholesale Mortgage, and Newo.ai report significant benefits from using Gemini 2.5 Flash Native Audio.
- Developers can start building voice agents with Gemini 2.5 Flash Native Audio on Vertex AI and the Gemini API.