Updated Gemini 2.5 Flash Native Audio Model

5 months ago

Google introduced an updated Gemini 2.5 Flash Native Audio model for live voice agents, improving complex workflow handling and natural conversations.
The model is now available across Google products like Google AI Studio, Vertex AI, Gemini Live, and Search Live.
Key improvements include sharper function calling (71.5% score on ComplexFuncBench Audio), robust instruction following (90% adherence rate), and smoother multi-turn conversations.
Live speech translation is introduced, supporting over 70 languages and 2000 language pairs, with features like style transfer, multilingual input, auto-detection, and noise robustness.
Beta live speech translation is rolling out in the Google Translate app for Android in the US, Mexico, and India, with iOS and more regions coming soon.
Customers like Shopify, United Wholesale Mortgage, and Newo.ai report significant benefits from using Gemini 2.5 Flash Native Audio.
Developers can start building voice agents with Gemini 2.5 Flash Native Audio on Vertex AI and the Gemini API.

Hasty Briefsbeta