Hasty Briefsbeta

Building the Next Generation of Physical Agents with Gemini Robotics-ER 1.5

18 hours ago
  • #AI
  • #machine learning
  • #robotics
  • Gemini Robotics-ER 1.5 is now available to all developers as the first broadly accessible Gemini Robotics model.
  • The model specializes in visual and spatial understanding, task planning, progress estimation, and can call tools like Google Search or vision-language-action models.
  • It is designed for complex robotics tasks requiring contextual information and multi-step execution, such as sorting objects based on local recycling rules.
  • Gemini Robotics-ER 1.5 acts as a high-level reasoning brain for robots, capable of understanding natural language commands and orchestrating complex behaviors.
  • The model excels in spatial-temporal reasoning, processing video to understand object relationships and actions over time.
  • Developers can balance latency and accuracy by adjusting the thinking token budget for different task complexities.
  • Enhanced safety features include filters for harmful content and unsafe physical actions, though additional safety engineering is recommended.
  • The model is available in preview via Google AI Studio and the Gemini API, serving as a foundational component of the broader Gemini Robotics system.