Hasty Briefsbeta

Gemini 2.5 Computer Use model

4 hours ago
  • #AI
  • #User Interface
  • #Automation
  • Gemini 2.5 Computer Use model is released, built on Gemini 2.5 Pro’s visual understanding and reasoning capabilities.
  • The model enables agents to interact with user interfaces (UIs) for tasks like filling forms, clicking, and scrolling.
  • It outperforms leading alternatives on web and mobile control benchmarks with lower latency.
  • Inputs to the model include user request, screenshot, and action history, with optional exclusions or custom functions.
  • The model operates in a loop: analyzes inputs, generates UI actions, executes them, and repeats until task completion.
  • Optimized for web browsers, with potential for mobile UI control, but not yet for desktop OS-level tasks.
  • Includes safety features to mitigate risks like misuse, unexpected behavior, and prompt injections.
  • Developers can implement additional safety controls, such as per-step safety checks and system instructions.
  • Early testers have used the model for UI testing, personal assistants, and workflow automation.
  • Available in public preview via Gemini API on Google AI Studio and Vertex AI, with demos and documentation.