Ferret-UI Lite: Lessons from Building Small On-Device GUI Agents
3 days ago
- #On-Device Models
- #Autonomous Interaction
- #GUI Agents
- Ferret-UI Lite is a compact, end-to-end GUI agent for mobile, web, and desktop platforms.
- Developed using techniques optimized for small models, including diverse GUI data curation, chain-of-thought reasoning, visual tool-use, and reinforcement learning.
- Achieves competitive performance in GUI grounding with scores of 91.6% (ScreenSpot-V2), 53.3% (ScreenSpot-Pro), and 61.2% (OSWorld-G).
- For GUI navigation, success rates are 28.0% (AndroidWorld) and 19.8% (OSWorld).
- Shares methods and lessons learned for developing compact, on-device GUI agents.