Show HN: I built an Android OS in the browser
8 hours ago
- #reinforcement-learning
- #simulation-platform
- #mobile-gui-agents
- MobileGym is a verifiable simulation platform enabling online RL training and deterministic evaluation on mobile apps, overcoming real-device limitations.
- It covers 28 in-browser apps with structured JSON state for reliable, parallelizable, and consequence-free testing.
- The platform eliminates false judgments, with no errors in 416 tasks vs. 10.2% for VLM-based scoring on real devices.
- GRPO fine-tuning of Qwen3-VL-4B boosts simulation success rates significantly, with gains largely retained on real devices (95.1% retention).
- MobileGym addresses real-device challenges: unreadable state, irreversibility, and real-world consequences by using sandboxed, structured environments.
- Leaderboard results show current agents struggle, especially on frontier tasks (L4), indicating room for improvement in mobile GUI agents.
- The architecture is browser-based (React/TypeScript), requiring minimal effort to add new apps (~3-4 person-days for daily apps).
- Efficiency gains include one-tenth the memory and less than one-hundredth the disk footprint compared to Android emulators.