Show HN: Live VNC for web agents – debugging native captcha on Cloud Run
4 months ago
- #web-agents
- #cloud-run
- #debugging
- Live VNC view and takeover for ephemeral cloud browsers improves debugging for web agents.
- Web agents fail due to the web's distributed system nature, not just UI issues.
- DOM-native automation avoids screenshots/vision and CDP-based tools like Playwright/Puppeteer.
- Live VNC helps debug real-time issues like captcha solving and iframe targeting.
- Cloud Run constraints require a relay-based architecture for VNC to avoid sticky routing issues.
- The solution involves a runner, relay, and signed pairing tokens for secure, scalable VNC access.
- Minimal display stack (Xvfb + x11vnc + xsetroot) replaces heavier solutions like Fluxbox.
- Security model ensures no public VNC ports, short-lived tokens, and role separation.
- Live VNC observability improves debugging for captchas, iframes, shadow DOM, and parallel execution.
- Performance metrics show stable relay service but some errors in web agent runners.
- Bandwidth is a critical factor for scaling VNC sessions.
- Lessons learned include avoiding VNC in runners and unnecessary complexity like Fluxbox.