Hasty Briefsbeta

Bilingual

Show HN: Live VNC for web agents – debugging native captcha on Cloud Run

4 months ago
  • #web-agents
  • #cloud-run
  • #debugging
  • Live VNC view and takeover for ephemeral cloud browsers improves debugging for web agents.
  • Web agents fail due to the web's distributed system nature, not just UI issues.
  • DOM-native automation avoids screenshots/vision and CDP-based tools like Playwright/Puppeteer.
  • Live VNC helps debug real-time issues like captcha solving and iframe targeting.
  • Cloud Run constraints require a relay-based architecture for VNC to avoid sticky routing issues.
  • The solution involves a runner, relay, and signed pairing tokens for secure, scalable VNC access.
  • Minimal display stack (Xvfb + x11vnc + xsetroot) replaces heavier solutions like Fluxbox.
  • Security model ensures no public VNC ports, short-lived tokens, and role separation.
  • Live VNC observability improves debugging for captchas, iframes, shadow DOM, and parallel execution.
  • Performance metrics show stable relay service but some errors in web agent runners.
  • Bandwidth is a critical factor for scaling VNC sessions.
  • Lessons learned include avoiding VNC in runners and unnecessary complexity like Fluxbox.