| ▲ | Show HN: Live VNC for web agents – debugging native captcha on Cloud Run(rtrvr.ai) | |
| 10 points by quarkcarbon279 4 days ago | 1 comments | ||
Hi HN, Bhavani here (rtrvr.ai). We build DOM-native web agents (no screenshot-based vision, no CDP/Playwright debugger-port control). We handle captchas natively including Google reCAPTCHA image challenges by traversing cross-origin iframes and shadow DOM. The latency is high on this one currently. The problem: when debugging image selection captchas ("select all images with traffic lights"), logs don't tell you why the agent clicked the wrong tiles. I found myself staring at execution logs thinking "did it even see the grid correctly?" and realized I just wanted to watch it work. So we built live VNC view + takeover for serverless Chrome workers on Cloud Run. Key learnings: 1. Session affinity is best-effort; "attach later" can hit a different instance 2. A separate relay service that pairs viewer↔runner by short-lived tokens makes attach deterministic 3. Runner stays clean: concurrency=1, one browser per container, no mixed traffic Would love feedback from folks who've shipped similar: 1. What replaced VNC for you (WebRTC etc) and why? 2. Best approach for recording/replay without huge storage? 3. How do you handle "attach later" safely in serverless? | ||
| ▲ | 4 days ago | parent [-] | |
| [deleted] | ||