Remix clone Hacker News

We love Puppeteer and use it to power our API!

We don't compete with Puppeteer/Playwright/Selenium but we're focused on solving the problem of hosting Chromium browsers in the cloud.

The workflow jump we've seen ourselves and from customers is that it's pretty straightforward to build these scripts locally, but the moment you want to run them in prod, you now need to deal with a ton of overhead around packaging up to docker, resource management, scaling, etc. We want to make that trivial :)

▲

bleachpedro3 9 hours ago | parent | next [-]

So this is the Vercel approach to web bots? Hopefully less predatory aha..

	▲	huss97 8 hours ago \| parent [-]
		haha, I love this analogy! If we can do for deploying web automation to prod what Vercel did for deploying front-ends, I'd be a happy camper. With even friendlier pricing of course :P

▲

cranberryturkey 8 hours ago | parent | prev [-]

I have never had any issues running puppeteer on a server, what problems are you solving that would arise?

	▲	huss97 7 hours ago \| parent [-]
		Yeah, that’s totally fair! Not everybody will. In our experience, the problems we solve are the creeping kind that develop over time and show themselves as maintenance costs. Let’s say we want to add web automation to our app, and we've set up a single instance of Chrome hosted on a server somewhere that we connect to and drive with Puppeteer. Great, now we need to ensure incoming requests/sessions are managed properly (>1 active session). If we have a ton of active requests, this becomes annoying because now the resources on our server are being eaten up — so we would need to scale horizontally (or vertically) but now that comes with potential downtime and cold start issues. If we decide to keep this instance of Chrome running 24/7 then we need to bake in resource management to handle memory leaks and connection issues. If we don’t keep them alive, then this comes with significant cold start times (10s+). Now, let’s say we want to support a real-time scraping use case that requires multiple browsers in parallel. At this point, we would need to use something like Kubernetes with warm pods or maybe lambdas/cloud run. But even those have their own set of challenges/costs. Managing Kubernetes can get complicated and expensive quickly, especially if you’re not already using it for your app. On the other hand, serverless options like Lambda or Cloud Run introduce latency issues (cold starts) and often don’t provide the flexibility you need for long-running sessions or custom configurations. Then, beyond the infra, we'll probably need to build capabilities to not get stopped out by anti-bot measures like proxy mgmt, fingerprint rotation, captcha solving, etc. Pretty quickly, as you grow, a small-scale project can turn into a pretty big maintenance project. Building parsers/agents/scripts is hard enough, so we're hoping to make the infra side as easy as possible.