Remix.run Logo
jmp1062 3 days ago

i would probably use Playwright with custom code, create chunks based on similar products, then run it on a large cluster in parallel (https://github.com/Burla-Cloud/burla).

if you have a single worker trying to scrape a shit ton of products back to back to back you're going to get rate limited or their bot detection will catch you.