Remix.run Logo
mind_heist 3 days ago

how did you scrape all the reviews?

jperryjperry 3 days ago | parent [-]

open source dataset from McAuley Lab at UCSD https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2....

I'm going to publish an Airbnb example tomorrow where I scraped 1,406,718 photo URLs from public listing pages. For that I used https://docs.burla.dev/ which is a high-performance parallel processing python library I've been working on for a few years now.