Remix.run Logo
jperryjperry 3 days ago

open source dataset from McAuley Lab at UCSD https://huggingface.co/datasets/McAuley-Lab/Amazon-Reviews-2....

I'm going to publish an Airbnb example tomorrow where I scraped 1,406,718 photo URLs from public listing pages. For that I used https://docs.burla.dev/ which is a high-performance parallel processing python library I've been working on for a few years now.