Remix.run Logo
dspillett 15 hours ago

Is there a public dump of the data anywhere that this is based upon, or have they scraped it themselves?

Such as DB might be entertaining to play with, and the threadedness of comments would be useful for beginners to practise efficient recursive queries (more so than the StackExchange dumps, for instance).

thomasmarton 14 hours ago | parent | next [-]

While not a dump per se, there is an API where you can get HN data programmatically, no scraping needed.

https://github.com/HackerNews/API

keepamovin 8 hours ago | parent | prev [-]

Yes, you can see the download HN bash script in the repository now that simply extract the data to your local machine from BigQuery and saves it as a series of gzip JSON files

dspillett 34 minutes ago | parent [-]

Ah, the repo was 404ing for me last time I checked (seems fine now) so I couldn't inspect that. I'll have a play later.