Remix.run Logo
diggan 2 days ago

> There are already “infinite” websites like these on the Internet.

Cool. And how much of the software driving these websites is FOSS and I can download and run it for my own (popular enough to be crawled more than daily by multiple scrapers) website?

gruez 2 days ago | parent | next [-]

Off the top of my head: https://everyuuid.com/

https://github.com/nolenroyalty/every-uuid

johnisgood a day ago | parent | next [-]

How is that infinite if the last one is always the same? Am I misunderstanding this? I assumed it is almost like an infinite scroll or something.

gruez a day ago | parent [-]

Here's another site that does something similar (iterating over bitcoin private keys rather than uuids), but has separate pages and would theoretically catch a crawler:

https://allprivatekeys.com/all-bitcoin-private-keys-list

johnisgood a day ago | parent [-]

503 :D

diggan 2 days ago | parent | prev [-]

Aren't those finite lists? How is a scraper (normal or LLM) supposed to "get stuck" on those?

gruez 2 days ago | parent [-]

even though 2^128 uuids is technically "finite", for all intents and purposes is infinite to a scraper.

fc417fc802 a day ago | parent [-]

[dead]

hartator 2 days ago | parent | prev [-]

Every not found pages that don’t return a 404 http header is basically an infinite trap.

It’s useless to do this though as all crawlers have a way to handle this. It’s very crawler 101.