Remix.run Logo
hobs 2 days ago

Cant you just request the ICANN’s zone files and have the canonical list of the day?

renegat0x0 2 days ago | parent | next [-]

Any link list, or domain list is not worth much without any rating, or meta. I lead a hobby project, and I am not expert, so I provide ratings based on what kind of data pages provide (title, social, description), and my own manual voting system. It is not ideal, but it is something. Also I provide tags, so it is easily known what the domain provides, or domains can be filtered by tags.

I know that you cannot count and visit every domain, so the list will never be finished, but I am happy with the results.

hobs 2 days ago | parent [-]

Well, if you are curating every link them its a different story, and looks like a more classic webring - I missed that part of the work - I thought it looked like a big set of crawler data that wasn't as manually curated.

egberts1 2 days ago | parent | prev [-]

Avoiding GIGO (Garbage In, Garbage Out).

This is why we have computer-variants of Library Science and Archeology, Forensic Science and a bunch of other advanced knowledge (not AI, mind you).

hobs 2 days ago | parent [-]

I don't see how this applies as its aggregating a bunch of stuff from random crawlers - if you want to crawl a list of actual domains that's generally considered the list of things that could resolve, so seems like a good starting place.