Remix.run Logo
Tiberium 11 hours ago

You won't be able to scan most of websites this way because most servers expect you to also pass a valid hostname. However you can use domain lists, for example https://purecrawl.com/en/download/domains (or https://domains-monitor.com/ which is paid but has more domains) as an initial seed shouldn't be too bad, but you'll have to ingest terabytes of spammy/low quality content.

Brybry 11 hours ago | parent [-]

wouldn't certificate transparency logs be a good way to collect most active domains?

Tiberium 9 hours ago | parent [-]

Yeah, I forgot about these :)