▲ | scotty79 7 days ago | |||||||
How do you even build a search index today when websites barely link to each other? Nowadays the bulk of linking goes to ecommerce sites (amazon) from content farms (reddit) and all those sites are submitted directly to Google. I don't think crawlable internet exists anymore. | ||||||||
▲ | NitpickLawyer 7 days ago | parent | next [-] | |||||||
> How do you even build a search index today You can start with seeds like common crawl, and go from there. You can also get DNS records from various providers. Then there's SSL cert logs that you can crawl. Plenty of sources, if you have funding (search by itself without ads sponsoring it might be a net loss, except some niche uses like kagi?) | ||||||||
▲ | loa_in_ 7 days ago | parent | prev | next [-] | |||||||
It isn't impossible nowadays to enumerate domain names using DNS data and score them based on the content they serve. Isn't that what we really want as users? Scoring based not on proxies for relevance like referral count, but on viewable content? | ||||||||
| ||||||||
▲ | gostsamo 7 days ago | parent | prev [-] | |||||||
You are making a broad generalization and even it is based on the assumption that the page-ranking algorithm is the only possible way to do it. | ||||||||
|