▲ | horsawlarway 2 days ago | ||||||||||||||||
Nah - it generalizes fine. They're doing exactly what I said - adding PoW (anubis - as you point out - being one solution) to gate access. That's hardly different than things like Captchas which were a big thing even before LLMs, and also required javascript. Frankly - I'd much rather have people put Anubis in front of the site than cloudflare, as an aside. If the site really was static before, and no JS was needed - LLM scraping taking it down means it was incredibly misconfigured (an rpi can do thousands of reqs/s for static content, and caching is your friend). --- Another great solution? Just ask users to login (no js needed). I'll stand pretty firmly behind "If you aren't willing to make an account - you don't actually care about the site". My take is that search engines and sites generating revenue through ads are the most impacted. I just don't have all that much sympathy for either. Functionally - I think trying to draw a distinction between accessing a site directly and using a tool like an LLM to access a site is a mistake. Like - this was literally the mission statement of the semantic web: "unleash the computer on your behalf to interact with other computers". It just turns out we got there by letting computers deal with unstructured data, instead of making all the data structured. | |||||||||||||||||
▲ | krupan 2 days ago | parent | next [-] | ||||||||||||||||
"this was literally the mission statement of the semantic web" which most everyone either ignored or outright rejected, but thanks for forcing it on us anyway? | |||||||||||||||||
| |||||||||||||||||
▲ | shiomiru 2 days ago | parent | prev | next [-] | ||||||||||||||||
> If the site really was static before, and no JS was needed One does not imply the other. This forum is one example. (Or rather, hn.js is entirely optional.) > Another great solution? Just ask users to login (no js needed). I'll stand pretty firmly behind "If you aren't willing to make an account - you don't actually care about the site". Accounts don't make sense for all websites. Self-hosted git repositories are one common case where I now have to wait seconds for my phone to burn through enough sha256 to see a readme - but surely you don't want to gate that behind a login either... > My take is that search engines and sites generating revenue through ads are the most impacted. I just don't have all that much sympathy for either. ...and hobbyist services. If we're sticking with Anubis as an example, consider the author's motivation for developing it: > A majority of the AI scrapers are not well-behaved, and they will ignore your robots.txt, ignore your User-Agent blocks, and ignore your X-Robots-Tag headers. They will scrape your site until it falls over, and then they will scrape it some more. They will click every link on every link on every link viewing the same pages over and over and over and over. Some of them will even click on the same link multiple times in the same second. It's madness and unsustainable. https://xeiaso.net/blog/2025/anubis/ > Functionally - I think trying to draw a distinction between accessing a site directly and using a tool like an LLM to access a site is a mistake. This isn't "a tool" though, it's cloud hosted scrapers of vc-funded startups taking down small websites in their quest to develop their "tool". It is possible to develop a scraper that doesn't do this, but these companies consciously chose to ignore the pre-existing standards for that. Which is why I think the candy analogy fits perfectly, in fact. | |||||||||||||||||
▲ | account42 2 days ago | parent | prev [-] | ||||||||||||||||
> They're doing exactly what I said - adding PoW (anubis - as you point out - being one solution) to gate access. Which is a shit solution where everyone suffers. > Another great solution? Just ask users to login (no js needed). I'll stand pretty firmly behind "If you aren't willing to make an account - you don't actually care about the site". No I won't create an account to check if a search result has what I'm looking for. Not will I sign up to a forum before I know what the culture is like. We already had this shit with communities moving to Discord, we don't need fuck up the remaining web as well. |