▲ | taikahessu 2 days ago | ||||||||||||||||||||||||||||
We had our non-profit website drained out of bandwidth and site closed temporarily (!!) from our hosting deal because of Amazon bot aggressively crawling like ?page=21454 ... etc. Gladly Siteground restored our site without any repercussions as it was not our fault. Added Amazon bot into robots.txt after that one. Don't like how things are right now. Is a tarpit the solution? Or better laws? Would they stop the chinese bots? Should they even? I don't know. | |||||||||||||||||||||||||||||
▲ | jsheard 2 days ago | parent | next [-] | ||||||||||||||||||||||||||||
For the "good" bots which at least respect robots.txt you can use this list to get ahead of them before they pummel your site. https://github.com/ai-robots-txt/ai.robots.txt There's no easy solution for bad bots which ignore robots.txt and spoof their UA though. | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
▲ | bee_rider 12 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
It is too bad we don’t have a convention already for the internet: User/crawler: I’d like site Server: ok that’ll be $.02 for me to generate it and you’ll have to pay $.01 in bandwidth costs, plus whatever your provider charges you User: What? Obviously as a human I don’t consume websites so fast that $.03 will matter to me, sure, add it to my cable bill. Crawler: Oh no, I’m out of money, (business model collapse). | |||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||
▲ | mrweasel 20 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||
> We had our non-profit website drained out of bandwidth There is a number of sites which are having issues with scrapers (AI and others) generating so much traffic that transit providers are informing them that their fees will go up with the next contract renewal, if the traffic is not reduced. It's just very hard for the individual sites to do much about it, as most of the traffic stems from AWS, GCP or Azure IP ranges. It is a problem and the AI companies do not care. | |||||||||||||||||||||||||||||
▲ | nosioptar 7 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||
I want better laws. The boot operator should have to pay you damages for taking down your site. If acting like inconsiderate tools starts costing money, they may stop. |