| ▲ | binaryturtle 6 days ago |
| I'm on an older system here, and both Cloudflare and Anubis entirely block me out of sites. Once you start blocking actual users out of your sites, it simply has gone too far. At least provide an alternative method to enter your site (e.g. via login) that's not hampered by erroneous human checks. Same for the captchas where you help train AIs by choosing out of a set of tiny/ noisy pictures. I often struggle for 5 to 10 minutes to get past that nonsense. I heard bots have less trouble. Basically we're already past the point where the web is made for actual humans, now it's made for bots. |
|
| ▲ | inejge 6 days ago | parent | next [-] |
| > Once you start blocking actual users out of your sites, it simply has gone too far. It has, scrapers are out of control. Anubis and its ilk are a desperate measure, and some fallout is expected. And you don't get to dictate how a non-commercial site tries to avoid throttling and/or bandwidth overage bills. |
| |
| ▲ | account42 6 days ago | parent [-] | | No, they are a lazy measure. Most websites that slap on these kinds of checks don't even bother with more human-friendly measures first. | | |
| ▲ | mschuster91 6 days ago | parent [-] | | Because I don't have the fucking time to deal with AI scraper bots. I went harder - anything even looking suspiciously close to a scraper that's not on Google's index [1] or has wget in its user agent gets their entire /24 hard banned for a month, with an email address to contact for unbanning. That seems to be a pretty effective way for now to keep scrapers, spammers and other abusive behavior away. Normal users don't do certain site actions at the speed that scraper bots do, there's no other practically relevant search engine than Google, I've never ever seen an abusive bot hide as wget (they all try to emulate looking like a human operated web browser), and no AI agent yet is smart enough to figure out how to interpret the message "Your ISP's network appears to have been used by bot activity. Please write an email to xxx@yyy.zzz with <ABC> as the subject line (or click on this pre-filled link) and you will automatically get unblocked". [1] https://developers.google.com/search/docs/crawling-indexing/... | | |
| ▲ | account42 6 days ago | parent [-] | | > Normal users don't do certain site actions at the speed that scraper bots do How would you know when you have already banned them. | | |
| ▲ | mschuster91 6 days ago | parent [-] | | Simple. A honeypot link in a three levels deep menu which no ordinary human would care about that, thanks to a JS animation, needs at least half a second for a human to click on. Any bot that clicks it in less than half a second gets the banhammer. No need for invasive tracking, third party integrations, whatever. | | |
| ▲ | account42 6 days ago | parent [-] | | That does sound like a much human friendlier approach than Anubis. I agree that tarpits and honeypots are a good stopgap until the legal system catches up to the rampant abuse of these "AI" companies. It's when your solutions start affecting real human users just because they are not "normal" in some way that I stop being sympathetic. |
|
|
|
|
|
|
| ▲ | alperakgun 6 days ago | parent | prev | next [-] |
| I gave up on a lot of websites because of the aggressive blocking. |
|
| ▲ | johnklos 6 days ago | parent | prev [-] |
| FYI - you can communicate with the author of Anubis, who has already said she's working on ways to make sure that all browsers - links, lynx, dillo, midori, et cetera, work. Unless you're paying Cloudflare a LOT of money, you won't get to talk with anyone who can or will do anything about issues. They know about their issues and simply don't care. If you don't mind taking a few minutes, perhaps put some details about your setup in a bug report? |