The problem with this approach is that modern scrapers use hordes of residential proxies and quickly rotate through IP addresses which belong to ASes you get a lot of real traffic from. There's nothing you can do if the ISP won't take any action against the customer.

▲

lucb1e 2 hours ago | parent | next [-]

I know. All the more reason to do it, right? If an ISP can't keep its network clean, then allowing them to send traffic onto the web is just asking for the problem to continue

Show people a useful error, such as "You are using [ISP name] which sends large volumes of abusive traffic (think of spam and DDoS). They allow the attackers to hop around points across their entire network so we cannot block the abusers more selectively. Despite our attempts to contact them, the abuse continues in volumes which we do not see from other ISPs. To access our corner of the internet, use a different ISP. You could try mobile data instead of Wi-Fi or vice versa.", and they can make their own choices about staying with this ISP if more and more websites show this sort of error

If everyone tries to identify people piecemeal, we all need to implement ~200 different identification systems (assuming each country has a central system that everyone is signed up to in the first place), or rely on algorithms to tell who is a bot (I'm currently being misidentified on a daily basis and I'm, eh, not a bot. Trying to buy public transport tickets is currently difficult, for example, because the monopolist in my country blocks me after a few route queries when using a Google browser, and 0 queries from Firefox)

▲

tangledhelix 6 hours ago | parent | prev [-]

Worse than that - even if they would take action, you can't possibly orchestrate filing all of the complaints. It's a drown-in-quicksand problem, you can't fight quicksand one grain at a time.

▲

lucb1e 2 hours ago | parent [-]

> you can't possibly orchestrate filing all of the complaints

To the ISPs? Each IP range has an abuse email address registered and this is specifically exempt from rate limiting at RIPE's WHOIS server. Not sure how it is in other RIRs but I just happen to know of this policy

You can automate the whole thing, provided that you have a reliable way of identifying the undesired traffic which you need anyway for being able to block it by any means. The trouble is in user identification (they'll just use a new IP address from that ISP or hosting provider if you don't tell the provider about the problematic user)

▲

tangledhelix an hour ago | parent [-]

See what I wrote above (and let me say I am talking about Project Gutenberg and Distributed Proofreaders here, I am one of the admins on both). A large amount of the hassle traffic we've seen is as I wrote above, the IPs come from everywhere and in many cases, each IP makes a single request and doesn't come back. They change user-agent dynamically, etc, to masquerade as regular traffic. They come from residential, cloud/hyperscale, corporate, educational, government, all the networks, on every continent. This is many thousands of "open a ticket with someone" events per hour territory. It's as difficult to fight as DDoS itself for the same reasons (presumably the harvesting parties know that and that's exactly why this approach is used).

Others online have been writing about their own experience with the same stuff; it's not unique to PG at all, it's everywhere. Talk to anyone that runs a web server and they'll have these stories...

	▲	lucb1e an hour ago \| parent [-]
		I'm aware, I also host various websites that see an IP do a single request to the most unlikely of deep pages. Usually not hard to correlate with similar surprising requests from the same ISP, though, and that's exactly why it would be useful to talk to them: they know who used that IP address at the given timestamp. If they get a hundred complaints from different websites, the ISP is in the unique position to correlate that and find the subscriber(s) that are problematic You also don't have to send out 1k support requests per hour. Could trial it with some hosting provider that you expect is responsive and see how it works out edit: like, I just don't see another solution short of banning being anonymous online. Each site would have to know who you are. Someone has to be able to track it back to a person that is doing the abuse or there can't be any rules that we can apply. Imo it's better if that's the ISP (or VPN provider, say) who already has this information anyway