| ▲ | jcalvinowens 4 hours ago |
| I had to block meta's ASN on my personal cgit server a few weeks ago because they were ignoring robots.txt and torching it. Like hundreds of megabytes of access logs just from them, spread around different network blocks to clearly try and defeat IP based limiting. I couldn't believe it. |
|
| ▲ | bflesch 3 hours ago | parent | next [-] |
| IMO ASN-based blocking should be much more common, but unfortunately it is not supported as a first-class configuration option in many common tools. |
| |
| ▲ | jcalvinowens 3 hours ago | parent | next [-] | | Yeah, I dont know how anybody stays sane without it. I have a list of over a thousand ASNs I blackhole at this point... Mine is a daily bash cronjob that fetches a text-based database and uses grep to build an nftables-apply script with all the IPs for the blocked ASNs. I keep meaning to share it, but it's embarrassingly messy I haven't had time to clean it up... | | |
| ▲ | noxvilleza 29 minutes ago | parent [-] | | It's been a real game of cat and mouse over the last few years. I used to do daily iptables updates to block repeat scrapers on my small niche stats site I run. About 5-6 ago it become more common to see broader ranges - so I started blocking ASNs which worked great (esp for the regulars like Alibaba, Tencent, compromised DigitalOcean/OVH, ...). In the last 2-3 years though the overall bot traffic has skyrocketed - it's easy to spot bot activity after the fact (no requests to the CDN for static assets, user agent changes from one request to the next, predictable ID enumeration, etc) but not in a real time. They're also often using residential-based proxies and Cloudflare bot detection has become pretty bad. |
| |
| ▲ | walrus01 3 hours ago | parent | prev [-] | | It's a real pain in the ass because in the absence of ASN based blocking, you often have to give something a long list of IP ranges in CIDR notation, and be certain you don't "miss" even one ipv4 /23 or /24 or a crawler will get through. |
|
|
| ▲ | websap 3 hours ago | parent | prev [-] |
| [flagged] |
| |