| ▲ | vivzkestrel 6 hours ago |
| rate limit per ip that progressively keeps decreasing req/mins every few mins? |
|
| ▲ | prmoustache 4 hours ago | parent | next [-] |
| What if scrapers ips are millions of smartphones? If I was as evil as an AI scraper company that is not obeying robots.txt I would totally build/buy thousands of small games/apps for mobiles to use them as jumphosts to scape the web. This is probably happening already. |
| |
| ▲ | vivzkestrel 2 hours ago | parent [-] | | in my case my application does not use pagination, it uses infinite scroll, even if you had a million devices that use google chrome, they would all load page 1 and if that req/minute progressively decreaasing thing is implemented, once they start scrolling endlessly they would all hit the rate limits sooner or later, the thing is a human is not going to scroll down a 100 pages but a bot will. once this difference has been factored it, it wont matter how many unique devices they bring into the battle |
|
|
| ▲ | chii 4 hours ago | parent | prev [-] |
| so why not just do that for these scrapers, rather than complicate it by encrypting and decrypting, which is just obfuscation as the private key is clearly available to the end-user? |
| |
| ▲ | vivzkestrel 2 hours ago | parent [-] | | tbh i did not encrypt decrypt for the ai scrapers at all, a lot of people were previously trying to download data directly from my API that my frontend uses and this kinda pissed me off a bit. So I added encryption/decryption to the mix and will release the newer version. As I mentioned earlier as well, can someone sit through and decrypt it? yes. Will 99% of them do it? no! Thats where I win |
|