Remix.run Logo
hollowturtle 7 hours ago

Ehm what would stop ai scrapers from using a browser like a normal user would? Google bot already does, it can execute js and can read spa client side generated content, so it proves can be done at scale, and I'm pretty sure some ai scrapers already do

dirkc an hour ago | parent | next [-]

If you decrypt the content on the client side using an expensive decryption algorithm the scraper needs to spend the computing resource to decrypt.

vivzkestrel 6 hours ago | parent | prev [-]

rate limit per ip that progressively keeps decreasing req/mins every few mins?

prmoustache 4 hours ago | parent | next [-]

What if scrapers ips are millions of smartphones? If I was as evil as an AI scraper company that is not obeying robots.txt I would totally build/buy thousands of small games/apps for mobiles to use them as jumphosts to scape the web. This is probably happening already.

vivzkestrel 2 hours ago | parent [-]

in my case my application does not use pagination, it uses infinite scroll, even if you had a million devices that use google chrome, they would all load page 1 and if that req/minute progressively decreaasing thing is implemented, once they start scrolling endlessly they would all hit the rate limits sooner or later, the thing is a human is not going to scroll down a 100 pages but a bot will. once this difference has been factored it, it wont matter how many unique devices they bring into the battle

chii 4 hours ago | parent | prev [-]

so why not just do that for these scrapers, rather than complicate it by encrypting and decrypting, which is just obfuscation as the private key is clearly available to the end-user?

vivzkestrel 2 hours ago | parent [-]

tbh i did not encrypt decrypt for the ai scrapers at all, a lot of people were previously trying to download data directly from my API that my frontend uses and this kinda pissed me off a bit. So I added encryption/decryption to the mix and will release the newer version. As I mentioned earlier as well, can someone sit through and decrypt it? yes. Will 99% of them do it? no! Thats where I win