Remix.run Logo
misterchocolat a day ago

hey! thanks for that read suggestion that's indeed a pretty funny captcha strat. Yup the links show up if you use the Lynx web browser. As for AI scrapers impersonating googlebot I feel like yes they'd definitely start doing that, unless the risk of getting sued by google is too high? If google could even sue them for doing that?

Not an internet litigation expert but seems like it could be debatable

kuylar 20 hours ago | parent | next [-]

> As for AI scrapers impersonating googlebot I feel like yes they'd definitely start doing that, unless the risk of getting sued by google is too high?

Google releases the Googlebot IP ranges[0], so you can makes sure that it's the real Googlebot and not just someone else pretending to be one.

[0] https://developers.google.com/crawling/docs/crawlers-fetcher...

n1xis10t 20 hours ago | parent [-]

Oh good idea!

n1xis10t a day ago | parent | prev [-]

Yeah I guess I don’t know if you can sue someone for using your headers, would be interesting to see how that goes.

throawayonthe 20 hours ago | parent [-]

i think making the case of "you are acting (sending web requests) while knowingly identifying as another legal entity (and criminally/libelously/etc)" shouldn't be toooo hard

n1xis10t 20 hours ago | parent [-]

Seems like, but there are tons of things that forge request headers all the time, and I don’t think I’ve heard of anyone getting in legal trouble for it. Now I think most of these are scrapers pretending to be browsers, so it might be different I don’t know.

owl57 15 hours ago | parent [-]

And most of them are pretending to be Chrome. If Google had a good case against someone reusing their user agent, maybe they would already have sued?

Or maybe not. Got some random bot from my server logs. Yeah, it's pretending to be Chrome, but more exactly:

"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/131.0.0.0 Safari/537.36"

I guess Google might be not eager to open this can of worms.