Remix.run Logo
onion2k 10 hours ago

So fuzzycanary also checks user agents and won't show the links to legitimate search engines, so Google and Bing won't see them.

Unscrupulous AI scrapers will not be using a genuine UA string. They'll be using Google. You'll need to do reverse DNS check instead - https://developers.google.com/crawling/docs/crawlers-fetcher...

bakugo 10 hours ago | parent [-]

Most AI scrapers use normal browser user agents (usually random outdated Chrome versions, from my experience). They generally don't fake the UAs of legitimate bots like Googlebot, because Googlebot requests coming from non-Google IP ranges would be way too easy to block.