▲ | jsheard 2 days ago | |||||||
For the "good" bots which at least respect robots.txt you can use this list to get ahead of them before they pummel your site. https://github.com/ai-robots-txt/ai.robots.txt There's no easy solution for bad bots which ignore robots.txt and spoof their UA though. | ||||||||
▲ | breakingcups a day ago | parent | next [-] | |||||||
Such as OpenAI, who will ignore robots.txt and change their user agent to evade blocks, apparently[1] 1: https://www.reddit.com/r/selfhosted/comments/1i154h7/openai_... | ||||||||
▲ | zcase a day ago | parent | prev | next [-] | |||||||
For those looking, this is the best I've found: https://blog.cloudflare.com/declaring-your-aindependence-blo... | ||||||||
| ||||||||
▲ | taikahessu 2 days ago | parent | prev [-] | |||||||
Thanks, will look into that! |