Remix.run Logo
JohnTHaller 17 hours ago

The Chinese AI scrapers/bots are killing quite a bit of the regular web now. YisouSpider absolutely pummeled my open source project's hosting for weeks. Like all Chinese AI scrapers, it ignores robots.txt. So forget about it respecting a Crawl-delay. If you block the user agent, it would calm down for a bit, then it would just come back again using a generic browser user agent from the same IP addresses. It does this across 10s of thousands of IPs.

mono442 12 hours ago | parent | next [-]

Just block the whole China, India and similar countries.

kevin_thibedeau 16 hours ago | parent | prev [-]

Start blocking /16s.