Remix.run Logo
szmarczak 4 hours ago

Those countermeasures don't really have an effect in terms of scraping. Anyone skilled can overcome any protection within a week or two. By officially blocking IA, IA can't archive those websites in a legal way, while all major AI companies use copyrighted content without permission.

zeagle 4 hours ago | parent [-]

For sure. There are many billions and brilliant engineers propping up AI so they will win any cat and mouse game of blocking. It would be ideal if sites gave their data to IA and IA protected it exactly from what you say. But as someone that intentionally uses AI tools almost daily (mainly open evidence) IMO blame the abuser not the victim that it has come to this.

szmarczak 4 hours ago | parent [-]

I'm not blaming the victim, but don't play the 'look what you made me do' game. Making content accessible to anyone (even behind a paywall) is a risk they need to take nevertheless. It's impossible to know upfront if the content is used for consumption or to create derived products (e.g. write an article in NYT style etc.). If this was a newspaper, this would be equivalent to scanning paper and then training AI. You can't prevent scanning, as the process is based on exactly the same phenomenon what makes your eyes see, iow information being sent and received. The game was lost before it even started.