Remix clone Hacker News

new | show | ask | jobs Github

	▲	wongarsu 2 hours ago
		If using Common Crawl or Anna's Archive in your training data is legal, then surely the same is true for using conversations with Claude. I don't see a reasonable framework where training AI on copyrighted data is ok if and only if that data is not generated by AI (granted, only meta got caught using Anna's Archive, but it seems safe to assume it's common practice. And even if it wasn't, the websites in Common Crawl are still covered by copyright)