| ▲ | CqtGLRGcukpy 3 hours ago | |||||||||||||
The AI companies won't just scrape IA once, they're keeping come back to the same pages and scraping them over and over. Even if nothing has changed. This is from my experience having a personal website. AI companies keep coming back even if everything is the same. | ||||||||||||||
| ▲ | zmmmmm 6 minutes ago | parent | next [-] | |||||||||||||
yeah, they should really have a think about how their behavior is harming their future prospects here. Just because you have infinite money to spend on training doesn't mean you should saturate the internet with bots looking for content with no constraints - even if that is a rounding error of your cost. We just put heavy constraints on our public sites blocking AI access. Not because we mind AI having access - but because we can't accept the abusive way they execute that access. | ||||||||||||||
| ▲ | giancarlostoro 2 hours ago | parent | prev | next [-] | |||||||||||||
Weird, considering IA has most of its content in a way you could rehost it all idk why nobody’s just hosting a IA carbon copy that AI companies can hit endlessly, and then cutting IA a nice little check in the process, but I guess some of the wealthiest AI startups are very frugal about training data? This also goes back to something I said long ago, AI companies are relearning software engineering poorly. I can think of so many ways to speed up AI crawlers, im surprised someone being paid 5x my salary cannot. | ||||||||||||||
| ||||||||||||||
| ▲ | iririririr 2 hours ago | parent | prev [-] | |||||||||||||
> The AI companies won't just scrape IA once, they're keeping come back to the same pages and scraping them over and over. Even if nothing has changed. Maybe they vibecoded the crawlers. I wish I were joking. | ||||||||||||||