| ▲ | deaux 8 hours ago |
| Indeed, that's why Anthropic, OpenAI and other LLM providers are known to adhere to published APIs to gather the world's data, obeying licensing and ROBOTS.txt. It's truly disgusting. |
|
| ▲ | skybrian 8 hours ago | parent [-] |
| I was under the impression that they do obey robots.txt now? There are clearly a lot of dumb agents that don’t, but didn’t think it was the major AI labs. |
| |
| ▲ | deaux 8 hours ago | parent [-] | | After 3 years of pirating and scraping the entire world by doing the above, I guess they have everything that they now need or want. So then it's better to start obeying ROBOTS.txt as a ladder pull through a "nicely behaved" image advantage. | | |
| ▲ | skybrian 7 hours ago | parent [-] | | Obeying robots.txt (now) is still better than not obeying it, regardless of what they did before. The alternative is to say that bugs shouldn’t be fixed because it’s a ladder pull or something. But that’s crazy. What’s the point of complaining if not to get people to fix things? |
|
|