| ▲ | amenhotep 3 hours ago | |
When you buy, or pirate, a book, you didn't enter into a business relationship with the author specifically forbidding you from using the text to train models. When you get tokens from one of these providers, you sort of did. I think it's a pretty weak distinction and by separating the concerns, having a company that collects a corpus and then "illegally" sells it for training, you can pretty much exactly reproduce the acquire-books-and-train-on-them scenario, but in the simplest case, the EULA does actually make it slightly different. Like, if a publisher pays an author to write a book, with the contract specifically saying they're not allowed to train on that text, and then they train on it anyway, that's clearly worse than someone just buying a book and training on it, right? | ||
| ▲ | BeetleB 3 hours ago | parent [-] | |
> When you buy, or pirate, a book, you didn't enter into a business relationship with the author specifically forbidding you from using the text to train models. Nice phrasing, using "pirate". Violating the TOS of an LLM is the equivalent of pirating a book. | ||