| ▲ | friendzis 4 hours ago | ||||||||||||||||||||||
No, those are separate issues. The pipeline is something like: download material -> store material -> train models on material -> store models trained on material -> serve output generated from models. These questions focus on the inputs to the model training, the question I have raised focuses on the outputs of the model. If [certain] outputs are considered derivative works of input material, then we have a cascade of questions which parts of the pipeline are covered by the license requirements. Even if any of the upstream parts of this simplified pipeline are considered legal, it does not imply that that the rest of the pipeline is compliant. | |||||||||||||||||||||||
| ▲ | superxpro12 2 hours ago | parent [-] | ||||||||||||||||||||||
Consider the net effect and the answer is clear. When these models are properly "trained", are people going to look for the book or a derivative of it, with proper attribution? Or is the LLM going to regurgitate the same content with zero attribution, and shift all the traffic away from the original work? When viewed in this frame, it is obvious that the work is derivative and then some. | |||||||||||||||||||||||
| |||||||||||||||||||||||