No, those are separate issues.

The pipeline is something like: download material -> store material -> train models on material -> store models trained on material -> serve output generated from models.

These questions focus on the inputs to the model training, the question I have raised focuses on the outputs of the model. If [certain] outputs are considered derivative works of input material, then we have a cascade of questions which parts of the pipeline are covered by the license requirements. Even if any of the upstream parts of this simplified pipeline are considered legal, it does not imply that that the rest of the pipeline is compliant.

▲

superxpro12 2 hours ago | parent [-]

Consider the net effect and the answer is clear. When these models are properly "trained", are people going to look for the book or a derivative of it, with proper attribution?

Or is the LLM going to regurgitate the same content with zero attribution, and shift all the traffic away from the original work?

When viewed in this frame, it is obvious that the work is derivative and then some.

▲

limagnolia 23 minutes ago | parent [-]

That is your opinion, but the judge disagreed with you. The decision may have been overturned on appeal, but as it stands, in that courtroom, the training was fair use.

	▲	seba_dos1 a minute ago \| parent \| next [-]
		[delayed]
	▲	integralid 3 minutes ago \| parent \| prev [-]
		This is also, unfortunately, the only way this can be settled. Making LLM output legally a derivative work would murder the AI golden rush and nobody wants that