Remix.run Logo
CuriouslyC 7 months ago

The output of a model can be copyright violation. In fact, even if the model was never trained on copyright content, if I provided copyright text then told the model to regurgitate it verbatim that would be a violation.

That does not make the model copyright violation itself.

throw646577 7 months ago | parent [-]

This is is sort of like the argument against a blank tape levy or a tape copier tax, which is a reasonable argument in the context of the hardware.

But an LLM doesn't just enable direct duplication, it (well its model) contains it.

If software had a meaningful distribution cost or per-unit sale cost, a blank tape tax would be very appropriate for LLM sales.

But instead OpenAI is operating a for-pay duplication service where authors don't get a share of the proceeds -- it is doing the very thing that copyright laws were designed to dissuade by giving authors a time-limited right to control the profits from reproducing copies of their work.