| ▲ | fc417fc802 2 days ago | |
Yes that's what the law currently says. I'm asking if it ought to say that in this specific scenario. Previously there was no way for a machine to do large swaths of things that have now recently become possible. Thus a law predicated on the assumption that a machine can't do certain things might need to be revisited. | ||
| ▲ | martin-t 14 hours ago | parent [-] | |
This is the first technology in human history which only works if you use an exorbitant amount of other people's work (without their consent, often without even their knowledge) to automate their work. There have been previous tech revolutions but they were based on independent innovation. > Thus a law predicated on the assumption that a machine can't do certain things might need to be revisited. Perhaps using copyright law for software and other engineering work might have been a mistake but it worked to a degree me and most devs were OK with. Sidenote: There is no _one_ copyright law. IANAL but reportedly, for example datasets are treated differently in the US vs EU, with greater protection for the work that went into creating a database in the EU. And of course, China does what is best for China at a given moment. There's 2 approaches: 1) Either we follow the current law. Its spirit and (again IANAL) probably the letter says that mechanical transformation preserves copyright. Therefore the LLMs and their output must be licensed under the same license as the training data (if all the training data use compatible licenses) or are illegal (if they mixed incompatible licenses). The consequence is that very roughly a) most proprietary code cannot be used for training, b) using only permissive code gives you a permissively licensed model and output and c) permissive and copyleft code can be combined, as long as the resulting model and output is copyleft. It still completely ignores attribution though but this is a compromise I would at least consider being OK with. (But if I don't get even credit for 10 years of my free work being used to build this innovation, then there should be a limit on how much the people building the training algorithms get out of it as well.) 2) We design a new law. Legality and morality are, sadly, different and separate concepts. Now, call me a naive sucker, but I think legality should try to approximate morality as closely as possible, only deviating due to the real-world limitations of provability. (E.g. some people deserve to die but the state shouldn't have the right to kill them because the chance of error is unacceptably high.) In practice, the law is determined by what the people writing it can get away with before the people forced to follow it revolt. I don't want a revolution, but I think for example a bloody revolution is preferable to slavery. Either way, there are established processes for handling both violations of laws and for writing new laws. This should not be decided by private for-profit corporations seeing whether they can get away with it scot-free or trembling that they might have to pay a fine which is a near-zero fraction of their revenue, with almost universally no repercussions for their owners. | ||