Remix.run Logo
visarga 5 days ago

> What ML does is not in any way akin to human learning in methodology. When we call it learning that's an analogy.

Of course it's different, but if we look closely, it is not copying. The model itself is smaller, sometimes 1000x smaller, than the training set. Being made of billions of examples, the impact of any one of them is very small (de minimis).

If you try to replicate something closely with AI it fails. If regurgitation was a huge problem we'd see lots of lawsuits on output, but we see most suits for input (training). That means authors can't identify cases of infringement in the outputs.

delusional 5 days ago | parent [-]

I'm not sure I agree that most suits are on input. Most of the ones I've read have been related to the output of a model. Early worried about CoPilot were about its tendency to regurgitate code verbatim. The WaPo suit was about verbatim output of segments of their articles.