| ▲ | gf000 4 hours ago | |
I guess it depends on if the source data set is part of the training data or not (if it's open source it is likely part of it). A lawyer could easily argue that the model itself stores a representation of the original, and thus it can never do a "fresh context". And to be perfectly honest, LLMs can quote a lot of text verbatim. | ||