| ▲ | Dylan16807 3 hours ago | ||||||||||||||||
Why do you want me to pick a number so bad? There are very very long examples that are clearly memorization. Like, if a model was trained on all the code in the world except that specific example, the chance of it producing that snippet is less than a billionth of a billionth of a percent. But that snippet got fed in so many times it gets treated like a standard idiom and memorized in full. Is that a clear enough threshold for you? I don't know where the exact line is, but I know it's somewhere inside this big ballpark, and there are examples that go past the entire ballpark. I don't care where specifically the bound is. | |||||||||||||||||
| ▲ | kelseyfrog 3 hours ago | parent [-] | ||||||||||||||||
Ok, 1 it is then. | |||||||||||||||||
| |||||||||||||||||