Remix.run Logo
Dylan16807 3 hours ago

Why do you want me to pick a number so bad?

There are very very long examples that are clearly memorization.

Like, if a model was trained on all the code in the world except that specific example, the chance of it producing that snippet is less than a billionth of a billionth of a percent. But that snippet got fed in so many times it gets treated like a standard idiom and memorized in full.

Is that a clear enough threshold for you?

I don't know where the exact line is, but I know it's somewhere inside this big ballpark, and there are examples that go past the entire ballpark.

I don't care where specifically the bound is.

kelseyfrog 3 hours ago | parent [-]

Ok, 1 it is then.

Dylan16807 2 hours ago | parent [-]

That is not good faith, my dude.

kelseyfrog an hour ago | parent [-]

It sounds like you do care where it is then, no?