Remix.run Logo
energy123 2 hours ago

I am not sure how to interpret the first paper's results.

If we use a random number generator then we will converge to 100% correct answers under pass@n in the limit.

A random number generator will eventually outperform or match all models (for large n) whenever top-p is less than 1 because the other models will most likely have some level of bias that makes correct CoTs mathematically impossible due to the tokens being too improbable and being filtered out by top-p, meaning that other models will asymptote to below 100% while the RNG will reach 100% in an almost surely sense.

Under this paper's logic doesn't that mean that the random number generator is a superior reasoner?