▲ | ipaddr 3 days ago | ||||||||||||||||||||||||||||
"This is science at its worst, where you start at an inflammatory conclusion and work backwards" Science starts with a guess and you run experiments to test. | |||||||||||||||||||||||||||||
▲ | hodgehog11 3 days ago | parent [-] | ||||||||||||||||||||||||||||
True, but the experiments are engineered to give results they want. It's a mathematical certainty that the performance will drop off here, but is not an accurate assessment of what is going on at scale. If you present an appropriately large and well-trained model with in-context patterns, it often does a decent job, even when it isn't trained on them. By nerfing the model (4 layers), the conclusion is foregone. I honestly wish this paper actually showed what it claims, since it is a significant open problem to understand CoT reasoning relative to the underlying training set. | |||||||||||||||||||||||||||||
|