Remix.run Logo
naasking 7 days ago

> Because reasoning tasks require choosing between several different options. “A B C D [M1] -> B C D E” isn’t reasoning, it’s computation, because it has no mechanism for thinking “oh, I went down the wrong track, let me try something else”. That’s why the most important token in AI reasoning models is “Wait”. In fact, you can control how long a reasoning model thinks by arbitrarily appending “Wait” to the chain-of-thought. Actual reasoning models change direction all the time, but this paper’s toy example is structurally incapable of it.

I think this is the most important critique that undercuts the paper's claims. I'm less convinced by the other point. I think backtracking and/or parallel search is something future papers should definitely look at in smaller models.

The article is definitely also correct on the overreaching, broad philosophical claims that seems common when discussing AI and reasoning.