Remix.run Logo
malcontented 3 days ago

Agreed, regarding the computational simplicity of CoT LLMs, and that this solution certainly has much more flexibility. But is there a reason to believe that this architecture (and training method) is as applicable to the development of generally-capable models as it is to the solution of individual puzzles?

Don't get me wrong, this is a cool development, and I would love to see how this architecture behaves on a constraint-based problem that's not easily tractable via traditional algorithm.

bubblyworld 3 days ago | parent [-]

The ARC-1 problem set that they benchmark on is an example of such a problem, I believe. It's still more-or-less completely unsolved. They don't solve it either, mind, but they achieve very competitive results with their tiny (27m param) model. Competitive with architectures that are using extensive pretraining and hundreds of billions of parameters!

That's one of the things that sticks out for me about the paper. Having tried very hard myself to solve ARC it's pretty insane what they're claiming to have done here.

(I think a lot of the sceptics in this thread are unaware of just how difficult ARC-1 is, and are focusing on the sudoku part, which I agree is much simpler and less surprising that they do well on)