▲ | modeless 5 days ago | |
Spatial reasoning is weak, but still I frequently see models come up with the right answer in reasoning steps, only to make the wrong move in the following turn because they forget what they just learned. For models with hidden reasoning it's often not even possible to retain the reasoning tokens in context through multiple steps, but even if you could the context windows are big but not big enough to contain all the past reasoning for every step for hundreds of steps. And then even if they were the retrieval from context for abstract concepts (vs verbatim copying) is terrible. Text is too lossy and inefficient. The models need to be able to internally store and retrieve a more compact, abstract, non-verbal representation of facts and procedures. |