Remix.run Logo
charcircuit 6 hours ago

I think it was largely the introduction of tool calling that allowed models to mitigate the issue of errors amplifying exponentially since it allows the model to understand if what it generated is correct or has issues that it needs to address. This addresses the potential lack of or low quality of world model by being able to reference the current state of the world.

ravenstine 5 hours ago | parent [-]

I've definitely realized this phenomenon after a few occasions of erroneously trying to rely purely on instructions to get an LLM to do a thing or take on a role, especially without persistent cloud-based sessions that have internal checklists and other opaque guidance. They're essentially poor at self-managing, but can do really well when they are limited in scope/context and are worked into a sort of state machine that guarantees they perform certain tasks predictably. They won't always do those tasks the exact way you expect them to, but at least they actually do them, and because of that they are more likely to have the correct prior context to perform the next task better. Because they are so prone to selectively ignoring directions, that can quickly send them down an incorrect path that compounds on itself.