| ▲ | oofbey 4 hours ago | |||||||||||||||||||||||||||||||
> It doesn’t decide to do something and then do it, it just outputs text. We can debate philosophy and theory of mind (I’d rather not) but any reasonable coding agent totally DOES consider what it’s going to do before acting. Reasoning. Chain of thought. You can hide behind “it’s just autoregressively predicting the next token, not thinking” and pretend none of the intuition we have for human behavior apply to LLMs, but it’s self-limiting to do so. Many many of their behaviors mimic human behavior and the same mechanisms for controlling this kind of decision making apply to both humans and AI. | ||||||||||||||||||||||||||||||||
| ▲ | pierrekin 4 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||
I suspect we are not describing the same thing. When a human asks another human “why did you do X?”, the other human can of course attempt to recall the literal thoughts they had while they did X (which I would agree with you are quite analogous to the LLMs chain of thought). But they can do something beyond that, which is to reason about why they may have the beliefs that they had. “Why did you run that command?” “Because I thought that the API key did not have access to the production system.” When a human responds with this they are introspecting their own mind and trying to project into words the difference in understanding they had before and after. Whereas for an agent it will happily include details that are not literally in its chain of thought as justifications for its decisions. In this case, I would argue that it’s not actually doing the same thing humans do, it is creating a new plausible reason why an agent might do the thing that it itself did, but it no longer has access to its own internal “thought state” beyond what was recorded in the chain of thought. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | tredre3 4 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||
I agree with you a LLM is perfectly capable of explaining its actions. However it cannot do so after the fact. If there's a reasoning trace it could extract a justification from it. But if there isn't, or if the reasoning trace makes no sense, then the LLM will just lie and make up reasons that sound about right. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||