What the LLM cannot do is explain why it said what it said, when cross-examined. It simply hallucinates the best account of why someone would have said such a thing as it said, same as it can give a probable account of why someone else said something different. The question 'But why did you say this not that ...?' does not lead it to make explicit its grounds for what it said, but just to make a new more complicated statement.

▲

U4E4 4 hours ago | parent | next [-]

This is true in the naive case.

There are however LLM context building techniques that anchor completions in data structures that persist the structure of claims that support the conclusion contained in a completion. Lots of different patterns exist —organizing logic in language is a rich domain— but the one I’ve liked the most is something called a Claim Dependency Graph that models the relationships between atomic claims as graph edges.

There’s a whole suite of operations you can perform on these structures, and “reconstruct how you came to this conclusion” is absolutely one of them.

	▲	mdlman 4 hours ago \| parent [-]
		I’d love to read more about these type of patterns. Do you have any recommendations?

▲

xattt 5 hours ago | parent | prev | next [-]

A human has a motive that exists that frames the thought being expressed. An LLM is going to be creating a “de novo” thought in response to a line of questioning.

▲

ashdksnndck 5 hours ago | parent | prev | next [-]

Same is probably true of humans. In a conversation, we often respond from instinct, then work backwards to a rationalization only when asked. For more considered thoughts, if we’re lucky, we can remember our “reasoning traces” but that’s as deep as our introspection goes. Unless we’re neuroscientists, we don’t even know how many neurons we have, let alone have any understanding of how they generate our thoughts. Motivated reasoning impairs our introspection further, and then dishonesty and communication errors prevent us from relaying the limited remaining information to each other.

Model interpretability work has advanced a lot. Arguably we already can explain AI decision-making better than human brains.

	▲	applicative 5 hours ago \| parent \| next [-]
		No, it happens in the immediate context, where e.g. we say 'No I meant Meredith Jones, not Meredith Smith'- and the possibility of this elaboration is actually part of ordinary communication. I did mean Meredith Jones, not Meredith Smith - thus the use of the past tense The LLM will just give the best answer for what one might have meant, completely reopening calculation. The point is familiar but there are good illustrations in the Atlantic article by a book editor. At first it seems abstract AI hate, but then she gets to the details. AI text cannot be edited. https://www.theatlantic.com/technology/2026/05/how-to-tell-a... or https://archive.ph/YJsGK
	▲	BDPW 4 hours ago \| parent \| prev [-]
		Nonsense, some of my friends are lawyers and they're able to give you consistent interpretations on why they think about a certain aspect of a law a certain way. The whole thing is that they work with this the entire time, so they have a really consistent 'head model' of how things work and why and how considerations should be weighted/ordered/whatever. LLMs just do not have this, there's no consistent underlying reasoning (the 'reasoning' traces in LLMs are really inconsistent)

▲

j45 5 hours ago | parent | prev [-]

LLMs hallucinate, because humans hallucinate.

Asking the LLM in a way where it annotates its sources, it can greatly increase the pattern matching to closely simulate logic, just like in humans.

I understand the question of why did you say this, not that, I have seen other ways of asking that which do not seem to trigger the LLMs over-response in the other direction.

	▲	latentsea 5 hours ago \| parent \| next [-]
		Humans hallucinate because they take shrooms or have schizophrenia.
	▲	applicative 5 hours ago \| parent \| prev [-]
		No, the hallucination of its reasons follows immediately from the technique of probabilistic inference. You can see this in real time, just ask 'why did you use this word, not that word?' It is in the position of a desperate liar. All its responses are essentially 'rationalizations'