| ▲ | ian_j_butler a day ago | |||||||
It's well-known that the reasoning model output is not necessarily faithful to the content of the thinking scratch pad anyway, even if you had it unsummarized and available verbatim. Setting aside coding agents.. we really need this information to even pretend to evaluate the claims of stuff like mathematical breakthroughs, which is exactly why we will never see it. Very embarrassing to get the right answer for the wrong reason. But to give the models some credit, you could argue that even paying too much attention to the thinking is misunderstanding how CoT works. The argument would be that thinking in LLMs isn't really thinking, that it's self-reinforcement and circling to to encourage stability around beneficial attractors instead of degenerate ones. Can't have it both ways though: either the thinking is thinking and so it should be correct. Or the thinking is NOT thinking, and it's NOT real justification for the outcome, and these systems are even more hopelessly opaque than we usually assume. | ||||||||
| ▲ | handoflixue 16 hours ago | parent | next [-] | |||||||
> we really need this information to even pretend to evaluate the claims of stuff like mathematical breakthroughs Why? Either the proof is correct, or it isn't, right? And it either produces them reliably or not, right? Like, even if it's reasoning is completely wrong, and it's only producing correct answers 10% of the time, that's still an astounding amount above baseline and a useful tool. Humans have inaccurate thinking all the time, and are also pretty hopelessly opaque. "It came to me in a dream" is a major plot point in the history of math. I'd still trust Ramanujan more than most mathematicians, since he got the right answer. | ||||||||
| ▲ | anuramat a day ago | parent | prev [-] | |||||||
> NOT real justification I thought it was widely accepted that it's not; eg https://www.anthropic.com/research/natural-language-autoenco... | ||||||||
| ||||||||