| ▲ | LPisGood a day ago | |||||||
The point of this post isn’t that the “reasoning” phase of LLM thinking isn’t the same as what humans consider reasoning; it’s that Anthropic is intentionally hiding Claude’s “reasoning output” to make the model harder to distill. | ||||||||
| ▲ | 0o_MrPatrick_o0 a day ago | parent [-] | |||||||
Reading these comments is so harrowing. You are correct in my intentions on this post generally. I want to highlight: I want to measure performance of the LLMs over time- which includes assessing the quality of their outputs. I don’t perceive the reasoning output to be anything other than a measurable signal of possible drift in model performance. Except it isn’t, because I’m only getting a low value summary of the thinking. It’s like asking your buddy how fast he thought that last pitch was when radar guns are behind the plate. Yeah, it’s a description related to what happened, but it’s not the thing I want to measure. | ||||||||
| ||||||||