Remix.run Logo
CamperBob2 2 hours ago

Interestingly, it doesn't always condition the final output. When playing with DeepSeek, for example, it's common to see the CoT arrive at a correct answer that the final answer doesn't reflect, and even vice versa, where a chain of faulty reasoning somehow yields the right final answer.

It almost seems that the purpose of the CoT tokens in a transformer network is to act as a computational substrate of sorts. The exact choice of tokens may not be as important as it looks, but it's important that they are present.

nowittyusername an hour ago | parent | next [-]

That phenomenon and others is what made it obvious that COT is not its "thinking". I think COT is a process by which the llm expands its processing boundary, in that it allows it to sample over a larger space of possibilities. So its kind of acts like a "trigger" of sorts that allows the model to explore in more ways then without COT. First time I saw this was when I witnessed the "wait" phenomenon. Simply inducing the model to say "wait" in its response improved accuracy of results. as now the model double checked its "work". funny enough it also sometimes lead it to produce a wrong answer where otherwise it should have stuck to its guns. But overall that little wait had a net positive affect. Thats when i knew COT was not same as human thinking as we dont care about trigger words or anything like that, our thinking requires zero language (though it does benefit from language) its a deeper process. Thats why i was interested in latent processing models and foray in that matter.

Workaccount2 an hour ago | parent | prev [-]

IIRC Anthropic has research finding CoT can sometimes be uncorrelated with the final output.