▲ | svnt 3 days ago | |||||||
You’re absolutely wrong! This is not how reasoning models work. Chain-of-thought did not produce reasoning models. | ||||||||
▲ | Dylan16807 2 days ago | parent | next [-] | |||||||
How do they work then? Because I thought chain of thought made for reasoning. And the first google result for 'chain of thought versus reasoning models' says it does: https://medium.com/@mayadakhatib/the-era-of-reasoning-models... Give me a better source. | ||||||||
| ||||||||
▲ | lucb1e 3 days ago | parent | prev [-] | |||||||
Then I can't explain why it's producing the results that it does. If you have more information to share, I'm happy to update my knowledge... Doing a web search on the topic just comes up with marketing materials. Even Wikipedia's "Reasoning language model" article is mostly a list of release dates and model names, with as only relevant-sounding remark as to how these models are different: "[LLMs] can be fine-tuned on a dataset of reasoning tasks paired with example solutions and step-by-step (reasoning) traces. The fine-tuned model can then produce its own reasoning traces for new problems." It sounds like just another dataset: more examples, more training, in particular on worked examples where this "think step by step" method is being demonstrated with known-good steps and values. I don't see how that fundamentally changes how it works; you're saying such models do not predict the most likely token for a given context anymore, that there is some fundamentally different reasoning process going on somewhere? | ||||||||
|