Remix clone Hacker News

new | show | ask | jobs Github

	▲	HarHarVeryFunny 6 days ago
		It can work when: a) The "reasoning" is regurgitated (in LLM sense) from the training set rather than novel, OR b) As a slight variation of above, the model has been RL-trained for reasoning such that it's potential outputs are narrowed and biased towards generating reasoning steps that worked (i.e. led to verified correct conclusions) on reasoning samples it was trained on. In domains like math where similar sequences of reasoning steps can be applied to similar problems, this works well. I don't think most people expect LLMs to be good at reasoning in the general case - it's more a matter of "if the only tool you have is a hammer, then every problem is a nail". Today's best general-purpose AI (if not AGI) is LLMs, so people try to use LLMs for reasoning - try to find ways of squeezing all the reasoning juice out of the training data using an LLM as the juicer.