▲ | safety1st 6 days ago | |||||||||||||||||||||||||
I'm pretty much a layperson in this field, but I don't understand why we're trying to teach a stochastic text transformer to reason. Why would anyone expect that approach to work? I would have thought the more obvious approach would be to couple it to some kind of symbolic logic engine. It might transform plain language statements into fragments conforming to a syntax which that engine could then parse deterministically. This is the Platonic ideal of reasoning that the author of the post pooh-poohs, I guess, but it seems to me to be the whole point of reasoning; reasoning is the application of logic in evaluating a proposition. The LLM might be trained to generate elements of the proposition, but it's too random to apply logic. | ||||||||||||||||||||||||||
▲ | _diyar 6 days ago | parent | next [-] | |||||||||||||||||||||||||
We expect this approach to work because it's currently the best working approach. Nothing else comes close. Using symbolic language is a good idea in theory, but in practice it doesn't scale as well as auto-regression + RL. The IMO results of DeepMind illustrate this well: In 2024, they solved it using AlphaProof and AlphaGeometry, using the Lean language as a formal symbolic logic[1]. In 2025 they performed better and faster by just using a fancy version of Gemini, only using natural language[2]. [1] https://deepmind.google/discover/blog/ai-solves-imo-problems... [2] https://deepmind.google/discover/blog/advanced-version-of-ge... Note: I agree with the notion of the parent comment that letting the models reason in latent space might make sense, but that's where I'm out of my depth. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | gmadsen 6 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
because what can be embedded in billions of parameters is highly unintuitive to common sense and an active area of research. We do it because it works. One other point, the platonic ideal of reasoning is not even an approximation for human reason. The idea that you take away emotion and you end up with Spock is a fantasy. All neuroscience and psychology research point to the necessary and strong coupling of actions/thoughts with emotions. you don't have a functional system with just logical deduction. At a very basic level it is not functional | ||||||||||||||||||||||||||
▲ | bubblyworld 6 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
I think that focusing on systems of truth (like formal logics) might be missing the forest for the trees a bit. There are lots of other things we use reasoning for, like decision making and navigating uncertainty, that are arguably just as valuable as establishing truthiness. Mathematicians are very careful to use words like "implication" and "satisfaction" (as opposed to words like "reasoning") to describe their logics, because the philosophers may otherwise lay siege to their department. A model that is mathematically incorrect (i.e. has some shaky assumptions and inference issues) but nevertheless makes good decisions (like "which part of this codebase do I need to change?") would still be very valuable, no? I think this is part of the value proposition of tools like Claude Code or Codex. Of course, current agentic tools seem to struggle with both unless you provide a lot of guidance, but a man can dream =P | ||||||||||||||||||||||||||
▲ | Night_Thastus 6 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
Congratulations, you've said the quiet part out loud. Yes, the idea is fundamentally flawed. But there's so much hype and so many dollars to be made selling such services, everyone is either genuinely fooled or sticking their fingers in their ears and pretending not to notice. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | dcre 6 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
It is sort of amazing that it works, and no one knows why, but empirically speaking it is undeniable that it does work. The IMO result was achieved without any tool calls to a formal proof system. I agree that is a much more plausible-sounding approach. https://arstechnica.com/ai/2025/07/google-deepmind-earns-gol... | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | HarHarVeryFunny 6 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
It can work when: a) The "reasoning" is regurgitated (in LLM sense) from the training set rather than novel, OR b) As a slight variation of above, the model has been RL-trained for reasoning such that it's potential outputs are narrowed and biased towards generating reasoning steps that worked (i.e. led to verified correct conclusions) on reasoning samples it was trained on. In domains like math where similar sequences of reasoning steps can be applied to similar problems, this works well. I don't think most people expect LLMs to be good at reasoning in the general case - it's more a matter of "if the only tool you have is a hammer, then every problem is a nail". Today's best general-purpose AI (if not AGI) is LLMs, so people try to use LLMs for reasoning - try to find ways of squeezing all the reasoning juice out of the training data using an LLM as the juicer. | ||||||||||||||||||||||||||
▲ | shannifin 6 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
Problem is, even with symbolic logic, reasoning is not completely deterministic. Whether one can get to a set of given axioms from a given proposition is sometimes undecidable. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||
▲ | wonnage 6 days ago | parent | prev | next [-] | |||||||||||||||||||||||||
My impression of LLM “reasoning” is that works more like guardrails. Perhaps the space of possible responses to the initial prompt is huge and doesn’t exactly match any learned information. All the text generated during reasoning is high strength. So placing it in the context should hopefully guide answer generation towards something reasonable. It’s the same idea as manually listing a bunch of possibly-useful facts in the prompt, but the LLM is able to generate plausible sounding text itself. I feel like this relates to why LLM answers tend to be verbose too, it needs to put the words out there in order to stay coherent. | ||||||||||||||||||||||||||
▲ | horizion2025 5 days ago | parent | prev [-] | |||||||||||||||||||||||||
I think you should drop the "stochastic text transformer" label you have probably heard applied, and instead think of them as neural networks that they are. Reason being that the term says absolutely zero about capabilities but creates a subjective 'reduction'. It's just a thought terminating cliché. Let's for the sake of argument assume current LLM's are a mirage but in the future some new technology emerges that offers true intelligence and true reasoning. At the end of the day such a system will also input text and output text, and output will probably piece-meal as current LLM's (and humans) do. So voila: They are also "stochastic text transformers". Yes LLM's were trained to predict next token. But clearly they are not just a small statistical table or whatever. Rather, it turns out that to be good at predicting the next token, after some point you need a lot of extra capabilities, so that's why they emerge during training. All the "next-token-prediction" is just a way abstract and erasing name of what is going on. A child learning how to write, fill in math lessons etc. is also learning 'next token prediction' from this vantage point. It says nothing about what goes on inside the brain of the child, or indeed inside the LLM. It is a confusion between interface and implementation. Behind the interface getNextToken(String prefix) may either be hiding a simple table or a 700 billion-size neural network or a 100 billion sized neuron human brain. |