Remix.run Logo
Large Language Model Reasoning Failures(arxiv.org)
15 points by T-A 4 hours ago | 8 comments
chrisjj 3 hours ago | parent | next [-]

The only reasoning failures here are in the minds of humans gulled into expecting chatbot reasoning ability.

Lapel2742 15 minutes ago | parent | prev | next [-]

> These models fail significantly in understanding real-world social norms (Rezaei et al., 2025), aligning with human moral judgments (Garcia et al., 2024; Takemoto, 2024), and adapting to cultural differences (Jiang et al., 2025b). Without consistent and reliable moral reasoning, LLMs are not fully ready for real-world decision-making involving ethical considerations.

LOL. Finally the Techbro-CEOs succeeded in creating an AI in their own image.

donperignon 33 minutes ago | parent | prev | next [-]

an llm will never reason. reasoning is an emergent behavior of those systems that is poorly understood. neurosymbolic systems will be what combined with llm will define the future of AI

sergiomattei 2 hours ago | parent | prev [-]

Papers like these are much needed bucket of ice water. We antropomorphize these systems too much.

Skimming through conclusions and results, the authors conclude that LLMs exhibit failures across many axes we'd find to be demonstrative of AGI. Moral reasoning, simple things like counting that a toddler can do, etc. They're just not human and you can reasonably hypothesize most of these failures stem from their nature as next-token predictors that happen to usually do what you want.

So. If you've got OpenClaw running and thinking you've got Jarvis from Iron Man, this is probably a good read to ground yourself.

Note there's a GitHub repo compiling these failures from the authors: https://github.com/Peiyang-Song/Awesome-LLM-Reasoning-Failur...

lostmsu 6 minutes ago | parent | next [-]

https://en.wikipedia.org/wiki/List_of_cognitive_biases

Specifically, the idea that LLMs fail to solve some tasks correctly due to fundamental limitations where humans also fail periodically well may be an instance of the fundamental attribution error.

vagrantstreet an hour ago | parent | prev [-]

Isn't it strange that we expect them to act like humans even though after a model was trained it remains static? How is this supposed to be even close to "human like" anyway

mettamage 10 minutes ago | parent | next [-]

> Isn't it strange that we expect them to act like humans even though after a model was trained it remains static?

An LLM is more akin to interacting with a quirky human that has anterograde amnesia because it can't form long-term memories anymore, it can only follow you in a long-ish conversation.

LiamPowell 32 minutes ago | parent | prev [-]

If we could reset a human to a prior state after a conversation then would conversations with them not still be "human like"?

I'm not arguing that LLMs are human here, just that your reasoning doesn't make sense.