Remix.run Logo
johnisgood 6 hours ago

"Five" is not merely "plausible". It is the uniquely correct answer, and it is what the model produces because the training corpus overwhelmingly associates "2 + 3" with "5" in truthful contexts.

And the stochastic parrot framing has a real problem here: if the mechanism reliably produces correct outputs for a class of problems, dismissing it as "just plausibility" rather than computation becomes a philosophical stance rather than a technical critique. The model learned patterns that encode the mathematical relationship. Whether you call that "understanding" or "statistical correlation" is a definitional argument, not an empirical one.

The legal citation example sounds about right. It is a genuine failure mode. But arithmetic is precisely where LLMs tend to succeed (at small scales) because there is no ambiguity in the training signal.