People also tend not to understand the absurdity of assuming that we can make LLMs stop hallucinating. It would imply not only that truth is absolutely objective, but that it exists on some smooth manifold which language can be mapped to.

That means there would be some high dimensional surface representing "all true things". Any fact could be trivially resolved as "true" or "false" simply by exploring whether or not it was represented on this surface. Where or not "My social security number is 123-45-6789" is true could be determined simply by checking whether or not that statement was mappable to the truth manifold. Likewise you could wander around that truth manifold and start generating output of all true things.

If such a thing existed it would make even the wildest fantasies about AGI seem tame.

edit: To simplify it further, this would imply you could have an 'is_true(statement: string): bool' function for any arbitrary statement in English.

▲

jdietrich 5 days ago | parent | next [-]

>People also tend not to understand the absurdity of assuming that we can make LLMs stop hallucinating. It would imply not only that truth is absolutely objective, but that it exists on some smooth manifold which language can be mapped to.

Frankly, this is a silly line of argument. There is a vast spectrum between regularly inventing non-existent citations and total omniscience. "We can't define objective truth" isn't a gotcha, it's just irrelevant.

Nobody in the field is talking about or working on completely eliminating hallucinations in some grand philosophical sense, they're just grinding away at making the error rate go down, because that makes models more useful. As shown in this article, relatively simple changes can have a huge effect and meaningful progress is being made very rapidly.

We've been here before, with scepticism about Wikipedia. A generation of teachers taught their students "you can't trust Wikipedia, because anyone can edit it". Two decades and a raft of studies later, it became clear that Wikipedia is at least as factually accurate as traditional encyclopedias and textbooks. The contemporary debate about the reliability of Wikipedia is now fundamentally the same as arguments about the reliability of any carefully-edited resource, revolving around subtle and insidious biases rather than blatant falsehoods.

Large neural networks do not have to be omniscient to be demonstrably more reliable than all other sources of knowledge, they just need to keep improving at their current rate for a few more years. Theoretical nitpicking is missing the forest for the trees - what we can empirically observe about the progress in AI development should have us bracing ourselves for radical social and economic transformation.

	▲	apsurd 4 days ago \| parent \| next [-]
		You're not being charitable with the take. seems like you just switched "objective truth" with your flavor: "error rate" what is an error? how does the llm "know"? wikipedia example is good. i'd say its "truth" is based on human curated consensus. everyone gets that. what i don't get what's the llm analog? as you state, it's just about making the error rate go down, ok so what is an error? does it require human in the loop?
	▲	skydhash 4 days ago \| parent \| prev \| next [-]
		The thing is, for a lot of tasks, a formal method (either algorithmic or simulation) can be very efficient to create and run with more reliable results. And for a lot of cases, creating a simpler and smaller model with other ML techniques can be as good or better than LLMs. There's still no justification for the whole investment craze in LLMs.
	▲	player1234 3 days ago \| parent \| prev [-]
		[flagged]

▲

mqus 5 days ago | parent | prev | next [-]

Well, no. The article pretty much says that any arbitrary statement can be mapped to {true, false, I don't know}. This is still not 100% accurate, but at least something that seems reachable. The model should just be able to tell unknowns, not be able to verify every single fact.

▲

gary_0 5 days ago | parent [-]

Determining a statement's truth (or if it's outside the system's knowledge) is an old problem in machine intelligence, with whole subfields like knowledge graphs and such, and it's NOT a problem LLMs were originally meant to address at all.

LLMs are text generators that are very good at writing a book report based on a prompt and the patterns learned from the training corpus, but it's an entirely separate problem to go through that book report statement by statement and determine if each one is true/false/unknown. And that problem is one that the AI field has already spent 60 years on, so there's a lot of hubris in assuming you can just solve that and bolt it onto the side of GPT-5 by next quarter.

▲

red75prime 4 days ago | parent [-]

> And that problem is one that the AI field has already spent 60 years on

I hope you don't think that the solutions will be a closed-form expression. The solution should involve exploration and learning. The things that LLMs are instrumental in, you know.

▲

sirwhinesalot 4 days ago | parent | next [-]

Not the same person but I think the "structure" of what the ML model is learning can have a substantial impact, specially if it then builds on that to produce further output.

Learning to guess the next token is very different from learning to map text to a hypervector representing a graph of concepts. This can be witnessed in image classification tasks involving overlapping objects where the output must describe their relative positioning. Vector-symbolic models perform substantially better than more "brute-force" neural nets of equivalent size.

But this is still different from hardcoding a knowledge graph or using closed-form expressions.

Human intelligence relies on very similar neural structures to those we use for movement. Reference frames are both how we navigate the world and also how we think. There's no reason to limit ourselves to next token prediction. It works great because it's easy to set up with the training data we have, but it's otherwise a very "dumb" way to go about it.

	▲	red75prime 2 days ago \| parent [-]
		I mostly agree. But, next token prediction is a pretraining phase of an LLM, not all there is to LLMs.

▲

gary_0 4 days ago | parent | prev [-]

Of course not, expert systems were abandoned decades ago for good reason. But LLMs are only one kind of ANN. Unfortunately, when all you have is a hammer...

▲

thisoneisreal 5 days ago | parent | prev | next [-]

A great book in this vein is "Language vs. Reality." The main thesis of the book is that language evolved to support approximate, ad hoc collaboration, and is woefully inadequate for doing the kind of work that e.g. scientists do, which requires incredible specificity and precision (hence the amount of effort devoted to definitions and quantification).

▲

BobbyTables2 5 days ago | parent | prev | next [-]

Agree. I deeply suspect the problem of asking an LLM to not hallucinate is equivalent to the classic Halting Problem.

▲

beeflet 4 days ago | parent | prev [-]

Maybe if a language model was so absolutely massive, it could <think> enough to simulate the entire universe and determine your social security number

	▲	riwsky 4 days ago \| parent [-]
		42