Remix.run Logo
gobdovan 3 days ago

OpenAI just recently took a systematic look into why models hallucinate [0][1].

The article you shared raises an interesting point by comparing human memory with LLMs, but I think the analogy can only go so far. They're too distinct to explain hallucinations simply as a lack of meta-cognition or meta-memory. These systems are more like alien minds, and allegories risk introducing imprecision when we're trying to debug and understand their behavior.

OpenAI's paper instead identifies hallucinations as a bug in training objectives and benchmarks, and is grounding the explanation in first principles and the mechanics of ML.

Metaphors are useful for creativity, but less so when it comes to debugging and understanding, especially now that the systematic view is this advanced.

[0] https://openai.com/index/why-language-models-hallucinate/?ut... [1] https://cdn.openai.com/pdf/d04913be-3f6f-4d2b-b283-ff432ef4a...

K0balt 2 days ago | parent | next [-]

Hallucinations are actually not a malfunction or any other process outside of the normal functioning of the model. They are merely an output that we find unuseful, but in all other ways is optimal based on the training data, context, and model precision and parameters being used.

I honestly have no idea why OAI felt that they needed to publish a “paper” about this, since it is blazingly obvious to anyone who understands the fundamentals of transformer inference, but here we are.

The confusion on this topic comes from calling these suboptimal outputs “hallucinations” which drags anthropomorphic fallacies into the room by their neck even though they were peacefully minding their own business down the corridor on the left.

“Hallucination” implies a fundamentally fixable error in inference, a malfunction of thought caused by a pathology or broken algorithm.

LLMs “Hallucinating” are working precisely as implemented, only we don’t feel like the output usefully matches the parameters from a human perspective. It’s just unhelpful results from the algorithm, like any other failure of training, compression, alignment, or optimisation.

pegasus 2 days ago | parent [-]

Did you read TFA? It gives concrete advice on how to change the training and eval of these models in order to decrease the error rate. Sure, these being stochastic models, the rate will never reach zero, but given that they are useful today, decreasing the error rate is worthy cause. All this complaining on semantics is just noise to me. It stems from being fixated on some airy-fairy ideas of AGI/ASI, as if anything else doesn't matter. Does saying that a model "replied" to a query mean we are unwittingly anthropomorphizing them? It's all just words, we can extend their use as we see fit. I think "confabulation" would be a more fitting term, but beyond that, I'm not seeing the problem.

K0balt a day ago | parent [-]

We can call it whatever, and yes, the answer is training- just like all things regarding the quality of LLM output per parameter count. The problem is that many people understand“hallucination” as a gross malfunction of an otherwise correctly functioning system, I.e. a defect that can/must be categorically “fixed”, not understanding that it is merely a function of trained weights, inference parameters, and prompt context that they can:

A: probably work around by prompting and properly structuring tasks

B: never completely rule out

C: not avoid at all in certain classes of data transformations where it will creep in in subtle ways and corrupt the data

D: not intrinsically detect, since it lacks the human characteristic of “woah, this is trippy, I feel like maybe I’m hallucinating “

These misconceptions stem from the fact that in LLM parlance, “hallucination” is often conflated with a same-named, relatable human condition that is largely considered completely discrete from normal conscious thought and workflows.

Words and their meanings matter, and the failure to properly label things often is at the root of significant wastes of time and effort. Semantics are the point of language.

aszen 3 days ago | parent | prev [-]

I haven't read the full paper yet, but my intuition is that hallucinations are a byproduct of models having too much information that needs to be compressed for generalizing.

We already know that larger models hallucinate less since they can store more information, are there any smaller models which hallucinate less

gobdovan 2 days ago | parent | next [-]

I'd recommend checking out the full conclusions section. What I can tell you is that with LLMs, it's never a linear correlation. There's always some balance you have to strike, as they really do operate on a changing-anything-changes-everything basis.

excerpt: Claim: Avoiding hallucinations requires a degree of intelligence which is exclusively achievable with larger models. Finding: It can be easier for a small model to know its limits. For example, when asked to answer a Māori question, a small model which knows no Māori can simply say “I don’t know” whereas a model that knows some Māori has to determine its confidence. As discussed in the paper, being “calibrated” requires much less computation than being accurate.

K0balt 2 days ago | parent | prev | next [-]

Hallucinations are actually not a malfunction or any other process outside of the normal functioning of the model.

They are merely an output that we find unuseful, but in all other ways is optimal based on the training data, context, and model precision and parameters being used.

euroderf 2 days ago | parent | prev [-]

One robot's "hallucination" is another robot's "connecting the dots" or "closing the circle".