Remix.run Logo
herval 5 days ago

How does one “fix hallucinations” on an LLM? Isn’t hallucinating pretty much all it does?

kasey_junk 5 days ago | parent | next [-]

Coding agents have shown how. You filter the output against something that can tell the llm when it’s hallucinating.

The hard part is identifying those filter functions outside of the code domain.

dotancohen 5 days ago | parent [-]

It's called a RAG, and it's getting very well developed for some niche use cases such as legal, medical, etc. I've been personally working on one for mental health, and please don't let anybody tell you that they're using an LLM as a mental health counselor. I've been working on it for a year and a half, and if we get it to production ready in the next year and a half I will be surprised. In keeping up with the field, I don't think anybody else is any closer than we are.

tptacek 2 days ago | parent [-]

Wait, can you say more about how RAG solves this problem? What Kasey is referring to is things like compiling statically-typed code: there's a ground truth an agent is connected to there --- it can at least confidently assert "this code actually compiles" (and thus can't be using an entirely-hallucinated API. I don't see how RAG accomplishes something similar, but I don't think much about RAG.

dingdingdang 4 days ago | parent | prev [-]

No no, not at all, see: https://openai.com/index/why-language-models-hallucinate/ which was recently featured on the frontpage - excellent clean take on how to fix the issue (they already got a long way with gpt-5-thinking-mini). I liked this bit for clear outline of the issue:

´´´Think about it like a multiple-choice test. If you do not know the answer but take a wild guess, you might get lucky and be right. Leaving it blank guarantees a zero. In the same way, when models are graded only on accuracy, the percentage of questions they get exactly right, they are encouraged to guess rather than say “I don’t know.”

As another example, suppose a language model is asked for someone’s birthday but doesn’t know. If it guesses “September 10,” it has a 1-in-365 chance of being right. Saying “I don’t know” guarantees zero points. Over thousands of test questions, the guessing model ends up looking better on scoreboards than a careful model that admits uncertainty."´´´´