Hallucinations are not a bug or an exception, but a feature. Everything outputted by LLMs is 100% made-up, with a heavy bias towards what has been fed to it at first (human written content).

The fundamental reason why it cannot be fixed is because the model does not know anything about the reality, there is simply no such concept here.

To make a "probability cutoff" you first need a probability about what the reality/facts/truth is, and we have no such reliable and absolute data (and probably never will).

▲

simianwords 6 days ago | parent | next [-]

>To make a "probability cutoff" you first need a probability about what the reality/facts/truth is, and we have no such reliable and absolute data (and probably never will).

Can a human give a probability estimate to their predictions?

	▲	Jensson 6 days ago \| parent [-]
		Humans can explain how they arrived at the conclusion, an LLM fundamentally cannot do that since they don't remember why they picked the tokens they did, they just make up an explanation based on explanations it has seen before.

▲

tmnvdb 6 days ago | parent | prev | next [-]

You use a lot of anthropomorphisms: doesn't "know" anything (does your hard drive know things? Is it relevant?), "making things up" is even more linked to continuous intent. Unless you believe the LLMs are sentient this is a strange choice of words.

	▲	Seb-C 6 days ago \| parent [-]
		I originally put quotes around "know" and somehow lost it in an edit. I'm precisely trying to criticize the claims of AGI and intelligence. English is not my native language, so nuances might be wrong. I used the word "makes-up" in the sense of "builds" or "constructs" and did not mean any intelligence there.

▲

nikolayasdf123 6 days ago | parent | prev [-]

have you seen Iris flowers dataset? it is fairly simple to find cutoffs to classify flowers.

or are you claiming in general that there is no objective truth in reality in philosophical sense? well, you can go on that more philosophical side of the road, or you can get more pragmatic. things just work, regardless how we talk about them.

	▲	Seb-C 6 days ago \| parent [-]
		I don't mean it in a philosophical sense, more in a rigorous scientific one. Yes, we do have reliable datasets as in your example, but those are for specific topics and are not based on natural language. What I would call "classical" machine learning is already a useful technology where it's applied. Jumping from separate datasets focused on specific topics to a single dataset describing "everything" at once is not something we are even close to doing, if it's even possible. Hence the claim of having a single AI able to answer anything is unreasonable. The second issue is that even if we had such a hypothetical dataset, ultimately if you want a formal response from it, you need a formal question and a formal language (probably something between maths and programming?) in all the steps of the workflow. LLMs are only statistical models about natural languages, so it's the antithesis of this very idea. To achieve that would have to be a completely different technology that has yet to even be theoretized.