It's not really about "steps", it's about getting the architecture right. LLMs by themselves are missing two crucial ingredients: embodiment and feedback. The reason they hallucinate is that they have no idea what the words they are saying mean. They are like children mimicking other people. They need to be able to associate the words with some kind of external reality. This could be either the real world, or a virtual world, but they need something that establishes an objective reality. And then they need to be able to interact with that world, poke at it and see what it does and how it behaves, and get feedback regarding whether their actions were appropriate or not.

If I were doing this work, I'd look at a rich virtual environment like Minecraft or simcity or something like that. But it could also be coq or a code development environment.

▲

bryanrasmussen 5 days ago | parent [-]

if they were able to associate with some sort of external reality will that prevent hallucination or just being wrong. Humans hallucinate and humans are wrong, perhaps being able to have intelligence without these qualities is the impossibility.

	▲	lisper 5 days ago \| parent [-]
		It's certainly possible that computers will suffer from all the same foibles that humans do, but we have a lot of evolutionary baggage that computers don't, so I don't see any fundamental reason why AGIs could not transcend those limitations. The only way to know is to do the experiment.