Remix.run Logo
GuB-42 a day ago

Humans brains are animal brains and their primary function is to keep their owner alive, healthy and pass their genes. For that they developed abilities to recognize danger and react to it, among many other things. Language came later.

For a LLM, language is their whole world, they have no body to care for, just stories about people with bodies to care for. For them, as opposed to us, language is first class and the rest is second class.

There is also a difference in scale. LLMs have been fed the entirety of human knowledge, essentially. Their "database" is so big for the limited task of text generation that there is not much left for creativity. We, on the other hand are much more limited in knowledge, so more "unknowns" so more creativity needed.

johnb231 a day ago | parent [-]

The latest models are natively multimodal. Audio, video, images, text, are all tokenised and interpreted in the same model.