Remix.run Logo
mort96 2 hours ago

And it does this thinking without producing tokens?

AntiUSAbah an hour ago | parent [-]

yes.

Btw. just because you have to do something with the LLM to trigger the flow of information through the model, doesn't mean it can't think. It only means that we have to build an architecture around the model or build it into the models base architecture to enable more thinking.

We do not know how the brain architecture is setup for this. We could have sub agents or we can be a Mixture of Experts type of 'model'.

There is also work going on in combining multimodal inputs and diffusion models which look complelty different from a output pov etc.

If you look how a LLM does math, Anthropic showed in a blog article, that they found similiar structures for estimating numbers than how a brain does.

Another experiment from a person was to clone layers and just adding them beneth the original layer. This improved certain tasks. My assumption here is, that it lengthen and strengthen kind of a thinking structure.

But because using LLMs are still so good and still return relevant improvements, i think a whole field of thinking in this regard is still quite unexplored.