Remix.run Logo
sirwhinesalot 4 days ago

Not the same person but I think the "structure" of what the ML model is learning can have a substantial impact, specially if it then builds on that to produce further output.

Learning to guess the next token is very different from learning to map text to a hypervector representing a graph of concepts. This can be witnessed in image classification tasks involving overlapping objects where the output must describe their relative positioning. Vector-symbolic models perform substantially better than more "brute-force" neural nets of equivalent size.

But this is still different from hardcoding a knowledge graph or using closed-form expressions.

Human intelligence relies on very similar neural structures to those we use for movement. Reference frames are both how we navigate the world and also how we think. There's no reason to limit ourselves to next token prediction. It works great because it's easy to set up with the training data we have, but it's otherwise a very "dumb" way to go about it.

red75prime 2 days ago | parent [-]

I mostly agree. But, next token prediction is a pretraining phase of an LLM, not all there is to LLMs.