| ▲ | visarga 2 days ago | |
> The current generation of LLM's have convinced me that we already have the compute and the data needed for AGI, we just likely need a new architecture. I think this is one of the greatest fallacies surrounding LLMs. This one, and the other one - scaling compute!! The models are plenty fine, what they need is not better models, or more compute, they need better data, or better feedback to keep iterating until they reach the solution. Take AlphaZero for example, it was a simple convolutional network, not great compared to LLMs, small relative recent models, and yet it beat the best of us at our own game. Why? Because it had unlimited environment access to play games against other variants of itself. Same for the whole Alpha* family, AlphaStar, AlphaTensor, AlphaCode, AlphaGeometry and so on, trained with copious amounts of interactive feedback could reach top human level or surpass humans in specific domains. What AI needs is feedback, environments, tools, real world interaction that exposes the limitations in the model and provides immediate help to overcome them. Not unlike human engineers and scientists - take their labs and experiments away and they can't discover shit. It's also called the ideation-validation loop. AI can ideate, it needs validation from outside. That is why I insist the models are not the bottleneck. | ||
| ▲ | geon 16 hours ago | parent [-] | |
For Alpha Zero, the "better data" was trivial. The environment of board games is extremely simplistic. It just can't be compared to language models. The problem with language is that there is no know correct answer. Everything is vague, ambiguous and open ended. How would we even implement feedback for that? So yes, we do need new models. | ||