| ▲ | Enginerrrd 2 days ago | |||||||||||||||||||||||||||||||
The current generation of LLM's have convinced me that we already have the compute and the data needed for AGI, we just likely need a new architecture. But I really think such an architecture could be right around the corner. It appears to me like the building blocks are there for it, it would just take someone with the right luck and genius to make it happen. | ||||||||||||||||||||||||||||||||
| ▲ | visarga 2 days ago | parent | next [-] | |||||||||||||||||||||||||||||||
> The current generation of LLM's have convinced me that we already have the compute and the data needed for AGI, we just likely need a new architecture. I think this is one of the greatest fallacies surrounding LLMs. This one, and the other one - scaling compute!! The models are plenty fine, what they need is not better models, or more compute, they need better data, or better feedback to keep iterating until they reach the solution. Take AlphaZero for example, it was a simple convolutional network, not great compared to LLMs, small relative recent models, and yet it beat the best of us at our own game. Why? Because it had unlimited environment access to play games against other variants of itself. Same for the whole Alpha* family, AlphaStar, AlphaTensor, AlphaCode, AlphaGeometry and so on, trained with copious amounts of interactive feedback could reach top human level or surpass humans in specific domains. What AI needs is feedback, environments, tools, real world interaction that exposes the limitations in the model and provides immediate help to overcome them. Not unlike human engineers and scientists - take their labs and experiments away and they can't discover shit. It's also called the ideation-validation loop. AI can ideate, it needs validation from outside. That is why I insist the models are not the bottleneck. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||
| ▲ | netdevphoenix 2 days ago | parent | prev [-] | |||||||||||||||||||||||||||||||
> The current generation of LLM's have convinced me that we already have the compute and the data needed for AGI, we just likely need a new architecture This is likely true but not for the reasons you think about. This was arguably true 10 years ago too. A human brain uses 100 watts per day approx and unlike most models out there, the brain is ALWAYS in training mode. It has about 2 petabytes of storage. In terms of raw capabilities, we have been there for a very long time. The real challenge is finding the point where we can build something that is AGI level with the stuff we have. Because right now, we might have the compute and data needed for AGI but we might lack the tools needed to build a system that efficient. It's like a little dog trying to enter a fenced house, the closest path topologically between the dog and the house might not be accessible for that dog at that point because its current capabilities (short legs, inability to jump high or push through the fence standing in between) so while it is further topologically, a longer path topologically might be the closest path to reach the house. In case it's not obvious, AGI is the house, we are the little dog and the fence represent current challenges to build AGI. | ||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||