▲ | tomp 7 hours ago | ||||||||||||||||||||||
Did we read the same article? They clearly mention, take into account and extrapolate this; LLM have first scaled via data, now it's test time compute, but recent developments (R1) clearly show this is not exhausted yet (i.e. RL on synthetically (in-silico) generated CoT) which implies scaling with compute. The authors then outline further potential (research) developments that could continue this dynamic, literally things that have already been discovered just not yet incorporated into edge models. Real-world data confirms their thesis - there have been a lot of sceptics about AI scaling, somewhat justified ("whoom" a.k.a. fast take-off hasn't happened - yet) but their fundamental thesis has been wrong - "real-world data has been exhausted, next algorithmic breakthroughs will be hard and unpredictable". The reality is, while data has been exhausted, incremental research efforts have resulted in better and better models (o1, r1, o3, and now Gemini 2.5 which is a huge jump! [1]). This is similar to how Moore's Law works - it's not given that CPUs get better exponentially, it still requires effort, maybe with diminishing returns, but nevertheless the law works... If we ever get to models be able to usefully contribute to research, either on the implementation side, or on research ideas side (which they CANNOT yet, at least Gemini 2.5 Pro (public SOTA), unless my prompting is REALLY bad), it's about to get super-exponential. Edit: then once you get to actual general intelligence (let alone super-intelligence) the real-world impact will quickly follow. | |||||||||||||||||||||||
▲ | Jianghong94 7 hours ago | parent [-] | ||||||||||||||||||||||
Well based on what I'm reading, the OP's intent is that, not all (hence 'fully') validation, if not most of, can be done in-silico. I think we all agree that and that's the major bottleneck making agents useful - you have to have human-in-the-loop to closely guardrail the whole process. Of course you can get a lot of mileage via synthetically generated CoT but does that lead to LLM speed up developing LLM is a big IF. | |||||||||||||||||||||||
|