Remix.run Logo
aoeusnth1 9 hours ago

> Wobbly assumption that increasing the size of these models yields better performance.

I'm assuming you disagree that larger models are better? Can you expand on what indicates that AI will hit a wall in scaling given the evidence of the last 9 years of scaling transformers (or other models)? Where on the plot does the line go from exponential to flat?

monodeldiablo 9 hours ago | parent | next [-]

Leaks from within OpenAI have made it pretty clear that they've been struggling to achieve significant improvements lately by simply scaling up parameter size. Experts like LeCunn have also been vocal that blindly scaling up is a dead end.

(Incidentally, the line of skill improvement isn't "exponential". It's been incremental in improvements per generation, but generations have been coming thick and fast of late, and have grown in parameter count exponentially since 2017.)

Speaking more broadly, LLMs don't have to "hit a wall" in scaling to become uneconomical. If incremental improvement continues to come at exponential cost, however, then the fundamental value argument falls apart.

Setting all that aside, even presuming that model performance scales linearly with dimensionality, there are just fundamental limits to the size of the training corpuses. Quality training data is not unbounded and infinite. Given the same size corpus of training data, there's a hard theoretical limit to how much meaning and inference a model can wring out of it.

And then there are other issues with the whole business model. For one thing, it's predicated on continual full scale retraining to achieve even modest gains in skill and relevancy. Topical and targeted learning requires a full retraining. Etc cetera.

I think that the next generation of AI will lean more heavily on RL to be useful beyond a few months. I also think that the energy requirements of a particular technology are a good proxy to whether it's got a realistic future.

emp17344 9 hours ago | parent | prev | next [-]

Why do you believe progress is currently exponential? There’s one dubious chart showing “exponential growth” in a single narrow domain, and otherwise zero evidence to suggest exponential improvement.

danaris 7 hours ago | parent | prev | next [-]

The evidence is the last 9 years of scaling.

The curve flattened out years ago. The exponential was going from GPT-2 to GPT-4 (or thereabouts). After that, it was painfully obvious to anyone observing without a vested interest in believing otherwise that the progress had slowed.

Now, it's not just that progress has slowed: it's that the exponential has reversed. In order to get marginal gains, they have to throw exponentially more hardware at the training.

functional_dev 3 hours ago | parent [-]

even if traning is hitting a wall I think they are shifting more to reasoning phase to get better results... and that is inference compute scaling

righthand 9 hours ago | parent | prev [-]

In my experience the models havent gotten any better, just the hype.

heavyset_go 9 hours ago | parent [-]

And companies know this hence the heavy astroturfing, if their new product has minimal improvements they'll just gaslight you into thinking otherwise