Remix.run Logo
danaris 7 hours ago

The evidence is the last 9 years of scaling.

The curve flattened out years ago. The exponential was going from GPT-2 to GPT-4 (or thereabouts). After that, it was painfully obvious to anyone observing without a vested interest in believing otherwise that the progress had slowed.

Now, it's not just that progress has slowed: it's that the exponential has reversed. In order to get marginal gains, they have to throw exponentially more hardware at the training.

functional_dev 3 hours ago | parent [-]

even if traning is hitting a wall I think they are shifting more to reasoning phase to get better results... and that is inference compute scaling