Remix.run Logo
scellus 6 days ago

Even with pretraining, there's no limit or wall in raw performance, just diminishing returns in terms of the current applications, and business rationale to serve lighter models given the current infrastructure and pricing (and applications). Algorithmic efficiency of inference on a given performance level has also advanced a couple of OOMs since 2022 (for sure a major part of that is about model architecture and training methods).

And it seems research is bottlenecked by computation.