Remix.run Logo
bdamm 3 hours ago

Is this a known or quantifiable thing? I thought that the limit had already been determined i.e. the existing models top out and at some point it doesn't matter how much time or energy you let the model consume, it won't improve the result. And with regards to training parameters, I thought we were equally limited there, e.g. the existing models can't benefit from a larger parameter space.

I was under the impression that improvements are arriving via how the models are trained and how model prompting context is constructed, rather than just by how much data or how much energy is spent searching over the model space for a particular prompt.

Is there some evidence that we have not reached a pleateau with just resource consumption on existing models?