▲ | somenameforme a day ago | |
In every domain that uses neural networks, there always reaches a point of sharp diminishing returns. You 100x the compute and get a 5% performance boost. And then at some point you 1000x the compute and your performance actually declines due to overfitting. And I think we can already see this. The gains in LLMs are increasingly marginal. There was a hugeeeeeeee jump going from glorified markov chains to something able to consistently produce viable output, but since then each generation of updates has been less and less recognizable to the point that if somebody had to use an LLM for an hour and guess its 'recency'/version, I suspect the results would be scarcely better than random. That's not to say that newer systems are not improving - they obviously are, but it's harder and harder to recognize those changes without having its immediate predecessor to compare against. |