| ▲ | zwnow 2 days ago | |||||||
You are thankfully wrong. I watch lots of talks on the topic from actual experts. New models are just old models with more tooling. Training data is exhausted and its a real issue. | ||||||||
| ▲ | TeMPOraL a day ago | parent | next [-] | |||||||
Well, my experts disagree with your experts :). Sure, the supply of available fresh data is running out, but at the same time, there's way more data than needed. Most of it is low-quality noise anyway. New models aren't just old models with more tooling - the entire training pipeline has been evolving, as researchers and model vendors focus on making better use of data they have, and refining training datasets themselves. There are more stages to LLM training than just the pre-training stage :). | ||||||||
| ▲ | GrumpyGoblin 2 days ago | parent | prev [-] | |||||||
Not saying it's not a problem, I actually don't know, but new CPU's are just old models with more improvements/tooling. Same with TV's. And cars. And clothes. Everything is. That's how improving things works. Running out of raw data doesn't mean running out of room for improvement. The data has been the same for the last 20 years, AI isn't new, things keep improving anyways. | ||||||||
| ||||||||