| ▲ | resoluteteeth 2 days ago | |||||||||||||
It's pointless to write a whole article about how model collapse is actually happening and isn't just a theoretical concern with no evidence that model collapse is actually happening. | ||||||||||||||
| ▲ | janalsncm 2 days ago | parent | next [-] | |||||||||||||
It isn’t pointless. The author cited research that demonstrates that model collapse can happen on a small scale. The author also cited sources that a larger and larger portion of the web will be written by language models. There are already studies showing that LLM generated text is less diverse than human generated text: https://techxplore.com/news/2026-03-llms-creativity-ai-respo... https://arxiv.org/html/2501.19361 The studies don’t show that the lack of creativity in LLMs is caused by model collapse or that the problem is getting worse. But 1) we know they do this and 2) we know that training on synthetic data can cause model collapse. | ||||||||||||||
| ||||||||||||||
| ▲ | locknitpicker 2 days ago | parent | prev [-] | |||||||||||||
> It's pointless to write a whole article about how model collapse is actually happening and isn't just a theoretical concern with no evidence that model collapse is actually happening. Except perhaps the link to article on the peer-reviewed paper that describes the problem in detail. https://www.cs.ox.ac.uk/news/2356-full.html > Researchers at Oxford and Cambridge published work on this back in 2023, showing how iterative training on synthetic data leads to progressive degradation. | ||||||||||||||
| ||||||||||||||