| ▲ | qsera 4 hours ago |
| No one is saying that it cannot do what you say now. What I am saying is that once the high quality training data runs out, it will drop in its capabilities pretty fast. That is how I compare it to perpetual motion mechanism scams. In the case of a perpetual motion machine, it appear that it will continue to run indefinitely. That is analogous to the impression that you have now. You feel that this will go on and on for ever, and that is the scam you are falling for. |
|
| ▲ | WarmWash 3 hours ago | parent | next [-] |
| >What I am saying is that once the high quality training data runs out, it will drop in its capabilities pretty fast. That's more a misunderstood study that over time turned into a confidently stated fact. Yes, the model collapses if you loop the output to the input. But no, that's not how it's done. The reality is that all the labs are already using synthetic training data, and have been for at least a year now. It basically turned out to be a non-issue if you have robust monitoring and curation in place for the generated data. |
| |
| ▲ | qsera 3 hours ago | parent [-] | | >using synthetic training data yea, look up how it is done. It is exactly how a perpetual motion machine scam would project an appearance of working like using a generator to drive a motor, and the motor driving the generator..something that would obscure the fact that there is energy loss happening along the way.... | | |
| ▲ | WarmWash 2 hours ago | parent [-] | | I'm confused with the point you are trying to make, because they are using synthetic data, and the models are getting stronger. There is no "conservation of fallacy" law (bad data must conserve it's level of bad), so I'm struggling to connect the dots on the analogy, unless I ignore the fact that training on synthetic data works, is being used, and the models are getting better. | | |
| ▲ | qsera an hour ago | parent | next [-] | | If the training that did not use synthetic data failed to capture some aspect of the information contained, then using data synthesized from the original data could help to capture it, thus it could result in the models getting better. But that is because the synthetic data helped the model capture what was already there in the training data. But after all such information has been extracted, then it would not be possible to use synthetic data or anything that is derived from the original data to create "new" information for training.... | |
| ▲ | dgb23 an hour ago | parent | prev [-] | | Better by which metrics? |
|
|
|
|
| ▲ | _aavaa_ 4 hours ago | parent | prev [-] |
| Why would the capabilities drop instead of stagnate? |
| |
| ▲ | qsera 4 hours ago | parent [-] | | Because technologies, programming languages, best practices, won't stay frozen. If LLMs cannot catch up with it, I think it can be considered as a drop in capability. No? | | |
| ▲ | coldtea 3 hours ago | parent [-] | | Close, but no. What will happen is that "technologies, programming languages, best practices" will stay frozen because human innovation will drop, and the whole field will stagnate. | | |
| ▲ | californical 2 hours ago | parent [-] | | This is the biggest fear! I don’t see an easy fix. Will the developer of a new programming language be able to reach out to model companies to give a huge amount of training data, ensuring that the models are good at that new language? I don’t think a small team can write enough code, the models already struggle in medium-popularity languages that have years of history. They hallucinate lua functionality sometimes, for example, even though I’m sure there is lots of lua code out there. So if most people use coding agents, we’re stuck with the current most popular languages because no new language will get past the barrier of having enough code that models can write it well, meaning nobody adopts the new language, etc. Same thing with libraries and frameworks - technical decisions are already being made based on “is this popular enough that the agents can use it well?” Rather than a newer library that meets our needs perfectly but isn’t in the training data |
|
|
|