Remix.run Logo
qsera 2 hours ago

That is like saying we can get unlimited data compression by feeding the output of a data compressing program into its own input..

geremiiah 2 hours ago | parent [-]

Not it's not like saying that at all.

qsera an hour ago | parent [-]

Yes, that is exactly like it.

Generating new training data from existing data will only generate patterns that already exist in the training data. It might help LLMs to capture it if has not already. But it can never generate novel patterns.

For example, Imagine some kind of neural network architecture that can do OCR. You might be able to generate variations of the letters it already know using some technique and use it to better train the recognition of already known letters.

But it would never be able to generate letters that it does not know.