Remix.run Logo
the_af 7 hours ago

> Data sharing agreements permitting, today's inference runs can be tomorrow's training data. Presumably the models are good enough at labeling promising chains of thought already.

Wouldn't this lead to model collapse?

littlestymaar 7 hours ago | parent [-]

Not necessarily, as exhibited by the massive success of artificial data.

the_af 4 hours ago | parent [-]

Could you elaborate?

nhecker 3 hours ago | parent [-]

EDIT: probably not relevant, after re-re-reading the comment in question.

Presumably littlestymaar is talking about all the LLM-generated output that's publicly available on the Internet (in various qualities but significant quantity) and there for the scraping.