Remix.run Logo
sejje a day ago

I bet they'll only train on the internet snapshot from now, before LLMs.

Additional non-internet training material will probably be human created, or curated at least.

pc86 a day ago | parent | next [-]

This only makes sense if the percentage of LLM hallucinations is much higher than the percentage of things written on line being flat wrong (it's definitely not).

sosodev a day ago | parent | prev [-]

Nope. Pretraining runs have been moving forward with internet snapshots that include plenty of LLM content.

sejje a day ago | parent [-]

Sure, but not all of them are stupid enough to keep doing that while watching the model degrade, if it indeed does.