| ▲ | sejje a day ago | |||||||
I bet they'll only train on the internet snapshot from now, before LLMs. Additional non-internet training material will probably be human created, or curated at least. | ||||||||
| ▲ | pc86 a day ago | parent | next [-] | |||||||
This only makes sense if the percentage of LLM hallucinations is much higher than the percentage of things written on line being flat wrong (it's definitely not). | ||||||||
| ▲ | sosodev a day ago | parent | prev [-] | |||||||
Nope. Pretraining runs have been moving forward with internet snapshots that include plenty of LLM content. | ||||||||
| ||||||||