Remix.run Logo
robryan 2 days ago

They will either pay for it to be generated or get good enough at producing synthetic data that actually improves LLM quality.

croes 2 days ago | parent [-]

So either even higher costs and hope that a bug problem of LLMs get solved somehow.

Given how much data they need that will be pretty expensive, I mean really really expensive. How many people can write good training data and how much per day?

Doesn’t sound sustainable.