Remix.run Logo
notarealllama 14 hours ago

My understanding is the opposite, see papers for "synthetic" data training. They use a small bit if real data to generate lots of synthetic data and get usable results.

The bias leans towards overfitting the data, which in some use cases - such as missile or drone design which doesn't need broad comparisons like 747s or artillery to complete it's training.

Kind of like neural net back propogation but in terms of model /weights