Remix.run Logo
lastdong 7 days ago

In my opinion, we need more models trained on fully traceable and clean data instead of closed models that we later find out were trained on Reddit and Facebook discussion threads.

johntash 4 days ago | parent [-]

I want to see something trained _only_ on stuff like encyclopedias, programming books, etc. I'm interested in how different it would be compared to something with a lot of social media in it.

ekianjo 4 days ago | parent [-]

Better to do a fine tune or a LoRA than a full retraining from scratch