Remix.run Logo
Balinares 3 days ago

I'm getting so annoyed with the omnipresent mainstream model trend of cramming more and more data in models and advertising that as an improvement.

One, that's got to be a recipe for All Overfit All The Time, or at least I don't understand how you avoid overfit when the expected output is a reconstruction of atomic, individual facts. And two, this mass of embedded parameters has got to make them costlier, less efficient to run, as well as plain less useful, than if they were backed by e.g. knowledge graphs (ideally annotated with sources of truth), and were optimized toward querying such graphs robustly as opposed to trying and necessarily failing to remember the contents in exhaustive detail.

Model weights are a terrible way to store data. Surely I can't be the only nerd out there who feels that a model should not try to be an encyclopedia and should certainly never pretend to be one?

I suppose it boils down to marketing. Models are sold as "smart", and what smart is supposed to look like in Western culture is confidently spouting fact-shaped sentences about any topic. So that's what we're getting. What a waste.