Remix.run Logo
CuriouslyC 6 hours ago

Parameter size gets you world knowledge and better persistence of behavior as context grows. Both of those things can be engineered around to a large degree, and the latest Qwen models show that small models can be quite smart in narrow domains and short time windows.

alfiedotwtf an hour ago | parent [-]

… maybe we should just teach models how to get their world knowledge from a local Postgres connection! Then the model can be tiny, and it can query to its little heart desires AND run on commodity hardware TODAY!