Remix.run Logo
kybernetikos 3 hours ago

Gpt3.5 as used in the first commercially available chat gpt is believed to be hundreds of billions of parameters. There are now models I can run on my phone that feel like they have similar levels of capability.

Phones are never going to run the largest models locally because they just don't have the size, but we're seeing improvements in capability at small sizes over time that mean that you can run a model on your phone now that would have required hundreds of billions of parameters less than 6 years ago.

johnsmith1840 3 minutes ago | parent | next [-]

Sure but the moment you can use that small model locally its capabilities are no longer differntiated or valuable no?

I supose the future will look exacrly like now. Some mixture of local and non local.

I guess my argument is that market dominated by local doesn't seem right and I think the balance will look similar to what it is right now

onion2k 2 hours ago | parent | prev [-]

The G in GPT stands for Generalized. You don't need that for specialist models, so the size can be much smaller. Even coding models are quite general as they don't focus on a language or a domain. I imagine a model specifically for something like React could be very effective with a couple of billion parameters, especially if it was a distill of a more general model.

christkv 30 minutes ago | parent | next [-]

Thats what i want and orchestrator model that operates with a small context and then very specialized small models for react etc

MzxgckZtNqX5i 2 hours ago | parent | prev [-]

I'll be that guy: the "G" in GPT stands for "Generative".