▲ | mustyoshi 3 days ago | ||||||||||||||||
Yeah this is the thing people miss a lot. 7,32b models work perfectly fine for a lot of things, and run on previously high end consumer hardware. But we're still in the hype phase, people will come to their senses once the large model performance starts to plateau | |||||||||||||||||
▲ | _heimdall 3 days ago | parent | next [-] | ||||||||||||||||
I expect people to come to their senses when LLM companies stop subsidizing cost and start charging customers what it actually costs them to train and run these models. | |||||||||||||||||
| |||||||||||||||||
▲ | zamadatix 3 days ago | parent | prev | next [-] | ||||||||||||||||
People don't want to guess which sized model is right for a task and current systems are neither good or efficient at trying to estimate that automatically. I see only the power users tweaking more and more as performance plateaus and the average user only changing when it's automatic. | |||||||||||||||||
▲ | bakugo 3 days ago | parent | prev [-] | ||||||||||||||||
> 7,32b models work perfectly fine for a lot of things Like what? People always talk about how amazing it is that they can run models on their own devices, but rarely mention what they actually use them for. For most use cases, small local models will always perform significantly worse than even the most inexpensive cloud models like Gemini Flash. | |||||||||||||||||
|