▲ | bakugo 3 days ago | |||||||
> 7,32b models work perfectly fine for a lot of things Like what? People always talk about how amazing it is that they can run models on their own devices, but rarely mention what they actually use them for. For most use cases, small local models will always perform significantly worse than even the most inexpensive cloud models like Gemini Flash. | ||||||||
▲ | totaa 3 days ago | parent [-] | |||||||
Gemma 3n E4B has been crazy good for me - fine tune running on Google Cloud Run via Ollama, completely avoiding token based pricing at the cost of throughput limitations | ||||||||
|