| ▲ | datadrivenangel 8 hours ago | ||||||||||||||||||||||
In my experience once you get to ~30 gigs of ram for a model like Gemma4, the rest of the 128g of memory is simply nice to have. The speed and costs are what make it tough though, because its slower and more expensive than the same model served on a big accelerator card, and is going to be worse than a frontier model. | |||||||||||||||||||||||
| ▲ | digitaltrees 7 hours ago | parent [-] | ||||||||||||||||||||||
I wonder if it really needs to be worse. I am playing with the idea of fine tuning a model on my exact stack and coding patterns. I suspect I could get better performance by training “taste” into a model rather than breadth. | |||||||||||||||||||||||
| |||||||||||||||||||||||