▲ | Aurornis 3 days ago | |||||||
FYI there are a number of Strix Halo boards and computers out in the market already. The Framework version looks to be high quality and have good support, but it’s not the only option in this space. Also take a good hard look at the token output speeds before investing. If you’re expecting quality, context windows, and output speeds similar to the hosted providers you’re probably going to be disappointed. There are a lot of tradeoffs with a local machine. | ||||||||
▲ | jchw 3 days ago | parent | next [-] | |||||||
> Also take a good hard look at the token output speeds before investing. If you’re expecting quality, context windows, and output speeds similar to the hosted providers you’re probably going to be disappointed. There are a lot of tradeoffs with a local machine. I don't really expect to see performance on-par with the SOTA hosted models, but I think I'm mainly curious what you could possibly do with local models that would otherwise not be doable with hosted models (or at least, stuff you wouldn't want to for other reasons, like privacy.) One thing I've realized lately is that Gemini, and even Gemma, are really, really good at transcribing images, much better and more versatile than OCR models as they can also describe the images too. With the realization that Gemma, a model you can self-host, is good enough to be useful, I have been tempted to play around with doing this sort of task locally. But again, $2,000 tempted? Not really. I'd need to find other good uses for the machine than just dicking around. In theory, Gemma 3 27B BF16 would fit very easily in system RAM on my primary desktop workstation, but I haven't given it a go to see how slow it is. I think you mainly get memory bandwidth constrained on these CPUs, but I wouldn't be surprised if the full BF16 or a relatively light quantization gives tolerable t/s. Then again, right now, AI Studio gives you better t/s than you could hope to get locally with a generous amount of free usage. So ... maybe it would make sense to wait until the free lunch ends, but I don't want to build anything interesting that relies on the cloud, because I dislike the privacy implications of it, even though everything I'm interested in doing is fully safe with the ToS. | ||||||||
| ||||||||
▲ | walterbell 3 days ago | parent | prev [-] | |||||||
HP Z2 Mini G1a with 128GB and Strix Halo is ~$5K, https://www.notebookcheck.net/Z2-Mini-G1a-HP-reveals-compara... |