1. /r/localllama unanimously doesn't like the Spark for running models

2. and for CUDA dev it's not worth the crazy price when you can dev on a cheap RTX and then rent a GH or GB server for a couple of days if you need to adjust compatibility and scaling.

▲

BadBadJellyBean 2 hours ago | parent [-]

I am not on reddit. What are they saying?

	▲	mapontosevenths 13 minutes ago \| parent [-]
		It isn't for "running models." Inference workloads like that are faster on a mac studio, if that's the goal. Apple has faster memory. These devices are for AI R&D. If you need to build models or fine tune them locally they're great. That said, I run GPT-OSS 120B on mine and it's 'fine'. I spend some time waiting on it, but the fact that I can run such a large model locally at a "reasonable" speed is still kind of impressive to me. It's REALLY fast for diffusion as well. If you're into image/video generation it's kind of awesome. All that compute really shines when for workloads that aren't memory speed bound.