Remix.run Logo
alecco 4 hours ago

IMHO DGX Spark at $4,000 is a bad deal with only 273 GB/s bandwidth and the compute capacity between a 5070 and a 5070 TI. And with PCIe 5.0 at 64 GB/s it's not such a big difference.

And the 2x 200 GBit/s QSFP... why would you stack a bunch of these? Does anybody actually use them in day-to-day work/research?

I liked the idea until the final specs came out.

BadBadJellyBean 2 hours ago | parent [-]

I think the selling point is the 128GB of unified system memory. With that you can run some interesting models. The 5090 maxes out at 32GB. And they cost about $3000 and more at the moment.

alecco 2 hours ago | parent [-]

1. /r/localllama unanimously doesn't like the Spark for running models

2. and for CUDA dev it's not worth the crazy price when you can dev on a cheap RTX and then rent a GH or GB server for a couple of days if you need to adjust compatibility and scaling.

BadBadJellyBean an hour ago | parent [-]

I am not on reddit. What are they saying?

mapontosevenths 11 minutes ago | parent [-]

It isn't for "running models." Inference workloads like that are faster on a mac studio, if that's the goal. Apple has faster memory.

These devices are for AI R&D. If you need to build models or fine tune them locally they're great.

That said, I run GPT-OSS 120B on mine and it's 'fine'. I spend some time waiting on it, but the fact that I can run such a large model locally at a "reasonable" speed is still kind of impressive to me.

It's REALLY fast for diffusion as well. If you're into image/video generation it's kind of awesome. All that compute really shines when for workloads that aren't memory speed bound.