Get the DGX Spark computers? They’re exactly what you’re trying to build.

They’re very slow.

They're okay, generally, but slow for the price. You're more paying for the ConnectX-7 networking than inference performance.

	▲	Gracana a day ago \| parent [-]
		Yeah, I wouldn’t complain if one dropped in my lap, but they’re not at the top of my list for inference hardware. Although... Is it possible to pair a fast GPU with one? Right now my inference setup for large MoE LLMs has shared experts in system memory, with KV cache and dense parts on a GPU, and a Spark would do a better job of handling the experts than my PC, if only it could talk to a fast GPU. [edit] Oof, I forgot these have only 128GB of RAM. I take it all back, I still don’t find them compelling.