Remix.run Logo
rynn 4 hours ago

> Please do give that a try and report back the prefill and decode speed.

M4 Max here w/ 128GB RAM. Can confirm this is the bottleneck.

https://pastebin.com/2wJvWDEH

I weighed about a DGX Spark but thought the M4 would be competitive with equal RAM. Not so much.

cmrdporcupine 4 hours ago | parent [-]

I think the DGX Spark will likely underperform the M4 from what I've read.

However it will be better for training / fine tuning, etc. type workflows.

rynn 3 hours ago | parent [-]

> I think the DGX Spark will likely underperform the M4 from what I've read.

For the DGX benchmarks I found, the Spark was mostly beating the M4. It wasn't cut and dry.

coder543 3 hours ago | parent [-]

The Spark has more compute, so it should be faster for prefill (prompt processing).

The M4 Max has double the memory bandwidth, so it should be faster for decode (token generation).