| ▲ | Thaxll 13 hours ago | |
M3 max tflops is tiny compared to the 12k box. It's not even comparable. | ||
| ▲ | davej 8 hours ago | parent | next [-] | |
It is very comparable if you work out the $/tok/s on inference. I did some napkin math and it looks like you’re getting roughly 3x the performance for 3x the cost. Red v2 vs Mac Studio M3 Ultra 96GB. If you compare tokens/kWh efficiency then my math has Mac Studio being about 1.5x more efficient. | ||
| ▲ | zozbot234 13 hours ago | parent | prev [-] | |
M3 has tolerable decode performance for the price, and that's what people would care about most of the time. they underperform severely wrt. prefill, but that's a fraction of the workload. AI, even agentic AI, spends most of its time outputing tokens, not processing context in bulk. | ||