Remix.run Logo
jmyeet 2 hours ago

Here's a pretty detailed breakdown of this [1]:

> tl;dr - For software development, Qwen3.6 27B, 5090 gives you ~3x speed over M5 Max, letting you plow through code, while M5 Max gives you ~4x memory, letting you use higher quantization and bigger context. Which would you choose and why?

I've read a number of things from which the consensus seems to be that yes you can run a larger model and/or have more context with a 128GB+ Mac but the performance gap is still massive and with current hardware we're still talking about inference rates that matter. By this I mean there's a big difference between 10tok/s vs 30. Once we get to t apoint where it's 100 vs 300, it won't be as big of a deal, a bit like FPS in games.

Oh and there are similar concerns with the DGX Spark [2].

[1]: https://www.reddit.com/r/LocalLLaMA/comments/1t5v2gr/need_ad...

[2]: https://www.reddit.com/r/LocalLLaMA/comments/1sqk333/dgx_spa...