Remix clone Hacker News

new | show | ask | jobs Github

	▲	jmyeet 2 hours ago
		Here's a pretty detailed breakdown of this [1]: > tl;dr - For software development, Qwen3.6 27B, 5090 gives you ~3x speed over M5 Max, letting you plow through code, while M5 Max gives you ~4x memory, letting you use higher quantization and bigger context. Which would you choose and why? I've read a number of things from which the consensus seems to be that yes you can run a larger model and/or have more context with a 128GB+ Mac but the performance gap is still massive and with current hardware we're still talking about inference rates that matter. By this I mean there's a big difference between 10tok/s vs 30. Once we get to t apoint where it's 100 vs 300, it won't be as big of a deal, a bit like FPS in games. Oh and there are similar concerns with the DGX Spark [2]. [1]: https://www.reddit.com/r/LocalLLaMA/comments/1t5v2gr/need_ad... [2]: https://www.reddit.com/r/LocalLLaMA/comments/1sqk333/dgx_spa...