| ▲ | cpburns2009 8 hours ago | |||||||
Qwen3-Coder-Next works well on my 128GB Framework Desktop. It seems better at coding Python than Qwen3.5 35B-A3B, and it's not too much slower (43 tg/s compared to 55 tg/s at Q4). 27B is supposed to be really good but it's so slow I gave up on it (11-12 tg/s at Q4). | ||||||||
| ▲ | vlowther 3 hours ago | parent | next [-] | |||||||
The 8 bit MLX unsloth quant of qwen3-coder-next seems to be a local best on an MBB M5 Max with 128GB memory. With oMLX doing prompt caching I can run two in parallel doing different tasks pretty reasonably. I found that lower quants tend to lose the plot after about 170k tokens in context. | ||||||||
| ||||||||
| ▲ | UncleOxidant 5 hours ago | parent | prev [-] | |||||||
Agreed. Qwen3-coder-next seems like the sweetspot model on my 128GB Framework Desktop. I seem to get better coding results from it vs 27b in addition to it running faster. | ||||||||