| ▲ | sosodev an hour ago | |
I was testing the 4-bit Qwen3 Coder Next on my 395+ board last night. IIRC it was maintaining around 30 tokens a second even with a large context window. I haven't tried Minimax M2.5 yet. How do its capabilities compare to Qwen3 Coder Next in your testing? I'm working on getting a good agentic coding workflow going with OpenCode and I had some issues with the Qwen model getting stuck in a tool calling loop. | ||
| ▲ | lambda 21 minutes ago | parent [-] | |
I've literally just gotten Minimax M2.5 set up, the only test I've done is the "car wash" test that has been popular recently: https://mastodon.world/@knowmadd/116072773118828295 Minimax passed this test, which even some SOTA models don't pass. But I haven't tried any agentic coding yet. I wasn't able to allocate the full context length for Minimax with my current setup, I'm going to try quantizing the KV cache to see if I can fit the full context length into the RAM I've allocated to the GPU. Even at a 3 bit quant MiniMax is pretty heavy. Need to find a big enough context window, otherwise it'll be less useful for agentic coding. With Qwen3 Coder Next, I can use the full context window. Yeah, I've also seen the occasional tool call looping in Qwen3 Coder Next, that seems to be an easy failure mode for that model to hit. | ||