| ▲ | camelmel 6 hours ago | |||||||||||||||||||||||||||||||||||||
Huh, according to that model card this is a 137B total parameter model. Performance doesn't seem that good: - MAI-Code-1-Flash (137B-A5B) = 51% on SWE-bench pro - Qwen3.6-35B-A3B = 49.5% on SWE-bench pro (https://huggingface.co/Qwen/Qwen3.6-35B-A3B) They benchmark against Claude Haiku but Haiku is not good, it's worse than tiny open models you can run locally or via API at 10% the cost. | ||||||||||||||||||||||||||||||||||||||
| ▲ | giancarlostoro 6 hours ago | parent | next [-] | |||||||||||||||||||||||||||||||||||||
The take away is that this model is a smaller model that competes with Haiku, I would hope they come out with a "Sonnet" competing model, then Opus. I have been wondering why Microsoft is kind of "sleeping" on offering models they themselves have made on Copilot, maybe it was part of their deal with OpenAI? Not sure. | ||||||||||||||||||||||||||||||||||||||
| ||||||||||||||||||||||||||||||||||||||
| ▲ | 6 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
| [deleted] | ||||||||||||||||||||||||||||||||||||||
| ▲ | kristjansson 5 hours ago | parent | prev | next [-] | |||||||||||||||||||||||||||||||||||||
> 137B-A5B Yeah, not a 5B param model as the earlier title implied! | ||||||||||||||||||||||||||||||||||||||
| ▲ | wetpaws 6 hours ago | parent | prev [-] | |||||||||||||||||||||||||||||||||||||
[dead] | ||||||||||||||||||||||||||||||||||||||