| ▲ | tristor 8 hours ago | ||||||||||||||||||||||
I'm disappointed they didn't release a 27B dense model. I've been working with Qwen3.5-27B and Qwen3.5-35B-A3B locally, both in their native weights and the versions the community distilled from Opus 4.6 (Qwopus), and I have found I generally get higher quality outputs from the 27B dense model than the 35B-A3B MOE model. My basic conclusion was that MoE approach may be more memory efficient, but it requires a fairly large set of active parameters to match similarly sized dense models, as I was able to see better or comparable results from Qwen3.5-122B-A10B as I got from Qwen3.5-27B, however at a slower generation speed. I am certain that for frontier providers with massive compute that MoE represents a meaningful efficiency gain with similar quality, but for running models locally I still prefer medium sized dense models. I'll give this a try, but I would be surprised if it outperforms Qwen3.5-27B. | |||||||||||||||||||||||
| ▲ | ilaksh 4 hours ago | parent | next [-] | ||||||||||||||||||||||
It's a given that the dense models with comparable size are better. I also proved that in my use case for those two Qwen 3.5 models. The benchmarks show 3.6 is a bit better than 3.5. I should retry my task, but I don't have a lot of confidence. But it does sound like they worked on the right thing which is getting closer to the 27B performance. | |||||||||||||||||||||||
| ▲ | adrian_b 8 hours ago | parent | prev [-] | ||||||||||||||||||||||
You are right, but this is just the first open-weights model of this family. They said that they will release several open-weights models, though there was an implication that they might not release the biggest models. | |||||||||||||||||||||||
| |||||||||||||||||||||||