▲ | jug 4 days ago | ||||||||||||||||||||||
> The Qwen3-Next-80B-A3B-Instruct performs comparably to our flagship model Qwen3-235B-A22B-Instruct-2507 I'm skeptical about these claims. How can this be? Wouldn't there be massive loss of world knowledge? I'm particularly skeptical because a recent trend in Q2 2025 has been benchmaxxing. | |||||||||||||||||||||||
▲ | dragonwriter 4 days ago | parent [-] | ||||||||||||||||||||||
> I'm skeptical about these claims. How can this be? More efficient architecture. > Wouldn't there be massive loss of world knowledge? If you assume equally efficient architecture and no other salient differences, yes, that’s what you’d expect from a smaller model. | |||||||||||||||||||||||
|