▲ | deepdarkforest 14 hours ago | ||||||||||||||||||||||||||||||||||||||||
The Chinese are doing what they have been doing to the manufacturing industry as well. Take the core technology and just optimize, optimize, optimize for 10x the cost/efficiency. As simple as that. Super impressive. These models might be bechmaxxed but as another comment said, i see so many that it might as well be the most impressive benchmaxxing today, if not just a genuinely SOTA open source model. They even released a closed source 1 trillion parameter model today as well that is sitting on no3(!) on lm arena. EVen their 80gb model is 17th, gpt-oss 120b is 52nd https://qwen.ai/blog?id=241398b9cd6353de490b0f82806c7848c5d2... | |||||||||||||||||||||||||||||||||||||||||
▲ | jychang 12 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
They still suck at explaining which model they serve is which, though. They also released today Qwen3-VL Plus [1] today alongside Qwen3-VL 235B [2] and they don't tell us which one is better. Note that Qwen3-VL-Plus is a very different model compared to Qwen-VL-Plus. Also, qwen-plus-2025-09-11 [3] vs qwen3-235b-a22b-instruct-2507 [4]. What's the difference? Which one is better? Who knows. You know it's bad when OpenAI has a more clear naming scheme. [1] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?... [2] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?... [3] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?... [4] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?... | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
▲ | spaceman_2020 3 hours ago | parent | prev | next [-] | ||||||||||||||||||||||||||||||||||||||||
Interestingly, I've found that models like Kimi K2 spit out more organic, natural-sounding text than American models Fails on the benchmarks compared to other SOTA models but the real-world experience is different | |||||||||||||||||||||||||||||||||||||||||
▲ | nl 11 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
> Take the core technology and just optimize, optimize, optimize for 10x the cost/efficiency. As simple as that. Super impressive. This "just" is incorrect. The Qwen team invented things like DeepStack https://arxiv.org/abs/2406.04334 (Also I hate this "The Chinese" thing. Do we say "The British" if it came from a DeepMind team in the UK? Or what if there are Chinese born US citizens working in Paris for Mistral? Give credit to the Qwen team rather than a whole country. China has both great labs and mediocre labs, just like the rest of the world.) | |||||||||||||||||||||||||||||||||||||||||
|