Remix.run Logo
deepdarkforest 14 hours ago

The Chinese are doing what they have been doing to the manufacturing industry as well. Take the core technology and just optimize, optimize, optimize for 10x the cost/efficiency. As simple as that. Super impressive. These models might be bechmaxxed but as another comment said, i see so many that it might as well be the most impressive benchmaxxing today, if not just a genuinely SOTA open source model. They even released a closed source 1 trillion parameter model today as well that is sitting on no3(!) on lm arena. EVen their 80gb model is 17th, gpt-oss 120b is 52nd https://qwen.ai/blog?id=241398b9cd6353de490b0f82806c7848c5d2...

jychang 12 hours ago | parent | next [-]

They still suck at explaining which model they serve is which, though.

They also released today Qwen3-VL Plus [1] today alongside Qwen3-VL 235B [2] and they don't tell us which one is better. Note that Qwen3-VL-Plus is a very different model compared to Qwen-VL-Plus.

Also, qwen-plus-2025-09-11 [3] vs qwen3-235b-a22b-instruct-2507 [4]. What's the difference? Which one is better? Who knows.

You know it's bad when OpenAI has a more clear naming scheme.

[1] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...

[2] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...

[3] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...

[4] https://modelstudio.console.alibabacloud.com/?tab=doc#/doc/?...

jwr 2 hours ago | parent | next [-]

> They still suck at explaining which model they serve is which, though.

"they" in this sentence probably applies to all "AI" companies.

Even the naming/versioning of OpenAI models is ridiculous, and then you can never find out which is actually better for your needs. Every AI company writes several paragraphs of fluffy text with lots of hand waving, saying how this model is better for complex tasks while this other one is better for difficult tasks.

viraptor 44 minutes ago | parent [-]

Both Deepseek and Claude are exceptions. Simple versions and Sonnet is overall worse but faster than Opus for the same version.

deepdarkforest 12 hours ago | parent | prev | next [-]

Eh i mean often innovation is made just by letting a lot of fragmented, small teams of cracked nerds trying out stuff. It's way too early in the game. I mean, qwens release statements have anime etc. IBM, Bell, Google, Dell, many did it similarly, letting small focused teams having many attempts at cracking the same problem. All modern quant firms are doing basically the same as well. Anthropic is actually an exception, more like Apple.

11 hours ago | parent | prev [-]
[deleted]
spaceman_2020 3 hours ago | parent | prev | next [-]

Interestingly, I've found that models like Kimi K2 spit out more organic, natural-sounding text than American models

Fails on the benchmarks compared to other SOTA models but the real-world experience is different

nl 11 hours ago | parent | prev [-]

> Take the core technology and just optimize, optimize, optimize for 10x the cost/efficiency. As simple as that. Super impressive.

This "just" is incorrect.

The Qwen team invented things like DeepStack https://arxiv.org/abs/2406.04334

(Also I hate this "The Chinese" thing. Do we say "The British" if it came from a DeepMind team in the UK? Or what if there are Chinese born US citizens working in Paris for Mistral?

Give credit to the Qwen team rather than a whole country. China has both great labs and mediocre labs, just like the rest of the world.)

viraptor 40 minutes ago | parent | next [-]

The naming makes some sense here. It's backed by the very Chinese Alibaba and the government directly as well. It's almost a national project.

Mashimo 4 hours ago | parent | prev | next [-]

> Do we say "The British"

Yes.

taneq 10 hours ago | parent | prev | next [-]

The Americans do that all the time. :P

mamami 11 hours ago | parent | prev | next [-]

Yeah it's just weird Orientalism all over again

riku_iki 7 hours ago | parent | prev [-]

> Also I hate this "The Chinese" thing

to me it was positive assessment, I adore their craftsmanship and persistence in moving forward for long period of time.

mrtesthah 6 hours ago | parent [-]

It erases the individuals doing the actual research by viewing Chinese people as a monolith.