Their Chinese announcement says that, based on internal employee testing, it is not as good as Opus 4.6 Thinking, but is slightly better than Opus 4.6 without Thinking enabled.

▲

mchusma 2 days ago | parent | next [-]

I appreciate this, makes me trust it more than benchmarks.

▲

ibic 2 days ago | parent | prev | next [-]

In case people wonder where the announcement is (you can easily translate it via browser if you don't read Chinese): https://mp.weixin.qq.com/s/8bxXqS2R8Fx5-1TLDBiEDg

It's still a "preview" version atm.

▲

deaux 2 days ago | parent | prev | next [-]

That's super interesting, isn't Deepseek in China banned from using Anthropic models? Yet here they're comparing it in terms of internal employee testing.

	▲	computably 2 days ago \| parent \| next [-]
		> That's super interesting, isn't Deepseek in China banned from using Anthropic models? Yet here they're comparing it in terms of internal employee testing. I don't see why Deepseek would care to respect Anthropic's ToS, even if just to pretend. It's not like Anthropic could file and win a lawsuit in China, nor would the US likely ban Deepseek. And even if the US gov would've considered it, Anthropic is on their shitlist.
	▲	renticulous 2 days ago \| parent \| prev [-]
		They use VPN to access. Even Google Deepmind uses Anthropic. There was a fight within Google as to why only DeepMind is allowed to Claude while rest of the Google can't.

▲

anentropic 2 days ago | parent | prev [-]

Who uses Opus without thinking though...?