Remix.run Logo
stymaar 2 hours ago

> Lastly, there is a massive difference in capabilities, determinism, and error handling between 5T SOTA models like Opus

What's your source for Opus being a 5T model?

> and tiny distillations from DeepSeek that perform well only in benchmarks.

I don't think you know what you're talking about. Local models aren't “distillations from Deepseek”.

And they don't perform well “only in benchmarks”, Qwen 3.6 is a very decent model (obviously it's not Opus, but it's also much faster and speed is a quality of its own).

layer8 2 hours ago | parent | next [-]

> What's your source for Opus being a 5T model?

Probably Elon Musk: https://eu.36kr.com/en/p/3760679047267075

stymaar 2 hours ago | parent [-]

Sigh, it's year 2026 and there are still people believing something Musk…

gpugreg 2 hours ago | parent | prev [-]

> What's your source for Opus being a 5T model?

Elon Musk tweeted that Grok is 0.5T or 1/10th the size of Opus. https://xcancel.com/elonmusk/status/2042123561666855235#m

While this source's reliability is certainly debatable, the size matches the results of this paper, in which researchers estimated the parameter count from model knowledge. https://01.me/research/ikp/

stymaar an hour ago | parent [-]

> While this source's reliability is certainly debatable

Massive understatement. Nowadays it has become hard to find a single Musk statement that doesn't contain at least one lie.

> the size matches the results of this paper, in which researchers estimated the parameter count from model knowledge. https://01.me/research/ikp/

Thanks for the pointer. This estimation has Grok 6 times bigger than Musk claims it is, so maybe that's where the lie is.

(I'm quite skeptical about that number though, it would be quite disappointing for the US tech if their flagship models had to be that much larger than the Chinese ones for such a small edge in performance. Because I don't think US labs are incompetent, I'd bet that US flagships aren't more than 2/3 times faster than Chinese flagship. Otherwise it really doesn't bode well.)

striking an hour ago | parent [-]

In tiny gray text right above the table is written "90% PI ≈ ±3.00× either side." Is GPT-5.5-Pro 3.4T or 30.8T in size, or somewhere in between? We just don't know.