| ▲ | stymaar 2 hours ago | ||||||||||||||||
> Lastly, there is a massive difference in capabilities, determinism, and error handling between 5T SOTA models like Opus What's your source for Opus being a 5T model? > and tiny distillations from DeepSeek that perform well only in benchmarks. I don't think you know what you're talking about. Local models aren't “distillations from Deepseek”. And they don't perform well “only in benchmarks”, Qwen 3.6 is a very decent model (obviously it's not Opus, but it's also much faster and speed is a quality of its own). | |||||||||||||||||
| ▲ | layer8 2 hours ago | parent | next [-] | ||||||||||||||||
> What's your source for Opus being a 5T model? Probably Elon Musk: https://eu.36kr.com/en/p/3760679047267075 | |||||||||||||||||
| |||||||||||||||||
| ▲ | gpugreg 2 hours ago | parent | prev [-] | ||||||||||||||||
> What's your source for Opus being a 5T model? Elon Musk tweeted that Grok is 0.5T or 1/10th the size of Opus. https://xcancel.com/elonmusk/status/2042123561666855235#m While this source's reliability is certainly debatable, the size matches the results of this paper, in which researchers estimated the parameter count from model knowledge. https://01.me/research/ikp/ | |||||||||||||||||
| |||||||||||||||||