Remix.run Logo
locknitpicker 2 hours ago

> I am saying this probably is "silly behavior by a government" and it is a milestone that points towards what the future may look like. Why can't it be both?

Here is why it's unlikely this is anything other than "silly behavior by a government":

- some benchmarks show GPT-5.5, Gemini 3.1, and even Claude Opus outperforming Claude Fable, and yet it's Fable which is restricted.

- some benchmarks still show the likes of Kimi 2.5 outperforming any Claude model, and DeepSeek is getting equivalent scores (a few tenths of a percent difference)

> Do you think that Chinese labs will continue to release open models forever (...)

That's immaterial to the discussion. Even if China forced Chinese labs to restrict access to all models, the truth of the matter is that Trump's administration to restrict access to US-based models does not prevent others from having access to models that are as capable or even better.

So what's exactly the point of this?

rileyphone 2 hours ago | parent | next [-]

All that says is some benchmarks aren’t worth the tokens it takes to evaluate them. Mythos is clearly capable of finding zero days other models can’t, and Fable is close enough to be lumped with it.

mullingitover 34 minutes ago | parent [-]

> Mythos is clearly capable of finding zero days other models can’t

I'm unconvinced that this is anything more than proof of work and marginal improvement that other models will catch up with, perhaps as early as to next week. Lots of other current-gen models will find vulns that can be chained together if you're willing to burn enough tokens on the task, and Fable is an absolute token incinerator.

solumunus 2 hours ago | parent | prev | next [-]

You’re completely overrating these benchmarks and it’s landing you at a nonsense opinion. Just actually use the models and you will see that the gap is significant.

kolinko an hour ago | parent | prev [-]

Did you use the models yourself?