Remix.run Logo
crimsoneer 7 hours ago

If someone is using these models, they probably can't or won't use the existing SOTA models, so not sure how useful those comparisons actually are. "Here is a benchmark that makes us look bad from a model you can't use on a task you won't be undertaking" isn't actually helpful (and definitely not in a press release).

constantcrying 7 hours ago | parent [-]

Completely agree, that there are legitimate reasons to prefer comparison to e.g. deepeek models. But that doesn't change my point, we both agree that the comparisons would be extremely unfavorable.

Lapel2742 7 hours ago | parent [-]

> that the comparisons would be extremely unfavorable.

Why should they compare apples to oranges? Ministral3 Large costs ~1/10th of Sonnet 4.5. They clearly target different users. If you want a coding assistant you probably wouldn't choose this model for various reasons. There is place for more than only the benchmark king.

constantcrying 6 hours ago | parent [-]

Come on. Do you just not read posts at all?

esafak 6 hours ago | parent [-]

Which lightweight models do these compare unfavorably with?