| ▲ | crimsoneer 7 hours ago | |||||||||||||||||||||||||
If someone is using these models, they probably can't or won't use the existing SOTA models, so not sure how useful those comparisons actually are. "Here is a benchmark that makes us look bad from a model you can't use on a task you won't be undertaking" isn't actually helpful (and definitely not in a press release). | ||||||||||||||||||||||||||
| ▲ | constantcrying 7 hours ago | parent [-] | |||||||||||||||||||||||||
Completely agree, that there are legitimate reasons to prefer comparison to e.g. deepeek models. But that doesn't change my point, we both agree that the comparisons would be extremely unfavorable. | ||||||||||||||||||||||||||
| ||||||||||||||||||||||||||