| ▲ | jampekka 7 hours ago | |||||||
1491 vs 1418 ELO means the stronger model wins about 60% of the time. | ||||||||
| ▲ | supermatt 6 hours ago | parent [-] | |||||||
Probably naive questions: Does that also mean that Gemini-3 (the top ranked model) loses to mistral 3 40% of the time? Does that make Gemini 1.5x better, or mistral 2/3rd as good as Gemini, or can we not quantify the difference like that? | ||||||||
| ||||||||