| ▲ | raincole 3 hours ago |
| Yes, and the article author is fully aware of that. Thank you for pointing out this small mistake though. |
|
| ▲ | mkagenius 2 hours ago | parent [-] |
| It looks like the author is specifically avoiding model's name, because results are really weird. Opus 4.8/4.7 scored 28%
Opus 4.6 score 37%
So the author thought as let's not get into that just write Claude. |
| |
| ▲ | happycube an hour ago | parent | next [-] | | Not weird at all, given the variance in Opus' quality over the last few months. wild guess - I wouldn't be surprised if Opus 4.6 was run quantized for a while, and 4.7/4.8 have QAT for that nerfed size. | |
| ▲ | andriy_koval 2 hours ago | parent | prev [-] | | many people think opus 4.6 was the best |
|