| ▲ | randomtoast 3 hours ago | |
Except for a few models, the selected ones were non-reasoning models. Naturally, without reasoning enabled, the reasoning performance will be poor. This is not a surprising result. I asked GPT-5.2 10x times with thinking enabled and it got it right every time. | ||
| ▲ | felix089 2 hours ago | parent [-] | |
Thinking or extended thinking? | ||