Remix.run Logo
randomtoast 3 hours ago

Except for a few models, the selected ones were non-reasoning models. Naturally, without reasoning enabled, the reasoning performance will be poor. This is not a surprising result.

I asked GPT-5.2 10x times with thinking enabled and it got it right every time.

felix089 2 hours ago | parent [-]

Thinking or extended thinking?