| ▲ | HardCodedBias 7 hours ago | |
What? The 4.5 and 5.1 columns aren't thinking in Google's report? That's a scandal, IMO. Given that Gemini-3 seems to do "fine" against the thinking versions why didn't they post those results? I get that PMs like to make a splash but that's shockingly dishonest. | ||
| ▲ | iosjunkie 4 hours ago | parent | next [-] | |
It that true? > For Claude Sonnet 4.5, and GPT-5.1 we default to reporting high reasoning results, but when reported results are not available we use best available reasoning results. https://storage.googleapis.com/deepmind-media/gemini/gemini_... | ||
| ▲ | mountainriver 5 hours ago | parent | prev [-] | |
Every single time | ||