Interesting they didn’t post any benchmark results - lmarena/artificial analysis etc. I would’ve thought they’d be testing it behind the scenes the same way they did with Gemini 3.