Remix.run Logo
achierius 3 days ago

How is this a notable release? It's strictly worse than Gemini 2.5 on coding &c, and only an iterative improvement over their own models. The only thing that struck me as particularly interesting was the native visual reasoning.

og_kalu 3 days ago | parent [-]

It's not worse on coding. SWE Bench, Aider, live bench coding all show noticeably better results.