Remix.run Logo
Redster 2 hours ago

So Gemini 3 Pro dropped today, which happens to be the day I proofread a historical timeline I'm assisting a PhD with. I do one pass and then realize I should try Gemini 3 Pro on it. I give the same exact prompt to 3 Pro as Claude 4.5 Sonnet. 3 pro finds 25 real errors, no hallucinations. Claude finds 7 errors, but only 2 of those are unique to Claude. (Claude was better at "wait, that reference doesn't match the content! It should be $corrected_citation!). But Gemini's visual understanding was top notch. It's biggest flaw was that it saw words that wrapped as having extra spaces. But it also correctly caught a typo where a wrapped word was misspelled, so something about it seemed to fixate on those line breaks, I think.

Redster 2 hours ago | parent [-]

A better test would be 2.5 Pro vs 3 Pro. Google just has been doing better at vision for a while.