Remix.run Logo
iamjackg 6 hours ago

Curious how this will fare when playing Pokemon Red.

minimaxir 5 hours ago | parent | next [-]

Gemini 3 Pro has been playing Pokemon Crystal (which is significantly harder than Red) in a race against Gemini 2.5 Pro: https://www.twitch.tv/gemini_plays_pokemon

Gemini 3 Pro has been making steady progress (12/16 badges) while Gemini 2.5 Pro is stuck (3/16 badges) despite using double the turns and tokens.

theLiminator 3 hours ago | parent [-]

I think what would be interesting is if it could play the game with vision only inputs. That would represent a massive leap multimodal understanding.

danso 2 hours ago | parent | prev | next [-]

> 3. Turning long videos into action: Gemini 3 Pro bridges the gap between video and code. It can extract knowledge from long-form content and immediately translate it into functioning apps or structured code

I'm curious as to how close these models are to achieving that once long-ago mocked claim (by Microsoft I think?) that AIs could view gameplay video of long lost games and produce the code to emulate them.

euvin 6 hours ago | parent | prev [-]

Yeah the "High frame rate understanding" feature caught my eye, actual real time analysis of live video feeds seems really cool. Also wondering what they mean by "video reasoning/thinking"?

skybrian 5 hours ago | parent [-]

I don’t think it’s real time? The videos were likely taken previously.