Remix.run Logo
hodder 5 hours ago

"Gemini 3 Pro represents a generational leap from simple recognition to true visual and spatial reasoning."

Prompt: "wine glass full to the brim"

Image generated: 2/3 full wine glass.

True visual and spatial reasoning denied.

minimaxir 5 hours ago | parent | next [-]

Gemini 3 Pro is not Nano Banana Pro, and the image generation/model that decodes the generated image tokens may not be as robust.

The thinking step of Nano Banana Pro can refine some lateral steps (i.e. the errors in the homework correction and where they are spatially in the image) but it isn't perfect and can encounter some of the typical pitfalls. It's a lot better than Nano Banana base, though.

hodder 5 hours ago | parent [-]

As a consumer I typed this into "Gemini". The behind the scenes model selection just adds confusion.

If "AI" trust is the big barrier for widespread adoption to these products, Alphabet soup isn't the solution (pun intended).

JacobAsmuth 34 minutes ago | parent | next [-]

It works fine for me. https://imgur.com/a/MKNufm1

iknowstuff 5 hours ago | parent | prev [-]

Nano Banana generates images.

This article is about understanding images.

Your task is unrelated to the article.

spchampion2 5 hours ago | parent | prev [-]

I actually did this prompt and found that it worked with a single nudge on a followup prompt. My first shot got me a wine glass that was almost full but not quite. I told it I wanted it full to the top - another drop would overflow. The second shot was perfectly full.

RyJones 5 hours ago | parent | next [-]

The correction I expect to give to an intern, not a junior person.

ugh123 5 hours ago | parent | prev [-]

did it return the exact same glass and surrounding imagery, just with more wine?