Remix.run Logo
cubefox 7 hours ago

In your example, z-image and Nano Banana Pro look basically equally photorealistic to me. Perhaps the NBP image looks a bit more real because it resembles an unstaged smartphone shot with wide angle. Anyway, the difference is very small. I agree the lighting in Flux.2 Pro looks a bit off.

But anyway, realistic environments like a street cafe are not suited to test for photorealism. You have to use somewhat more fantastical environments.

I don't have access to z-image, but here are two examples with Nano Banana Pro:

"A person in the streets of Atlantis, portrait shot." https://i.ibb.co/DgMXzbxk/Gemini-Generated-Image-7agf9b7agf9...

"A person in the streets of Atlantis, portrait shot (photorealistic)" https://i.ibb.co/nN7cTzLk/Gemini-Generated-Image-l1fm5al1fm5...

These are terribly unrealistic. Far more so than the Flux.2 Pro image above.

> Also Imagen 4 and Nano Banana Pro are very different models.

No, Imagen 4 is a pure diffusion model. Nano Banana Pro is a Gemini scaffold which uses Imagen to generate an initial image, then Gemini 3 Pro writes prompts to edit the image for much better prompt alignment. The prompts above a very simple, so there is little for Gemini to alter, so they look basically identical to plain Imagen 4. Both pictures (especially the first) have the signature AI look of Imagen 4, which is different from other models like Imagen 3.

By the way, here is GPT Image 1.5 with the same prompts:

"A person in the streets of Atlantis, portrait shot." https://i.ibb.co/Df8nDHFL/Chat-GPT-Image-10-Feb-2026-14-17-1...

"A person in the streets of Atlantis, portrait shot (photorealistic)" https://i.ibb.co/Nns4pdGX/Chat-GPT-Image-10-Feb-2026-14-17-2...

The first is very fake and the second is a strong improvement, though still far from the excellent cafe shots above (fake studio lighting, unrealistic colors etc).

GaggiX 6 hours ago | parent [-]

>In your example, z-image and Nano Banana Pro look basically equally photorealistic to me

I disagree, nano banana pro result is on a completely different league compare to flux.2 and z-image.

>But anyway, realistic environments like a street cafe are not suited to test for photorealism

Why? It's the perfect settings in my opinion.

Btw I don't think you are using nano banana pro, probably standard nano banana, I'm getting this from your prompt: https://i.ibb.co/wZHx0jS9/unnamed-1.jpg

>Nano Banana Pro is a Gemini scaffold which uses Imagen to generate an initial image, then Gemini 3 Pro writes prompts to edit the image for much better prompt alignment.

First of all how should you know the architecture details of gemini-3-pro-image, second of all how the model can modify the image if gemini itself is just rewriting the prompt (like old chatgpt+dalle), imagen 4 is just a text-to-image model, not an editing one, it doesn't make sense, nano banana pro can edit images (like the ones you can provide).

cubefox 6 hours ago | parent [-]

> I disagree, nano banana pro result is on a completely different league.

I strongly disagree. But even if you are right, the difference between the cafe shots and the Atlantis shots is clearly much, much larger than the difference between the different cafe shots. The Atlantis shots are super unrealistic. They look far worse than the cafe shots of Flux.2 Pro.

> Why? It's the perfect settings in my opinion

Because it's too easy obviously. We don't need an AI to make fake realistic photos of realistic environments when we can easily photograph those ourselves. Unrealistic environments are more discriminative because they are much more likely to produce garbage that doesn't look photorealistic.

> Btw I don't think you are using nano banana pro, I'm getting this from your prompt: https://i.ibb.co/wZHx0jS9/unnamed-1.jpg

I'm definitely using Nano Banana Pro, and your picture has the same strong AI look to it that is typical of NBP / Imagen 4.

> First of all how should you know the architecture details of gemini-3-pro-image, second of all how the model can modify the image if gemini itself is just rewriting the prompt (like old chatgpt+dalle), imagen 4 is just a text-to-image model, not an editing one, it doesn't make sense, nano banana pro can edit images (like the ones you can provide).

There were discussions about it previously on HN. Clearly NBP is using Gemini reasoning, and clearly the style of NBP strongly resembles Imagen 4 specifically. There is probably also a special editing model involved, just like in Qwen-Imahe-2.0.

GaggiX 5 hours ago | parent [-]

>Because it's too easy obviously.

Still the vast majority of models fail at delivery an image that looks real, I want realism for a realistic settings, if it can't do that than what's the point. Of course you can always pay people and equipment to make the perfect photo for you ahah

If the image of z-image turbo looks as good as the nano banana pro one for you, you are probably too used to slop that a model that do not produce obvious artifacts like super shiny skin it's immediately undistinguishable from a real image (like the nano banana pro one that to me looks as real as a real photo) and yes I'm ignoring the fact that in the z-image-turbo the cup is too large and the bag is inside the chair. Z-image is good (in particular given its size) but not as good.

cubefox 5 hours ago | parent [-]

It seems you are ignoring the fact that the NBP Atlantis pictures looks much, much worse than the z-image picture of the cafe. They look far more like AI slop. (Perhaps the Atlantis prompt would look even worse with z-image, I don't know.)

GaggiX 5 hours ago | parent [-]

I have generated my own using your prompt and post it in the previous comment. You haven't posted a z-image one of Atlantis. I'm not at home to try but I have trained lora for z-image (it's a relatively lightweight model), I know the model, it's not as good as nano banana pro. Use what you prefer.

cubefox 4 hours ago | parent [-]

> I have generated my own using your prompt and post it in the previous comment.

Yes, and it has a very unrealistic AI look to it. That was my point.

> You haven't posted a z-image one of Atlantis.

Yes, I don't doubt that it might well be just as unrealistic or even worse. I also just tried the Atlantis prompts in Grok (no idea what image model they use internally) and they look somewhat more realistic, though not on cafe level.