This thing's ability to produce entire infographics from a short prompt is really impressive, especially since it can run extra Google searches first.

I tried this prompt:

  Infographic explaining how the Datasette open source project works

Here's the result: https://simonwillison.net/2025/Nov/20/nano-banana-pro/#creat...

▲ JLO64 4 hours ago | parent | next [-]

This is legitimately game changing a feature in my SaaS where customers can generate event flyers. Up until now I had Nano Banana generate just a decorative border and had the actual text be rendered via Pillow controlled by an LLM. The result worked, but didn’t look good.

That said, I wonder if text is only good in small chunks (less than a sentence) or if it can properly render full sentences.

▲ skybrian 9 hours ago | parent | prev | next [-]

It didn’t do so well at finding middle C on a piano keyboard:

https://gemini.google.com/share/c9af8de05628

I did manage to get one image of a piano keyboard where the black keys were correct, but not consistently.

	▲	vunderba 8 hours ago \| parent \| next [-]
		I've tried similar stuff such as: "Show a piano with an outstretched hand playing a Emaj triad on the E, G#, and B keys". https://imgur.com/ogPnHcO Even generating a standard piano with 7 full octaves that are consistent is pretty hard. If you ask it to invert the colors of the naturals and sharps/flats you'll completely break them.
	▲	gowld 7 hours ago \| parent \| prev [-]
		Fooled me because it was locally correct!

▲ pseudosavant 8 hours ago | parent | prev | next [-]

It even worked really well at creating an infographic for one of my quirkier projects which doesn't have that much information online (other than its repo).

"An infographic explaining how player.html works (from the player.html project on Github). https://github.com/pseudosavant/player.html"

And then it made one formatted for social: "Change it to be an infographic formatted to fit on Instagram as a 1:1 square image."

▲ bn-l 10 hours ago | parent | prev | next [-]

Is the infographic accurate in terms of the way datasette wprks?

▲

simonw 9 hours ago | parent | next [-]

Almost entirely. I called out the one discrepancy in my post:

> “Data Ingestion (Read-Only)” is a bit off.

	▲	hugkdlief 8 hours ago \| parent [-]
		[flagged]

▲

OtherShrezzing 10 hours ago | parent | prev | next [-]

It’s subtly incorrect. R/w permissions for example are described incorrectly on some nodes.

▲

mikepurvis 9 hours ago | parent [-]

Then the question becomes, can it incorporate targeted feedback, or is it a oneshot-or-bust affair?

My experience is that ChatGPT is very good at iterating on text (prose, code) but fairly bad at iterating on images. It struggles to integrate small changes, choosing instead to start over from scratch, with wildly different results. Thinking especially here of architectural stuff, where it does a great job laying out furniture in a room, but when I ask it to keep everything the same but change the colour of one piece, it goes completely off the rails.

▲

simonw 9 hours ago | parent | next [-]

Nano Banana is really good at iterating on images, as shown by the pancake skull example I borrowed from Max Woolf: https://simonwillison.net/2025/Nov/20/nano-banana-pro/#tryin...

I've tried iterating on slides with test on them a bit and it seems to be competent at that too.

▲

spike021 9 hours ago | parent | prev | next [-]

I would assume it depends on how it generates the images.

I've used Claude to generate fairly simple icons and launch images for an iOS game and I make sure to have it start with SVG files since those can be defined as code first. This way it's easier to iterate on specific elements of the image (certain shapes need to be moved to a different position, color needs to be changed, text needs an update, etc.).

FWIW not sure how Nano Banana Pro works though.

	▲	fzysingularity 7 hours ago \| parent [-]
		Claude does image generation in surprising ways - we did a small evaluation [1] of different frontier models for image generation and understanding, and Claude is by far the most surprising in results. [1] https://chat.vlm.run/showdown [2] https://news.ycombinator.com/item?id=45996392

▲

vunderba 8 hours ago | parent | prev [-]

You can use targeted feedback - but it's on the user to verify whether the edits were completely localized. In my experience NB mostly tends to make relatively surgical edits but if you're not careful it'll introduce other minute changes.

And that point you can either start over or just feather/mask with the original in any Photoshop type application.

▲

gpmcadam 9 hours ago | parent | prev [-]

None of it was accurate.

But boy was it beautiful.

	▲	Kiro 6 hours ago \| parent [-]
		Funny thing to say considering the author of Datasette himself says it's accurate.

▲ 9 hours ago | parent | prev | next [-]

[deleted]

▲ fudged71 9 hours ago | parent | prev | next [-]

I’ve been really excited for you infographic generation. Previous models from Google and openAI had very low detail/resolution for these things.

I’ve found in general that the first generation may not be accurate but a few rolls of the dice and you should have enough to pick a style and format that works, which you can iterate on.

▲ nrhrjrjrjtntbt 5 hours ago | parent | prev | next [-]

Game changer for architecture diagrams.

	▲	energy123 an hour ago \| parent [-]
		I'm finding it bad at instruction following for architectural specs (physical not software), where you tell it what goes where, and it ignores you and does some average-ish thing it's seen before. It looks visually appealing though.

▲ ndkap 8 hours ago | parent | prev | next [-]

Did you check if the SynthID works when you edit the photos with filters like GrayScale?

▲ turbonegrofa 10 hours ago | parent | prev [-]

[dead]