Remix.run Logo
CWuestefeld 3 hours ago

What they've chosen as examples to illustrate the strength of the new model surprises me.

The "cubism" example seems like it would be a closer fit to something like stained glass or something. I don't think the thing really understands what cubism was all about. Cubist painters were trying to free themselves from the confines of a single integral plane of perspective by allowing themselves to show various parts of the image from different viewpoints, different times, different styles, etc.

The division of the image into geometric shapes is just a by-product of that quest, whereas the examples here have made it the sum total of the whole piece.

This feels to me like an example of how LLMs still don't "understand" what the art means, and are just aping its facade.

kevinsync 3 hours ago | parent [-]

I had a similar thought before realizing that I'm pretty sure what they were demonstrating wasn't art style, but adherence to correct physical dimensions and construction of the buildings referenced, that was then expressed in an art style (or reasonable facsimile thereof). The before prompts would just conjure a random building out of thin air, the after prompts searched the web for reference material and then used that in image generation.

And actually, the link I saw a bit ago was this [0] which is more in-depth and has a lot more examples + prompts.

[0] - https://deepmind.google/models/gemini-image/flash/