| ▲ | minimaxir 4 hours ago | ||||||||||||||||||||||||||||||||||||||||
So during my Nano Banana Pro experiments I wrote a very fun prompt that tests the ability for these image generation models to follow heuristics, but still requires domain knowledge and/or use of the search tool:
The NBP result is here, which got the numbers, corresponding Pokemon, and styles correct, with the main point of contention being that the style application is lazy and that the images may be plagiarized: https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:oxaerni...Running that same prompt through gpt-2-image high gave an...interesting contrast: https://cdn.bsky.app/img/feed_fullsize/plain/did:plc:oxaerni... It did more inventive styles for the images that appear to be original, but: - The style logic is by row, not raw numbers and are therefore wrong - Several of the Pokemon are flat-out wrong - Number font is wrong - Bottom isn't square for some reason Odd results. | |||||||||||||||||||||||||||||||||||||||||
| ▲ | dvt 2 hours ago | parent | next [-] | ||||||||||||||||||||||||||||||||||||||||
This is an amazing test and it's kinda' funny how terrible gpt-2-image is. I'd take "plagiarized" images (e.g. Google search & copy-paste) any day over how awful the OpenAI result is. Doesn't even seem like they have a sanity checker/post-processing "did I follow the instructions correctly?" step, because the digit-style constraint violation should be easily caught. It's also expensive as shit to just get an image that's essentially unusable. | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||
| ▲ | rrr_oh_man 2 hours ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||||||||
Why would you consider this a good prompt? | |||||||||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||||||||