| ▲ | cubefox 3 hours ago | |
While I don't doubt this was one influence, there was also an infamous problem with Dall-E 2, which was perfectly able to generate an astronaut riding a horse but completely unable to generate a horse riding an astronaut. This problem is infamous because it persisted (unlike other early problems, like creating the wrong number of fingers) for much more capable models, and the Qwen Image people are certainly very aware of this difficult test. Even Imagen 4 Ultra, which might be the most advanced pure diffusion model without editing loop, fails at it. And obviously an astronaut is similar to a man, which connects this benchmark to the Chinese meme. | ||