▲ | retinaros 7 days ago | |||||||
did you see the generated pic demis posted on X? it looks like slop from 2 years ago. https://x.com/demishassabis/status/1960355658059891018 | ||||||||
▲ | raincole 7 days ago | parent [-] | |||||||
I've tested it on Google AI Studio since it's available to me (which is just a few hours so take it with a grain of salt). The prompt comprehension is uncannily good. My test is going to https://unsplash.com/s/photos/random and pick two random images, send them both and "integrate the subject from the second image into the first image" as the prompt. I think Gemini 2.5 is doing far better than ChatGPT (admittedly ChatGPT was the trailblazer on this path). FluxKontext seems unable to do that at all. Not sure if I were using it wrong, but it always only considers one image at a time for me. Edit: Honestly it might not be the 'gpt4 moment." It's better at combining multiple images, but now I don't think it's better at understanding elaborated text prompt than ChatGPT. | ||||||||
|