| ▲ | encroach 6 days ago | ||||||||||||||||||||||||||||||||||
This outperforms Gemini 3 pro image (nano banana pro) on Text-to-Image Arena and Image Edit Arena. I'm surprised they didn't mention this leaderboard in the blog post. I like this benchmark because its based upon user votes, so overfitting is not as easy (after all, if users prefer your result, you've won). | |||||||||||||||||||||||||||||||||||
| ▲ | ygouzerh 5 days ago | parent | next [-] | ||||||||||||||||||||||||||||||||||
The score are really, really close, it might be why | |||||||||||||||||||||||||||||||||||
| ▲ | nycdatasci 6 days ago | parent | prev [-] | ||||||||||||||||||||||||||||||||||
The arena concept doesn’t work for image models due to watermarks. | |||||||||||||||||||||||||||||||||||
| |||||||||||||||||||||||||||||||||||