Generate an SVG of a pelican riding a bicycle: https://codepen.io/chdskndyq11546/pen/yyaWGJx

Generate an SVG of a dragon eating a hotdog while driving a car: https://codepen.io/chdskndyq11546/pen/xbENmgK

Far from perfect, but it really shows how powerful these models can get

tln 2 hours ago | parent | next [-]

The dragon image has issues like one eye, weird tail etc, but the pelican is imo perfect -- the best I've seen!

	▲	vunderba 35 minutes ago \| parent [-]
		Yeah the dragon one is just a complete mess. The car is sideways but the WHEEL is oriented in a first-person perspective. Seems like a case of overfitting with regard to the thousands of pelican bike SVG samples on the internet already.

▲

yrds96 2 hours ago | parent | prev [-]

I wonder if this became a so well known "benchmark" that models already got trained for it.

▲

HotHotLava an hour ago | parent | next [-]

Given that the pelican looks way better than the dragon, it almost seems like a certainty.

▲

sietsietnoac an hour ago | parent | prev | next [-]

Given the likeness of the sky between the 2 examples, the overall similarities and the fact that the pelican is so well done, there is 0-doubt that the benchmark is in the training data of these models by now

That doesn't make it any less of an achievement given the model size or the time it took to get the results

If anything, it shows there's still much to discover in this field and things to improve upon, which is really interesting to watch unfold

▲

Marciplan 2 hours ago | parent | prev [-]

every model release Simon comes with his Pelican and then this comment follows.

Can we stop both? its so boring

▲

refulgentis 43 minutes ago | parent [-]

I really appreciate you speaking up. Happened yesterday on GPT Image 2, bit my tongue b/c people would see it as fun policing, and same thing today. And it happens on every. single. LLM. release. thread.

It's disruptive to the commons, doesn't add anything to knowledge of a model at this point, and it's way out of hand when people are not only engaging with the original and creating screenfuls to wade through before on-topic content, but now people are creating the thread before it exists to pattern-match on the engagement they see for the real thing. So now we have 2x.

▲

jszymborski 34 minutes ago | parent [-]

No more disruptive than this comment. If you don't like it, downvote and move on. It's on topic and doesn't contradict the rules. The reason you see Simon's comment on the top is because people like it and upvote it.

	▲	refulgentis 30 minutes ago \| parent [-]
		Our comments are no more disruptive, so we shouldn't write them. The other comments are at most as disruptive & fine. Something seems off when I combine those premises. You also make a key observation here: the root comment is fine and on-topic. The the replies spin off into nothing to do with the headline, but the example in the comment. Makes it really hard to critique with coming across as fun police. Also, worth noting there's a distinction here, we're not in simonw's thread: we're in a brand new account's imitation of it.