Remix.run Logo
culi 7 hours ago

It's also one of the few models that seem capable of drawing an SVG clock

https://clocks.brianmoore.com/

SwellJoe 7 hours ago | parent | next [-]

Interesting that the best performers are all Chinese-made models (DeepSeek and Qwen also perform consistently well). I wonder if there's more focus on vision and illustration in their training, or if something else is leading to their clear lead on this one test.

sigmoid10 7 hours ago | parent | prev [-]

Is it? In your link it definitely failed to draw the clock.

squarefoot 6 hours ago | parent | next [-]

It redraws it every minute, and some models give quite different results although the prompt is exactly the same.

quesera 4 hours ago | parent [-]

This reads like satire, but I've been feeling that a lot lately.

dryarzeg 7 hours ago | parent | prev | next [-]

I'm not really sure how this works, but I stayed on the page for a while, and then it reloaded and all clocks changed. I guess there's either a collection of different clocks generated by models, or maybe they're somehow generated in the real time, but the fact is what you see is not necessarily what I see.

culi 3 hours ago | parent | next [-]

It reruns a prompt every minute to all the models included. Everyone is gonna see something different but I've spent too long on it and there's a consistent pattern of Qwen and Kimi outperforming the others

This site was made months ago and it seems its only been updated with the latest model of a couple of the providers so keep in mind that many of the Chinese models haven't been updated

sigmoid10 6 hours ago | parent | prev [-]

Seems like it regenerates them to reflect the current time. Funny to see how some models (like Kimi and Deepseek) sometimes get it right and other times fail miserably on the level of ancient models like GPT 3.5.

gunalx 6 hours ago | parent | prev [-]

It reruns the prompt every minute.