| ▲ | senko 4 hours ago | |||||||||||||||||||
My fav coding benchmark for frontier models is to build a simple RTS game in one file (js/html/css). Claude Code with Opus 4.8 in ultracode mode nailed it, the best result so far: https://bsky.app/profile/senko.net/post/3mmwnrkwboc2v The prompt was: Create a simple but functional real time strategy (RTS) game similar to old WarCraft, StarCraft or Command & Conquer games. The player should be able to build buildings, create units, gather resources and should uncover the whole map. No AI or multiplayer needed. Use simple but nice-looking graphics. No sound. Implement everything in HTML/CSS/JS, everything in a single file (you can use 3rd-party js or css libraries/frameworks via CDN). | ||||||||||||||||||||
| ▲ | H3X_K1TT3N 2 minutes ago | parent | next [-] | |||||||||||||||||||
Thanks for also sharing the prompt. I've been testing claude by asking it to make similar things, so it's useful to see what other people are doing. I do find it interesting that the visual style is pretty similar to things it's produced for me. | ||||||||||||||||||||
| ▲ | apitman 3 hours ago | parent | prev | next [-] | |||||||||||||||||||
I like that benchmark. You should throw the results up on GitHub pages so people can try out the games. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | jclay 3 hours ago | parent | prev | next [-] | |||||||||||||||||||
It almost appears as if the code was minified. The variable names are short and formatting looks like it's written to minimize whitespace. Did it write it in this compact format all on it's own? | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | digdugdirk 2 hours ago | parent | prev | next [-] | |||||||||||||||||||
Do you have a collection of these benchmark apps saved anywhere? I'd be particularly interested in seeing the relative cost differences between different models in a use case like this. | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | elAhmo 3 hours ago | parent | prev | next [-] | |||||||||||||||||||
What is ultracode mode? | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | jryan49 2 hours ago | parent | prev | next [-] | |||||||||||||||||||
Kinda buggy, but impressively nonetheless. How long did it take? | ||||||||||||||||||||
| ||||||||||||||||||||
| ▲ | l3x4ur1n 3 hours ago | parent | prev [-] | |||||||||||||||||||
Played it to the end. Pretty neat! | ||||||||||||||||||||