| ▲ | bgirard 4 hours ago | |||||||
> Using the develop web game skill and preselected, generic follow-up prompts like "fix the bug" or "improve the game", GPT‑5.3-Codex iterated on the games autonomously over millions of tokens. I wish they would share the full conversation, token counts and more. I'd like to have a better sense of how they normalize these comparisons across version. Is this a 3-prompt 10m token game? a 30-prompt 100m token game? Are both models using similar prompts/token counts? I vibe coded a small factorio web clone [1] that got pretty far using the models from last summer. I'd love to compare against this. | ||||||||
| ▲ | veb 4 hours ago | parent [-] | |||||||
I just wanted to say that's a pretty cool demo! I hadn't realised people were using it for things like this. | ||||||||
| ||||||||