Remix.run Logo
simonw 4 hours ago

How hard have you tried?

I've been finding that the Opus 4.5/4.6 and GPT-5.2/5.3 models really have represented a step-change in how good they are at running long tasks.

I can one-shot prompt all sorts of useful coding challenges now that previously I would have expected to need multiple follow-ups to fix mistakes the agents made.

I got all of this from a single prompt, for example: https://github.com/simonw/research/tree/main/cysqlite-wasm-w... - including this demo page: https://simonw.github.io/research/cysqlite-wasm-wheel/demo.h... - using this single prompt: https://github.com/simonw/research/pull/79

aeyes 4 hours ago | parent | next [-]

What do you mean? The generated script just downloads the sources and runs pyodide: https://github.com/simonw/research/blob/main/cysqlite-wasm-w...

There is maybe 5 relevant lines in the script and nothing complex at all that would require to run for days.

simonw 3 hours ago | parent | next [-]

No, not for days - but it churned away on that one for about ten minutes.

I don't think I've got any examples of multi-hour or multi-day sessions that ran completely uninterrupted - this one back in December took 4.5 hours but I had to prompt it to keep going a few times along the way: https://simonwillison.net/2025/Dec/15/porting-justhtml/

andai 3 hours ago | parent | prev [-]

Maybe so, but I did once spend 12 hours straight debugging an Emscripten C++ compiler bug! (After spending the first day of the jam setting up Emscripten, and the second day getting Raylib to compile in it. Had like an hour left to make the actual game, hahah.)

I am a bit thick with such things, but just wanted to provide the context that Emscripten can be a fickle beast :)

I sure am glad I can now deploy Infinite Mechanized Autistic Persistence to such soul-crushing tasks, and go make a sandwich or something.

(The bug turned out to be that if I included a boolean in a class member, the whole game crashed, but only the Emscripten version. Sad. Ended up switching back to JS, which you basically need anyway for most serious web game dev.)

basilgohar 4 hours ago | parent | prev [-]

Can you share any examples of these one-shot prompts? I've not gotten to the point where I can get those kind of results yet.

simonw 3 hours ago | parent [-]

If you look through the commit logs on simonw/research and simonw/tools on GitHub most commits should either list the prompt, link to a PR with the prompt or link to a session transcript.