| ▲ | stared 2 hours ago | |
To be honest, it is also our surprise. I mean, I used GPT 5.2 Codex in Cursor for decompiling an old game and it worked (way better than Claude Code with Opus 4.5). We tested for Opus 4.6, but waiting for public API to test on GPT 5.3 Codex. At the same time, various task can be different, and now all things that work the best end-to-end are the same as ones that are good for a typical, interactive workflow. We used Terminus 2 agent, as it is the default used by Harbor (https://harborframework.com/), as we want to be unbiased. Very likely other frameworks will change the result. | ||