Remix.run Logo
nvanlandschoot a day ago

Method: I used OpenAI’s published SWE-Bench Pro chart points and matched GPT-5.3-Codex-Spark to the baseline model at comparable accuracy levels by reasoning effort. At similar accuracy, the effective speedup is closer to ~1.37× rather than 15×.