If Pro is the same model (hard to tell, I'm not sure) it has a token budget to think (test time scaling) which is huge compared to the Codex endpoint.