Remix.run Logo
cyberclimb 21 hours ago

Note that these results are specific to gpt-4o so it's unclear how much they generalize.

They note at the end they're also testing "GPT o3, and Claude" but no empircal results are included.