Remix.run Logo
XCSme 7 hours ago

Funny how they didn't include Gemini 3.0 Pro in the bar chart comparison, considering that it seems to do the best in the table view.

jychang 7 hours ago | parent | next [-]

Also, funny how they included GPT-5.0 and 5.1 but not 5.2... I'm pretty sure they ran the benchmarks for 5.0, then 5.1 came out, so they ran the benchmarks for 5.1... and then 5.2 came out and they threw their hands up in the air and said "fuck it".

rynn 4 hours ago | parent | next [-]

gpt-5.2 codex isn't available in the API yet.

If you want to be picky they could've compared it against gpt-5 pro gpt-5.2 gpt-5.1 gpt-5.1-codex-max gpt-5.2 pro

all depending on when they ran benchmarks (unless, of course, they are simply copying OAI's marketing).

At some point it's enough to give OAI a fair shot and let OAI come out with their own PR, which they doubtlessly will.

XCSme 7 hours ago | parent | prev | next [-]

I didn't even notice that, I assumed it was the latest GPT version.

amelius 5 hours ago | parent | prev [-]

after or before running the benchmarks?

7 hours ago | parent | prev | next [-]
[deleted]
guluarte 6 hours ago | parent | prev [-]

Gemini is garbage and does it's own thing most of the time ignoring the instructions