Remix.run Logo
Taek 5 days ago

Yeah, I spent a ton of time yesterday comparing o3, 4.5, 5, 5 thinking, and 5 pro, and... 5 seems to underperform across the board? o3 is better than 5 thinking, o3 pro is better than 5 pro, 4.5 is better than 5, and overall 5 just seems underwhelming.

When I think back to the delta between 3 and 3.5, and the delta between 3.5 and 4, and the delta between 4 and 4.5... this makes it seem like the wall is real and OpenAI has topped out.