| ▲ | SkyPuncher 2 hours ago | |
> What's your basis for thinking that codex is best for planning, but opus is best for implementing? I for one work on an agentic product where we use all 3 of the major frontier models. The models absolutely have preferences and "personality" that lead to different characteristics. In my eyes: * Gemini - consistently the best at pure reasoning and tunability. Flash models are particularly good at latency sensitive small-scale reasoning. The tradeoff is they struggle with some basic behavior, like tool calling. * Claude - consistently good at long standing sessions. Opus may or may not be the best model, but it was the first model that crossed the "holy shit" threshold. I understand it's quirks/nuances and it's consistently solid. It's the best for me because I've learn how to be incredibly effective with it. * ChatGPT - Probably really good, but probably not worth switching from Claude. Last time I used their frontier model, it was a bit random. It would have moments of brilliance immediately followed by falling flat on it's face. | ||