Remix.run Logo
rtp4me 11 hours ago

I never trust the opinion of a single LLM model anymore - especially for more complex projects. I have seen Claude guarantee something is correct and then immediately apologize when I feed a critical review by Codex or Gemini. And, many times, the issues are not minor but are significant critical oversights by Claude.

My habit now: always get a 2nd or 3rd opinion before assuming one LLM is correct.

kaydub 8 hours ago | parent | next [-]

Happy to see someone else doing this.

All code written by an LLM is reviewed by an additional LLM. Then I verify that review and get one of the agents to iterate on everything.

rtp4me 8 hours ago | parent [-]

Agreed. From my experience, Claude is the top-level coder, Gemini is the architect, and Codex is really good at finding bugs and logic errors. In fact, Codex seems to perform better deep analysis than the other two.

kaydub 7 hours ago | parent [-]

I just round robin them until I run out on whatever subscription level I'm on. I only use claude api, so I pay per token there... I consider using claude as "bringing out the big guns" because I also think it's the top-level coder.

ozten 10 hours ago | parent | prev [-]

It doesn’t have to be different foundation models. As long as the temperature is up, as the same model 100 times.