I never trust the opinion of a single LLM model anymore - especially for more complex projects. I have seen Claude guarantee something is correct and then immediately apologize when I feed a critical review by Codex or Gemini. And, many times, the issues are not minor but are significant critical oversights by Claude.

My habit now: always get a 2nd or 3rd opinion before assuming one LLM is correct.

▲

kaydub 8 hours ago | parent | next [-]

Happy to see someone else doing this.

All code written by an LLM is reviewed by an additional LLM. Then I verify that review and get one of the agents to iterate on everything.

▲

rtp4me 8 hours ago | parent [-]

Agreed. From my experience, Claude is the top-level coder, Gemini is the architect, and Codex is really good at finding bugs and logic errors. In fact, Codex seems to perform better deep analysis than the other two.

	▲	kaydub 7 hours ago \| parent [-]
		I just round robin them until I run out on whatever subscription level I'm on. I only use claude api, so I pay per token there... I consider using claude as "bringing out the big guns" because I also think it's the top-level coder.

▲

ozten 10 hours ago | parent | prev [-]

It doesn’t have to be different foundation models. As long as the temperature is up, as the same model 100 times.