Remix clone Hacker News

new | show | ask | jobs Github

	▲	sheepscreek 3 hours ago
		It seems their tests rely on Claude alone. It’s not safe to assume that Codex or Gemini will behave the same way as Claude. I use all three and each has its own idiosyncrasies.
	▲	verdverm 2 hours ago \| parent [-]
		I've done very similar things with my custom agent that uses Gemini and have gotten very similar results. Working on the evals to back that claim up