Remix clone Hacker News

	▲	arvindh-manian 6 days ago
		Interesting link. Worth noting that the pull requests were judged by o3-mini. Further, I'm not sure that 55% vs 45% is a huge difference.
	▲	marsh_mellow 6 days ago \| parent \| next [-]
		Good point. They said they validated the results by testing with other models (including Claude), as well as with manual sanity checks. 55% to 45% definitely isn't a blowout but it is meaningful — in terms of ELO it equates to about a 36 point difference. So not in a different league but definitely a clear edge
	▲	servercobra 5 days ago \| parent \| prev \| next [-]
		Maybe not as much to us, but for people building these tools, 4.1 being significantly cheaper than Clause 3.7 is a huge difference.
	▲	elAhmo 6 days ago \| parent \| prev [-]
		I first read it as 55% better, which sounds significantly higher than ~22% which they report here. Sounds misleading.