Remix clone Hacker News

new | show | ask | jobs Github

	▲	lancebeet 2 hours ago
		If benchmarks are fishy, it seems their bias would be to produce better scores than expected for proprietary models, since they have more incentives to game the benchmarks.