Remix clone Hacker News

new | show | ask | jobs Github

	▲	embedding-shape 3 days ago
		Just remember to benchmark it yourself first with you private task collection, so you can actually measure them against each other. Pretty much any public benchmark is unreliable at this moment, and making model choices based on other's benchmarks is bound to leave you disappointed.
	▲	MaxikCZ 3 days ago \| parent [-]
		This. Last benchmarks of DSv3.2spe hinted at beating basically everything, yet in my testing even sonnet is miles ahead both in terms of speed and accuracy