Remix clone Hacker News

new | show | ask | jobs Github

	▲	zozbot234 11 hours ago
		> These models are dumber and slower than API SoTA models and will always be. Sure but you're paying per-token costs on the SoTA models that are roughly an order of magnitude higher than third-party inference on the locally available models. So when you account for per-token cost, the math skews the other way.