Remix clone Hacker News

new | show | ask | jobs Github

	▲	simianwords 6 hours ago
		Maybe a good idea to be more explicit about this -- maybe a cost analysis benchmark would be a nice accompaniment. This kind of thing keeps popping up each time a new model is released and I don't think people are aware that token efficiency can change.
	▲	tedsanders 6 hours ago \| parent [-]
		Agreed. Would be great if everyone starts reporting cost per task alongside eval scores, especially in a world where you can spend arbitrary test-time compute. This is one thing I like about the Artificial Analysis website - they include cost to run alongside their eval scores: https://artificialanalysis.ai/