Remix clone Hacker News

new | show | ask | jobs Github

	▲	XCSme 7 months ago
		It's weird that Opus4 is the worst at one-shot, it requires on average two attempts to generate a valid query. If a model is really that much smarter, shouldn't it lead to better first-attempt performance? It still "thinks" beforehand, right?
	▲	riwsky 7 months ago \| parent [-]
		Don’t talk to Opus before it’s had its coffee. Classic high-performer failure mode.