Remix clone Hacker News

new | show | ask | jobs Github

	▲	hhh 3 hours ago
		these things can just change with infrastructure changes rather than be some mysterious A/B testing.
	▲	jumploops an hour ago \| parent [-]
		I don't disagree, we've seen performance shift with capacity changes in the past. With that said, I doubt OpenAI would choose to publish a singular coding benchmark for a new model that exactly matches their previous model (88.8%).