Remix clone Hacker News

new | show | ask | jobs Github

	▲	bsamuels a year ago
		as soon as you publish a benchmark like this, it becomes worthless because it can be included in the training corpus
	▲	rbjorklin a year ago \| parent [-]
		While I agree with you in principle give Claude 4 a try on something like: https://open.kattis.com/problems/low . I would expect this to have been included in the training material as well as solutions found on Github. I've tried providing the problem description and asking Claude Sonnet 4 to solve it and so far it hasn't been successful.