Remix clone Hacker News

new | show | ask | jobs Github

	▲	smithclay 7 hours ago
		We need more rigorous benchmarks for SRE tasks, which is much easier said that done. The only other benchmark I've come across is https://sreben.ch/ ... certainly there must be others by now?
	▲	nyellin 4 hours ago \| parent [-]
		We publish the benchmarks for HolmesGPT (CNCF sandbox project) at https://holmesgpt.dev/development/evaluations/