Remix clone Hacker News

new | show | ask | jobs Github

	▲	amelius 2 hours ago
		There should be a way to turn the questions we ask LLMs into benchmarks. That way, we can have a benchmark that is always up to date.