Remix clone Hacker News

new | show | ask | jobs Github

	▲	CephalopodMD 3 hours ago
		What I'm getting from this thread is that people have their own private benchmarks. It's almost a cottage industry. Maybe someone should crowd source those benchmarks, keep them completely secret, and create a new public benchmark of people's private AGI tests. All they should release for a given model is the final average score.