Remix clone Hacker News

new | show | ask | jobs Github

	▲	Banditoz 6 hours ago
		If the benchmarks are private, how do we reproduce the results? I looked up the Humanity's Last Exam (https://agi.safe.ai/) this model uses and I can't seem to access it.
	▲	johndough 5 hours ago \| parent [-]
		You can request access here: https://huggingface.co/datasets/cais/hle The test data is purposely difficult to access to reduce the chance of leaking it into the training dataset.