Remix clone Hacker News

new | show | ask | jobs Github

	▲	wyre 3 hours ago
		ARC-AGI is testing raw intelligence, like the raw power of a Formula 1 engine. The rest of the car is the harness.
	▲	gchamonlive 2 hours ago \| parent [-]
		Maybe there is a complex relationship between harness, model and the emergent perceived intelligence we just can't access by isolating the model alone to evaluate "raw intelligence". I don't think it's absurd to imagine a model that by itself wouldn't be that impressive, but would outperform other models given the right harness. It's also not absurd to think of a model that has incredible raw intelligence, but would not scale much with different harnesses. Model performance given different scenarios depend a LOT on dataset and training strategies, so we need to account for these complex relationships, otherwise measuring "raw intelligence" would be the next AI benchmark that is purely for show.