Remix clone Hacker News

new | show | ask | jobs Github

	▲	mmaunder 5 hours ago
		We're calling agents harnesses now?
	▲	fritzo 4 hours ago \| parent \| next [-]
		ELI5 what is a harness? EDIT from https://arcprize.org/media/ARC_AGI_3_Technical_Report.pdf: > We seek to fight two forms of overfitting that would muddy public sensefinding: > Task-specific overfitting. This includes any agent that is created with knowledge of public ARC-AGI-3 environments, subsequently being evaluated on the same environments. It could be either directly trained on these environments, or using a harness that is handcrafted or specifically configured by someone with knowledge of the public environments.
	▲	boxed 3 hours ago \| parent \| prev \| next [-]
		The point of this test is to check if an AI system can figure out the game. This isn't what happened here. A human figured out the game, wrote in their prompts exactly how the game works and THEN put the AI on the problem. This is 100% cheating and imo quite stupid.
	▲	lwansbrough 4 hours ago \| parent \| prev [-]
		I think generally people regard a harness as the system instructions + tools made available to the LLM (and probably the thing that runs the LLM conversation in a loop.) An agent is collectively, the LLM plus the harness.