| ▲ | mmaunder 5 hours ago | |
We're calling agents harnesses now? | ||
| ▲ | fritzo 4 hours ago | parent | next [-] | |
ELI5 what is a harness? EDIT from https://arcprize.org/media/ARC_AGI_3_Technical_Report.pdf: > We seek to fight two forms of overfitting that would muddy public sensefinding: > Task-specific overfitting. This includes any agent that is created with knowledge of public ARC-AGI-3 environments, subsequently being evaluated on the same environments. It could be either directly trained on these environments, or using a harness that is handcrafted or specifically configured by someone with knowledge of the public environments. | ||
| ▲ | boxed 3 hours ago | parent | prev | next [-] | |
The point of this test is to check if an AI system can figure out the game. This isn't what happened here. A human figured out the game, wrote in their prompts exactly how the game works and THEN put the AI on the problem. This is 100% cheating and imo quite stupid. | ||
| ▲ | lwansbrough 4 hours ago | parent | prev [-] | |
I think generally people regard a harness as the system instructions + tools made available to the LLM (and probably the thing that runs the LLM conversation in a loop.) An agent is collectively, the LLM plus the harness. | ||