Remix.run Logo
falcor84 7 hours ago

I for one think that harness development is perhaps the most interesting part at the moment and would love to have an alternative leaderboard with harnesses.

sanxiyn 7 hours ago | parent | next [-]

There is. Official leaderboard is without harness, and community leaderboard is with harness. Read ARC-AGI-3 Technical Paper for details.

falcor84 7 hours ago | parent [-]

I went through the technical paper again, and while they explain why they decided against the harness, I disagree with them - my take is that if harnesses are overfitting, then they should be penalized on the hidden test set.

Anyway, searching both in ARC-AGI's paper and website and directly on kaggle, I failed to find a with-harness leaderboard; can you please give the link?

sanxiyn 7 hours ago | parent [-]

Here it is: https://arcprize.org/leaderboard/community

steve_adams_86 6 hours ago | parent | prev [-]

I'm so into harness development right now. Once it clicked that harnesses can bring more safety and determinism to LLMs, I started to wonder where I'd need that and why (vs MCP or just throwing Claude Code at everything), and my brain gears have been turning endlessly since then. I'd love to see more of what people do with them. My use cases are admittedly lame and boring, but it's such a fun paradigm to think and develop around.

j_bum 4 hours ago | parent [-]

Could you point me to some resources to learn about harnesses? I’d love to hear an example of a use case you’re thinking of.