| ▲ | Claude Code completes the first level of several ARC AGI 3 games(arc-agi-runs.web.app) | |
| 2 points by dextersjab 13 hours ago | 2 comments | ||
| ▲ | dextersjab 13 hours ago | parent | next [-] | |
I put this together after a playgroup.org.uk session. This obviously isn't a valid prize submission, but I was interested in testing what was possible using a SOTA harness and model (CC + Opus 4.7) before trying smaller models. It's great to see that the constraints introduced appear to have worked well. Interested in critiques + in case anyone spots leakage that could still be hiding or proposals for what a cleaner eval might look like. | ||
| ▲ | 13 hours ago | parent | prev [-] | |
| [deleted] | ||