| ▲ | skybrian 3 hours ago | |
Sounds interesting, but I'm not quite getting the relevance for people writing code with an agent. Should I be doing evals? | ||
| ▲ | ssk42 35 minutes ago | parent [-] | |
Well I mean yes. I think people ought be aware for how the harnesses compare for their stacks. But clean room applies for this RGR situation too | ||