Remix.run Logo
rriley 7 hours ago

The biggest gap in this paper is the condition they didn't test: Skills built through human-AI collaboration. They found fully self-generated Skills are useless (-1.3pp) and human-curated ones help a lot (+16.2pp), but that's a false dichotomy. In practice, especially in tools like OpenClaw, skills will emerge iteratively: the AI drafts procedural knowledge while solving a real problem, the human refines it with domain expertise. Neither produces the same artifact alone. The +16.2pp from curated Skills is likely the floor for this approach, not the ceiling. Would love to see a fourth condition.