| ▲ | gobdovan 2 hours ago | |
I distilled the Meta-Harness workflow in a skill [0]. Unlike the original Islo POC, which demonstrates an automated runtime loop converging from trace-rich evaluations [1], my test only evaluates whether the distilled skill improves a lead agent's prompt-repair discipline and audit trail [2]. It took a few tries to figure out what to test in the first place, since it is not obvious what the workflow should improve (prompt? guided agent ability?). So, the only meaningful test I ended up with was giving easy tasks, but with a deliberately misleading/incomplete prompt, then testing whether persisting deltas and observations between successive prompts meaningfully improves a meta-agent's ability to correct the imprecise prompt (what I mean by "prompt-repair discipline and audit trail") [2]. From a couple more experiments (summarized in [2]), I found that the Meta-agent does not really have an effect on how well the guided agents perform, but simply improves imprecise prompts better. My conclusion is that this method works to improve bad prompts, I didn't demonstrate that improves guided agent capabilities. However, I think it's better to work on your prompts before giving them to agents instead of giving bad prompts and iterating on them with a meta-agent. [0]: https://github.com/ouatu-ro/skill-distillery/blob/main/skill... [1]: https://github.com/zozo123/meta-harness-on-islo [2]: https://github.com/ouatu-ro/skill-distillery/blob/main/repor... | ||
| ▲ | zozo123-IB 2 hours ago | parent [-] | |
love that! we are hiring, please DM (if you want) on linkedin: https://www.linkedin.com/in/yossi-eliaz/ | ||