| ▲ | esafak 11 hours ago | |||||||||||||
Does anyone find that agents just don't use them without being asked? | ||||||||||||||
| ▲ | libraryofbabel 11 hours ago | parent | next [-] | |||||||||||||
This has been a problem for us too. Sometimes they reach for skills, sometimes they don’t and just try to do the thing on their own. It’s annoying. I think this is (mostly) a solvable problem. The current generation of SotA models wasn’t RLVR-trained on skills (they didn’t exist at that time) and probably gets slightly confused by the way the little descriptions are all packed into the same tool call schema. (At least that’s how it works with Claude Code.) The next generation will have likely been RLVRed on a lot of tasks where skills are available, and will use them much more reliably. Basically, wait until the next Opus release and you should hopefully see major improvements. (Of course, all this stuff is non-deterministic blah blah, but I think it’s reasonable to expect going from “misses the skill 30% of the time” to “misses it 2% of the time”.) | ||||||||||||||
| ||||||||||||||
| ▲ | modernerd 11 hours ago | parent | prev | next [-] | |||||||||||||
That's also what Vercel found: > In 56% of eval cases, the skill was never invoked. The agent had access to the documentation but didn't use it. Adding the skill produced no improvement over baseline. > … > Skills aren't useless. The AGENTS.md approach provides broad, horizontal improvements to how agents work with Next.js across all tasks. Skills work better for vertical, action-specific workflows that users explicitly trigger, https://vercel.com/blog/agents-md-outperforms-skills-in-our-... | ||||||||||||||
| ▲ | jillesvangurp 11 hours ago | parent | prev | next [-] | |||||||||||||
Depends what you use perhaps. I use codex and it seems to mostly stick to instructions I give. I use an AGENTS.md that explicitly points to the repository's skill directory. I mostly keep instructions in there for obvious things like how to build, how to test, what to do before declaring a thing done, etc. I don't tend to have a lot of skills in there either. Probably the more skills you have, the more confused it might get. The more potentially conflicting instructions you give the harder it gets for an LLM to figure out what you actually want to happen. If I catch it going off script, I often interrupt it and tell it what to do and update the relevant skill. Seems to work pretty good. Keeping things simple seems to work. | ||||||||||||||
| ▲ | rco8786 11 hours ago | parent | prev | next [-] | |||||||||||||
Yep. I have an incredibly hard time getting them to use Skills at all, even when asked. I saw someone's analysis a few days ago and they found that their agents were more accurate when just dumping the skill context directly into AGENTS.md | ||||||||||||||
| ▲ | 11 hours ago | parent | prev | next [-] | |||||||||||||
| [deleted] | ||||||||||||||
| ▲ | troupo 11 hours ago | parent | prev | next [-] | |||||||||||||
Because "skills" are just .md files that the lossy compressing statistical output machine may or may not find and that may or may not be retained in the tiny context window | ||||||||||||||
| ||||||||||||||
| ▲ | shmoogy 11 hours ago | parent | prev | next [-] | |||||||||||||
I often find they aren't triggered when I would expect using a keyword and explicitly trigger them. | ||||||||||||||
| ▲ | tobyhinloopen 11 hours ago | parent | prev [-] | |||||||||||||
Same! If I put the skill's instructions in the general AGENTS.md, it works just fine. | ||||||||||||||