| ▲ | iainmerrick 9 hours ago | ||||||||||||||||
I can't quite tell what's being compared there -- just looks like several different LLMs? To be clear, I'm suggesting that any specific format for "skills.md" is a red herring, and all you need to do is provide the LLM with good clear documentation. A useful comparison would be between: a) make a carefully organised .skills/ folder, b) put the same info anywhere and just link to it from your top-level doc, c) just dump everything directly in the top-level doc. My guess is that it's probably a good idea to break stuff out into separate sections, to avoid polluting the context with stuff you don't need; but the specific way you do that very likely isn't important at all. So (a) and (b) would perform about the same. | |||||||||||||||||
| ▲ | postalcoder 9 hours ago | parent | next [-] | ||||||||||||||||
Your skepticism is valid. Vercel ran a study where they said that skills underperform putting a docs index in AGENTS.md[0]. My guess is that the standardization is going to make its way into how the models are trained and Skills are eventually going to pull out ahead. 0: https://vercel.com/blog/agents-md-outperforms-skills-in-our-... | |||||||||||||||||
| |||||||||||||||||
| ▲ | anupamchugh 7 hours ago | parent | prev [-] | ||||||||||||||||
> If you want a clean comparison, I’d test three conditions under equal context budgets: (A) monolithic > AGENTS.md, (B) README index that links to docs, (C) skills with progressive disclosure. Measure task > success, latency, and doc‑fetch count across 10–20 repo tasks. My hunch: (B)≈(C) on quality, but (C) > wins on token efficiency when the index is strong. Also, format alone isn’t magic—skills that reference > real tools/assets via the backing MCP are qualitatively different from docs‑only skills, so I’d > separate those in the comparison. Have you seen any benchmarks that control for discovery overhead? | |||||||||||||||||