I have a custom skill-creator skill that contains this:

> A common pitfall is for Claude to create skills and fill them up with generated information about how to complete a task. The problem with this is that the generated content is all content that's already inside Claude's probability space. Claude is effectively telling itself information that it already knows!

> Instead, Claude should strive to document in SKILL.md only information that:

> 1. Is outside of Claude's training data (information that Claude had to learn through research, experimentation, or experience) > 2. Is context specific (something that Claude knows now, but won't know later after its context window is cleared) > 3. Aligns future Claude with current Claude (information that will guide future Claude in acting how we want it to act)

> Claude should also avoid recording derived data. Lead a horse to water, don't teach it how to drink. If there's an easily available source that will tell Claude all it needs to know, point Claude at that source. If the information Claude needs can be trivially derived from information Claude already knows or has already been provided, don't provide the derived data.

For those interested the full skill is here: https://github.com/j-r-beckett/SpeedReader/blob/main/.claude...

▲

dimitri-vs 7 hours ago | parent | next [-]

I don't think LLMs are very good at introspection on what they know or don't know, but otherwise this is gold. Thanks for sharing.

▲

lkoczorowski 5 hours ago | parent | prev | next [-]

Does this not assume that Claude can pick out the best of what it knows?

Claude's training data is the internet. The internet is full of Express tutorials that use app.use(cors()) with no origin restriction. Stack Overflow answers that store JWTs in localStorage, etc.

Claude's probability space isn't a clean hierarchy of "best to worst." It's a weighted distribution shaped by frequency in training data.

So even though it "knows" stuff, it doesn't necessarily know what you want, or what a professional in production environment do.

Unless I'm missing something?

▲

nmilo 6 hours ago | parent | prev | next [-]

This is really good! I like how it reads like a blog post, it feels like I'm learning a skill on how to write good skills. Maybe that's another heuristic, a skill should read like an interesting blog post, highlighting non-obvious information.

▲

j45 8 hours ago | parent | prev [-]

Sincerely, perhaps you should publish on arxiv before a researcher reads it to run it and write a study.

It's fairly common we notice these types of threads where one thing is being postulated and then there's comments upon comments of doer's showing what they have done.

▲

siva7 8 hours ago | parent [-]

somehow sad that some random dude on hn seems to have more brain than most scientists publishing on something about agents or prompting.

	▲	jerf 8 hours ago \| parent \| next [-]
		The AI world moves at a blistering pace. Academic publishing does not. In this particular case the "random dude on HN" is probably six to nine months ahead of the academic publication, not in the sense of being that much smarter but literally just being that much further progressed through time relative to the academic publication pipeline.
	▲	pickleRick243 7 hours ago \| parent \| prev [-]
		we should give a little more credit to the readership of HN. I'm not sure it's that much lower than the average academic publishing on arxiv.