Remix.run Logo
RyanShook 5 hours ago

So far my experience with skills is that they slow down or confuse agents unless you as the user understand what the skill actually contains and how it works. In general I would rather install a CLI tool and explain to the agent how I want it used vs. trying to get the agent to use a folder of instructions that I don't really understand what's inside.

giancarlostoro 5 hours ago | parent | next [-]

> So far my experience with skills is that they slow down or confuse agents unless you as the user understand what the skill actually contains and how it works. In general I would rather install a CLI tool and explain to the agent how I want it used vs. trying to get the agent to use a folder of instructions that I don't really understand what's inside.

For Claude Code I add the tooling into either CLAUDE.md or .claude/INSTRUCTIONS.md which Claude reads when you start a new instance. If you update it, you MUST ask Claude to reread the file so it knows the full instructions.

airstrike 5 hours ago | parent | prev | next [-]

Most LLM "harnessing" seems very lazy and bolted on. You can build much more robustly by leveraging a more complex application layer where you can manage state, but I guess people struggle building that

TeMPOraL an hour ago | parent [-]

Common failure mode I've observed is people building a stateful harness for the LLM and then forgetting to tell the LLM about it. Leads to funny/disturbing results whenever the two "desync" in some way.

Example: a plan/act division, with the harness keeping state of which mode is active, and while in "plan mode", removing/disabling tools that can write data. Cue a mishandled timeout or an UI bug that prevents switching to "act mode", and suddenly the agent is spinning for 10 minutes questioning the nature of their reality, as the basic tools it needs to write code inexplicably ceased to exist, then opting for empirical experimentation and eventually figuring out a way to reimplement "search/replace" using shell calls or Python or whatever alternative wasn't properly sandboxed by the harness writers...

Part of this is just bugs in code, but what irks me is watching the LLM getting gaslighted or plain confused by rules of reality changing underneath it, all because the harness state wasn't made observable to the agent, or someone couldn't be arsed to have their error messages and security policies provide feedback to the LLM and not just the user.

selridge 4 hours ago | parent | prev [-]

I mean, yes. You should do exactly that: instruct an agent on how to do something you understand in terms you can explain.

Putting that in a `.md` file just means you don’t need to do it twice.