Remix.run Logo
sheept 3 hours ago

This feels completely speculative: there's no measure of whether this approach is actually effective.

Personally, I'm skeptical:

- Having the agent look up the JSON schemas and skills to use the CLI still dumps a lot of tokens into its context.

- Designing for AI agents over humans doesn't seem very future proof. Much of the world is still designed for humans, so the developers of agents are incentivized to make agents increasingly tolerate human design.

- This design is novel and may be fairly unfamiliar in the LLM's training data, so I'd imagine the agent would spend more tokens figuring this CLI out compared to a more traditional, human-centered CLI.

gck1 3 hours ago | parent | next [-]

Yeah, people seem to forget one of the L's in LLM stands for Language, and human language is likely the largest chunk in training data.

A cli that is well designed for humans is well designed for agents too. The only difference is that you shouldn't dump pages of content that can pollute context needlessly. But then again, you probably shouldn't be dumping pages of content for humans either.

Smaug123 2 hours ago | parent | next [-]

It's not obvious that human language is or should be the largest amount of training data. It's much easier to generate training data from computers than from humans, and having more training data is very valuable. In paticular, for example, one could imagine creating a vast number of debugging problems, with logs and associated command outputs, and training on them.

rkagerer an hour ago | parent | prev [-]

I also feel like it's just a matter of time until someone cracks the nut of making agents better understand GUI's and more adept at using them.

Is there progress happening in that trajectory?

an hour ago | parent | prev | next [-]
[deleted]
magospietato 3 hours ago | parent | prev [-]

Surely the skill for a cli tool is a couple of lines describing common usage, and a description of the help system?

sheept 2 hours ago | parent [-]

Sure, but the post itself brags,

> gws ships 100+ SKILL.md files

Which must altogether be hundreds of lines of YAML frontmatter polluting your context.

danw1979 2 hours ago | parent | next [-]

Claude Code, at least, will only load a SKILL.md file into context when it’s invoked by the user or LLM itself, i.e. in demand.

sheept an hour ago | parent [-]

Claude will load the name and description of each enabled skill into context at startup[0]; the LLM needs to know what it can invoke, after all. It's negligible for a few skills, but a hundred skills will likely have some impact, e.g. deemphasizing other skills by adding noise.

[0]: https://platform.claude.com/docs/en/agents-and-tools/agent-s...

justinwp 2 hours ago | parent | prev [-]

You don't need to install all of them.