| ▲ | irrationalfab 18 hours ago | ||||||||||||||||
There's a pattern I keep seeing: LLMs used to replace things we already know how to do deterministically. Parsing a known HTML structure, transforming a table, running a financial simulation. It works, but it's like using a helicopter to cross the street: expensive, slow, and not guaranteed to land exactly where you intended. The real opportunity with Agent Skills isn't just packaging prompts. It's providing a mechanism that enables a clean split: LLM as the control plane (planning, choosing tools, handling ambiguous steps) and code or sub-agents as the data/execution plane (fetching, parsing, transforming, simulating, or executing NL steps in a separate context). This requires well-defined input/output contracts and a composition model. I opened a discussion on whether Agent Skills should support this kind of composability: | |||||||||||||||||
| ▲ | esafak 2 hours ago | parent | next [-] | ||||||||||||||||
Skills are about empowering LLMs with tools, so the heavy lifting can still be deterministic. Furthermore, pipelines written in LLMs are simpler and less brittle, since handling variation is the essence of machine learning. | |||||||||||||||||
| ▲ | basch 17 hours ago | parent | prev | next [-] | ||||||||||||||||
The same applies to context vs a database. If a reasoning model makes a decision about something, it should be put off to the side and stored as a value/variable/entry somewhere. Instead of using pages and pages of context, it makes sense for some tasks to "press" decisions that become more permanent to the conversation. You can somewhat accomplish that with notebooklm, by turning results into notes into sources, but notebooklm is insular and doesnt have the research and imaging features of gemini. And also, in writing, writing from top to bottom has its disadvantages. It makes sense to emulate human writing process and have passes, as you flesh out, and conversely summarize writing. Current LLMs can brute force these things through emulation/observation/mimicry but they arent as good as doing it the right way. Not only would I like to see "skills" but also "processes" where you create a well defined order that tasks are accomplished in sequence. Repeatable templates. This would essentially include variables in the templates, set for replacement. | |||||||||||||||||
| |||||||||||||||||
| ▲ | gradus_ad 15 hours ago | parent | prev | next [-] | ||||||||||||||||
I've recently been doing some work with Autodesk. It would be great for an LLM to be as comfortable with the "vocabulary" of these applications as they are with code. Maybe part of this involves creating a language for CAD design in the first place. But the principle that we need to build out vocabularies and subsequently generate and expose "sentences" (workflows) for LLM's to train on seems like a promising direction. Of course this requires substantial buy in from application owners - create the vocabulary - and users - agree to expose and share the sentences they generate - but the results would be worth it. | |||||||||||||||||
| |||||||||||||||||
| ▲ | ugh123 18 hours ago | parent | prev | next [-] | ||||||||||||||||
100% Additionally, I can't even get claude or codex to reliable use the prompt and simple rules (use this command to compile) in an agents.md or whatever required markdown file is needed. Why would I assume they will reliably handle skills prompts spread about a codebase? I've even seen tool usage deteriorate while it's thinking and self commanding through its output to say.. read code from a file. Sometimes it uses tail while other times it gets confused on the output and then writes a basic python program to parse lines and strings from the same file to effectively get what was the same output as before. How bizarre! | |||||||||||||||||
| ▲ | itissid 10 hours ago | parent | prev | next [-] | ||||||||||||||||
Isn't atleast part of that GH issue something that this https://docs.boundaryml.com/guide/introduction/what-is-baml is also trying to solve? LLM inputs and outputs must be functions with defined functions. That was their starting point. IIUC their most recent arc focuses on prompt optimization[0] where you can optimize — using DSPy and an optimization algo GEPA [1] — using relative weights on different things like errors, token usage, complexity. [0] https://docs.boundaryml.com/guide/baml-advanced/prompt-optim... [1] https://github.com/gepa-ai/gepa?tab=readme-ov-file | |||||||||||||||||
| ▲ | _the_inflator 6 hours ago | parent | prev | next [-] | ||||||||||||||||
I agree partly. Skills are essentially boiling down to distributed parts of a Main Prompt. If you consider a state model you can see this pattern: Task is the state and combining the task's specifics skills defines the current prompt augmentation. When the task changes, another prompt emerges. In the end, it is the clear guidance of the Agent that is the deciding factor. | |||||||||||||||||
| ▲ | hintymad 17 hours ago | parent | prev [-] | ||||||||||||||||
> Parsing a known HTML structure, transforming a table, running a financial simulation. Transforming an arbitrary table is still hard, especially a table on a webpage or in a document. Sometimes I even struggle to find the right library. The effort does not seem worth it for one-off need of such transformation too. LLM can be a great tool for doing the tasks. | |||||||||||||||||