| ▲ | walmsles 5 hours ago | |
I built this while working on a coding agent that kept starting cold every session. The deeper problem was that agent frameworks give you what a tool does and how to call it, but no structured answer to when — when should a tool fire autonomously, and when should it stay silent. That judgement is always implicit, scattered across system prompts and tool descriptions. Tendril is a reference implementation of what I'm calling the Agent Capability pattern. It starts with three bootstrap tools and builds everything else itself. The key constraint: there's no direct code execution. The agent can only run registered capabilities, so every task forces it to write a tool, define its invocation conditions, and register it for future sessions. The registry accumulates across sessions. I also ran the self-extending loop against five local models — Qwen3-8B, Gemma 4, Mistral Small 3.1, Devstral Small 2, Salesforce xLAM-2. None passed. The failure modes were distinct enough to be worth writing up separately: https://serverlessdna.com/strands/ai-agents/agents-know-what... Stack: AWS Strands TypeScript SDK, Bedrock (Claude Sonnet), Deno sandbox, Tauri + React desktop shell. | ||
| ▲ | dd8601fn 3 hours ago | parent | next [-] | |
I did something that sounds similar for my home assistant. The agent never executes anything. It has like four tools… search, request execute, request build, request update. The tool service runs vector search against the tools catalog. The build generalizes the requested function and runs authoring with review steps, declaring needed credentials and network access. The adversarial reviewer can reject back to the authoring three times. After passing, the tool is registered and embeddings are done for search. It’s live for future use. Credentials are stored encrypted, and only get injected by the tools catalog service during tool execution. The network resources are declared so tool function execution can be better sandboxed (it’s not, yet). The agent never has access to credentials and cannot do anything without going through vetted functions in the tool service. Agent, author process, reviewer, embedding… all can be different models running local or remote. Event bus, agent, tool service… all separate containers. I have an url if you want to read a bit about what I did: https://dcd.fyi/agent It’s really just meant for me, but if you’re interested in more details on anything let me know. There’s nothing super special in it. | ||
| ▲ | esafak 4 hours ago | parent | prev [-] | |
You can list the uses of the available tools in the AGENTS. I keep my agents on a tight leash, and self-extension runs counter to this. I would not my agent to spontaneously develop the ability to tap my bank account, for example. | ||