▲ | losvedir 4 days ago | |
Can someone confirm my understanding of how tool use works behind the scenes? Claude, ChatGPT, etc, through the API offer "tools" and give responses that ask for tool invocations which you then do and send the result back. However, the underlying model is a strictly text based medium, so I'm wondering how exactly the model APIs are turning the model response into these different sort of API responses. I'm assuming there's been a fine-tuning step with lots of examples which put desired tool invocations into some sort of delineated block or something, which the Claude/ChatGPT server understand? Is there any documentation about how this works exactly, and what those internal delineation tokens and such are? How do they ensure that the user text doesn't mess with it and inject "semantic" markers like that? | ||
▲ | jedimastert 4 days ago | parent | next [-] | |
Here's some docs from anthropic about their implementation https://docs.anthropic.com/en/docs/agents-and-tools/tool-use... The disconnect here is that models aren't really "text" based, but token based, like how compilers don't use the code itself but a series of tokens that can include keywords, brackets, and other things. The output can include words but also metadata | ||
▲ | libraryofbabel 4 days ago | parent | prev | next [-] | |
You have the right picture of what’s going on. Roughly: * The only true interface with an LLM is tokens. (No separation between control and data channels.) * The model api layer injects instructions on tool calling and a list of available tools into the base prompt, with documentation on what those tools do. * Tool calling is delineated by special tokens. When a model wants to call a tool, it adds a special block to the response that contains the magic token(s) along with the name of the tool and any params. The api layer then extracts this and forms a structured json response in some tool_calls parameter or whatever that is sent in the api response to the user. The result of the tool coming back from the user through the tool calling api is then encoded with special tokens and injected. * Presumably, the api layer prevents the user from injecting such tokens themselves. * SotA Models are good at tool calls because they have been heavily fine-tuned on them, with all sorts of tasks that involve tool calls, like bash invocations. The fine-tuning is both to get them good at tool calls in general, and also probably involves specific tool calls that the model provider wants them to be good at, such as the Claude Sonnet model getting fine-tuned on the specific tools Claude Code uses. Sometimes it amazes me that this all works so well, but it does. You are right to put your finger on the fine-tuning, as it’s critical for making tool calling work well. Tool calling works without fine-tuning, but it’s going to be more hit-or-miss. | ||
▲ | the_mitsuhiko 4 days ago | parent | prev [-] | |
> I'm assuming there's been a fine-tuning step with lots of examples which put desired tool invocations into some sort of delineated block or something, which the Claude/ChatGPT server understand? As far as I know that's what's happening. They are training it to return tool responses when it's unsure about the answer or instructed to do so. There are generic tool trainings for just following the response format, and then probably there are some tool specific trainings. For instance gpt-oss loves to use the search tool, even if it's not mentioned anywhere. Anthropic lists well known tools in their document (eg: text_editor, bash). They are likely to have been trained specifically to follow some deeper semantics compared to just generic tool usage. The whole thing is pretty brittle and tool invocations are just taking place via in-band signalling, delineated by special tokens or token sequences. |