Remix.run Logo
fennecfoxy 5 days ago

You've replied multiple times specifying toolchains without explaining what they are.

I've seen for models that don't support tool defs via API that those tool defs are provided in the context (though the model is still trained for tool use, outputting the special python_call/x tokens to indicate a tool call in output).

I can see for example that MCP's own example using Anthropic uses their API/SDKs tools section as outlined here https://docs.anthropic.com/en/api/messages#body-tools. What the example does is shove the tool definition into here - this includes the full name description etc of the tool.

Quoting them "And then asked the model "What's the S&P 500 at today?", the model might produce tool_use content blocks in the response" so I imagine that behind the scenes they're _smashing it into the context_ as I already suggested; the only reason it's separate in the API is so they can type/validate it.

I don't know what this magical tool chain is but the LLM is the thing providing output based on the not so new magical concept of attention and statistics; I don't see how some separate "toolchain" piece takes the input string and somehow does a better job at selecting a tool than the model itself; unless the toolchain is itself a smaller LLM trained specifically for tool use outside of your larger multi-purpose/"knowledgable" LLM.