▲ | creddit 8 days ago | |||||||
> But here’s the important part: LLMs don’t know how to use tools. They don’t have native tool calling support. They just generate text that represents a function call. This terrifies me. This whole time I was writing bash commands into my terminal, I thought I knew how to use the tools. Now, I’ve just learned that I had no idea how to use tools at all! I just knew how to write text that /represented/ tool use. | ||||||||
▲ | nlawalker 8 days ago | parent | next [-] | |||||||
> writing bash commands into my terminal This is what the author means by "knowing how to use the tool". The LLM alone is effectively a function that outputs text, it has no other capabilities, it cannot "connect to" or "use" anything by itself. The closest it can come is outputting an unambiguous, structured text request that can be interpreted by the application code that wraps it and does something on its behalf. The author's point hinges on the architectural distinction between the LLM itself and that application code, which is increasingly irrelevant and invisible to most people (even developers) because the application code that knows how to do things like call MCP servers is already baked in to most LLM-driven products and services. No one is "talking directly to" an LLM, it's all mediated by multiple layers, including layers that perform tool calling. | ||||||||
| ||||||||
▲ | jerf 8 days ago | parent | prev | next [-] | |||||||
A lot of people resist the idea that programming is intrinsically mathematical, but this is one of the places it pops out. The power of programming lies precisely in the way it brings together text that "represents" something with text that "does" something. That is, at the core, the source of its power. You can still draw the distinction philosophically, as you just did, but at the same time there is also a profound way in which there is in fact no difference between "using" computers and "representing" your use of computers. | ||||||||
▲ | fennecfoxy 8 days ago | parent | prev | next [-] | |||||||
I think what your quote is trying to say essentially boils down to: LLMs can be given facts in the context, we _hope_ that the statistical model picks up on that information/tool calls but it isn't _guaranteed_. Unlike human beings such as yourself (presumably), LLMs do not have agency, they do not have conscious or active thought. All they do is predict the next token. I've thought about the above a lot, these models are certainly capable of a lot, but they do not in any form or fashion emulate the consciousness that we have. Not yet. | ||||||||
▲ | johnmaguire 8 days ago | parent | prev [-] | |||||||
I think you might be missing the point of this quote, which is that you don't have to introduce additional code into the model to support MCP. MCP happens at a different layer. You have to run the MCP commands. Or use a client that does it for you: > But the LLM will never know you are using MCP, unless you are letting it know in the system prompt of tool definitions. You, the developer, is responsible for calling the tools. The LLM only generates a snippet of what tool(s) to call with which input parameters. The article is describing how MCP works, not making an argument about what it means to "understand" something. | ||||||||
|