| ▲ | dougbright 8 days ago |
| The question I’m wrestling with is will anybody care about MCP? I’m working on my own MCP proxy to manage security, auditing, and server management and the more I think deeply about the actual use cases the more I wonder if I’m wasting my time.
Can anyone think of a world where MCP is relevant if generic chatbots (ChatGPT, Claude Desktop) don’t become the primary human-AI interface? If LLMs are still wrapped in application wrappers, isn’t ̶a̶n̶ ̶a̶p̶p̶r̶o̶a̶c̶h̶ ̶l̶i̶k̶e̶ ̶L̶a̶n̶g̶C̶h̶a̶i̶n̶ a more traditional agentic approach going to make more sense? |
|
| ▲ | electric_muse 8 days ago | parent | next [-] |
| I think MCP has legs well beyond just the LLM / agent world. Just like USB went from "how I connect my mouse" to "how I charge my beard trimmer." In fact, I imagine it's going to go full-duplex with all our systems, becoming a more standard way for systems to communicate with each other. Under the hood, MCP is just JSON RPC, which is a fine format for communicating between systems. MCP layers on some useful things like authentication and discovery. Both are critical to any kind of communication between systems built by different authors (e.g. various apps and services). Discovery, especially, is the fascinating part. Rather than hoping an OpenAPI spec exists and hoping it's right, MCP has this exchange of capabilities baked in. I spent the last 9 years building integration technology, and from that perspective, the discovery-documentation-implementation problem is the core issue. Right now, LLMs basically "solve" the integration problem because they can do the mapping between external tools/resources/formats and internal ones. But there's nothing that strictly "requires" an LLM to be involved at all. That's just the primary reason to develop MCP. But you could just as well use this as a way for integrating systems, making some bets on interface stability (and using LLMs for cases only when your prior expectations no longer hold and you need a new mapping). The comparison is perhaps imperfect and overused, but I feel like we're witnessing the birth of a new USB-like standard. There's something right now that it was designed to do, but it's a decent enough standard that can actually handle many things. I wouldn't be surprised if in some period of time we see enterprise apps shift from REST to MCP for bi-directional integrations. For the OP, I'm not sure if you're working on an MCP proxy (A) as a commercial offering, (B) as something for your team to use, closed source, or (C) as something open source for fun. But we just built and started selling an MCP proxy/gateway. It handles identities for humans & bots, tool allowlists, and policy setting for an org. If you don't want to build something on your own because of option B above, get in touch. |
| |
| ▲ | justusthane 8 days ago | parent | next [-] | | Maybe you've already seen it, but your comment reminded me of this recent article about MCP as a universal protocol (not just for AI): https://worksonmymachine.ai/p/mcp-an-accidentally-universal-... (discussion: https://news.ycombinator.com/item?id=44404905) | |
| ▲ | j45 5 days ago | parent | prev | next [-] | | It could, except LLMs are non-deterministic, and the rest of the tech world largely isn't, and aligning those two and keeping them aligned with every tweak to a model and model change, and prompt change can be a lot of babysitting. No doubt there's lots of smart people working on it, I just have been around application of tech in B2B for reliability and that's usually where the conversation usually starts. | |
| ▲ | nlawalker 8 days ago | parent | prev [-] | | A concise, no-nonsense list of every endpoint your service offers, with a simple text description and a JSON schema, is all that many developers ever wanted and needed in a lot of cases, but no one bothered producing them until we invented a machine that could automatically make use of them. | | |
| ▲ | zaphirplane 8 days ago | parent | next [-] | | That’s not true at all, open API have it | |
| ▲ | Charon77 8 days ago | parent | prev [-] | | Oh like OpenAPI? Or god-forbid, HATEOAS? MCP itself still runs on JSON RPC, which I think has been a thing since like a long time |
|
|
|
| ▲ | fennecfoxy 8 days ago | parent | prev | next [-] |
| In my opinion, I get the desire to create some sort of specification for an LLM to interface with [everything else], but I don't really see the point at doing it on an inference level by smashing JSON into the context. These models are usually very decent at parsing out stuff like that anyway; we don't need the MCP spec, everyone can just specify the available tools in natural language and then we can expect large param models to just "figure it out". If MCP had been a specification for _training_ models to support tool use on an architectural level, not just training it to ask to use a tool with a special token as they do now. It's an interesting topic because it's the exact same as the boundary between humans (sloppy, organic, analog messes) and traditional programs (rigid types, structures, formats). To be fair if we can build tool use in architecturally and solve the boundary between these two areas then it also works for things like objective facts. LLMs are just statistical machines and data in the context doesn't really mean all that much, we just hope it is statistically relevant given some input and it is often enough that it works, but not guaranteed. |
| |
| ▲ | dragonwriter 8 days ago | parent | next [-] | | > These models are usually very decent at parsing out stuff like that anyway; we don't need the MCP spec, everyone can just specify the available tools in natural language and then we can expect large param models to just "figure it out". This is mostly the kind of misunderstanding of MCP that the article seems directed at, and much of this response is focussed on things that are key points in the article, but: MCP isn't for the models, it is for the toolchains supporting them. The information models actually need about tools and resources is accessed from the server by the toolchain using the information that is in the MCP, and the structure that models use varies by the model, but it is consistently completely different information than what is in the MCP—the tool and resource (but probably not prompt) names from the MCP will probably also be given to the model, but that's pretty much the only direct overlap. MCP can also define prompts for the toolchain, but information about those are more likely presented directly to the user than the model itself. The toolchain also needs to know how the model is trained to get tool information in its prompt, just like it needs to know other aspects of the models preeferred prompt template, but that is a separate concern from MCP. > If MCP had been a specification for _training_ models to support tool use on an architectural level, not just training it to ask to use a tool with a special token as they do now. MCP isn't a specification for training anything. MCP is a specification for providing information about tools external to the toolchain running the LLM to the toolchain. Tools internal to the toolchain don't ever use MCP because, again, MCP isn't for the model, it's for the toolchain. | | |
| ▲ | fennecfoxy 5 days ago | parent [-] | | You've replied multiple times specifying toolchains without explaining what they are. I've seen for models that don't support tool defs via API that those tool defs are provided in the context (though the model is still trained for tool use, outputting the special python_call/x tokens to indicate a tool call in output). I can see for example that MCP's own example using Anthropic uses their API/SDKs tools section as outlined here https://docs.anthropic.com/en/api/messages#body-tools. What the example does is shove the tool definition into here - this includes the full name description etc of the tool. Quoting them "And then asked the model "What's the S&P 500 at today?", the model might produce tool_use content blocks in the response" so I imagine that behind the scenes they're _smashing it into the context_ as I already suggested; the only reason it's separate in the API is so they can type/validate it. I don't know what this magical tool chain is but the LLM is the thing providing output based on the not so new magical concept of attention and statistics; I don't see how some separate "toolchain" piece takes the input string and somehow does a better job at selecting a tool than the model itself; unless the toolchain is itself a smaller LLM trained specifically for tool use outside of your larger multi-purpose/"knowledgable" LLM. |
| |
| ▲ | zozbot234 8 days ago | parent | prev [-] | | As I mentioned in a sibling thread, you can use that JSON structured input to constrain the LLM's output during inference so that it will only contain valid tool calls, in addition to smashing it into the context. This is valuable since it's going to be far more robust than assuming that the LLM can "figure everything out" from a natural language description. |
|
|
| ▲ | dragonwriter 8 days ago | parent | prev | next [-] |
| MCP is a means of communicating information about externally-defined tools to the “application wrapper” (and your examples of “generic chatbots” are also application wrappers). Well, between the application wrapper and servers; “application wrappers” for LLMs are pretty much the motivating (but not sole) case of MCP Clients. Without something like MCP, each application wrapper is left do do its own ad hoc wrappers for external tools (tools internal to the wrapper don’t use MCP.) With MCP, it just integrates an MCP client library, and then it can use any tool, resource, or prompt provided by any MCP server available to it. |
|
| ▲ | blitzar 8 days ago | parent | prev | next [-] |
| Personally I find LLMs functionally useless without any external data besides than which I write in the prompt. One MCP that I use is as simple as todays date and time - how else would LLMs know what day of the week it is? |
| |
| ▲ | fennecfoxy 8 days ago | parent [-] | | `${context} ${extra_data} ${user_query}`. That's all MCP is. Concatenating JSON to the context. | | |
| ▲ | dragonwriter 8 days ago | parent | next [-] | | MCP is not concatenating JSON to the context. MCP is providing JSON to the toolchain; except for the names of tools and resources, most of the information in MCP doesn't go to the model at all, the toolchain uses it to connect to tool (resource, etc.) providers and from there it gets information that it can use either in the context for the LLM or in the UI for the user, and the shape that information goes into the context for the model depends on the model and has nothing to do with MCP. MCP, is just a way for the toolchain to get information about and communicate with external services, the model doesn't (and if this sounds like the title of the article, there is a reason) need to know about it. | |
| ▲ | blitzar 8 days ago | parent | prev | next [-] | | Yeah but I dont have to type all that context in - not to mention if I had all that context in my hand I wouldnt need to enter it into a LLM to find out what it says. | |
| ▲ | 8 days ago | parent | prev [-] | | [deleted] |
|
|
|
| ▲ | whalesalad 8 days ago | parent | prev | next [-] |
| MCP sucks because it has to be connected to a desktop client. I'd love to build some MCP-like integrations but no one on my team can use them. We use LLM's via - as you noted - other means like via Notion, via web UI, via our own API integrations. Until there is a more central way to connect these things - yeah they won't reach mass adoption. |
| |
| ▲ | dragonwriter 8 days ago | parent | next [-] | | > MCP sucks because it has to be connected to a desktop client. No, it doesn't need to be connected to a desktop client. It is true that the original use was for connecting local tools over stdio to a desktop client, and it is currently more supported in desktop clients than others, but it now includes remote support and, e.g., ChatGPT Deep Research has support for remote MCP, but only for servers with a very specific shape. | |
| ▲ | prpl 8 days ago | parent | prev | next [-] | | A comprehensive solution for 1. A user interacting with multiple MCP servers, behind a gateway (with MCP client support) to get authentication from the user to those servers in some way (OAuth/OIDC, with PKCE, usually, sometimes token exchange), allowing out-of-band auth 2. The same, but built on identity for service accounts/native identity or something, for automation would enable this. There’s a few SEPs open now around this. | |
| ▲ | pjmlp 8 days ago | parent | prev [-] | | No it doesn't, there are ongoing efforts to orchestrate MCPs just like any other kind of Web API. Example, https://www.sitecore.com/products/sitecore-stream |
|
|
| ▲ | j45 8 days ago | parent | prev | next [-] |
| Something like containerized apps are going to be important for security with MCPs or whatever it becomes, comes from it, or comes afterwards. Getting in reps on thinking through these kinds of problems are valuable since LLMs are a new type of software and existing software axioms don't always fit. |
|
| ▲ | ivape 8 days ago | parent | prev | next [-] |
| What about LangChain makes more sense? It’s one of the most prematurely complex libs I’ve seen. I’m calling it right now, LangChain is going to run a mind fuck on everyone and convince people that’s actually how complicated orchestrating LLM control flow should be. The community needs to fight this framework off. That’s besides the point. MCP servers let you discover function interfaces that you’ll have to implement yourself (in which case, yeah, what’s the point of this? I want the whole function body). |
| |
| ▲ | fennecfoxy 8 days ago | parent | next [-] | | Yup exactly. It's all just state machines. Really nothing more than that. It's like all these lang* frameworks are pretending that they can solve core deficiencies in the model, whereas most stuff is just workarounds. We do have to glue model stuff together _somehow_ but there's no reason that it needs to be as complex as most of these frameworks are setting out to be. | |
| ▲ | diggan 8 days ago | parent | prev | next [-] | | > The community needs to fight this framework off. Why? The people who been around for a while, already avoid it because they've either tried it before, or poked around in the source and then we ran away quickly. If people start using stuff without even the slightest amount of thinking beforehand, then that's their prerogative, why would it be up to the community hive-mind to "chose" what tools others should use? | | |
| ▲ | lyu07282 8 days ago | parent [-] | | Agreed except we end up with a lot of junior people in the space who learned and used only langchain, who we then have to unlearn all the langchain nonsense when we hire them. Or we grep -v langchain cvs/ |
| |
| ▲ | dougbright 8 days ago | parent | prev [-] | | My bad. I shouldn’t have mentioned LangChain here because it’s a little besides my point. What I mean is, MCP seems designed for a world where users talk to an LLM, and the LLM calls software tools. For the foreseeable future, especially in a business context, isn’t it more likely that users will still interact with structured software applications, and the applications will call the LLM? In that case, where does MCP fit into that flow? | | |
| ▲ | anthonypasq 8 days ago | parent | next [-] | | it separates FE and BE for agent teams just like we did with web apps. the team building your agent framework might not know the business domain of every piece of your data/api space that your agent will need to interact with. in that case, it makes sense for your differnet backend teams to also own the mcp server that your companies agent team will utilize. | | | |
| ▲ | ivape 8 days ago | parent | prev | next [-] | | Yeah I don’t know. Let’s a say a org wants to do discovery of what functions are available for an app across the org. Okay, that’s interesting. But, each team can just also import a big file called all_functions.txt. A swagger api is already kind of like an MCP, or really any existing REST api (even better because you don’t have to implement the interface). If I wanted to give my LLM brand new functionality, all I’d have to do is define out tool use for <random_api>, with zero implementation. I could also just point it to a local file and say here are the functions locally available. Remember, the big hairy secret is that all of these things just plop out a blob of text that you paste back into the LLM prompt (populating context history). That’s all these things do. Someone is going to have to unconfuse me. | | |
| ▲ | 8 days ago | parent | next [-] | | [deleted] | |
| ▲ | anthonypasq 8 days ago | parent | prev [-] | | it separates FE and BE for agent teams just like we did with web apps. the team building your agent framework might not know the business domain of every piece of your data/api space that your agent will need to interact with. in that case, it makes sense for your differnet backend teams to also own the mcp server that your companies agent team will utilize. | | |
| ▲ | ivape 8 days ago | parent [-] | | Why don’t they just own a REST or RPC server? This is the part of the MCP motivation I’m not totally getting. In fact, you can prove to yourself that your LLM can hook into almost any existing REST api in a few minutes, which gives it more existing options and functionality than just about anything else as it stands now. Things like swagger or graphql already provide you discovery. | | |
| ▲ | dragonwriter 8 days ago | parent [-] | | > This is the part of the MCP motivation I’m not totally getting Would it help you to know that the original use case of MCP was communicating information about and facilitating communication with servers that the LLM frontend would run locally and communicate with over stdio, and that remains an important use case? |
|
|
| |
| ▲ | tomhallett 8 days ago | parent | prev [-] | | Total beginner question: if the “structured software application” gives llm prompt “plan out what I need todo for my upcoming vacation to nyc”, will an llm with a weather tool know “I need to ask for weather so I can make a better packing list”, while an llm without weather tool would either make list without actual weather info OR your application would need to support the LLM asking “tell me what the weather is” and your application would need to parse that and then spit back in the answer in a chained response? If so, seems like tools are helpful in letting LLM drive a bit more, right? | | |
| ▲ | Eisenstein 8 days ago | parent [-] | | If you have a weather tool available it will be in a list of available tools, and the LLM may or may not ask to use it; it is not certain that it will, but if it is a 'reasoning' model it probably will. You need to be careful creating a ton of tools and displaying a list of all of them to the model since it can overwhelm them and they can go down rabbit holes of using a bunch of tools to do things that aren't particularly helpful. Hopefully you would have specific prompts and tools that handle certain types of tasks instead of winging it and hoping for the best. |
|
|
|
|
| ▲ | pjmlp 8 days ago | parent | prev | next [-] |
| I see them as the future SOA/WebServices/REST/GrapQL/.... endpoints in many cloud services. And as replacements for AppleScript, COM Automation, and friends on desktop systems. |
|
| ▲ | cyanydeez 8 days ago | parent | prev | next [-] |
| You are wasting your time. Write a restapi, add a description field. done. |
|
| ▲ | CodeNest 8 days ago | parent | prev [-] |
| [dead] |