Remix.run Logo
Show HN: Marmot, context layer for agents and humans(marmotdata.io)
15 points by bschaatsbergen 6 hours ago | 7 comments

Hi HN, Bruno here, one of the two people building this.

For years we got away with poor context because people fill gaps. If you didn't know what a column meant or where a database lived, you asked someone. You knew who to go to. That informal layer, who to ask and what things mean, carried us for years. Agents can't do that. An agent only knows what you hand it. It doesn't ask, it guesses. So what people carried in their heads now has to live somewhere a machine can reach: what your data is, what it means, who owns it, what it connects to.

Concretely, Marmot is a catalog. It catalogs your services, APIs, queues, topics, databases, pipelines, and more, then exposes that over a built-in MCP server for agents and a UI/API for people. You populate it from Terraform, Kubernetes, Pulumi, the API or the CLI.

MIT licensed. Self-host for free.

kerlenton 3 hours ago | parent | next [-]

The catalog approach is appropriate for MCP as well. Something I would be interested in: once you have all of your services/APIs/DBs exposed via one MCP server, the next choke point will become the model of selecting the correct tool. After the first dozens of tools, agents select the wrong tool (or nothing) more often than it would be expected.

How does Marmot cope with it? Are all of the tools exposed in a flat way, or there is a scoping/search step which allows an agent to select between only a few tools out of the catalog?

charlie-haley 3 hours ago | parent [-]

Hey, good question. Marmot is designed to be as generic as possible. An "Asset", whether it's a database, glossary term, topic, API or anything else, has the exact same schema, API endpoint and MCP tool. The MCP server exposes 3 tools: discover_data, find_ownership and lookup_term. Scoping happens as filters and arguments to discover_data, and it's summary-first, so a broad query returns counts or provider breakdowns rather than dumping every asset into context, and the agent narrows from there. The search step is really the primary interface here.

kerlenton 2 hours ago | parent [-]

Makes sense. 3 generic tools + summarize first is a nice approach to sidestepping the problem of too many tools. But it looks like it shifts the chokepoint around rather than eliminating it: instead of "choose the proper tool out of many", it shifts to "formulate the appropriate query to discover_data from the summary".

But in real applications, does the model reliably drill down from the general summary, or does it often just hang around at the level of the summary?

bonigv 6 hours ago | parent | prev [-]

Bruno, if we are operating the agent (assuming coding-agent ) in an environment where the build and test tools as well the source code lives, why would we need an additional step to supply context? Wouldn't even the most basic of agents eb able to operate on those tools and build the discovery themselves?

5 hours ago | parent | next [-]
[deleted]
bschaatsbergen 5 hours ago | parent | prev | next [-]

[dead]

charlie-haley 5 hours ago | parent | prev [-]

[dead]