Remix.run Logo
DonHopkins 2 hours ago

There's a fundamental architectural difference being missed here: MCP operates BETWEEN LLM complete calls, while skills operate DURING them. Every MCP tool call requires a full round-trip — generation stops, wait for external tool, start a new complete call with the result. N tool calls = N round-trips. Skills work differently. Once loaded into context, the LLM can iterate, recurse, compose, and run multiple agents all within a single generation. No stopping. No serialization.

Skills can be MASSIVELY more efficient and powerful than MCP, if designed and used right.

Leela MOOLLM Demo Transcript: https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...

  2. Architecture: Skills as Knowledge Units

  A skill is a modular unit of knowledge that an LLM can load, understand, and apply. 
  Skills self-describe their capabilities, advertise when to use them, and compose with other skills.

  Why Skills, Not Just MCP Tool Calls?
  MCP (Model Context Protocol) tool calls are powerful, but each call requires a full round-trip:

  MCP Tool Call Overhead (per call):
  ┌─────────────────────────────────────────────────────────┐
  │ 1. Tokenize prompt                                      │
  │ 2. LLM complete → generates tool call                   │
  │ 3. Stop generation, universe destroyed                  │
  │ 4. Async wait for tool execution                        │
  │ 5. Tool returns result                                  │
  │ 6. New LLM complete call with result                    │
  │ 7. Detokenize response                                  │
  └─────────────────────────────────────────────────────────┘
  × N calls = N round-trips = latency, cost, context churn

  Skills operate differently. Once loaded into context, skills can:

  Iterate:
      MCP: One call per iteration
      Skills: Loop within single context
  Recurse:
      MCP: Stack of tool calls
      Skills: Recursive reasoning in-context
  Compose:
      MCP: Chain of separate calls
      Skills: Compose within single generation
  Parallel characters:
      MCP: Separate sessions
      Skills: Multiple characters in one call
  Replicate:
      MCP: N calls for N instances
      Skills: Grid of instances in one pass
I call this "speed of light" as opposed to "carrier pigeon". In my experiments I ran 33 game turns with 10 characters playing Fluxx — dialogue, game mechanics, emotional reactions — in a single context window and completion call. Try that with MCP and you're making hundreds of round-trips, each suffering from token quantization, noise, and cost. Skills can compose and iterate at the speed of light without any detokenization/tokenization cost and distortion, while MCP forces serialization and waiting for carrier pigeons.

speed-of-light skill: https://github.com/SimHacker/moollm/tree/main/skills/speed-o...

Skills also compose. MOOLLM's cursor-mirror skill introspects Cursor's internals via a sister Python script that reads cursor's chat history and sqlite databases — tool calls, context assembly, thinking blocks, chat history. Everything, for all time, even after Cursor's chat has summarized and forgotten: it's still all there and searchable!

cursor-mirror skill: https://github.com/SimHacker/moollm/tree/main/skills/cursor-...

MOOLLM's skill-snitch skill composes with cursor-mirror for security monitoring of untrusted skills, also performance testing and optimization of trusted ones. Like Little Snitch watches your network, skill-snitch watches skill behavior — comparing declared tools and documentation against observed runtime behavior.

skill-snitch skil: https://github.com/SimHacker/moollm/tree/main/skills/skill-s...

You can even use skill-snitch like a virus scanner to review and monitor untrusted skills. I have more than 100 skills and had skill-snitch review each one including itself -- you can find them in the skill-snitch-report.md file of each skill in MOOLLM. Here is skill-snitch analyzing and reporting on itself, for example:

skill-snitch's skill-snitch-report.md: https://github.com/SimHacker/moollm/blob/main/skills/skill-s...

MOOLLM's thoughtful-commitment skill also composes with cursor-mirror to trace the reasoning behind git commits.

thoughtful-commit skill: https://github.com/SimHacker/moollm/tree/main/skills/thought...

MCP is still valuable for connecting to external systems. But for reasoning, simulation, and skills calling skills? In-context beats tool-call round-trips by orders of magnitude.