Chapter 69: MCP: the Model Context Protocol in depth

We touched on MCP in Chapter 66. This chapter is the deep dive: the protocol’s design, the primitives it exposes, the transport options, the server/client model, and what makes it the emerging standard for connecting LLMs to tools and data sources.

By the end you’ll know how MCP works at the protocol level, how to write an MCP server, how to integrate MCP into an agent framework, and why MCP matters for the LLM ecosystem.

Outline:

The integration problem MCP solves.
The protocol’s design.
The three primitives: tools, resources, prompts.
Transports: stdio and HTTP.
Servers and clients.
Discovery and lifecycle.
The MCP server ecosystem.
Writing your own MCP server.
MCP in production.
The standardization story.

69.1 The integration problem

Before MCP, every LLM application had to implement its own tool integrations. If you wanted your chatbot to access GitHub, you wrote GitHub API code in your chatbot. If you wanted it to access Slack, you wrote Slack API code. If you wanted it to access a database, you wrote database connection code. Every integration is its own one-off.

Worse, the integration was tightly coupled to the chatbot. If you wanted to use the same GitHub integration in a different chatbot (say, you switch from OpenAI to Claude), you had to rewrite it.

The result: massive duplication of work. Every LLM application reimplemented the same integrations. There was no way to share.

MCP (Model Context Protocol, Anthropic, late 2024) is the standardized solution. The pitch:

Tool providers implement an MCP server that exposes their tool over a standard protocol.
LLM applications act as MCP clients that can connect to any MCP server.
The connection is bidirectional: the client can call the server’s tools, and the server can request resources or prompts from the client.

MCP converts an N×M integration matrix into N+M implementations — each tool and each client writes the protocol once, eliminating one-off bespoke code per pairing.

The result: once a tool has an MCP server, any MCP-compatible client can use it. Write the GitHub integration once; use it in Claude Desktop, in Cursor, in your custom agent, in anything.

This is the same value proposition as USB or HTTP: a standard that lets components interoperate without one-off code per pairing. The pitch is compelling, and adoption has been fast since the late-2024 launch.

69.2 The protocol’s design

MCP is built on JSON-RPC 2.0. Both client and server exchange JSON messages over a chosen transport. Each message is either:

A request (with an id, a method, and parameters).
A response (with an id matching a request, and a result or error).
A notification (a message that doesn’t expect a response).

The protocol is bidirectional: both the client and the server can send requests to each other. The client typically initiates by listing the server’s capabilities; the server responds with the tools, resources, and prompts it offers.

The lifecycle:

Client connects to server (via stdio or HTTP).
Client sends initialize with its capabilities.
Server responds with its capabilities (which tools, resources, prompts it offers).
Client and server exchange messages over the connection.
Either side can close the connection.

The protocol is intentionally simple. The complexity is in the primitives the protocol exposes (next section), not in the transport.

69.3 The three primitives: tools, resources, prompts

MCP defines three main primitives that a server can expose to a client:

Tools

Functions the model can call. This is the same as the tool calling we covered in Chapter 66. A tool has:

Name (e.g., get_weather).
Description (what it does).
Input schema (JSON Schema for the arguments).

The client lists the available tools at startup. When the LLM decides to call one, the client sends a tools/call request to the server with the tool name and arguments. The server executes the tool and returns the result.

// Client → Server
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "get_weather",
    "arguments": {"location": "Tokyo"}
  }
}

// Server → Client
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [{
      "type": "text",
      "text": "Tokyo: 22°C, sunny"
    }]
  }
}

Tools are the most-used MCP primitive. They map directly onto the LLM tool calling pattern.

Resources

Files or data the model can read. Resources are read-only data sources — files, database query results, API responses cached as documents, etc. The server exposes resources as URIs (e.g., file:///path/to/document.txt, postgres://table/row).

The client can:

List the available resources.
Read a specific resource by URI.
Subscribe to changes in a resource (the server notifies on changes).

Resources are different from tools because they’re passive: the server doesn’t do work for each access; it just returns data. The client (or the LLM) decides what to do with the data.

Use cases for resources:

Exposing files in a directory.
Exposing rows in a database.
Exposing entries in a knowledge base.
Anything where the model needs to read but not modify.

Prompts

Pre-built prompt templates that the server provides. The client can list the available prompts and inject them into the LLM’s context as needed.

A prompt is a named template with parameters. For example, a code review server might offer a “review-pull-request” prompt that takes a PR URL and produces a structured review prompt for the LLM.

// Client → Server
{
  "method": "prompts/get",
  "params": {
    "name": "review-pull-request",
    "arguments": {"pr_url": "https://github.com/owner/repo/pull/123"}
  }
}

// Server → Client
{
  "result": {
    "messages": [
      {"role": "system", "content": "You are a code reviewer..."},
      {"role": "user", "content": "Review this PR: ..."}
    ]
  }
}

Prompts are useful for canned workflows — common LLM interactions that benefit from a curated prompt template.

The three primitives together — tools, resources, prompts — give MCP its flexibility. A single MCP server can expose tools (actions), resources (data), and prompts (workflows) simultaneously.

MCP's three primitives partition the problem cleanly: Tools act, Resources provide data, Prompts supply canned workflows — a single server can expose all three simultaneously.

69.4 Transports: stdio and HTTP

MCP supports two main transports:

stdio (standard input/output)

The simplest transport. The client launches the server as a child process and communicates over the process’s stdin and stdout. JSON messages are sent line-by-line.

This is the standard transport for local tools. The MCP server runs on the same machine as the client. There’s no network involved; the messages flow through OS pipes.

stdio is the right transport for:

Tools that access local resources (files, local databases, local APIs).
Tools that should run in the user’s security context.
Development and testing.

Most MCP servers in the open ecosystem use stdio.

HTTP

The other transport: the server runs as an HTTP service, and the client connects over the network. JSON messages are exchanged via HTTP requests (with optional Server-Sent Events for streaming).

HTTP is the right transport for:

Remote services (a database, a SaaS API, a corporate intranet tool).
Multi-tenant deployments (one server, many clients).
Production deployments where the server needs to be a real service.

HTTP-based MCP servers can have authentication, rate limiting, monitoring, and all the other things you’d want for a production service.

The choice of transport is independent of the protocol — the same MCP server logic can be exposed over either stdio or HTTP. Most production deployments use HTTP for shared services and stdio for local user tools.

69.5 Servers and clients

The MCP ecosystem has two roles:

MCP servers

The components that expose tools, resources, and prompts. Servers are written in any language (Python, TypeScript, Go, Rust, etc. — there are SDKs for all of them). They implement the JSON-RPC interface and respond to client requests.

Examples of MCP servers:

GitHub MCP server: exposes GitHub’s API as MCP tools.
Filesystem MCP server: exposes local files as resources and read/write tools.
Postgres MCP server: exposes a database as resources and query tools.
Slack MCP server: exposes Slack channels and messages.
Web search MCP server: exposes search engines as tools.
Memory MCP server: exposes a persistent key-value memory.

The Anthropic team and the community have built dozens of reference MCP servers for common tools. They’re available on GitHub.

MCP clients

The components that use MCP servers. Clients are typically LLM applications: chatbots, IDEs, agent frameworks. The client maintains a connection to one or more servers and exposes their tools/resources/prompts to the LLM.

Examples of MCP clients:

Claude Desktop: Anthropic’s desktop app. The original MCP client.
Cursor: the AI-first code editor. Uses MCP to integrate with tools.
Zed: an editor with MCP support.
Continue: an open-source IDE assistant with MCP support.
Custom agents: any agent framework can implement MCP client support.

The client lists tools from connected servers and presents them to the LLM. When the LLM calls a tool, the client routes the call to the right server.

sequenceDiagram
  participant Client as MCP Client
  participant Server as MCP Server

  Client->>Server: initialize(capabilities)
  Server-->>Client: capabilities (tools, resources, prompts)
  Client->>Server: tools/list
  Server-->>Client: [{name, description, inputSchema}, …]
  Note over Client: LLM decides to call get_weather
  Client->>Server: tools/call {name:"get_weather", arguments:{…}}
  Server-->>Client: {content:[{type:"text", text:"22°C, sunny"}]}
  Client->>Server: (connection close)

The MCP lifecycle is initialize → discover → call → close; the protocol is symmetric JSON-RPC so both sides can send requests.

69.6 Discovery and lifecycle

The MCP lifecycle:

Initialization:

Client launches/connects to server.
Client sends initialize with its capabilities (e.g., “I support roots and sampling”).
Server responds with its capabilities (which tools, resources, prompts).
Client sends initialized notification.

Discovery:

Client sends tools/list to get the available tools.
Client sends resources/list to get the available resources.
Client sends prompts/list to get the available prompts.

Use:

Client sends tools/call to invoke a tool.
Client sends resources/read to read a resource.
Client sends prompts/get to fetch a prompt.

Notifications:

Server sends notifications/tools/list_changed if its tool list changes.
Server sends notifications/resources/updated if a subscribed resource changes.

Shutdown:

Client closes the connection.

The protocol supports dynamic tool lists — a server can add or remove tools at runtime, and notify the client. This is useful for tools that depend on the user’s current context (e.g., a “current open file” tool).

69.7 The MCP server ecosystem

As of late 2025, there are hundreds of MCP servers available in the open ecosystem. The major categories:

File and filesystem:

mcp-server-filesystem: read and write local files.
mcp-server-git: git operations.

Code and development:

mcp-server-github: GitHub API.
mcp-server-gitlab: GitLab API.
mcp-server-puppeteer: browser automation.

Data:

mcp-server-postgres: Postgres queries.
mcp-server-sqlite: SQLite queries.
mcp-server-google-drive: Google Drive files.

Communication:

mcp-server-slack: Slack integration.
mcp-server-gmail: Gmail.
mcp-server-everart: image generation.

Search and knowledge:

mcp-server-brave-search: Brave Search API.
mcp-server-fetch: HTTP fetching.
mcp-server-memory: persistent memory.

Time and dates:

mcp-server-time: timezone-aware date/time tools.

Custom:

Many companies are building MCP servers for their internal tools (Linear, Jira, ServiceNow, etc.).

The list grows weekly. The MCP repository on GitHub maintains an awesome-list of community servers.

The pattern: for almost any tool you’d want an LLM to access, someone has probably already written an MCP server. Check before building your own.

69.8 Writing your own MCP server

Writing an MCP server is straightforward. The Python SDK example:

from mcp.server import Server
from mcp.server.stdio import stdio_server
from mcp.types import Tool, TextContent

server = Server("my-server")

@server.list_tools()
async def list_tools() -> list[Tool]:
    return [
        Tool(
            name="echo",
            description="Echo back the input",
            inputSchema={
                "type": "object",
                "properties": {
                    "message": {"type": "string"}
                },
                "required": ["message"]
            }
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict) -> list[TextContent]:
    if name == "echo":
        return [TextContent(type="text", text=arguments["message"])]
    raise ValueError(f"Unknown tool: {name}")

async def main():
    async with stdio_server() as (read_stream, write_stream):
        await server.run(read_stream, write_stream, server.create_initialization_options())

if __name__ == "__main__":
    import asyncio
    asyncio.run(main())

That’s a complete MCP server with one tool. Run it as a subprocess from any MCP client and it just works.

For HTTP transport, replace stdio_server with the HTTP transport. The rest is the same.

The TypeScript SDK is similarly clean. SDKs exist for Go, Rust, Java, C#, and several other languages.

69.9 MCP in production

Operational considerations for MCP in production:

(1) Security. MCP servers can do anything they’re programmed to. Granting an MCP server access is granting it the privileges of whatever it can do. Be careful what you connect.

(2) Authentication. HTTP-based MCP servers should authenticate clients. The protocol supports auth headers; use them.

(3) Rate limiting. A misbehaving client can hammer an MCP server with requests. Apply rate limits at the server.

(4) Versioning. MCP itself is versioned. So are individual servers. Pin versions in production to avoid surprises.

(5) Monitoring. Log every tool call (with arguments and results) for audit. This is important for debugging and for compliance.

(6) Sandboxing. For untrusted MCP servers, run them in isolated environments (containers, VMs). Don’t give them access to anything they don’t need.

(7) Performance. MCP calls add latency to agent loops. For performance-critical tools, consider implementing them directly in the application instead of going through MCP.

(8) Tenant isolation. Multi-tenant MCP servers need to enforce per-tenant access controls.

These are standard concerns for any service interface. MCP doesn’t make them go away; it just gives you a standard format to work with.

69.10 The standardization story

MCP is the most successful protocol standardization in the LLM ecosystem so far. The reasons:

Anthropic backed it. A major lab promoting an open standard gives it momentum.
The protocol is simple. JSON-RPC over stdio/HTTP. No proprietary parts.
The SDKs are good. Easy to write servers in any language.
The reference clients (Claude Desktop) are popular. Many users are already exposed to MCP.
The ecosystem grew fast. Within months, hundreds of servers were available.

As of 2025, MCP is becoming the standard way to extend LLMs with tools and data. Major IDEs (Cursor, Zed, Continue) have adopted it. Many companies are building MCP servers for their products. The ecosystem is converging.

The competing approaches — proprietary plugin systems (OpenAI’s Plugins, Anthropic’s earlier tool patterns), framework-specific tool registries (LangChain tools, CrewAI tools) — are losing ground to MCP. The standardization is the key advantage.

For new agent applications in 2025, MCP is the right tool integration choice. It’s the standard, it’s open, the ecosystem is growing, and you don’t lock into any single framework.

69.11 The mental model

Eight points to take into Chapter 70:

MCP is a standard protocol for connecting LLMs to tools and data sources.
Three primitives: tools (callable functions), resources (readable data), prompts (templates).
Two transports: stdio (local) and HTTP (remote).
Servers expose tools; clients (LLM apps) consume them.
Hundreds of community-built servers are available for common tools.
Writing a server is ~30 lines of code with the SDK.
MCP is the emerging standard in the LLM ecosystem.
Production concerns: security, auth, rate limiting, sandboxing, monitoring.

In Chapter 70 we look at the related but distinct question: when to use a workflow engine vs an agent loop.

Read it yourself

The MCP specification at modelcontextprotocol.io.
The MCP GitHub organization (github.com/modelcontextprotocol) for the SDKs and reference servers.
The Anthropic blog post introducing MCP (November 2024).
The Cursor / Zed / Claude Desktop documentation on MCP integration.
The awesome-mcp-servers list on GitHub.

Practice

Read the MCP specification’s “primitives” section. Can you implement each in an MCP server?
Write an MCP server with one tool that returns the current date and time.
Why does MCP support both stdio and HTTP transports? When would you use each?
Connect an existing MCP server (e.g., the filesystem server) to Claude Desktop and use it.
Why has MCP succeeded as a standard where earlier tool protocols failed? List three reasons.
Identify a tool you’d want an LLM to access and design the MCP server interface for it.
Stretch: Write an MCP server in Python that wraps a real API (e.g., the OpenWeatherMap API). Connect it to a Claude Desktop or Cursor and verify it works.