Lesson 9 of 46 ~25 min
Course progress
0%

MCP Architecture Deep Dive

Understand the Model Context Protocol from the ground up — resources, tools, prompts, sampling, and the transport layer that connects Claude to the outside world.

The Model Context Protocol (MCP) is an open standard that defines how AI models communicate with external systems. It replaces ad-hoc tool integrations with a structured, composable interface. Opus 4.6’s improved tool-calling reliability — fewer hallucinated parameters, better parallel tool execution — makes MCP significantly more powerful than with previous models.

MCP vs. Raw Tool Use

Before MCP, every integration was custom:

Traditional:  App → custom glue code → API → custom parsing → model
MCP:          App → MCP client → MCP server → any resource

MCP standardizes the protocol so you write the server once and every MCP-compatible client can use it.

The Four Primitives

MCP defines four core primitives. Every interaction with an MCP server uses one or more of these:

graph TB
    subgraph "MCP Server"
        R[Resources<br/>Read-only data]
        T[Tools<br/>Actions with side effects]
        P[Prompts<br/>Reusable instructions]
        S[Sampling<br/>Model invocation]
    end
    subgraph "MCP Client"
        C[Claude / Host Application]
    end
    C -->|"list, read"| R
    C -->|"call"| T
    C -->|"get"| P
    S -->|"request completion"| C

1. Resources — Data Access

Resources expose read-only data to the model. They are identified by URIs and return content with a MIME type.

# Resource examples
"file:///Users/dev/project/src/main.py"      # Local file
"postgres://localhost/mydb/users"              # Database table
"github://owner/repo/issues"                   # GitHub issues
"config://app/settings"                        # Application config

Resources are model-controlled — the model decides which resources to read based on the conversation context. The host application can also surface resources to the model proactively.

from mcp.server import Server
from mcp.types import Resource, TextContent

server = Server("demo")

@server.list_resources()
async def list_resources():
    return [
        Resource(
            uri="config://app/database",
            name="Database Configuration",
            mimeType="application/json",
            description="Current database connection settings"
        )
    ]

@server.read_resource()
async def read_resource(uri: str):
    if uri == "config://app/database":
        return TextContent(
            text='{"host": "localhost", "port": 5432, "db": "myapp"}',
            mimeType="application/json"
        )

2. Tools — Actions

Tools perform actions that may have side effects. They accept structured input (JSON Schema) and return results. This is the primitive Claude calls when it needs to do something.

@server.list_tools()
async def list_tools():
    return [
        {
            "name": "run_query",
            "description": "Execute a read-only SQL query against the database",
            "inputSchema": {
                "type": "object",
                "properties": {
                    "sql": {
                        "type": "string",
                        "description": "SQL SELECT query to execute"
                    },
                    "limit": {
                        "type": "integer",
                        "description": "Maximum rows to return",
                        "default": 100
                    }
                },
                "required": ["sql"]
            }
        }
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "run_query":
        sql = arguments["sql"]
        limit = arguments.get("limit", 100)
        # Execute query and return results
        results = await db.execute(sql, limit=limit)
        return {"rows": results, "count": len(results)}

3. Prompts — Reusable Instructions

Prompts are server-defined templates that guide model behavior. They let the server author encode domain expertise into reusable instruction sets.

@server.list_prompts()
async def list_prompts():
    return [
        {
            "name": "analyze_table",
            "description": "Analyze a database table for schema issues",
            "arguments": [
                {
                    "name": "table_name",
                    "description": "Name of the table to analyze",
                    "required": True
                }
            ]
        }
    ]

@server.get_prompt()
async def get_prompt(name: str, arguments: dict):
    if name == "analyze_table":
        table = arguments["table_name"]
        return {
            "messages": [
                {
                    "role": "user",
                    "content": f"""Analyze the '{table}' table. Check for:
1. Missing indexes on foreign keys
2. Columns that should have NOT NULL constraints
3. Potential normalization issues
4. Data type mismatches

Use the run_query tool to inspect the schema and sample data."""
                }
            ]
        }

4. Sampling — Reverse Invocation

Sampling is unique — it lets the server request a model completion from the client. This enables agentic loops where the server drives multi-step reasoning.

# Server-side: request a completion from the host model
result = await server.request_sampling(
    messages=[
        {"role": "user", "content": f"Summarize this error log:\n{error_log}"}
    ],
    max_tokens=500
)
summary = result.content

Sampling requires explicit user approval in most host applications. It is the most powerful primitive and the most restricted.

Transport Layer

MCP supports two transport mechanisms:

graph LR
    subgraph "stdio Transport"
        A1[MCP Client] -->|stdin| B1[MCP Server Process]
        B1 -->|stdout| A1
    end
    subgraph "HTTP + SSE Transport"
        A2[MCP Client] -->|"HTTP POST /message"| B2[MCP Server]
        B2 -->|"SSE stream"| A2
    end

stdio — Local Processes

The server runs as a child process. Communication happens over stdin/stdout using JSON-RPC 2.0 messages.

// Client → Server (stdin)
{"jsonrpc": "2.0", "id": 1, "method": "tools/list", "params": {}}

// Server → Client (stdout)
{"jsonrpc": "2.0", "id": 1, "result": {"tools": [...]}}

When to use: Local development, desktop applications like Claude Desktop, single-user setups.

HTTP + SSE — Remote Servers

The client sends requests via HTTP POST. The server pushes responses and notifications through Server-Sent Events.

POST /message HTTP/1.1
Content-Type: application/json

{"jsonrpc": "2.0", "id": 1, "method": "tools/call", "params": {...}}

---

SSE Response:
event: message
data: {"jsonrpc": "2.0", "id": 1, "result": {...}}

When to use: Multi-user deployments, remote servers, cloud infrastructure, production environments.

Message Lifecycle

Every MCP interaction follows this lifecycle:

sequenceDiagram
    participant C as MCP Client
    participant S as MCP Server

    C->>S: initialize (protocol version, capabilities)
    S-->>C: initialize response (server capabilities)
    C->>S: initialized (acknowledgment)

    Note over C,S: Connection established

    C->>S: tools/list
    S-->>C: Available tools with schemas

    C->>S: tools/call (name, arguments)
    S-->>C: Tool result (content)

    Note over C,S: Multiple tool calls can happen in parallel

    C->>S: shutdown
    S-->>C: Acknowledgment

Capability Negotiation

During initialization, client and server declare what they support:

# Server capabilities example
{
    "capabilities": {
        "tools": {"listChanged": True},     # Server can notify when tools change
        "resources": {"subscribe": True},    # Client can subscribe to resource updates
        "prompts": {"listChanged": True},    # Server can notify when prompts change
        "sampling": {}                       # Server can request completions
    }
}

This means a simple server can start with just tools and add resources, prompts, and sampling as needed.

Why Opus 4.6 Makes MCP Better

Three specific improvements in Opus 4.6 make MCP significantly more reliable:

ImprovementImpact on MCP
Reduced hallucinated parametersTool calls match the JSON Schema more consistently — fewer runtime errors
Better parallel tool executionModel can call 3–5 tools simultaneously, reducing round trips
Improved tool result reasoningModel correctly interprets complex nested JSON responses
# Opus 4.6 can reliably handle parallel tool calls like this:
response = client.messages.create(
    model="claude-opus-4-6-20260205",
    max_tokens=4096,
    tools=mcp_tools,  # Tools from MCP server
    messages=[{
        "role": "user",
        "content": "Get the user count, list recent errors, and check disk usage"
    }]
)
# Opus 4.6 will call all three tools in a single turn,
# then synthesize the results into a coherent response.

Architecture Decision: One Server or Many?

PatternWhen to UseExample
Monolithic serverSmall team, tightly coupled toolsAll database ops in one server
Server per domainClear boundaries, independent scalingDB server + GitHub server + Slack server
Server per security tierDifferent trust levelsRead-only analytics vs. write-capable admin

The most common production pattern is server per domain — each server owns one external system, and the MCP client composes them.

In the next lesson, you will build a production-grade MCP server from scratch in both Python and TypeScript.