All articles

What Is an MCP Server? A Plain-English Guide for 2026

An MCP server gives AI tools a standard way to call your data and APIs. Here's what it is, how it works, and what hosting it in production demands.

Naman Kabra· June 25, 2026· 8 min
MCPModel Context ProtocolAI agentsInfrastructure
What Is an MCP Server? A Plain-English Guide for 2026

What Is an MCP Server? A Plain-English Guide for 2026

An MCP server is a program that exposes your data, tools, and APIs to AI applications through one standard interface, so an assistant like Claude, ChatGPT, or Cursor can call them without custom integration code. MCP stands for Model Context Protocol, the open standard Anthropic introduced on November 25, 2024 as "an open standard that enables developers to build secure, two-way connections between their data sources and AI-powered tools." The official documentation defines an MCP server precisely: "a program that provides context to MCP clients." In practice it advertises a set of tools, receives requests from AI clients, runs the requested action, and returns a structured result.

The part most definitions skip, and the part that decides whether your server works in production, is that an MCP server is a stateful, long-running process, not a stateless function you can drop onto any serverless host. That single fact drives every hosting decision below.

Why this matters: MCP became the default in one year

MCP went from one company's open-source release to a widely adopted standard in about a year. By 2025 it was supported across major AI tools including Claude, ChatGPT, Cursor, GitHub Copilot, and Windsurf, and adoption broadened well beyond the company that created it.

For a builder, the takeaway is simple: writing an MCP server is no longer a side experiment. If you expose a useful tool, the major AI clients can already discover and call it. The hard part is not the protocol. It is hosting the server so it survives real agent traffic.

How an MCP server actually works

MCP uses a client-server architecture. An MCP host (the AI application: Claude Desktop, VS Code, a custom agent) spins up one MCP client per server, and each client holds a dedicated connection to its server. The protocol has two layers, defined in the architecture overview:

  • Data layer is a JSON-RPC 2.0 protocol that handles connection lifecycle and the core primitives.
  • Transport layer is the set of channels that move messages and handle authorization.

A server exposes three primitives:

Primitive What it is Example
Tools Executable functions the AI can invoke query_database, send_email
Resources Read-only context data a file's contents, a DB schema
Prompts Reusable interaction templates a few-shot example set

A client first calls tools/list to discover what is available, then calls tools/call to run one. When the toolset changes, the server can push a notifications/tools/list_changed message so the AI stays current. This discover-then-call loop is why one MCP server can serve many different AI clients with no per-client code.

The detail that changes everything: stdio vs Streamable HTTP

Here is where generic explainers stop and production reality begins. The protocol supports two transports, and your choice determines where and how you host:

  • Stdio transport uses standard input/output streams for "direct process communication between local processes on the same machine." A local server like the filesystem reference server runs on your laptop and serves a single client. No network, no hosting decision.
  • Streamable HTTP transport uses HTTP POST with optional Server-Sent Events for streaming. Per the spec, remote servers using it "will typically serve many MCP clients." This is the transport you deploy when you want a server other people's agents can reach.

If you are building something real that a team or the public calls, you are on Streamable HTTP, and you are now running a remote web service that needs a stable URL, uptime, and authentication. That is a hosting problem, not a protocol problem.

MCP is a stateful protocol

The official architecture overview is explicit: "MCP is a stateful protocol that requires lifecycle management." Every connection negotiates capabilities through an initialize handshake, then maintains session state until it closes. A subset can be made stateless over Streamable HTTP, but the default model assumes a server that holds state across a session.

This is the line that rules out a lot of cheap hosting. A stateless serverless function that cold-starts and forgets everything between invocations fights the protocol's design. Session state, connection pools, and any in-memory context evaporate on each cold start. For anything beyond a toy, your MCP server wants persistent compute and a real database, which is exactly why the hosting question deserves its own guide.

Authentication is OAuth 2.1, and it is not optional in practice

For HTTP-based servers, the MCP authorization spec is built on OAuth 2.1. A protected MCP server "acts as an OAuth 2.1 resource server." Authorization is formally OPTIONAL, but the spec is pointed about transports: STDIO implementations "SHOULD NOT" follow it and pull credentials from the environment instead, while HTTP implementations "SHOULD" conform. Every request carries an Authorization: Bearer <access-token> header, and the server "MUST validate that access tokens were issued specifically for them as the intended audience." Translation: the moment your server is remote and useful, you owe it real auth: token validation, audience binding, PKCE, and HTTPS-only endpoints.

What production-grade MCP hosting requires

Putting the spec together, a serious server needs four things a generic host does not give you out of the box:

  1. Persistent compute means warm processes that hold session state, not cold-starting functions.
  2. A real database means Postgres or a Redis-compatible store for session and tool state.
  3. A stable, HTTPS URL so AI clients can register and reach the server reliably.
  4. OAuth 2.1 auth with bearer-token validation and correct audience binding.

This is the gap CreateOS was built to close. CreateOS is the application layer of NodeOps, and it offers native MCP server hosting with auto-discovery via mcp-tool.json, persistent compute for long-running agents, and managed PostgreSQL and Valkey on a $0 free tier. You push your code, you get a stable createos.sh URL with warm pods and no cold-start state loss, and you can wire in OpenAI, Anthropic, Stripe, and other services without separate infrastructure. See MCP server hosting on CreateOS for the agent-focused path, or read where to host an MCP server for free for the full platform comparison.

The point is not the brand. It is the checklist. Whatever you host on, hold it to those four requirements, because the protocol's own design demands them.

Common questions

What does MCP stand for?

MCP stands for Model Context Protocol. It is an open standard, introduced by Anthropic in November 2024, that gives AI applications a uniform way to connect to external data sources, tools, and workflows. In December 2025 it moved to vendor-neutral governance under the Linux Foundation's Agentic AI Foundation.

What is the difference between an MCP server, an MCP client, and an MCP host?

The MCP host is the AI application, such as Claude Desktop or VS Code. The host creates one MCP client per connection, and each client maintains a dedicated link to one MCP server. The MCP server is the program that exposes tools, resources, and prompts. One host can run many clients, each talking to a different server.

Do I need to know how to code to build an MCP server?

Yes, building an MCP server requires coding, but the official SDKs handle most of the protocol. SDKs exist for Python, TypeScript, and other languages, so you write your tool logic and the SDK manages JSON-RPC, lifecycle, and capability negotiation. The harder part is hosting the server reliably once it is built.

What is the difference between a local and a remote MCP server?

A local MCP server runs on the same machine as the AI application and uses the stdio transport, typically serving one client. A remote MCP server runs elsewhere, uses the Streamable HTTP transport, and can serve many clients at once. Remote servers need a stable URL, uptime, and authentication.

Is an MCP server stateful or stateless?

MCP is a stateful protocol that requires lifecycle management, so the server holds session state across a connection by default. A subset can be made stateless over the Streamable HTTP transport, but the standard model assumes state is kept for the life of the session. That is why cold-starting serverless functions, which forget everything between invocations, fight the protocol's design.

How do AI clients discover the tools an MCP server offers?

The client sends a tools/list request and the server returns a structured list, each tool carrying a name, description, and JSON Schema for its inputs. The client then calls a specific tool with tools/call. If the server's tools change, it can send a tools/list_changed notification so the client refreshes its registry.

How does an MCP server handle authentication?

For HTTP-based servers, MCP authorization is built on OAuth 2.1, with the server acting as a resource server. Each request includes an Authorization Bearer header, and the server must validate that the token was issued specifically for it. Stdio servers skip this and read credentials from the environment instead.

What does an MCP server need to run reliably in production?

Four things: persistent compute that holds session state, a managed database such as Postgres or a Redis-compatible store, a stable HTTPS URL for clients to register against, and OAuth 2.1 authentication with audience-bound tokens. CreateOS provides all four with native auto-discovery and a $0 free tier. Self-hosted VPS or serverless setups can work for simpler cases but require more configuration.

About the author

Naman Kabra is the founder of CreateOS, the unified execution layer for AI that coordinates infrastructure, compute, LLM orchestration, agent deployment, and monetization in one place. CreateOS is the application layer of NodeOps and offers native MCP server hosting with auto-discovery via mcp-tool.json, persistent compute, managed databases, and a $0 free tier. Naman has been building in Web3 since 2017.

Next step

Building an MCP server and need it to survive real agent traffic? Start free on MCP server hosting on CreateOS: persistent compute, managed databases, and native MCP auto-discovery on a $0 tier, no credit card required.

Give Us One Stuck Pilot.

We'll have it in governed production before your next board meeting.