The Surprising Origins of the Model Context Protocol

The Surprising Origins of the Model Context Protocol
Photo by NASA Hubble Space Telescope / Unsplash

Have you ever wondered about where MCP came from, what inspired it, and why it's built the way it is? Today, we will get into a time machine and travel all the way back to 2024, when an Anthropic engineer was getting annoyed at having to constantly switch between Claude Desktop and his code editor.

💡
This is an excerpt from Chapter 2 of AI Agents with MCP, my book coming soon from O'Reilly Media. If you subscribe to their learning platform, you can access an updated chapter 1 and the full current draft of chapter 2 here.

The Genesis of MCP

MCP was born out of an internal project at Anthropic. David Soria Parra, a software engineer at Anthropic, was using Claude Desktop to assist him with his day-to-day work: writing developer tools. But he was frustrated by having to constantly copy and paste code between Claude Desktop and his code editor. If you were doing AI-assisted coding before the current wave of LLM-powered coding tools, you’ve probably experienced the same sort of frustration. He had also been working on a Language Server Protocol (LSP) project, which inspired him to take a similar approach to this problem.

The LSP is an open protocol that uses JSON-RPC to enable communication between a client and a server in order to provide advanced language knowledge to an IDE. If you’ve ever used a non-LLM autocomplete, hovered over a variable or a function call to get its definition, or renamed all instances of a variable from your IDE, you’ve likely used LSP and one or more language servers. The protocol was developed by Microsoft for VS Code and standardized in 2016, rapidly becoming the de facto standard for providing advanced language-specific capabilities to IDEs.

While stewing on his specific problem, David was thinking more broadly about how to make it easier for developers to create integrations with Claude Desktop specifically and LLMs in general. He noticed that writing integrations for LLMs, especially for more than one, created the MxN problem you learned about in the previous chapter: for each of M LLMs that you want to write N integrations for, you need to write MxN connectors. You can see this below.

The MxN problem, showing a proliferation of connectors for each model to support each tool.

With the LSP project still in the back of his mind, it dawned on David that an LSP-like protocol could be the perfect solution to the MxN problem. This reduces the MxN problem to an M+N problem (see the figure below), now the integrations don’t need unique connectors for each LLM, the integrations and LLM(s) just need to be able to communicate over a shared protocol, allowing a common interface for integrations and LLMs to implement their own connectors. These connectors only need to interface with MCP servers, rather than every combination of LLM and integration, and so the MxN problem is reduced to an M+N problem.

An MCP server providing a universal bus connecting models with tools.

David brought his idea to Justin Spahr-Summers, and the two began setting the foundations for what would become MCP. Later that year, before MCP’s initial release, Anthropic had an internal hackathon. Several teams adopted MCP right away and begin building all kinds of tools and other integrations to Claude Desktop using MCP. This rapid, organic adoption of MCP was, in hindsight, a sign of things to come only a month later, when MCP was first released to the public.


I hope you enjoyed what you read, and welcome any feedback! Keep following along for more updates and excepts from the book.