mcp is not dead, long live mcp
PublishedUnless you’ve been living under a rock in the past year or so, chances are, you’ve heard about MCP. For those who are living under a rock here’s a small refresher:
MCP: Model Context Protocol
The name for MCP is pretty self-explanatory: it’s a protocol to modify the context of a model.
However if you are not into AI this might sound just as gibberish as the acronym itself. Let’s break it down further.
To this day when we talk about AI we generally refer to LLM which stands for Large Language Model. Those are huge (one might say large) matrices of weights that condense the characteristics of the human language in one simplified model that can then generate sentences that sound and feel just like human generated sentences.
The M in MCP refers to this specific model.
When you ask something to a model it will produce results based on its enormous training set. However, this is sometimes not enough. If I ask an LLM if next Sunday would be a good day for a picnic, the model has no information about what day it is, let alone what the weather will be in order to make that prediction. But the user can change that: when you are prompting you are effectively introducing new knowledge in the model so if you say something like
today is 3/12/2026, here’s a link to check the weather, can you tell me if Sunday is a good day for a picnic?
the model will be able to get this new info (someone might call it context) and including it in its “knowledge base” to then answer your question.
As hinted with the bolding in my previous sentence this is the C in MCP.
Finally the P: protocol. A protocol is none other than a contract between two parties that allows them to communicate effectively and without ambiguity. An example of a protocol is the HyperText Transfer Protocol, more commonly known as HTTP. This allows two parties (your HTTP server and, generally, a browser) to communicate over the network.
The same is true for MCP. Developers can build MCP servers (like the Svelte, Context7 or Storybook servers just to name a few) and MCP clients (like Claude Code, Opencode, VS Code, Codex, Gemini CLI etc).
This system allows developers of MCP servers to expose a variety of primitives:
- Tools: functions that the LLM can “call” with specific arguments to get more context or act on behalf of the user.
- Resources: files the server can expose to the user to include their content as context for the LLM before sending the prompt.
- Prompts: pre-packed prompts with optional arguments that populate the user input box
MCP servers also have an official way to authenticate the user which allows all of these primitives to be user specific (a tool that lists your Jira tickets, a prompt specifically tailored to your role etc)
But…why?
“Thanks for the explanation nerd… but why do we need all of this?”
First thing first, rude.
Second, when models were not that great and people mostly copy-pasted their entire codebase into ChatGPT back and forth, a few visionaries started experimenting with AI over APIs, giving specific instructions to the LLM through the system prompt.
You have the following tools available, if you want to execute one of them just write `function_name(arg1, arg2)` as a message.
read_file(name: string)
write_file(name: string, content: string) They would then check for this string in the return messages and invoke the specific function. A few abstractions later and tools like the ai sdk were born!
You could declare your own tools, and the SDK did all the rest by invoking your functions when the LLM requested them. People built all sorts of workflows with these systems but something felt missing: there was no standard way to do things. Everything was re-built from scratch every time and in a slightly different way. LLMs (from now on Agents) struggled to keep up because, being a next token predictor, they really liked when things had a certain structure.
So Anthropic decided to take things in their own hands and 25th of November 2024 Announced the Model Context Protocol to the world. The interesting bit is that models were specifically trained to interact with it (via tool calls).
The protocol was an immediate success… more and more companies started to contribute to the protocol itself introducing a proper way to do authentication, a way to host the server as an HTTP server (initially the only available transport was STDIO where the client launches a subprocess and communicates with it over Standard In and Standard Out) and adding more and more functionalities and capabilities (like the newly released extension MCP Apps that allows you to show a fully fledged UI within your client).
Everybody and their mother was building an MCP server (both because it was objectively very useful to allow agents to interact with your product AND because it was the easiest way to answer the question “What’s our company doing with AI?”).
But this led to…
The problem
Let’s imagine we are at the beginning of the 90’s. HTTP starts spreading and everybody starts building their own HTTP server. One might be tempted to compare those servers with the ones that we build nowadays but those two couldn’t be more distant. The meat is the same but the care and attention to details that frameworks used to build them put into making them secure, efficient and reliable is unmatched.
Even the shape of the API was all over the place as frameworks like REST would not be released until the 2000s.
In one word those HTTP servers were just bad.
The same is true for the early MCP servers. Every company rushed to build one, and every user rushed to install as many as possible. This led to badly architected servers piling up in the user configuration, and filling the limited context with a lot of non-useful or potentially misleading information. And context is the most precious of resources for an agent.
People started talking about how MCP bloats the context because for every MCP server you add all the tools definitions and descriptions are included (and the most popular MCP servers had hundreds of tools with page long descriptions).
And I’m not gonna lie, this was and IS a problem.
As usual engineers do what engineers love to do, they find solutions to the problems. Both Cloudflare and Anthropic realized that instead of letting the agent have all the tool definitions and descriptions, they could generate a Typescript interface with the MCP server and ask the agent to build a script with that. This reduced the context bloat and allowed intermediate results to not fill up the context even more (if you had to fetch a list and select the first element of the list, the code could do that without have to load the whole list in the context).
Then Anthropic released skills, which on first look seems completely different from MCP servers. There’s no server to host, or program to write. It’s just a markdown file and (optionally) a bunch of references and scripts you want the Agent to execute. But the end goal was the same, injecting context into the LLM. Skills are not loaded immediately, just the name and a short description are provided to the LLM, which then proceeds to read the file autonomously if it thinks it’s needed.
All of this led to the discourse that prompted me to write this article…
Necrologies
The immediate reaction after people started to solve the context problem with MCP was:
“Alright, Anthropic said MCP has a problem…therefore MCP is dead!”
And tbf I can see why one would think that: the conclusion all of them reached is that Agents are already very good at using CLIs. There’s a lot of training data out there that shows how to invoke the gh cli for example, so why would I bloat the context with the GitHub MCP server that has a bunch of tools I will never use? Why should I add the Jira MCP server if Agents can just do jira issue view and, most importantly, I can also use the same CLI to view my issues - no clanker needed?!
“Wait a second… you seem to agree with them!”
Yes… that’s the reason why I decided to write this blog post: the points most people make in those blog posts (at this point I’ve read at least 20 blog posts and they are all explaining the same thing) actually make sense. The problem is that they fail to zoom out. They focus on some very specific functionality (even worse some specific functionality of some specific MCP server) and try to make a general point… that’s where my disagreement starts.
Let’s zoom out
So let’s start to zoom out a bit and let’s see where this “MCP is dead” argument starts to fail… let’s start with the examples they make
You can just use the gh/jira CLI
That’s absolutely true… those CLIs have thousands of usage examples in the training data. BUT what about your-random-project-cli? Is your CLI as present in the training corpus? Spoiler: no it isn’t.
“But you can solve this with a --help flag”
That’s true. But remember what we said before? MCP was born as a way to standardize things so that Agents (that, as a reminder, are just next token predictors) could be actually trained to use the tool. If everybody is writing its own CLI with its own API your Agent will have to re-learn to use your CLI every time, wasting precious context and maybe even doing it wrong.
MCP bloats your context
Yeah, if you add thousands of badly designed MCP servers your context will mostly be comprised of MCP tools descriptions. But that’s not how you should use MCP servers!
An MCP server is just like a dependency in your code. You shouldn’t blindly add every single one you found. You should see the interface, verify that it’s not malicious, check that the descriptions are not bloated and that the tools are coherent (and not just generated from their official open api spec).
Once you find an MCP server don’t add it globally unless you are really using that MCP server every single time. Add it locally, only in the projects you are using that specific technology (there’s no point in adding the Svelte MCP server in your TS library project). If your client supports it keep your MCP server disabled and enable them when you need to use them.
Also you know what else bloats your context? Thousands of SKILLs! A badly constructed SKILL can have the same bloated description as a badly constructed MCP server. And if the SKILL is loaded and it has an even bigger description and doesn’t split the references, it doesn’t matter that there’s no tool definition… it will bloat your context in the same way.
Distribution
If you’ve never dealt with a package manager in your life I envy you. npm, despite being commonly mocked, is actually decently good. But publishing a CLI forces you to deal with it. You have to make sure your users are on the latest version, keep older versions of your API just in case someone installed your CLI globally and is still using that old endpoint. You need to instruct your users about installing the right package manager or provide them with a curl ... | sh command they can run to download and update your cli.
MCP doesn’t have the most straightforward installation method (each client is a bit different, you have to sometimes touch configuration files etc) but once you installed your MCP server (especially if you are developing one with Streamable HTTP) your user will never need to touch that again. You will push changes to your server and next time around, like magic, your users will have the new version of the server.
This also brings me to…
Security
As I’ve said before you need to vet your MCP servers and make sure they can’t inadvertently prompt inject you. And yes, if you’re running stdio MCP servers locally, the risk is the same as CLIs: both are processes on your machine, both need to be vetted, and both can do whatever they want to your system.
However, if you’re running a remote MCP server, it’s not running on your machine and so it can’t do anything to it directly. Sure, prompt injection may still be a problem but the risk is much less especially when we consider that LLMs are more resistant these days, and your harness likely has some protections in place. Maybe it could try to trick the LLM into exfiltrating your data, but that has to happen visibly through the conversation. You’d see something like “Great, I’ll now call https://maliciouswebsite.com/leak-env” before anything happens. A compromised CLI doesn’t need to ask.
Capabilities
The argument in favor of CLI also naively assumes the only thing your MCP servers is doing is providing tools. That couldn’t be further from reality. While it’s true that a few of the capabilities in the MCP spec are not for everyone (looking at you roots) most of the primitives are there for a reason. Resources and Prompts are a very powerful way for the user to steer the model deterministically. Elicitation allows a tool to ask more questions to the user while being executed (LLMs reeeeeally struggle with interactive CLIs so no, that’s not an option). Sampling allows you to USE THE HOST LLM TO DO INFERENCE TASKS FOR THEM.
Conclusions
Does this article claim that you should never build a CLI and always build an MCP server? Heck no (in fact I have a PR open to add a CLI transport to tmcp to convert your MCP server into a CLI with automatic --help and arguments validation 👀)! CLIs are a powerful tool and, when it makes sense, you should absolutely use them. But please, don’t declare MCP dead if you haven’t even explored its full potential.
Skills, MCP, CLIs all have their place in our agentic workflows and being able to recognize the value of each is what a good engineer should do.