2025-03-11
•Sam Reed
The Wise Recluse
LLMs & the Model Context Protocol
 
 If you’re in search of a good mental model for understanding how software developers are building with large language models, consider the wise recluse.
The wise recluse lives in a creaky 19th century home in a Northeastern town straight out of a Stephen King novel. He hasn’t been seen in many years—in fact, many townspeople thought that he was either dead or gone until the courageous neighborhood kid dropped her letter through his mail slot a few years back.
What happened next was extraordinary—not only was the recluse still there, but he had been waiting. He responded almost instantly with a letter of his own, one that explained his eagerness for interaction. What can I help with? Ask anything… the wise recluse wrote back.
Word of the event spread rapidly. Soon, everyone sought out the wise recluse. The wise recluse remained indefatigable, thriving in the face of increased demand for his consultation. Some people tapped his inexhaustibly rich knowledge of history, asking for facts about great wars and crumbled empires. Some used the wise recluse as a foil; his letters were the whetstones that sharpened their most audacious ideas. Others found a friend in the wise recluse: to them, the wise recluse was a faceless therapist, a nameless penpal.
The business sector took notice. Savvy businesspeople saw the wise recluse as an enabler of new processes, a new point on the spectrum between their computers and their employees. Though the savvy businesspeople understood the raw power of the wise recluse, their ability to harness his power was severely limited by his stubborn insistence on correspondence through letters. The wise recluse had no phone number. The wise recluse could not join a Zoom call. The wise recluse would not sit in a cubicle. The wise recluse would receive your letter and respond with one of his own, and that was it.
One day, the hardware store tried something new. You see, the townspeople loved to buy local and so the hardware store got lots of voicemail with product orders and questions, but the hardware store only had two employees so it struggled to keep up. Despite their value to the community, the hardware store operated with very thin margins, so hiring another employee was out of the question. But what about the wise recluse? the owners thought.
The hardware store owners came up with a system. Twice a day, every day, they would print out all of the voice messages (they used iPhones, so voicemail was automatically transcribed) and walk them over to the wise recluse (with the nice side effect of additional cardiovascular activity). However, they wouldn’t just give the wise recluse the raw voicemails—before printing, they would copy and paste these instructions for the wise recluse at the top of each voicemail:
You are helping manage a hardware store’s voicemail system. You will see the transcript of a customer’s voicemail below. If the customer is asking for a refund, your response letter should adhere to the following format: Refund: [order number, amount]. If the customer is asking to purchase a new product, your response should adhere to the following format: Purchase: [customer address, product SKU, quantity]. We know everyone in town, so we’ll handle billing separately. If any information is missing from the voicemail, or if you don’t know how to respond, just write us back about the issue.
In doing this, the hardware store had discovered a way for the wise recluse, a man who hadn’t left his home in years, to control the external world without stepping a foot out the door. They weren’t just dropping letters in his mail slot and getting free-form responses back. They were giving him letters with special instructions and getting precise commands for how to update their records back. The wise recluse was turning raw, unstructured material into real-world actions, all from the comfort of his home.
The hardware store got really good at this. Soon, the bank, the accountant, and the law office wanted in. Then the grocery store took interest. Then the school.
Eventually, the hardware store realized that they could help. First, they wrote an ebook that described their process for instructing the wise recluse on how to command the real world. This was helpful, but there was room for more. After more consideration, they decided to codify their system, creating a standard that any business could use for integrating with the wise recluse.
They called their standard the “Recluse Context Protocol.”
One in the chamber
Wow, that got weird. But so did this past week in AI world on X (formerly Twitter).
The source of the weirdness was the “Model Context Protocol” which, though first made publicly available in November of 2024, has exploded in popularity as of late among users of Cursor, the popular AI programming tool (I’ve written about Cursor before!). The discovery of MCP tools caused yet another one of these reverberating next big thing moments that are increasingly common in our online echo chambers.
As with many things in the applied AI space, “Model Context Protocol” is a simple (and useful!) idea at its core, but the combination of its technical-sounding name and the magic of seeing MCP servers in use has me feeling like a little bit of explaining might be in order.
Applying the metaphor
Anyone that feels mystified by today’s AI landscape should keep the wise recluse in mind (don’t be shy to share this article with someone who needs it!). It’s pretty easy to apply the metaphor to tools like ChatGPT—when you work with ChatGPT (the wise recluse), you send it text (your letters) and it responds with text (its letters). Sometimes, when you ask ChatGPT a question outside of its knowledge base, it will flash “Searching the web” back at you, seemingly breaking down the metaphor (has the recluse stepped outside?), but don’t fret—this is just a smooth user interface design, not an instance of the underlying model conducting a web search. As with the hardware store, the AI model probably is the thing that identifies that an internet search is necessary, but it isn’t actually hitting Google on its own.
 
  This process of getting instructions for initiating things like internet searches is at the core of the Model Context Protocol. A (slightly) deeper look at why this works in the first place will help us understand how we got here.
Training day
In previous essays, I’ve described how large language models are basically big mathematical functions that take text as input and then return text as output (this isn’t actually true—they take tokens, which are numerical representations of text). This naturally leads us to another question: of all the words that a model could spit out in response to a question, how does the model decide what text to return?
This is where the idea of training a model comes into play. I can’t speak too deeply about the nuances of training an enormous, multimodal transformer model like OpenAI’s GPT-4o or Anthropic’s Claude Sonnet 3.5, but the general idea is that:
- 
A model starts out as a massive rules system (a giant set of “weights,” which are a series of matrix multiplications that change the input to an output) 
- 
Data is fed into that rules system 
- 
The data that actually comes out of that rules system is compared with what should have come out of that rules system (i.e. results are checked for correctness) 
- 
The rules system gets adjusted in retrospect to make sure that it produces what it is expected to produce. 
Do this enough times, with a large enough rules system and a large enough set of data, and the function will start to show the nuance that we get when interacting with Claude and ChatGPT.
So why is this idea of training important to keep in mind in the context of MCP?
It’s important as a reminder that the purpose of training is to make adjustments to the model’s underlying weights until you’re confident that the model will produce sensible responses, no matter the input. This is why you hear the term “Prediction” thrown around sometimes when people talk about how LLMs work: during training, a model “Predicts” the correct output and then gets adjusted so that next time around it will predict an output that is closer to the correct output. The goal is obviously to get to a place where it’s able to respond to any question with what feels like a satisfactory response. If you ask it for weeknight pasta recipes, it shouldn’t respond with a salad. If you ask it for a business plan, it shouldn’t respond with a sales pitch. If you ask it for a haiku, you shouldn’t get free verse.
An interesting side effect (maybe people always had this in mind) of this training process is that models become good at things beyond just sensibly answering questions, such as handling well-defined logical systems. This is intuitive: the model is still returning the most likely sequence of words that a user would expect, whether those words are representative of a Spaghetti alla puttanesca recipe or the structured inputs to a downstream system (tell me the difference, I’ll wait).
Think back to the wise recluse. The hardware store owner described the store’s system to the wise recluse, so that he could respond in a way that could be plugged right into the system. This holds for building a software system with large language models as well—if I provide a large language model with a detailed description of the system that it is a part of, I’ve essentially given it the keys to control the external world (my program), even though it never leaves its home.
One beautiful thing about this is that it enables what I’m seeing referred to as “Soft requirements.” Normally, writing a computer program is a pretty inflexible task. For example, if I write a program to manage wedding invitations that expects you to upload an Excel spreadsheet with the columns “Name” and “Address,” but you accidentally upload a sheet with the columns “Name” and “Mailing Address,” chances are that it won’t work (we’ve all experienced this at one point or another), even though any human would be able to handle this situation with ease. Speaking of Excel, this is the same type of problem that you get with those dreaded #NAME errors—if your Excel formula isn’t perfectly typed, Excel can’t make sense of it. Large language models are a great way to solve “Hard requirements” problems like these—they can take the frustratingly unstructured and imperfect data that we find out in the real world and transform it into stuff that is guaranteed to work with the rest of our system. These use cases are sometimes negatively called “GPT Wrappers,” but there’s room for them everywhere.
MCP
Anyways, the “Model Context Protocol” that we’ve been hinting at this whole time is a standardization of this clever technique for letting models take in conversational data and turn it into structures that control the external world. It was created by Anthropic and as far as I can tell only works with Anthropic’s models. The reason why a company like Anthropic would want to do this is to make sure that it’s as easy as possible for their models to control external tools—in their words, the MCP standard is a “USB-C port” for AI applications. They’ve also created libraries in popular programming languages that make it even easier for developers to build “MCP servers” for their LLMs. This was a brilliant move! Bravo!
As with all things that people use without fully understanding, it is not without danger (I learned this on a dirt bike in Vermont once). I’ve seen clueless people bragging online about doing stupid things like letting AI agents download code from online repositories (hackers must be VERY excited these days), I’ve seen people talk about attempting to integrate with bank accounts, and I’ve seen people talking about managing Firebase authentication via MCP (so much AI coding these days. Just so, so, so much). None of this is a good idea!
With that said, a widely adopted standard for giving our wise recluses ways to access the real world is a very interesting thing, something that is almost certain to open doors when combined with LLM host applications like Cursor.
I talked about this a bit in my post on OpenAI’s Deep Research Agent, but it seems like the industry has quickly rounded the corner from training bigger and better models towards a core focus on connecting models to the real world. We might be here folks, it might be happening.
See you next week!