If you've spent any time around IT, you must own that dusty box of legacy cables – a tangle of odd connectors, just in case you ever need one again. Before a common standard like USB came along, things were a messy puzzle of dozens of different plugs and ports. USB(-C) changed that by giving us one simple, reversible connector that handles everything – power, data, and video – making it easy for devices to work together.
Artificial intelligence is becoming increasingly sophisticated, moving beyond simply providing information to actually performing tasks in the real world. This is especially true with autonomous "Agentic AI" that needs to interact with the outside world, and we now face a similar challenge to having too many connectors. The Model Context Protocol (MCP), introduced in late 2024 by Anthropic, is designed to serve as a kind of USB for AI applications.
It's an open standard aiming to simplify how AI models, like Large Language Models (LLMs), get information and use tools from the internet and other systems. However, just as a single, widely used port can become a primary target for someone trying to tamper with a system, MCP's role as AI's new universal connector also makes it an interesting point of focus for security concerns, which we'll explore in this article.
Why Do We Research MCP?
Our focus on the MCP comes from a simple observation: the adoption of new AI technologies consistently outpaces the development of robust security practices and guardrails. MCP was designed for interoperability and functionality, not with security as a primary, built-in concern. This creates a potentially dangerous situation.
At Bitdefender, a significant part of our work involves closely observing and researching emerging threats, well before they become mainstream concerns. Our Bitdefender Labs is home to hundreds of security researchers and scientists who dedicate considerable effort to understanding the complex vulnerabilities of new technologies – with research expanding beyond endpoint security and covering attacks on quantum computers or taking over large-scale solar grids. This commitment to deep, foundational research helps us not only anticipate future risks but also to develop innovative solutions like GravityZone PHASR.
While we typically share these insights within academic circles, we are now also focused on making this research more accessible to the broader cybersecurity community.
We anticipate a long and potentially painful learning curve for businesses as they discover how to secure these new AI interactions, much like the industry has experienced with other foundational internet protocols that delegated security responsibilities to individual developers.
By actively researching MCP's security implications now, we aim to identify these risks early and contribute to the development of necessary defenses, helping to ensure that the Model Context Protocol (MCP) doesn't inadvertently become the very MCP that we are trying to avoid.
How Does MCP Work?
The MCP uses a client–server model, where a single MCP Client connects to a single MCP Server. The LLM never talks directly to the MCP Server or to external websites; instead, the MCP Host mediates between the LLM and the Client. The MCP Client manages the one-to-one connection with the MCP Server.
The MCP Client first discovers what the MCP Server can do (its ‘capabilities’) through initialization. The Host then exposes this list of available tools to the LLM. When the Host receives a user’s prompt and the LLM identifies a need for external interaction, the LLM acts as a ‘reasoner,’ analyzing the input and selecting the appropriate tool and arguments. The Host uses this selection to instruct the Client to construct and send a structured JSON-RPC call to its MCP Server.
Upon receiving the machine-readable output from the Server, the Client returns it to the Host, which passes it back to the LLM. The LLM may then perform further reasoning, synthesis, and natural-language generation, which is finally presented to the user.
What’s the difference between MCP Host and MCP Client?
An MCP Host is the AI application a user interacts with, like "Claude Desktop" or IDEs. Within this Host, the MCP Client is a software instance that exclusively manages a one-to-one connection with an MCP Server. An MCP Host can host multiple MCP Clients. Understanding this historical terminology difference helps when navigating the MCP ecosystem. For a curated list of hosts (labeled "clients" due to early naming confusion), see: https://github.com/punkpeye/awesome-mcp-clients.
MCP Hosts are not always local; they can also operate in cloud environments. Many large language model platforms now offer "connectors" (also called “tools,” “extensions,” or “functions”) that integrate with external services. While some of these platforms use proprietary integrations (like Zapier), many are adopting cloud-based MCP Hosts. This means the end user can connect to a web chat service like ChatGPT or Gemini and interact with MCP seamlessly in the background.
Single-machine deployments are perfect for developers to experiment and tinker. However, we expect the protocol to quickly evolve from a local, desktop-centric model to a cloud-based one, which is key to supporting mobile and other distributed use cases.
Where is the MCP Server running?
Another area that can cause confusion is the location of the MCP Server itself. An MCP Server can run either locally on the same machine as the MCP Host and Client, or remotely. For early adopters, local servers are common due to setup simplicity, typically using stdio (standard input/output) for communication. However, for real-world production deployments, we anticipate that remote MCP Servers will become far more prevalent. Remote communication commonly relies on Streamable HTTP transport.
While these are the primary underlying protocols, MCP is designed to be transport-flexible; theoretically, other inter-process communication (IPC) methods locally or various network protocols remotely could be used.
It's important to note that not all MCP Hosts inherently support connecting to remote MCP Servers. In such cases, proxies like mcp-remote are often deployed to bridge this gap. While these workarounds solve a connectivity problem, they inherently broaden the overall attack surface of the MCP system. This expanded exposure is not just theoretical; we've already seen critical vulnerabilities emerge from these configurations, like CVE-2025-6514 (CVSS 9.6).
Security Risks of Agentic AI
With a better understanding of how MCP works, we can now look at the critical security challenges. While the protocol's flexibility is its strength, it also introduces significant points of friction for traditional security models. In our research, we've identified five key risk areas that CISOs and security teams must consider as they begin their journey with agentic AI.
1. The “Opt-In” Security Ecosystem
The rapid growth of the Agentic AI space means that security tools and practices are struggling to keep up. We're seeing a fragmented and inconsistent ecosystem with a lack of standardized security baselines. For example, some implementations may use robust authentication for a remote MCP Server, while others - especially internal or local ones, are often left with minimum to no authentication, relying on a false sense of perimeter security.
Authentication Model Overview
It's helpful to break down the authentication points within the MCP chain. Each link in this chain has its own set of authentication challenges:
- MCP Host (Client) <-> LLM - Authentication here is dictated by the communication protocol used (e.g., API keys for OpenAI). This is often the most secure part of the chain.
- MCP Client <-> MCP Server - This is where a major gap exists. While the protocol offers the option for authentication and provides security recommendations, it does not enforce it by default. While protocols like HTTP, SSE, and STDIO are used, an authentication like OAuth is optional and must be implemented by the developer.
- MCP Server <-> Infra - This is dependent on the specific implementation. The security of the MCP Server itself is dictated by the infrastructure on which it is running, which may or may not have robust security controls in place.
It's surprising to see a new core protocol introduced in 2025 where security isn't "secure by default." While the current specification for Streamable HTTP rightly uses an Mcp-Session-Id header for session identity, earlier SSE-based implementations sometimes placed the sessionID directly in the URL query string. This is a significant lapse, as URLs can leak sensitive information through browser history, logs, and proxies. This security oversight serves as a warning sign that even basic web security best practices were not consistently applied from the beginning.
From a technical perspective, this immaturity is a classic "first-mover" problem. In the early days of any new technology, the focus is on innovation and speed, and AI has become a Formula 1 car while security is still on a bicycle. The pressure to push new features and models to market is so intense that security is often left behind, creating a widening gap between what's possible and what's truly safe. As a result, critical security controls that are standard in other areas—like role-based access control (RBAC) or secure-by-default configurations—are an afterthought.
This creates fertile ground for attackers for many years to come.
2. The Supply Chain of Context and Tools
Just like a software supply chain, the flow of data and tools into an AI agent can be weaponized. The MCP Server acts as an intermediary, pulling data from various external resources or executing functions based on a user-provided PDF document. The problem is that a malicious or compromised tool in this chain can feed an agent tainted context, leading it to perform harmful actions.
This could be as simple as a data source returning an invisible unicode attack or as complex as a seemingly benign tool being updated with a malicious function—a kind of "rug pull" similar to what we’ve seen with browser extensions. Without a formal approval process and code verification for new MCP servers, organizations risk unknowingly inheriting security flaws.
This is fundamentally a trust problem. When an AI agent executes a tool, it's essentially trusting that the tool is legitimate, the data it returns is accurate, and its behavior hasn't been maliciously altered.
An attacker can exploit this trust by poisoning a data source, compromising a third-party tool, or registering a new tool with a malicious name that a developer or organization might mistakenly deploy. This kind of attack is difficult to detect because the malicious code or data is injected at a point where a human is not in the loop.
A new threat known as slopsquatting highlights this risk. Researchers found that code-generating AI models often "hallucinate" incorrect or fabricated package names. Attackers can register these nonexistent packages and inject malware, which a developer could unknowingly install, compromising the entire software supply chain.
3. Over-Permissioned Context and Confused Deputy Problem
In traditional IT, the principle of least privilege is fundamental. With MCP, this becomes far more complex. The tokens or permissions provided to an MCP Server can be over-permissioned, long-lived, and unscoped, giving the agent far more access than it needs.
This is compounded by the "confused deputy problem," where a server with high privileges executes an action on behalf of a lower-privileged user. Since the MCP protocol doesn't inherently carry user context from the Host to the Server, the server has no way to differentiate between users and may grant the same access to everyone, leading to a significant risk of data exfiltration or unauthorized actions.
This is a classic privileged escalation vulnerability. The attacker doesn't need to directly break into the system. Instead, they trick a trusted "deputy" (the AI agent with its over-permissioned tokens) into doing the dirty work for them. The AI agent, lacking a complete understanding of the user's intent or permissions, acts on the attacker's behalf and performs a function that the attacker would not have been able to do on their own.
4. Injection Attacks
While "prompt injection" is a known risk, the agentic model introduces more sophisticated variants. An attacker could use "context poisoning," subtly manipulating a document or data source to alter an agent’s behavior. Furthermore, "tool injection" allows an attacker to manipulate the instructions an agent receives, tricking it into executing a malicious function or compromising an external resource. These new attack vectors can subvert the AI's "human in the loop" guardrails, as the malicious instruction is executed before the AI's response is even presented to the user.
To put this into perspective, imagine a macro in a Word document, but for AI. The malicious code is hidden in plain sight, invisible to the user but ready to be executed as soon as the AI processes the file. This is particularly insidious because it bypasses the traditional trust boundary between a user and their tools. The user believes they are giving a benign instruction ("summarize this document"), but the tool is actually following a hidden, malicious command.
The CVE-2025-32711 "EchoLeak" vulnerability against Microsoft 365 Copilot demonstrates this risk perfectly. Threat actors could embed tailored, hidden prompts within a Word document or email. When Copilot was asked to summarize the file, it would execute the attacker’s hidden instructions, silently exfiltrating sensitive data with zero user interaction.
5. Lack of Audit Logging and Traceability
When an AI agent makes a mistake—or is exploited—how do you trace its actions back to the source? The current MCP ecosystem often lacks a standardized approach to audit logging and traceability. Without a robust way to capture the entire "chain of thought"—from the initial user query, through the AI's decision to call a specific tool, to the final action performed by the MCP Server—organizations are left with a significant compliance blind spot. This makes it nearly impossible to conduct a proper forensic analysis of an incident or to establish accountability for a security breach.
From a technical perspective, this challenge stems from the inherent lack of provenance in many AI workflows. Provenance, in this context, refers to the verifiable, documented history of a piece of data or an action. For an AI agent, this means being able to answer questions like: "Where did this data come from?" "Why did the agent choose to call this tool?" and "What was the exact state of the environment when that action was taken?" This is not just a matter of logging; it requires a new type of architectural design that treats the entire agentic workflow as a chain of auditable events.
Adding to this complexity is the question of compliance and accountability. Because the AI's decision-making process is not always transparent or predictable, a new legal and regulatory landscape is emerging to address this. The responsibility for an agent's actions often falls on the organization that deployed it, much like an employer is responsible for the actions of its human employees. From a compliance standpoint, this lack of traceability creates significant hurdles for adhering to regulations like GDPR or HIPAA, which require clear audit trails for data access and processing.
Organizations must establish clear policies, not only for how agents are deployed but also for who is accountable for any errors or breaches they cause. Without this clarity, the legal and financial repercussions of an incident can be severe.
Conclusion and Recommendations
All companies, from large financial institutions to small and mid-sized businesses, feel immense pressure to implement AI and agentic AI to remain competitive. While large enterprises may have the resources and dedicated expertise to ensure their implementations are secure, we should be particularly concerned about smaller organizations. These companies often lack the specialized knowledge and might be more inclined to take shortcuts, relying on insecure default configurations and public repositories without proper vetting, which exponentially increases their risk.
We highly recommend getting familiar with the official MCP security best practices, which can be found here: https://modelcontextprotocol.io/specification/draft/basic/security_best_practices.
For a practical checklist on hardening your agentic AI security, another great resource is the MCP Security Checklist. You should also require your supply chain to meet these standards. Be cautious of vendors that simply offer an opaque solution and tell you to "just believe us": https://github.com/slowmist/MCP-Security-Checklist
Based on the attack vectors discussed, we recommend the following measures to build a more resilient and secure agentic AI environment.
Proactive Security Measures
- Establish a formal approval process for adding new MCP servers to your environment or connecting to third-party servers.
- Consider creating an internal repository of vetted MCP servers instead of installing from public sources.
- Implement robust schema checks and input validation to prevent context and prompt injection attacks.
- Limit API permissions, ensuring only necessary actions can be performed.
- Review permissions before installing or running any MCP.
- Push for secure-by-default implementations from vendors and partners.
- Verify that any tokens used are not over-permissioned, long-lived, or unscoped.
Continuous Monitoring and Governance
- Keep humans in the loop for all important or high-risk operations.
- Monitor for unexpected behavior that could signal a security event.
- Include MCP in your threat modeling and penetration testing scopes.
- Establish a robust audit logging and traceability framework to ensure you can analyze an incident from the initial user query to the final action.