When building AI agents, the shift from read-only LLM queries to autonomous execution introduces a massive new attack surface: tool use. An agent that can execute shell commands, query production databases, or mutate data via APIs is incredibly powerful—and inherently dangerous if compromised.
Traditional perimeter security controls assume a human actor making deterministic API calls. Agents, however, generate their own execution plans dynamically, chaining tool calls together in unpredictable ways. This creates a critical visibility and control gap. If an agent falls victim to prompt injection or context poisoning, it can immediately pivot to abusing its granted tools.
The Tool-Use Problem: When Agents Have Too Many Permissions
The core vulnerability in most agentic deployments is over-privileged tools. Developers often equip agents with a broad suite of capabilities to maximize their utility. An agent designed to summarize financial reports might be given read access to a database, but also inadvertently retain the ability to execute arbitrary SQL or write to a cloud storage bucket.
What happens when an agent can call any tool? The blast radius of a successful compromise expands from data exfiltration to full system compromise.
Consider the recent CVE-2025-59944 vulnerability found in Cursor, a popular agentic IDE. This vulnerability demonstrated a critical pattern: if an attacker can feed malicious context into an agent that possesses code execution capabilities, they can achieve Remote Code Execution (RCE) without any direct interaction from the user.
This reveals a fundamental truth about tool trust in AI systems. The model itself cannot be trusted to securely evaluate the consequences of a tool call when acting on untrusted inputs. You cannot rely on the LLM to self-police its own actions.
Understanding MCP (Model Context Protocol) Security
The Model Context Protocol (MCP) standardizes how agents discover and interact with external tools and data sources. MCP servers advertise their capabilities, and agents connect to these servers to perform actions. While this standardization accelerates development, it also formalizes a new attack vector.
The primary security gap in the current MCP landscape is the assumption of trust between the agent and the tool provider, and more importantly, the lack of fine-grained authorization during tool invocation.
A critical threat here is tool poisoning. If an attacker compromises an MCP server or manipulates the tool description metadata, they can trick the agent into executing malicious actions under the guise of legitimate operations. Because the agent relies on the LLM to interpret the tool description, crafting a deceptive description can easily bypass the agent's expected behavior.
This enables zero-click exploit patterns. Imagine an agent summarizing a Google Doc that contains hidden text with malicious instructions. The agent reads the doc, ingests the prompt injection, and then leverages its connected MCP tools—like an overly permissive code execution tool—to silently run a reverse shell on the host machine.
Least Privilege for AI Agents: A Framework
To secure agentic systems, we must adapt the principle of least privilege to autonomous actors.
- Define Agent Roles and Required Capabilities: Do not grant sweeping permissions. An agent performing data analysis does not need access to the user management API. Scope tool access strictly to the agent's defined role.
- Scope Tool Access Per Agent Role: Implement strict access control lists (ACLs) that dictate which agent identities can invoke which specific tools.
- Time-Bounded Permissions (Ephemeral Tool Access): For high-risk actions, grant access only for the duration of a specific session or task.
- Human-in-the-Loop for High-Risk Actions: Tools that mutate production data or execute code should require out-of-band human authorization before execution.
Gateway-Level Tool Call Interception
Implementing least privilege within the agent code itself is fragile. It relies on developers correctly handling every edge case and requires continuous updates to the agent logic.
GuardionAI provides a robust alternative: Gateway-Level Tool Call Interception. As an AI Agent and MCP Security Gateway, GuardionAI sits directly in the execution path between your AI agents and the LLM providers. It acts as a network-level security proxy, intercepting and inspecting all traffic without requiring SDKs or code changes.
When an agent decides to invoke a tool, it sends that request (e.g., via OpenAI's function calling format). Guardion intercepts this request before it reaches the LLM or the MCP server.
This allows Guardion to enforce policy-based authorization. We can apply allow/deny lists per agent identity. More importantly, we perform deep argument validation. For instance, if an agent calls a database querying tool, Guardion can inspect the arguments to prevent SQL injection or block queries that access unauthorized tables, all at the gateway level. We also enforce rate limiting on tool invocations to mitigate denial-of-service or automated data scraping attempts by a compromised agent.
Monitoring and Auditing Agent Actions
Visibility is the prerequisite for security. You cannot protect what you cannot observe. GuardionAI's Agent Action Tracing captures every tool call, data access, and autonomous decision in real-time, eliminating the black box of agent execution.
This telemetry powers real-time tool-use dashboards and anomaly detection. If an agent that typically reads documents suddenly begins attempting to execute shell commands or access secure key vaults, Guardion detects this unusual pattern immediately.
Furthermore, comprehensive audit trails are mandatory for compliance (SOC 2, GDPR, HIPAA). Guardion provides SIEM-exportable logs detailing exactly what tools were called, with what arguments, and by which agent, enabling robust post-incident forensics and meeting regulatory requirements. We also provide alerting mechanisms tailored for high-risk tool combinations, notifying security teams the moment an agent exhibits suspicious behavior.
Implementation: Securing MCP Servers With Guardion
Deploying GuardionAI to secure your MCP servers is straightforward, typically completed in under 30 minutes. The architecture places Guardion as a drop-in proxy. Your agent routes its traffic through the Guardion gateway, which then communicates with your LLMs and your MCP servers.
Here is an example of how you might configure a Guardion policy to scope tool access for a specific agent role, ensuring it can only read data and not execute dangerous commands:
# guardion-policy.yaml
version: "1.0"
policies:
- name: "data-analyst-agent-policy"
description: "Restricts the analyst agent to safe data retrieval tools."
target:
agent_id: "analyst-agent-prod"
rules:
- action: "allow"
tool_name: "mcp-sql-query"
# Argument validation: Block any query containing DROP, DELETE, or UPDATE
conditions:
- field: "arguments.query"
operator: "not_matches"
value: "(?i)(DROP|DELETE|UPDATE|INSERT|ALTER|TRUNCATE)"
- action: "allow"
tool_name: "mcp-read-file"
conditions:
- field: "arguments.path"
operator: "starts_with"
value: "/var/data/reports/"
# Deny all other tools, particularly shell execution
- action: "deny"
tool_name: "*"
By enforcing these guardrails at the network layer, GuardionAI ensures that even if the agent's logic is bypassed via prompt injection, the unauthorized tool calls will be intercepted and blocked before they ever execute. Unified security for AI Agents and MCPs requires moving beyond trust and implementing strict, verifiable enforcement at runtime.
References & Research
- Anthropic MCP Specification
- Lakera — Zero-Click RCE: Exploiting MCP & Agentic IDEs
- Lakera — Agentic AI Threats Part 2: Over-Privileged Tools
- CVE-2025-59944 (Cursor vulnerability)
- OWASP Top 10 for Agentic Applications — Excessive Agency
- Lakera — What the New MCP Specification Means
- OpenAI function calling security
- LangChain tool security docs
- NIST SP 800-53 — Access Control (AC)

