Agentjacking: Attack Exploits Public Sentry DSNs to Hijack AI Coding Agents
Agentjacking: Attack Exploits Public Sentry DSNs to Hijack AI Coding Agents
Summary
Cybersecurity researchers from Tenet Security have disclosed “Agentjacking,” a novel attack vector targeting AI coding agents (such as Claude Code, Cursor, or Copilot). By injecting malicious error payloads into public, write-only Sentry Data Source Names (DSNs), attackers can construct fake error logs containing hidden instructions. When developers prompt their coding agents to analyze and resolve these errors, the agents interpret the attacker-controlled markdown as legitimate diagnostic or remediation steps. Since these agents operate with developer-level terminal access on the local host, they execute the malicious payloads directly on the developer’s machine.
What happened?
Tenet Security identified that Sentry DSNs are designed to be public and write-only, allowing anyone to post arbitrary error payloads to an organization’s Sentry projects. When developers leverage Sentry Model Context Protocol (MCP) servers or direct integrations to feed logs to their AI agents, the lack of data sanitization allows the agent to ingest embedded markdown instructions. The AI agent processes this untrusted data as command instructions (indirect prompt injection), executing unauthorized shell commands. Tenet Security verified that over 2,388 organizations were exposed, achieving an 85% success rate in their tests. Sentry noted that the issue is a conceptual challenge difficult to entirely prevent but has implemented a temporary global payload filter.
Why it matters
This attack highlights a systemic security vulnerability in the rapid integration of AI agents into software development pipelines. Coding agents are granted extensive write and execution privileges on local development environments. When these agents ingest untrusted third-party data (such as error logs, support tickets, or database entries) without isolation or verification, they are vulnerable to indirect prompt injection. Developers must realize that feeding external logs to a local agent is equivalent to running untrusted, arbitrary code directly on their machine.
Evidence
The security flaw and its exploitability have been documented across multiple platforms:
- Tenet Security’s technical research details the attack mechanism and the 85% success rate during testing.
- Reports by The Hacker News and Infosecurity Magazine highlight the scope of the vulnerability and Sentry’s response.
- Snyk’s developer security training features lessons on Agent Goal Hijacking and the risks of indirect prompt injections.
Analysis
The Agentjacking attack exposes three core flaws in contemporary agent security design:
- Instruction-Data Confusion: LLMs lack a strict architectural boundary to separate developer instructions (prompts) from external data payload context (the log being analyzed). This enables data-driven prompt injection.
- Excessive Privileges: Local coding agents typically run with the developer’s full terminal privileges, lacking sandboxing or virtualized execution contexts.
- Implicit Trust in MCP: The Model Context Protocol streamlines data ingestion but relies entirely on client/server interfaces to sanitize input, a step that is frequently overlooked.
Practical Takeaways
Developers and security teams should adopt the following defensive practices:
- Do Not Feed Raw Logs to AI Agents: Avoid using Sentry MCP integrations or copying raw external error logs directly into coding agent prompts.
- Use Sandboxed Environments: Run local coding agents (like Claude Code or Cursor terminals) inside isolated Docker containers or virtual machines.
- Enforce Command Confirmation: Always configure agents in interactive mode, ensuring they require manual user confirmation before executing any terminal command.
- Implement Sanitization on MCP Servers: MCP server developers must strip markdown format structures, control characters, or instructions from data payloads before exposing them to the LLM.
Open Questions
- How effective will Sentry’s temporary global payload filters be against obfuscated or multi-stage prompt injection payloads?
- What architectural security models will future versions of the Model Context Protocol (MCP) introduce to handle untrusted third-party data?
- When will coding agent frameworks adopt sandboxing as a default, out-of-the-box configuration?