TIPS #37: When AI Agents Become Insider Threats
Shane Shook
March 19, 2026
- Blog Post
- TIPS
Issue: Autonomous AI agents are performing business operations with minimal human oversight, expanding the enterprise attack surface at machine speed.
Autonomous AI agents are shifting from enterprise experiments to reality. As companies increasingly embed AI agents in production environments, they execute business-critical functions including business operations, security monitoring, customer engagement, DevOps automation, and financial workflows. Agents are now inheriting real authority, becoming execution engines that move data, modify infrastructure, call APIs, read and write, approve transactions, and orchestrate downstream workflows.
This trend is accelerating. Gartner predicts that by 2028, 33% of enterprise software applications will include agentic AI and 15% of day-to-day work decisions will be made autonomously through agentic AI (up from 0% in 2024).
AI identity risks, weak human-in-the-loop controls, and new layers of exposure
However, many legacy systems lack the secure identity management and human-in-the-loop controls necessary for secure and effective agentic AI integration. Gartner notes that over 40% of agentic AI projects will be canceled by 2027 due to insufficient risk controls and unclear business value.
Adding to the complexity, Autonomous AI agents are now introducing a new layer of exposure rooted in system design. Modern computing systems follow the von Neumann architecture, in which instructions and data share the same memory space. Historically, this design has produced classes of vulnerabilities such as code injections, command execution attacks, and buffer overflows because data can be interpreted as executable instructions.
Large language models (LLMs) and autonomous agents recreate a version of this problem at the semantic layer. In AI systems:
- Instructions and data coexist within the same prompt context
- External content is processed alongside system guidance
- Reasoning, retrieval, and execution are often tightly coupled
In other words, autonomous agents amplify architectural risk because they both interpret content and act on it. If an agent can ingest arbitrary content and then call APIs, retrieve data, or execute tasks based on what it reads, malicious input can influence downstream system behavior. When reasoning authority, data ingestion, and execution privileges are fused together, the attack surface expands from infrastructure into architecture.
Adding to the challenge, attackers may be able to use their own AI agents to exploit a company’s AI systems autonomously. For example, in February 2026, red-team cybersecurity startup CodeWall announced that its offensive AI agent autonomously targeted McKinsey’s internal AI chatbot Lilli and breached the platform in just two hours. The offensive AI agent autonomously exploited a SQL injection vulnerability to obtain write access to Lilli’s database, ultimately gaining access to millions of messages, hundreds of thousands of sensitive files, tens of thousands of user accounts, and nearly 100 system prompts governing Lilli’s behavior. While the exact details are contested- McKinsey has since patched the vulnerabilities and stated that there was no unauthorized client data access– it’s a clear example of how agentic AI is complicating security.
Impact: Compromised and under-secured AI agents are non-human insiders that can facilitate cascading failures, data loss, and regulatory violations.
When a traditional user account is compromised, an attacker must manually explore and execute actions step-by-step. When an autonomous agent is compromised or behaves unexpectedly, the attacker inherits something far more dangerous: a trusted, machine-speed operator that can automatically chain actions across systems.
Since agents operate continuously and at scale, failures propagate faster than most manual detection and response processes are designed to contain. The consequences can be immediate and compounding:
- Cascading system failures triggered by automated workflows
- Data loss or unintended propagation across integrated platforms
- Regulatory violations executed without explicit human review
- Financial damage driven by automated approvals or transactions
The following case studies illustrate how architectural exposure and autonomous connectivity can expand the attack surface:
Clinejection CLI Supply Chain Attack (2026)
In February 2026, threat actors exploited a prompt injection vulnerability in an AI-powered GitHub issue triage bot used by open-source AI coding agent Cline. Attackers embedded a malicious instruction within a GitHub issue title, which the Cline system inserted directly into its AI triage bot prompt without sanitization. The bot then downloaded a payload from a typosquatted repository.
This initiated a chain of events that poisoned the Cline CI/CD cache and stole the project’s highly privileged npm release token. Attackers used this token to publish a compromised version of the tool that silently installed OpenClaw, a separate autonomous AI agent with full system access, onto approximately 4,000 developer machines.
This incident illustrates the semantic von Neumann flaw for AI. Untrusted external data (a natural language issue title) was ingested, interpreted as an executable instruction by the AI, and run with CI environment privileges, leading to a recursive supply chain compromise.
Salesforce OAuth and Drift Supply Chain Breach (2025)
In August 2025, Salesloft experienced a supply chain breach via its Drift chatbot integration, impacting more than 700 organizations. Threat actors gained access to the company’s GitHub environment and then Drift’s AWS environment, where they stole OAuth authentication tokens. They then used the tokens to access data, move laterally into customers’ third-party platforms including Salesforce and Google Workspace, and query customer environments.
This activity bypassed traditional MFA and monitoring because it originated from a trusted application (Salesloft Drift). Drift’s AI-driven integration expanded the attack surface by introducing additional OAuth trust relationships and system connectivity, showing how AI integrations can quickly add structural risk.
Microsoft 365 Copilot EchoLeak Vulnerability (2025)
EchoLeak (CVE-2025-32711) is a patched critical vulnerability in Microsoft 365 Copilot that enabled a zero-click attack. The exploit could have allowed attackers to automatically exfiltrate sensitive data by crafting external email content to influence Copilot’s processing flow. Since the attack did not require users to click a malicious link, the payload executed invisibly. If a targeted user asked Copilot a question about the email’s topic, Copilot would have executed hidden instructions to collect information from anything in Copilot’s access scope- including chat logs, OneDrive files, and Teams messages. This would bypass prompt injection controls, redaction mechanisms, and content security policies.
This vulnerability shows how AI can be weaponized against itself to leak sensitive data when models process malicious instructions alongside normal contextual data.
Action: Secure autonomous AI agents as operational actors by implementing Agent Identity Governance (AIG).
1) Adopt an Agent Identity Governance control model
Agentic autonomy is accelerating, and governance must accelerate with it. Organizations need to treat AI agents like operational actors.
The Agent Identity Governance (AIG) framework establishes a control model built on five core principles which impose architectural discipline on autonomous agents:
- Least Privilege: Grant agents only the minimum permissions required for narrowly defined tasks.
- Human-in-the-Loop Controls: Require explicit human approval for high-impact agent decisions including administrative changes, production deployments, financial transfers, and sensitive data exports.
- Mandatory Decision Logging: Capture detailed records of agent inputs, decisions, actions, and resulting system changes.
- Explicit Kill Switch Mechanisms: Develop and maintain the capabilities to immediately suspend or isolate agent activity across integrated systems.
- Separation of Control and Data Planes: Architect systems such that reasoning authority, data access, and execution privileges are not fused into a single layer. A failure in one layer should not automatically cascade across others.
2) Operationalizing AIG
Implementing AIG principles in practice requires a coordinated technology and security stack that prioritizes agentic AI visibility, runtime controls, accountability, and testing.
a) Gain visibility and control over AI access: You can’t govern an AI agent if you don’t know what it’s connected to or that it exists in your environment. Nudge Security provides visibility into SaaS and AI integrations, enabling security teams to control access from OAuth grants, API permissions, service accounts, and corresponding delegated AI capabilities.
“OAuth is supposed to simplify security, not bypass it. In today’s environment of over-trusted SaaS and AI integrations, it’s a growing challenge for IT security teams to maintain visibility and control of who (and what) has access to corporate data and how. Effective AI governance starts with the visibility to see every cloud asset created by your workforce and the control to manage every integration, OAuth grant, and API permission.”
Jaime Blasco CTO and Co-founder, Nudge Security
b) Deploy accountable agents: When building or buying agents for high-stakes operations, prioritize auditable architectures to build governance into the foundation of autonomous AI. Maisa empowers enterprises to create accountable digital AI workers capable of automating complex business processes, with governance, operational traceability, and hallucination resistance at the core to maximize trust.
“Most AI models today are probabilistic with unpredictable outputs. This is unacceptable for precise business processes. Enterprises need AI agents that not only integrate seamlessly with existing systems but offer stable, fully auditable outputs.”
David Villalon CEO and Co-Founder, Maisa
c) Ensure runtime protection: Runtime observability is essential to enforce an architectural separation of control and data planes, in addition to blocking malicious prompts. WitnessAI helps companies observe, protect, and control AI systems and agents with runtime visibility and policy enforcement.
d) Conduct continuous testing: Continuous security testing is paramount for AI due to its tightly coupled reasoning and execution. The complexity and mercurial nature of AI introduces evolving risks around user interactions, model behavior, and guardrail gaps. Bishop Fox applies adversarial testing and exposure management expertise to identify systemic weaknesses across modern AI-integrated environments before attackers can exploit them.
Featured in this edition: Bishop Fox, Maisa, Nudge Security, and WitnessAI