Blog 2.1 Agentic Risks: Overview

Before Agentic AI entered the chat, GenAI security considerations centered on a bounded model. A user entered a prompt, the system produced a response, and risks lived in that exchange. Concerns were real, poisoned data, bias, hallucinations, inadvertent data leaks, but largely contained.

That boundary has shifted because agentic AI systems act. Agents can retrieve documents, query data sources, send emails, or even operate IoT devices. In this post we provide a broad overview of agentic AI risks. And in the next blogs in this series we’ll explore agentic risks through the lens of OWASP’s guidance on threats and mitigations and structured threat modeling approaches.

The shift from “prompt in, response out” to “read, decide, act” is significant. Instead of being a passive participant in a workflow, the model becomes an actor in the workflow itself. That means risks no longer stop at the edge of the output window and what a human does with that output. They extend into the systems an agent can touch, the data it can access, and the actions it can trigger. If we once worried about a single flawed answer in a chat session, we now need to consider the impact of an agent sending that answer via email to every employee at the company or deleting a critical production database. 

Expanding Attack Surfaces

The old saying goes that there are millions of ways into a system, but attackers only need to find one. While we can’t map millions of attack points in this post, we can distill out five that illustrate how agentic systems broaden the attack surface. In the next post we’ll do a deep dive into the OWASP Agentic Threats taxonomy.

Cross-domain tool use. By design, agentic AI is meant to span systems. A single instruction might lead an agent to pull information from a document store, draft a message, push code into a repository, and call an API to deploy a new build. The efficiency is appealing, but each added tool is another potential point of failure or abuse and an expansion of the attack surface.

Indirect prompt injection or zero-click inputs. Traditional attacks often depend on a user taking an action such as clicking a link or opening a file. Agents change that dynamic. They can interpret instructions hidden in documents, calendar events, or web pages automatically. The risk shifts from “what a user clicks” to “what the system consumes.”

Action and exfiltration channels. Agents are not limited to generating on-screen output. They can send messages, move files, or control devices. Those capabilities, if misused, provide new ways for data to leave an organization or for unwanted changes to occur.

Identity and trust boundaries. Perhaps one of the most fundamental challenges is the identity attack surface. Agents act on behalf of users, yet there is still no widely accepted way to establish or manage their identity. Questions as basic as “Which user is this agent representing?” or “How do we distinguish between multiple agents?” are often unanswered. Without identity, accountability blurs and delegation across domains becomes a potential attack vector.

Connectors. MCP (model context protocol) is a bridge into enterprise systems such as Gmail, Drive, or GitHub. In an MCP environment, tools and resources are accessed via an MCP server that is hosted locally, while the agent or LLM acts as the MCP client. MCP comes with several security considerations. One is the “confused deputy” problem, where the system could carry out actions with its own higher-level access instead of just using the user’s permissions. Because MCP servers are applications, there’s also a risk that someone could slip in bad code or tamper with updates, much like downloading any other untrustworthy app. Another concern is tricking the system through prompts or disguised tools, where something looks safe on the surface but secretly steals data or performs hidden tasks. Even normal features like letting an AI “complete” requests can be risky if users can’t see or approve what’s happening.

Real World Examples

These risks aren’t just theoretical. Researchers have already shown how agentic AI attacks work in the real world.  One example used indirect prompt injection to show how malicious instructions could be hidden in seemingly innocuous calendar invites. When Google Gemini summarized the user’s schedule, it also acted on those hidden instructions, raising shades and turning lights on and off in a smart home.

Another attack, dubbed AgentFlayer, was presented at Black Hat 2025, and exploited ChatGPT Connectors. Researchers hid prompts instructing the AI to search for API keys stored in connected drives and then exfiltrate that information back to them into what looked like normal business email attachments. If an unsuspecting user uploaded the attachment to ChatGPT for summarization the AI would have executed the hidden prompts and potentially leaked very sensitive data.

Both cases build on familiar principles, untrusted input, hidden instructions, and data exfiltration, but apply them in ways unique to agentic systems. 

This overview is only the starting point. In the next post, we will walk through agentic risks using the OWASP framework and examine how to approach agentic threat modeling. Agents represent an important evolution in how AI is used. By treating their connectors, identity, and interfaces as security boundaries, and by applying established security discipline to their operation, organizations can benefit from their capabilities without introducing unnecessary risks.