OpenAI ChatGPT Agent: A CISO’s guide to the security implications and how to mitigate them

Published: Jul 21, 2025 · 6 min. read

On July 17th OpenAI announced significant updates to ChatGPT that fundamentally shifts its capabilities from passive text generation to active autonomous agentic decision making and task execution on behalf of its users. As OpenAI states in their announcement, “ChatGPT now thinks and acts, proactively choosing from a toolbox of agentic skills to complete tasks for you using its own computer.” This announcement represents more than just a feature update, it reflects the dramatic transformation of AI from a helpful assistant driven by the user, to an AI agent that is capable of making independent decisions and performing actions across interconnected business systems.

We’ve evaluated the capabilities and security implications of the ChatGPT agent from the perspective of the enterprise CISO’s cybersecurity team. Here’s everything you need to know about the security risks and how you can mitigate them.

What is the OpenAI ChatGPT agent?

OpenAI released a new agentic mode for its state-of-the-art LLM, called “Agentic mode,” essentially combining functionalities that existed separately into a comprehensive autonomous agent. These are:

Operator: Solving tasks through autonomous web browsing and interaction.
Deep Research: conducts multi-step research on the internet for complex tasks.
Code Interpreter: Execute code for data analysis and visualization.
Integrations with external data sources and connectors such as Google Drive or GitHub

This new “Agent mode” has the ability to transform business operations by automating multi-step workflows that can, for example) help sales teams automatically generate customized pricing proposals, or enable HR teams to streamline onboarding processes by automatically scheduling meetings on their behalf. While this all sounds incredibly promising and exciting for a business user, what is required under the hood means providing the agent access to sensitive data, user access permissions, company networks and internal systems that contain highly sensitive information for it to be useful. If you are a CISO or have any vested interest in keeping your organization safe and secure, this should be ringing alarm bells. While these autonomous capabilities promise unprecedented productivity gains and competitive advantages for organizations ready and willing to embrace this new era of AI, they simultaneously introduce a new category of security and governance risks that organizations must consider.

The New Agentic AI Risk Landscape

As we previously stated in our blog, AI agent risk originates from three critical risk factors, which have now increased significantly with this new mode:

Autonomy: Agents can now use any connectors or credentials provided by users when browsing the web, enabling them to act with the user’s identity and perform irreversible actions across multiple platforms.
No differential access control between AI and the user it impersonates: There’s no efficient way to restrict agent capabilities since they use the exact same permissions as the user, essentially giving AI the full scope of human access rights.
Limited auditability: Actions performed by agents using user credentials cannot be easily distinguished from user actions in audit logs, creating accountability gaps

Ready to protect your organization from AI agent risks? See how Noma Security enables safe agent adoption.

ChatGPT Agent Risk Exposure

While this release is a great step into Agentic AI, there is a long way to go before organizations and users are able to fully adopt AI agents in a safe and secure way. While OpenAI has implemented built-in guardrails where ChatGPT proactively seeks user approval before executing real-world actions, operates under active supervision (“Watch Mode”) for sensitive tasks like sending emails, and automatically declines high-stakes actions such as financial transactions to prevent misuse, they have not publicly shown any security benchmark results, only performance related ones. There is little information about the effectiveness of the security controls OpenAI has put in place.

At Noma we have taken a deeper look into this new AI agent capability within ChatGPT to help organizations better understand the potential risk exposure of adopting this new capability. In summary, the ChatGPT agent is potentially exposed to prompt injections and malicious instructions which can leak data provided through connectors or authenticated browsing sessions, and potentially perform destructive actions across Google Drive, Gmail, GitHub, and other connected platforms.

Detailed ChatGPT Agent Risk Scenarios:

Indirect prompt injection/data poisoning (LLM01): Research shows that content consumed by AI can alter its behavior. Malicious websites processed by the agent may contain hidden instructions that redirect the AI’s actions, potentially resulting in sensitive data leakage or destructive actions on connected platforms.
Excessive Autonomy (LLM06): While OpenAI asserts that ChatGPT agents are trained to prevent destructive actions without explicit user approval, there is limited publicly available information validating the effectiveness of these safety measures. Because the agent operates with identical access privileges as the user it represents, enforcing differentiated security controls between the AI agent and human users becomes exceedingly challenging. Failures in these safeguards could lead to unauthorized, irreversible actions executed by the AI, potentially resulting in significant harm.
AI Hallucination Risk: AI-generated hallucinations occur when the agent incorrectly interprets instructions, fabricates information, or pursues unintended tasks. Even in such scenarios, the AI continues to operate using the permissions and credentials granted by the user. These hallucinations could unintentionally lead the AI to perform actions that differ significantly from the intended task, potentially causing unauthorized actions or compromising data across connected systems..
Lack of auditability: Since agents operate using user credentials, it becomes nearly impossible to differentiate between human and AI actions in system logs, compromising forensic capabilities and compliance tracking.

Recommendations for Security Teams

OpenAI has enabled its AI to perform significant, irreversible actions on behalf of the user and if the guardrails fail, the impact could be devastating. We believe blocking AI will only drive shadow, ungoverned AI adoption within your organization, creating greater security risks and compliance gaps. Therefore, security teams should proactively establish governance frameworks that enable controlled deployment of ChatGPT’s agentic capabilities while implementing robust monitoring, access controls, and data protection measures.

Govern allowed connectors in the organization: Workspace admins should control which connectors can be used organizationally, noting that most connectors cannot be configured with least-privilege access. For example, Google Drive’s minimal permissions include edit and delete rights, while GitHub connectors require “Act on your behalf” permissions.
Monitor usage using the ChatGPT Compliance API: Enterprise users should leverage ChatGPT’s Compliance API to monitor and respond when agents are exposed to sensitive data or malicious instructions.
Credential Management: Unless specifically removed, OpenAI stores all cookies, including ones that can be used to automatically sign you in. This means that future interactions will be able to use previously logged-in websites, even if it is not related to the task at hand. Make sure your employees are mindful to their logged-in sessions and log out of risky platforms such as bank, engineering’s production environment.
Active Supervision: Educate your employees to tightly examine what the AI is doing using “Watch Mode” and by inspecting its activity, and intervene before any unauthorized action is taken.
Set custom instructions: Use ChatGPT’s custom instructions capability to align the ChatGPT Agents with your desired restrictions. For example, add “Avoid editing or deleting any data that is provided by the connectors” or “Never use Web Browsing capabilities with internal data”. Note: These instructions are user-managed, not centrally applied or enforced across the workspace meaning education and internal employee processes must be updated.

What does secure Agentic AI look like in the future?

This ChatGPT update represents just the first step toward a future where autonomous AI agents become deeply embedded in organizational workflows, evolving from simple task executors to sophisticated decision makers that operate with increasing independence and decision making power with little human oversight. As these systems mature, we anticipate a fundamental shift toward truly autonomous agents that can spawn sub-agents, modify their own objectives, and make complex multi-step decisions across interconnected business systems creating scenarios where organizations may find themselves running AI agents they cannot fully control or predict.

Organizations must prepare for a future where their most valuable AI agents operate in a black box of autonomous decision-making, requiring entirely new frameworks for governance, accountability, and risk management that can maintain oversight without stifling the very autonomy that makes these systems valuable. The key is providing secure pathways for legitimate business use while maintaining visibility and control over AI agent activities across your enterprise.

By taking proactive steps today, organizations can build a foundation for secure, effective use of agentic AI that drives innovation while protecting critical assets and data. To learn more about how Noma Security can help you implement security for ChatGPT agent and other agentic AI risk, and help your organization embrace this new technology with confidence, schedule a demo for you and your team.