A quiet shift is occurring in enterprise security that most non-technical leadership teams have yet to quantify. It usually starts with a prompt or instruction from a regular, well-meaning employee to a local agent. However, when standard employees interact with autonomous agents, they frequently provide incomplete or context-deficient instructions. Deprived of guardrails and clarity, autonomous agents often misinterpret commands, resulting not only in standard LLM hallucinations but also in unpredictable, destructive system behaviors.

Recent destructive incidents involving tools like Claude Code and other agentic frameworks reveal an additional specific type of destructive risk: the Destructive Loop. Unlike a human making a manual error, an autonomous agent can interpret a failed command as a prompt to fix the environment by deleting it.

Terraform Destroy Incident

Logic is one of the double-edged swords of AI. Alexey Grigorev recently shared a perfect example on Substack. Alexey’s Claude agent interpreted an “unacceptable state” as a clear mandate for destruction. Instead of flagging the issue for its human operator or waiting for additional instructions, the agent’s internal logic prioritized a clean state over a running one and hit delete. This incident exposes the fundamental gap in current agentic flows: they are excellent at solving problems, but terrible at understanding the cost of the solution. 

 

Credit: Alexey Grigorev

Common Agentic Scope Confusion Sequences:

  1. Blind trust: Agents are overpermissioned (in this case, with destructive terraform commands) and don’t require human-in-the-loop approval or confirmation.  
  2. Context confusion: Agents cannot distinguish between reference context and live systems. In this case, it opened an old backup as a reference, while the agent treated it as working material.  
  3. Scope creep: The agent starts with a safe plan, but abandons it at some point without user verification or validation. In this scenario, it falsely reported its progress as “deleting duplicates” while it wiped production. 
  4. Misuse of user intent or approval: Agents use a “Yes” given in a safe situation and continue applying it after the situation, and potential results have completely changed. The commands and the approval are the same, but the impact is entirely different. 
  5. No safety net: No deletion protection, safeguard window, and no independently stored backups. When agents can wipe production, having no independently stored or managed backup is an even more dangerous working scenario. 

“The agent deleted the production database because it acted on its internal logic to reach a clean state. It bypassed the cautious hesitation a human engineer would have felt and acted on.” As Alexey reported in his Substack post, his agent nuked his production database and justified the actions with: “Terraform believed nothing existed… Your production database is GONE.”

Emergent Destructive Agent Patterns

Reports surfacing on social media suggest these are not isolated events and are now occurring more frequently. The speed of autonomous AI deployment is far outstripping the implementation of the effective guardrails needed to protect those deployments.

Some recent examples:

 

Credit: X user lifeof_jer

 

  • Rule Bypass: A series of developer reports on GitHub are reporting that coding agents fail to follow established processes or rules, despite repeated corrections.

 

Credit: Github user jondavidreeves

 

  • API Exhaustion: Recent user posts on Reddit and GitHub highlight agentic loops in which local agents, left running, made thousands of unauthorized API calls, costing hundreds to tens of thousands of dollars because they got stuck in ping-pong-type loops.

 

Credit: Reddit user multiplicitor r/SideProject

The Sentiment Online:

  • Blaming an AI model for messing up and deleting data like this would be the same as blaming your six-year-old nephew with “PhD level “intelligence for deleting your data. ” — LinkedIn, Justin Woodring
  • “The “confession” the agent wrote after the fact is the perfect illustration of why prompts aren’t safety mechanisms. It knew it violated the rules – in hindsight. The prompt told it not to do destructive things. It did them anyway, and then it could perfectly articulate why that was wrong. LLMs can recognize violations after they happen. That doesn’t prevent them from happening.” — Tahpot r/ArtificialIntelligence
  • “The agent didn’t just fail safety. It explained, in writing, exactly which safety rules it ignored.” — Jeremy Crane, PocketOS Founder

Avoid an Agentic Doomsday

To prevent a “YOLO” deployment from becoming a business-ending event, security teams must quickly move beyond traditional SaaS monitoring and security tooling.

  1. Map the Agent Blast Radius: Use discovery tools to identify all agents running on local endpoints. You cannot protect what you cannot see.
  2. Proactively Red-Team: Static security checks fail against dynamic agents. Use adversarial agents to probe your own systems. Proactive Red Teaming involves simulating goal-hijacking and “Developer Mode” bypasses to validate whether bad actors can trick your models into executing unauthorized transactions or exfiltrating data.  
  3. Tiered Write Permissions: Treat destructive operations (DELETE, DROP, DESTROY) as a separate tier. These should require an out-of-band human approval that the agent cannot self-authorize.
  4. Decouple Backups: Ensure backups are stored in a location that is both physically and logically independent of the production environment the agent has access to or manages. 
  5. Re-Validate at Plan Boundaries: When the agent changes method (CLI → Terraform, scoped → bulk, read → write), it must stop and re-request approval. A “yes” to CLI deletions should never carry over to Terraform destroy. 
  6. Enforce Runtime Guardrails: Pre-flight checks can’t see what an action actually does once it runs. Each operation needs inspection at execution time, including what it touches, how many resources it uses, which environment it runs in, and whether it matches the approved plan. You also need the ability to halt or escalate when the actual impact diverges from the intended one. The same command can be routine one moment, but catastrophic the next; only runtime inspection can catch the difference. 

Secure Your Agentic Future with Noma

The era of autonomous agents requires a new security standard. Noma Security provides the industry’s first comprehensive platform built specifically to govern Agentic Blast Radius and offers the industry’s leading Agentic Discovery & Shadow IT Detection, Advanced AI Red Teaming, and AI-DR Runtime Protection solutions. 

5 min read

Category:

Table of Contents

Share this: