I’ve had a number of people ask me “Why did you agree to co-chair the Cloud Security Alliance (CSA) Autonomous Action Runtime Management (AARM) initiative?” Well, partly, because I was asked! Who would pass on the opportunity to collaborate with Herman Errico, the author of AARM, as well as Akul Loomba and Chris Hughes?
But of course, the real answer is a lot more nuanced than that.
There’s no shortage of AI security frameworks, governance models, standards efforts, and working groups right now. Sometimes it feels like every week brings another benchmark, another AI security startup, another safety paper, another acronym. People are trying to sort through which efforts are durable, which are premature, and which are actually addressing the operational security problems we’re going to face over the next several years.
For me, this one felt important for a very specific reason.
The Runtime Problem
The more time I spend looking at and threat modeling agentic AI systems, the more convinced I become that runtime governance is going to be one of the defining cybersecurity challenges of this next era. And I believe we need open, interoperable approaches to that problem before the market fragments into disconnected proprietary architectures that create years of unnecessary operational complexity for all of us.
We’ve seen this pattern before.
OAuth and OpenID Connect transformed identity federation. MITRE ATT&CK gave defenders a shared framework for adversary behaviors and techniques. STIX/TAXII improved threat intelligence sharing. Open Policy Agent helped normalize policy-as-code thinking. CSA itself played an important role in establishing a common language around cloud governance and trust boundaries with CAIQ and STAR during the early evolution of cloud computing.
None of those efforts “solved” security. But they all helped the industry mature faster than it would have under proprietary fragmentation. That kind of maturation matters right now because we are still very early in the evolution of agentic systems.
Upstream Controls Are Necessary. They’re Not Sufficient
A lot of today’s AI security conversation remains focused upstream, on the systems and data before anything reaches the LLM itself. We’re working through prompt injection defenses, model alignment, retrieval security, safety tuning, orchestration hardening, memory protections, and guardrails. Those controls absolutely matter, and I don’t want anything here interpreted as dismissing that work. We genuinely need all of it.
And we need those controls to improve dramatically. If your retrieval pipeline is poisoned, your memory layer is corrupted, your orchestration is insecure, or your identity model is weak, runtime controls alone may not save you. Upstream security still matters because agents can only make decisions based on the context and tooling they’re given. Runtime governance isn’t a replacement for secure model development, orchestration security, identity controls, or data governance. It’s an additional control layer focused on what happens at the point of action, where security consequences become operational reality.
But upstream controls aren’t enough. Large language models are probabilistic systems. Once information enters the context window, trusted and untrusted inputs all influence the same reasoning process. The model cannot, architecturally, reliably preserve trust boundaries between those inputs the way traditional deterministic systems preserve execution boundaries.
One important question is what the model generated, and why. The higher-consequence question is what actions the system took as a result.
Why AARM Resonated
Herman Errico understood the problem and instead of simply talking about it, he dove head first into the creation of AARM, an open system specification for securing AI-driven actions at runtime. It specifies a set of runtime governance capabilities: intercept agent actions before execution, accumulate session context, evaluate actions against policy and intent alignment, enforce authorization decisions, and produce tamper-evident audit receipts. It’s a specification, not a product. It describes the components, behaviors, and conformance requirements. It doesn’t tell you how to build it.
AARM draws the security boundary at the action boundary: the moment an agent tries to do something in the real world.
Reading customer data may be allowed. Sending email may be allowed. Accessing internal systems may be allowed. But combining those actions in the wrong context, under manipulated intent, or across dynamically chained workflows can create entirely different risk outcomes: exfiltration, privilege misuse, fraud, unintended operational impact. Traditional enterprise authorization systems were not designed for this kind of dynamic, contextual, machine-speed decision-making across autonomous agent workflows.
Most existing authorization models assume more deterministic software behavior than what we’re seeing in agentic systems. Agents chain actions together, invoke tools dynamically, delegate to sub-agents, adapt behavior mid-session, and continue operating under manipulated or ambiguous intent while still operating within formally authorized permission boundaries. AARM’s threat model explicitly covers the real operational risks we’re going to be managing like prompt injection, confused deputy attacks, data exfiltration, and intent drift.
That’s why runtime governance matters so much, because it’s where real-world impact occurs.
Where We Are Now
One of the things I appreciate most about this initiative is that it acknowledges how early we still are. The ecosystem is evolving fast. None of us have all the answers yet. The responsible move at this stage is not pretending the architecture is settled. It’s creating an open framework the community can shape together as the technology matures, operational patterns emerge, and we learn more about how these systems behave at scale.
That openness mattered to me. A lot.
I’ve spent enough years in cybersecurity to know what happens when foundational control layers fragment too early. Customers carry the burden for years: incompatible policy models, duplicated integrations, and limited interoperability that becomes increasingly difficult and expensive to operationalize coherently across environments. Seen that film before and didn’t like the ending.
We still have an opportunity to avoid at least some of that here. Vanta’s decision to contribute AARM to the CSAI Foundation under vendor-neutral CSA governance is the kind of move that gives an open specification a real chance to become durable. The working group now includes security leaders from Elastic, Darktrace, Truist, Gusto, Ballistic Ventures, and IEEE, alongside practitioners, enterprise operators, and agent framework developers. That diversity of perspective is exactly what this space needs because none of these problems exist purely inside one discipline.
What the Spec Defines
It’s worth pausing on what AARM concretely is, because “open specification” can mean a lot of things.
AARM defines six system components that work together to secure agent actions at runtime. An action mediation layer intercepts tool invocations before they execute and normalizes them to a canonical schema. A context accumulator maintains session state so that policy decisions account for what the agent has already done, not just what it’s attempting now.
A policy engine evaluates each action against organizational rules covering identity, parameters, and accumulated context, then enforces a decision: allow, deny, modify, escalate to human approval, or defer. That last one matters. Deferral means the system temporarily suspends execution when available context is insufficient or ambiguous, rather than forcing a binary allow-or-deny decision under uncertainty. For autonomous systems operating at machine speed, that’s a meaningful addition to the security model. An approval service handles the human-in-the-loop step when actions are escalated, routing high-risk decisions to the right people before execution proceeds. A receipt generator supports tamper-evident audit records binding the action, the decision, the identity chain, and the outcome. And a telemetry exporter feeds those events into your SIEM/SOAR.
If that sounds like defense in depth applied to the action boundary, that’s because it is. The architecture maps to operational controls security teams already understand: mediation is inline enforcement, context accumulation is session-aware authorization, policy evaluation is contextual access control, approval workflows are step-up authentication for sensitive operations, and signed receipts support auditability and investigation.
The spec draws the trust boundary at the right place: not the orchestration layer, which is software you can harden and secure like any other application code, but the context window itself, where trusted and untrusted inputs become indistinguishable tokens in a shared reasoning process. Prompt injection is an architectural property of current LLM systems, not a bug to be patched. Runtime enforcement has to sit outside that boundary.
AARM’s threat model covers eleven categories of runtime risk we’re going to be managing as agentic deployments scale, including confused deputy attacks, data exfiltration, goal hijacking, intent drift, memory poisoning, and cross-agent propagation.
The spec also defines four implementation architectures based on what you actually control in your environment: network (protocol gateway), code (SDK instrumentation), host (kernel-level enforcement), or policy only (vendor integration hooks for SaaS agents you don’t control). Each has different trade-offs in bypass resistance, context richness, and enforcement guarantees, and they can be layered together. That matters because enterprise environments aren’t uniform. Your security posture for a self-hosted agent framework is going to look very different from your posture for a third-party SaaS agent, and the spec accounts for that.
Before Fragmentation Sets In
Most security leaders I talk with are trying to enable AI adoption without introducing excessive business risk. Without a shared standard for runtime governance, every vendor defines what “secured at runtime” means on their own terms. And customers have no common baseline to evaluate against.
Think about how long the industry spent normalizing security telemetry across platforms before OCSF. Every SIEM integration was a custom connector. Every vendor had its own schema. Security teams spent more time on data plumbing than on actual detection and response. Runtime governance for agentic AI could be heading toward the same black hole right now, and the longer it takes to establish a shared baseline, the harder it becomes to retrofit one.
An open specification establishes a shared definition of what runtime security for agents actually requires before the market fills the void with a dozen proprietary answers that aren’t interoperable, auditable against a common baseline, or portable across agent frameworks. It means conformance is measurable against real requirements, not marketing language.
AARM is under vendor-neutral governance, so no single company controls the roadmap. It’s attracting contributors from across the ecosystem: security vendors, enterprise operators, and agent framework builders. And it treats the action boundary as the critical security boundary, which I believe is architecturally correct for agentic systems.
Runtime action governance is going to become one of the foundational control layers for agentic AI. The window to shape it together is open. It won’t stay open.


