Beyond Authorization: Intent-Aware Detection for AI Agents

Identity tells us an agent is allowed to act, intent tells us why it is acting. In an agentic world, only one of those questions actually predicts whether your environment is about to break.

The Question Identity Was Never Designed to Answer

For three decades, enterprise security has been built around a deceptively simple question, is this principal authorized to perform this action? Identity and Access Management (IAM), role-based access control, least privilege, and zero trust, every modern security architecture eventually distills into that same yes-or-no decision. It is a remarkable system when the principal is a human filling out a timecard or a service account running a nightly batch job. Their behavior is bounded, their goals are predictable, and the worst they can do is what their entitlements permit.

Agentic AI breaks that comfortable assumption and is precisely why the industry is wrestling with how to secure agents. An agent with read access to a CRM and the ability to send email is, on paper, doing nothing more than what its entitlements allow. But the same set of permissions can support an analyst summarizing pipeline health, an attacker exfiltrating customer records, or a confused model emailing a contract to the wrong recipient. Authorization is identical across all three, however, intent is not.

This is the gap that pushed Gartner to name Zenity the Company to Beat in AI Agent Governance in its 2026 AI Vendor Race report, citing Zenity’s “purpose-built agentic-centric architecture” and “intent-aware detection” as differentiators. The recognition matters less for what it says about any single vendor and more for what it signals about where the category is heading, which is away from coarse permissioning and toward continuous reasoning about agent purpose.

Why IAM-Era Thinking Doesn’t Map to Agents

IAM was designed for principals whose behavior could be modeled in advance. Humans authenticate, accept terms, and act within a job description; service accounts execute a fixed script, agents do neither. They reason, plan, chain tools, accept context from other agents, and improvise paths to goals that may not even have been defined when the entitlement was granted.

I made a related point in “The Agentic AI Governance Blind Spot,” arguing that the most-cited governance frameworks such as NIST AI RMF, the EU AI Act, and ISO 42001 still treat AI as a model rather than as an actor. The same blind spot exists one layer down in technical controls. Identity governance tells you a Salesforce-connected agent could pull a contact list. It does not tell you whether the agent is pulling that list to draft a quarterly business review or because an indirect prompt injection from a calendar invite told it to forward the data to an external address.

Permissions are necessary but profoundly insufficient. The unsettling implication is that almost every successful agent attack we have seen so far such as the Cursor incident, the Microsoft Copilot prompt-injection demonstrations, and the various MCP-based escapes were carried out by an agent that was perfectly within its rights. The system did exactly what it was told, but the problem was who, or what, did the telling, and toward what end.

What “Intent Analysis” Actually Means for Agents

Intent, in the agentic AI context, is the operative purpose behind an action. It is not the same as a prompt, a system message, or an authorization grant. It is the inferred answer to a different question, which is what is this agent ultimately trying to accomplish, and is that aligned with the goal the user or business sanctioned?

There are usefully two halves to that question. The first is input intent, when a user or another agent sends a message, what are they actually asking for? “Summarize this customer email” and “extract all financial figures from any document this agent can see” may produce similar tool calls but represent radically different intentions. The second is execution intent, when an agent invokes tools, traverses memory, and chains operations, what objective do those actions cumulatively serve? Execution intent is what allows us to notice when an agent that was asked to draft an email starts touching the HR system three steps later.

This framing is consistent with how the broader research community is starting to model the problem. A useful primer is the substack essay “How to evaluate intent without using LLMs recursively” which surveys the leading technical approaches and is a blog I found helpful on the topic.

It is worth reading not because any one of those approaches is sufficient on its own, we will see in a moment that several have important production limitations, but because it captures the shift in how practitioners now think about agent risk, which is the shift from filtering inputs to reasoning about purpose, the latter of which is a much more complex challenge to solve.

The Research Landscape: Five Ways to Evaluate Intent

If you survey the literature, vendor blogs, and academic publications from the past year as I’ve been doing, you see roughly five families of approaches to inferring intent. Each has merit, but each also has a limit that becomes obvious only once you try to run it in production.

I’m far from an expert on every one of these methods, but it has been helpful to learn about each of them and the role they can play in the discussion around intent as it relates to securing agents. That said, we have folks on the team with deep expertise in this area who are more than happy to chat, such as Raz Tel-Vered, Keren Katz and Jenny Abramov, who I enjoyed diving into this topic with.

Taxonomies and threat models like MITRE’s ATLAS and OWASP’s Agentic Top 10 are the natural starting point. As a Distinguished Reviewer on the OWASP Top 10 for Agentic Applications and a contributor to the Coalition for Secure AI (CoSAI), I lean on these heavily. They give organizations a shared vocabulary for talking about goal hijack, tool misuse, and memory poisoning. But they are intentionally framework-level rather than prescriptive. Teams running detection in production quickly find them too coarse to drive a classifier. Zenity’s detection team, for example, found that public taxonomies were partial and not specific enough to power their models, and ended up developing a purpose-built taxonomy derived from real attack telemetry rather than from open-source datasets alone.

Encoder-only classifiers and fine-tuned small language models (SLMs) are the workhorses of production intent detection. Instead of asking a large general purpose model “is this malicious?” on every request, you train a smaller, narrower model on a carefully curated dataset of attack patterns and hard negatives, such as benign inputs that look adversarial. Done well, these models outperform OSS detectors and frontier LLMs on real-world traffic because they are tuned to actual attacker behavior rather than to academic benchmarks. One nuance is that the quality of the model is only as good as the taxonomy and training data behind it, which is why the work to define real attack categories matters as much as the modeling itself.

LLM internals classification detects malicious intent by reading a model's internal state rather than analyzing text inputs and outputs. We pass the input through a small LLM and read the activations it produces internally, which lets us detect maliciousness directly, since the model internally "knows" when an input is trying to manipulate it, extract secrets, or make a harmful request, even when it wouldn't say so in its response.

Finite state machines and policy graphs are conceptually elegant, they model every legitimate path an agent can take, then alert when execution leaves the graph. In practice, FSM-style approaches buckle under production usage limits and the combinatorial explosion of multi-agent, multi-tool environments. They remain a useful design pattern for narrow, high-assurance scenarios but rarely as the primary detection mechanism due to the impracticality of implementing them in large complex enterprise mutli-agent architectures with exponential potential agent paths, and the stochastic autonomy of agents that select among those paths non deterministically.

Sandboxing and behavioral profiling, which is running agents through dry-runs in isolated environments to build behavioral baselines, has more value in pre-production evaluation than in continuous runtime monitoring. This is something Zenity does for our own agents we build as well. It is excellent for red-teaming and acceptance testing, but the cost and latency of full sandboxed replay rarely fit a production budget and it is also easy to see how shadow usage and deployments would often mitigate the value of this technique. That said, we know sandboxing as a general security primitive has become a core part of the conversation around securing agents, with both open source and commercial offerings in the market and being championed by the largest players such as Anthropic and NVIDIA.

Anomaly detection on tool-call patterns is the most promising of the runtime techniques and the closest cousin to AI-native UEBA. The idea is to learn what “normal” looks like for a given agent, its tool sequences, data access patterns, memory updates, and surface deviations. POCs in scoped scenarios work well but scaling that approach to every agent in an enterprise without drowning analysts in false positives is the ongoing challenge and reminds me of earlier areas of cyber, such as EDR.

The takeaway from the research is not that one approach “wins.” It is that intent must be observed from multiple angles, input, execution, baseline, taxonomy, and reconciled in real time. Vendors that pretend otherwise are usually overselling a single technique. Given how early we are in the era of agentic AI, it’s much more helpful to be honest and transparent about the state of the landscape than pitch silver bullets.

From Theory to Practice: Four Dimensions of Agent Intent

Operationalizing intent in a security platform means decomposing it into things you can actually detect. Drawing on how leading teams, including my team behind Zenity’s AI Detection and Response (AIDR), are thinking about this today, four dimensions stand out.

Malicious intent (adversarial AI): This is the category most people imagine when they hear “AI security”, including vectors such as prompt injection, jailbreaks, weaponized documents, indirect injection via emails or web pages. The detection job is to recognize that an input, wherever it originated, is attempting to coerce the agent into hostile behavior. Encoder-only classifiers and dual-model pipelines (one tuned for recall, one for low false positives) are the tools of choice here. This is also where threat-specific taxonomies pay off, because “malicious” is not one thing but many. As we know from industry recognized mental models such as Simon Willison’s “Lethal Trifecta”, all input should be treated as untrusted.

Intent drift: Even without an adversary, agents wander in ways we don’t and can’t always anticipate as users. A request to “summarize Q3 revenue” can fan out into tool calls that touch payroll, customer health scores, and a partner portal. Some of that fan-out is legitimate, agents need flexibility, and some is dangerous. Intent drift detection compares input intent against the actual chain of executions and flags meaningful divergence. The hard part is calibrating “meaningful” so the signal is useful rather than noisy.

Goal hijack: Every agent has, or should have, a sanctioned objective, with examples such as “process refund requests,” “summarize meeting notes,” or “triage support tickets.” Goal hijack is what happens when an attacker convinces an agent that its true purpose is something else entirely. Defending against it starts with the unglamorous work of inventorying every agent in the environment and capturing a single-sentence summary of what each one is supposed to do. From there, runtime detection can compare observed trajectories against the sanctioned goal. That said, as we’ve discussed, reconciling behavior against goals is complex, and requires truly being able to tie the appropriateness of activities back to original goals or objectives.

Input intent classification: Finally, intent is also a useful primitive for identity and data security boundaries. If every prompt could be enriched with a label such as “attempting to access financial data,” “summarizing an email,” or “modifying a configuration”, then policies could be written in terms a CISO actually cares about. For example, “a user from R&D should never trigger an agent action whose intent is to access HR compensation data,” regardless of what entitlements technically permit. This is intent as the bridge between IAM and data security, and it is where the most interesting category-defining work is happening right now and it's easy to see how this approach allows organizational policy to move from paperwork no one reads to enforceable.

Why Intent-Aware Detection Earns the “Company to Beat” Label

Gartner’s methodology for naming a “Company to Beat” looks at technical capabilities, customer implementations, partnerships, and ecosystem influence. Intent-aware detection cuts across all of those because it is fundamentally a different kind of capability than what model-level guardrail vendors offer. Three things in particular tend to separate serious implementations from window-dressing.

First, custom taxonomies grounded in real attack telemetry, not just open-source datasets as I discussed above. Public datasets are overfitted to known patterns and miss the long tail of attacks that actually show up in production. Second, paired modeling, a high-recall model with an LLM post-filter for triage, plus a low-false-positive model trained on hard negatives for inline blocking, lets a platform make different trade-offs in different contexts instead of forcing a single threshold. Third, stateful, context-aware analysis across the full interaction chain means that multi-turn attacks, slow data exfiltration, and gradual goal hijack become visible in ways that stateless prompt filters will probably miss.

Combined with active contributions to OWASP’s Agentic Top 10, CoSAI, and the emerging body of MCP security guidance, this is what “intent-aware” looks like as a discipline and capability rather than as a marketing phrase.

What Practitioners Should Do Now

If you are responsible for AI security in your enterprise, three moves will pay outsized dividends in the next 12 months.

Start an agent inventory and capture intended purpose - You cannot detect goal hijack without a sanctioned goal to compare against. A one-sentence purpose statement per agent is more valuable than another spreadsheet of entitlements. This helps you both capture what agents you have, as well as what they’re supposed to be used for.

Stop equating prompt filtering with AI security - Input scanning is necessary but does nothing for intent drift, goal hijack, or multi-step attacks. Ask vendors how they reason about execution, not just inputs.

Push vendors to be specific - Anyone can claim intent detection. Ask them: questions such as which approaches do you use, where do your training labels come from, how do you handle false positives in inline blocking, and what does your taxonomy actually look like? The answers will separate the platforms from the pretenders.

The Authorization Era Is Ending

The IAM control plane will not disappear, IAM and A&A are still foundational. Agents still need identities, entitlements, and audit trails. But the center of gravity of enterprise security is shifting, and it is shifting toward a question identity was never designed to answer, which is not can this agent do this, but why is it doing this, and is that aligned with what we sanctioned?

Intent-aware detection is how that question becomes operational. It is also why governance frameworks, detection platforms, and the broader market are converging on the same conclusion at the same time. The work is early, and the taxonomies are still maturing. The models will keep getting better, but the direction is unmistakable, and the organizations that start treating intent as a first-class security primitive today will find themselves dramatically better positioned when the next class of agent attacks lands, because they will be the ones who can answer not just whether their agents had permission, but whether they had an approved purpose.

All Articles

Capabilities

Environment

By Business Need

By Platform

By Risk Type

By Business Type

Beyond Authorization: Why Intent-Aware Detection Is the New Control Plane for Agentic AI

The Question Identity Was Never Designed to Answer

Why IAM-Era Thinking Doesn’t Map to Agents

What “Intent Analysis” Actually Means for Agents

The Research Landscape: Five Ways to Evaluate Intent

From Theory to Practice: Four Dimensions of Agent Intent

Why Intent-Aware Detection Earns the “Company to Beat” Label

What Practitioners Should Do Now

The Authorization Era Is Ending

Related blog posts

The Permission Boundary Myth: Why Authorized Doesn't Mean Appropriate for Coding Agents

The Top 5 Questions Security Leaders Are Asking About Coding Agents

Coding Agents Are Moving Faster Than Security. Here's What CISOs Need to Know.

Secure Your Agents