
Key Takeaways
- Single-signal security solutions for agentic AI, whether identity-only, data-only, or model behavior-only, each leave predictable blind spots that sophisticated attacks already exploit.
- Zenity’s five-signal framework assembles identity (NHI), data (DSPM), model behavior, agent posture, and environment signals at runtime to answer whether agent activity was appropriate, not just permitted.
- The PleaseFix vulnerability family, documented by Zenity Labs, illustrates how an attacker can stay entirely within authorized boundaries while conducting a material breach, invisible to any single-signal monitor. Five-signal coverage changes that outcome.
- Step mutation, the ability to intercept and rewrite a specific anomalous action within a multi-step workflow, is a critical capability that blunt block-or-allow architectures can’t provide. It occupies one position in a broader response spectrum that mature programs must operate across.
- Download Beyond Identity: The CISO’s Guide to Securing Agentic AI for the complete breakdown of each signal domain and what threats remain invisible without it.
The security industry hasn’t been wrong about agentic AI risk. It’s been incomplete. There’s no shortage of single-signal solutions for the problem: tools that analyze prompts for malicious content, platforms that monitor data access patterns, capabilities that assess model behavior for signs of manipulation. Each captures something real. None is sufficient on its own.
That insufficiency isn’t a product gap. It’s an architectural one. An agent’s behavior is the product of multiple interacting factors: the identity it’s operating under, the data it’s accessing, the instructions it has received, the tools it’s invoking, and the environment it’s operating in. A change in any of these factors can produce behavior that looks normal through one lens and anomalous through another. Identifying whether an activity is appropriate requires assembling all of these signals together, at runtime.
What Partial Coverage Actually Costs
The argument for assembling all five signals isn’t theoretical. Here’s what happens when coverage is incomplete.
Identity only
An organization with heavy NHI governance investment knows which credentials its agents hold and can detect anomalous credential use in isolation. What it can’t see: an agent that stays entirely within its authorized scope while being manipulated through prompt injection to access data in a pattern inconsistent with its task. The identity signal is clean throughout. The exfiltration is gradual, session-by-session, well within per-session data volume thresholds, and only visible in the data signal accumulated across time. Without that signal, the breach completes.
Data only
A DSPM platform observing that sensitive data was accessed tells the security team that something happened. It doesn’t tell them whether it was an agent or a human user, which agent, through which tool invocation, or whether the access was within the agent’s intended scope. Gradual exfiltration paced to stay below volume thresholds is invisible. The attack completes.
Model behavior only
Prompt and output monitoring can detect many injection attempts at the model layer. What it can’t detect: a prompt injection crafted to produce a response that passes content filters while still redirecting tool invocations downstream. A compromised MCP server that manipulates agent behavior through tool responses rather than prompt inputs bypasses model-layer monitoring entirely. Without data and posture signals correlating what the agent actually did after the model responded, the attack succeeds silently.
Agent posture only
Posture management catches misconfigured agents before or between deployments. It doesn’t catch an agent that was correctly configured at scan time and then manipulated at runtime. Without model behavior and data signals operating continuously during execution, a clean posture report provides no protection against runtime attacks.
The Five Signals That Actually Tell You What’s Happening
Zenity’s five-signal framework addresses each coverage gap. Here’s what each domain evaluates at runtime:
Identity (NHI layer)
Which identity layer is active? Is it consistent with the agent’s declared purpose? Has it drifted from expected parameters? An agent operating under an identity that has acquired permissions beyond its configured scope, or that is minting tokens with unusual attributes, warrants investigation regardless of whether downstream activities look normal.
Data (DSPM)
What did the agent actually touch? Did data access patterns match the scope of the assigned task? Was data volume proportionate to task requirements? Did the agent access data it had never accessed in similar contexts? Data access signals are particularly important in scenarios where individual actions are authorized but collectively anomalous.
Model behavior
Is there evidence of prompt injection, jailbreaking, or mid-execution manipulation? This signal captures changes in reasoning patterns, unexpected goal shifts, or outputs inconsistent with the agent’s configuration. Note: while the underlying model is where reasoning occurs, the primary attack surface is almost always the agent’s context, memory, and tool interfaces, not the model itself. Organizations focusing only on model-level protections leave the broader attack surface unaddressed.
Agent posture
Has the agent’s configuration changed? Have tool dependencies been updated without review? Has behavior shifted from established baselines? An agent operating outside a known-good state is in a higher-risk condition regardless of whether its current activity looks normal.
Environment
Are there infrastructure conditions that change the risk calculus? An agent accessing systems from unexpected locations, or running in an environment that has recently experienced configuration changes, may warrant additional scrutiny.
All Five Assembled: A Different Outcome
With all five domains active and correlated, the same attack sequence that defeats partial coverage produces a materially different outcome. The model behavior signal detects the mid-session reasoning shift caused by prompt injection. The data signal confirms the agent began accessing records outside the scope of the originating task. The identity signal verifies the session token is consistent but flags that the data accessed doesn’t match the expected permission scope for this task type. The posture signal confirms the agent’s configuration was clean at deployment, narrowing the incident hypothesis to manipulation rather than error. The environment signal is unremarkable.
Together: a high-confidence, fully contextualized detection that drives an immediate, well-scoped response.
The PleaseFix vulnerability family, documented by Zenity Labs, is a useful lens for understanding what this difference means in practice. In one documented variant, a malicious calendar invite caused a compromised agent to take over a user’s credential vault with zero clicks required from the victim. The agent was authorized. The identity chain was clean. With only identity-layer monitoring in place, the attack completes without triggering an alert. With all five signals assembled, the mid-session behavioral shift, goal state inconsistency in the model behavior signal, anomalous credential access in the data signal, and posture drift confirming runtime manipulation, surfaces the attack before the credential vault is compromised. Five-signal coverage doesn’t just detect better. It detects things that partial coverage structurally cannot see.
Step Mutation: Beyond Block or Allow
One capability enabled by five-signal assembly that single-signal architectures can’t provide: step mutation. But understanding it requires understanding where it fits in a broader response spectrum.
Most detection-oriented security architectures operate on a binary, either the agent’s action proceeds, or the entire workflow is terminated. Mature programs need a fuller range of responses: logging, elevated scrutiny, human approval queuing, single-step blocking, step rewriting, workflow suspension, and a full kill switch. Step mutation occupies a specific position in that spectrum. It’s not a replacement for the other instruments, it’s the right tool when a specific step is inappropriate but a safe equivalent exists and the workflow can continue legitimately.
"Step mutation is the right tool when a specific step is inappropriate but a safe equivalent exists. It’s not a replacement for hard stops, it’s what makes precision possible where precision is warranted."
Consider a concrete example of what happens without it. An HR agent is mid-workflow processing a legitimate employee record request. It encounters a prompt-injected instruction embedded in a document it’s reviewing. A block-or-allow architecture has two options: let the compromised step execute, or terminate the entire workflow. Termination protects against the attack, but it also disrupts a legitimate business process, generates a false-positive friction event for the HR team, and erodes trust in the security program over time. Repeated terminations in low-confidence scenarios create pressure to loosen controls entirely.
When Zenity’s platform identifies the specific step within a multi-step workflow that is inappropriate, and where a safe equivalent exists, it can intercept that step and rewrite it, allowing the rest of the workflow to continue. The HR interaction completes normally. The security intervention is invisible to the end user and to the business process that depends on the agent.
Critically, step mutation has hard limits. It is not appropriate for irreversible actions, confirmed exfiltration attempts, or cases where the platform cannot infer the agent’s intended downstream logic with high confidence. In those cases, the platform escalates to a full block or workflow termination. The value of step mutation is precision where precision is possible, not replacing hard stops where they’re warranted.
This capability requires the continuous, stateful awareness of the agent’s full execution trajectory that conventional detection architectures can’t provide. The distinction matters: event-level detection can identify that something happened. Trajectory-level monitoring, maintaining a running model of the agent’s intent across the full workflow, can evaluate whether what happened was consistent with what should have happened. That’s the architectural foundation step mutation depends on, and it’s only available when all five signal domains are active and assembled at runtime.
The difference partial coverage creates isn’t between being attacked and not. It’s between completing an attack undetected and stopping it before it becomes material, with precision that preserves the business processes that depend on agents working correctly.
Download Beyond Identity: The CISO’s Guide to Securing Agentic AI for the full breakdown of each signal domain, real attack examples, and the framework for building five-signal coverage in your environment.
All ArticlesRelated blog posts

Allowed Is Not Aligned: Why Retrofitted Tools Can’t Secure AI Agents
Gartner® named Zenity the Company to Beat in AI Agent Governance on April 17, 2026. That recognition, grounded...

AI Risk Is Not Uniform: The Case for Archetype-Aware Enterprise Security
Every conversation I have with security leaders about enterprise AI security eventually arrives at the same place:...

Your AI Agent Inventory Is Incomplete. Here's What That Means for Risk.
Some organizations still treat agentic AI as a future problem. Something to plan for. Something on the horizon....
Secure Your Agents
We’d love to chat with you about how your team can secure and govern AI Agents everywhere.
Get a Demo