Closing the Guardrail Gap: Runtime Protection for OpenAI AgentKit

OpenAI’s AgentKit has democratized AI agent development in a big way. Tools like Agent Builder, ChatKit, and the Connector Registry make it possible for teams to spin up autonomous agents without writing custom code. That kind of accessibility changes everything, including the AI agent security threat model.
The easier it becomes to build agents, the harder it gets to secure them. Each new connector and workflow introduces potential attack paths, and guardrails that work in theory often fail when real attackers start probing their limits.
That is why Zenity, the leading AI agent security and governance platform, is introducing runtime protection for OpenAI AgentKit. This is not about flagging suspicious activity after the fact. It is about blocking data leakage, secret exposure, and unsafe agent behavior in real time before any risky response reaches a user or downstream system.
Why AgentKit Expanded the Threat Landscape
AgentKit extends citizen agent development to ChatGPT Enterprise, the world’s most widely used AI platform with hundreds of millions of weekly active users. That accessibility empowers creators everywhere, but it also expands the agentic threat vector. Every new agent, connector, and workflow increases the potential for things to go wrong at scale.
Zenity Labs recently tested AgentKit's native guardrails and found critical gaps. Prompt injections buried in multi-turn conversations. Credentials leaking through poorly scoped OAuth tokens. Jailbreaks via prompt injections that evade detection by using other languages or encoded text formats. These are not edge cases. They are real-world attack patterns that soft, probabilistic guardrails miss because they are designed to classify risk, not enforce policy.
Deterministic Enforcement Where Guardrails End
Zenity’s runtime protection capabilities work differently. The platform inspects every interaction between users and AgentKit-based agents, not only the text exchanged but also the underlying intent behind each action.
By focusing on agentic intent rather than simple prompt-and-response analysis, Zenity helps security teams understand what the agent is trying to do, not just what it says. This makes it possible to detect manipulative behavior that hides behind safe-sounding language, prevent data leakage or unsafe tool use before it occurs, and maintain full control without interrupting legitimate workflows.
The goal is simple: enforce policy before risk turns into exposure.
Runtime Protection in Action
- Data Leakage Detection: Identifies and blocks attempts to exfiltrate sensitive or regulated information in responses or tool outputs.
- Secrets Exposure Prevention: Detects API keys, credentials, or access tokens embedded in agent outputs and stops them before they reach users.
- Unsafe Response Blocking: Prevents actions that violate compliance standards or organizational policies from executing.
Unlike model-driven guardrails, Zenity uses rule-based, deterministic enforcement with predictable and auditable outcomes. These hard boundaries ensure unsafe actions are stopped immediately while productive, compliant activity continues as intended. Security teams can feel confident that harmful behavior is blocked without standing in the way of innovation or positive outcomes.
Even OpenAI’s AgentKit documentation notes that “even with mitigations in place, agents can still make mistakes or be tricked.” Zenity’s runtime protection fills that gap by analyzing intent and behavior, not just text, to stop unsafe actions before they happen.
“AgentKit accelerates how AI agents are built and scaled, but it also expands the attack surface overnight,” said Michael Bargury, CTO and Co-Founder of Zenity. “Our research shows that native guardrails can miss critical risks, from subtle prompt injections to hidden data leakage. Zenity’s runtime protection closes that gap by inspecting every response, understanding intent, and enforcing security policies.”
Built for How Enterprises Actually Use AI
As organizations deploy AgentKit for both internal workflows and customer-facing apps, they need enterprise-grade AI agent security, not developer-grade safeguards.
Zenity provides that through endpoint protection, policy-based enforcement aligned with compliance frameworks, and unified visibility across SaaS, cloud, and endpoint agent environments.
This runtime protection extends Zenity’s full-lifecycle approach to AI agent security, including discovery, posture management, inline detection, and active prevention. It gives security teams control over how agents behave, what they access, and which tools they invoke.
Zenity is the first agent-centric security and governance platform built for enterprises that need visibility and control wherever AI agents operate. It provides real-time defense in depth, trust and compliance at scale, and the confidence to innovate without compromise.
Drop us a line to learn more! hello@zenity.io
Related blog posts

Preventing AI Agents from Going Rogue: Zenity Collaborates with Microsoft Copilot Studio to Deliver Inline Protection Against Malicious Behavior
AI agents are autonomous, powerful, and deeply embedded in how modern businesses operate. From rerouting customer...

Securing the AI Agent Era: One Control Panel Across SaaS, Endpoint, and Cloud
The companies winning with AI aren’t just deploying agents faster - they’re operationalizing them responsibly....

Zenity and Microsoft Copilot Studio Extend AI Agent Security from Buildtime to Runtime
As enterprises race to adopt AI Agents to drive productivity and innovation. We are excited to announce that Zenity...
Secure Your Agents
We’d love to chat with you about how your team can secure and govern AI Agents everywhere.
Get a Demo