The League Assembled: Reflections from the AI Agent Security Summit

Portrait of Lana Salameh
Lana Salameh
Cover Image

At the AI Agent Security Summit in San Francisco, some of the brightest minds in AI security and top industry leaders gathered to tackle one of the most challenging problems in tech nowadays - how do we secure super smart systems that change at runtime and are designed to think, adapt, and compete? As someone who spends every day turning AI security challenges into tangible solutions, I left the summit both inspired by the innovation on display and concerned by the magnitude of what’s still ahead.

The event consisted of two main tracks, “Ferry” and the “Cable”, (very San Francisco vibes) with keynotes, lightning talks, sessions and panel discussions. Topics focused on real world risks and how attackers can exploit AI agents, deep dive into adversarial tactics and methods for prompt manipulation that turn trusted internal systems into serious threats, and finally discover various approaches to tackle the problem and develop operational defense mechanisms to monitor agent behaviors and detect malicious agents.

After a day packed with deep technical discussions and spirited debate, a few themes stood out to me.

LLMs are not neutral - they’re trained to beat us

LLMs are the core of agents, they are the “brains” that analyze, think, plan and decide. The

neural networks (NNs) that power them are trained by solving puzzles, playing games and competing. In other words, we’re building agents that behave like competitors, they are willing to do ANYTHING to win.

Understanding this allows us to realize some of the main problems with AI and puts us in the right mindset of “everything can go wrong”. For example, agents’ hallucinations are the direct result of the above. LLMs will never give up, if they don’t know the answer, they will simply guess the answer with the highest confidence even if the answer is “wrong” or “not real”.

Recognizing this nature early helps us design security controls with the right mindset, assuming the agent will take unexpected paths to achieve its goal, and plan for it.

Prompt injection isn’t a vulnerability - it’s a an ongoing reality to manage

It has been more than two years since we first heard about “prompt injection” and “indirect prompt injection”, witnessing the very first attacks that led to data exfiltration, RAG and memory poisoning and other harmful consequences.

Since then, vendors of agentic AI platforms have been working to block and filter malicious prompts, put guardrails and content filtering in place to mitigate the issue. Enterprises are deploying firewalls and implementing other solutions to protect their organizations. Yet attackers continue to find their way in.

It’s an endless chase because agents themselves are adaptive, they’ll follow the logic of their instructions, even when it leads them astray.

In order to make real progress here, we have to shift our perspective and rethink how we approach this problem - treating prompt injection as something to manage, not a bug to patch.

Michael Bargury, our Co-founder and CTO, talked about applying hard boundaries implemented with software in runtime to prevent bad things from happening or harden the attack surface. For example enforcing explicit human approval for new tools and MCP servers and limiting markdown rendering to reduce the attack surface.

AI agents - new insider threats we need to look for

One of my favorite sessions was from Steve Wilson, Chief AI & Product Officer at Exabeam, who introduced the concept of “AI agents insiders”. He presented numbers that showed a significant increase in insider threats proportional to AI adoption in the organizations and explained why organizations need to worry about both external hijacking of agents as well as protecting from agents insiders.

The takeaway for me is that we don’t need to reinvent security for AI - we need to extend proven concepts, strategies, and frameworks with an agent-centric lens. For example, for years we’ve used User and Entity Behavior Analytics (UEBA) to detect anomalies and identify threats in human or application activity. Similarly, we can apply this approach to agents - a technique I call AI Agent Behavior Analytics (ABA), to mark and track riskiest agents. Of course, building a baseline for an AI agent is tricky. These systems are nondeterministic by nature. But we can start with clear indicators like use of new or unapproved MCP servers, repetitive destructive actions, or irregular data access patterns, etc., and fold those into each agent’s dynamic risk score.

That then brings up a critical question, “what is agentic identity?” Unlike human identity or application identity that are easier to define, agentic identity is complex since AI agents are dynamic applications that change in runtime: new tools and agents are discovered in runtime and become available for the agent to use, agent code and version might vary and so on. Defining this is the foundation of how we’ll secure, monitor, and ultimately trust agents.

Looking forward the next summit

This day was full of insights and a lot of information that did not make it to this blog since I am still processing.

The vibes were WOW, everyone was eager to learn, hear and share thoughts. You could feel the excitement not only in the halls and aisles while moving from one session to another, but also in the conversations on the rooftop in front of the amazing bay view.

For me, the biggest takeaway wasn’t a single tactic, it was the sense that progress in AI security will come from collaboration, not competition. We need to share failures as openly as successes, and build together toward frameworks that make AI safer for everyone. The league really is assembling, and I’m proud Zenity is part of that conversation.

Eager to know more? Make sure to sign up here for early access for the on demand content.

All Articles

Secure Your Agents

We’d love to chat with you about how your team can secure and govern AI Agents everywhere.

Get a Demo