Bridging AI Safety and AI Security

The regularly occurring NYC AI Safety Meetups cover a variety of topics, with this latest session focusing on the convergence of AI Safety and AI Security. I had the fantastic opportunity to contribute to the conversation, it’s one that’s been budding for some time, but this was my first direct exposure.

For me, this meetup was all about give and take. Within the opening minutes of the meetup, I knew I was surrounded by people who were speaking a language I don’t understand, but wish I did. When AI first appeared on the scene, the conversation was framed around AI Safety: building systems that are trustworthy, aligned, and as free of bias as possible. From my perspective, that focus on creating better products with fewer risks for end users has always been appealing. So my goal for the event was to soak up as much information around AI Safety as possible from brilliant minds not only discussing the risks of models and large language systems but also building them. Conversely, I wanted to share my perspective on AI Security, and I was pleased to realize that as these conversations continued that there was absolutely interest from the event participants.

The lineup of scheduled talks was inspiring:

Patricia Paskov, from RAND opened with fascinating research on Human Baselines in Model Evaluations. Every point she made had me mentally connecting dots to security testing and evaluation.
Emile Delcourt, a prominent AI security researcher, followed with his research on AI Agent Security, brilliantly bridging safety and security throughout his presentation.
Finally, Changlin Li, from the AI Safety Awareness Project, challenged us to envision the AI future we want, the benefits and the risks, and to act rather than simply watch.

During the Q&A, questions ranged from emerging compliance programs to whether a six month moratorium on AI development could help us catch up on security, and whether true secure by design AI is even possible.

This question in particular, “can we ever achieve secure by design for AI” is particularly attached to why we need these communities to collide for good. Throughout the event I kept thinking about the traditional security / development mantra of “shift left”, which means finding and fixing risks earlier in the development lifecycle. The shift left theme has not gone well for traditional software development. There are many causes behind the failure to incorporate security into development as a core capability, but one of the leading causes is the divide between the security and software developer communities. By the end of the evening, it was clear to me that we have a unique chance to do this right with AI. If we bring the safety community, focused on transparent building of AI, and the security community, focused on ensuring the secure operations of AI, together we can leapfrog the slow evolution that secure by design for software has endured.

Security’s Parallel Path

From the security side, when it comes to how we deal with AI, many of the risk management principles feel familiar:

pre discovery work and risk assessments before production
continuous monitoring and detection
rapid reporting and response when something looks off

Yet AI changes the game. The technology and therefore the threat landscape is fundamentally different. That difference is exactly where AI Safety and AI Security must come together. When you add the autonomy of AI Agents into the mix, the guardrails that security is most used to are critical. These are typically established to answer the “what”. What was the agent doing in the environment? Tracking the activity path to identify if there is a security concern or not. However, the underlying mechanics that try to answer the “why” behind an Agent making the decision it made fall pretty solidly in the established content of the AI Safety realm. This is just one example of where the two sides should come together to truly build and operate safe and secure AI.

A Call to Collaboration

The takeaway was crystal clear: AI Safety and AI Security must come together. Safety without security leaves systems open to attack. Security without safety ignores unintended behaviors and societal impact.

If we bridge this gap today, AI development could become secure by design, something traditional software is still striving to achieve even after decades. That's why we're so excited that Zenity Labs is hosting the AI Agent Security Summit in San Francisco on October 8th - a place for security professionals and researchers to come together to discuss solutions and best practices for this critical need.

All Articles