This project started as a way for us to make sense of our own research. How would we mitigate our own attacks? Microsoft’s first reaction was to deny-list our prompts, but that’s not really helpful, finding others is easy. Was there a better solution?
Learning from EDRs, we should stop over-relying on static signatures and refocus on behavior. What are the behavior patterns of a copilot, an agent and the human interaction with those, when an attack is underway? If we can capture these patterns in a meaningful way, we can guide mitigation for builders, defense for defenders and detection for hunters.
GenAI Attacks Matrix is focused GenAI-based applications like copilots and agents. It’s inspired by MITRE ATT&CK and Atlas frameworks (intended as the sincerest form of flattery). One important distinction from ATT&CK is that we’re documenting security research, not observed adversary behavior. We believe its important to get on top of these threats well before they are observed, given how fast AI is being adopted. We’re considering three distinct threat models:
We’re collaborating with others across the industry to enrich this framework. We want YOUR contributions. It’s easy to get started. There are other awesome frameworks for AI security, and you can read our thoughts about the relation to each in the Q&A section. We’re also working on contributing back into these.
This is still very much a work-in-progress. We’ve decided to build in public to get others involved early on. Please let us know if you find this useful or have a suggestion on how we (you included) could make it better!
All PostsA case study and 8 techniques were added to MITRE ATLAS from the Gen AI Attacks Matrix
New Attack Vectors Discovered for Initial Access and Post-Compromise
Links, source code, tools and slides for BlackHat USA 2024
10 free, open-source tools to help security teams to identify and understand immediate risks
Assess Your Risk