A year of incidents, shadow tools, and ungoverned deployments has compressed the AI security problem into a single deadline. This is the report of what we found in the lead-up to it.
The report builds in order if you have the time. Otherwise, jump to whichever chapter answers the question your board is asking on Monday.
The first wave of generative AI was about prompts. Then employees brought their own tools. Then vendors shipped agents. Each wave landed before the controls for the previous one had finished being written. The 2025 IBM data is the cleanest read of where that has left enterprises: 13% have already had a breach involving an AI model or application, and 97% of those organisations had no AI access controls in place when it happened.
The first wave was conversational. ChatGPT inside the browser, employees pasting in code or contracts, security teams writing acceptable-use memos that almost nobody read. The data leakage problem is real — 27% of organisations report that more than 30% of the data flowing through AI tools contains private information — but the controls for this layer are at least understood: DLP, CASB, browser policy.
The second wave was vendor AI — Copilot in Microsoft 365, Gemini in Workspace, AI features inside every SaaS your business already paid for. Different problem. The model is reading your real data. If anyone can write instructions into that data, they can write instructions for your AI. EchoLeak is the documented case; we walk through it in Chapter 03.
The third wave is agents. Tools, MCP servers, autonomous task execution. The category did not exist as an enterprise concern eighteen months ago. It is now in production at most large organisations — with, in most cases, no inventory, no logging, and no policy specific to it.
A representative read from the IBM cohort, with our own field observations.
Four featured incidents below — chosen because together they describe the four shapes the AI security problem actually takes today: shadow tooling, supply chain trust, model-context manipulation, and OAuth overreach. The full tracker, with every incident we have documented this year, lives on the breach tracker page.
| Incident | Shape | Reach | Severity | Period |
|---|---|---|---|---|
| UNC6395 — Drift / Salesforce OAuthStolen OAuth tokens from a trusted AI integration. The attacker did not exploit a vulnerability; they used legitimate third-party access. | OAuth overreach | 700+ organisations | Critical | Aug 2025 |
| EchoLeak — Microsoft 365 CopilotCVE-2025-32711. Hidden prompt in a document instructs Copilot to forward mail. Privately patched before public release — mechanism is general. | Indirect prompt injection | Vendor-wide | Critical | Disclosed 2025 |
| Cursor IDE RCE chainTwo vulnerabilities in the AI coding assistant. Source code, API keys, and cloud credentials exposed on developer machines. | Supply chain | Per-developer | Critical | Aug 2025 |
| Shadow AI breach (composite)The IBM cohort case. Employees feeding sensitive data to personal ChatGPT or Copilot accounts. 20% of breached organisations report this shape. | Shadow AI | 1 in 5 organisations | High | Year-round, 2025 |
The pattern is that the perimeter did not disappear. It mattered less. Most of these breaches went through trust relationships your organisation had already approved — an OAuth grant, a SaaS integration, a document the AI was authorised to read. Endpoint security and user-focused monitoring were not the right tools for any of them.
Of the four featured incidents, EchoLeak is the one with a published CVE and a documented mechanism. Microsoft caught and patched it through coordinated disclosure, so there is no record of in-the-wild exploitation. The mechanism is still general: any AI tool that reads attachments, emails, or shared documents is reachable through the same shape. What follows is the breach in six steps, four seconds total.
EchoLeak worked because Copilot had permission to read mail and permission to send it. The permissions were granted at deployment and never re-examined. That is the structural problem this chapter is about — not specific to agents, not specific to MCP, just the gap between how IAM teams think about access and how AI deployment teams think about it. Below: the same employee, side by side with the AI assistant deployed to support her.
An agent with tool access and stored credentials is not the same risk profile as a chatbot. Most AI acceptable-use policies do not make that distinction. The result is that procurement, legal and security all signed off on "the same product" — except the product quietly became a different one.
The security stack most enterprises run was built around an attacker who had to escalate to act. You log the escalations — that is most of what your SIEM does for a living. The agent does not escalate. The permissions were granted at deployment. The actions it takes are the actions a service account was approved to take, and your SIEM correctly does not alert on those.
So "lateral movement" has collapsed into a single tool call. The motion your defenders were trained to detect simply no longer happens. The agent does not move across systems. It already lives in all of them.
None of these are exotic. They are standard hygiene, applied to a layer most teams have not yet added to their threat model.
From that date, Articles 9 through 17 apply to providers of high-risk AI systems and Article 26 applies to deployers. Article 50 transparency rules apply too. Failure to comply carries fines up to €35M or 7% of global annual turnover. The data from the year before says most organisations are not ready, and the obligation that is most consistently missing is the one every other obligation rests on — an inventory of what AI you actually have.
Inventory is the prerequisite for every other obligation in the Act. Solve it first. Risk management, oversight, logging, incident reporting — none of them function until you can produce a complete picture of every AI system in your organisation. It is also the highest-leverage compliance work available in the time you have left.
Is the exposure expensive enough to act on, and are we behind our peers? The table answers the first. The grid answers the second. A note on the risk-reduction column: percentages reflect FireTail's review of incidents where the control was present versus absent. They are floors, not promises. The point is the gap between vectors.
| Threat vector | Avg breach cost | Primary control | Effort | Risk reduction |
|---|---|---|---|---|
| Prompt injection (indirect) | $4.88M | Input sanitisation layer + prompt shield before LLM ingestion | Low · 2–4 wks | ~82% |
| MCP supply chain poisoning | $4.63M | Internal MCP registry + version pinning + automated trust scoring | Med · 4–8 wks | ~74% |
| API misconfiguration / exposure | $3.86M | AI API gateway with auth enforcement, rate limiting, anomaly detection | Low · 1–3 wks | ~91% |
| Shadow AI / OAuth overreach | $4.63M | AI asset discovery + OAuth audit + agent-specific acceptable use policy | Med · 3–6 wks | ~67% |
| Training data poisoning | $5.20M | Data lineage tracking + pipeline integrity + behaviour monitoring | High · 8–16 wks | ~58% |
| Multi-agent cascade failure | $6.10M+ | Agent isolation boundaries + inter-agent message validation + kill-switch | High · 12+ wks | ~70% |
The economic argument for AI runtime monitoring is, frankly, the easiest part of this report. IBM puts the saving at $1.9M per breach, with containment 40% faster. The harder part is that you cannot monitor what you have not yet inventoried. The order of operations matters more than the budget.
If your score is below your sector median, you are carrying above-average risk relative to your competitive set. That is the comparison your cyber insurer is making, and increasingly the one your regulator is too. The number you walk into the audit committee with is not your absolute score — it is the gap.
Source · FireTail Research, getaiactready.eu cohort · n=412
Twelve actions across three horizons. Horizon 1 is what the security team can start without procurement. Horizon 2 needs some new tooling. Horizon 3 is architecture work that takes a fiscal year. Run them in order — the first item tends to reveal the rest.
The questions are the ones FireTail uses in initial assessments — the same ones most security teams quietly skip when they review their own posture. Two minutes. No login. Run it again in ninety days and watch the trajectory.
This is the artefact CISOs walk into the audit committee with. The numbers on the right are from the IBM 2025 cohort — the canonical reference. Read out the questions, point at the deadline, leave the briefing with the directors. The detailed case is in the eight chapters above.
The argument of this report is that inventory is the lever — the prerequisite for every other control, the foundation under every other Article. FireTail builds inventories. The platform discovers every AI endpoint, agent and integration across your stack — authenticated or not, documented or not — and produces the map every other control depends on. Fifteen minutes from connect to coverage. The rest of what we do is built on top of that.