Security

Offense Scales with Compute. Defense Scales with Committees.

Why AI is widening the attacker-defender gap faster than anything we've built to close it — and what that actually means for the next decade of security.

Casey Ellis

08 Apr 2026 — 11 min read

There's a Windows XP machine running right now, at an airport you've probably flown through in the last six months, because nobody has updated the middleware it talks to and the middleware has a dependency on something older than the iPhone. That machine isn't a curiosity. It's the entire story of modern cyber defense in a single blinking beige box.

Meanwhile, somewhere other than Kansas, an operator is running an AI-orchestrated pipeline that can generate, test, and deploy weaponized n-day exploits across tens of thousands of targets while they go get lunch. The model doesn't need a change advisory board. It doesn't need a pen-test budget approved next fiscal cycle. It doesn't need to fill out a vendor risk assessment for itself.

This is the asymmetry. It was already there — the defender's dilemma is not new — but AI has taken the knob that used to go to eleven and turned it to seven hundred. The uncomfortable part, the part people in my corner of the industry say to each other but still mostly swallow in public, is that defense isn't going to catch up. Not on the current trajectory. Not with the current toolkit. Not with the current policy posture. I want to walk through why, and then what actually moves.

The top five turtles in a stack of thirty

The thing that bothers me most about the current AI-for-defense conversation is how narrow it is. Every vendor deck at RSA, every "we're using AI to transform the SOC" LinkedIn post — they're all playing in the same two-inch-deep puddle. App sec. Vuln triage. SIEM enrichment. Things you can fix with a pull request.

Meanwhile Salt Typhoon, and Volt Typhoon before it, is not messing around with anything that looks like a pull request. They're living in the places we stopped looking a decade ago. Firmware on routers nobody remembers procuring. Middleware whose vendor went out of business in 2011. The operational plumbing under the operational plumbing.

The way I've been phrasing it: we are obsessively polishing the top five turtles in a stack of thirty. China is already living in the bottom ten. And "living" is doing work in that sentence, because the Canadian telecom folks, when they feel like they can talk freely, describe their own networks as fully infested with no real plan to evict. Then everyone goes to work the next day and acts like everything is fine. It's not an ice cream truck business. It's tacos — the whole supply chain is compromised and we're still trying to sell the burrito.

You cannot patch your way out of that. You cannot AI-SIEM your way out of that. The compromise is resident, and the attack surface is older than most of the people hired to defend it.

Offense scales with compute. Defense scales with committees.

Here is the core asymmetry, as plainly as I know how to put it:

The attacker's velocity of adoption is gated only by compute and creativity. The cost of failure is trying again. The defender's velocity of adoption is gated by enterprise policy, change control, production stability, compliance review, procurement, vendor consolidation, board appetite, insurance posture, and twelve people in four time zones agreeing on one Jira ticket. The cost of failure is loss of availability and revenue.

The attacker's cost of failure is low and getting lower. If an AI-generated exploit pipeline fires a thousand payloads and 997 fail, the three that land are still a win, and the 997 failures are somebody else's logs to sift. The defender's cost of failure is structurally civilization-scale — one bad patch takes down payroll, one over-aggressive detection rule takes down the call center, one too-clever containment action takes down the hospital. Defenders move at the speed of "don't break prod," attackers move at the speed of "let's see what happens."

That was always true. What AI did is give the attacker a force multiplier with no symmetric equivalent on the defender side. You cannot deploy an autonomous agent into a Fortune 500 network to aggressively rewrite firewall rules in real time. You can deploy that exact agent as an attacker into the same network and nobody will review your change control.

The "you must be this tall to ride" bar on the offensive side is collapsing. The cohort of people who can run a credible AI-assisted intrusion and persistence campaign is broader than ever. The cohort of organizations that can defend against one is not growing at the same rate. It might not be growing at all.

The sovereign cyber reality check

If you only read American security press, you would be forgiven for thinking the US is automatically dominant in cyber. Cyber Command, the agencies, the vendors, the budget. That is not how the rest of the world is acting.

Every security policy conversation I have outside the US is about bulking up. Letters-of-marque-style legislation is floating in multiple jurisdictions. Partners who used to assume they could lean on the US for forward-leaning cyber response are quietly building sovereign capacity — their own capability, doctrine, and tooling — because they can no longer rely on Cyber Command being as predictable as it was a decade ago. "Sovereign cyber capacity" used to be a flavor of nationalist grandstanding. In 2026 it means something more specific: assume the cavalry is not coming, and build accordingly.

On the adversary side, the tactical shift around 2019–2020 — when Chinese operations stopped prioritizing stealth and started prioritizing opportunistic, broad-spectrum, refresh-on-detection persistence — is now the baseline. The posture is "don't worry about getting caught." In a Western cyber-espionage framing that is not an acceptable trade. In theirs, it is the entire doctrine. Asymmetry on top of asymmetry.

"Defend forward" and the wartime patching question

Which brings us to the US policy debate, which is — to put it gently — arguing about the wrong problem with the wrong urgency on the wrong timeline.

The most durable position out of the White House right now is "defend forward": we're out of time to defend domestically, so we push the fight outward and delegate domestic defense to the states and the private sector. I don't like it very much as an exclusive position, but do I understand why it wins the policy battle — it's the one framing that attracts defense-budget dollars, and defense is the only budget line that reliably moves in the current climate. If you want to get something done on cyber in Washington, dressing it up as forward defense is currently the primary lever.

The more interesting debate, and the one almost nobody outside a small circle wants to discuss in public, is non-consensual patching in wartime conditions. If there's an imminent, unfolding cyber risk against critical infrastructure, can the government reach out and patch systems without the owner's consent? Fix your CI/CD pipeline... Remediate the middleware dependency that every operator apart from the actual asset owner knows is there. Close a hole in the bottom ten turtles. That is wartime stuff, and it is being actively worked on. The reason it makes people uncomfortable is not that it brings about a very chaotic Internet — it's that the legal and operational framework for it is nowhere close to ready, and the people arguing loudest against it are often the same people whose infrastructure most needs it.

Meanwhile CISA is looking at a proposed ~$700M cut, the federal cyber workforce is in the middle of its own chaos, and the political stability of any ten-year defensive posture is not something you can build a roadmap on. You cannot run a decade-scale defense program on a two-year funding cycle whose second year is a coin flip. Every CISO in a regulated industry already knows this. They just don't say it on panels.

Why "more tools" is not the answer

If you've been in security longer than five minutes you know tool fatigue is the status quo. The average SOC runs forty-plus tools. The average analyst is drowning. The average vendor pitch is "our AI will sit on top of your existing stack and make sense of it."

Adding an AI layer on top of forty tools that don't talk to each other does not fix the problem. It adds a forty-first tool that now also needs to be tuned, trusted, governed, audited, and renewed. We never lacked a dashboard. The underlying architecture — enterprise IT as it actually exists in the wild — was never designed to be defended at the speed attackers can now operate. If you're a vendor and your pitch is "we use AI to reduce alert fatigue," I'm not mad at you. I'm just telling you you're solving a 2019 problem while the 2026 problem is walking around in your customer's firmware.

And now, Mythos

While I was drafting this, Anthropic dropped Claude Mythos Preview and Project Glasswing. Case study for everything above.

Mythos scored 83.1% on CyberGym's vulnerability reproduction benchmark — Opus 4.6 scored 66.6%. It found a 27-year-old OpenBSD vulnerability and a 16-year-old FFmpeg flaw that five million fuzzer executions had missed. Thousands of zero-days across every major OS and browser, sitting there for a decade-plus while we polished the top five turtles in a stack of fifty. Engineers with no security background pointed it at code overnight and woke up to working exploits.

Props to Anthropic for taking this on and staying front and center. They didn't release it — access is gated to twelve launch partners (AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks, Anthropic) plus ~40 critical infrastructure organizations and a forthcoming Cyber Verification Program. First time a major AI lab has withheld a model specifically because its offensive capabilities were too dangerous.

The discourse is splitting predictably into "this changes everything" and "it's overhyped." It's both.

Opus (and, for the record, other USA frontier models, as well as many Chinese open-weight models) were already good at finding primitives given codebase access. Vulnerability discovery and full exploit fit-and-finish are different disciplines — one looks like spicy QA, the other like adversarial engineering. Mythos advances both, but the triage problem it solves is the one that desperately needs solving.

The genie is also directionally out of the bottle. Open-model replication already shows smaller models with decent scaffolding recovering similar scoped analysis — 8 out of 8 models found the flagship FreeBSD zero-day, including a 3B parameter model. The AI cybersecurity frontier is super jagged: capability reshuffles completely by task.

Glasswing as a defensive coalition is genuinely new — the first credible counterexample to "force multiplier with no symmetric equivalent." Not a complete one. Twelve companies and forty organizations is not the entire attack surface. And here's the part that should keep you up: assuming everything gets fixed in 90+30 days, the moment patches drop they get diffed out and weaponized. Integration into actual production becomes the battlezone. Lag is real. Bureaucracy is real. Supply chains are real.

For VR and pentesting: the center of gravity shifts. Not disappears — capability development and vulnerability reproduction are discrete disciplines the "AI replaces pentesters" crowd is glossing over. But the question moves from "can you find the bug" to "can you understand what it means in context and help the org actually fix it." That was always the harder part. Mythos just made it impossible to pretend otherwise. To the bounty hunter community: this will be new to you. This is how we did stuff in the old days before "attacking the cloud" became the default route to PoC.

Practical takeaways for the #vulnpocalypse:

Prepare for a lot of patching and regression testing in the next few months.
Map your runtime exposure and reachability. Know your products and your environment.
Test your IR, recovery, and rotation protocols now — ahead of when this gets spicier.

I'm pulling the trigger on that "Threat Asymmetry Enthusiast" t-shirt for Hacker Summer Camp.

The pincer: what actually moves

So what does work?

What I've seen work, in my own career — around vulnerability disclosure, CFAA reform, the long slow normalization of bug bounties — is a pincer move. You don't win these fights from inside policy alone, and you don't win them from inside industry alone. You win them by:

Bottom-up precedent. Companies, open-source projects, missions, experiments that go do the thing and prove it's possible — viral, loud, and real enough to become the reference implementation everyone points at. This is how norms get rewritten while the policy people are still drafting.
Top-down pressure. Podcasts, writing, policy advisors, the handful of people who can actually get a meeting. Not to pass the thing, but to make sure the thing exists in the room when the room is deciding.
Middle capture. Public discourse shifts the middle. The policy layer cracks when top and bottom are both pressing and the middle has nowhere clean to stand.

This is slow in peacetime. It'll have to be faster now. And it specifically doesn't require you to be in the White House — some of the most durable wins I've seen came from people deliberately not in government, because they couldn't be co-opted by whichever administration was in the chair that year.

[Editor note] If this post goes well on LinkedIn, it's a natural segment topic for Risky Business News — pitch it as "the asymmetry thesis and what we do about it." Pat and the crew would have strong takes, and the pincer framing hasn't really been articulated in that venue yet.

What this means if you are not running a nation-state

Most of us don't control Cyber Command's budget. We're defending a mid-market company, a research lab, a nonprofit, a client engagement. So what does the asymmetry thesis tell you to do on Monday morning?

Stop optimizing for prevention you can't achieve. Start optimizing for resilience you can. "How do we keep them out" is now secondary. The primary questions are how fast do we know, how fast do we contain, how fast do we recover. If your IR tabletop hasn't been updated in twelve months, it's an artifact of a different threat model.
Inventory the bottom ten turtles. What firmware are you running with no update path? What middleware does a critical business flow depend on that hasn't shipped a patch since your youngest engineer graduated? That's the real attack surface.
Assume compromise where you can't instrument. If you can't see it, assume it's resident. Build blast radius accordingly.
Pick a pincer lane. Industry: push bottom-up precedent. Policy-adjacent: push top-down pressure. Neither: push the middle — write, speak, normalize the conversation.
Stop waiting for the ten-year plan. Nobody is coming with one, and no ten-year plan survives a two-year funding cycle. Build what you can build now, assuming the rules, budget, vendor, and threat will all move under you. That's not cynicism. It's the operating environment.

What comes next

If I'm right — I've been in this long enough to trust the pattern — defense in the classical sense is not going to catch up until things really start going bump in th night. Not with more tools, more compliance frameworks, a better SIEM, or an AI layer on top of a stack that was already failing before the attackers got a force multiplier.

That doesn't mean we're doomed. It means the word "security" has to stretch to cover a different set of practices than the one we built the industry around. Prevention becomes one discipline among several, and not the most important one. Resilience, recovery, containment, attribution, consequence management, legal response, insurance posture, public communication — the things we used to call "what happens after security fails" — are security now.

The XP machine at the airport is not going to update itself. The middleware vendor is not coming back from the dead. The telecom backbone is not going to de-infest. The budget is not going to stabilize. The attackers are not going to slow down. We get to decide what we build next, knowing all of that.

Offense scales with compute. Defense scales with committees. The committees aren't going away — that's not what I'm arguing for. But the committees need to know what they're actually voting on, and the people in the rooms need to stop solving 2019 problems while 2026 walks around in the firmware. The pincer works. It has worked before, in smaller fights, with less at stake. It can work here — but only if the people who can see the asymmetry stop pretending we're still in a fair fight.