What does Claude Mythos Mean for Enterprise Cybersecurity?

Did you know that before April 2025, no AI model could complete a single expert-level cybersecurity challenge. Twelve months later, Claude Mythos Preview solves 73% of them.

That number deserves a moment. Not because it is alarming in isolation, benchmark performance in a controlled environment is not the same as an autonomous attack on your network. But because it represents the steepest capability jump the UK’s AI Security Institute has recorded since it began tracking AI cyber performance in 2023. And because the same model, in the same twelve-month window, was used in a live attack against critical infrastructure.

So now the real question is what specifically changes about how you need to defend against as a security team, and what the evidence actually shows, as opposed to the speculation that has surrounded this story since Anthropic announced Project Glasswing this past April.

Anthropic’s announcement of Project Glasswing in April this year.

Claude attacks

Between December 2025 and February 2026, an unknown threat group used Claude to attack a municipal water utility in Monterrey, Mexico, as part of a broader campaign targeting nine government agencies.

The attackers had no prior knowledge of the utility’s industrial setup. They used Claude to map the enterprise network, identify a vNode industrial gateway as the primary bridge between IT and operational systems controlling the water supply, generate targeted credential lists combining default vendor passwords with environment-specific naming conventions, and advise a password-spraying attack against the SCADA interface. What traditionally took advanced threat groups weeks of custom malware development was therefore compressed into hours.

But te core industrial systems were not breached, but more significantly, Jay Deen, associate principal adversary hunter at Dragos, told Cybersecurity Dive: “In this case, the AI rapidly interpreted an unfamiliar environment, identified OT infrastructure and began developing plausible access paths without prior ICS/OT specific context.

Threat actors no longer need specialized operational technology knowledge to target OT environments. The expertise requirement for sophisticated attacks has dropped and it will keep dropping.

What the AISI found when they actually tested Claude Mythos

The UK’s AI Security Institute evaluated Mythos Preview before its public announcement. Their findings are the most rigorous independent assessment of its offensive capabilities available.

On expert-level CTF tasks, Mythos succeeded at 73%, up from zero for any model before April 2025. Two years ago, the best models could barely complete beginner tasks. The trajectory is not slowing.

The more telling test was “The Last Ones” — a 32-step attack simulation modeled on corporate networks, spanning initial reconnaissance through full network takeover, estimated to require human experts 20 hours. Mythos is the first model to complete it end-to-end, doing so in 3 of 10 attempts. In the other attempts it averaged 24 of 32 steps. Every previous model averaged fewer than 16.

The difference between 16 steps and 24 is not incremental. It is the difference between an intrusion that stalls partway through lateral movement and one that reaches the point of data exfiltration and persistence establishment. The AISI was direct: Mythos can exploit systems with weak security posture, more models with equivalent capabilities are coming, and evaluation environments without active defenses will soon not be hard enough to discriminate between the most capable models.

What do you need to know about project Glasswing?

Anthropic’s response to what Mythos can do is Project Glasswing, a controlled program giving a small number of security partners including Google and Microsoft access to Mythos Preview for defensive vulnerability research.

In the weeks before the public announcement, Mythos Preview identified thousands of zero-day vulnerabilities across every major operating system and every major web browser, including flaws that had survived undetected for decades. Microsoft’s statement on joining the program is worth reading carefully: “This is not only a game changer for finding previously hidden vulnerabilities, but it also signals a dangerous shift where attackers can soon find even more zero-day vulnerabilities and develop exploits faster than ever before.”

Claude Code Security, launched February 20 as a separate enterprise product built on Claude Opus 4.6, brings a version of this capability to organizations outside Glasswing. It scans codebases for vulnerabilities and recommends patches for human review, identifying over 500 previously unknown high-severity vulnerabilities in open-source codebases during internal testing. Its announcement triggered a cybersecurity sector selloff erasing approximately $830 billion in market value over six trading days, a reaction CSIS analysts described as markets conflating build-time code scanning with runtime defense.

AI tools are changing your attack surface. But how?

While Claude Code Security identifies vulnerabilities in your code, the tool has its own documented flaws that enterprise security teams need to assess before deployment.

Check Point Research found three vulnerabilities in Claude Code in early 2026. CVE-2025-59356 allows an attacker to introduce malicious Hook commands in a project repository’s configuration file. When a developer opens the project, those commands execute automatically and silently, giving the attacker remote terminal access with full developer privileges. CVE-2025-59536 affects Claude Code’s MCP settings, allowing malicious commands to execute before any user warning appears. CVE-2026-21852 is the broadest: it allowed harvesting a developer’s API key with no user interaction, by intercepting communications between Claude Code and Anthropic’s servers.

All three are patched. The structural point is not. AI coding tools operate with elevated access to source code, local files, and credentials. When the tool is compromised, the attacker gets the tool’s access level. The AI layer of the software pipeline is now an attack surface in its own right, and most security teams have not yet built detection coverage for it.

So what actually changes?

87% of global organizations have already experienced an AI-powered cyberattack in the past year according to SoSafe’s Cybercrime Trends 2025 report. Mythos is the most capable public example of this trend, but OpenAI’s GPT-5.4-Cyber and Google’s Big Sleep already possess comparable capabilities. Containing any specific model is not the right frame. Defending against a sustained era of AI-enabled attacks is.

Le CSA, SANS, and OWASP joint assessment is blunt: organizations are “likely to be overwhelmed” by threat actors using AI to find and exploit vulnerabilities faster than defenders can patch them. The asymmetry is not about capability — defenders can use the same tools. It is about institutional speed. Attackers do not need legal review, procurement cycles, or change management approval before acting on a discovered vulnerability. Defenders do.

Network segmentation and least-privilege access limit blast radius when AI-assisted attackers achieve initial access. Anomaly detection and comprehensive logging surface activity that looks legitimate but behaves abnormally. Rapid patching closes the windows that AI-powered vulnerability discovery is specifically designed to find.

Bain’s 2025 Cybersecurity Survey finds most organizations plan budget increases of around 10% annually, while their analysis suggests many large organizations need to double current security spending to build the depth of defense AI-enabled attacks now require. The organizations that are ahead of this are not those with the largest budgets. They are those that have built continuous external visibility into what is exposed before an attacker finds it first.

What CISOs need to put on the board agenda this quarter

Most of what is being written about Claude Mythos focuses on the threat actor side. This section is for the people responsible for the defensive side, and specifically for what needs to land in board-level conversations in the next 90 days.

The expertise barrier for sophisticated attacks has structurally collapsed. The Mexico water utility attack was not conducted by a state-sponsored APT with years of OT knowledge. It was conducted by an unknown group using a commercial AI model that had never seen the target environment before. CISOs who have built their threat models around the assumption that OT attacks, advanced persistent intrusions, or zero-day exploitation require sophisticated, well-resourced adversaries need to revise that assumption. The capability is now commoditized. The question your board should be asking is not “are we a high-value enough target for a sophisticated attacker?” It is “what happens when any attacker has access to sophisticated tools?”

Your AI tools are part of your attack surface now, and most governance frameworks do not reflect this yet. Industry surveys suggest formal governance frameworks for AI security tools remain the exception, with many CISOs not anticipating this capability arriving this early in 2026. Claude Code, GitHub Copilot, Amazon CodeWhisperer, and their equivalents operate with developer-level access to source code, credentials, and production pipelines. The three Claude Code vulnerabilities Check Point found are patched, but the class of risk they represent is not. If your security team is not monitoring AI tool behavior as part of your detection coverage, you have a blind spot that did not exist eighteen months ago.

The patching window is closing faster than your patch cycle. Mythos Preview found thousands of zero-day vulnerabilities in production software, some of which had existed for decades, in a matter of weeks. As equivalent capabilities proliferate beyond controlled research programs, the window between a vulnerability existing in your environment and an adversary finding it will compress toward days or hours rather than months. If your current patch prioritization is based on CVSS score and vendor advisories alone, it is not moving at the speed the threat now requires. Catalogue des vulnérabilités connues et exploitées de la CISA is the minimum baseline. KEV-listed vulnerabilities have confirmed active exploitation and should be treated as emergency remediation items regardless of your standard cycle.

You are likely underinvesting and the gap is widening. Bain’s analysis suggests many large organizations need to double their current security spending to build the depth of defense AI-enabled attacks now require, while most plan increases of only 10% annually. The board conversation that is harder to have, but more important, is not about budget increases. It is about what your current investment actually covers versus what the threat now demands. A CISO who can map their current controls against the specific attack patterns Mythos and its equivalents use, including lateral movement through poorly segmented networks, exploitation of AI pipeline access, and rapid OT reconnaissance, is in a much stronger position to make that case than one presenting general threat landscape slides.

External visibility is no longer optional. The dwell time between a vulnerability existing and an attacker discovering it is collapsing. The organizations that find out about their exposure from CybelAngel are in a fundamentally better position than the organizations that find out from an incident report. Your internal monitoring tells you what is happening inside your perimeter. It does not tell you what is visible from outside it: which credentials are circulating, which infrastructure is exposed, which AI exploitation tooling is being developed that names your sector or your organization as a target. That visibility is what closes the gap between Mythos-speed discovery and your ability to respond.

CybelAngel's gestion de la surface d'attaque identifies exposed credentials, vulnerable infrastructure, and external-facing risks before they become the entry point for an AI-assisted intrusion. Our surveillance du dark web tracks AI exploitation tooling and compromised credentials circulating in the markets feeding this new class of attack.

Frequently asked questions

Claude Mythos is an unreleased frontier AI model from Anthropic capable of autonomously executing multi-stage cyberattacks on vulnerable networks. In independent AISI evaluations it completed 73% of expert-level cybersecurity challenges and solved a 32-step corporate network attack simulation end-to-end, milestones no previous AI model had reached.

Yes. Between December 2025 and February 2026, attackers used Claude to assist an attack on a water utility in Mexico, mapping the network and identifying critical SCADA infrastructure without prior OT knowledge, and generating targeted credentials for a password-spraying attempt. Documented by Dragos and Gambit Security. The core systems were not breached.

Project Glasswing is Anthropic’s controlled vulnerability research program giving select security partners including Google and Microsoft access to Claude Mythos Preview for defensive purposes. In the weeks before its announcement, Mythos Preview identified thousands of zero-day vulnerabilities across every major operating system and web browser, including flaws that had existed for decades.

Check Point Research identified three vulnerabilities in Claude Code in early 2026, CVE-2025-59356, CVE-2025-59536, and CVE-2026-21852, enabling remote code execution, silent command execution, and API credential theft respectively. All are patched. The broader risk is structural: AI coding tools operate with elevated access to source code and credentials, making the AI pipeline layer an attack surface in its own right.

Four things: revise threat models to account for AI-lowered expertise barriers; extend security governance to cover AI tool behavior in development pipelines; accelerate patch cycles toward KEV-catalog speed rather than standard quarterly cycles; and build continuous external attack surface visibility that operates independently of internal monitoring. The fundamentals are unchanged. The speed at which they must be applied has not been.

No. Anthropic has not released Mythos and has restricted access to Project Glasswing partners for defensive research. Claude Code Security, built on Claude Opus 4.6, is available as a limited research preview to Enterprise and Team customers for vulnerability scanning.

À propos de l'auteur