Anthropic Accidentally Leaked Its Most Powerful AI Ever — And It’s Called Claude Mythos
By Maya Chen · March 28, 2026

A single CMS misconfiguration at Anthropic exposed approximately 3,000 unpublished internal assets — including the existence of “Claude Mythos,” a model the company internally describes as “by far the most powerful we’ve ever developed.” Rather than a typical version bump, Mythos represents a true tier change above Claude Opus, with Anthropic deliberately restricting access to cyber defenders first due to its unprecedented offensive cyber capabilities.
New Tier Above Opus
Cyber Defenders First
1 CMS Error
Table of Contents
- How 3,000 Unpublished Assets Spilled Into the Open
- Claude Mythos (a.k.a. Capybara): What the Leaked Docs Actually Say
- Why Cyber Defenders Get It First — And the Public Waits
- If Your Most Capable AI Is a Cyber Risk, How Do You Govern It?
- What a “Step Change” Means for the AI Capability Map in 2026
- Claude Mythos vs. Claude Opus 4.6: What We Know
- Frequently Asked Questions
- Related Reading
How 3,000 Unpublished Assets Spilled Into the Open
The breach wasn’t a hack. It was human error — a CMS misconfiguration at Anthropic that left a large swath of unpublished internal content sitting in a publicly searchable cache. Discovered independently by Roy Paz at LayerX Security and Alexandre Pauwels at the University of Cambridge, the exposed cache contained roughly 3,000 assets: draft blog posts, internal memos, model capability documents, and a collection of promotional materials never intended for public eyes.

Neither Paz nor Pauwels exploited the data — both followed responsible disclosure practices, notifying Anthropic before publicizing their findings. The configuration flaw has since been patched, but the window during which the cache was publicly accessible remains unclear. The fact that two independent researchers stumbled across it suggests it was open for a meaningful period. This kind of CMS misconfiguration is a well-known risk vector in enterprise content pipelines, but it carries especially high stakes when the content involves unreleased AI model specifications.
Among the leaked assets, one stood out immediately: internal documentation referencing a model codenamed “Claude Capybara” — publicly branded as “Claude Mythos” — and described in Anthropic’s own language as a genuine step change in AI capability.
Claude Mythos (a.k.a. Capybara): What the Leaked Docs Actually Say
The leaked materials describe Claude Mythos — internally referred to as Claude Capybara — using language that Anthropic almost never deploys publicly. Specifically, the documents call it “by far the most powerful AI model we’ve ever developed” and characterize it as a “step change,” distinguishing it categorically from incremental model improvements. In Anthropic’s product hierarchy, Mythos sits above Claude Opus 4.6, occupying what appears to be an entirely new capability tier.

On coding benchmarks, academic reasoning tasks, and — most critically — cybersecurity evaluations, Mythos reportedly “dramatically outperforms” all previous Claude models. Anthropic’s internal language goes further, asserting that the model is “currently far ahead of any other AI model in cyber capabilities.” That claim, if even partially accurate, has enormous implications. It would mean Anthropic is sitting on a model whose offensive cyber potential exceeds anything else on the market — including models from OpenAI, Google DeepMind, and Meta.
The “step change” framing is important. In the industry, that phrase signals a qualitative leap — not just higher benchmark scores, but fundamentally different behavior: longer reasoning chains, better tool use, autonomous task completion, and in this case, a level of cybersecurity reasoning that Anthropic considers genuinely dangerous in the wrong hands.
Why Cyber Defenders Get It First — And the Public Waits
Anthropic’s decision to restrict Claude Mythos to cyber defenders before any broader rollout is a deliberate dual-use calculus. The same capabilities that make a model powerful at offensive cyber tasks — autonomous vulnerability discovery, exploit reasoning, lateral movement planning — make it equally powerful as a defensive tool. By seeding the model first with defenders: security operations teams, threat intelligence analysts, red teamers at major enterprises, Anthropic hopes to build institutional knowledge about how to use Mythos responsibly before it reaches general availability.
This mirrors the approach taken with biosecurity and nuclear contexts in previous frontier model releases — but extends it meaningfully. Cybersecurity isn’t a niche domain. It touches every enterprise, every government, every critical infrastructure operator. Giving defenders a running start is both ethically sound and strategically smart for Anthropic, which has positioned “responsible scaling” as a core brand differentiator against OpenAI and Google.
The leak also briefly surfaced details of an invite-only CEO retreat held at an 18th-century English countryside manor — a rare glimpse into how Anthropic’s executive leadership operates at the frontier. That detail has understandably received less attention than the model reveal, but it speaks to the company’s distinctive culture: simultaneously building the most dangerous AI ever made and discussing it over countryside weekends with select invitees.
If Your Most Capable AI Is a Cyber Risk, How Do You Govern It?
Claude Mythos crystallizes a problem the AI industry has been circling for two years: at a certain capability level, a general-purpose model becomes a dual-use weapon. The question of how to govern that transition doesn’t have a clean answer. Anthropic’s approach — restricted access, defender-first rollout, internal capability thresholds — is arguably the most rigorous in the industry. But “most rigorous” still means making unilateral decisions about who gets access to technology that could, in adversarial hands, compromise critical infrastructure.
The leak itself underscores the governance paradox. Anthropic is simultaneously the company most vocal about AI risk and the company that just had 3,000 sensitive internal documents exposed through a basic infrastructure error. If you can’t reliably protect your CMS configuration, what does that say about your ability to govern access to a model you’ve described as the most powerful you’ve ever built?
The regulatory implications are real. AI safety frameworks in the EU and UK are beginning to incorporate capability thresholds that could trigger mandatory disclosure or access controls. Claude Mythos — assuming the leaked description is accurate — would almost certainly qualify under the EU AI Act’s “general purpose AI with systemic risk” classification. That means Anthropic’s controlled rollout isn’t just a strategic choice; it may soon become a legal obligation.
What a “Step Change” Means for the AI Capability Map in 2026
Every major AI lab has been racing toward what insiders call “the next tier” — a capability jump significant enough to change what AI can actually do, not just how fast it does it. Anthropic’s description of Claude Mythos as a “step change” is the clearest public signal yet that one lab believes it has crossed that threshold. Whether Mythos represents genuine AGI-adjacent capability or extraordinary performance on specific task categories (coding, cyber, academic reasoning) is the critical question — and the leaked documents don’t fully answer it.
What the leak does confirm is that Anthropic’s internal capability taxonomy has shifted. Claude Opus 4.6 was previously the company’s most capable model. Mythos doesn’t replace it — it sits above it in a new tier, suggesting Anthropic is moving toward a more stratified model portfolio similar to what OpenAI has been building with GPT-4o, o3, and its research models. The implication: frontier AI is no longer a single ladder of models, but a branching tree where different tops serve radically different use cases at radically different access levels.
For the industry, the Mythos leak is a wake-up call. If Anthropic has a model this far ahead on cyber capabilities, competitors are likely in a comparable position. The arms race isn’t slowing down — it’s accelerating into territory that basic CMS security can’t protect.
Claude Mythos vs. Claude Opus 4.6: What We Know
| Attribute | Claude Opus 4.6 | Claude Mythos (Capybara) |
|---|---|---|
| Tier | Flagship (current) | New tier above Opus |
| Coding | State-of-the-art (until now) | Dramatically outperforms Opus |
| Cybersecurity | Strong defensive capability | “Far ahead of any other AI model” |
| Academic Reasoning | Top-tier benchmark scores | Dramatically outperforms Opus |
| Public Access | Generally available | Restricted — cyber defenders first |
| Internal Codename | Opus | Capybara |
Frequently Asked Questions
What is Claude Mythos?
Claude Mythos (internally codenamed Claude Capybara) is Anthropic’s next-generation AI model, described as sitting above Claude Opus 4.6 in a new capability tier. Its existence was revealed through an accidental CMS misconfiguration that exposed approximately 3,000 internal Anthropic assets. Anthropic’s own documents call it “by far the most powerful AI model we’ve ever developed.”
When will Claude Mythos be publicly available?
Anthropic has not announced a public release date. The current rollout strategy restricts access to cyber defenders and vetted security professionals first. Given Anthropic’s cautious scaling policy, a broader release will likely follow a staged evaluation period of several months to over a year.
How did the Claude Mythos leak happen?
A CMS (content management system) misconfiguration at Anthropic left approximately 3,000 unpublished internal assets in a publicly accessible cache. Security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) discovered the exposure independently and followed responsible disclosure procedures, notifying Anthropic before publishing their findings.
Is Claude Mythos safe to use?
Anthropic’s position is that Mythos is safe within a controlled access framework, which is why it’s being released to cyber defenders first rather than the general public. The company’s own documentation acknowledges the model is “currently far ahead of any other AI model in cyber capabilities” — a statement that inherently frames Mythos as a potential risk requiring careful governance. Public availability will hinge on how the defender-first phase evaluates risk and efficacy.
Related Reading
- Why AI Agents Are the Real Test of AGI — Not Chatbots →
Chatbots aren’t the measure of AGI. AI agents that plan, execute multi-step tasks, and self-correct are the true test.
- The Weekly Brief #001: AGI Is Here, Your Router Is a Liability →
AGI declared, OpenClaw hits 97M users, and what it all means for enterprise AI in 2026.
- The Anthropic Distillation Scandal Is Bigger Than IP Theft →
Anthropic vs. DeepSeek, Moonshot AI, and MiniMax — a national security case that rewrites AI competition.
Stay Ahead of the AI Frontier
Get Networkcraft in Your Inbox
Claude Mythos won’t be the last leak. The AI landscape is moving faster than any single publication can track — but we’ll keep you in the loop every week.