Back to Blog
AI Models May 24, 2026 5 min read

Anthropic's Claude Mythos Preview: The Most Capable AI Model Ever Evaluated — Locked Behind Project Glasswing

Mythos autonomously finds and exploits zero-days across every major OS and browser, scoring 73% on expert CTF tasks. Anthropic classified it under heightened RSP safeguards and restricted access to 12 founding partners.

Anthropic's Claude Mythos Preview: The Most Capable AI Model Ever Evaluated — Locked Behind Project Glasswing

Anthropic released the Mythos Preview on April 7, 2026 without a public announcement. No blog post. No social media. Twelve founding organizations received access through a sealed program called Project Glasswing, with roughly 40 additional vetted critical-infrastructure operators admitted since. The wider world found out through partner disclosures and a formal capability evaluation published by the UK’s AI Safety Institute.

What Mythos Can Do

The cybersecurity benchmarks are the headline. AISI-UK’s independent evaluation found Mythos achieving a 73% success rate on expert-level CTF challenges — tasks that routinely defeat professional red teams. In internal benchmarks, the model autonomously developed 181 working Firefox engine exploits and built a 20-gadget ROP chain targeting a 17-year-old FreeBSD NFS vulnerability (CVE-2026-4747) with no human guidance.

On standard coding and reasoning benchmarks: 93.9% on SWE-bench Verified, 77.8% on SWE-bench Pro, 82.0% on Terminal-Bench 2.0, and 97.6% on USAMO 2026.

Why It Is Not Public

During internal evaluation, Mythos autonomously discovered working exploits for previously unknown vulnerabilities across every major operating system and browser. Anthropic concluded the model exceeds current containment and monitoring capabilities for public deployment.

The company has not officially published the RSP tier assigned — secondary sources cite protocols consistent with ASL-3 or ASL-4 safeguards — but the operational posture is clear: no public access, no API, no waitlist.

For Glasswing partners, pricing is $25 per million input tokens and $125 per million output tokens — roughly 5× Opus 4.7 rates. Usage is logged at the session level, with Anthropic retaining audit rights over all interactions.

The Glasswing Access Model

Project Glasswing is Anthropic’s answer to a new kind of problem: a model capable enough to cause catastrophic harm if widely accessible, but too useful in specific high-stakes domains — national infrastructure security, advanced vulnerability research, classified defense applications — to simply withhold.

The 12 founding organizations include national laboratories, critical-infrastructure operators, and at least two allied-government entities. None have been publicly named. Applications for additional Glasswing slots have a median review time of 60–90 days, according to partners who have spoken on the record.

What This Signals for the Industry

Mythos marks the first time a frontier lab has formally acknowledged deploying a model it does not consider safe for general use. That’s not a temporary state pending a safety patch. Anthropic stated there are no known techniques to make Mythos safe for public deployment at this time.

The gap between frontier capability and safely deployable capability is now visible, formally documented, and growing. Bishop Fox’s analysis calls it “the AI cybersecurity inflection point”: a model that reverses the offense-defense asymmetry in vulnerability research, making autonomous zero-day discovery cheaper than human red-teaming for the first time.

For security professionals, the implications cut both ways. The same capability that makes Mythos valuable for critical-infrastructure defense makes it dangerous if a comparable model — or a leaked version — reaches adversarial actors. The Glasswing vetting process is, functionally, the only thing standing between that capability and open access.

anthropic ai claude ai-safety cybersecurity frontier-ai language-models