Skip to content

SYSTEM Cited by 1 source

Mythos Preview

Mythos Preview is Anthropic's preview of a cyber frontier model — a frontier-tier LLM specialised for offensive security work (vulnerability discovery, exploit reasoning, proof generation). The first canonical wiki disclosure comes from Cloudflare's 2026-05-18 Project Glasswing: what Mythos showed us writeup, after Cloudflare ran the model against "more than fifty of our own repositories" spanning runtime, edge data path, protocol stack, control plane, and OSS dependencies.

Capability disclosures

Cloudflare names two capabilities that defined the jump from previous general-purpose frontier models:

  • Exploit chain construction"Mythos Preview can take several of these primitives and reason about how to combine them into a working proof. The reasoning it shows along the way looks like the work of a senior researcher rather than the output of an automated scanner." The wiki-load-bearing consequence: previously low-severity bugs that "would traditionally sit invisible in a backlog" can now be chained into a single more-severe exploit, reclassifying the backlog as actual security risk.
  • Proof generation — the model writes code that would trigger a suspected bug, compiles it in a scratch environment, runs it, reads the failure, adjusts its hypothesis, and tries again. "A suspected flaw without a working proof is speculation, and Mythos Preview closes that gap on its own."

Other frontier models were observed by Cloudflare to find the same underlying bugs but stop short of stitching them into a working chain — "a model would identify an interesting bug, write a thoughtful description of why it mattered, and then stop, leaving the actual chain unfinished and the question of exploitability open."

Safeguards posture (preview)

Mythos Preview was provided to Cloudflare via Project Glasswing without the additional safeguards present in generally- available models like Opus 4.7 or GPT-5.5. Despite the lower-baseline guardrail set, the model exhibits emergent organic refusals on certain legitimate-research requests — "the same task, framed differently or presented in a different context, could produce completely different outcomes". Cloudflare's stated position: those organic refusals are real but not consistent enough to serve as a complete safety boundary on their own"any capable cyber frontier model made generally available in the future must include additional safeguards on top of this baseline behavior — making it appropriate for broader use outside of a controlled research context like Project Glasswing."

The wiki canonicalises this observation as concepts/model-organic-refusal-inconsistency.

Comparison reference points

The post explicitly distinguishes Mythos Preview from GA-tier models:

"The Mythos Preview model provided by Anthropic, as part of Project Glasswing, did not have the additional safeguards that are present in generally available models (like Opus 4.7 or GPT-5.5)."

For the wiki this is the first disclosure of post-Sonnet-4 Anthropic frontier capability tiers being benchmarked against GPT-5.x, and the first cyber-specialised frontier-tier model named.

Operational use shape

In Cloudflare's harness, Mythos Preview is the model behind the hunter agents in the vulnerability discovery harness — the agents that compile PoC code in per-task scratch directories and chain primitives into proofs. Crucially, Cloudflare also used Mythos Preview to build the harness itself: "We used Mythos Preview to build on, tailor, and improve our original harnesses to suit its strengths." This is dogfood-as-model-self-improvement — the model that runs the pipeline also designed parts of the pipeline.

Open questions / what's not disclosed

  • Architecture, parameter count, training-data shape — not disclosed; the post is an external use-case retrospective.
  • General availability timeline — not stated; Cloudflare's argument that GA cyber frontier models "must include additional safeguards on top of this baseline behavior" reads as a precondition, not a release date.
  • Quantitative results vs Opus 4.7 / GPT-5.5 — Cloudflare declines to publish a clean apples-to-apples benchmark: "a clean apples-to-apples comparison to earlier models difficult. So rather than trying to benchmark Mythos Preview against general-purpose frontier models, it's more useful to describe what it can actually do."

Seen in

Last updated · 542 distilled / 1,571 read