Mythos: Pandora's Box or Marketing Stunt?

An AI so powerful it can hack autonomously — and therefore cannot be released to the general public. With Claude Mythos, Anthropic has made a bold move... at the very moment it is courting investors and technical partners. Behind what seems like a very (too?) convenient buzz, what are the real dangers of Mythos? INCYBER takes stock.

Too powerful, too autonomous, too dangerous: barely had Anthropic unveiled Claude Mythos Preview in early April before it tried to close the lid again — no public release, but restricted access via Project Glasswing. It was not unveiling a new high-performance AI model in cybersecurity; it was staging a technological breakthrough, immediately sparking a worldwide buzz. A crime too perfect to be true?

On substance, however, something has changed. With a 100% success rate, Claude Mythos saturates the Cybench benchmarks, the academic reference in the field. Better yet, on exercises from the UK AI Security Institute (AISI), more complex and graded from ‘beginner’ to ‘expert’, it ‘has a 73% success rate’ at expert level, the AISI notes. This is not a spectacular score in itself, except when you recall that these tasks were out of reach for models just a year ago. And these are still capture-the-flag exercises: the AI must find a vulnerability and retrieve a flag, a proof of success — isolated cyber exercises.

If, still according to the AISI, Mythos ‘represents progress over previous state-of-the-art models’, it is because of its success in The Last Ones scenario, a complete 32-step computer attack simulation. Where a human takes around twenty hours, Mythos is the first model to have completed it end-to-end, with three successes out of 10 and an average of 22 steps completed, compared to 16 for Claude Opus 4.6, its closest rival. If Mythos is so frightening, it is not because it got a good grade on a standardized exam, but because the examiners had to throw away the questionnaire and ask it to burgle a fake company.

Mythos Eliminates Friction

Anthropic pushes the narrative even further. In its own tests, the company claims the model can identify and exploit critical vulnerabilities without human intervention, going so far as to describe complete exploits produced overnight. ‘Mythos Preview was able to find and exploit zero-day vulnerabilities in every major operating system and every major web browser.’ A striking example: a 17-year-old FreeBSD vulnerability, found and exploited immediately.

Behind the sensationalist rhetoric lies a ground-level reality, summarized by Andres Mendoza, CTO of Zoho Corporation B.V., at the INCYBER Forum 2026: ‘attackers use a simple tool you use yourself every day: ChatGPT’, adding that it ‘considerably reduces the time needed to create something.’ In other words, Mythos is part of a trend already underway: the compression of time, the disappearance of friction.

And it is this friction that partly protected companies. Exploiting misconfiguration, an old component, poorly designed authentication, or the necessary chain of bugs costs time, talent, and nerve. Where an attack once required days of preparation and tedious tasks, it now takes only a few minutes. The novelty of Mythos is not that it invented this movement, but that it pushed it to a level where the agent no longer merely assists: from discovering the flaw to transforming it into a functional exploit, it chains everything together — upending the balance of power between attackers and defenders.

The Buzz Worth 800 Billion

This explains the brutality of Anthropic’s reaction: the model is placed under control via the Glasswing program, backed by $100 million in credits. It is accessible only to a select circle of handpicked players — 12 founding members, giants such as GAFAM, the Linux Foundation, and Nvidia. The remaining 40 critical infrastructure operators and trusted partners remain in the shadows, making the model a strategic asset all the more valuable because most companies lack it — creating a potentially major competitive distortion. This issue also plays out at state level, since the NSA (National Security Agency) has admitted using it. For purely defensive purposes? China, Russia, Iran, or North Korea may find out soon.

While Anthropic cites caution to explain this choice, it is also set against a very particular backdrop. The company is in the midst of a major financial build-up, with valuations mentioned in the hundreds of billions of dollars. It hopes to raise $400 to $500 billion in an IPO planned for next October. Some investors even value the company at $800 billion! Anthropic has also massively invested in computing infrastructure. In this context, claiming that a model is ‘too dangerous to release’ produces an immediate effect: a rare, expensive, sensitive, coveted model is worth more than a model that is simply ‘good.’ The storytelling is devastatingly effective, right when investors need to be impressed. To put even more pressure on them, British and American financial authorities have each met separately to assess the risks posed by Mythos.

David vs. Goliath

Scarcity is all the more profitable because Mythos seems expensive to run usefully. Anthropic does not publicly explain the cost of a genuine end-to-end vulnerability research campaign, but the model is certainly heavy to run. On the Glasswing page, some Mythos benchmarks are listed with budgets of one million tokens per task, multiple attempts, and extended time windows. At $25 per million input tokens and $125 per million output tokens after the allocated credit phase, Mythos is not within everyone’s budget.

Critics were quick to appear. Researchers and players such as AISLE showed that smaller models could rediscover some of the vulnerabilities highlighted by Mythos, or re-exploit certain demonstrations. VentureBeat summarizes one such counter-argument: eight out of eight models detected the FreeBSD bug highlighted by Anthropic, with a small 3.6-billion-parameter model priced at 11 cents per million tokens — a fraction of the resources and cost required by Anthropic’s model. This risks breaking the spell: no, Mythos is not necessarily a ‘secret super-weapon’ in the sense that only it can see the invisible.

Yet this does not destroy Anthropic’s central argument, because small models perform well when guided toward the right target, the right bug family. Mythos’s claimed advantage is not merely knowing how to pick a lock, but wandering alone through the building until it finds the back door.

Mythos the Liar

An autonomy that is both the key to Mythos’s success… and arguably its greatest danger. Anthropic proudly states that Mythos is ‘significantly more capable and used in a more autonomous and proactive manner than any previous model.’ Why? First, while you sleep, Mythos reviews the day’s actions to correct its errors and continues working — but above all, it dives gleefully into a grey zone.

The company’s risk report notes that the model may adopt problematic behaviors to achieve its objective, delicately referring to its ‘willingness to perform non-compliant actions.’ It is capable of lying about its identity or escaping a secured environment and autonomously publishing its exploit. In an early version of the model, Mythos was able to erase its traces after a prohibited action or claim it was following the rules while breaking them. Anthropic assures us that the latest versions partially correct these behaviors. One can only hope so, because that is the heart of the problem. From the moment a model itself chooses how to overcome an obstacle, the boundary between a useful initiative and a dangerous one becomes all the thinner given that no one is truly able to monitor when the yellow line is crossed.

Here again, Mythos merely accelerates an already ongoing process. At the INCYBER Forum 2026, Pierre Meganck, cybersecurity consultant at LINKT, already described systems where AI no longer merely executes: ‘The AI agent will decide on its own that it must attack those three companies because they are easier to attack.’

Anthropic’s Leak Parade

AI does not change the nature of the threat, but its speed and depth. Attacks become faster, more numerous, more personalized; they better exploit data and better target their victims. Mythos automates and autonomizes this process, widening the gap between the speed of attack and the speed of defense. Yet all cybersecurity rests on this speed differential. ‘There is an imbalance taking shape, a troubling form of asymmetry,’ Colonel Herve Petry, commander of the national cyber unit of the Gendarmerie, warned at the INCYBER Forum 2026, with Pascal Le Digol, France country manager at Watchguard Technologies, adding at the same roundtable: ‘It is no longer just a small lead — they [the hackers] have a real technological reserve. […] That creates a colossal catch-up gap.’

Even if the alarmist discourse serves its interests, Anthropic is right to highlight the capability leap its model represents. But its white-knight armor cracks somewhat when you note that it allowed nearly 3,000 documents to leak, most related to Mythos development. Worse, a few days later, more than 500,000 lines of Mythos code and 1,900 files ended up in the wild. And in trying to erase the traces of this blunder, the company accidentally removed 8,100 GitHub repositories, most of which were perfectly legitimate.

‘An Insufficient Level of Rigor’

Finally, it emerged that ‘a small group of unauthorized individuals’ had accessed Mythos on the very day Anthropic announced limiting its distribution to the tech happy few. In short, the paradox is almost perfect: the company that claims to have created a tool capable of detecting and exploiting complex vulnerabilities finds itself weakened by relatively classic operational errors — such as failing internal processes, poor management of its partners or access rights… It always comes back to basics: cybersecurity does not depend solely on the power of AI, but begins with mastering the factor located between the chair and the keyboard. And there, Anthropic performs no miracles — far from it.

As the risk report published by the company about Mythos notes: ‘During the development of Mythos Preview, we identified errors in our training, monitoring, evaluation, and security processes. We do not believe these errors present significant safety risks for a model at this performance level, but they reflect an insufficient level of rigor for more capable future models.’ Setting aside the fact that it exonerates itself from responsibility for controlling the current model, this is probably the most honest sentence in the whole affair. It underlines that while Mythos is impressive enough to cause serious concern, Anthropic seems to better master the art of buzz than the world it is opening up.