AI Models Gain Hacking Power as States Scramble

Advanced artificial intelligence systems designed to identify cybersecurity vulnerabilities are demonstrating hacking capabilities that far exceed expert expectations, raising urgent questions about whether democratic governments and critical infrastructure operators can secure their networks before adversaries weaponize the technology.

The United Kingdom's AI Security Institute found that Anthropic's Mythos can fully take over a corporate network in six out of 10 attempts, while OpenAI's GPT-5.5 succeeded in three out of 10 tries—performance levels that have alarmed security researchers, government officials, and infrastructure operators who depend on robust defenses to protect hospitals, water systems, and telecommunications networks.

"Cyber capabilities in leading AI systems are advancing much faster than expected," British AI Minister Kanishka Narayan stated, signaling deep concern among policymakers about the speed at which these tools are evolving.

The Scale of the Threat

Nine of the nation's top cyber researchers and technology leaders who tested the models in controlled settings universally concluded that the tools are advancing much faster than anticipated and will fundamentally change the digital security landscape. The findings underscore a critical governance challenge: powerful technologies are being developed and deployed faster than institutions can regulate them.

Anthropic revealed that Mythos had "already found thousands of high-severity vulnerabilities, including some in every major operating system and web browser," and warned that the consequences of releasing such technology could be "severe" for global economies, public safety and national security. Broadcom, which tested Mythos against its own software code, described its findings as "jolting" in a report published last month, stating: "We are learning things that appear unlikely to ever have been uncovered by human researchers alone."

Lee Klarich, chief product and technology officer at Palo Alto Networks, said testing Mythos was transformative. "It was very clear to me that this was going to be a game-changer," he said, adding: "I would actually say if you asked me today, it's more [powerful] than I thought it was going to be then."

Isaac Evans, CEO of cybersecurity company Semgrep, said Mythos "exceeded our expectations," noting that "the model's not superhuman across all dimensions, but at least in some narrow cases, it's really demonstrating an uncanny ability around exploit generation." Evans warned that some described Mythos as capable of generating "a SolarWinds every quarter"—a reference to the Russian government's 2020 breach of U.S. federal agencies, widely regarded as one of the worst hacks in history and affecting more than 18,000 organizations worldwide through compromised software.

The Governance Gap

The rapid advancement of these capabilities has exposed a significant gap between technological development and institutional oversight. Government agencies, congressional committees, banks and regulators have been clamoring for access in recent weeks so they can secure critical networks before adversaries obtain the technology to launch devastating cyberattacks.

Both Anthropic and OpenAI have kept testing of their frontier AI models limited to small groups of trusted organizations because of the technologies' advanced cyber capabilities, which have outpaced other publicly available digital tools and even the most skilled human minds. This gatekeeping approach, while cautious, raises questions about whether democratic institutions have adequate access to understand and defend against threats posed by privately developed systems.

Concerns are rising that China and other adversaries could soon develop their own advanced AI tools. China has launched what officials describe as an industrial-scale campaign to copy American AI technology through so-called distillation attacks—a development that underscores how quickly technological advantages can erode in the absence of international cooperation and robust regulatory frameworks.

Policy Stalled at Critical Moment

The Trump administration is described as acutely aware of these dangers and scrambling to work with tech companies, government agencies and critical infrastructure groups to figure out how to deploy these tools quickly and safely. However, the policy process has stalled at a critical juncture.

President Donald Trump abruptly postponed signing an executive order earlier this week that would have established a voluntary process for tech companies to submit certain AI models to the federal government for testing. Former AI czar David Sacks raised concerns about the executive order stifling innovation with Trump at the last minute, creating uncertainty about when—or whether—a formal testing framework will be established.

Trump told POLITICO on Friday that he had "many" concerns about the draft executive order and worried it was "inhibiting the industry." It is unclear when the executive order will be signed, leaving a regulatory vacuum at a moment when officials and infrastructure operators say they urgently need structured access to test and prepare defenses.

Competing Visions of Defense

Security experts are divided on how to leverage these tools responsibly. Klarich suggested that defenders could try to use the strengths demonstrated by various AI models, including Mythos and GPT-5.5, to create a "multi-model architecture" to secure their networks. "There's a future state where we will actually be producing more secure products, more secure code as opposed to having to remediate things that are already released," he said.

However, Evans offered a more cautious assessment. "These model developments mainly are advantages for attackers rather than defenders," he said, highlighting the asymmetry in how offensive and defensive capabilities are likely to evolve.

Jonathan Trull, chief information security officer of IT security company Qualys, which is testing GPT-5.5, said the model "can basically do what your most advanced app security engineer can do." Cloudflare Chief Security Officer Grant Bourzikas said in a blog post published this week that Mythos can both identify vulnerabilities and write code to exploit them, marking a "real step forward" for this type of advanced AI technology.

Why This Matters:

The rapid advancement of AI hacking capabilities presents a fundamental challenge to democratic governance: powerful technologies developed by private companies are outpacing the institutional capacity of governments and critical infrastructure operators to understand, regulate, and defend against them. The delay in establishing a formal testing framework means that public institutions lack structured access to assess risks and prepare defenses at the moment when such preparation is most urgent. The asymmetry between offensive and defensive applications suggests that without robust public oversight and international cooperation, these tools may increase rather than decrease cybersecurity inequality—leaving smaller institutions and developing nations more vulnerable while well-resourced actors gain disproportionate advantage. The policy stall reflects a broader tension between those prioritizing innovation speed and those emphasizing precaution, a debate that directly affects whether democratic institutions can maintain control over critical infrastructure security.

The Scale of the Threat

The Governance Gap

Policy Stalled at Critical Moment

Competing Visions of Defense

Why This Matters: