AI Security

Google Just Caught the First AI-Built Zero-Day Used in the Wild — It Was a 2FA Bypass

Google Threat Intelligence Group disclosed the first known AI-developed zero-day used in the wild — a Python 2FA bypass intended for mass exploitation. Google identified the LLM fingerprint and coordinated a patch before the campaign could launch.

Google Threat Intelligence Group disclosed on May 11, 2026 — resurfacing through analyst writeups across May 17 and 18 — the first known instance of a threat actor using an AI system to develop a working zero-day exploit deployed in the wild. The vulnerability was implemented as a Python script designed to bypass two-factor authentication on a popular open-source, web-based system-administration tool, and was developed by an unnamed cybercrime cluster planning a "mass vulnerability exploitation operation." Google identified the AI fingerprint through telltale LLM-output artifacts — over-documented code, textbook-style formatting, and a fabricated CVSS severity score — and coordinated a patch with the upstream vendor before the planned mass campaign deployed.

MOUNTAIN VIEW, CALIFORNIA — On May 11, 2026, Google Threat Intelligence Group (GTIG) disclosed what it characterized as the first known instance of a threat actor using an AI system to develop a working zero-day exploit that was deployed in the wild — a disclosure that resurfaced across the security analyst community through follow-on writeups on May 17 and 18. The vulnerability was implemented as a Python script designed to bypass two-factor authentication on a popular open-source, web-based system-administration tool. The script was developed by an unnamed cybercrime cluster that GTIG says was planning a "mass vulnerability exploitation operation" — the kind of pre-built campaign that runs once it deploys, targeting every reachable instance of the vulnerable software. Google identified the AI fingerprint through three telltale LLM-output artifacts: code that was over-documented relative to functional complexity, textbook-style formatting that mirrored common LLM output conventions, and a fabricated CVSS severity score that did not correspond to any published advisory. The 2FA bypass required valid user credentials but defeated the second factor — the layer most commonly relied on to protect cryptocurrency wallets and enterprise system-administration consoles. Google coordinated with the upstream vendor to land a patch before the planned mass campaign could deploy, disrupting the operation pre-deployment.

Disclosure Overview
Field	Details
Disclosure	Google Threat Intelligence Group (GTIG) — May 11, 2026, with analyst-coverage cycle landing May 17-18
Significance	First publicly confirmed AI-developed zero-day exploit used in the wild
Exploit Type	Python script bypassing two-factor authentication on a popular open-source web-based system-administration tool
Threat Actor	Unnamed cybercrime cluster planning a "mass vulnerability exploitation operation"
AI Fingerprint	Over-documented code, textbook-style formatting, fabricated CVSS severity score
Credentials Requirement	Valid user credentials required; the second factor was defeated
Outcome	Google coordinated with the upstream vendor to land a patch before the planned mass campaign could deploy

What Happened

The Exploit

The exploit GTIG analyzed was a Python script — a single, self-contained piece of code designed to be embedded inside a larger mass-exploitation framework. The target was a popular open-source web-based system-administration tool, which Google did not publicly name to give the upstream vendor time to land the patch. The script bypassed the tool's two-factor authentication layer, the second-of-two credential controls. The exploit was useful only against accounts where the attacker already had a valid first-factor credential — username and password — but defeating the second factor was the entire operational gate the threat actor needed to clear.

The AI Fingerprint

GTIG identified the AI origin of the exploit through three artifacts that distinguish LLM-generated security tooling from human-authored security tooling. First, the code was over-documented: comment density and explanation depth exceeded what a human exploit author would typically include in a payload. Second, the formatting followed textbook conventions that match common LLM output patterns rather than the looser, more idiosyncratic conventions that mark hand-written exploit code. Third — and the most decisive artifact — the script included a fabricated CVSS severity score that did not correspond to any published advisory for the underlying vulnerability. LLMs commonly fabricate CVSS scores when asked to characterize a vulnerability they have analyzed; human exploit authors typically either omit the score or copy it from a published advisory.

The Disrupted Mass Campaign

The exploit was developed by an unnamed cybercrime cluster planning a "mass vulnerability exploitation operation." Mass-exploitation campaigns rely on a working exploit that runs unattended against every reachable instance of the vulnerable software, harvesting credentials, deploying secondary payloads, or pivoting into adjacent systems at scale. The 2FA bypass would have been the critical chokepoint inside that campaign — the gate that determines how many of the harvested first-factor credentials actually produce successful logins. Google coordinated with the upstream vendor and landed a patch before the campaign could deploy. The disruption is pre-deployment, not post-deployment.

Broader GTIG Report Findings
Actor / Cluster	AI-Adjacent Activity
Unnamed Cybercrime Cluster	Python 2FA bypass exploit — first AI-developed zero-day used in the wild
PRC-Linked Groups	Vulnerability discovery and persona-driven jailbreaks of commercial AI models
DPRK-Linked Groups	Bulk exploit validation — using AI to triage exploit candidates at volume
Multi-Actor	Self-morphing malware family with Gemini-powered backdoors documented in the same report

Scope and Impact

The 2FA bypass GTIG disrupted is operationally important on its own terms — a successful mass exploitation against a popular open-source administration tool would have produced thousands of compromised endpoints, each with the credentials of an administrator. The broader significance is what the disclosure says about the attacker-side use of AI. The defender-side AI vulnerability discovery cluster has been the central story of the spring 2026 cycle: Microsoft's MDASH and Palo Alto's Mythos shipped within weeks of each other; OpenAI launched Daybreak into the same competitive space; XBOW has been operating in the bug-bounty layer alongside human researchers. The GTIG disclosure is the moment the attacker-side analog of that cluster arrives operationally.

The cycle of human-discovered and AI-discovered vulnerabilities is also accelerating. XBOW raced human researchers to exploit the Dead.Letter Exim RCE earlier this cycle. NGINX Rift — an 18-year-old rewrite-module RCE — surfaced through AI-assisted retrospective analysis of long-shipped open-source code. Kimsuky's PebbleDash malware was LLM-developed for use against the South Korean defense sector. Germany's intelligence services warned of a China-linked AI Superhacker capability in February. And the Mandia, Stamos, and Adamski RSAC panel called the next 18 to 24 months "the most consequential" in the history of cyber on those exact grounds. The asymmetry the industry has been theorizing about is no longer theoretical.

Response and Attribution

Google has not named the upstream vendor whose tool was targeted, nor the cybercrime cluster that authored the AI-generated exploit. The patch coordination occurred in private, before public disclosure, which is the standard responsible-disclosure flow for a vulnerability that would have enabled mass exploitation. GTIG's report describes the cluster only as a cybercrime actor — not a nation-state — though the report's broader findings document PRC-linked vulnerability discovery and persona-driven jailbreak operations, DPRK-linked bulk exploit validation, and a self-morphing malware family with Gemini-powered backdoors. The actor attribution for the 2FA bypass remains scoped to "an unnamed cybercrime cluster."

For defenders, the operational guidance is concentrated. First, patch the named upstream open-source administration tool as soon as the vendor's advisory is public — and assume that other operators reading the same GTIG report will be re-developing equivalent capability. Second, evaluate whether two-factor authentication is being treated as a hard gate or a soft signal across the enterprise; the AI velocity gap means 2FA bypass logic is going to be commoditized faster than the industry's deployment cycle. Third, add LLM-output fingerprints — over-documented code, fabricated CVSS scores, textbook formatting — to internal exploit-analysis playbooks. The same artifacts GTIG used to detect the AI origin of this exploit will surface in others.

The CyberSignal Analysis

Signal 01 — The Attacker-Side AI Discovery Cluster Has Arrived

CyberSignal's spring 2026 coverage has tracked the defender-side AI vulnerability discovery cluster — Microsoft's MDASH and Palo Alto's Mythos, OpenAI's Daybreak, and XBOW — as one of the cycle's defining structural changes. The GTIG disclosure is the first publicly confirmed wild deployment of the attacker-side analog. The industry no longer has to estimate when AI-developed exploits will become operational; the answer is now. The implication is that the defender-side velocity gain from AI vulnerability discovery has to be measured against an attacker-side velocity gain that exists in the same time horizon. Mandia, Stamos, and Adamski's RSAC panel called this exactly: the next 18 to 24 months will be defined by which side absorbs AI capability faster.

Signal 02 — 2FA Was Not Actually the Hard Gate

The exploit GTIG analyzed required valid first-factor credentials and bypassed the second factor. That is the credential architecture the industry has spent the last decade migrating customers to — username, password, time-based one-time password, fingerprint, push notification. The 2FA layer is supposed to be the load-bearing control that absorbs credential compromise. When 2FA bypass logic is commoditized through AI-assisted exploit development, the industry's de-facto credential architecture has a softer floor than the marketing language suggests. CISOs should be re-evaluating which credential architectures are genuinely phishing-resistant — hardware-backed FIDO2, attested device flows — and accelerating the migration off TOTP-style 2FA in scope of the next exposure cycle. The Kimsuky LLM-developed PebbleDash track and the Germany AI Superhacker warning both point to the same conclusion from the nation-state side.

Signal 03 — Pre-Deployment Disruption Is the Defender's Best New Tool

The most operationally interesting part of the GTIG disclosure is not the AI fingerprint or the 2FA bypass — it is the timing. Google identified the exploit, identified its AI origin, identified the planned mass-exploitation operation, and coordinated a patch before the campaign deployed. That is a pre-deployment disruption rather than a post-deployment forensics. The defender-side AI discovery infrastructure produces this kind of intelligence at the scale that lets vendors and law-enforcement coordinate pre-deployment patches. The CrowdStrike Falcon AIDR Kubernetes expansion and the broader AI runtime security market are pieces of the same machinery. Defenders should expect the next two years to produce more pre-deployment disruptions and fewer post-deployment cleanups — provided the defender-side AI investment continues to keep pace.

Sources

Type	Source
Primary	Google Threat Intelligence Group — May 2026 Report
Reporting	The Hacker News (Ravie Lakshmanan) — Hackers Used AI to Develop First Known Zero-Day
Reporting	Tom's Hardware — Google Finds First AI-Developed Zero-Day That Bypasses 2FA
Reporting	Cybernews — First AI-Assisted Zero-Day Exploit
Reporting	CryptoBriefing — Google AI Zero-Day Exploit 2FA Bypass
Reporting	CNBC — Google Thwarts Hacker Group's AI Mass Exploitation Effort

Citrix Patches Six NetScaler Flaws, Including CVE-2026-8451 With Echoes of CitrixBleed

Microsoft Defender Flaw CVE-2026-33825, 'BlueHammer,' Tied to Ransomware Activity

Nissan Oracle PeopleSoft Campaign Reportedly Targeted 100 Organizations

Researchers Disclose AirDrop and Quick Share Flaws Affecting Five Billion Devices