Artificial Intelligence (AI)

A Poisoned Notification Can Hijack Google Gemini's Voice Assistant on Android

SafeBreach researchers found that a single poisoned notification — from WhatsApp, Slack or SMS — could hijack Google Gemini's voice assistant on Android with no malicious app installed, reaching smart-home controls and poisoning the assistant's long-term memory. Google has patched it.

Key Takeaways

SafeBreach researchers found a prompt-injection flaw in Google Gemini's voice assistant on Android: an attacker could hide instructions inside an ordinary notification from apps like WhatsApp, Slack or SMS and have the assistant act on them, with no malicious app installed on the phone.
Demonstrated actions included controlling Google Home smart-home devices, starting a Zoom call, crafting messages that appear to come from trusted contacts, and poisoning the assistant's long-term memory for persistent influence.
Google has patched the issue with content-classifier updates following responsible disclosure, SafeBreach lists no CVE, and there is no evidence the technique was used in the wild — but it is the second 'AI assistant trusts hostile input' failure The CyberSignal has covered in a week.

The attack surface was the notification itself: Gemini treated hostile, attacker-controlled text arriving in a banner as a trusted instruction — the same class of failure as a recent Meta-AI hijack, but reaching into smart-home control and persistent memory.

SUNNYVALE, CALIFORNIA — SafeBreach Labs has disclosed a prompt-injection vulnerability in Google Gemini's voice assistant on Android that could let an attacker hide malicious commands inside an ordinary notification — from apps such as WhatsApp, Slack and SMS — and have the assistant act on them. Per The Hacker News, SecurityWeek and Dark Reading (reporting dated June 3 and 4, 2026), no malicious app needed to be installed on the phone: a single poisoned notification was enough to push the assistant toward dangerous actions, including controlling Google Home smart-home devices, starting a Zoom video call, crafting messages that appear to come from trusted contacts, and quietly poisoning Gemini's long-term memory. Google has since patched the issue, SafeBreach lists no CVE for it, and there is no evidence the technique was ever used in the wild.

The defining feature of the attack is where the hostile input came from. The malicious instructions rode in on a notification — content the assistant did not generate and the user did not author — and Gemini processed that attacker-controlled text as though it were a trusted command. That is the same 'AI assistant as attack surface' failure mode The CyberSignal documented days earlier in the Meta-AI support-bot hijack, but with consequences that reach into the physical home and persist across sessions.

Research Overview
Field	Details
Discovered By	SafeBreach Labs
Target	Google Gemini voice assistant on Android
Class	Indirect prompt injection via notification content
Delivery Apps	Notifications from messaging apps including WhatsApp, Slack and SMS
Precondition	No malicious app required on the device
Demonstrated Actions	Control Google Home smart-home devices, start a Zoom call, craft messages appearing to come from trusted contacts, poison the assistant's long-term memory
Bypass Technique	'Fake Context Alignment' — seeding a question into conversation history so a routine 'Yes' authorizes a hidden action
Status	Patched by Google via content-classifier updates after responsible disclosure; no CVE assigned; no known in-the-wild use

What Happened

Per SafeBreach's research, the attack exploits the way Gemini's voice assistant ingests context on Android. Notifications from messaging apps surface text into the assistant's reach, and the researchers found that attacker-controlled text in a notification — from WhatsApp, Slack, SMS and similar apps — could be treated as an instruction rather than as untrusted data. Because the assistant is wired into the device's broader capabilities, that misplaced trust translated into real actions: the researchers demonstrated controlling Google Home smart-home devices, initiating a Zoom call, generating messages that appear to come from a trusted contact, and writing into the assistant's long-term memory to influence its future behavior. Crucially, none of this required installing a malicious app; the notification was the entire delivery mechanism.

To defeat Google's earlier safeguards, the researchers developed a technique they call 'Fake Context Alignment.' In the demonstrated example, Gemini vocally outputs a question in Chinese — 你想打开窗户吗? ("Do you want to open the window?") — immediately followed by the English phrase "Is that all you needed me to do?" Because the Chinese question is now part of the conversation history, the backend safety mechanism aligns the user's natural "Yes" with the planted instruction to open the window, authorizing the tool execution. It is a manipulation of the assistant's own context window, turning an innocuous affirmation into consent for an action the user never knowingly approved. Following responsible disclosure, Google rolled out content-classifier updates to mitigate the vulnerabilities; SafeBreach assigned no CVE and reported no evidence of real-world abuse.

The 'Fake Context Alignment' trick is worth dwelling on because it generalizes beyond Gemini. Conversational assistants make decisions based on their conversation history, and they often treat a user's short affirmation — a "Yes," a "Sure" — as consent for whatever was most recently proposed. By injecting its own proposal into that history and then prompting a benign-seeming confirmation, an attacker can borrow the user's consent for an action the user never understood. This is the assistant-specific version of a 'confused deputy' problem, and it is the same underlying weakness The CyberSignal described in the Meta AI support-bot hijack, where high-profile Instagram accounts were taken over by asking the bot nicely. The model is not being broken; it is being persuaded, using the trust it places in its own context.

Why Smart-Home Control and Memory Poisoning Change the Stakes

Most prompt-injection demonstrations end in a one-shot action — a leaked secret, a single unwanted message. This one reaches further in two directions that make the consequences durable. First, by controlling Google Home devices, the attack crosses from the screen into the physical environment, where the affected objects are locks, lights, cameras and appliances. Second, by writing into the assistant's long-term memory, the attack establishes persistence: a poisoned memory entry can shape the assistant's behavior in future sessions, long after the original notification is gone. That combination — physical reach plus persistence — is what elevates this from a roundup line to a standalone concern, and it fits the broader arc The CyberSignal has tracked in how AI is being used in cyberattacks.

The Pattern: Assistants Keep Trusting Hostile Input

Within a single week, two major AI assistants were shown to act on instructions that arrived from outside the user — Meta's support bot and now Gemini. The recurring failure is foundational rather than incidental: assistants are being granted real capabilities (account actions, smart-home control, messaging) while still treating external content as trustworthy by default. Vendors are responding — Google shipped classifier updates here, and the industry has been building defensive tooling such as Google's own AI Threat Defense pairing Gemini with Wiz and CodeMender — but the structural mismatch between capability and input-trust remains. Until assistants reliably distinguish 'data to read' from 'instructions to follow,' each new integration expands the attack surface, a dynamic also visible in adversarial misuse of consumer AI models in influence operations.

Scope and Impact

The scope of this specific research is bounded by two facts that should temper alarm: Google has patched the issue with content-classifier updates, and SafeBreach found no evidence it was ever used in the wild. This was responsible-disclosure research, not an observed campaign. That said, the population it spoke to is enormous — Gemini's voice assistant is integrated across a vast Android install base and tied to widely used messaging apps and the Google Home ecosystem — so the value of the research is in the class of risk it exposes rather than in a count of victims, of which there are none reported.

The structural risk is the breadth of what a hijacked assistant can touch. An assistant connected to smart-home devices, calendars, messaging and persistent memory has a large action surface, and any flaw that lets external content steer it inherits that entire surface. The memory-poisoning element is the most concerning for durability: a one-time exploit that leaves a persistent instruction behind can influence behavior indefinitely, which is a qualitatively different problem from a transient injection that ends when the session does.

Specifics worth confirming against Google's and SafeBreach's statements include the exact Gemini and Android versions affected, whether the content-classifier updates fully close the demonstrated techniques or raise the bar against them, and the complete list of notification sources the researchers validated. The brief-stage list of delivery apps beyond WhatsApp, Slack and SMS — such as Signal, Instagram and Messenger — should be treated as illustrative of the notification class until pinned to the primary research; the mechanism is the notification surface itself rather than any one app.

Response and Attribution

For individuals and higher-risk users, the practical steps are about limiting what the assistant can reach. Review and minimize Gemini's connections — smart-home, calendar and messaging integrations — keeping only what you actively use, and confirm you are running the patched version. Consider disabling assistant access from the lock screen or other unattended states where a sensitive action could be triggered without you present, and periodically review the assistant's memory and personalization settings, clearing any entries you do not recognize. The memory-poisoning angle makes that last step more than housekeeping: it is the way to detect and undo a persistent manipulation.

For enterprise mobility and MDM teams, the action is inventory and policy. Identify managed Android fleets where the Gemini assistant is exposed, push the patched version, and treat AI-assistant integrations as a privileged attack surface in mobile threat models. The working assumption should be that external content — notifications, messages, shared documents — can become an instruction to a connected assistant until a vendor proves otherwise. That assumption should drive which integrations are permitted on managed devices and how aggressively assistant capabilities are constrained in sensitive environments.

On attribution, there is none and there should be none: this is published research with no associated threat actor and no in-the-wild exploitation. The CyberSignal frames it as a defensive disclosure that exposes a class of weakness, not as an incident. The honest takeaway is forward-looking — the assistant-as-attack-surface pattern is recurring often enough that organizations should plan for it as a category, rather than reacting to each individual assistant flaw as a surprise.

The CyberSignal Analysis

Signal 01 — The Notification Is the New Untrusted Input

The conceptual shift this research forces is to treat notifications as untrusted input to AI assistants, not as benign UI. A banner from a messaging app is attacker-controllable content, and any assistant that can read it must treat it as data rather than instruction. The Gemini flaw is a clean demonstration that the boundary between 'content the assistant displays' and 'commands the assistant follows' was too thin. Defenders and vendors alike should assume that every channel feeding text into an assistant is a potential injection vector until the assistant provably separates the two.

Signal 02 — Persistence Is the Real Escalation

What distinguishes this from a routine prompt-injection demo is memory poisoning. A transient injection ends with the session; a poisoned memory entry survives it, shaping the assistant's future behavior without any further attacker action. That moves AI-assistant attacks from the category of one-shot tricks into the category of persistent footholds, which is a far more serious posture. The defensive implication is that reviewing and clearing assistant memory needs to become a standard hygiene step, the same way clearing persistence mechanisms is standard in endpoint incident response.

Signal 03 — Capability Is Outrunning Input-Trust

The deeper signal across the week's two assistant hijacks is a mismatch: vendors are granting assistants real-world capability — account actions, smart-home control, messaging — faster than they are solving the problem of distinguishing trusted from hostile input. Every new integration widens the blast radius of an injection. Until that input-trust problem is solved at the platform level, the responsible posture is to grant assistants the minimum capability necessary, to keep sensitive actions behind explicit human confirmation, and to treat each new 'assistant can now do X' feature as an expansion of the attack surface rather than a pure convenience.

Sources

Type	Source
Primary	SafeBreach Labs — Exploiting Gemini via Prompt Injection (original research)
Reporting	The Hacker News — WhatsApp, Slack Notifications Could Hijack Google Gemini on Android
Reporting	SecurityWeek — Gemini Voice Assistant Hijacked via Messaging Notifications
Reporting	Dark Reading — Malicious Notifications Could Trick Google Gemini Users
Related	The CyberSignal — Hackers Hijacked High-Profile Instagram Accounts by Asking Meta's AI Support Bot Nicely
Related	The CyberSignal — Google Cloud Launches AI Threat Defense, Pairing Gemini With Wiz and CodeMender

A Poisoned Notification Can Hijack Google Gemini's Voice Assistant on Android

What Happened

Why Smart-Home Control and Memory Poisoning Change the Stakes

The Pattern: Assistants Keep Trusting Hostile Input

Scope and Impact

Response and Attribution

The CyberSignal Analysis

Signal 01 — The Notification Is the New Untrusted Input

Signal 02 — Persistence Is the Real Escalation

Signal 03 — Capability Is Outrunning Input-Trust

Sources

Read more

UK Police Chiefs Cite TfL Prosecution to Push for New Cybercrime Risk Orders

Zhejiang University Researchers Publish "Bit2Watt" Cloud-to-Power-Grid Disruption Research at CHES 2026

South Korea's Foreign Ministry Confirms Nine-Month Breach of Diplomatic Training System

FBI Warns of Deepfake Videos Impersonating IC3 Leadership in Fraud-Revictimization Campaign

What Happened

The Mechanism: When Context Becomes Consent

Why Smart-Home Control and Memory Poisoning Change the Stakes

The Pattern: Assistants Keep Trusting Hostile Input

Scope and Impact

Response and Attribution

The CyberSignal Analysis

Signal 01 — The Notification Is the New Untrusted Input

Signal 02 — Persistence Is the Real Escalation

Signal 03 — Capability Is Outrunning Input-Trust

Sources

Read more

UK Police Chiefs Cite TfL Prosecution to Push for New Cybercrime Risk Orders

Zhejiang University Researchers Publish "Bit2Watt" Cloud-to-Power-Grid Disruption Research at CHES 2026

South Korea's Foreign Ministry Confirms Nine-Month Breach of Diplomatic Training System

FBI Warns of Deepfake Videos Impersonating IC3 Leadership in Fraud-Revictimization Campaign