Researchers Disclose 282 iOS AI Apps Leaking API Keys in Network Traffic

A scale-significant AI-app privacy disclosure with developer accountability implications: Wake Forest researchers found 282 of 444 iOS AI apps exposing usable LLM credentials in their own network traffic, and most stayed open months after notification.

Share
Flat white line-art of a smartphone with an app grid, a key, and flowing connection lines, on a Charcoal background — 282 iOS AI apps leaking API keys in network traffic.

Key Takeaways

  • Researchers at Wake Forest University disclosed findings that 282 of 444 analyzed AI-powered iOS apps — roughly two-thirds — exposed usable large language model (LLM) API credentials or open backend proxy access in their runtime network traffic, spanning at least ten AI providers and 13 App Store categories.
  • The leakage took three forms — JWT-based token leakage (48% of cases), unauthenticated backend proxy access (33%), and plaintext API key transmission (19%) — with OpenAI credentials the most commonly exposed, appearing in 42 apps, and Google Gemini, Anthropic, DeepSeek and others also affected.
  • After the team privately notified all 282 developers and waited roughly three months, only about 28% had clearly fixed the problem, leaving the majority still exploitable; the researchers called on AI providers and Apple to add safeguards rather than on a single emergency patch.

A scale-significant AI-app privacy disclosure with developer accountability implications: Wake Forest researchers found 282 of 444 iOS AI apps exposing usable LLM credentials in their own network traffic, and most stayed open months after notification.

WINSTON-SALEM, NORTH CAROLINA — Researchers at Wake Forest University on June 30, 2026 drew fresh attention to a systemic weakness in mobile AI software after disclosing that 282 of 444 AI-powered iOS apps they analyzed — close to two-thirds — exposed usable large language model (LLM) API credentials or open backend proxy access in their own network traffic. The study, titled "Mind your key," used a custom dynamic-analysis tool the team built called LLMKeyLens to intercept each app's outbound HTTPS traffic at runtime, extract provider-specific keys and tokens, and confirm that the captured credentials still worked.

The framing here is a research disclosure, not a single breach: the finding is that a common developer pattern — embedding or proxying paid AI access in a way that any observer of the traffic can lift — is widespread across the App Store rather than confined to one bad actor. That makes it a developer-accountability and supply-chain story for any organization whose staff rely on AI iOS apps, sitting alongside other recent work on how AI assistants and the software supply chain can quietly leak trust.

At a Glance
FieldDetails
WhatResearch disclosure of LLM API credential leakage in iOS AI apps
Apps affected282 of 444 analyzed AI-powered iOS apps (~64%)
ExposurePlaintext keys, replayable JWT tokens, and unauthenticated backend proxies visible in network traffic
Keys exposedOpenAI (42 apps), Google Gemini (7), plus Anthropic, OpenRouter, DeepSeek, Mistral, Baidu ERNIE and others — at least 10 providers
Disclosed byWake Forest University researchers (tool: LLMKeyLens)
App Store actionNot specified — researchers recommend Apple screen for this at review
StatusAfter ~90-day disclosure window, only ~28% of apps clearly remediated

What the Research Disclosed

The Wake Forest team set out to answer a question prior research had largely left untouched for Apple's platform: how often do AI-powered iOS apps hand an observer of their network traffic the credentials needed to use the paid AI service behind them? Starting from a large pool of App Store listings, the researchers curated a dataset of 444 free iOS apps with confirmed, exercisable LLM features, then ran each through LLMKeyLens, a man-in-the-middle proxy framework that watches an app's outbound HTTPS traffic, applies provider-specific fingerprinting rules to pull out candidate credentials, and actively validates each one against the relevant AI provider's API to confirm it is live.

The headline figure is that 282 of those 444 apps — roughly 64% — exposed exploitable LLM API credentials or open backend access in their runtime traffic. The leakage fell into three patterns. JWT-based token leakage was the most common at 48% of cases, in which apps handed out temporary access tokens that nonetheless leaked in the same traffic and were usually still valid when captured. Unauthenticated backend proxy access accounted for 33%, where an app routed requests through a server that answered anyone with no check on who was asking — effectively an open relay to a paid AI account. Plaintext API key transmission made up the remaining 19%, in which a raw key could be read from a single captured request.

On the provider side, OpenAI credentials were the most frequently exposed, appearing in 42 of the vulnerable apps, with Google's Gemini identified in 7. The researchers also reported exposed credentials tied to Anthropic, OpenRouter, DeepSeek, Mistral, Baidu's ERNIE and several others, spanning at least ten providers in total. The affected apps reportedly ranged across at least 13 App Store categories — productivity, entertainment, lifestyle, education, utilities, and health and fitness among the most represented — which underscores that this is a pattern in how AI features get wired into consumer apps generally, not a quirk of one niche. It echoes a broader theme in recent Apple platform and WebKit AI security work that the rapid bolting-on of AI to mobile software keeps outrunning the security review around it.

Defender Posture for Organizations Using AI iOS Apps

For a security team, the practical lesson is less about any single named app and more about a class of risk that now sits inside the mobile fleet. When an AI iOS app leaks a provider key or exposes an open proxy, the most direct consequence is financial and reputational damage to the developer, whose paid AI quota can be drained by anyone watching the traffic. But the exposure is not purely the developer's problem. An open proxy that accepts arbitrary prompts is, in effect, a free and somewhat anonymous gateway to a capable model, and credentials that travel in clear or replayable form in app traffic are exactly the kind of artifact that careless logging, shared networks, or man-in-the-middle positions can capture at scale.

Defenders should treat this disclosure as a prompt to inventory which AI-powered mobile apps employees actually use for work, and to fold that inventory into existing mobile-device-management and acceptable-use processes rather than treating consumer AI apps as out of scope. The questions worth asking are familiar ones applied to a newer category: which apps are touching sensitive prompts or data, what backends do they talk to, and would the organization know if an app it depends on were quietly relaying through an unauthenticated server. Where a vendor cannot explain how it protects the credentials behind its AI features, that uncertainty is itself a data point for a risk decision.

The data-privacy dimension is just as important as the cost one. Prompts sent to these apps can contain personal, commercial, or regulated information, and an app that mishandles its own API credentials is unlikely to be handling the content of those prompts with great care either. For teams building a defensible posture, the durable control is to assume that anything typed into a third-party AI app may be observable or replayable, to steer staff toward sanctioned tooling with clear data-handling commitments, and to keep the threat model current as more of the mobile estate sprouts AI features.

Vendor-Accountability Tracking

What gives this study weight beyond its raw numbers is its remediation finding. The researchers did not simply publish a count and move on; they privately notified all 282 affected developers and then went back roughly three months later to re-test the same apps. Only about 28% had clearly fixed the problem. A further share remained wide open, with the leaked access still working when re-checked. In other words, responsible disclosure to the parties best positioned to fix the issue produced a fix in fewer than one in three cases.

That gap is the heart of the accountability story. It suggests that leaving remediation entirely to individual app developers — many of them small teams shipping AI features quickly — is not, on this evidence, a reliable path to closing the exposure. The researchers' own recommendations point upstream rather than at the developers alone: they want AI providers to label client-side keys as unsafe in their documentation and to flag keys that suddenly start being used by thousands of devices, behavior that is a strong signal a key has escaped into the wild. They also call on Apple to screen for this class of leakage during App Store review, which would move detection to the gate rather than relying on after-the-fact research.

For organizations, the takeaway is to track vendor behavior as a first-class signal. An app that fixed the issue promptly after notification is telling you something different about its engineering discipline than one that left credentials exposed for months. Building that kind of responsiveness into vendor assessment — and revisiting it over time rather than treating a one-time check as permanent — is how the accountability lesson translates into a concrete control rather than a headline.

Parallel Exposure for Android AI Apps

iOS is not where this problem starts. The researchers note that their work is, to their knowledge, the first systematic empirical study of LLM API credential leakage specifically on Apple's platform, but credential leakage in mobile AI apps had already been documented on Android. Earlier work — including a 2025 study sometimes referred to as LM-Scout — found the same insecure AI wiring across Android apps and was able to automatically obtain working access to a large set of them, on the order of 120, by the same broad mechanism of credentials and proxies exposed through app behavior.

The significance for defenders is that this is a cross-platform pattern, not an iOS-specific defect. The same architectural shortcut — putting a paid AI provider behind a thin mobile client without a properly authenticated backend in between — produces the same exposure regardless of operating system. An organization that standardizes on one mobile platform should not assume the other is safe, and a bring-your-own-device environment spanning both should expect the risk on each.

It also frames the iOS findings as confirmation of a known failure mode reaching a newer ecosystem rather than a one-off surprise. That continuity is useful for planning: the controls that help on one platform — inventorying AI apps, scrutinizing how vendors handle credentials, assuming prompt data may be observable — carry over directly, and the research community's move from Android to iOS suggests the measurement tooling to find these issues is maturing on both sides.

Open Questions

Several points remain to be watched. The most consequential is whether the platform-level and provider-level safeguards the researchers recommend actually materialize. As reported, there was no specific App Store enforcement action tied to this disclosure; the call for Apple to screen for credential leakage at review, and for AI providers to flag anomalous key usage, are recommendations rather than confirmed changes. Whether any of those parties acts on them — and how quickly — will determine if this study nudges the ecosystem or simply documents a problem that persists.

The exact downstream impact is also not fully quantified in a way that should be overstated. The study confirms that usable credentials and open proxies were exposed and, in many cases, still live when captured, but a leaked credential being exposed is not the same as a particular account being abused, and the public reporting at this stage rests largely on the research itself rather than a wide field of independent corroboration. That single-source character is worth flagging even as the core methodology — intercept, extract, validate against the live API — is concrete and the headline figures are specific. It is a reminder that the prudent reading tracks what has been demonstrated and keeps an eye on follow-on work, much as defenders do with emerging research into developer-credential exposure in AI tooling more broadly.

What is established is enough to act on now: a large, methodically assembled sample of AI iOS apps overwhelmingly exposed usable AI credentials in their own traffic, the leakage spanned the major providers, and most affected developers had not fixed it months after being told. For organizations relying on AI mobile apps, that combination argues for treating the category as a managed risk surface — inventoried, scrutinized, and assumed leaky until a vendor proves otherwise — rather than waiting for a platform or provider fix that may or may not arrive.


Sources

TypeSource
PrimaryWake Forest University researchers — "Mind your key: An Empirical Study of LLM API Credential Leakage in iOS Apps" (arXiv)
ReportingThe Hacker News
RelatedThe CyberSignal — Trapdoor: AI-Assistant Supply-Chain Poisoning
RelatedThe CyberSignal — Apple Ships 30+ iOS, macOS and Safari Patches