Security
Make the Wire Boring: How Mirror Security Defeats Side-Channel Attacks on LLM Streaming
There's a new vulnerability in AI systems. It's called Whisper Leak, and it's been hiding in plain sight.
Recent research from Microsoft demonstrates that an attacker monitoring your LLM streaming sessions can identify what you're discussing with near-perfect accuracy just by watching packet sizes and timing patterns. No decryption needed. No sophisticated hacking. Just passive observation of metadata that TLS inherently exposes.
The numbers are sobering: across 28 popular LLMs from major providers, researchers achieved >98% classification accuracy. For 17 of those models, attackers could identify sensitive conversations with 100% precision while recovering 5-20% of target topics even in realistic scenarios with 10,000 irrelevant conversations mixed in.
At Mirror Security, we've developed a defense that effectively resolves this problem.
Here's what the research found, why it matters, and how we're making the wire boring.
The Threat: What Whisper Leak Actually Exploits
LLM services stream responses token-by-token to create that responsive, natural experience we've come to expect. Each token gets sent as it's generated, wrapped in TLS encryption over HTTPS. The content is protected, but the metadata isn't.
Here's the problem: TLS stream ciphers preserve the size relationship between plaintext and ciphertext. When a token is generated and sent, the encrypted packet size directly reveals the token's length, plus a small constant overhead. And since tokens arrive at different intervals depending on the topic being discussed, the timing between packets also creates patterns.
The researchers trained simple machine learning classifiers like LightGBM, LSTM, and BERT on these packet sizes and timing sequences. The models learned to recognize topic signatures from encrypted traffic alone. No content access required.
The Results:
17 of 28 models tested achieved >98% AUPRC (Area Under Precision-Recall Curve)
Several models reached >99.9% classification accuracy
At extreme class imbalance (10,000:1 noise-to-target): 17 models enabled 100% precision at 5-20% recall
Attack works across all major providers: OpenAI, Microsoft, DeepSeek, Mistral, X.AI, Alibaba, Google, Amazon, Anthropic
The researchers used a simple proof-of-concept: training classifiers to identify conversations about "money laundering" among thousands of random questions from the Quora dataset. But the methodology works for any sensitive topic—political dissent, medical conditions, confidential business discussions, legal consultations.
If your users are in healthcare, finance, government, or any regulated industry, this side channel is leaking information you thought was protected.
The Research Tested Three Common Defenses—None Worked Completely
The researchers evaluated standard mitigation approaches that some providers have already deployed:
Random Padding: Adding random-length data to each streamed token to obscure the token length. CloudFlare, OpenAI, and Microsoft have implemented this approach.
Result: AUPRC dropped from 97.5% to 92.9% (−4.8 percentage points)
Still vulnerable: Timing patterns and cumulative distributions persist
Token Batching (N=5): Grouping 5 tokens before transmission to reduce network events and obscure individual token characteristics.
Result: AUPRC dropped from 98.2% to 94.8% (−3.5 percentage points)
Still vulnerable: Works better for some models than others; OpenAI gpt-4o-mini showed surprising resistance to this mitigation
Packet Injection: Injecting synthetic "noise packets" at random intervals to obfuscate size and timing patterns.
Result: AUPRC dropped from 98.1% to 93.3% (−4.8 percentage points)
Still vulnerable: Moderate mitigation with variability across models; incurs 2-3× bandwidth overhead
The Bottom Line: Each approach provides meaningful but incomplete protection. None eliminates the vulnerability. Residual attack success rates remain in the 90%+ range for many models.
The Data Plane: Vectax Makes Every Stream Look the Same
Vectax has one job: make every stream on the wire look identical.
Here's what that means in practice:
Prompt Secrecy: We use client-side envelope encryption, so prompts hit the gateway already opaque. No plaintext anywhere.
Metadata Flattening: Constant-rate framing with session normalization and an end "tail" removes the timing patterns. We add uplink padding to hide prompt length variations. Basically, the metadata stops telling stories.
Network Fit: Vectax sits at your gateway or network plane and works with all your models. Your dev team keeps the streaming UX they want, but the wire loses its fingerprint.
Think of it this way: Vectax turns side-channel music into white noise.
The Control Plane: AgentIQ Turns Privacy Into Policy
Okay, so VectaX handles the technical defense. But how do you know it's actually working? How do you prove to an auditor or yourself that privacy protections are in place?
That's AgentIQ's job: turn privacy from a promise into something you can measure and verify.
Policy & Routing: You decide when traffic shaping kicks in and which privacy tier applies. One posture across all your providers.
Attestation or Fail-Closed: Every shaped session gets a signed attestation with the policy ID and enforced parameters baked in. No attestation? The system falls back to non-streaming. No exceptions.
Risk Telemetry: We compute a leakscore from observed features so you can spot drift or regressions. Privacy shows up as a metric on your dashboard, not as a checkbox you hope is working.
AgentIQ is the cryptographic receipt proving Vectax did exactly what your policy said it should.
Why the Two-Plane Model Works
Defense in Depth: We're protecting content secrecy and metadata at the same time, with independent verification sitting on top.
Operational Clarity: Vectax lives at the gateway, doing the hard work. AgentIQ governs the overall posture, routes traffic, and generates the audit trail.
Vendor Consistency: One control posture, multiple LLM backends. The stream always looks boring, no matter which provider you're using behind the scenes.
The Numbers: Does It Actually Work?
We ran tests using the same prompts and model family. Let me show you what happened:
Encryption Only (no shaping):
AUC ≈ 0.58, AUPRC ≈ 0.66 — the attack still worksMedian events ≈ 140; mean chunk size ≈ 1.28 KB
Encryption + Constant-Rate Shaping:
AUC ≈ 0.33, AUPRC ≈ 0.49 — much betterMedian events ≈ 60; mean chunk size ≈ 1.28 KB
What this means: Just encrypting content doesn't cut it. When you add constant-rate framing and session normalization, you remove the features attackers rely on to classify your traffic. The patterns disappear.
Staying Ahead of What Comes Next
Let's be honest about something: attacks evolve. Someone will find a new technique, exploit a different feature set, or develop a workaround that targets a specific defense.
But here's why we think our approach holds up: defense-in-depth means we're not betting everything on one technique. Vectax and AgentIQ work together, i.e, data plane protection plus control plane verification, with continuous monitoring and attestation built in. When attacks evolve (and they will), we're not starting from scratch. We've got the Mirror Platform to adapt and respond to each layer of your AI.
What the Research Shows
The Whisper Leak research paints a pretty clear picture of the problem:
17 out of 28 models hit >98% AUPRC—and several reach >99.9%Precision at extreme noise ratios: 17 models achieve 100% precision at 5–20% recall, even at 10,000:1 noise-to-target ratiosSome models are tougher nuts to crack: Google Gemini (81.9–84.0% AUPRC), Amazon Nova (71.2–77.5%)
Standard mitigations on OpenAI gpt-4o-mini (AUPRC, both features):
Packet injection: 98.1% → 93.3% (−4.8 pp)Token batching (N=5): 98.2% → 94.8% (−3.5 pp)Random padding: 97.5% → 92.9% (−4.5 pp)
Bottom line: standard mitigations help a little, but they don't solve the core problem.
Wrapping Up
Whisper Leak isn't theoretical. It's practical, it works today, and it affects 28 popular LLMs across every major provider. The researchers demonstrated that encrypting your content is necessary but not sufficient; packet sizes and timing patterns leak sensitive information despite TLS protection.
Standard defenses deployed by providers, random padding, token batching, and packet injection reduce attack effectiveness by 3.5-4.8 percentage points. But they leave residual vulnerability at 90%+ attack success rates for many models.
Mirror Security's defense goes further. We silence the signal entirely through constant-rate shaping, session normalization, uplink padding, and cryptographic attestation. Add policy-driven routing and risk telemetry, and you get defense-in-depth that's measurable, auditable, and ready to deploy today.
Mirror = Vectax (data plane) + AgentIQ (control plane):Encrypt the content. Erase the signal. Prove it happened.
Want to see how Mirror Security can protect your LLM deployments? Get in touch to learn more about Vectax and AgentIQ.




