Your AI Agent Trusts People You've Never Met
Every tool call your AI agent makes passes through a stranger's server. They can read everything. A new paper shows some of them are already stealing from you.
I run AI agents every day. Code agents that write and deploy software. Marketing agents that generate and publish content. Research agents that pull data, synthesise it, and hand me a decision. If you work in tech right now, you probably do too.
Until last week, I hadn’t thought about what happens between my agent and the model. Every single request those agents send to a provider, OpenAI, Anthropic, Google, goes through at least one intermediary. Sometimes four or five. Each one of those intermediaries can read the entire request. The system prompt. The tool definitions. The API keys. The response. Everything. In plaintext.
Nobody is checking whether they’re honest.
A team of researchers just published the first systematic study of this problem. “Your Agent Is Mine,” from a group at Waterloo and other institutions, tested 428 LLM API routers. 28 paid routers bought from Taobao, Xianyu, and Shopify stores. 400 free ones built from open-source templates. What they found changes how you think about agent architecture.
The problem nobody’s looking at
LLM API routers are the middlemen of the AI world. You need one if you want to use models from different providers without managing separate API keys. LiteLLM, the biggest open-source router, has 40,000 GitHub stars and 240 million Docker pulls. OpenRouter connects to 300 models across 60 providers. They’re infrastructure, not some side project.
The way they work is simple. Your agent sends a request to the router. The router terminates the TLS connection, reads the request in full, picks an upstream provider, opens a new TLS connection, and forwards it. When the response comes back, the router reads that too, then sends it to your agent.
That middle step is the problem. The router sits in what security people call a man-in-the-middle position. No hack required. You configured it that way. You pointed your agent at the router’s URL. The router has full application-layer access to every byte that passes through it. It can read tool-call arguments, rewrite responses, copy API keys, and inject code into the output your agent is about to execute.
And because routers are composable, your request might pass through several. A developer buys API access from a Taobao reseller, who aggregates from a second-tier provider, who routes through OpenRouter, who dispatches to the model host. Four hops. Each one terminating and re-originating TLS. Each one with plaintext access. The client configures only the first hop. The rest are invisible.
What the researchers found
Of the 28 paid routers they bought, one was actively injecting malicious code into tool-call responses. Of the 400 free routers, eight were doing the same. Two of those used adaptive evasion, meaning they waited for a warm-up period before activating, or only triggered when the router detected the agent was running in autonomous mode where tool execution gets auto-approved.
The researchers call that “YOLO mode.” The agents call it being productive.
Seventeen free routers touched researcher-owned AWS credentials that were embedded in test requests. One drained ETH from a researcher-owned private key. A router, sitting between a user and a model provider, stole cryptocurrency from someone who used it.
But the poisoning studies are worse. The researchers intentionally leaked an OpenAI key on Chinese forums and in WeChat and Telegram groups. That single key generated 100 million GPT-5.4 tokens and more than seven Codex sessions. They also deployed weakly configured decoy routers across 20 domains. Those decoys served 2 billion tokens, exposed 99 credentials across 440 Codex sessions, and found that 401 of those sessions were already running in YOLO mode. Auto-approve on. No human in the loop.
Every one of those 440 sessions was command-injectable. A single malicious router in the chain would have had full control over the tool calls those agents were executing.
The LiteLLM wake-up call
In March 2026, LiteLLM got compromised through dependency confusion. Attackers injected malicious code into the request-handling pipeline. Every deployment that pulled the poisoned release was compromised. A single supply-chain entry point in the most widely deployed LLM router became a weapon with full plaintext access to every API request flowing through it.
This is the same LiteLLM that thousands of organisations trust as production infrastructure. The same tool that developers configure and forget about. It turned into a supply-chain attack vector overnight, and most of its users wouldn’t have known until the damage was done.
Dependency confusion isn’t new. It’s been a known attack in software supply chains for years. But the stakes are different when the compromised package goes beyond reading your code and starts rewriting the tool calls your AI agent is about to execute on your behalf. A compromised npm package can steal secrets. A compromised LLM router can make your agent do things you never asked it to do, with your credentials, on your infrastructure.
Why this matters more than prompt injection
Everyone’s worried about prompt injection. Rightly so. Someone embeds a malicious instruction in a webpage, the model reads it, the agent does something it shouldn’t. It’s a real problem with real research behind it.
But the router problem is orthogonal to prompt injection. It happens in the JSON layer, before the model sees the request or after it emits a response. It’s outside the model’s reasoning loop entirely. A malicious router doesn’t need to trick the model. It just rewrites the response the model sends back. The model can be perfectly aligned, perfectly safe, perfectly capable of resisting prompt injection, and none of that matters if the router swaps the URL in a tool call from a legitimate installer to an attacker-controlled script.
This composes with model-side safeguards rather than replacing them. You can have the best prompt injection defences in the world and still get compromised through the router. The two attacks stack. That’s what makes it dangerous.
What can actually be done
The researchers built a tool called Mine that implements the attack classes and tests defences. They evaluated three client-side mitigations that can be deployed today without waiting for provider cooperation.
A fail-closed policy gate that blocks suspicious shell rewrites in tool-call responses. Hit rate: blocks all injection samples with a 1.0% false positive rate.
Response-side anomaly screening that flags suspicious patterns in returned tool calls. Catches 89% of injection samples without requiring any changes to the upstream provider.
Append-only transparency logging that creates a verifiable record of what the router forwarded and what the client received.
These are good. They’re practical. They reduce exposure now, not after a standards body agrees on a specification in 18 months. But they’re patches on a fundamentally broken trust model.
What actually solves this is provider-backed cryptographic integrity. A mechanism that binds the tool call the model produces to the tool call the client executes. Until that exists, every router in the chain is an untrusted intermediary with god-mode access to your agent’s actions.
The uncomfortable truth
If you’re running agents through a router right now, and most of us are, you’re trusting every hop in that chain without verification. You don’t know how many intermediaries sit between your agent and the model. You don’t know if any of them are compromised. You don’t know if the tool call your agent just executed was the one the model actually returned.
The researchers tested this at scale and found real-world exploitation happening right now, in commercial routers that people pay money for and free routers that thousands of developers use every day.
The AI agent ecosystem is being built on a trust model that doesn’t hold. We’re connecting autonomous systems that execute code, manage credentials, and make decisions to infrastructure that has no integrity guarantees. The routers are the plumbing, and the plumbing is compromised.
I’m not going to stop using agents. The productivity gains are real. But I’m going to start paying attention to the infrastructure between my agent and the model. Because right now, the weakest link in my AI stack sits between me and the model. The router nobody audits.


