Hey 👋
I spent most of this week building a threat model for MCP deployments — mapping 10 MCP-specific risks to OWASP categories, with lab-confirmed attack chains for each. Midway through, I kept hitting the same wall: most of the risks I was documenting don't require a sophisticated attack. They require a credential. A shared API key sitting in a .env file, with no scope, no expiry, and no per-agent identity.
Then the Grantex audit landed on HN. 30 popular open-source agent projects, 500k+ GitHub stars combined. 28 out of 30 rely exclusively on environment-variable API keys. Zero have per-agent cryptographic identity. Zero have per-agent revocation. It's not a niche problem — it's the default configuration across the entire ecosystem.
That's the thread through everything this week.
From the lab this week
MCP Security Top 10: my own threat model, lab-confirmed
I published a 30-minute deep dive mapping 10 MCP-specific security risks to the OWASP LLM Top 10 categories they belong to. The framing I kept coming back to: in a standard LLM app, the developer controls what context the model receives. In MCP, a third party does — and the user approved them once, weeks ago, and forgot about it.
The three things the article establishes: (1) MCP's trust model is fundamentally different from a standard API integration. (2) The flat tool namespace — where every connected server's descriptions coexist in one LLM context — is a novel attack surface with no direct OWASP equivalent. (3) The rug pull lives in a runtime API response (tools/list), which bypasses every static analysis tool that catches a malicious package. Every attack in the article is lab-confirmed and reproducible from the mcp-attack-labs repo.
Something I didn't expect to write this week: the EU AI Act angle
On March 13, the Council of the EU published the delegated regulation under Article 92 of the AI Act — the enforcement mechanism for GPAI model evaluation. It takes effect August 2, the same date GPAI obligations become enforceable. While working through the compliance implications for MCP deployments, I realized most engineering teams haven't processed a specific fact: if your agent calls Claude, GPT, or Gemini through an API or MCP server, you are a downstream deployer under Article 26. The provider's compliance obligations don't substitute for yours — they're parallel, and the regulatory chain runs through your architecture. The MCP layer makes this substantially harder than a standard API integration because you don't fully control what instructions third-party server authors send to the model.
I wrote it up as a 17-minute practical mapping — which obligations apply at which position in the GPAI supply chain, and what your architecture decisions look like from a compliance standpoint before August 2.
Also published: the RAG defenses companion post
The practical follow-up to the RAG poisoning research from last week. Three pipeline-specific defenses (embedding anomaly detection, access-controlled retrieval, prompt structure hardening) measured against a live attack suite. Short, dense, ends with a results table. If you ran the lab from Issue #4, this is what to implement next.
This week in AI security
93% of agent projects use unscoped API keys — 0% have per-agent identity
Grantex Research audited 30 popular open-source agent projects across six authorization dimensions. 93% rely exclusively on env-var API keys. 0% assign cryptographically verifiable per-agent identity. 100% have no per-agent revocation — rotating one agent's access requires rotating all of them. In frameworks like CrewAI and AutoGen, when one agent delegates to another, the child either inherits the parent's full credentials or gets its own independent key. No project implements scope narrowing for delegated access.
CVE-2026-2256: prompt injection → OS command execution in ms-agent. CVSS 9.8. No patch.
CERT/CC advisory VU#431821. ModelScope's ms-agent Shell tool calls subprocess.run(shell=True) on input that can be influenced by external content — a document the agent is summarizing, code it's analyzing, a web page it fetched. The denylist defense (check_safe()) is bypassable through command obfuscation. Successful exploitation runs with the process's full OS privileges: data exfiltration, file modification, persistence, lateral movement. No vendor statement, no patch at time of writing.
Fine-tuning on interaction logs? 2% data poisoning embeds backdoors that all current defenses miss
arXiv:2510.05159 (Malice in Agentland) is getting renewed attention as more teams fine-tune agents on production interaction data. The finding: poisoning 2% of collected interaction traces embeds a trigger-based backdoor with over 80% success rate. Llama-Firewall, Granite Guardian, and weight-based Watch the Weights detection all failed to catch it. The backdoor doesn't degrade normal task performance, so it passes standard evals. Three threat models tested: direct data poisoning, environmental poisoning through scraped content, and a pre-backdoored base model fine-tuned on clean data. All three work.
Tooling worth knowing
Kvlar — open-source proxy enforcing YAML policies on every MCP tool call, fails closed by default. Most immediately useful for deny-by-default on shell and database tools — directly addresses the CVE-2026-2256 class. github →
mcp-scan — Invariant Labs' scanner for malicious tool descriptions. Catches direct poisoning and rug-pull variants at server load time. Add it to CI and treat any authentication finding as P0. github →
OWASP NHI Top 10 — Non-Human Identity framework. If the Grantex findings describe your stack, this is the right starting framework for fixing it — more directly applicable than the LLM Top 10 for credential and identity issues. owasp.org →
One thing to check this week
For each agent in your stack: what credential does it use? Is it shared with any other agent? What breaks if you rotate it? If the answers are "org API key," "yes," and "everything" — you have the Grantex finding in your own environment. The Nango guide published this week has the clearest breakdown of which auth pattern applies per integration type (OAuth vs bot token vs scoped API key). nango.dev →
What I'm watching
CVE-2026-2256 weaponization timeline — CVSS 9.8, no patch, CERT/CC advisory public. This gets picked up by exploit kits fast
Fine-tuning pipeline security — Malice in Agentland shows all current defenses fail; this is an open research problem, not a solved one
EU AI Act August 2 enforcement — the March 13 GPAI delegated regulation made abstract obligations concrete; most MCP deployers haven't mapped their compliance position yet
Agent identity standards — the Grantex findings will push OWASP Agentic Top 10 and NIST CAISI toward concrete per-agent credential guidance
If this was useful, forward it to the engineer on your team who last deployed an agent. Odds are they used an env file and moved on.
Questions, pushback, topics — reply directly, I read everything.
Cheers,
Amine
