Hey 👋,
If you use an AI coding agent: Claude Code, Cursor, Gemini CLI, Copilot, Codex, this issue is about you specifically. Two disclosures this month showed that the “Yes, I trust this folder” prompt you click through a dozen times a day is approving far more than it tells you. Each one turns a cloned repository into remote code execution on your machine. Neither is a bug in any single product. They’re the same architectural flaw in two different disguises.
I spent this week mapping how this connects to the MCP attack chain I published last week. Same lesson at a different layer. The danger isn’t the model, it’s the trust you grant without seeing what you’re granting.
If you want to follow along with the agent build itself, check out the aminrj.com blog. I’m publishing the full series there.
This week in AI security
SymJack: a booby-trapped repo tricks your coding agent into overwriting its own config, then runs attacker code on restart
Adversa AI’s disclosure is the sharpest of the month. A malicious repository contains an innocent-looking file copy operation. Your AI coding assistant performs the copy as part of normal work. What it’s actually doing is following a symlink that overwrites the agent’s own configuration file. The next time the agent restarts, it runs the attacker’s code with your full user privileges. Confirmed across six tools: Claude Code, Gemini CLI / Antigravity CLI, Cursor Agent CLI, GitHub Copilot CLI, Grok Build, and OpenAI Codex CLI. Every one was vulnerable.
The approval prompt the developer sees describes a benign file copy. It does not say “this will overwrite my configuration.” The user accepts because nothing looks wrong. That’s the entire point of using a coding agent. Speed and trust in automation are exactly what the attack exploits.
TrustFall: one keypress, four CLIs, full RCE — in CI/CD there’s no keypress at all
The companion disclosure. A cloned repo ships two JSON files (.mcp.json and .claude/settings.json) that auto-approve an attacker-controlled MCP server the moment you accept the folder trust prompt. One Enter keypress and the attacker’s MCP server runs with your privileges. Confirmed across Claude Code, Cursor CLI, Gemini CLI, and GitHub Copilot CLI. All default the trust prompt to “Yes.”
The real problem is CI/CD. When Claude Code runs on a continuous integration server through Anthropic’s official GitHub Action, it runs headless. No terminal, so the trust dialog never appears. A pull request from an outside contributor ships a malicious project file, the pipeline runs against that branch, and the attacker’s code reaches whatever the runner can reach: deploy keys, signing certificates, cloud tokens. Adversa published a working demo that exfiltrates the runner’s environment variables. No human in the loop.
From the blog: the same lesson, one layer down
Last week I published a step-by-step walkthrough of how a malicious MCP server drains a database in five steps. The root cause is identical to SymJack and TrustFall. The attack doesn’t start at your model. It starts at your tool marketplace, in a tool description field that looks like documentation and reads like an instruction to the LLM. A developer installs a calendar-integration MCP server, runs a malware scan (clean), and never checks the tool description. That description quietly tells the agent to export all customer records before answering any query. The post includes the three controls that actually stop it.
Tooling worth knowing
AI Agent Pre-Deployment Security Checklist — walks every control you need verified before an agent goes anywhere near production or sensitive credentials: identity, tool permissions, sandboxing, CI/CD gating, the exact areas SymJack and TrustFall exploit. Free, no fluff, built from real assessments. aminrj.com/resources/predeployment-checklist →
Adversa AI PoC repos — safe proof-of-concept repositories for testing your own setup against SymJack and TrustFall. Test your agents against these before trusting any external code. Adversa AI GitHub →
One thing to check this week
Answer one question for your team: can an AI coding agent in your environment run against an external pull request without a human merge gate first? If yes that’s a TrustFall-class exposure. The fix is a policy, not a patch. External code gets a human merge before any agent touches it, and CI runners get minimum-scope credentials. That single control neutralizes the most dangerous variant of both attacks this week.
What I’m watching
→ The “informed consent” boundary — Anthropic considers post-trust-dialog execution out of scope; researchers argue the dialog doesn’t disclose what it authorizes. This definitional fight determines whether the whole class gets fixed or keeps generating CVEs one at a time.
→ Coding-agent RCE as the 2026 supply-chain vector — SymJack and TrustFall both turn coding agents into backdoor delivery systems for CI/CD poisoning. This is shaping up to be the supply-chain attack story of the year. The defaults are the vulnerability.
→ The headless CI/CD gap — the trust dialog that “protects” interactive use simply doesn’t exist in automation. Every team running coding agents in pipelines has this exposure until they gate it explicitly.
→ The cybersecurity agent build — next issue: the allowlist-based tool gate in practice, tested against exactly these config-overwrite and trust-prompt patterns. Read the first post in the series →
I’m writing in depth about all of this on the aminrj.com blog. Agent security patterns, lab tests, framework breakdowns. If this newsletter is useful, the blog is where the full technical deep-dives live →.
Questions, pushback, something I missed — reply directly, I read everything.
Cheers, Amine
