Microsoft Threat Intelligence discovered that Anthropic’s Claude Code GitHub Action could expose CI/CD workflow secrets when AI agents process untrusted GitHub content, including issue bodies, pull request descriptions, and comments. We found that while Claude Code Action supported environment scrubbing for subprocess execution paths such as Bash, the Read tool was not subject to the same sandboxing model. It was eventually authorized to access /proc/self/environ, reading the workflow’s ANTHROPIC_API_KEY and potentially other credentials available to the runner.
Following our responsible disclosure, Anthropic mitigated this issue in Claude Code version 2.1.128 by blocking access to sensitive /proc files. Defenders should treat AI workflows that process untrusted GitHub content as high-risk when they also have access to secrets, file-read tools, or external communication channels.
We began this research after observing prompt injection attempts in public repositories using AI-assisted GitHub workflows across multiple vendors, where attacker-controlled issue or PR content is processed by the AI agent and could influence its tool use. For example:
The injection payload was placed inside an HTML comment (<!– –>), making it invisible when the issue is rendered in the browser but still visible to the AI model which reads the raw markdown:

The target repository – fork of a major open-source documentation project – used a highly permissive GitHub Actions workflow to automate issue resolution. We believe the actor is using a fork to test which payloads work before disclosing or exploiting them.
Whenever a user opened a new issue, an AI bot interpreted the request and was granted robust operational tools to resolve it:
This tool chain, operating without external oversight, provided an unauthorized user with the exact high-level primitives needed to plant malware without directly possessing write access.
Disguising the attack as a legitimate feature request for “diagnostic telemetry”, the payload provided the AI with a precise sequence of commands rather than a standard conversational prompt. It instructed the bot to search for a specific markdown heading, read the target file’s contents, append an exact block of malicious HTML, and immediately invoke the pull request tool to commit the newly poisoned file, effectively steering the AI step-by-step through a supply-chain compromise.
The attack vector successfully coerced the bot into locating the target documentation file and appending an invisible XSS image tag:
Had this PR been merged by a maintainer or by automated CI/CD automation, rendering the documentation site would execute JavaScript on visitors’ machines to silently exfiltrate their session tokens to the attacker’s endpoint.
This same trust boundary is what makes the Read tool vulnerability exploitable: once an attacker can influence the agent, they might be able to steer it toward sensitive files available inside the CI runner environment.
To understand the vulnerability described in this blog, it helps to first understand the environment in which they operate. GitHub Actions workflows were designed for deterministic automation—running tests, deploying builds, and enforcing policy. But as AI-powered tools like Claude Code Action have entered that environment, they’ve brought up a fundamentally different execution model: one where natural language can be treated as instruction. The sections below walk through how that model works, where the security boundaries are drawn, and critically, why those boundaries fail.
GitHub Actions is GitHub’s native automation and CI/CD platform. A workflow is a YAML configuration file that defines jobs to run when repository events occur, such as pull_request, issue_comment, scheduled runs, or manual dispatch.
When a workflow is triggered, GitHub executes its jobs on a runner: an ephemeral virtual machine, or in some cases a self-hosted environment. That runner is not just executing code in isolation. Depending on the workflow configuration, it may receive repository contents, issue and pull request metadata, environment variables, the GITHUB_TOKEN, cloud credentials, package publishing tokens, and third-party API keys.
GitHub workflows were built for deterministic automation: run tests, build artifacts, deploy code, label issues, or enforce repository policy. AI-powered workflows change that model. Instead of only executing predefined logic, they ingest repository context, interpret natural-language input, and decide which actions to take next.
A common example is AI-based pull request review. Tools such as Anthropic’s Claude Code GitHub Action can trigger on pull requests, read the diff, title, description, and comments, then post review feedback or security findings. In more advanced configurations, the same agent can modify files, create commits, or open follow-up pull requests from inside the CI runner.
Despite differences between vendors and implementations, the security pattern is consistent:
These integrations are not necessarily careless. Most include system prompts, filters, and policy logic intended to separate user content from control instructions. But when those boundaries fail, the workflow is no longer just automation. It becomes an AI agent embedded inside the repository, and its prompt construction, tool permissions, and runtime isolation become part of the security perimeter.
Claude Code Action is a GitHub action that runs Claude inside your CI runner. Under the hood, it’s a wrapper around the Claude Agent SDK (software development kit). The Claude Code Action handles GitHub-specific concerns (parsing the event, fetching issue/PR context, building the prompt, wiring up MCP (Model Context Protocol) servers, managing tracking comments) and then calls the SDK’s query function to drive Claude. Tool permissions, model selection, and most other runtime behavior are SDK options that the action is responsible for setting.

When Anthropic designed Claude Code Actions, they knew the risks. For the Bash tool, they support Bubblewrap (namespace-based Linux sandbox) with a scrubbed environment (enforced by CLAUDE_CODE_SUBPROCESS_ENV_SCRUB , auto enabled for actions that can be triggered by non-write users).
This is a solid defense. However, a gap exists: the Read tool is not subject to the same isolation.
Rather than routing Read operations through the same secure isolation boundary as Bash, these operations represent direct, in-process calls. They inherently bypass the Bubblewrap sandbox, operating with full access to the process’s environment variables.
To confirm the exploitability of this gap, we constructed a prompt injection payload. We tested this in a lab environment, specifically a non-write user enabled, which forces the CLAUDE_CODE_SUBPROCESS_ENV_SCRUB mitigation active.
We then injected this malicious prompt, the kind that naturally flows through issue bodies, PR comments, or other input:

This prompt defeats two distinct layers of defense:

In figure 4, the prompt injection succeeds; Claude confidently invokes the Read tool directly against /proc/self/environ (taken from the GitHub’s action logs).
The returned environ blob contains the unscrubbed ANTHROPIC_API_KEY. If Read ran inside the same Bubblewrap subprocess that Bash uses, it would not contain this key in the process’s environment variable.

From there, the attacker has their pick of exfiltration channels based on the target workflow configuration (which is publicly visible, since it’s stored in the repository under . github/workflows/). They can use an adversary-controlled domain via WebFetch or Bash, post it in an issue comment using GitHub MCP, or echo it to the Action log (if show_full_output is enabled in the target workflow). The attacker can then prepend “sk-ant-“ to the leaked string to reconstruct the full Anthropic API key.
May 5, 2026: Anthropic mitigated this issue in Claude Code 2.1.128. The mitigation strengthened the Read tool by unconditionally rejecting a number of files in /proc/ in order to protect those files from exfiltration.
April 29, 2026: reported to Anthropic via HackerOne.
The good news for defenders: controls already exist. Below is an actionable hardening guide:
Resource Development
Execution
Defense Evasion
Credential Access
Exfiltration
To conduct AI-driven black-box research on Claude Code Action, we built a GitHub workflow configured with the Bash tool and a system prompt designed to initiate a reverse shell. To bypass Sonnet’s refusal safety mechanisms, we obscured the shell payload behind a response from our controlled domain. We also enabled the workflow to be triggered by users with no “write” permissions to ensure Anthropic’s environment variables scrub mitigations were active during our tests.

Gaining an interactive foothold on the runner, we initially deployed a frontier AI model for automated, black-box research. When an hour of automated analysis produced no actionable findings, we pivoted.

We adopted a white-box approach, feeding the AI model the Claude Code Actions codebase and the obfuscated @anthropic-ai/claude-agent-sdk. Through this human-AI collaboration, where we actively directed the model, analyzed its findings, and tested variations, we uncovered the necessary exploit chains and responsibly disclosed them to Anthropic.
The integration of AI into GitHub Actions isn’t just a productivity improvement, it is a fundamental rewrite of the CI/CD security model. Right now, development is moving faster than defense.
Even when AI agents are deployed with safety prompts, permission scopes, and platform-level defenses (such as the secret scanner we reviewed), a determined attacker can potentially bypass these controls. We are entering an era where natural language is executable code, and untrusted inputs like GitHub issues must be treated as hostile by default. A single, carefully crafted comment combined with a misunderstood trust boundary is all it takes to walk away with production credentials.
We encourage maintainers to stay alert, keep up with the latest security updates, and implement the safeguards outlined in our mitigation guide to protect their repositories against this emerging class of attack.
For the latest security research from the Microsoft Threat Intelligence community, check out the Microsoft Threat Intelligence Blog.
To get notified about new publications and to join discussions on social media, follow us on LinkedIn, X (formerly Twitter), and Bluesky.
To hear stories and insights from the Microsoft Threat Intelligence community about the ever-evolving threat landscape, listen to the Microsoft Threat Intelligence podcast.
Review our documentation to learn more about our real-time protection capabilities and see how to enable them within your organization.