AI coding tools exposed by credential-stealing exploits in 2025

In early 2025, security teams sounded alarms as multiple widely used AI coding assistants became prime targets for credential theft. The incidents—spanning code repositories, command execution, and sandbox bypasses—highlighted systemic vulnerabilities in how these tools handle authentication and authorization. While organizations raced to integrate AI coding agents into development pipelines, threat actors exploited weaknesses in credential storage, command parsing, and access control to infiltrate production systems.

The credential theft pattern across AI coding tools

Researchers observed a consistent attack vector: AI coding agents frequently held credentials with excessive privileges, often without adequate safeguards. These credentials were used to authenticate to production systems without human oversight, enabling attackers to bypass traditional security controls. The pattern emerged in exploits against major platforms, including Microsoft Copilot, OpenAI’s Codex, Anthropic’s Claude Code, and Google’s Vertex AI.

The discovery of this vulnerability cycle began at Black Hat USA 2025, when Michael Bargury, CTO of Zenity, demonstrated zero-click attacks against ChatGPT, Microsoft Copilot Studio, Google Gemini, Salesforce Einstein, and Cursor with Jira MCP. His presentation underscored how attackers could hijack these tools by targeting the credentials they wielded rather than the models themselves.

Merritt Baer, CSO at Enkrypt AI and former Deputy CISO at AWS, emphasized the root cause in an interview: “Enterprises often approve AI vendors based on interface-level reviews, not the underlying systems. The credentials embedded beneath these interfaces are what attackers are after.” This mismatch between perceived security and actual exposure has left many organizations vulnerable.

OpenAI Codex: GitHub tokens exfiltrated via malicious branch names

In March 2026, BeyondTrust researchers Tyler Jespersen, Fletcher Davis, and Simon Stewart uncovered a critical flaw in OpenAI’s Codex. By crafting a GitHub branch name with unsanitized input, they discovered that unsuspecting developers could inadvertently expose their OAuth tokens during repository cloning.

The exploit hinged on a command injection vulnerability. A malicious branch name containing a semicolon and backtick subshell triggered unintended shell execution during the cloning process. The payload exfiltrated the GitHub OAuth token embedded in the repository URL, sending it to a remote server without any visible indicators.

Stewart added a layer of stealth by appending 94 Ideographic Space characters (Unicode U+3000) after “main,” making the malicious branch appear identical to the default branch in Codex’s interface. OpenAI classified the vulnerability as Critical P1 and deployed a full remediation by February 5, 2026.

Anthropic’s Claude Code: Sandbox bypasses and permission escalation

Anthropic’s Claude Code faced three critical vulnerabilities, each exploiting flaws in command execution and permission handling. The first, CVE-2026-25723, allowed attackers to escape file-write restrictions using piped commands like sed and echo. Command chaining was not properly validated, enabling code execution outside the intended sandbox.

A second vulnerability, CVE-2026-33068, targeted permission resolution in Claude Code. The tool read permission modes from .claude/settings.json before displaying the workspace trust dialog. By setting permissions.defaultMode to bypassPermissions, attackers could bypass the trust prompt entirely, gaining unauthorized access to sensitive files.

The third issue, discovered by Adversa, involved a 50-subcommand threshold. Claude Code silently ignored deny-rule enforcement when command complexity exceeded 50 subcommands—a trade-off between performance and security. Anthropic patched this flaw in version 2.1.90, but the incident underscored how performance optimizations can inadvertently weaken security posture.

Carter Rees, VP of AI and Machine Learning at Reputation and a member of the Utah AI Commission, noted: “Broken access control remains a critical vulnerability in enterprise AI. When an LLM’s flat authorization plane fails to respect user permissions, the system’s security collapses.”

Microsoft Copilot: Pull requests and GitHub issues as attack vectors

Johann Rehberger, in collaboration with Markus Vervier of Persistent Security, uncovered CVE-2025-53773 in GitHub Copilot. Hidden instructions in pull request descriptions could manipulate Copilot into disabling auto-approve mode in .vscode/settings.json. This change granted unrestricted shell execution across multiple operating systems.

Microsoft addressed the vulnerability in the August 2025 Patch Tuesday release, but attackers quickly adapted. Orca Security later demonstrated how hidden instructions in a GitHub issue could manipulate Copilot within GitHub Codespaces. By exploiting a symbolic link to /workspaces/.codespaces/shared/user-secrets-envs.json, attackers exfiltrated the privileged GITHUB_TOKEN via a crafted JSON schema URL, achieving full repository takeover without user interaction.

Mike Riemer, CTO at Ivanti, warned about the speed of exploitation: “Threat actors reverse engineer patches within 72 hours. If organizations don’t apply updates within that window, they remain exposed.”

Google Vertex AI: Excessive default permissions enable supply chain attacks

Unit 42 researcher Ofir Shaty identified a critical flaw in Google’s Vertex AI. The default Google service identity attached to Vertex AI agents granted excessive permissions, including unrestricted read access to all Cloud Storage buckets in a project.

This compromised identity, termed a “double agent” by Shaty, could access restricted Google-owned Artifact Registry repositories central to Vertex AI’s Reasoning Engine. The vulnerability exposed both user data and internal Google infrastructure, raising concerns about supply chain risks in AI-powered workflows.

Mitigating risks in AI-driven development pipelines

The recurring pattern of credential theft across AI coding tools suggests that organizations must rethink how they integrate these tools into development environments. Key defensive strategies include:

Least-privilege credential management: Restrict OAuth tokens and service identities to the minimum permissions required.
Input validation and sanitization: Ensure all user-provided inputs, including branch names and PR descriptions, are rigorously validated to prevent command injection.
Sandbox hardening: Implement strict command chaining validation and enforce deny-rule enforcement regardless of command complexity.
Real-time monitoring: Deploy tools to detect anomalous credential usage and unauthorized access attempts.
Rapid patching cycles: Prioritize updates within 72 hours of release to stay ahead of threat actors.

As AI coding assistants become integral to software development, the security of the underlying credentials and access controls must take precedence. The 2025 wave of exploits serves as a stark reminder that these tools, while powerful, introduce new attack surfaces that demand proactive defense strategies.

AI summary

GitHub tokenlarından Gmail'e kadar tüm hassas verilerinizi hedef alan AI kod asistanı güvenlik açıkları hakkında bilmeniz gerekenler. Hangi hizmetler risk altında ve nasıl korunabilirsiniz?

AI coding tools exposed by credential-stealing exploits in 2025

The credential theft pattern across AI coding tools

OpenAI Codex: GitHub tokens exfiltrated via malicious branch names

Anthropic’s Claude Code: Sandbox bypasses and permission escalation

Microsoft Copilot: Pull requests and GitHub issues as attack vectors

Google Vertex AI: Excessive default permissions enable supply chain attacks

Mitigating risks in AI-driven development pipelines

Comments

RunPod Flash slashes AI development time by removing Docker dependencies

Why OpenAI banned goblins in GPT-5.5—and what it reveals about AI training

Event-based AI agents launch from Writer to cut human workflow delays