AIDevelopmentSecurity

Claude Code Source Leak: What We Learned About Building Better AI Harnesses

April 1, 2026

by SolaScript

Claude Code Source Leak: What We Learned About Building Better AI Harnesses

#Claude Code #AI Agents #Open Source #Anthropic #Code Harness #LLMs

The source code for Claude Code — Anthropic’s acclaimed AI coding assistant — was accidentally leaked, and the developer community has been dissecting it ever since. Matthew Berman’s video provides an excellent walkthrough of what was revealed and why it matters for anyone building AI-powered development tools.

In this post, we’ll break down the key architectural decisions that make Claude Code special, the practical lessons for developers, and what this means for the broader AI tooling ecosystem.

How the Leak Happened

The source code was exposed through a map file in Anthropic’s npm registry — a classic accidental disclosure. Within hours, the leak had garnered over 22 million views on X alone. Someone even converted the entire TypeScript codebase to Python, which (regardless of how you feel about it) makes that version legally distinct from the original.

For Anthropic, this isn’t catastrophic. No customer data was exposed, no API keys were compromised, and there weren’t any major internal secrets beyond the harness architecture itself. But it does provide an unprecedented look at how one of the best AI coding assistants actually works under the hood.

The Claude.md File: Your First Optimization

Perhaps the most immediately actionable revelation: the claude.md file gets loaded into every single prompt. Every. Single. One.

This file is your direct line to Claude’s behavior. You get 40,000 characters to define:

Your codebase’s architecture and conventions
Coding standards your team follows
File locations that matter most
Best practices and patterns you want enforced

If you’re using Claude Code and barely touching this file, you’re leaving significant performance on the table. Think of it as your persistent system prompt that shapes every interaction. The quality of your claude.md directly correlates with how well Claude Code understands and works within your specific codebase.

Parallelism is Built In

Claude Code isn’t designed for single-threaded workflows. The architecture supports multiple agents running simultaneously, and critically, sub-agents share prompt caches. This means spinning up five or ten sub-agents doesn’t multiply your token costs proportionally — you’re essentially getting parallelism for free.

The source code reveals three distinct execution models for sub-agents:

Fork: Inherits parent context, cache-optimized
Teammate: Separate pane in tmux or iTerm, communicates via file-based mailbox
Worktree: Gets its own git worktree with an isolated branch per agent

Boris Churnney, the inventor of Claude Code, has mentioned he routinely runs multiple agents simultaneously. Git worktrees prevent agents from conflicting in your working branch. If you’re still doing everything with a single agent, you’re not using the tool as designed.

The Permission System You’re Probably Fighting

Every time Claude Code asks permission for something, that’s a configuration failure — not a feature. The system is meant to be pre-configured so you rarely see those prompts.

There’s a settings.json file that lets you define exactly which commands and operations are allowed by default. The three permission modes are:

Bypass: No permission checks (fast but dangerous)
Allow Edits: Auto-allows file edits in your working directory
Auto: Runs an LLM classifier on each action to predict what you’d approve

Auto mode is the sweet spot for most users. It intelligently predicts which actions you’d approve and handles them automatically while still blocking genuinely risky operations. The old “dangerously skip permissions” flag is essentially deprecated in favor of this smarter system.

Compaction: What to Forget Matters More Than What to Remember

There’s a saying in AI development: what the model forgets is more important than what it remembers. Selective forgetting lets you maintain higher fidelity on the things that actually matter.

Claude Code implements five compaction strategies:

Micro Compact: Time-based clearing of old tool results
Context Collapse: Summarizes spans of conversation (lossy compression)
Session Memory: Extracts key context to a file
Full Compact: Summarizes entire history
PTL Truncation: Drops the oldest message groups

The practical advice here: use /compact proactively. Don’t wait for the system to auto-compact and potentially lose context you care about. If you know what you want to remember (and especially what you want to forget), trigger compaction manually.

The default context window is 200,000 tokens, with an option for a million. Quality does degrade past 200k tokens, but it’s still better than most alternatives. Think of /compact like saving your game — do it intentionally at good checkpoints.

Sessions Are Persistent (Stop Starting Fresh)

Every Claude Code conversation is saved as JSONL at a specific path. You can:

Use -c to continue your last session
Resume any previous session by ID
Fork sessions to branch your work

Starting fresh means no context. Claude has to relearn your codebase from scratch. If there’s any continuity between tasks — even if you’re working on different parts of the codebase — resuming the same session preserves momentum and accumulated understanding.

Large tool results get stored to disk with only an 8KB preview sent to the model. If you paste a massive file, Claude may only see a fraction. Keep your inputs focused.

Hooks: The Power User Feature

The hooks system is apparently underutilized, but it’s where significant automation potential lives. Available hook points include:

Pre-tool use
Post-tool use
User prompt submit
Session start
Session end

Hook types include command, prompt, agent, HTTP, and function hooks.

One practical application: automatically updating documentation when code changes. Instead of manually reminding Claude to update docs (which gets tedious), you can hook into the commit process and trigger documentation updates based on which parts of the codebase were modified.

The 66 Built-in Tools

Claude Code ships with 66 built-in tools, partitioned into two categories:

Concurrent tools: Read-only operations that can run in parallel
Serialized tools: Mutating operations (edits, writes, bash commands) that run one at a time

This means if Claude needs to read ten different parts of your codebase via sub-agents, it can do so in parallel. Write operations are queued to prevent conflicts.

Interruption is Cheap

If you notice Claude Code going in the wrong direction — maybe it misunderstood your prompt or is implementing something incorrectly — stop it immediately. The streaming architecture means you’re not losing tokens by cutting it off. Letting it continue due to sunk cost fallacy just wastes more resources. Cut your losses, clarify, and continue.

What This Means for the Ecosystem

The leak is already influencing open-source projects. Tools like Open Code can integrate these architectural insights directly. Developers building their own harnesses now have a reference implementation showing:

How to structure effective system prompts
Optimal permission handling patterns
Memory management and compaction strategies
Multi-agent orchestration approaches

The Claude Code harness is specifically optimized for Claude models — plugging in GPT or Gemini won’t yield the same results. But the architectural patterns are transferable.

For Anthropic, this isn’t ideal optics, but the actual damage is limited. For the developer community, it’s a rare look at production-grade AI tooling architecture from a frontier lab.

Key Takeaways

Invest in your claude.md file — it shapes every interaction
Use multiple agents — the architecture supports parallelism with shared caches
Configure permissions upfront — don’t fight the prompts, configure them away
Compact proactively — control what gets remembered and forgotten
Stop starting fresh — session continuity preserves valuable context
Explore hooks — automate the repetitive parts of your workflow
Interrupt early — don’t let sunk cost fallacy waste tokens

Whether you’re using Claude Code directly or building your own AI-assisted development tools, these patterns represent battle-tested approaches to making language models more effective in coding contexts.

The original video by Matthew Berman provides additional context and visual walkthroughs of the leaked source code structure.

Published by

Sola Fide Technologies - SolaScript

This blog post was crafted by AI Agents, leveraging advanced language models to provide clear and insightful information on the dynamic world of technology and business innovation. Sola Fide Technology is a leading IT consulting firm specializing in innovative and strategic solutions for businesses navigating the complexities of modern technology.

Keep Reading