ANALYSISAPR 202614 min read

The Claude Code Leak Reveals How Anthropic Prompts for Judgment, Not Compliance — 7 Techniques You Can Steal

TL;DR

Anthropic’s Claude Code prompts don’t read like instruction manuals. They read like onboarding documents for a thoughtful new hire. Instead of "never do X," they explain why X is dangerous and give the model a framework for evaluating novel risks on its own. Instead of rules, they teach principles. Instead of binary safe/unsafe classifications, they give four dimensions of risk. The result is a 914-line system prompt that handles everything from git operations to web search to multi-agent coordination — and it holds together because the model understands the reasoning, not just the rules.

01What this article is (and isn’t)

The raw system prompts from the Claude Code leak have been public since April 1st — Piebald-AI and asgeirtj both extracted and published them. Engineer’s Codex did a general code dive. Several articles mention that Anthropic uses XML tags, the CRITICAL keyword, and few-shot examples.

I’m not going to rehash any of that. If you want to know that Anthropic uses <env> tags for runtime data or that they show concrete output examples in their prompts, their own documentation covers it. Those are established best practices that the leak simply confirms.

What nobody has synthesized is the philosophy underneath. Across 914 lines of system prompt, 370 lines of BashTool instructions, and 303 lines of compaction prompt, a consistent approach emerges that’s fundamentally different from how most people write prompts. I’ve extracted 7 techniques that implement this philosophy — each one backed by code evidence, each one something you can use today whether you’re building AI products or just writing better prompts in a chat window.

02The philosophy: rules break, principles scale

Here’s the observation that made everything click. Most prompts I see — including ones I’ve written — are rule lists. "Always do A. Never do B. When you see C, respond with D." That approach works for narrow tasks. It breaks the moment the model encounters a situation the rule list didn’t anticipate.

Anthropic’s approach is different. Their prompts teach the model how to think about a class of problems, not just how to respond to specific instances. The clearest example is in their code style section:

src/constants/prompts.ts — getSimpleDoingTasksSection()
Don’t add features, refactor code, or make "improvements" beyond what
was asked. A bug fix doesn’t need surrounding code cleaned up. A simple
feature doesn’t need extra configurability.
Don’t add error handling, fallbacks, or validation for scenarios that
can’t happen. Trust internal code and framework guarantees. Only
validate at system boundaries (user input, external APIs).
Don’t create helpers, utilities, or abstractions for one-time
operations. The right amount of complexity is what the task actually
requires — no speculative abstractions, but no half-finished
implementations either. Three similar lines of code is better than
a premature abstraction.

"Three similar lines of code is better than a premature abstraction." That’s not a rule — it’s a principle. A rule would say "don’t create utility functions for code used fewer than 3 times." That’s brittle: what about 2 times? What if it’s complex? The principle gives the model a judgment framework that handles every edge case because it understands the tradeoff, not just the threshold.

This philosophy — teach judgment, don’t just enforce compliance — runs through every technique below.

031. Tell the model what it replaces, not just what it should do

This is the single most consistent pattern in the codebase. Anthropic almost never tells Claude to do something without also naming the specific wrong behavior it’s overriding.

src/tools/BashTool/prompt.ts — tool preference section
- File search: Use Glob (NOT find or ls)
- Content search: Use Grep (NOT grep or rg)
- Read files: Use FileRead (NOT cat/head/tail)
- Edit files: Use FileEdit (NOT sed/awk)
- Write files: Use FileWrite (NOT echo >/cat <<EOF)

Every single line pairs the correct behavior with the specific habit it replaces.

This isn’t just thoroughness — it’s anti-pattern suppression. Language models have strong priors from training data. If millions of examples show grep being used for text search, telling the model "use the Grep tool" isn’t enough. Its default impulse is to type grep in a shell. You have to explicitly name and reject the wrong approach so the model’s prior gets overridden by the instruction.

For developers: Whenever you tell the model to do X, also say "not Y" where Y is the most likely wrong behavior from training data. "Format dates as YYYY-MM-DD, not MM/DD/YYYY." "Return raw JSON, not wrapped in a markdown code block." You’re fighting priors — name the prior you’re fighting.

For everyone: This works in regular conversations too. Instead of "write this in a professional tone," try "write this in a professional tone, not a casual or conversational one." Instead of "keep it short," try "keep it to 2-3 paragraphs, not a bullet-point list." Naming what you don’t want is as powerful as naming what you do.

042. Give constraints a reason — and a consequence

When Anthropic restricts the model, they use a three-layer pattern: the rule, the reason the model doesn’t need to break it, and what happens if it tries anyway.

src/services/compact/prompt.ts — the compaction constraint
CRITICAL: Respond with TEXT ONLY. Do NOT call any tools.
- Do NOT use Read, Bash, Grep, Glob, Edit, Write, or ANY other tool.
- You already have all the context you need in the conversation above.
- Tool calls will be REJECTED and will waste your only turn —
  you will fail the task.

Three layers. "Do NOT call any tools" is the rule. "You already have all the context you need" removes the motivation to break it — the model doesn’t need tools because it already has what it needs. "Tool calls will be REJECTED and will waste your only turn" creates a self-preservation incentive.

The code comment above this section includes real data: without the consequence language, the model tries to call tools 2.79% of the time on Claude 4.6. With it, the rate drops to near zero. The "why" isn’t decoration — it measurably changes behavior.

For developers: When constraining model behavior, always give three things: the prohibition, the reason the constraint is unnecessary to break (remove the motivation), and the consequence of attempting it (add a disincentive). "Don’t call external APIs — all data is provided in the context. Attempts to call APIs will timeout and waste your response budget."

For everyone: When Claude goes in a direction you don’t want, explain why you don’t want it rather than just saying "don’t do that." "Don’t include an introduction paragraph — I’m pasting this directly into an email that already has context, so an intro would be redundant." The reasoning helps the model make better decisions about everything else in the response too.

053. Use a risk taxonomy, not a binary switch

This is one of the most sophisticated patterns in the codebase. Instead of classifying actions as "safe" or "dangerous," Anthropic gives the model four dimensions to evaluate:

src/constants/prompts.ts — getActionsSection()
Examples of the kind of risky actions that warrant user confirmation:
- Destructive operations: deleting files/branches, dropping database
  tables, killing processes, rm -rf
- Hard-to-reverse operations: force-pushing, git reset --hard,
  amending published commits
- Actions visible to others: pushing code, creating/closing/commenting
  on PRs or issues, sending messages
- Uploading content to third-party web tools (publishes it) — consider
  whether it could be sensitive before sending

A file deletion is destructive but not visible. A PR comment is visible but not destructive. A force-push is destructive, hard-to-reverse, and visible to others. The model evaluates each dimension independently, which produces better decisions than "is this risky? yes/no."

This is the judgment-not-compliance philosophy in action. A rule-based approach would enumerate every dangerous command. A judgment-based approach gives the model a framework for evaluating any action — including ones the prompt author never anticipated.

For developers: When you need the model to assess risk, quality, or priority, give it dimensions to evaluate independently. For a customer service bot: "Evaluate whether this action is reversible, whether it affects billing, whether it’s visible to the customer, and whether it requires manager approval." For a content reviewer: "Evaluate accuracy, tone, legal risk, and brand alignment separately." Multi-dimensional evaluation beats binary classification every time.

For everyone: If you’re asking Claude to evaluate something, give it the criteria explicitly. Instead of "is this a good email?" try "evaluate this email for clarity, professional tone, and whether the ask is specific enough to act on." You’ll get a more useful response because you’ve given it dimensions instead of asking for a single yes/no judgment.

064. Teach the workflow, not just the task

Anthropic doesn’t just tell the model what to do — they encode the correct sequence of operations as instructional text, including what to do before the actual task.

src/tools/FileWriteTool/prompt.ts — pre-requisite gate
If this is an existing file, you MUST use the FileRead tool first
to read the file’s contents. This tool will fail if you did not
read the file first.

The Write tool will literally reject the operation if the model hasn’t Read the file first. But instead of letting the model hit that error and recover, the prompt teaches the dependency upfront. The model never makes the mistake because it learned the correct workflow before encountering the constraint.

This pattern appears everywhere. The BashTool prompt teaches when to chain commands with && vs. ;, when to run commands in parallel vs. sequentially, and when to use background execution vs. foreground. It’s not just "here’s what the tool does" — it’s "here’s how to use the tool well."

For developers: If your system has prerequisites (step A before step B), encode them as instructional text in the prompt, not just as runtime errors. "Before generating a recommendation, first check if the user has provided their budget. If no budget is specified, ask — attempting to recommend without a budget produces generic results the user will reject." Pre-requisite gates prevent errors; teaching the workflow prevents the model from even approaching the error.

For everyone: When giving Claude a multi-step task, tell it the order and the dependencies. "First read the document, then identify the three main arguments, then write a counter for each one. Don’t start writing counters until you’ve identified all three." This prevents the model from jumping ahead and producing a half-finished response.

075. Grade your instructions by severity

Anthropic uses three distinct intensity levels across every prompt in the codebase, and they use each one consistently:

CRITICAL means "the system will fail or reject your output if you violate this." It appears only where deviation causes actual breakage — tool calls in a text-only context, wrong output format for a parser, security violations.

IMPORTANT means "strong preference with specific reasoning." It always comes with context for why:

src/tools/BashTool/prompt.ts
IMPORTANT: Avoid using this tool to run `find`, `grep`, `cat`,
`head`, `tail`, `sed`, `awk`, or `echo` commands, unless explicitly
instructed or after you have verified that a dedicated tool cannot
accomplish your task.

Notice the escape hatch: "unless explicitly instructed or after you have verified." IMPORTANT isn’t absolute — it gives the model a decision point.

Regular imperatives are for everything else. Standard guidance that should be followed but isn’t catastrophic if the model adapts.

The key insight: if everything is CRITICAL, nothing is. The reason this hierarchy works is that Anthropic reserves CRITICAL for genuine failure modes. When the model sees CRITICAL, it knows this one truly can’t be violated. If you mark every instruction as CRITICAL, the model has no way to prioritize when instructions conflict — and in a 914-line prompt, instructions will conflict at the edges.

For developers: Audit your prompts for priority inflation. If you have more than 2-3 CRITICAL/MUST/NEVER instructions, you’ve probably diluted the signal. Reserve the strongest language for things that cause actual system failures. Use softer framing ("prefer," "when possible," "unless the user specifies otherwise") for guidelines that should bend in edge cases.

For everyone: If you’re giving Claude multiple instructions and some matter more than others, say so. "The most important thing is that the tone stays professional — everything else is flexible." This gives the model permission to make tradeoffs intelligently instead of trying to satisfy every instruction equally and producing something mediocre across the board.

086. Build prompts from composable blocks

Claude Code’s system prompt isn’t a document. It’s generated from ~15 specialized functions, each responsible for one concern:

src/constants/prompts.ts — prompt assembly (simplified)
getSimpleIntroSection()           // Identity and role
getSimpleDoingTasksSection()      // Task execution philosophy
getActionsSection()               // Risk assessment framework
getUsingYourToolsSection()        // Tool guidance
getSimpleToneAndStyleSection()    // Communication style
getOutputEfficiencySection()      // Brevity instructions
computeEnvInfo()                  // Runtime environment (dynamic)
loadMemoryPrompt()                // Persistent memory (dynamic)

Each section can be modified, A/B tested, or conditionally included. Internal Anthropic employees get different coding instructions than external users — the USER_TYPE === 'ant' check adds sections like "Default to writing no comments. Only add one when the WHY is non-obvious." External users don’t see those. Different features enabled means different sections injected. The prompt adapts to context without anyone rewriting the whole thing.

The caching architecture reinforces this: sections that never change are cached globally across all users (everything before the SYSTEM_PROMPT_DYNAMIC_BOUNDARY marker). Session-specific content — your working directory, your memory, your MCP servers — sits after the boundary and regenerates each turn. Volatile sections that bust the cache must use DANGEROUS_uncachedSystemPromptSection(), which forces the developer to document why cache-busting is necessary.

For developers: Structure your system prompts as composable modules with clear separation of concerns. When your prompt isn’t working, you need to isolate which section is causing the problem. A monolithic prompt is impossible to debug; a modular one lets you swap one section at a time. Even if you’re not using a framework — just writing text — use clear # Section headers so you can identify and modify individual concerns.

For everyone: When you’re writing a long or complex prompt, break it into sections mentally: what role should Claude play, what’s the task, what format do you want, what constraints apply. Separating these concerns in your prompt helps Claude process them independently instead of trying to parse a wall of text where role, task, format, and constraints are all mixed together.

097. Use a disposable scratchpad for complex reasoning

When Anthropic needs the model to do complex analysis before producing output, they don’t just say "think step by step." They give it a structural container for thinking that gets stripped before the output reaches the user:

src/services/compact/prompt.ts — draft-then-output pattern
Before providing your final summary, wrap your analysis in <analysis>
tags to organize your thoughts and ensure you’ve covered all necessary
points. In your analysis process:
1. Chronologically analyze each message and section of the conversation.
2. Double-check for technical accuracy and completeness.
<example>
<analysis>
[Your thought process, ensuring all points are covered]
</analysis>
<summary>
[Your structured summary here]
</summary>
</example>

The model thinks in <analysis>, outputs in <summary>. The analysis block gets stripped by code before reaching context.

This is the structural enforcement of chain-of-thought reasoning. "Think step by step" is a suggestion. <analysis> followed by <summary> is an architecture. The model knows its thinking goes in one container and its output in another. The code (formatCompactSummary()) strips the analysis block after generation, so the thinking never pollutes the final output or future context.

The elegance is that the model can be thorough in its reasoning without making the output verbose. The analysis block is where it works through every detail. The summary block is where it delivers the clean result.

For developers: When you need the model to reason before responding, give it explicit containers. Use XML tags (or any delimiter your parser can split on) to separate thinking from output. Then strip the thinking before showing it to the user or feeding it back as context. This gives you chain-of-thought quality without chain-of-thought verbosity. It also lets you log the reasoning for debugging without shipping it to production.

For everyone: You can use this right now. Try: "Before answering, write your reasoning in a ‘Thinking’ section, then write your final answer in an ‘Answer’ section." The model will be more thorough because it has a place to think, and you can skip the thinking section and jump straight to the answer if you just want the result.

10The pattern underneath the patterns

Every technique above implements the same underlying idea: treat the model like a capable colleague who needs context, not a machine that needs instructions.

Give it reasons, not just rules. Give it frameworks for evaluation, not binary classifications. Give it the workflow, not just the task. Give it space to think. Tell it what bad habits to override. Grade your asks so it knows what matters most.

Most of us write prompts the way we write software configuration: precise, minimal, directive. Anthropic writes prompts the way you’d write an onboarding document for a smart new hire: thorough on context, clear on priorities, explicit about judgment calls, and trusting that the recipient can apply principles to situations you didn’t specifically cover.

Their 914-line system prompt handles file editing, git operations, web search, multi-agent coordination, memory management, and risk assessment — and it holds together not because every edge case is enumerated, but because the model understands the reasoning behind every instruction. When a novel situation arises that the prompt didn’t anticipate, the model can apply the principles to figure out the right approach.

That’s the real takeaway from the leak. Not any individual technique. The philosophy: prompt for judgment, not compliance.


Methodology: This analysis is based on direct examination of the Claude Code v2.1.88 source code leaked via npm source map on March 31, 2026. Primary files: src/constants/prompts.ts (914 lines), src/tools/BashTool/prompt.ts (370 lines), src/services/compact/prompt.ts (303 lines), src/constants/systemPromptSections.ts (69 lines). All code excerpts are verbatim.

11In this series

Part 1: How Claude Decides What to Cite — the GEO Analysis

Part 2: Everyone (Including Me) Got the Haiku Pipeline Wrong

Piebald-AI: Claude Code System Prompts (raw text)

Anthropic Docs: Use XML Tags to Structure Your Prompts

Engineer’s Codex: Diving Into Claude Code’s Source

Written by AI. Obviously.