The Forge · 11 min mission
Claude Code vs Codex: An Honest Comparison
Know which agent fits which job — and how to run both in one repo.
On this page
Pick a forum thread about agentic coding tools and you will find someone insisting Claude Code is obviously better, someone equally sure Codex is, and almost nobody explaining why either claim might be true. This guide is the version that does not pick a jersey. Both are excellent. They share the same core idea — describe an outcome, the agent reads your repo, edits files, runs commands, and iterates — and they diverge in ways that genuinely matter once you live in one daily.
The honest summary up front: there is no universal winner. There is the tool that fits your model preference, your billing situation, and the surfaces your team already works in. What follows is the comparison I wish existed when I was choosing — concrete, specific, and fair to both.
The shared shape
Before the differences, respect the similarity, because it is the most important fact here. Both tools are agents, not autocomplete. Both run the same read-act-observe loop: gather context, take an action, observe the result, repeat until done. Both install as a terminal CLI, ship an IDE extension, run in the cloud for long jobs, and read a plain-markdown instructions file from your repo. If you are fluent in one, you are most of the way to fluent in the other. The skills you build — writing tight instructions, scoping tasks, reviewing diffs — transfer almost entirely. Switching tools is closer to switching editors than switching languages.
Instruction files: CLAUDE.md vs AGENTS.md
Both tools solve the same problem — an agent starts every session knowing nothing about your repo — with the same mechanism: a plain-markdown file you commit to the repo that the agent reads before you type a word. Claude Code reads CLAUDE.md; Codex reads AGENTS.md. The content advice is identical for both: list the build/test/lint commands, the non-obvious setup, the hard constraints, and what "done" means — and cut anything the agent could infer by reading the code.
The meaningful difference is ownership of the format. CLAUDE.md is Anthropic's file for Anthropic's tool. AGENTS.md is an open standard — stewarded under the Linux Foundation and read by Codex plus a long list of other agents (Jules, Cursor, Aider, GitHub Copilot, and more), used across tens of thousands of open-source projects. So if your team runs a mix of tools, one AGENTS.md guides all of them, while CLAUDE.md guides Claude Code specifically.
Both also support nested files: a per-directory instructions file deeper in the tree that applies when the agent works there. The merge rule is the same intuition in both — the more specific, more local file wins. Codex documents this precisely: it builds an instruction chain from the git root downward, concatenating files so the closest one lands last and takes precedence, and it also reads a global file at ~/.codex/AGENTS.md for machine-wide defaults. Claude Code walks the directory tree the same way and adds a user-level ~/.claude/CLAUDE.md.
| Dimension | Claude Code | Codex |
|---|---|---|
| File name | CLAUDE.md | AGENTS.md |
| Format | Anthropic convention, Claude Code-specific | Open standard, read by many agents |
| Global / user-level file | ~/.claude/CLAUDE.md | ~/.codex/AGENTS.md |
| Nested files | Walks the directory tree; most-specific wins | Chain from git root down; closest file wins |
| Size discipline | Target under ~200 lines; bloat reduces adherence | Capped by project_doc_max_bytes (32 KiB default) |
Extensibility: where the two philosophies show
This is where the tools stop rhyming and start differing in flavor. Claude Code exposes a broad, named toolkit of extension points, each with its own file convention. Skills are reusable procedures: a SKILL.md file under .claude/skills/ whose body loads only when invoked (via /skill-name) or when Claude decides it is relevant — they follow the open Agent Skills standard. Hooks are deterministic shell commands, HTTP calls, or prompts wired to lifecycle events like PreToolUse, PostToolUse, and Stop, configured in settings.json; they enforce behavior the way instructions cannot — block a write, run lint before a commit, refuse a dangerous command. Subagents are specialized workers defined as markdown-with-frontmatter under .claude/agents/, each running in its own context window with its own tool access and even its own model, so you can route cheap tasks to Haiku and keep your main context clean.
Codex covers much of the same ground, and the surface has grown — the "Codex is the leaner one" line you may have read in older comparisons is no longer honest. It speaks MCP (the Model Context Protocol) for connecting external tools and data, exactly as Claude Code does — that is the one extension mechanism both share natively and identically. [V] For automation, Codex leans on codex exec, a non-interactive mode that runs a single task without the conversational loop, which fills the niche Claude Code's hooks and headless -p mode occupy. [V] But Codex now also ships its own subagents (a primary agent spawns child agents, bounded by max_threads and max_depth under [agents] in config), an official SDK for TypeScript (@openai/codex-sdk) and Python (openai-codex) so you can drive the agent from your own process, hosted cloud tasks that run in isolated containers and return a pull request, and an official GitHub Action (openai/codex-action@v1) for CI. [V] The practical read in 2026: both tools now field a broad extension surface; they differ in shape — Claude Code names more distinct primitives (skills, hooks, subagents) as repo files, while Codex concentrates its surface around codex exec, the SDK, and the cloud. Neither is the "small core" anymore.
| Extension point | Claude Code | Codex |
|---|---|---|
| Reusable procedures | Skills (SKILL.md under .claude/skills/) | Prompt files + codex exec recipes |
| Lifecycle enforcement | Hooks at PreToolUse / PostToolUse / Stop | codex exec in a wrapper script or the GitHub Action |
| Parallel subagents | Subagents under .claude/agents/, own context + model | Subagents via [agents] (max_threads 6, max_depth 1) |
| Programmatic SDK | Headless -p / TS Agent SDK | @openai/codex-sdk (TS) + openai-codex (Python) |
| Cloud / background tasks | Claude Code on the web, long-running jobs | Cloud tasks in isolated containers → a PR |
| CI integration | GitHub Action + headless CLI | Official openai/codex-action@v1 GitHub Action |
| External tools / data | MCP (native) | MCP (native, identical protocol) |
Claude Code vs Codex, feature by feature
Claude Code vs Codex
The same agentic-coding ideas, two houses. Each column carries its own accent — coral for Claude, mint for Codex. Tap a header to isolate a tool, or replay the reveal.
CLAUDE.md at the repo root (plus ~/.claude/CLAUDE.md and auto memory).
AGENTS.md in the repo, with config.toml for settings.
Plugins, skills and marketplaces — sharable across a team.
Custom prompts and config profiles.
Markdown agents in .claude/agents/ — each its own context, tools and model.
Supported for delegating focused subtasks.
Shell hooks on lifecycle events, wired up in settings.json.
Hooks for workflow automation.
MCP servers; tool definitions deferred until first used.
MCP servers for third-party tools and context.
Terminal, VS Code + JetBrains, Desktop, Web, GitHub Actions, Agent SDK.
CLI, IDE extension, Cloud/Web, GitHub Action, Codex SDK, Chrome.
Opus 4.8, Sonnet 4.6, Haiku 4.5, Fable 5.
GPT-5.5, GPT-5.4, GPT-5.4 mini.
Claude Pro / Max / Team / Enterprise, or Anthropic API tokens.
ChatGPT Plus / Pro / Business / Edu / Enterprise, or API tokens.
Deep, multi-surface workflows tuned through CLAUDE.md, skills and hooks.
Teams already living in ChatGPT and the OpenAI stack.
Showing both
Verified against code.claude.com/docs and developers.openai.com/codex — current as of 2026. Both tools ship fast; check the docs for the latest.
Autonomy surfaces: how each tool lets you off the leash
Both tools let you dial how much the agent can do without asking — and both default to a careful, ask-first posture. The difference is the shape of the controls.
Codex exposes two explicit, orthogonal knobs in ~/.codex/config.toml. sandbox_mode decides what the agent is capable of touching (read-only, workspace-write, or danger-full-access), and approval_policy decides when it must stop and ask (untrusted, on-request, or never). The sane local default is workspace-write plus on-request: edit inside the repo freely, pause before anything riskier. It is a clean mental model — capability and permission are separate dials you set independently.
Claude Code expresses autonomy through permission rules (allow/ask/deny lists for tools and commands) layered with hooks that can hard-block actions programmatically, plus interactive permission prompts and a plan-first mode. The philosophy is the same — least privilege by default, widen deliberately — but the surface is rules-and-hooks rather than two enum keys. If you like a single config file with two obvious dials, Codex's model is satisfying. If you want fine-grained, per-command rules and the ability to enforce policy with a script, Claude Code's model gives you more reach.
Autonomy controls, side by side
Codex
Two orthogonal enum keys in ~/.codex/config.toml:
sandbox_mode = read-only | workspace-write | danger-full-access — what it can touch.
approval_policy = untrusted | on-request | never — when it must ask.
Sane local default: workspace-write + on-request. Clean, two-dial mental model.
Claude Code
Permission rules (allow / ask / deny lists for tools and commands) in settings.json, layered with hooks that can programmatically block an action at PreToolUse.
Plus interactive prompts and a plan-first mode.
More moving parts, but finer-grained — you can deny one exact command or enforce policy with a script.
The models, and why they are the real choice
Strip away the tooling and the decision often comes down to: which model do you want writing your code? Claude Code runs Anthropic's lineup — Fable 5 for the hardest work, Opus 4.8 and Sonnet 4.6 as strong all-rounders, Haiku 4.5 for fast, cheap tasks. Codex runs OpenAI's — GPT-5.5 as the recommended default for complex coding, with gpt-5.4-mini as the faster, cheaper option for lighter tasks or subagents. Both tools let you switch models mid-session (/model) and pin a default in config.
I will not tell you one model family is universally better, because that is genuinely workload-dependent and it changes with every release. The honest advice is empirical: if you have a paid plan for both, run the same real task through each — a bug you understand, a refactor you can verify — and judge the diffs, not the marketing. Most teams find each model has a personality, and the right question is which personality fits your codebase and your reviewing style, not which benchmark leads this month.
Pricing philosophy
The two tools share a billing shape but differ in the details. Both bundle the agent into the company's paid subscription, and both offer an API-key path that bills per token for programmatic and CI use. Codex is included across ChatGPT plans (Plus, Pro, Business, Edu, Enterprise, and lighter tiers), metered with rolling usage windows; Claude Code runs on a Claude subscription or an Anthropic API key. The decision rule is refreshingly simple: if your team already pays for ChatGPT, Codex adds no new bill; if you already pay for Claude, Claude Code adds none. For automated, headless, or high-volume use, both point you to usage-based API-key billing rather than a flat seat.
You do not actually have to choose
The framing as a war is mostly marketing. The two files — CLAUDE.md and AGENTS.md — do not conflict, and nothing stops a repo from carrying both. Plenty of teams keep both agents installed and reach for whichever model suits the task, or run one as a second opinion on the other's diff. Here is the concrete setup.
Run both in one repo
Write AGENTS.md as the shared source of truth
Put the build/test/lint commands, setup gotchas, and hard constraints in
AGENTS.mdat the repo root. Because it is the open standard, Codex and many other agents read it directly — so it is the broadest-reach file to maintain.Point CLAUDE.md at it
Keep a short
CLAUDE.mdthat imports the shared file with@AGENTS.mdso Claude Code reads the same instructions, then add only the few Claude-specific notes on top (a skill to prefer, a hook to respect). One source of truth, no drift between the two files.Set each tool’s autonomy to the same posture
Give Codex
sandbox_mode = "workspace-write"andapproval_policy = "on-request"in~/.codex/config.toml; give Claude Code an equivalent ask-before-risky permission setup. Matching postures means the agents behave consistently no matter which one you grab.Use them for what each does best
Run the model you trust for the task at hand, and use the other as a reviewer — have one implement and the other critique the diff. Two independent agents catch different mistakes; the second opinion is the whole point of keeping both.
They call each other now
The "second opinion" pattern stopped being a manual copy-paste chore — each agent can now invoke the other as a subagent and hand back the result. [P] Claude Code can delegate a task to Codex and read its diff; Codex can do the reverse. That turns "run both" from a discipline you maintain into a single command. The two tandem guides walk through the exact wiring in both directions: calling Codex from Claude Code and calling Claude Code from Codex. The setup is symmetric — pick whichever agent you live in as the driver, and reach for the other when you want an independent pass on the same problem.
Knowledge check
You want one set of project instructions that both Codex and Claude Code follow, with no second copy to keep in sync. What is the cleanest setup?
Reach the end and this star joins your charted sky.