The Bridge · 12 min mission

The MCP Bridge: Codex as a Tool Inside Claude

Wire Codex into a Claude session over MCP, keep stateful threads, and build a dual-agent pipeline.

tandemmcpsdkFact-checked 2026-06-13

On this page

Stage 1 — Codex as an MCP server
The codex() parameters that matter
Stage 2 — stateful threads with threadId
The community wrapper: codex-mcp-server
Scope the bridge to one subagent
Stage 3 — the dual-agent pipeline in code
Where this leaves you

You already know the two CLIs as rivals: Claude Code in one terminal, Codex in another, you playing referee. This guide deletes the referee. We wire Codex into a Claude session as a callable tool, so Claude can hand a sub-task to GPT-5-class reasoning, read the answer, and keep going — one loop, one context, no copy-paste.

The trick is that there is no trick. MCP is the neutral protocol that makes this work, and it is the exact same standard covered in the Claude Code MCP guide (claude mcp add, scopes, .mcp.json, tool search). Read that first if "MCP server" still feels fuzzy — here we assume it. The one new idea: the server you are connecting is Codex itself.

We build in three stages. Stage 1 registers Codex as an MCP server so codex() shows up as a tool in your Claude session. Stage 2 keeps that Codex thread stateful across multiple turns with codex-reply(). Stage 3 drops the interactive CLI entirely and drives both agents from code with the Claude Agent SDK, where "Claude calls Codex" becomes a dual-agent pipeline you can put in CI.

Stage 1 — Codex as an MCP server

Codex can run as an MCP server over stdio. The command is codex mcp-server [V]. It launches a long-lived process that speaks MCP on standard input/output and exposes exactly two tools:

codex [V] — starts a new Codex session against a prompt.
codex-reply [V] — continues an existing Codex session by its thread id.

You register it like any other stdio MCP server. The fastest path is the CLI:

claude mcp add codex -- codex mcp-server

That writes a server named codex into your config (local scope by default, exactly as the MCP guide describes). Once connected, /mcp lists it and the two tools surface to Claude as mcp__codex__codex and mcp__codex__codex-reply — the standard mcp__<server>__<tool> naming. From here Claude can decide, on its own, to call Codex when a task suits it.

Two ways to register the same server

.mcp.json (project scope — commit to git)

Drop this at your repo root so the whole team gets the bridge on checkout. Project-scoped servers from .mcp.json are approval-gated — Claude Code prompts before first use.

{
  "mcpServers": {
    "codex": {
      "type": "stdio",
      "command": "codex",
      "args": ["mcp-server"]
    }
  }
}

.claude/settings.json projects map (user/local)

claude mcp add codex -- codex mcp-server records the same server into the per-project entry of ~/.claude.json. The stored shape is identical — an mcpServers object keyed by server name:

{
  "projects": {
    "/path/to/your/project": {
      "mcpServers": {
        "codex": { "type": "stdio", "command": "codex", "args": ["mcp-server"] }
      }
    }
  }
}

Claude reaches for Codex mid-session

… scroll to run this session

After `claude mcp add codex -- codex mcp-server`, the codex tool is just there. Claude calls it like any other tool — no terminal switching.

The codex() parameters that matter

codex() is not a black box — it takes the same safety and routing knobs you set on the Codex CLI, passed as tool arguments. The four you will reach for constantly [V]:

approval-policy — when Codex pauses to ask permission. Accepted values: untrusted, on-request, never.
sandbox — the filesystem/network blast radius. Accepted values: read-only, workspace-write, danger-full-access.
model — override the model for this call (e.g. a smaller, faster model for a triage pass).
cwd — the working directory Codex operates in, resolved relative to the server process.

The sandbox value is the single most important argument in the whole bridge. read-only is the correct default when you want Codex as a reviewer — it can read the tree and reason, but cannot touch a file. workspace-write lets it edit inside the workspace (the implementer role). danger-full-access removes the sandbox entirely; reserve it for throwaway environments. The tool also accepts prompt (required), plus base-instructions, profile, include-plan-tool, and a freeform config object that overrides individual config.toml settings [V].

`sandbox` value	Codex can…	Use it for	Risk
`read-only`	Read the tree, reason, report — no writes	Review, audit, triage, second opinion	Low — safe default
`workspace-write`	Read and edit files inside the workspace	The implementer role in a pipeline	Medium — scoped to the workspace
`danger-full-access`	Anything — sandbox disabled, network open	Throwaway containers, ephemeral CI runners	High — no guardrails

codex() sandbox values [V]. Pick read-only for review, workspace-write for implementation, and avoid danger-full-access outside disposable sandboxes.

Stage 2 — stateful threads with threadId

A single codex() call is one-shot: Claude asks, Codex answers, done. But real delegation is a conversation — "review this," then "now apply the fix you suggested," then "re-check." For that, Codex sessions are stateful, and the state handle is a thread id.

When codex() returns, the response carries structuredContent.threadId [V] — a stable identifier for the Codex session it just spun up. To continue that exact session, Claude calls codex-reply, which requires threadId and a new prompt [V]. The same Codex context — files it read, reasoning it did — is still live, so the follow-up is cheap and coherent instead of starting cold.

The mental model: codex() is "open a new thread and return its id"; codex-reply({ threadId, prompt }) is "speak again into that thread." It is the same shape as the Claude Agent SDK's own session resume, just on the Codex side of the bridge.

Open a Codex thread, then continue it

… scroll to run this session

codex() returns structuredContent.threadId; codex-reply() reuses it so Codex keeps full context across turns.

The community wrapper: codex-mcp-server

OpenAI's codex mcp-server is the canonical bridge, but the community has built a richer wrapper worth knowing: codex-mcp-server by tuannvm [P]. It installs with one line and exposes more than the two stock tools:

claude mcp add codex-cli -- npx -y codex-mcp-server

Its codex tool takes a slightly different, more ergonomic parameter set [P]: sessionId (its name for the continuation handle), model (e.g. "o3"), reasoningEffort (e.g. "high"), fullAuto (a boolean for hands-off runs), and sandbox (e.g. "workspace-write"). It adds tools beyond codex too — review, websearch, listSessions, ping, and help. The hard requirement: it wraps the Codex CLI, so you need Codex CLI v0.75.0+ installed (npm i -g @openai/codex or brew install codex) [P].

Use the official codex mcp-server when you want the supported, minimal surface. Reach for codex-mcp-server when you want reasoningEffort/fullAuto knobs and the extra review/websearch tools without writing them yourself.

	Official `codex mcp-server`	Community `codex-mcp-server`
Launch	`codex mcp-server`	`npx -y codex-mcp-server`
Continuation handle	`threadId` (via `structuredContent`)	`sessionId`
Tools	`codex`, `codex-reply`	`codex`, `review`, `websearch`, `listSessions`, `ping`, `help`
Notable params	`approval-policy`, `sandbox`, `model`, `cwd`	`model`, `reasoningEffort`, `fullAuto`, `sandbox`
Requires	Codex CLI installed	Codex CLI v0.75.0+
Support	First-party [V]	Community [P]

The official server [V] vs the community wrapper [P]. Same bridge concept, different surface area.

Scope the bridge to one subagent

Here is the trap. The moment Codex is connected, codex and codex-reply live in your main session's tool list for every project that loads the server. That is tool clutter you rarely want bleeding into ordinary work — and it puts a second autonomous agent one tool-call away from your primary context.

The clean fix is a Claude Code idiom: scope the server to a single subagent via the mcpServers frontmatter field [V]. A subagent is a Markdown file with YAML frontmatter under .claude/agents/, and its mcpServers field gives that subagent access to MCP servers that need not exist in the main conversation — inline servers are connected when the subagent starts and disconnected when it finishes [V].

---
name: codex-reviewer
description: Delegates a focused, read-only code review to Codex and returns only the findings.
tools: Read, Glob, Grep
mcpServers:
  codex:
    type: stdio
    command: codex
    args: ["mcp-server"]
---
 
You are a review router. For the task you are given, call the Codex tool with
sandbox "read-only" and approval-policy "never", then return a tight bullet list
of findings. Do not edit files yourself.

Now the Codex bridge exists only inside codex-reviewer. Your main session stays lean, Codex's tools never bloat the primary context, and the second agent is firewalled to one explicit, named delegate.

Stage 3 — the dual-agent pipeline in code

Interactive sessions are stage one of trust. The real payoff is a code-driven pipeline: a script where Claude is the orchestrator and Codex is a tool it calls, end to end, with no human in the loop. The Claude Agent SDK (@anthropic-ai/claude-agent-sdk / claude-agent-sdk) is the harness — the same agent loop and context management that power Claude Code, as a library.

You connect Codex exactly the way the SDK docs connect any MCP server: through the mcpServers option [V]. Then you allow only the two Codex tools by their fully-qualified names — mcp__codex__codex and mcp__codex__codex-reply — in allowedTools [V]. Now Claude can call Codex autonomously, and you capture both sides' state: the Claude session_id from the init system message, and Codex's threadId from structuredContent for any codex-reply.

import { query } from "@anthropic-ai/claude-agent-sdk";
 
let sessionId: string | undefined;
 
for await (const message of query({
  prompt:
    "Implement the failing test in tests/checkout.test.ts. " +
    "Before you commit, call Codex (sandbox read-only) for a second-opinion review, " +
    "then address its findings.",
  options: {
    // Codex registered as a plain stdio MCP server inside the SDK
    mcpServers: {
      codex: { command: "codex", args: ["mcp-server"] },
    },
    // Only the two Codex tools are pre-approved, plus the editors Claude needs
    allowedTools: [
      "Read", "Edit", "Bash",
      "mcp__codex__codex",
      "mcp__codex__codex-reply",
    ],
  },
})) {
  if (message.type === "system" && message.subtype === "init") {
    sessionId = message.session_id; // resume Claude later with options.resume
  }
  if ("result" in message) console.log(message.result);
}

This is the whole dual-agent loop: Claude reads, edits, then delegates the review to Codex, reads Codex's findings back through the tool result, and fixes — all inside one query(). Swap the prompt and you have a generate-then-critique, a planner-then-implementer, or a two-model consensus check.

Standing up the pipeline

Install both runtimes
Install the Agent SDK (npm install @anthropic-ai/claude-agent-sdk or pip install claude-agent-sdk) and the Codex CLI (npm i -g @openai/codex). The TypeScript SDK bundles the Claude Code binary, so you do not install Claude Code separately [V].
Register Codex as an MCP server
Add mcpServers: { codex: { command: "codex", args: ["mcp-server"] } } to your ClaudeAgentOptions / query options [V]. In Python the key is mcp_servers; in TypeScript it is mcpServers.
Allow exactly the Codex tools
Put mcp__codex__codex and mcp__codex__codex-reply in allowedTools (TS) / allowed_tools (Python) [V] so Claude can call Codex without an approval prompt, while everything you omit stays gated.
Capture both state handles
Read Claude's session_id from the init system message [V] to resume the orchestrator. Read Codex's structuredContent.threadId [V] from the codex tool result to continue the Codex thread with codex-reply.
Track spend
The SDK's terminal ResultMessage reports total_cost_usd [V] for the run. Log it per pipeline invocation — a dual-agent loop bills both Claude tokens and Codex tokens, so cost visibility is not optional.

Configure the Codex bridge

Wire Codex into Claude over MCP

Register Codex as a stdio MCP server so Claude can call it as a tool — then graduate to a code-driven dual-agent pipeline on the Agent SDK. Set the models, sandbox, and approval policy; the three artifacts on the right rewrite themselves as you click.

Project path

Passed as cwd to codex() so Codex runs against the right repo.

Claude model (the conductor)

The orchestrating session — drives Codex through the MCP tools.

Codex model (the worker)

Runs inside the MCP server — the second model on a second budget.

Sandbox

How much of the machine Codex may touch. Passed as codex()'s sandbox.

Approval policy

When Codex pauses to ask you. Passed as codex()'s approval-policy.

.mcp.jsonjson

{
  "mcpServers": {
    "codex": {
      "type": "stdio",
      "command": "codex",
      "args": ["mcp-server"]
    }
  }
}

Pick the role (reviewer vs implementer), sandbox, approval-policy, and scope, and generate the matching .mcp.json / subagent frontmatter / Agent SDK options.

Knowledge check

You want a headless CI script where Claude implements a feature, then asks Codex for a read-only second-opinion review before committing — with Codex never blocking on a prompt and never able to write. Which configuration is correct?

Where this leaves you

You now have the full ladder. Interactive: claude mcp add codex -- codex mcp-server, and Claude can reach for Codex on its own. Stateful: capture structuredContent.threadId, continue with codex-reply. Scoped: hide the bridge inside a .claude/agents/ subagent via mcpServers frontmatter so it never clutters your main context. Programmatic: the Agent SDK turns "Claude calls Codex" into a dual-agent pipeline with mcpServers, allowedTools, captured session_id + threadId, and total_cost_usd you actually watch.

The mental shift is the whole point: once Codex is just an MCP server, the rivalry dissolves. It is a reasoning tool Claude can call — review, plan, second opinion, consensus — composable in exactly the same socket as a database or a browser. The next guides in this section build pipelines on top of this socket; this is the bridge they all cross.

Reach the end and this star joins your charted sky.