The Observatory · 9 min mission

Choose Your Coding Agent: A Cross-Tool Decision Guide

Route each task to the coding agent whose cost and review model actually fit — across CLI and IDE tools.

decision-guidecli-agentside-agentstool-comparisonobservatoryFact-checked 2026-06-15

On this page

The decision: pick by loop length
Codex CLI's two-dial trust model
Per-tool pick / avoid
Convergence: what is shared, what still differs

Eight coding agents, one decision: which surface to run a given task on. This guide gives the install string, surface, cost model, and pick/avoid criteria for each, plus a wizard that walks the six decision axes to a recommendation. All facts are current as of 2026-06-15.

The decision: pick by loop length

Every tool here runs the same agentic loop — describe an outcome, the agent reads the repo, edits files, runs commands, reads errors, iterates. Choosing is not about capability; it is about loop length:

Short, exploratory, review-heavy work where you watch each diff land → an IDE agent (inner loop).
Long, well-specified, parallel, or unattended work where you review a finished diff or PR → a CLI or cloud agent (outer loop).

Most setups keep one of each. The wizard below branches on six axes in priority order.

The six decision axes (the wizard's branch order)

Loop length / review posture
Watch every hunk land (IDE) vs review a finished diff or PR (CLI/cloud). The single biggest signal.
Where you already live
Daily in VS Code → Copilot or Cline drop in as extensions, no new editor. Want an agent-first workspace → Cursor or Devin Desktop. Live in the terminal → a CLI (Claude Code, Codex CLI).
Who you already pay
Pay for ChatGPT → Codex CLI adds no bill. Pay for Claude → Claude Code adds none. Want zero markup + full provider choice → Cline (BYO key). Want truly free → Copilot Free or Cline + a cheap/local model.
Unattended / parallel needs
PR opened from an issue while you sleep → Copilot cloud agent or Cursor Cloud Agents. Headless CI / fan-out across worktrees → claude -p, codex exec, or Cursor parallel agents.
Openness / data control
Open-source + BYO key + local models → Cline or Zoo Code. Enterprise governance (SSO, pooled credits, audit) → Copilot Business/Enterprise, Cursor Teams/Enterprise, Devin Enterprise, or Claude Code enterprise.
OS constraint
Native Windows: Claude Code's sandbox requires WSL2 (no native-Windows sandbox). IDE agents and Codex CLI run on Windows directly.

Find your tool (or your pair)

Which coding agent fits you?

Answer six honest questions — surface, autonomy, budget, who you already pay, openness, and models — and land on a specific recommendation with a runner-up. Facts as of 2026-06-15.

Surface
Autonomy
Budget
Existing plan
Openness
Context + models

Start at the first question — answer to narrow the field.

Question 1 / 6

Where do you want the agent to live?

The single biggest signal. Watch every diff land in your editor (IDE), or hand off a goal in the terminal and review a finished diff (CLI)?

Walk the six axes — loop length, where you live, who you pay, unattended needs, openness, OS — and land on a recommended tool or pair with a one-line rationale and the matching cost note. It surfaces the "most people run two" outcome, and flags the stale paths (Gemini → Antigravity, Windsurf → Devin, Roo → Zoo) the moment you pick one.

Tool	Surface	Vendor / license	Current state
Claude Code	CLI + IDE + desktop + web + Slack + Actions	Anthropic (proprietary)	Rolling auto-update; no pinned version
OpenAI Codex CLI	CLI + IDE + cloud + ACP	OpenAI (Apache-2.0)	v0.139.0 (npm `latest`); sandboxed by default
Gemini CLI	CLI	Google (Apache-2.0)	v0.46.0; free individual tier ends 2026-06-18 → Antigravity CLI
Cursor	IDE (VS Code fork, agent-first)	Anysphere (proprietary)	v3.7; standalone app since 3.0 (not an extension)
GitHub Copilot	IDE + cloud agent (Actions)	GitHub / Microsoft (proprietary)	Agent mode + cloud agent GA; usage-based AI Credits
Windsurf → Devin Desktop	IDE (VS Code fork) + Devin Cloud + ACP	Cognition (proprietary)	Rebranded 2026-06-02; Cascade EOL 2026-07-01
Cline	VS Code/JetBrains ext + CLI + Kanban	Cline Bot Inc. (Apache-2.0)	Ext v3.89.2; CLI v3.0.24; ~4.3M installs; BYO key
Roo Code → Zoo Code	VS Code ext	community	Roo archived 2026-05-15; Zoo Code v3.60.0 is the live fork

The eight tools in scope, current state on 2026-06-15. Claude Code auto-updates (no single pinned version); "→" marks a tool mid-rebrand or retired.

Install / entry per tool (verified strings)

# CLI agents
curl -fsSL https://claude.ai/install.sh | bash   # Claude Code (rolling auto-update)
npm install -g @openai/codex                      # Codex CLI v0.139.0 (npm latest)
npm install -g @google/gemini-cli                 # Gemini CLI v0.46.0 (free tier ends 2026-06-18)
 
# IDE agents
#   Cursor          download from cursor.com          (standalone app since 3.0)
#   GitHub Copilot  VS Code / JetBrains extension
#   Devin Desktop   download from devin.ai/desktop    (ex-Windsurf)
#   Cline           VS Code Marketplace: saoudrizwan.claude-dev   (historical ID, NOT an Anthropic product)
#   Zoo Code        VS Code Marketplace (community fork of Roo Code)

IDE agents vs CLI / cloud agents — the loop-length split

IDE agents (inner loop)

Cursor, GitHub Copilot, Devin Desktop, Cline, Zoo Code.

Watch each diff land, Tab-complete, ask a prebuilt index "where is auth handled?" Best for short, exploratory, review-heavy work in your current editor.

Unmetered surface = autocomplete: Cursor Tab and Copilot completions are unlimited on paid plans. If most of your value is autocomplete, an IDE subscription is the floor.

CLI / cloud agents (outer loop)

Claude Code, Codex CLI, Gemini CLI; plus cloud agents.

Hand off a goal, review a finished diff or PR. Best for long, well-specified, parallel, or unattended work — headless CI (claude -p, codex exec), fan-out across worktrees, a PR opened from an issue.

Expensive surface = the agent loop: it re-reads context every turn and is token-metered, so one turn can cost 10–50× a $0 Tab completion.

Tool	Free tier	Cheapest entry	Headless / control	Instruction file
Claude Code	None	Claude Pro $20 / Max 5× $100 / Max 20× $200, or API	`claude -p --output-format json`; `--bare` for CI; per-cmd allow/ask/deny + `PreToolUse`/`PostToolUse`/`Stop` hooks	`CLAUDE.md`
Codex CLI	None	ChatGPT Plus $20 (Codex included on Plus/Pro/Business/Edu/Enterprise), or API	`codex exec --json` / `--output-schema`; `openai/codex-action@v1`; `@openai/codex-sdk`	`AGENTS.md`
Gemini CLI	Yes — ends 2026-06-18 for individuals	After cutover: paid Gemini API key or Code Assist Std/Ent	Opt-in `--sandbox` (Docker/Podman/Seatbelt/gVisor/LXC); config in `~/.gemini/settings.json`	`GEMINI.md`

CLI agents — free tier, entry cost, control surface, and the exact flags/config keys. Pick by who you already pay and how unattended you go.

Codex CLI's two-dial trust model

Codex CLI ships sandboxed by default (default = workspace-write + on-request). Two independent enum keys in ~/.codex/config.toml control it: sandbox_mode (what it can touch) and approval_policy (when it must ask).

Key	Values	Effect
`sandbox_mode`	`read-only` / `workspace-write` / `danger-full-access`	What the agent can touch on disk/network
`approval_policy`	`untrusted` / `on-request` / `never`	When the agent must pause and ask

Codex CLI sandbox + approval enums (`~/.codex/config.toml`). Default is `workspace-write` + `on-request`. `--full-auto` is deprecated (use `--sandbox workspace-write`); `--yolo` / `--dangerously-bypass-approvals-and-sandbox` removes all guardrails.

Tool	Free tier	Paid entry	Unmetered surface	Rules / config
Cursor	Hobby (free)	Pro $20, Pro+ $60, Ultra $200, Teams Std $40/user, Teams Premium $120/user	Tab (multi-line, cross-file, jump-in-file) — unlimited on paid	`.cursor/rules/*.mdc` + `AGENTS.md`; Max Mode bills at full API rates
GitHub Copilot	Free ($0)	Pro $10, Pro+ $39, Max $100, Business $19/seat, Enterprise $39/seat	Code completions — unlimited on paid plans	Custom instructions; AI Credits $0.01 (chat/agent/cloud all metered)
Devin Desktop	Free ($0)	Pro $20, Max $200, Teams $40/seat, Enterprise (ACUs)	Tab + inline edits — unlimited even on Free	`.devin/rules/*.md`; ACP runs Codex/Claude/Gemini as agents
Cline	Free (open source)	BYO key (pay provider directly); Enterprise custom	None (agent-only, no completion surface)	`.clinerules`; Plan/Act modes, per-mode models; MCP
Zoo Code	Free (open source)	BYO key; optional Zoo Gateway credits	None (agent-only)	`.roomodes` / `.roo/`; role modes + per-mode MCP allowlists

IDE agents — free tier, paid entry, unmetered surface, and the rules/config key each reads. Pick by where you live and your governance needs.

Per-tool pick / avoid

One row per tool: the conditions that make it the right pick, and the conditions that rule it out.

Tool	Pick when	Avoid when
Claude Code (Anthropic)	One agent across terminal/IDE/desktop/web/Slack/CI sharing one `CLAUDE.md`; named extension primitives (Skills at `.claude/skills/SKILL.md`, Subagents at `.claude/agents/`, Agent Teams); scriptable hooks that enforce policy; strong sandbox (Seatbelt on macOS, bubblewrap on Linux/WSL2, deny-by-default network proxy); you already pay Pro/Max or per-token.	You need a free tier (none); native Windows + sandbox (WSL2 required); whole value is Tab autocomplete; you want zero-markup multi-provider choice.
Codex CLI (OpenAI, Apache-2.0)	You already pay for ChatGPT; you want the clean two-dial trust model; safety-on-by-default with zero config; the portable `AGENTS.md`; headless/CI via `codex exec`, cloud tasks, `openai/codex-action@v1`, or the SDK.	You need a $0 tier (cheapest is ChatGPT Plus $20); you want the broadest named file-based extension surface (Claude Code's hooks + permission rules reach further than two enum keys).
Gemini CLI (Google, Apache-2.0)	You hold a paid Gemini Code Assist Standard/Enterprise license or paid Gemini API key (retain access past the cutover); or you specifically need the Gemini 3 family (1M-token input on `gemini-3.1-pro-preview`).	You are an individual on the free tier (ends 2026-06-18); you are starting fresh today — install Antigravity CLI instead.
Cursor (Anysphere)	Agent-first workspace; best-in-class Tab; Plan mode; parallel agents across worktrees/cloud/SSH; prebuilt semantic index for codebase Q&A; pointing one editor at any frontier model (Claude/GPT/Gemini/Grok/Kimi) or in-house Composer 2.5.	You want to stay in your current editor (standalone app since 3.0); you want open-source/BYO-key; you are cost-sensitive about agent loops (pool can exhaust; Max Mode bills at full API rates).
GitHub Copilot (GitHub/Microsoft)	You live in VS Code/JetBrains/Visual Studio; want a genuine $0 start; want issue → PR automation via the cloud (coding) agent (distinct from synchronous in-editor agent mode); a team/enterprise wanting pooled AI Credits + SSO; the cheapest paid agent tier (Pro $10/mo).	You expect the cloud agent to be unconstrained (hard 59-minute cap, one branch / one PR per task, cannot change multiple repos, ignores content exclusions); you assume agent usage is free because completions are.
Devin Desktop (Cognition, ex-Windsurf)	An Agent Command Center orchestrating local + cloud agents from one Kanban surface; built-in ACP (run Codex CLI, Claude Agent, Gemini CLI as first-class agents); Windsurf's real-time awareness + `.devin/rules/*.md`; unlimited Tab on Free ($0).	You are following any "Windsurf"/"Cascade" tutorial (brand gone, Cascade EOL 2026-07-01); you want stable docs/versions (mid-migration); you want open-source/BYO-key.
Cline (Cline Bot Inc., Apache-2.0)	Most-installed open-source agent (~4.3M). Total provider choice with no markup (Anthropic, OpenAI, Gemini, Bedrock, Vertex, Azure, OpenRouter's 200+ models, or local via Ollama/LM Studio — pay the provider directly); data control/residency; a minimal Plan/Act agent with different models per mode; MCP-first extensibility.	You want a flat all-in price (you pay per-token inference); you are a beginner who'd be surprised by a provider bill; you want role-based modes (that is Zoo's design); you want autocomplete (Cline is agent-only).
Zoo Code (community, ex-Roo Code)	Multi-persona role modes (Code / Ask / Architect / Debug / Orchestrator + Custom Modes with scoped tool/file permissions); Orchestrator delegating subtasks; open-source BYO-key MCP with per-mode MCP allowlists (v3.60.0).	Never install Roo Code — shut down 2026-05-15, repo read-only, Cloud/Router offline. Install Zoo Code (or Cline) instead.

Pick-when / avoid-when per tool. "Pick" = the conditions it is the right answer for; "Avoid" = the disqualifiers. Vendor/license in the first column.

Knowledge check

Your team lives in VS Code on GitHub and writes well-specified tasks. You want to assign a GitHub Issue and have a PR opened on a branch while you sleep. You also need a $0 starting point. Which fits best?

The two-tool setup, sharing one AGENTS.md

… scroll to run this session

Inner-loop IDE agent + outer-loop CLI agent reading one open AGENTS.md, so neither surface starts blind.

Convergence: what is shared, what still differs

Shared, so switching is cheap:

Instruction file — AGENTS.md is read by Codex, Cursor, GitHub Copilot, and Gemini CLI. Copilot also reads CLAUDE.md/GEMINI.md; Claude Code can @AGENTS.md-import it.
Models — the same frontier models (Claude Opus 4.x / Fable 5, GPT-5.x, Gemini 3.x) back both IDE agents and CLIs; in-house models (Cursor Composer, Devin SWE-1.x) are the differentiator.
Protocol — the Agent Client Protocol (ACP) lets a CLI agent (Codex, Claude Agent, Gemini CLI) run inside an editor (Devin Desktop, JetBrains, Zed) as a first-class agent.
Cloud handoff — Cursor Cloud Agents and Copilot's cloud agent both hand an in-IDE task to an async worker that opens a PR.

Still differs sharply — cost models, review postures, sandbox philosophies, and extension surfaces. Those are the axes the wizard branches on. Keep instructions in AGENTS.md so switching tools is cheap, and re-evaluate when the perishable facts above expire.

Reach the end and this star joins your charted sky.