The Observatory · 9 min mission

Choose Your Coding Agent: A Cross-Tool Decision Guide

Route each task to the coding agent whose cost and review model actually fit — across CLI and IDE tools.

decision-guidecli-agentside-agentstool-comparisonobservatoryFact-checked 2026-06-15
On this page

Eight coding agents, one decision: which surface to run a given task on. This guide gives the install string, surface, cost model, and pick/avoid criteria for each, plus a wizard that walks the six decision axes to a recommendation. All facts are current as of 2026-06-15.

The decision: pick by loop length

Every tool here runs the same agentic loop — describe an outcome, the agent reads the repo, edits files, runs commands, reads errors, iterates. Choosing is not about capability; it is about loop length:

  • Short, exploratory, review-heavy work where you watch each diff land → an IDE agent (inner loop).
  • Long, well-specified, parallel, or unattended work where you review a finished diff or PR → a CLI or cloud agent (outer loop).

Most setups keep one of each. The wizard below branches on six axes in priority order.

The six decision axes (the wizard's branch order)

  1. Loop length / review posture

    Watch every hunk land (IDE) vs review a finished diff or PR (CLI/cloud). The single biggest signal.

  2. Where you already live

    Daily in VS Code → Copilot or Cline drop in as extensions, no new editor. Want an agent-first workspace → Cursor or Devin Desktop. Live in the terminal → a CLI (Claude Code, Codex CLI).

  3. Who you already pay

    Pay for ChatGPT → Codex CLI adds no bill. Pay for Claude → Claude Code adds none. Want zero markup + full provider choice → Cline (BYO key). Want truly free → Copilot Free or Cline + a cheap/local model.

  4. Unattended / parallel needs

    PR opened from an issue while you sleep → Copilot cloud agent or Cursor Cloud Agents. Headless CI / fan-out across worktrees → claude -p, codex exec, or Cursor parallel agents.

  5. Openness / data control

    Open-source + BYO key + local models → Cline or Zoo Code. Enterprise governance (SSO, pooled credits, audit) → Copilot Business/Enterprise, Cursor Teams/Enterprise, Devin Enterprise, or Claude Code enterprise.

  6. OS constraint

    Native Windows: Claude Code's sandbox requires WSL2 (no native-Windows sandbox). IDE agents and Codex CLI run on Windows directly.

Find your tool (or your pair)

Which coding agent fits you?

Answer six honest questions — surface, autonomy, budget, who you already pay, openness, and models — and land on a specific recommendation with a runner-up. Facts as of 2026-06-15.

  1. Surface
  2. Autonomy
  3. Budget
  4. Existing plan
  5. Openness
  6. Context + models

Start at the first question — answer to narrow the field.

Question 1 / 6

Where do you want the agent to live?

The single biggest signal. Watch every diff land in your editor (IDE), or hand off a goal in the terminal and review a finished diff (CLI)?

Walk the six axes — loop length, where you live, who you pay, unattended needs, openness, OS — and land on a recommended tool or pair with a one-line rationale and the matching cost note. It surfaces the "most people run two" outcome, and flags the stale paths (Gemini → Antigravity, Windsurf → Devin, Roo → Zoo) the moment you pick one.
ToolSurfaceVendor / licenseCurrent state
Claude CodeCLI + IDE + desktop + web + Slack + ActionsAnthropic (proprietary)Rolling auto-update; no pinned version
OpenAI Codex CLICLI + IDE + cloud + ACPOpenAI (Apache-2.0)v0.139.0 (npm latest); sandboxed by default
Gemini CLICLIGoogle (Apache-2.0)v0.46.0; free individual tier ends 2026-06-18 → Antigravity CLI
CursorIDE (VS Code fork, agent-first)Anysphere (proprietary)v3.7; standalone app since 3.0 (not an extension)
GitHub CopilotIDE + cloud agent (Actions)GitHub / Microsoft (proprietary)Agent mode + cloud agent GA; usage-based AI Credits
Windsurf → Devin DesktopIDE (VS Code fork) + Devin Cloud + ACPCognition (proprietary)Rebranded 2026-06-02; Cascade EOL 2026-07-01
ClineVS Code/JetBrains ext + CLI + KanbanCline Bot Inc. (Apache-2.0)Ext v3.89.2; CLI v3.0.24; ~4.3M installs; BYO key
Roo Code → Zoo CodeVS Code extcommunityRoo archived 2026-05-15; Zoo Code v3.60.0 is the live fork
The eight tools in scope, current state on 2026-06-15. Claude Code auto-updates (no single pinned version); "→" marks a tool mid-rebrand or retired.
Install / entry per tool (verified strings)
bash
# CLI agents
curl -fsSL https://claude.ai/install.sh | bash   # Claude Code (rolling auto-update)
npm install -g @openai/codex                      # Codex CLI v0.139.0 (npm latest)
npm install -g @google/gemini-cli                 # Gemini CLI v0.46.0 (free tier ends 2026-06-18)
 
# IDE agents
#   Cursor          download from cursor.com          (standalone app since 3.0)
#   GitHub Copilot  VS Code / JetBrains extension
#   Devin Desktop   download from devin.ai/desktop    (ex-Windsurf)
#   Cline           VS Code Marketplace: saoudrizwan.claude-dev   (historical ID, NOT an Anthropic product)
#   Zoo Code        VS Code Marketplace (community fork of Roo Code)

IDE agents vs CLI / cloud agents — the loop-length split

IDE agents (inner loop)

Cursor, GitHub Copilot, Devin Desktop, Cline, Zoo Code.

Watch each diff land, Tab-complete, ask a prebuilt index "where is auth handled?" Best for short, exploratory, review-heavy work in your current editor.

Unmetered surface = autocomplete: Cursor Tab and Copilot completions are unlimited on paid plans. If most of your value is autocomplete, an IDE subscription is the floor.

CLI / cloud agents (outer loop)

Claude Code, Codex CLI, Gemini CLI; plus cloud agents.

Hand off a goal, review a finished diff or PR. Best for long, well-specified, parallel, or unattended work — headless CI (claude -p, codex exec), fan-out across worktrees, a PR opened from an issue.

Expensive surface = the agent loop: it re-reads context every turn and is token-metered, so one turn can cost 10–50× a $0 Tab completion.

ToolFree tierCheapest entryHeadless / controlInstruction file
Claude CodeNoneClaude Pro $20 / Max 5× $100 / Max 20× $200, or APIclaude -p --output-format json; --bare for CI; per-cmd allow/ask/deny + PreToolUse/PostToolUse/Stop hooksCLAUDE.md
Codex CLINoneChatGPT Plus $20 (Codex included on Plus/Pro/Business/Edu/Enterprise), or APIcodex exec --json / --output-schema; openai/codex-action@v1; @openai/codex-sdkAGENTS.md
Gemini CLIYes — ends 2026-06-18 for individualsAfter cutover: paid Gemini API key or Code Assist Std/EntOpt-in --sandbox (Docker/Podman/Seatbelt/gVisor/LXC); config in ~/.gemini/settings.jsonGEMINI.md
CLI agents — free tier, entry cost, control surface, and the exact flags/config keys. Pick by who you already pay and how unattended you go.

Codex CLI's two-dial trust model

Codex CLI ships sandboxed by default (default = workspace-write + on-request). Two independent enum keys in ~/.codex/config.toml control it: sandbox_mode (what it can touch) and approval_policy (when it must ask).

KeyValuesEffect
sandbox_moderead-only / workspace-write / danger-full-accessWhat the agent can touch on disk/network
approval_policyuntrusted / on-request / neverWhen the agent must pause and ask
Codex CLI sandbox + approval enums (`~/.codex/config.toml`). Default is `workspace-write` + `on-request`. `--full-auto` is deprecated (use `--sandbox workspace-write`); `--yolo` / `--dangerously-bypass-approvals-and-sandbox` removes all guardrails.
ToolFree tierPaid entryUnmetered surfaceRules / config
CursorHobby (free)Pro $20, Pro+ $60, Ultra $200, Teams Std $40/user, Teams Premium $120/userTab (multi-line, cross-file, jump-in-file) — unlimited on paid.cursor/rules/*.mdc + AGENTS.md; Max Mode bills at full API rates
GitHub CopilotFree ($0)Pro $10, Pro+ $39, Max $100, Business $19/seat, Enterprise $39/seatCode completions — unlimited on paid plansCustom instructions; AI Credits $0.01 (chat/agent/cloud all metered)
Devin DesktopFree ($0)Pro $20, Max $200, Teams $40/seat, Enterprise (ACUs)Tab + inline edits — unlimited even on Free.devin/rules/*.md; ACP runs Codex/Claude/Gemini as agents
ClineFree (open source)BYO key (pay provider directly); Enterprise customNone (agent-only, no completion surface).clinerules; Plan/Act modes, per-mode models; MCP
Zoo CodeFree (open source)BYO key; optional Zoo Gateway creditsNone (agent-only).roomodes / .roo/; role modes + per-mode MCP allowlists
IDE agents — free tier, paid entry, unmetered surface, and the rules/config key each reads. Pick by where you live and your governance needs.

Per-tool pick / avoid

One row per tool: the conditions that make it the right pick, and the conditions that rule it out.

ToolPick whenAvoid when
Claude Code (Anthropic)One agent across terminal/IDE/desktop/web/Slack/CI sharing one CLAUDE.md; named extension primitives (Skills at .claude/skills/SKILL.md, Subagents at .claude/agents/, Agent Teams); scriptable hooks that *enforce* policy; strong sandbox (Seatbelt on macOS, bubblewrap on Linux/WSL2, deny-by-default network proxy); you already pay Pro/Max or per-token.You need a free tier (none); native Windows + sandbox (WSL2 required); whole value is Tab autocomplete; you want zero-markup multi-provider choice.
Codex CLI (OpenAI, Apache-2.0)You already pay for ChatGPT; you want the clean two-dial trust model; safety-on-by-default with zero config; the portable AGENTS.md; headless/CI via codex exec, cloud tasks, openai/codex-action@v1, or the SDK.You need a $0 tier (cheapest is ChatGPT Plus $20); you want the broadest named file-based extension surface (Claude Code's hooks + permission rules reach further than two enum keys).
Gemini CLI (Google, Apache-2.0)You hold a paid Gemini Code Assist Standard/Enterprise license or paid Gemini API key (retain access past the cutover); or you specifically need the Gemini 3 family (1M-token input on gemini-3.1-pro-preview).You are an individual on the free tier (ends 2026-06-18); you are starting fresh today — install Antigravity CLI instead.
Cursor (Anysphere)Agent-first workspace; best-in-class Tab; Plan mode; parallel agents across worktrees/cloud/SSH; prebuilt semantic index for codebase Q&A; pointing one editor at any frontier model (Claude/GPT/Gemini/Grok/Kimi) or in-house Composer 2.5.You want to stay in your current editor (standalone app since 3.0); you want open-source/BYO-key; you are cost-sensitive about agent loops (pool can exhaust; Max Mode bills at full API rates).
GitHub Copilot (GitHub/Microsoft)You live in VS Code/JetBrains/Visual Studio; want a genuine $0 start; want issue → PR automation via the cloud (coding) agent (distinct from synchronous in-editor agent mode); a team/enterprise wanting pooled AI Credits + SSO; the cheapest paid agent tier (Pro $10/mo).You expect the cloud agent to be unconstrained (hard 59-minute cap, one branch / one PR per task, cannot change multiple repos, ignores content exclusions); you assume agent usage is free because completions are.
Devin Desktop (Cognition, ex-Windsurf)An Agent Command Center orchestrating local + cloud agents from one Kanban surface; built-in ACP (run Codex CLI, Claude Agent, Gemini CLI as first-class agents); Windsurf's real-time awareness + .devin/rules/*.md; unlimited Tab on Free ($0).You are following any "Windsurf"/"Cascade" tutorial (brand gone, Cascade EOL 2026-07-01); you want stable docs/versions (mid-migration); you want open-source/BYO-key.
Cline (Cline Bot Inc., Apache-2.0)Most-installed open-source agent (~4.3M). Total provider choice with no markup (Anthropic, OpenAI, Gemini, Bedrock, Vertex, Azure, OpenRouter's 200+ models, or local via Ollama/LM Studio — pay the provider directly); data control/residency; a minimal Plan/Act agent with different models per mode; MCP-first extensibility.You want a flat all-in price (you pay per-token inference); you are a beginner who'd be surprised by a provider bill; you want role-based modes (that is Zoo's design); you want autocomplete (Cline is agent-only).
Zoo Code (community, ex-Roo Code)Multi-persona role modes (Code / Ask / Architect / Debug / Orchestrator + Custom Modes with scoped tool/file permissions); Orchestrator delegating subtasks; open-source BYO-key MCP with per-mode MCP allowlists (v3.60.0).Never install Roo Code — shut down 2026-05-15, repo read-only, Cloud/Router offline. Install Zoo Code (or Cline) instead.
Pick-when / avoid-when per tool. "Pick" = the conditions it is the right answer for; "Avoid" = the disqualifiers. Vendor/license in the first column.

Knowledge check

Your team lives in VS Code on GitHub and writes well-specified tasks. You want to assign a GitHub Issue and have a PR opened on a branch while you sleep. You also need a $0 starting point. Which fits best?

The two-tool setup, sharing one AGENTS.md
… scroll to run this session
Inner-loop IDE agent + outer-loop CLI agent reading one open AGENTS.md, so neither surface starts blind.

Convergence: what is shared, what still differs

Shared, so switching is cheap:

  • Instruction fileAGENTS.md is read by Codex, Cursor, GitHub Copilot, and Gemini CLI. Copilot also reads CLAUDE.md/GEMINI.md; Claude Code can @AGENTS.md-import it.
  • Models — the same frontier models (Claude Opus 4.x / Fable 5, GPT-5.x, Gemini 3.x) back both IDE agents and CLIs; in-house models (Cursor Composer, Devin SWE-1.x) are the differentiator.
  • Protocol — the Agent Client Protocol (ACP) lets a CLI agent (Codex, Claude Agent, Gemini CLI) run inside an editor (Devin Desktop, JetBrains, Zed) as a first-class agent.
  • Cloud handoff — Cursor Cloud Agents and Copilot's cloud agent both hand an in-IDE task to an async worker that opens a PR.

Still differs sharply — cost models, review postures, sandbox philosophies, and extension surfaces. Those are the axes the wizard branches on. Keep instructions in AGENTS.md so switching tools is cheap, and re-evaluate when the perishable facts above expire.

Reach the end and this star joins your charted sky.