The Observatory · 9 min mission
Choose Your Coding Agent: A Cross-Tool Decision Guide
Route each task to the coding agent whose cost and review model actually fit — across CLI and IDE tools.
On this page
Eight coding agents, one decision: which surface to run a given task on. This guide gives the install string, surface, cost model, and pick/avoid criteria for each, plus a wizard that walks the six decision axes to a recommendation. All facts are current as of 2026-06-15.
The decision: pick by loop length
Every tool here runs the same agentic loop — describe an outcome, the agent reads the repo, edits files, runs commands, reads errors, iterates. Choosing is not about capability; it is about loop length:
- Short, exploratory, review-heavy work where you watch each diff land → an IDE agent (inner loop).
- Long, well-specified, parallel, or unattended work where you review a finished diff or PR → a CLI or cloud agent (outer loop).
Most setups keep one of each. The wizard below branches on six axes in priority order.
The six decision axes (the wizard's branch order)
Loop length / review posture
Watch every hunk land (IDE) vs review a finished diff or PR (CLI/cloud). The single biggest signal.
Where you already live
Daily in VS Code → Copilot or Cline drop in as extensions, no new editor. Want an agent-first workspace → Cursor or Devin Desktop. Live in the terminal → a CLI (Claude Code, Codex CLI).
Who you already pay
Pay for ChatGPT → Codex CLI adds no bill. Pay for Claude → Claude Code adds none. Want zero markup + full provider choice → Cline (BYO key). Want truly free → Copilot Free or Cline + a cheap/local model.
Unattended / parallel needs
PR opened from an issue while you sleep → Copilot cloud agent or Cursor Cloud Agents. Headless CI / fan-out across worktrees →
claude -p,codex exec, or Cursor parallel agents.Openness / data control
Open-source + BYO key + local models → Cline or Zoo Code. Enterprise governance (SSO, pooled credits, audit) → Copilot Business/Enterprise, Cursor Teams/Enterprise, Devin Enterprise, or Claude Code enterprise.
OS constraint
Native Windows: Claude Code's sandbox requires WSL2 (no native-Windows sandbox). IDE agents and Codex CLI run on Windows directly.
Find your tool (or your pair)
Which coding agent fits you?
Answer six honest questions — surface, autonomy, budget, who you already pay, openness, and models — and land on a specific recommendation with a runner-up. Facts as of 2026-06-15.
- Surface
- Autonomy
- Budget
- Existing plan
- Openness
- Context + models
Start at the first question — answer to narrow the field.
Where do you want the agent to live?
The single biggest signal. Watch every diff land in your editor (IDE), or hand off a goal in the terminal and review a finished diff (CLI)?
| Tool | Surface | Vendor / license | Current state |
|---|---|---|---|
| Claude Code | CLI + IDE + desktop + web + Slack + Actions | Anthropic (proprietary) | Rolling auto-update; no pinned version |
| OpenAI Codex CLI | CLI + IDE + cloud + ACP | OpenAI (Apache-2.0) | v0.139.0 (npm latest); sandboxed by default |
| Gemini CLI | CLI | Google (Apache-2.0) | v0.46.0; free individual tier ends 2026-06-18 → Antigravity CLI |
| Cursor | IDE (VS Code fork, agent-first) | Anysphere (proprietary) | v3.7; standalone app since 3.0 (not an extension) |
| GitHub Copilot | IDE + cloud agent (Actions) | GitHub / Microsoft (proprietary) | Agent mode + cloud agent GA; usage-based AI Credits |
| Windsurf → Devin Desktop | IDE (VS Code fork) + Devin Cloud + ACP | Cognition (proprietary) | Rebranded 2026-06-02; Cascade EOL 2026-07-01 |
| Cline | VS Code/JetBrains ext + CLI + Kanban | Cline Bot Inc. (Apache-2.0) | Ext v3.89.2; CLI v3.0.24; ~4.3M installs; BYO key |
| Roo Code → Zoo Code | VS Code ext | community | Roo archived 2026-05-15; Zoo Code v3.60.0 is the live fork |
# CLI agents
curl -fsSL https://claude.ai/install.sh | bash # Claude Code (rolling auto-update)
npm install -g @openai/codex # Codex CLI v0.139.0 (npm latest)
npm install -g @google/gemini-cli # Gemini CLI v0.46.0 (free tier ends 2026-06-18)
# IDE agents
# Cursor download from cursor.com (standalone app since 3.0)
# GitHub Copilot VS Code / JetBrains extension
# Devin Desktop download from devin.ai/desktop (ex-Windsurf)
# Cline VS Code Marketplace: saoudrizwan.claude-dev (historical ID, NOT an Anthropic product)
# Zoo Code VS Code Marketplace (community fork of Roo Code)IDE agents vs CLI / cloud agents — the loop-length split
IDE agents (inner loop)
Cursor, GitHub Copilot, Devin Desktop, Cline, Zoo Code.
Watch each diff land, Tab-complete, ask a prebuilt index "where is auth handled?" Best for short, exploratory, review-heavy work in your current editor.
Unmetered surface = autocomplete: Cursor Tab and Copilot completions are unlimited on paid plans. If most of your value is autocomplete, an IDE subscription is the floor.
CLI / cloud agents (outer loop)
Claude Code, Codex CLI, Gemini CLI; plus cloud agents.
Hand off a goal, review a finished diff or PR. Best for long, well-specified, parallel, or unattended work — headless CI (claude -p, codex exec), fan-out across worktrees, a PR opened from an issue.
Expensive surface = the agent loop: it re-reads context every turn and is token-metered, so one turn can cost 10–50× a $0 Tab completion.
| Tool | Free tier | Cheapest entry | Headless / control | Instruction file |
|---|---|---|---|---|
| Claude Code | None | Claude Pro $20 / Max 5× $100 / Max 20× $200, or API | claude -p --output-format json; --bare for CI; per-cmd allow/ask/deny + PreToolUse/PostToolUse/Stop hooks | CLAUDE.md |
| Codex CLI | None | ChatGPT Plus $20 (Codex included on Plus/Pro/Business/Edu/Enterprise), or API | codex exec --json / --output-schema; openai/codex-action@v1; @openai/codex-sdk | AGENTS.md |
| Gemini CLI | Yes — ends 2026-06-18 for individuals | After cutover: paid Gemini API key or Code Assist Std/Ent | Opt-in --sandbox (Docker/Podman/Seatbelt/gVisor/LXC); config in ~/.gemini/settings.json | GEMINI.md |
Codex CLI's two-dial trust model
Codex CLI ships sandboxed by default (default = workspace-write + on-request). Two independent enum keys in ~/.codex/config.toml control it: sandbox_mode (what it can touch) and approval_policy (when it must ask).
| Key | Values | Effect |
|---|---|---|
sandbox_mode | read-only / workspace-write / danger-full-access | What the agent can touch on disk/network |
approval_policy | untrusted / on-request / never | When the agent must pause and ask |
| Tool | Free tier | Paid entry | Unmetered surface | Rules / config |
|---|---|---|---|---|
| Cursor | Hobby (free) | Pro $20, Pro+ $60, Ultra $200, Teams Std $40/user, Teams Premium $120/user | Tab (multi-line, cross-file, jump-in-file) — unlimited on paid | .cursor/rules/*.mdc + AGENTS.md; Max Mode bills at full API rates |
| GitHub Copilot | Free ($0) | Pro $10, Pro+ $39, Max $100, Business $19/seat, Enterprise $39/seat | Code completions — unlimited on paid plans | Custom instructions; AI Credits $0.01 (chat/agent/cloud all metered) |
| Devin Desktop | Free ($0) | Pro $20, Max $200, Teams $40/seat, Enterprise (ACUs) | Tab + inline edits — unlimited even on Free | .devin/rules/*.md; ACP runs Codex/Claude/Gemini as agents |
| Cline | Free (open source) | BYO key (pay provider directly); Enterprise custom | None (agent-only, no completion surface) | .clinerules; Plan/Act modes, per-mode models; MCP |
| Zoo Code | Free (open source) | BYO key; optional Zoo Gateway credits | None (agent-only) | .roomodes / .roo/; role modes + per-mode MCP allowlists |
Per-tool pick / avoid
One row per tool: the conditions that make it the right pick, and the conditions that rule it out.
| Tool | Pick when | Avoid when |
|---|---|---|
| Claude Code (Anthropic) | One agent across terminal/IDE/desktop/web/Slack/CI sharing one CLAUDE.md; named extension primitives (Skills at .claude/skills/SKILL.md, Subagents at .claude/agents/, Agent Teams); scriptable hooks that *enforce* policy; strong sandbox (Seatbelt on macOS, bubblewrap on Linux/WSL2, deny-by-default network proxy); you already pay Pro/Max or per-token. | You need a free tier (none); native Windows + sandbox (WSL2 required); whole value is Tab autocomplete; you want zero-markup multi-provider choice. |
| Codex CLI (OpenAI, Apache-2.0) | You already pay for ChatGPT; you want the clean two-dial trust model; safety-on-by-default with zero config; the portable AGENTS.md; headless/CI via codex exec, cloud tasks, openai/codex-action@v1, or the SDK. | You need a $0 tier (cheapest is ChatGPT Plus $20); you want the broadest named file-based extension surface (Claude Code's hooks + permission rules reach further than two enum keys). |
| Gemini CLI (Google, Apache-2.0) | You hold a paid Gemini Code Assist Standard/Enterprise license or paid Gemini API key (retain access past the cutover); or you specifically need the Gemini 3 family (1M-token input on gemini-3.1-pro-preview). | You are an individual on the free tier (ends 2026-06-18); you are starting fresh today — install Antigravity CLI instead. |
| Cursor (Anysphere) | Agent-first workspace; best-in-class Tab; Plan mode; parallel agents across worktrees/cloud/SSH; prebuilt semantic index for codebase Q&A; pointing one editor at any frontier model (Claude/GPT/Gemini/Grok/Kimi) or in-house Composer 2.5. | You want to stay in your current editor (standalone app since 3.0); you want open-source/BYO-key; you are cost-sensitive about agent loops (pool can exhaust; Max Mode bills at full API rates). |
| GitHub Copilot (GitHub/Microsoft) | You live in VS Code/JetBrains/Visual Studio; want a genuine $0 start; want issue → PR automation via the cloud (coding) agent (distinct from synchronous in-editor agent mode); a team/enterprise wanting pooled AI Credits + SSO; the cheapest paid agent tier (Pro $10/mo). | You expect the cloud agent to be unconstrained (hard 59-minute cap, one branch / one PR per task, cannot change multiple repos, ignores content exclusions); you assume agent usage is free because completions are. |
| Devin Desktop (Cognition, ex-Windsurf) | An Agent Command Center orchestrating local + cloud agents from one Kanban surface; built-in ACP (run Codex CLI, Claude Agent, Gemini CLI as first-class agents); Windsurf's real-time awareness + .devin/rules/*.md; unlimited Tab on Free ($0). | You are following any "Windsurf"/"Cascade" tutorial (brand gone, Cascade EOL 2026-07-01); you want stable docs/versions (mid-migration); you want open-source/BYO-key. |
| Cline (Cline Bot Inc., Apache-2.0) | Most-installed open-source agent (~4.3M). Total provider choice with no markup (Anthropic, OpenAI, Gemini, Bedrock, Vertex, Azure, OpenRouter's 200+ models, or local via Ollama/LM Studio — pay the provider directly); data control/residency; a minimal Plan/Act agent with different models per mode; MCP-first extensibility. | You want a flat all-in price (you pay per-token inference); you are a beginner who'd be surprised by a provider bill; you want role-based modes (that is Zoo's design); you want autocomplete (Cline is agent-only). |
| Zoo Code (community, ex-Roo Code) | Multi-persona role modes (Code / Ask / Architect / Debug / Orchestrator + Custom Modes with scoped tool/file permissions); Orchestrator delegating subtasks; open-source BYO-key MCP with per-mode MCP allowlists (v3.60.0). | Never install Roo Code — shut down 2026-05-15, repo read-only, Cloud/Router offline. Install Zoo Code (or Cline) instead. |
Knowledge check
Your team lives in VS Code on GitHub and writes well-specified tasks. You want to assign a GitHub Issue and have a PR opened on a branch while you sleep. You also need a $0 starting point. Which fits best?
Convergence: what is shared, what still differs
Shared, so switching is cheap:
- Instruction file —
AGENTS.mdis read by Codex, Cursor, GitHub Copilot, and Gemini CLI. Copilot also readsCLAUDE.md/GEMINI.md; Claude Code can@AGENTS.md-import it. - Models — the same frontier models (Claude Opus 4.x / Fable 5, GPT-5.x, Gemini 3.x) back both IDE agents and CLIs; in-house models (Cursor Composer, Devin SWE-1.x) are the differentiator.
- Protocol — the Agent Client Protocol (ACP) lets a CLI agent (Codex, Claude Agent, Gemini CLI) run inside an editor (Devin Desktop, JetBrains, Zed) as a first-class agent.
- Cloud handoff — Cursor Cloud Agents and Copilot's cloud agent both hand an in-IDE task to an async worker that opens a PR.
Still differs sharply — cost models, review postures, sandbox philosophies, and extension surfaces. Those are the axes the wizard branches on. Keep instructions in AGENTS.md so switching tools is cheap, and re-evaluate when the perishable facts above expire.
Reach the end and this star joins your charted sky.