The Workshop · 11 min mission

IDE Agents vs Terminal CLI Agents: An Honest Decision Guide

Learn which surface fits each task — inline diffs and free Tab in the editor, or headless autonomy and parallel PRs from the terminal.

ide-agentscli-agentscursorheadlesscost-modelsdecision-guideFact-checked 2026-06-15

On this page

IDE-only affordances
CLI / cloud-only affordances
Instruction / "rules" files by surface
Cost models

IDE agents and terminal/CLI agents are two surfaces for the same model loop, usually shipped by the same vendor. This guide gives you the exact commands, flags, file paths, and cost models for each surface, and a procedure to route any task to the one that fits. The deciding variable is loop length: short, attended, review-heavy work → IDE; long, well-specified, parallel, or unattended → CLI/cloud.

Property	IDE agent (inner loop)	CLI / cloud agent (outer loop)
Review point	Per hunk, inline, as edits are proposed	At the end — a finished diff or PR
Context retrieval	Prebuilt semantic index (find by meaning)	Agentic search: `grep` + read on demand, no persistent index
Speed surface	`Tab` completions (free/unlimited on paid plans)	Autonomy + parallelism (many agents, headless, CI)
Runs unattended	No — needs a human watching the cursor	Yes — `claude -p`, `codex exec`, cloud PR agents
Best when	Short, local, exploratory, review-heavy	Long-horizon, well-specified, parallel, batch-reviewed

What each surface is, mechanically. Most teams run both and route per task.

IDE-only affordances

Three capabilities depend on a human watching a cursor; a headless process cannot replicate them.

Inline diffs / hunk-level review — the agent proposes edits as diffs in the file; you accept or reject per hunk before anything lands. Cursor's Agent snapshots edits as Checkpoints so you can roll back without touching Git history.

Tab completion — Cursor Tab suggests code from your recent edits, surrounding code, and linter errors; it predicts multi-line edits, cross-file edits, and jump-in-file (press Tab again to navigate to the next likely edit). Tab is unlimited on all paid Cursor plans. GitHub Copilot code completions and next-edit suggestions are not billed and stay unlimited on paid plans — a different cost basis from agent/chat usage.

Prebuilt semantic index — Cursor splits files into chunks, embeds them as vector embeddings, and stores the embeddings remotely. Semantic search becomes available at ~80% completion and the index auto-syncs every ~5 minutes. Exclusions honor .gitignore and .cursorignore.

Search style	How it works	Where it lives
Semantic search	Query the prebuilt embedding index by meaning ("where is auth handled?")	IDE agent — instant on conceptual questions
Agentic search	Chain `grep` for exact symbols + semantic search for concepts	Cursor Agent (both); CLI agents lean almost entirely on grep + read

IDE semantic search vs CLI agentic search — why each surface feels the way it does.

CLI / cloud-only affordances

These three capabilities require no human at the cursor: autonomy, parallelism, and unattended execution.

Headless, non-interactive runs are the CLI's signature. Claude Code runs with -p / --print; --output-format accepts text (default), json (includes total_cost_usd + session ID), and stream-json (newline-delimited events). --json-schema enforces structured output (returned in structured_output). --bare skips auto-discovery of hooks, skills, plugins, MCP, memory, and CLAUDE.md — "the recommended mode for scripted and SDK calls." Piped stdin is capped at 10MB (as of v2.1.128). OpenAI Codex CLI mirrors this with codex exec: progress streams to stderr, the final message to stdout; it must run inside a Git repo (override --skip-git-repo-check); --json emits a JSON-Lines event stream; --output-schema <path> enforces a JSON Schema; --sandbox sets the permission level; codex exec resume --last continues a prior run.

Cloud / async PR agents detach work from your machine. GitHub's Copilot coding agent runs in an ephemeral environment powered by GitHub Actions, edits on a branch, runs tests, and opens a PR — capped at a 59-minute execution limit per task. Cursor's Cloud Agents run in sandboxed cloud VMs from an ephemeral checkout. Devin (Cognition, the new identity of Windsurf) is an autonomous cloud agent fused with a desktop editor.

Capability	Claude Code	OpenAI Codex CLI
Non-interactive entry	`claude -p "<prompt>"` / `--print`	`codex exec "<prompt>"`
Structured output	`--output-format json` (`total_cost_usd`)	`--json` (JSON-Lines events)
Schema enforcement	`--json-schema` → `structured_output`	`--output-schema <path>`
Reproducible / minimal	`--bare` (skips hooks, MCP, `CLAUDE.md`)	`--sandbox <level>`
Resume a run	Session ID from `json` output	`codex exec resume --last`
Repo requirement	None	Must be in a Git repo (`--skip-git-repo-check`)

Headless flags by tool — the exact strings for unattended runs.

outer loop: headless agents in a pipeline

… scroll to run this session

A CLI agent composes like any Unix tool — pipe input in, get structured output out, no editor in sight.

Instruction / "rules" files by surface

Each surface reads configuration from different paths. AGENTS.md is the cross-vendor convergence point — Cursor, Copilot, and others all read it.

Tool	Primary path	Notes
Cursor	`.cursor/rules/*.mdc`	Modes via frontmatter: `alwaysApply:true`, `description`, `globs`, or manual `@rule-name`. Order: Team → Project → User. Plain `.md` ignored except `AGENTS.md`
GitHub Copilot	`.github/copilot-instructions.md`	Path-specific `.github/instructions/.instructions.md` with `applyTo` globs. Also reads `AGENTS.md` / `CLAUDE.md` / `GEMINI.md`. Code review reads only first 4,000 chars*
Claude Code	`CLAUDE.md`	Loads memory + hooks/skills/plugins/MCP by default — unless `--bare` skips all of it

Where each surface loads project rules. Use `AGENTS.md` for portability across all of them.

If the task is…	Reach for…	Because
Short, local, review-heavy	IDE agent	Inline diffs + Checkpoints let you approve each hunk
Dense incremental editing	IDE agent	`Tab` saves more keystrokes than a chat round-trip, and is free
Exploring an unfamiliar repo	IDE agent	Prebuilt semantic index answers concepts without grepping in
Long-horizon, well-specified	CLI / cloud agent	You review the finished diff or PR, not each step
Unattended (CI, scheduled, triage)	CLI headless	`claude -p` / `codex exec` slot into shells and pipelines
Parallel / fan-out	CLI / cloud agent	Many agents across worktrees; the editor is not the bottleneck
Provider control / no markup	BYOK CLI (Cline)	Connect your own key and pay model rates directly

The "which surface" routing matrix. Pick the row matching the task in front of you.

Surface router — IDE, CLI/headless, or cloud agent?

Which IDE agent?

Five quick questions about where you code, how much autonomy you want, how you'd rather pay, whether open source matters, and your main language — and you'll get a single agent to start with, the reasoning behind it, and a runner-up. All five contenders are in the legend below.

0/5

Preferred surface· 1 of 5

Where do you want it to live?

The five contenders

Cursorclosed source

An AI-first editor (a VS Code fork) with a strong agent, fast Tab completion, and multi-file edits.

GitHub Copilotclosed source

Drops into the editor you already use — VS Code, JetBrains, Visual Studio, Neovim — with chat, agent mode, and completions.

Windsurfclosed source

An AI-native editor whose "Cascade" agent keeps the whole task in flow — opinionated, polished, and beginner-friendly.

Clineopen source

An open-source, fully autonomous coding agent that lives in VS Code. You bring your own model and pay the provider directly.

Roo Codeopen source

Open-source VS Code agent with role-based "modes" and deep auto-approve control — the tinkerer's pick. BYO model.

Answer the routing questions — task length, per-hunk vs final-PR review, attended vs unattended, parallel/fan-out, and cost basis — to land on an IDE agent, a headless CLI run (`claude -p` / `codex exec`), or a cloud PR agent. Mirrors the routing matrix above.

Cost models

Two cost classes: completions are cheap-to-free; agent loops are metered per token. A Tab completion or next-edit suggestion costs $0 on a paid plan; an agent loop that re-reads context every turn is billed per input, output, and cached token. The three subscription shapes below differ structurally — the shape determines which surface is cheap for a given workload.

Tool	Shape	Stays cheap / free	What meters
Cursor	Subscription + included usage pool (Pro $20, Pro+ $60, Ultra $200)	Tab unlimited on all paid plans	Agent usage past the included pool, at API rates
GitHub Copilot	Usage-based AI Credits (1 credit = $0.01); incl. Pro 1,500 / Pro+ 7,000 / Max 20,000	Completions + next-edit suggestions not billed	Chat, CLI, cloud agents — metered per token
Claude Code	Subscription pool + separate Agent SDK credit (from June 15, 2026)	Interactive terminal / IDE use draws the normal pool	Headless `-p`, SDK, GitHub Actions — full API rates, no rollover

Three cost shapes, current as of 2026-06-15.

Model	Input / 1M	Output / 1M
GPT-5.5	$5.00	$30.00
GPT-5.4	$2.50	$15.00
GPT-5.3-Codex	$1.75	$14.00
Claude Sonnet 4.x	$3.00	$15.00
Claude Opus 4.x	$5.00	$25.00
Gemini 2.5 Pro	$1.25	$10.00

GitHub Copilot per-1M-token rates (input / output) — why a context-re-reading agent turn costs orders of magnitude more than a $0 Tab completion.

Danger zone:Three status changes that date older guides

Older blog posts will quietly mislead you:

Windsurf is now Devin Desktop. Cognition acquired Windsurf; windsurf.com/pricing 308-redirects to devin.ai/pricing, which states "Windsurf is now Devin Desktop." Headline model: SWE-1.6.
Roo Code is discontinued. The RooCodeInc/Roo-Code repo was archived May 15, 2026 (final release v3.54.0). Its shutdown notice points migrating users to a community fork (ZooCode) and Cline. Do not cite it as a live option.
Gemini CLI has a hard cutoff: June 18, 2026. On that date it stops serving AI Pro/Ultra and free individuals; Antigravity CLI (a Go rewrite carrying Agent Skills, Hooks, Subagents, and Extensions-as-plugins) replaces it. Only paid Gemini Code Assist enterprise licenses keep Gemini CLI access.

Route a task to the right surface

Estimate loop length
Do you want to see and shape each change (short, exploratory, uncertain spec), or hand off and review at the end (long, well-specified)? Short → IDE. Long → CLI / cloud. This one question decides most cases.
Check who is watching
Attended, steered continuously → IDE agent with inline diffs. Unattended (CI, scheduled maintenance, log triage, overnight refactors) → headless claude -p or codex exec, or a cloud PR agent. There is no one to accept hunks at 3am.
Decide if you need parallelism
One focused change → either surface. Many independent changes at once — several issues into PRs, a fan-out across worktrees → CLI / cloud, so your editor is not the bottleneck.
Let cost break the tie
Mostly autocomplete → an IDE subscription (free Tab) is the floor. Heavy headless / CI volume meters at API rates — on Anthropic plans via the separate Agent SDK credit since June 15, 2026. Pooled org credits (Copilot Business/Enterprise) favor teams; BYOK (Cline) favors cost control and data residency.

Knowledge check

You need to bump dependencies across a repo, run the test suite, and open a PR — overnight, with no one watching, reviewed in the morning. Which surface fits?

Reach the end and this star joins your charted sky.