The Navigator · 9 min mission
Choosing Your Model: The Claude Lineup
Pick the right model for the task — capability, speed, and cost.
On this page
Claude Code does not run on one model — it runs on whichever one you point it at, and that choice quietly shapes every session. Pick a model that is too small and a refactor stalls in confusion; reach for the biggest model on a one-line rename and you have spent reasoning (and money) you did not need. The skill is matching the model to the job.
The good news: switching is a single command, the lineup is small, and the trade-offs are legible once you have seen them named. This guide gives you the current roster, what each model is genuinely good at, and a decision framework you can apply without re-reading the docs every time.
A family, not a ladder
It is tempting to picture the models as a straight ladder where "bigger is always better." That is the wrong mental model. They are a family, each tuned to a different point on the curve between raw capability, response speed, and cost per token. A larger model thinks harder and verifies more, but it also costs more and answers slower. A smaller model is quick and cheap, but it will reach for a guess where a larger one would have investigated.
So the right question is never "which model is best?" It is "which model is best for this task, right now?" Most developers settle into a default for everyday work and deliberately escalate or downshift when a specific task calls for it.
| Model | Best for | Context | API price (in / out) | Speed |
|---|---|---|---|---|
Fable 5 claude-fable-5 | Hardest, longest autonomous sessions; deep investigation | 1M tokens | $10 / $50 | Deliberate |
Opus 4.8 claude-opus-4-8 | Complex reasoning, long-horizon agentic coding | 1M tokens | $5 / $25 | Moderate |
Sonnet 4.6 claude-sonnet-4-6 | Everyday coding — the speed/intelligence sweet spot | 1M tokens | $3 / $15 | Fast |
Haiku 4.5 claude-haiku-4-5 | Simple, scoped, latency-sensitive tasks | 200K tokens | $1 / $5 | Fastest |
How the generations evolved
Read the version numbers and the shape of the family comes into focus. The tiers — Opus, Sonnet, Haiku — name a size class: Opus the heavyweight reasoner, Sonnet the balanced workhorse, Haiku the speed specialist. The number after the name is the generation, and each bump folds in better reasoning, coding, and agentic behavior at the same tier. So Sonnet 4.6 is a sharper Sonnet than 4.5 was, not a different kind of model.
Fable 5 is the newer thing in the family. Rather than being "Opus, but more," it is positioned as Anthropic's most capable widely released model — purpose-built for work that spills past a single sitting. It sustains long autonomous sessions, investigates before it acts, and verifies its own work more often than the smaller models, which means it needs fewer "remember to test this" reminders from you. That self-direction is exactly what you want on an ambiguous, multi-hour task and exactly what you are over-paying for on a quick edit.
One practical wrinkle worth knowing: Fable 5 runs with safety classifiers for cybersecurity and biology content. When a request trips one, Claude Code automatically re-runs it on the default Opus model and shows a notice in the transcript. For most coding work you will never see this; for offensive-security or biology-adjacent codebases, expect it routinely.
Switching models with /model
You change models without leaving your session. Run /model with no argument to open a picker, or pass an alias directly — /model sonnet, /model opus, /model fable, /model haiku. Aliases are the friendly way to select a model without memorizing version numbers: on the Claude API, opus resolves to Opus 4.8 and sonnet resolves to Sonnet 4.6. They always point at the recommended version for your account, so they keep working as new generations ship.
A few aliases are worth keeping in your back pocket. default clears any override and reverts to the recommended model for your account type. best uses Fable 5 where your organization has access, otherwise the latest Opus. And opusplan is a clever hybrid: it uses Opus while you are in plan mode — for architecture and reasoning — then automatically switches to Sonnet for execution, giving you heavyweight planning and efficient implementation in one setting.
Your choice is sticky. In current versions, switching with /model saves that model as the default for new sessions by writing the model field in your user settings; pick "switch for this session only" in the picker if you want the change to be temporary. You can also launch with the model preset (claude --model opus) or pin one permanently in your settings file.
The decision framework
Three forces pull on every model choice: capability (how hard the model can think and how reliably it verifies), speed (how fast you get an answer), and cost (what you pay per token). You cannot maximize all three; picking a model is choosing where on that triangle to sit for the task in front of you.
A workable rule of thumb: default to Sonnet for the bulk of day-to-day coding — it is fast, capable, and the cheapest of the 1M-context models, which makes it the right home base. Escalate to Opus when a task needs genuinely complex reasoning or spans a long agentic loop where a wrong turn early is expensive. Reach for Fable 5 when the work is larger than a single sitting, deeply ambiguous, or the kind of root-cause investigation where its extra self-verification earns its keep. Drop to Haiku for short, well-scoped, latency-sensitive tasks where near-frontier intelligence is plenty and you would rather have the answer now and cheap.
The cost framing matters because tokens add up across a long session, but do not let it dominate. The expensive mistake is rarely the model's price — it is shipping the wrong fix because you under-powered a hard task, or burning an afternoon babysitting a model that was too small to investigate on its own.
Under-powered vs. right-sized
Wrong model for the job
A flaky, intermittent production bug handed to Haiku to save a few cents.
It patches the first plausible symptom, the bug returns the next day, and you have spent more time re-opening the task than the larger model would have cost. Speed bought nothing because the answer was wrong.
Matched to the task
The same bug handed to Fable 5 or Opus 4.8.
It reproduces first, traces the real cause across the deploy boundary, fixes it, and verifies — one pass, no re-opens. The higher per-token cost is dwarfed by the engineering time it saved.
Match a model to your task
Which Claude model?
Three quick questions about your task, your tolerance for latency, and your budget — and you'll get a single model to reach for, with the reasoning behind it. All four current models are in the legend below.
All four models
The most capable widely released model — built for the hardest reasoning and long-horizon agentic work.
The most capable Opus-tier model for complex reasoning and agentic coding.
The best combination of speed and intelligence — the everyday workhorse.
The fastest model with near-frontier intelligence — for snappy, high-volume work.
Knowledge check
You are starting a multi-hour, open-ended investigation into a hard-to-reproduce bug in an unfamiliar codebase. Which model is the strongest fit, and why?
Building the habit
The developers who get the most from Claude Code treat the model as a dial, not a permanent setting. Pick a sensible default — Sonnet for most people — and build the reflex of turning the dial when a task is clearly above or below it. Before a big, ambiguous task, type /model fable or /model opus; for a trivial scoped edit, /model haiku is faster and cheaper. The cost of switching is one command; the cost of using the wrong model is measured in re-opened tasks and wasted tokens.
And remember the numbers in this guide are a snapshot. Model names, version numbers, and prices move as Anthropic ships new generations — the stable principle is the family structure (Opus heavy, Sonnet balanced, Haiku fast, Fable for the hardest work) and the capability-versus-speed-versus-cost trade-off. When the specifics matter, check /model and the official models overview rather than trusting your memory of last quarter's lineup.
The 1M-token window and the [1m] suffix
A model's context window is the working memory for the session: every file you open, every tool result, and the whole back-and-forth share that space. The 1M-token window quadruples the headroom of the standard 200K, which is what lets a model hold a large codebase — or a long, sprawling session — without thrashing against the limit. [V] Fable 5, Opus 4.6 and later, and Sonnet 4.6 all support it; Haiku 4.5 stays at 200K.
The catch is that 1M is not always on. [V] On the Claude API, Fable 5, Opus 4.8, and Opus 4.7 always run with the 1M window — you get it for free. [V] Sonnet on the API also has full access. Where it gets conditional is on subscription plans: [V] Max, Team, and Enterprise plans auto-upgrade Opus to 1M context with no configuration (this covers both Team Standard and Team Premium seats), but Sonnet's 1M window is never part of that automatic upgrade — it requires usage credits on every subscription plan, Max included. On Pro, both Opus and Sonnet 1M need usage credits.
[V] When your account supports it, the 1M option shows up in the /model picker. You can also request it explicitly with the [1m] suffix on an alias or a full model name — /model opus[1m], /model sonnet[1m], or /model claude-opus-4-8[1m]. [V] Pricing carries no premium for the tokens beyond 200K: it is standard model pricing, billed to your subscription where extended context is included, or to usage credits where it is not. [V] To take the 1M variants off the table entirely — they will vanish from the picker — set CLAUDE_CODE_DISABLE_1M_CONTEXT=1.
[P] Treat the bigger window the same way you treat any context: a ceiling, not a target. A 1M session that is 80% irrelevant files still reasons worse than a tight 150K one. The window buys you room before you must /compact; it does not buy you the right to skip curating what is in it.
| Plan | Opus with 1M | Sonnet with 1M |
|---|---|---|
| Max, Team, Enterprise | Included with subscription (auto-upgrade) | Requires usage credits |
| Pro | Requires usage credits | Requires usage credits |
| API / pay-as-you-go | Full access (always on for Fable 5, Opus 4.8, Opus 4.7) | Full access |
Adaptive thinking vs. a fixed budget
The newer models do not think a fixed amount on every step — they decide, per step, whether to reason and how deeply, based on how hard the task in front of them is. That is adaptive reasoning, and it is why a trivial prompt comes back fast while a knotty one gets visibly more deliberation. [V] Opus 4.7 and later — which includes Opus 4.8 — and Fable 5 always use adaptive reasoning. You cannot switch it off on them: the fixed-budget mode and CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING simply do not apply, and on Fable 5 thinking cannot be turned off at all.
[V] The lever you do have on the adaptive models is the effort level (/effort, the --effort flag, or CLAUDE_CODE_EFFORT_LEVEL), which sets how much reasoning a given level produces — low, medium, high, xhigh, max. [V] Running /effort auto (or setting the env var to auto) resets to the model's default rather than disabling thinking. [P] For a one-off deep pass without changing your session setting, drop the keyword ultrathink anywhere in a prompt.
[V] The older fixed-budget behavior lives on only on Opus 4.6 and Sonnet 4.6. On those two, you can set CLAUDE_CODE_DISABLE_ADAPTIVE_THINKING=1 to revert to a fixed thinking budget governed by MAX_THINKING_TOKENS — a flat allowance applied every step regardless of difficulty. On the adaptive models that knob does nothing.
Two kinds of "auto": effort reset and opusplan routing
"Auto" shows up in two unrelated places, and it pays to keep them straight. The first is the effort auto reset above — /effort auto drops you back to the model's default reasoning level. [V] The second is opusplan, a routing mode you select like any other model: /model opusplan. It is a clever split rather than a single model. [V] While you are in plan mode, it uses opus for the architecture-and-reasoning work; the moment you move to execution, it automatically switches to sonnet for code generation. You get heavyweight planning and efficient implementation from one setting, without touching /model at the boundary.
[V] The plan-mode Opus phase inherits the same context window as the plain opus setting — so on an auto-upgrade tier (Max, Team, Enterprise) it gets 1M context in plan mode for free. [V] If you are not on an auto-upgrade tier and want 1M for both phases, select opusplan[1m]. [P] opusplan is the natural home base for anyone who plans before they code: you stop having to remember to size down after the design is settled.
Two fallbacks, easy to confuse
Claude Code has two distinct "fall back to another model" behaviors, and they fire for completely different reasons. [V] The first is the fallback model chain — an availability safety net. When your primary model is overloaded, unavailable, or returns a non-retryable server error, Claude Code can switch to a backup instead of failing the turn. You configure it with the --fallback-model flag (a comma-separated list) for one session, or the fallbackModel array in settings to persist it. [V] The chain is capped at three models after de-duplication, the switch lasts only the current turn, and auth, billing, rate-limit, and request-size errors never trigger it.
[V] The second is the Fable 5 to Opus auto-fallback — a content safety net, and entirely separate from the chain above. Fable 5 runs with safety classifiers for cybersecurity and biology content; when a request trips one, Claude Code re-runs it on the default Opus model (Opus 4.8 on the Claude API) and shows a notice in the transcript, then continues the session on that Opus model until you run /model fable again. [V] This can fire on the very first request, since that request carries your workspace context — CLAUDE.md, git status, directory names — so a security- or biology-adjacent repo can trip the classifier before you have typed anything unusual. [P] On offensive-security or biology codebases, expect this routinely; for ordinary application work you will likely never see it.
The two fallbacks side by side
Fallback model chain (availability)
Fires when the primary model is unavailable — overloaded or a non-retryable server error.
Configured by you via --fallback-model sonnet,haiku or the fallbackModel settings array. Capped at three models; lasts the current turn only; never triggered by auth, billing, or rate-limit errors. [V]
Fable 5 → Opus auto-fallback (content)
Fires when a safety classifier flags a cybersecurity or biology request on Fable 5.
Not configured — built in. Re-runs on the default Opus model (Opus 4.8 on the Claude API), shows a transcript notice, and continues on Opus until you run /model fable. Can trip on the first request via workspace context. [V]
Reach the end and this star joins your charted sky.