The Cartographer · 11 min mission

What Is Gemini CLI: Google's Open-Source Terminal Agent

Understand what Gemini CLI actually is, how its agentic loop works, and the three doors to Google's Gemini models.

gemini-cliagentic-codinggooglefundamentalsgemini-3Fact-checked 2026-06-15

On this page

The agentic loop
Authenticating — three backends, same models
Models, defaults, and Auto routing

Gemini CLI is Google's open-source (Apache-2.0) AI agent for the terminal: you state a goal, and it runs an agent loop — calling a Gemini model, reading and editing files, running shell commands, and searching the web — pausing for your approval before any change. After this guide you can install it, start a session, authenticate against any of its three backends, select a model, and avoid the env-var and date traps below.

The agentic loop

The source is two packages. packages/cli is the terminal frontend you type into; packages/core is the backend that talks to the Gemini API, manages tools, and processes each request. A turn runs as a loop: core assembles context (history + tool definitions + discovered GEMINI.md files) and sends it to the model; the model replies with an answer or a tool call (read file, run command, search); core executes the tool and returns the result; the model decides the next step. This repeats until the task is done.

Two mechanisms keep the loop safe and durable. Mutating tools require confirmation — before a file write (write_file, replace) or a shell command (run_shell_command), the CLI shows the diff or exact command and waits; read-only tools (grep_search, read_file) run silently. And when a conversation nears the model's token limit, core auto-compresses history (documented as "lossless in terms of information conveyed") so long tasks do not run out of room.

one task, one loop

… scroll to run this session

Read-only tools (`grep_search`, `read_file`) run silently; the file write pauses for a y/N confirmation before applying the diff.

Install and start a session

Run it without installing
Run npx @google/gemini-cli to try it with no global install.
Or install globally
npm: npm install -g @google/gemini-cli. macOS/Linux Homebrew: brew install gemini-cli. macOS MacPorts: sudo port install gemini-cli.
Launch in your project
From the repo root, run gemini. It opens an interactive session scoped to the current directory.
Or run one-shot / headless
Non-interactive prompt: gemini -p "Explain the architecture of this codebase". Add --output-format json for structured output, or --output-format stream-json for newline-delimited JSON events. Add directories with gemini --include-directories ../lib,../docs.

Authenticating — three backends, same models

The model and the quota come from one of three backends, selected by env vars. The same gemini binary serves all three; only billing, limits, and compliance differ.

Backend	How you select it	Free tier / best for
Google sign-in (OAuth)	Browser login on first run; optional `export GOOGLE_CLOUD_PROJECT="…"`	60 req/min + 1,000 req/day; individual devs, no key management
Gemini API key (AI Studio)	`export GEMINI_API_KEY="…"` (key from `aistudio.google.com/apikey`)	1,000 req/day on Gemini 3; direct model control + paid tier
Vertex AI (Google Cloud)	`export GOOGLE_API_KEY="…"` and `export GOOGLE_GENAI_USE_VERTEXAI=true`	Enterprise/production: higher limits, compliance

The three backends and the exact env vars that select each. The Vertex flag is a frequent footgun — see the warning below.

Models, defaults, and Auto routing

Gemini CLI defaults to Gemini 3 (launched 2025-11-18) with the 1M-token context window. The current recommended Pro model ID is gemini-3.1-pro-preview — 1,048,576 input / 65,536 output tokens, knowledge cutoff January 2025, reasoning ("Thinking") supported. You rarely type a model ID: run /model and pick Auto (Gemini 3) (requires v0.21.1+), or launch one directly with -m, e.g. gemini -m gemini-3.1-pro-preview or gemini -m gemini-2.5-flash.

Auto routing classifies the prompt — simple prompts go to gemini-2.5-flash, complex ones to Gemini 3 Pro. There is also a rate-limit fallback: when the "pro" model is throttled, the session silently switches to "flash." A hard reasoning task can therefore land on a lighter model — if you need top-tier reasoning, select Pro via /model or pin it with -m.

Pick a model and see what routing selects

Which Claude model?

Three quick questions about your task, your tolerance for latency, and your budget — and you'll get a single model to reach for, with the reasoning behind it. All four current models are in the legend below.

0/3

Question 1 of 3

How hard is the task?

All four models

Claude Fable 5$10 / $50 per MTok

The most capable widely released model — built for the hardest reasoning and long-horizon agentic work.

Claude Opus 4.8$5 / $25 per MTok

The most capable Opus-tier model for complex reasoning and agentic coding.

Claude Sonnet 4.6$3 / $15 per MTok

The best combination of speed and intelligence — the everyday workhorse.

Claude Haiku 4.5$1 / $5 per MTok

The fastest model with near-frontier intelligence — for snappy, high-volume work.

Compare model tiers and routing outcomes. In Gemini CLI the equivalent control is `/model` → Auto (Gemini 3) or launching with `-m`.

Channel	Cadence	Install tag	Current version
Stable	Weekly — Tue 20:00 UTC	`@latest`	`v0.46.0` (2026-06-10)
Preview	Weekly — Tue 23:59 UTC	`@preview`	`v0.47.0-preview.0`
Nightly	Daily — 00:00 UTC	`@nightly`	`v0.48.0-nightly.*`

Release channels, install tags, and current versions verified 2026-06-15 against the GitHub Releases API. Stable is the default for real work.

Knowledge check

You ask Gemini CLI to "refactor this module and rerun the tests." It reads several files, then pauses showing you a diff before doing anything else. Why did it stop there?

Reach the end and this star joins your charted sky.