The Cartographer · 11 min mission
What Is Gemini CLI: Google's Open-Source Terminal Agent
Understand what Gemini CLI actually is, how its agentic loop works, and the three doors to Google's Gemini models.
On this page
Gemini CLI is Google's open-source (Apache-2.0) AI agent for the terminal: you state a goal, and it runs an agent loop — calling a Gemini model, reading and editing files, running shell commands, and searching the web — pausing for your approval before any change. After this guide you can install it, start a session, authenticate against any of its three backends, select a model, and avoid the env-var and date traps below.
The agentic loop
The source is two packages. packages/cli is the terminal frontend you type into; packages/core is the backend that talks to the Gemini API, manages tools, and processes each request. A turn runs as a loop: core assembles context (history + tool definitions + discovered GEMINI.md files) and sends it to the model; the model replies with an answer or a tool call (read file, run command, search); core executes the tool and returns the result; the model decides the next step. This repeats until the task is done.
Two mechanisms keep the loop safe and durable. Mutating tools require confirmation — before a file write (write_file, replace) or a shell command (run_shell_command), the CLI shows the diff or exact command and waits; read-only tools (grep_search, read_file) run silently. And when a conversation nears the model's token limit, core auto-compresses history (documented as "lossless in terms of information conveyed") so long tasks do not run out of room.
Install and start a session
Run it without installing
Run
npx @google/gemini-clito try it with no global install.Or install globally
npm:
npm install -g @google/gemini-cli. macOS/Linux Homebrew:brew install gemini-cli. macOS MacPorts:sudo port install gemini-cli.Launch in your project
From the repo root, run
gemini. It opens an interactive session scoped to the current directory.Or run one-shot / headless
Non-interactive prompt:
gemini -p "Explain the architecture of this codebase". Add--output-format jsonfor structured output, or--output-format stream-jsonfor newline-delimited JSON events. Add directories withgemini --include-directories ../lib,../docs.
Authenticating — three backends, same models
The model and the quota come from one of three backends, selected by env vars. The same gemini binary serves all three; only billing, limits, and compliance differ.
| Backend | How you select it | Free tier / best for |
|---|---|---|
| Google sign-in (OAuth) | Browser login on first run; optional export GOOGLE_CLOUD_PROJECT="…" | 60 req/min + 1,000 req/day; individual devs, no key management |
| Gemini API key (AI Studio) | export GEMINI_API_KEY="…" (key from aistudio.google.com/apikey) | 1,000 req/day on Gemini 3; direct model control + paid tier |
| Vertex AI (Google Cloud) | export GOOGLE_API_KEY="…" and export GOOGLE_GENAI_USE_VERTEXAI=true | Enterprise/production: higher limits, compliance |
Models, defaults, and Auto routing
Gemini CLI defaults to Gemini 3 (launched 2025-11-18) with the 1M-token context window. The current recommended Pro model ID is gemini-3.1-pro-preview — 1,048,576 input / 65,536 output tokens, knowledge cutoff January 2025, reasoning ("Thinking") supported. You rarely type a model ID: run /model and pick Auto (Gemini 3) (requires v0.21.1+), or launch one directly with -m, e.g. gemini -m gemini-3.1-pro-preview or gemini -m gemini-2.5-flash.
Auto routing classifies the prompt — simple prompts go to gemini-2.5-flash, complex ones to Gemini 3 Pro. There is also a rate-limit fallback: when the "pro" model is throttled, the session silently switches to "flash." A hard reasoning task can therefore land on a lighter model — if you need top-tier reasoning, select Pro via /model or pin it with -m.
Pick a model and see what routing selects
Which Claude model?
Three quick questions about your task, your tolerance for latency, and your budget — and you'll get a single model to reach for, with the reasoning behind it. All four current models are in the legend below.
All four models
The most capable widely released model — built for the hardest reasoning and long-horizon agentic work.
The most capable Opus-tier model for complex reasoning and agentic coding.
The best combination of speed and intelligence — the everyday workhorse.
The fastest model with near-frontier intelligence — for snappy, high-volume work.
| Channel | Cadence | Install tag | Current version |
|---|---|---|---|
| Stable | Weekly — Tue 20:00 UTC | @latest | v0.46.0 (2026-06-10) |
| Preview | Weekly — Tue 23:59 UTC | @preview | v0.47.0-preview.0 |
| Nightly | Daily — 00:00 UTC | @nightly | v0.48.0-nightly.* |
Knowledge check
You ask Gemini CLI to "refactor this module and rerun the tests." It reads several files, then pauses showing you a diff before doing anything else. Why did it stop there?
Reach the end and this star joins your charted sky.