The Cartographer · 11 min mission
Prompting Gemini CLI Well
Fill the 1M-token window, place your ask last, and turn a good prompt into a script.
On this page
Gemini CLI (@google/gemini-cli, current stable v0.46.0, released 2026-06-10) is a terminal agent over a Gemini model with a 1,000,000-token context window. This guide shows how to prompt it: how to place context for that window, when to inject files explicitly with @ vs let the agent explore, how to use GEMINI.md, multimodal inputs, Plan Mode, and the headless surface that turns any prompt into a script.
| Alias | Resolves to | Use |
|---|---|---|
auto | gemini-2.5-pro or gemini-3-pro-preview (default) | Routes per prompt: simple → Flash, complex → Pro |
pro | gemini-2.5-pro or gemini-3-pro-preview | Always prefers the most capable model |
flash | gemini-2.5-flash | Fast, balanced |
flash-lite | gemini-2.5-flash-lite | Fastest, simple tasks |
Place context first, the ask last
Google's prompt-design guidance for large contexts is explicit: "When providing large amounts of context (e.g., documents, code), supply all the context first. Place your specific instructions or questions at the very end of the prompt," then bridge with a phrase like "Based on the information above…". Two more documented habits raise hit rate: give clear, specific instructions, and include few-shot examples with consistent formatting ("prompts without few-shot examples are likely to be less effective"; mismatched example formats produce mismatched output).
Question-first vs. context-first
Question first (degrades)
Why does the total round wrong? @src/checkout/ @src/pricing/
The ask sits ahead of tens of thousands of tokens the model reads afterward. With a large context this measurably hurts the answer.
Context first, ask last (recommended)
@src/checkout/ @src/pricing/ @tests/checkout.test.ts
Based on the files above: the order total rounds up by one cent on multi-item carts. Find the rounding bug and fix it without changing the public pricing API.
Upgrade a vague prompt
Prompt upgrader
A vague ask makes the model guess. Pick a prompt below — or type your own — and watch it become a directed one: outcome, evidence, constraints, acceptance criteria. Each added part is highlighted, with a one-line reason it earns its place.
Before — vague
fix the login bug
The model has to guess the file, the scope, and what “done” means — so it often guesses wrong.
After — directed
- OutcomeMake sign-in succeed for valid credentials again on the /login page.
- EvidenceThe regression is in src/auth/login.ts; a failing repro is in tests/auth/login.test.ts.
- ConstraintsTouch only the auth module. Do not change the public API or the session-token shape.
- Acceptance criterianpm test passes, including the previously failing login case, with no new lint warnings.
Outcome
Names the finished result, so the model optimizes for done — not for sounding busy.
Evidence
Points at the real files, data, or context to ground the work and stop it inventing facts.
Constraints
Draws the guardrails up front, so the model stays inside scope instead of gold-plating.
Acceptance criteria
Defines "passing" as a checkable test, so you both know when the task is actually finished.
Explicit context: the @ command
@<path> injects file or directory content into the current prompt: it runs the read_many_files tool and inserts the content before sending your turn. Position is flexible (@README.md What is this? and What is this? @README.md both work), but for large injections put the instruction last per the placement rule. A bare directory like @src/ is recursive (it pulls that directory and all subdirectories). Git-aware filtering is on by default — node_modules/, dist/, .env, and .git/ are excluded (tunable via context.fileFiltering), and a .geminiignore file is also respected. A lone @ with no path passes the query through unchanged.
| You type | What the agent receives |
|---|---|
@src/auth/session.ts Explain the token flow. | One file injected, then your question |
@src/ Summarize this codebase. | Every non-ignored file in src/ and subdirectories |
@src/components/UserProfile.tsx @src/types/User.ts Refactor to the new User type. | Two files chained for cross-file reasoning |
@My\ Documents/spec.md Implement section 3. | A path with a space, escaped with a backslash |
Multimodal: images, PDFs, sketches
The read_file tool supports text, images, audio, and PDF, so referencing a supported asset by path injects it as multimodal input the model can see. Google's README lists "Generate new apps from PDFs, images, or sketches using multimodal capabilities," and the official walkthrough has the agent visually analyze .png files and rename them by content (photo1.png → yellow_flowers.png). The mechanism is identical to text @: reference the asset by path, put the instruction last.
Explicit vs. exploratory steering
Explicit (you know where)
Name the files and constraints:
@src/api/auth.ts @src/api/middleware.ts Based on the above, add rate limiting to the login route. Keep the existing error-response shape.
Fast, cheap, predictable. Use when you can point at the work.
Exploratory (you know the goal)
State the goal and let it glob/grep/read:
Login occasionally rejects valid sessions under load. Investigate where this comes from and propose a fix.
Start in Plan Mode so it researches read-only first. Use when the cause is unknown.
Iterate with Plan Mode and mid-turn steering
Open Plan Mode for non-trivial work
Plan Mode is read-only research and design — it reads and proposes, it does not edit (enabled by default). Enter it with
gemini --approval-mode=plan, the/plan [goal]command, cyclingShift+TabthroughDefault → Auto-Edit → Plan, or natural language ("start a plan for…", which calls theenter_plan_modetool — unavailable in YOLO mode). It produces a Markdown plan to review before any change.Steer while it works
Model steering (experimental) lets you type hints while the agent is working to redirect it mid-turn — e.g. "don't forget to check
packages/common/queues" or "use a Pub/Sub pattern instead of a queue." Be specific and steer early: redirecting during research is cheaper than rewriting after the draft.Edit and approve the plan
Open the generated plan in your external editor with
Ctrl+X, adjust it, then approve and let the agent execute — orEscto cancel.Keep the session context clean
/compressreplaces the entire chat history with a summary to reclaim tokens;/clear(Ctrl+L) clears the screen;/chatand/resumemanage saved checkpoints. When you start an unrelated task, compress or clear so stale context does not crowd the window.
Standing context: GEMINI.md
GEMINI.md is the default context file the CLI concatenates and sends with every prompt. Load order: (1) global ~/.gemini/GEMINI.md; (2) workspace files in your configured directories and their parents; (3) just-in-time files discovered in a directory when a tool touches it. The footer shows how many context files are loaded. Use it for standing rules — build/test commands, conventions, hard "never do this" constraints — and modularize large files with @file.md imports. Override the filename(s) via context.fileName in settings.json (e.g. ["AGENTS.md", "CONTEXT.md", "GEMINI.md"]).
The same prompt, headless
Gemini CLI enters headless mode automatically in a non-TTY environment, or whenever you pass a query with -p / --prompt. gemini -p "query" runs non-interactively and prints to stdout; cat file | gemini processes piped stdin. Crucially, -p appends to stdin rather than replacing it, so git diff | gemini -p "Write a commit message for these changes" feeds the diff and your instruction together. For machine-readable output, --output-format json (-o json) returns one { response, stats, error? } object — extract clean text with jq -r '.response'. stream-json emits newline-delimited events (init, message, tool_use, tool_result, error, result).
| Flag / signal | Effect |
|---|---|
-p / --prompt | Non-interactive; appended to stdin if both are provided |
-i / --prompt-interactive | Execute the query, then continue interactively |
-o / --output-format | text (default) · json · stream-json |
--approval-mode | default · auto_edit · yolo · plan (--yolo/-y is deprecated) |
-r / --resume | Resume a session by "latest" or index/id, e.g. gemini -r "latest" "Check for type errors" |
exit 0 / 1 / 42 / 53 | success / general or API error / input error / turn limit exceeded |
Knowledge check
You inject a whole module plus a 14-page PDF spec, then ask one question. Where should the question go for the most accurate answer?
Reach the end and this star joins your charted sky.