The Cartographer · 11 min mission

Prompting Gemini CLI Well

Fill the 1M-token window, place your ask last, and turn a good prompt into a script.

gemini-clipromptingcontext-windowmultimodalheadlessFact-checked 2026-06-15

On this page

Place context first, the ask last
Explicit context: the @ command
Multimodal: images, PDFs, sketches
Standing context: GEMINI.md
The same prompt, headless

Gemini CLI (@google/gemini-cli, current stable v0.46.0, released 2026-06-10) is a terminal agent over a Gemini model with a 1,000,000-token context window. This guide shows how to prompt it: how to place context for that window, when to inject files explicitly with @ vs let the agent explore, how to use GEMINI.md, multimodal inputs, Plan Mode, and the headless surface that turns any prompt into a script.

Alias	Resolves to	Use
`auto`	`gemini-2.5-pro` or `gemini-3-pro-preview` (default)	Routes per prompt: simple → Flash, complex → Pro
`pro`	`gemini-2.5-pro` or `gemini-3-pro-preview`	Always prefers the most capable model
`flash`	`gemini-2.5-flash`	Fast, balanced
`flash-lite`	`gemini-2.5-flash-lite`	Fastest, simple tasks

Default model selection via `--model`/`-m` (default `auto`). Aliases from cli-reference.md. The 1M context window is documented for "Gemini 3 models" collectively; it is not guaranteed per individual model id.

Place context first, the ask last

Google's prompt-design guidance for large contexts is explicit: "When providing large amounts of context (e.g., documents, code), supply all the context first. Place your specific instructions or questions at the very end of the prompt," then bridge with a phrase like "Based on the information above…". Two more documented habits raise hit rate: give clear, specific instructions, and include few-shot examples with consistent formatting ("prompts without few-shot examples are likely to be less effective"; mismatched example formats produce mismatched output).

Question-first vs. context-first

Question first (degrades)

Why does the total round wrong? @src/checkout/ @src/pricing/

The ask sits ahead of tens of thousands of tokens the model reads afterward. With a large context this measurably hurts the answer.

Context first, ask last (recommended)

@src/checkout/ @src/pricing/ @tests/checkout.test.ts

Based on the files above: the order total rounds up by one cent on multi-item carts. Find the rounding bug and fix it without changing the public pricing API.

Upgrade a vague prompt

Prompt upgrader

A vague ask makes the model guess. Pick a prompt below — or type your own — and watch it become a directed one: outcome, evidence, constraints, acceptance criteria. Each added part is highlighted, with a one-line reason it earns its place.

Before — vague

fix the login bug

The model has to guess the file, the scope, and what “done” means — so it often guesses wrong.

After — directed

OutcomeMake sign-in succeed for valid credentials again on the /login page.
EvidenceThe regression is in src/auth/login.ts; a failing repro is in tests/auth/login.test.ts.
ConstraintsTouch only the auth module. Do not change the public API or the session-token shape.
Acceptance criterianpm test passes, including the previously failing login case, with no new lint warnings.

Outcome

Names the finished result, so the model optimizes for done — not for sounding busy.

Evidence

Points at the real files, data, or context to ground the work and stop it inventing facts.

Constraints

Draws the guardrails up front, so the model stays inside scope instead of gold-plating.

Acceptance criteria

Defines "passing" as a checkable test, so you both know when the task is actually finished.

Toggle the levers — outcome, the files/assets to inject, constraints, and the acceptance check — on a vague ask and watch it become a directed brief with the instruction placed last, each addition explained.

Explicit context: the `@` command

@<path> injects file or directory content into the current prompt: it runs the read_many_files tool and inserts the content before sending your turn. Position is flexible (@README.md What is this? and What is this? @README.md both work), but for large injections put the instruction last per the placement rule. A bare directory like @src/ is recursive (it pulls that directory and all subdirectories). Git-aware filtering is on by default — node_modules/, dist/, .env, and .git/ are excluded (tunable via context.fileFiltering), and a .geminiignore file is also respected. A lone @ with no path passes the query through unchanged.

You type	What the agent receives
`@src/auth/session.ts Explain the token flow.`	One file injected, then your question
`@src/ Summarize this codebase.`	Every non-ignored file in `src/` and subdirectories
`@src/components/UserProfile.tsx @src/types/User.ts Refactor to the new User type.`	Two files chained for cross-file reasoning
`@My\ Documents/spec.md Implement section 3.`	A path with a space, escaped with a backslash

The @ command in practice. A directory reference is recursive; git-ignored paths are filtered by default; binary or very large files are skipped or truncated by read_many_files, which reports what it skipped.

Multimodal: images, PDFs, sketches

The read_file tool supports text, images, audio, and PDF, so referencing a supported asset by path injects it as multimodal input the model can see. Google's README lists "Generate new apps from PDFs, images, or sketches using multimodal capabilities," and the official walkthrough has the agent visually analyze .png files and rename them by content (photo1.png → yellow_flowers.png). The mechanism is identical to text @: reference the asset by path, put the instruction last.

prompting from a mockup and a spec

… scroll to run this session

Two multimodal turns: a PNG mockup becomes a component, then one section of a PDF spec becomes an endpoint. The asset is referenced by path so the model sees it; the instruction is placed last.

Explicit vs. exploratory steering

Explicit (you know where)

Name the files and constraints:

@src/api/auth.ts @src/api/middleware.ts Based on the above, add rate limiting to the login route. Keep the existing error-response shape.

Fast, cheap, predictable. Use when you can point at the work.

Exploratory (you know the goal)

State the goal and let it glob/grep/read:

Login occasionally rejects valid sessions under load. Investigate where this comes from and propose a fix.

Start in Plan Mode so it researches read-only first. Use when the cause is unknown.

Iterate with Plan Mode and mid-turn steering

Open Plan Mode for non-trivial work
Plan Mode is read-only research and design — it reads and proposes, it does not edit (enabled by default). Enter it with gemini --approval-mode=plan, the /plan [goal] command, cycling Shift+Tab through Default → Auto-Edit → Plan, or natural language ("start a plan for…", which calls the enter_plan_mode tool — unavailable in YOLO mode). It produces a Markdown plan to review before any change.
Steer while it works
Model steering (experimental) lets you type hints while the agent is working to redirect it mid-turn — e.g. "don't forget to check packages/common/queues" or "use a Pub/Sub pattern instead of a queue." Be specific and steer early: redirecting during research is cheaper than rewriting after the draft.
Edit and approve the plan
Open the generated plan in your external editor with Ctrl+X, adjust it, then approve and let the agent execute — or Esc to cancel.
Keep the session context clean
/compress replaces the entire chat history with a summary to reclaim tokens; /clear (Ctrl+L) clears the screen; /chat and /resume manage saved checkpoints. When you start an unrelated task, compress or clear so stale context does not crowd the window.

Standing context: `GEMINI.md`

GEMINI.md is the default context file the CLI concatenates and sends with every prompt. Load order: (1) global ~/.gemini/GEMINI.md; (2) workspace files in your configured directories and their parents; (3) just-in-time files discovered in a directory when a tool touches it. The footer shows how many context files are loaded. Use it for standing rules — build/test commands, conventions, hard "never do this" constraints — and modularize large files with @file.md imports. Override the filename(s) via context.fileName in settings.json (e.g. ["AGENTS.md", "CONTEXT.md", "GEMINI.md"]).

The same prompt, headless

Gemini CLI enters headless mode automatically in a non-TTY environment, or whenever you pass a query with -p / --prompt. gemini -p "query" runs non-interactively and prints to stdout; cat file | gemini processes piped stdin. Crucially, -p appends to stdin rather than replacing it, so git diff | gemini -p "Write a commit message for these changes" feeds the diff and your instruction together. For machine-readable output, --output-format json (-o json) returns one { response, stats, error? } object — extract clean text with jq -r '.response'. stream-json emits newline-delimited events (init, message, tool_use, tool_result, error, result).

Flag / signal	Effect
`-p` / `--prompt`	Non-interactive; appended to stdin if both are provided
`-i` / `--prompt-interactive`	Execute the query, then continue interactively
`-o` / `--output-format`	`text` (default) · `json` · `stream-json`
`--approval-mode`	`default` · `auto_edit` · `yolo` · `plan` (`--yolo`/`-y` is deprecated)
`-r` / `--resume`	Resume a session by `"latest"` or index/id, e.g. `gemini -r "latest" "Check for type errors"`
exit `0` / `1` / `42` / `53`	success / general or API error / input error / turn limit exceeded

Headless flags and exit codes for scripting (cli-reference.md, headless.md).

interactive prompt becomes a pipeline

… scroll to run this session

The same ask — write a commit message for this diff — first interactively (with a live `!` shell injection), then as a one-line headless pipeline: pipe the diff in, append the instruction with -p, emit JSON, extract text with jq.

Knowledge check

You inject a whole module plus a 14-page PDF spec, then ask one question. Where should the question go for the most accurate answer?

Reach the end and this star joins your charted sky.