The Navigator · 11 min mission

Autonomous Loops: Long-Running & Scheduled Agents

Let Claude work unattended — safely, with the right harness.

autonomyautomationheadlessFact-checked 2026-06-13
On this page

Most of your time with Claude Code is a conversation: you type, it acts, you watch, you steer. Autonomy is what happens when you remove yourself from that loop — when Claude runs from a script, a cron schedule, or a CI job with no human in the chair to approve the next step.

That is a genuinely different mode of operation, and it is easy to get wrong. The same agent that is delightful when you are watching it can quietly delete the wrong files, loop forever, or burn through your budget when nobody is. This guide covers the machinery that makes unattended runs possible — and, just as importantly, the conditions that make them safe.

The shift from interactive to unattended

In an interactive session, the permission prompt is your safety net. Claude wants to run a command, you read it, you approve or deny. Unattended, that net is gone — there is no one to answer the prompt, so every decision about what Claude may do has to be made in advance, encoded as flags and rules rather than judgment in the moment.

So the mental model flips. Interactive work optimizes for flexibility: let Claude ask, decide as you go. Autonomous work optimizes for constraint: decide exactly what is allowed, make the task reversible, and define what "done" means before you start the run. The rest of this guide is the toolkit for doing that well.

Headless mode: claude -p

The foundation of every autonomous run is headless mode, triggered by the -p (or --print) flag. Instead of opening the interactive UI, claude -p "your prompt" runs the agent loop once, prints the result, and exits — exactly like any other command-line tool. It reads standard input, so you can pipe data in and redirect output out.

Because it behaves like a normal Unix command, headless mode composes with everything you already use: shell pipes, cron, Makefiles, package.json scripts, and CI runners. A one-line example: git diff main | claude -p "review this diff for typos and report filename:line for each" turns Claude into a project-specific linter you can wire into a pre-push hook.

One subtlety worth knowing for CI: by default claude -p loads the same context an interactive session would — your hooks, skills, MCP servers, and CLAUDE.md from the working directory and ~/.claude. The --bare flag skips all of that auto-discovery so a run produces the same result on every machine, with only the flags you pass explicitly taking effect. The docs note --bare is the recommended mode for scripted and SDK calls.

Structured output is what makes it programmable

Plain text is fine when a human reads the result. The moment a script consumes it, you want structure — and that is what --output-format gives you. There are three modes, and choosing the right one is most of the battle.

text is the default: just the model's final answer, nothing else. json wraps the run in a single JSON object — the answer lands in a result field alongside metadata like session_id and total_cost_usd, so a scripted caller can track spend per invocation and resume the exact conversation later. stream-json emits newline-delimited JSON events as they happen, which is what you want when you are building a live UI or piping progress somewhere in real time; pair it with --verbose and --include-partial-messages to receive tokens as they are generated.

For machine-readable results that conform to a shape you control, combine --output-format json with --json-schema and a JSON Schema definition. Claude returns data matching your schema in a structured_output field — no brittle regex parsing of prose, just a typed object you can hand straight to the next stage of a pipeline.

headless runs, three shapes
… scroll to run this session
Plain text for humans, JSON for scripts, stream-json for live UIs. The same agent loop underneath — only the output envelope changes.
FormatWhat you getBest for
textFinal answer as plain textA human reads it, or you pipe one clean string onward
jsonOne object: result, session_id, total_cost_usd, usageScripts that need the answer plus metadata or want to --resume
stream-jsonNewline-delimited events as they happenLive UIs and progress bars (add --verbose --include-partial-messages)
json + --json-schemaTyped object in structured_outputFeeding the next pipeline stage a guaranteed shape
The three --output-format modes and when to reach for each. All run the same agent loop; they differ only in how the result comes back.

Scheduled and background runs

Once a task runs from one command, you can run it on a clock. Headless mode is just a shell command, so anything that schedules shell commands schedules Claude: a cron entry that runs claude -p "triage new error reports and open an issue for each" every morning, a CI job on a timer, or a serverless function on a schedule. Nothing about Claude is special here — it slots into the same scheduler you already trust.

There is also a background sense of autonomy worth distinguishing from scheduling. Inside a single claude -p run, Claude can start a long-lived Bash task — a dev server, a watch build — to support its work. The docs are precise about lifecycle: such a background task is terminated about five seconds after Claude returns its final result and standard input closes, with that short grace period letting a task that finishes right after the result still deliver its output. The practical lesson is that a headless run is bounded — it is not a daemon that lingers after the agent is done thinking.

For a scheduled run, the iron rule is idempotence and a stop condition. A job that re-opens the same issue every night because it cannot tell it already filed one is a bug factory. Give the task a way to recognize "already done," and use --max-turns so a confused run cannot loop forever on your schedule.

The built-in scheduling commands

The cron-and-shell approach above is the do-it-yourself path. Claude Code also ships built-in commands for scheduling and backgrounding, and the one distinction that decides which you want is where the work runs.

/loop keeps a prompt running on your machine, in this session — it turns an interval you give into a cron-like schedule and re-runs the prompt while Claude Code is open and idle. Omit the interval and Claude self-paces the delay (a minute to an hour) between iterations; omit the prompt and it runs a built-in maintenance routine (or your .claude/loop.md). It is session-scoped: close the terminal and it stops, and recurring loops auto-expire after seven days. /schedule is the cloud opposite — it creates a routine that runs on Anthropic's infrastructure, so it keeps working with your laptop closed. Create one conversationally or from a description (/schedule daily PR review at 9am), and manage them with /schedule list, /schedule update, and /schedule run.

Three more round out background work. /deep-research is a bundled workflow that fans web searches across a question, cross-checks the sources it finds, and returns a cited report in the background. /background (alias /bg) detaches the current session so it keeps running as a background agent — you monitor it from agent view, covered in Agent Teams & Orchestration. /tasks (alias /bashes) shows everything running in the background of this session. And for the cloud planning counterpart, /ultraplan is taught in Plan Mode & Workflows.

CommandRuns whereWhen to reach for it
/loop [interval] [prompt] *(alias /proactive)*Your machine, session openBabysit a deploy or poll CI while you keep working — 1-minute minimum
/schedule [description] *(routines)*Anthropic's cloudDurable automation that runs with your laptop closed — 1-hour minimum
/deep-research <question>Background *(workflow)*A cited, cross-checked answer to a research question
/background [prompt] *(alias /bg)*Background agentDetach a long task to free your terminal; watch it in agent view
/tasks *(alias /bashes)*This sessionSee and manage everything running in the background now
The built-in scheduling and background commands, by where the work runs. Verified against the official scheduled-tasks, routines, and workflows docs.

Build an autonomous iteration loop

  1. Make the goal machine-checkable

    An autonomous loop needs a verifier it can run without you: a test suite, a type-checker, a linter, a build. "Make it look nicer" cannot drive a loop; npm test exits 0 or it does not. The exit code is your stop condition.

  2. Run one bounded pass

    Invoke claude -p "run the test suite and fix any failures" --allowedTools "Bash,Read,Edit" --max-turns 15. The --max-turns cap guarantees the pass terminates instead of spiraling, even if the task is harder than expected.

  3. Check the verifier, decide in the harness

    After the pass, your script — not Claude — runs the verifier and reads its exit code. The loop logic lives outside the agent so the stop condition is yours to enforce, not something Claude can talk itself out of.

  4. Loop with a hard ceiling, then surface

    If the verifier still fails, run another pass with --continue to keep context; if it passes, stop. Cap the number of passes too (say, 5) so a genuinely stuck task escalates to a human instead of running all night. When you hit the ceiling, post the diff and the last failure for review.

Is this task safe to run unattended?

Should I let it run unattended?

Four honest questions decide whether a task is safe to hand to an autonomous agent. Walk the tree — each answer narrows the path to a verdict you can act on.

  1. Reversible
  2. Sandboxed + scoped
  3. Observable + logged
  4. Clear stop condition

Start at the root — answer to walk the path.

Gate 1 / 4

Is the work reversible?

Can you undo every change with git, a redeploy, or a restore? Protected paths (.git, .claude, shell rc files) are never auto-approved outside bypass mode for exactly this reason.

Walk a real task down the tree — reversible? sandboxed? scoped? observable? bounded? — and see which guardrails it still needs before you let go of the wheel.

GitHub Actions: autonomy that lives in your repo

The most common production home for an autonomous Claude is GitHub Actions, via the official anthropics/claude-code-action@v1. The fastest setup is to run /install-github-app inside Claude Code, which walks you through installing the GitHub app and storing your ANTHROPIC_API_KEY as a repository secret (you must be a repo admin; the app requests read-and-write on Contents, Issues, and Pull requests).

The action runs in two shapes, and v1 auto-detects which. In interactive mode it watches for an @claude mention in an issue or PR comment, then analyzes the context and responds — implement a feature, fix a bug, answer a design question. In automation mode you supply a prompt directly and it runs immediately, which is how you wire up scheduled jobs: a schedule cron trigger plus prompt: "summarize yesterday's commits and open issues" gives you a daily report with no human in the loop.

CLI flags pass straight through the claude_args input, so everything you learned in headless mode carries over — --max-turns to bound a run, --model to pick Sonnet 4.6 or Opus 4.8, --allowedTools to scope permissions, --mcp-config to attach servers. The same discipline applies, doubly: this agent runs without you watching, so set --max-turns, a workflow timeout, and concurrency limits to keep a runaway job from draining both your Actions minutes and your token budget.

The honest checklist for unattended autonomy

Here is the part the demos skip. Letting an agent run without you is not a toggle — it is a set of conditions that all need to hold. If a task fails any one of them, you are not ready to walk away from it yet. Five questions, in order:

Reversible. Can you cleanly undo whatever this run does? An agent working on a feature branch with every change in git is reversible — you can git reset and the blast radius is one branch. An agent running database migrations against production is not. Make the work reversible before you remove yourself, usually by confining it to a branch, a worktree, or a scratch directory.

Sandboxed. Is the run confined to a blast radius you have decided in advance? A container, a fresh CI runner, a throwaway checkout — somewhere a worst-case action cannot touch anything you care about. "It probably won't do that" is not a sandbox.

Scoped permissions. Have you granted the minimum set of tools the task needs and nothing more? A summarizer needs Read and Grep, not Bash. A test-fixer needs Bash, Read, and Edit, not network access. Every extra capability is attack surface and accident surface both.

Observable. When the run finishes — or misbehaves — can you tell what it did? Capture --output-format json (it carries total_cost_usd and the session id), log the full transcript, and make failures loud. An unobservable autonomous run is one you cannot trust, learn from, or debug.

Bounded with a clear stop condition. Does the run end? --max-turns caps the agent's internal loop; your harness caps the number of passes; a machine-checkable verifier (tests pass, build is green) defines success; a timeout catches the rest. A loop with no stop condition is not autonomy, it is a runaway.

Pass all five and you can genuinely let go. Fail one and the fix is not "supervise harder" — it is to add the missing guardrail until the answer is yes.

And graduate into autonomy rather than leaping. Run the task interactively first and watch the whole thing. Then run it headless while you watch the output. Then schedule it on something reversible and sandboxed with a tight --max-turns. Only after it has earned trust on small, bounded jobs do you point it at anything that matters. Autonomy is something an agent earns on a specific task — not a setting you flip on faith.

Reckless vs. ready

Not ready to walk away

claude -p "clean up the codebase" on main, with --permission-mode acceptEdits and no turn cap, on your laptop, output discarded.

Irreversible (edits main directly), unsandboxed (your real machine), unscoped (vague goal, broad permissions), unobservable (no logs), unbounded (no stop condition). Five for five wrong.

Safe to run unattended

claude -p "make the test suite pass" --allowedTools "Bash,Read,Edit" --max-turns 15 --output-format json on a fresh branch in a CI runner, looped by a harness with a 5-pass ceiling.

Reversible (branch), sandboxed (runner), scoped (three tools), observable (JSON transcript + cost), bounded (turn cap, pass ceiling, green tests). Let it run.

Knowledge check

You want a nightly GitHub Actions job that fixes failing tests on a bot branch with no human watching. Which single change most improves its safety?

Reach the end and this star joins your charted sky.