Running Claude Code and AI Agents at Scale with Warp Oz

What problem does this solve?

Running AI coding agents on individual developer laptops — the default today with tools like Cursor or local Claude Code sessions — creates a hard ceiling on engineering throughput. Warp Oz breaks that ceiling by moving agent execution into the cloud, letting teams orchestrate hundreds of agents in parallel while keeping governance, observability, and data-privacy controls intact. For enterprises specifically, the blocker isn’t capability — it’s trust: sensitive code can’t leave the perimeter. Oz addresses this by splitting the orchestrator (cloud-managed) from the execution environment (customer-managed infrastructure), so the two concerns never conflict.

Before you start

A Warp account with access to the Warp Oz orchestration layer
At least one CLI coding agent you want to scale: Claude Code, Codex, or OpenCode
Clarity on your execution boundary: cloud-only vs. hybrid (cloud orchestrator + on-prem runners)
For enterprise deployments: customer-managed infrastructure provisioned and reachable by the Oz orchestrator

Steps

Choose your execution model

Decide whether agents will run fully in Warp’s cloud or on your own infrastructure.

Cloud execution — fastest to start; Oz manages the full lifecycle.
Hybrid / on-prem execution — agents run inside your perimeter; the Oz orchestrator still handles lifecycle management and observability from the cloud. This is the recommended path for any team with data-residency requirements.

The Oz orchestrator and the agent runner are intentionally decoupled. You can start with cloud execution and migrate runners on-prem later without changing your workflow definitions.

Connect your CLI coding agents to Warp Oz

Claude Code, Codex, and OpenCode all run natively inside Warp Oz. You don’t replace these tools — you wrap them.Oz adds a rich UI layer on top of each CLI agent session:

Capability	What it gives you
Rich input	Structured prompts, file attachments, context injection
Notifications	Real-time status updates per agent session
Code review	Inline diff review without leaving the orchestrator
Remote session control	Pause, resume, or redirect any running agent session

Connect an agent by selecting it from the Oz agent registry and pointing it at a target repo or task definition.

Scale to parallel agent execution

Once a single agent session is working, Oz makes horizontal scale a configuration decision, not an infrastructure project.

Define the task scope (e.g., “run this refactor across all 40 microservices”).
Set concurrency limits and resource quotas per agent.
Launch: Oz spawns and manages hundreds of agent instances, each with its own lifecycle, logs, and output artifacts.

Start with a small fan-out (5–10 agents) on a non-critical repo to validate your task definition before scaling to hundreds.

Compose agents for multi-step workflows

For complex engineering tasks, single-agent runs aren’t enough. Oz supports Agent Composition — agents that call other specialized agents as sub-tasks, creating multi-step workflows from reusable building blocks.Example pattern:

Orchestrator agent
├── calls → Security scanner agent (per service)
├── calls → Test generator agent (per changed file)
└── calls → PR description agent (per branch)

Define composition in your workflow config by specifying which agent types a parent agent is allowed to invoke and under what conditions. Oz handles routing, result aggregation, and failure handling across the graph.

Monitor and govern from the Oz dashboard

Every agent session — whether running in Warp’s cloud or on your own infrastructure — surfaces lifecycle events, logs, and outputs back to the Oz orchestrator.

Review per-agent status in real time.
Inspect code diffs before merging any agent-generated output.
Use remote session control to intervene in a running session without killing it.

Common pitfalls

Skipping the execution-boundary decision early. If you start with cloud runners and later discover your security policy requires on-prem execution, migrating mid-project is painful. Lock down your execution model before writing any workflow definitions.Over-composing agent graphs too soon. Agent Composition is powerful, but a deeply nested agent graph is hard to debug when a mid-chain agent fails. Build and validate each agent type independently before wiring them together.Treating parallel agents as independent. Hundreds of agents writing to the same repo simultaneously will produce merge conflicts at scale. Design your task decomposition so each agent owns a non-overlapping scope (e.g., one agent per service, one agent per file).

What we learned

The laptop is the wrong unit of scale for AI agents. Running Claude Code locally is fine for individual tasks, but teams that need to apply AI-driven changes across dozens of services simultaneously need cloud orchestration — not more powerful laptops. Warp Oz reframes the agent as a fleet resource, not a personal tool.
Data privacy and cloud orchestration are not mutually exclusive. The architectural split between the Oz orchestrator (cloud) and the execution environment (customer-managed) directly unblocks enterprise adoption. Sensitive code never has to leave the perimeter for the team to get full observability and lifecycle management.
Agent Composition unlocks workflows that single-agent runs can’t handle. The ability for agents to call other specialized agents — security scanners, test generators, PR writers — means complex, multi-step engineering processes can be encoded as reusable, composable graphs rather than monolithic prompts.

Playbooks

Guidelines

Content

Start here

Running Claude Code and AI Agents at Scale with Warp Oz

What problem does this solve?

Before you start

Steps

Common pitfalls

What we learned

Playbooks

Guidelines

Content

Start here

​What problem does this solve?

​Before you start

​Steps

​Common pitfalls

​What we learned

What problem does this solve?

Before you start

Steps

Common pitfalls

What we learned