AI Agent Design

Use this when you are building an AI agent that takes multiple steps and calls tools, not a single LLM call. Covers the tier decision (should this even be an agent), the tool surface, the loop and its termination, state and context management, and what happens when a step fails. If you only need one model call with structured output, you do not need this skill, use /ai-product-spec.

Related skills: Specs the feature first with /ai-product-spec. Evaluate the agent with /ai-eval-design and defend it with /ai-guardrails-design. Monitor it in production with /llm-observability-plan. Coordinate several agents with /multi-agent-orchestration, evaluate agent trajectories with /agent-eval-harness, and wire its tools via /mcp-integration-plan.

The hard part most teams miss

The model is not the agent. The harness is. Most "the agent is broken" reports are harness failures, not model failures.

Most things called agents should be workflows. An agent is for tasks that are multi-step and hard to specify in advance. If you can write the steps down, write them down in code and call the model at each step. A workflow you control beats an agent you hope behaves, on cost, latency, and debuggability.
The loop fails in the harness, not the model. No termination condition, no max-step cap, no state between steps, no recovery from a tool error: these are the real failure modes, and none of them are the model's fault. The model emits tool calls; your harness decides what is allowed, what is safe, and when to stop.
Context is a budget you spend, not free. A long-running agent accumulates tool outputs until quality drops and cost climbs. Deciding what to keep, summarize, or clear is a design decision, not an afterthought.

Process

Step 1: Gather inputs

Ask the user:

What does the agent do, end to end? (One or two sentences. The job, not the implementation.)
What does "done" look like? (A concrete, checkable success condition. If you cannot state it, the agent cannot reach it.)
What tools does it need? (Each capability the model will call: read, write, search, an external API, code execution.)
Who or what executes the tools? (Your harness, a sandbox, a hosted runtime. This decides what you can gate.)
What is the cost of a wrong action? (Reversible and cheap, or irreversible and expensive. This sets where you gate.)
What is the step and time budget? (Roughly how many tool calls, and how long, before the agent should stop and hand back.)

Step 2: Confirm it should be an agent

Run the four-part check. Build an agent only if all four hold:

Complexity: the task is multi-step and cannot be fully specified up front.
Value: the outcome justifies higher cost and latency than a single call.
Viability: the model is actually capable at this task type.
Cost of error: mistakes can be caught and recovered (tests, review, rollback).

If any answer is no, drop to a workflow (code-orchestrated steps with model calls) or a single call. Say so plainly; the cheaper tier is usually the right answer.

Step 3: Design the tool surface

For each capability, decide its shape:

Dedicated tool vs general execution. Start with a broad tool (a shell or code runner) for reach. Promote an action to a typed, dedicated tool when you need to gate it, render it, audit it, or run it in parallel safely. A send_email tool can be gated and confirmed; a raw shell command cannot.
Gate the irreversible. Hard-to-reverse actions (external writes, sending messages, deleting data) sit behind a confirmation or an allowlist. Reversibility is the criterion.
Mark parallel-safe reads. Read-only tools can run concurrently; anything that writes must serialize.

Step 4: Design the loop

Termination: define every way the loop ends, success condition met, max steps hit, budget exhausted, unrecoverable error, explicit hand-back to a human. An agent with no hard cap is a runaway bill.
State: decide what persists across steps (a scratchpad, a task list, retrieved facts) and where it lives. Stateless steps re-derive everything and drift.
Context management: as the transcript grows, prune stale tool results, summarize completed sub-tasks, or persist to memory. Do not let the window fill with outputs no future step needs.

Step 5: Design failure recovery

Tool errors: return the error to the model as a result it can react to, not a crash. Decide retry-with-backoff versus give-up-and-report per tool.
Model loops: detect repeated identical tool calls or no-progress cycles; break them with a step cap and a forced summary.
Partial progress: on failure, hand back what was accomplished and what remains, not nothing.
Human-in-the-loop: name the conditions that escalate to a person rather than letting the agent guess.

Step 6: Output the agent design

# Agent Design: (name)

**Job:** (one sentence)
**Done means:** (checkable success condition)
**Tier rationale:** (why an agent and not a workflow or single call)

## Tools
| Tool | Shape (dedicated/general) | Gated? | Parallel-safe? | Failure behavior |
|---|---|---|---|---|

## Loop
- Termination conditions: (list)
- Max steps / time budget: (values)
- State kept across steps: (what, where)
- Context management: (prune / summarize / memory)

## Failure recovery
- Tool error handling: (retry / report)
- Loop / no-progress breaker: (mechanism)
- Escalation to human: (conditions)

## Open questions
- (unresolved decisions)

Step 7: Review

Ask the user:

Could a workflow do this instead? (If yes, build that.)
What is the worst action this agent can take, and is it gated?
How does the loop end on a bad day, not just a good one?
Who pays attention when it runs, and how do they stop it?

Anti-patterns

Anti-pattern	Why it fails	Do instead
Agent where a workflow fits	Pays agent cost and unpredictability for a task you could code	Drop to a code-orchestrated workflow with model calls
No termination cap	One bad loop runs up cost or spins forever	Hard max-step and time budget, always
Tool error crashes the loop	One failed call kills a run that could recover	Return the error to the model as a result
The model as the gate	Trusting the model to not take a dangerous action	Gate irreversible actions in the harness, not the prompt
Unbounded context	Quality drops and cost climbs as the window fills	Prune, summarize, or persist to memory deliberately
No state between steps	The agent re-derives and drifts each step	Keep an explicit scratchpad or task list

Output location

Present the agent design as formatted text in the conversation for the user to copy into their design doc.