Pattern: Review Loops

🔵 Optional

Use this when: You're working with an AI agent in any capacity and want to avoid the single most common mistake — accepting the first draft without critical review.

The problem

AI agents produce fluent, confident, comprehensive output. It looks good. It reads well. And that's exactly why it's dangerous.

The most common failure mode for PMs using AI agents isn't that the agent produces obviously bad output. It's that the agent produces plausible output that the PM accepts without scrutiny — and the team builds on a foundation of unchecked assumptions, vague criteria, and hidden gaps.

The agent's first draft is a starting point, not a finished product.

The review loop

Every interaction with an agent should follow this loop:

1. PROMPT   → Give clear, specific input
2. REVIEW   → Read the output as critically as you'd review a colleague's work
3. CHALLENGE → Ask: "What's wrong with this?"
4. REVISE   → Provide specific feedback; iterate
5. VALIDATE → Check against primary sources, team knowledge, or real users

The loop isn't optional. Skipping steps 2-4 is how you ship mediocre work fast.

Review checklist

Use this checklist on every agent-generated artifact before you share it with the team:

Accuracy

Are all facts correct? (Cross-check against primary sources)
Are all claims supported by evidence?
Is the scope right — not too broad, not too narrow?

Completeness

Are edge cases covered?
Is there anything obviously missing?
Are error states, empty states, and failure modes addressed?

Specificity

Is every statement concrete? (No "intuitive," "seamless," "robust," "ensure a great experience")
Could an engineer write a test from the acceptance criteria?
Could a designer build a screen from the description?

Honesty

Are assumptions labeled as assumptions, not stated as facts?
Are open questions acknowledged, not papered over?
Is the confidence level appropriate for the evidence?

Voice

Does it sound like you, not like a chatbot?
Is the tone right for the audience?
Would you say these words in a meeting?

The challenge step

Step 3 is the most valuable and most frequently skipped. After the agent produces output, ask it to critique its own work:

Review what you just wrote and tell me:
1. What's the weakest part of this?
2. What assumptions are you making that might be wrong?
3. What would a skeptical engineer push back on?
4. What would a skeptical designer push back on?
5. What information is missing that would make this better?

Then evaluate its self-critique. The agent will usually identify real issues, though it may also flag trivial ones. Use your judgment about what matters.

Revision tactics

When providing feedback, be specific:

Bad feedback: "Make it better." "Add more detail." "This doesn't feel right."

Good feedback: "The acceptance criteria for the error case is missing. Add a scenario for when the user enters an invalid email format." "The scope is too broad. Cut features X and Y — they're Phase 2." "This summary assumes the reader knows what SCIM is. Add a one-sentence explanation."

Specific feedback produces specific improvement. Vague feedback produces a slightly rephrased version of the same output.

When to stop iterating

You're done when:

The output passes your review checklist
You'd be comfortable presenting it to the team
Further iteration would change wording, not substance

You've gone too far when:

You're polishing phrasing instead of checking substance
The agent is making changes you keep reverting
You've spent more time reviewing than you would have spent writing from scratch

The 80/20 rule

The agent gets you 80% of the way in 20% of the time. The remaining 20% — your review, judgment, and refinement — is where the real value lives.

That 20% is your job. It's what makes you a PM, not a prompt engineer.

Red flags that you're not reviewing enough

Your stories consistently need rework after engineering starts
Your stakeholder updates get challenged on facts
Your team asks clarifying questions about every story
You can't explain why a specific acceptance criterion is worded the way it is
You find yourself saying "the agent wrote that part" when questioned

If any of these are happening, slow down. Better to ship 5 well-reviewed stories than 15 unchecked ones.

Estimation anti-pattern: automated story pointing

Lesson learned: Automated story pointing has been tried and abandoned by practitioner teams. Agents can prepare estimation context — pulling historical velocity, identifying similar past stories, flagging complexity signals — but they should not replace human judgment on sizing.

The core issue: estimation is a conversation, not a calculation. The value of backlog refinement meetings isn't the point value — it's the discussion that surfaces misunderstandings, missing requirements, and scope disagreements. Automating the number removes the conversation.

What works: Use agents to prepare for estimation (pull comparable stories, summarize technical dependencies, flag unknowns). Keep the human discussion for the actual sizing. Agents augment refinement meetings; they don't replace them.

From review loops to continuous alignment

Review loops are the human-in-the-loop discipline for agent output. When agentic systems move to production, this same discipline scales through automated evaluation — offline test suites, online monitoring, and inline self-checks. See Continuous Alignment Techniques for the full framework.

Connecting to workflows

This pattern applies to every workflow in this section:

workflows/standup-prep.md — Review status assessments against your knowledge
workflows/backlog-refinement.md — Validate priority recommendations
workflows/stakeholder-prep.md — Verify accuracy, adjust for political dynamics
workflows/research-synthesis.md — Cross-check patterns against raw data
workflows/sprint-reporting.md — Confirm what actually shipped
workflows/prd-drafting.md — Stress-test assumptions and scope

Review gates in multi-agent workflows

When a task spans multiple agents in a pipeline, the review loop becomes a gate — a checkpoint between workflow steps that prevents errors from cascading downstream.

The discipline is the same (PROMPT → REVIEW → CHALLENGE → REVISE → VALIDATE), but applied at step boundaries rather than at the end of a single session. At each gate, you decide:

Pass — output meets quality criteria; proceed to the next agent
Revise — output needs rework; send it back to the current agent with specific feedback
Redirect — the approach is wrong; adjust the workflow plan before continuing

For high-volume pipelines, automated CAT gates (Continuous Alignment Testing) can handle structural validation at intermediate steps, reserving human review for high-risk transitions. See Continuous Alignment Techniques for the evaluation framework.

For the full catalog of orchestration patterns, context capsule design, and gate types, see Multi-Agent Orchestration Patterns.