Assumption Map - AI Agent Skill

Use this when a team has a feature idea, product concept, or strategic bet and needs to identify what they're assuming, how risky those assumptions are, and what to test first. Produces a prioritized assumption map with risk ratings and recommended validation methods. This is the starting point for focused discovery -- it tells you where to aim your research.

Related skills: Operationalizes the four-risk model from knowledge/pm-discovery-frameworks.md. Output feeds into /interview-plan, /usability-test-plan, or /experiment-design depending on the assumption type. Part of the discovery-sprint recipe.

Process

Step 1: Gather inputs

Ask the user to provide:

The idea or bet -- what is the team planning to build, launch, or invest in? (Feature, product, initiative, or strategic direction.)
Current evidence -- what do you already know? (Prior research, analytics, customer feedback, competitive intel.)
Stakeholder context -- who believes this will work, and why? (Understanding conviction sources helps surface hidden assumptions.)
Timeline pressure -- when does the team need to make a go/no-go decision? (This constrains how much validation is possible.)

Step 2: Surface assumptions

Guide the user through assumption extraction. For each risk category, ask: "What are we assuming is true that, if wrong, would make this fail?"

## Assumption Map -- (Idea/feature, date)

### Value assumptions (Will anyone want this?)
1. (Assumption -- e.g., "Users currently struggle with X and would switch to a better solution")
2. (Assumption -- e.g., "This problem is painful enough that users will pay to solve it")
3. (Assumption)

### Usability assumptions (Can people figure it out?)
1. (Assumption -- e.g., "Users will understand the new navigation without training")
2. (Assumption -- e.g., "The core workflow can be completed in under 3 minutes")
3. (Assumption)

### Viability assumptions (Should we build this?)
1. (Assumption -- e.g., "This feature won't cannibalize our existing product line")
2. (Assumption -- e.g., "We can acquire users for under $X CAC")
3. (Assumption)

### Feasibility assumptions (Can we build this?)
1. (Assumption -- e.g., "The API can handle 10x current load without infrastructure changes")
2. (Assumption -- e.g., "We can ship an MVP in 4 weeks with the current team")
3. (Assumption)

Extraction tips:

Ask "What would have to be true for this to succeed?" -- each answer is an assumption
Ask "What are we taking for granted?" -- especially for things that feel obvious
Ask "What would a skeptic challenge?" -- find the assumptions the team doesn't want to examine
Aim for 8-15 assumptions total. Fewer than 8 means you haven't dug deep enough. More than 20 means you need to consolidate

Step 3: Map assumptions to Business Model Canvas zones

After surfacing assumptions, map each one to its Business Model Canvas zone. This reveals where risk clusters:

Desirability (Value Proposition, Customer Relationships, Channels, Customer Segments) -- Will anyone want this? Are we solving a real problem for real people?
Feasibility (Key Activities, Key Resources, Key Partners) -- Can we build and deliver this? Do we have the people, tech, and partners?
Viability (Revenue Streams, Cost Structure) -- Should we build this? Will the economics work?

If all assumptions cluster in one zone, that is the team's primary risk. If desirability assumptions dominate, the team has a value risk. If feasibility dominates, it is a delivery risk. If viability dominates, it is a business model risk. Name the cluster explicitly -- it focuses the validation plan.

Step 4: Prioritize on the risk matrix

Plot each assumption on two axes:

Impact if wrong: How badly does the idea fail if this assumption is false? (High / Medium / Low)
Evidence level: How much do we actually know? (Strong evidence / Some evidence / No evidence / Contradictory evidence)

### Risk matrix

| # | Assumption | Risk category | Impact if wrong | Evidence level | Priority |
|---|---|---|---|---|---|
| 1 | (Assumption) | Value | High | No evidence | **Test first** |
| 2 | (Assumption) | Usability | High | Some evidence | **Test soon** |
| 3 | (Assumption) | Viability | Medium | Strong evidence | Monitor |
| 4 | (Assumption) | Feasibility | Low | Some evidence | Accept for now |

### Priority definitions
- **Test first:** High impact + low evidence. These are the riskiest assumptions. Validate before committing resources.
- **Test soon:** High impact + some evidence, or medium impact + no evidence. Validate within this iteration.
- **Monitor:** Medium impact with some evidence. Keep an eye on it, but don't block progress.
- **Accept for now:** Low impact or strong evidence. Revisit only if conditions change.

Step 5: Recommend validation methods

For each "Test first" and "Test soon" assumption, recommend a validation approach:

### Validation plan

| Assumption | Recommended method | Why this method | Timeline | Effort |
|---|---|---|---|---|
| (Assumption 1) | (e.g., 5 user interviews) | (Value assumption -- need to hear from users whether this pain exists) | 1-2 weeks | Medium |
| (Assumption 2) | (e.g., Prototype usability test) | (Usability assumption -- need to see if users can navigate the flow) | 3-5 days | Low |
| (Assumption 3) | (e.g., Spreadsheet financial model) | (Viability assumption -- need to validate unit economics) | 1-2 days | Low |
| (Assumption 4) | (e.g., Technical spike) | (Feasibility assumption -- need to prove the architecture works) | 3-5 days | Medium |

Method selection guidance:

Value assumptions → User interviews, surveys, landing page tests, fake door tests
Usability assumptions → Prototype testing, hallway tests, wizard of oz
Viability assumptions → Financial models, market sizing, competitive analysis, sales conversations
Feasibility assumptions → Technical spikes, proof of concepts, architecture reviews

Step 6: Review and validate

Ask the user:

Are we missing any assumptions? (The scariest ones are the ones nobody mentions.)
Do the priority ratings feel right? Is anything under-weighted because the team has false confidence?
Is the validation plan realistic given timeline and resources?
Who on the team should own each validation? (Assign owners now or it won't happen.)
When will you re-validate? Assumptions are dated the moment you write them. Market conditions, competitive landscape, and user needs shift. Schedule a review in 4-6 weeks. A validated assumption from last quarter may be invalid this quarter. Product-market fit is a moving target, not a destination.

Output location

Present the assumption map as formatted text in the conversation. The user copies it into their project documentation. The validation plan feeds directly into /interview-plan, /usability-test-plan, or /experiment-design depending on the assumption type.

Example Output

Input

The idea or bet: Add an AI-powered spend categorization feature to Ardent Finance's SMB expense management platform — the system would automatically tag transactions and flag policy violations without requiring manual review
Current evidence: NPS surveys show "manual categorization" is the #3 complaint; one enterprise pilot customer reduced reconciliation time by 40%; no data on SMB willingness to pay a premium; engineering has not evaluated ML infrastructure costs
Stakeholder context: VP of Product believes this closes deals against Expensify; CFO is skeptical because a similar initiative was deprioritized at a previous company due to low adoption
Timeline pressure: Board review in 10 weeks; team needs a go/no-go on roadmap commitment in 6 weeks

Output (abbreviated)

Assumption Map — Ardent Finance AI Spend Categorization, June 2025

Value assumptions (Will anyone want this?)

SMB finance managers find manual categorization painful enough to change workflows for an automated solution
Users will trust AI-generated categories enough to act on them without reviewing each transaction
Policy violation flagging is a valued feature, not just a nice-to-have — SMBs will pay to catch violations proactively
The 40% reconciliation time reduction from the enterprise pilot will generalize to SMB users with smaller transaction volumes

Usability assumptions (Can people figure it out?)

Users will understand when and why the AI is uncertain, and know what to do when confidence is low
Overriding an AI-assigned category won't feel more effortful than manual entry from scratch
The policy violation alert UI won't generate false-positive fatigue that causes users to ignore all flags

Viability assumptions (Should we build this?)

SMBs will pay a meaningful price premium (≥$15/seat/month) for AI categorization over the base tier
Categorization accuracy needs to stay above ~92% or support ticket volume will erode margin
This feature differentiates Ardent against Expensify and Ramp rather than prompting a feature match within 2 quarters

Feasibility assumptions (Can we build this?)

Ardent's current transaction data is large enough and clean enough to train or fine-tune a categorization model
The team can ship an MVP accurate enough to demo in 6 weeks with current ML and backend capacity
Ongoing model retraining and monitoring can be absorbed without a dedicated MLOps hire

Risk matrix

#	Assumption	Risk category	Impact if wrong	Evidence level	Priority
1	SMBs find manual categorization painful enough to change workflows	Value	High	Some evidence (NPS data)	Test first
2	Users will trust AI categories without reviewing each transaction	Value	High	No evidence	Test first
3	SMBs will pay ≥$15/seat premium for this feature	Viability	High	No evidence	Test first
4	40% time savings generalizes from enterprise to SMB segment	Value	High	Contradictory evidence	Test first
5	Override flow won't feel harder than manual entry	Usability	High	No evidence	Test soon
6	Users will understand low-confidence AI states	Usability	Medium	No evidence	Test soon
7	Transaction data is sufficient for model training	Feasibility	High	No evidence	Test soon
8	Policy flagging is valued, not just tolerated	Value	Medium	Some evidence	Test soon
9	Accuracy must stay above ~92% to protect margin	Viability	Medium	Some evidence	Monitor
10	Feature won't be matched by Expensify within 2 quarters	Viability	Medium	Some evidence	Monitor
11	MVP shippable in 6 weeks with current team	Feasibility	Medium	Some evidence	Monitor
12	MLOps absorbed without dedicated hire	Feasibility	Low	Some evidence	Accept for now

Risk cluster diagnosis: Assumptions heavily cluster in Desirability — the team has strong internal conviction but is largely guessing at SMB user behavior and willingness to pay. This is a value risk, not a delivery risk. Validate demand before touching the tech stack.

Validation plan

Assumption	Recommended method	Why this method	Timeline	Effort
SMBs find categorization painful enough to change workflows	8 problem interviews with SMB finance managers across 3 verticals	Need first-person accounts of the workflow, not survey proxies — uncover severity and current workarounds	Week 1–2	Medium
Users won't trust AI without reviewing each transaction	Prototype concept test — show categorized transactions and observe review behavior	Usability + value hybrid; reveals trust calibration before any code is written	Week 2–3	Low
SMBs will pay ≥$15/seat premium	Pricing page fake-door test + 5 sales discovery calls asking directly about budget	Combine stated and revealed willingness to pay; CFO skepticism makes this a must-answer before board review	Week 1–3	Low–Medium
40% time savings generalizes to SMBs	Diary study or time-on-task analysis with 3 SMB pilot users doing live reconciliation	Behavioral data beats extrapolating from a single enterprise customer	Week 2–4	Medium
Override flow usability	5-participant hallway usability test on Figma prototype with override scenario	Fast, cheap, answers the specific interaction risk before engineering speculates	Week 3	Low
Transaction data sufficient for model training	Engineering data audit + 1-week feasibility spike	Unblock or kill the technical path early — this gates everything downstream	Week 1	Medium

Open questions before 6-week go/no-go

Who owns the SMB interview recruiting — does CS have a warm list, or do we need a panel?
Should sales be included in the pricing fake-door test, or will that contaminate active pipeline?
The CFO's prior experience with low adoption: what specifically failed? Interview that stakeholder before finalizing the assumption map.
Re-validate date: Schedule assumption review for Week 5 — one week before the go/no-go decision — to incorporate findings before roadmap commitment is made.

Run this now

Try /assumption-map on your own input

0/4000

Part of these Playbook topics

Experiment-Driven Development

Related UX Research skills

Accessibility Audit Card Sort Plan Competitive UX Benchmark Concept Test Plan Diary Study Plan Interview Plan Interview Script Interview Synthesis

Back to Skills Catalog