Concept Test Plan - AI Agent Skill

Use this when the team has an idea or concept that needs validation before committing to build. A concept test sits between generative research ("what problems exist?") and usability testing ("can people use it?") -- it answers "does this direction resonate?" Produces a complete concept test plan: hypothesis, stimulus design, evaluation criteria, discussion guide, and logistics.

Related skills: Upstream: /assumption-map identifies which assumptions need concept validation. Pairs with /screener-design for participant recruitment. Downstream: feeds into /research-synthesize for analysis. If the concept passes, move to /usability-test-plan for evaluative testing. Part of the discovery-sprint recipe.

Process

Step 1: Gather inputs

Ask the user to provide:

The concept -- what idea, feature, or direction are you testing? Describe it in plain language, not product jargon.
What you're trying to learn -- what decision does this test inform? (e.g., "Should we invest a sprint in building this?" or "Which of three directions should we pursue?")
Assumptions being tested -- what has to be true for this concept to succeed? (Link to /assumption-map output if available.)
User segment -- who should evaluate this concept? What context do they need to have?
Constraints -- timeline, number of participants, remote vs. in-person, what stimulus materials exist (or need to be created).

Step 2: Define hypothesis and evaluation criteria

Frame the test around a falsifiable hypothesis:

## Concept Test Plan -- (Concept name, date)

### Hypothesis
We believe that (target users) will (expected reaction) when presented with (concept) because (rationale). We'll know this is true when (measurable signal).

### Evaluation criteria

| Signal | Positive indicator | Negative indicator | How captured |
|---|---|---|---|
| Comprehension | Participant explains concept in own words accurately | Participant misunderstands purpose or confuses with existing solutions | Think-aloud during concept review |
| Relevance | "I would use this" or describes personal use case | "I don't see why I'd need this" or can't name a situation | Direct question + follow-up probe |
| Value | Ranks concept above current workaround | Prefers status quo or competing solution | Comparison question |
| Credibility | Believes the concept would work as described | Expresses doubt about feasibility or trustworthiness | Probe after initial reaction |
| Willingness to act | Would sign up, pay, or switch | "Maybe someday" or conditions that will never be met | Commitment question |

Evaluation criteria rules:

Include at least 3 signals. Comprehension, relevance, and value are non-negotiable. Add credibility, willingness to act, or domain-specific signals as needed.
Positive and negative indicators must be specific observable behaviors or quotes, not interpretations.
"Participants like it" is not a valid signal. What they say, do, or choose is.

Step 3: Design the stimulus

Choose the right fidelity for the concept being tested:

### Stimulus design

**Format:** (One of the options below)

| Fidelity level | When to use | Examples | Pros | Cons |
|---|---|---|---|---|
| Verbal description | Very early concepts, abstract ideas | Elevator pitch, scenario narrative | Fast, no design needed | Easy to misunderstand, leading |
| Storyboard | User journey concepts, multi-step flows | 4-6 panel comic, annotated wireflow | Shows context and sequence | Requires illustration effort |
| Static mockup | UI concepts, feature ideas | Figma screens, annotated screenshots | Concrete, testable | Can fixate on visual details |
| Interactive prototype | Near-build concepts, flow validation | Clickable Figma, coded prototype | Realistic interaction | Expensive to create, hard to change |
| Video / animation | Complex interactions, system concepts | Screen recording with narration, motion prototype | Shows dynamics | One-way, no exploration |
| Concept card | Comparing multiple concepts | Name + tagline + 3 bullet value props per concept | Easy comparison | Thin, may not convey enough |

**Stimulus description:**
- (What will be shown to participants)
- (What level of polish -- rough/medium/polished)
- (What context or framing will accompany it)
- (What will NOT be shown -- scope boundaries)

**Stimulus review checklist:**
- [ ] Does the stimulus convey the concept without the moderator explaining it?
- [ ] Is the fidelity appropriate? (Not so rough that participants can't react, not so polished that they think it's finished)
- [ ] Does it avoid leading the participant toward a specific reaction?
- [ ] If comparing concepts, are they presented at equal fidelity?

Step 4: Build the discussion guide

### Discussion guide

**Before stimulus (5-10 min)**
1. Welcome, consent, think-aloud instruction
2. Context questions about their current workflow:
   - "Tell me about the last time you (relevant activity)."
   - "What do you currently use for (problem area)?"
   - "What's the most frustrating part of (current process)?"

**Stimulus presentation (5-10 min)**
3. Present the concept. Say: "I'm going to show you something we're exploring. It's not finished -- we're looking for your honest reaction."
4. Pause. Let them react. Note their first words -- these are the most honest signal.
5. Comprehension check: "In your own words, what would this do for you?"

**Evaluation probes (15-20 min)**
6. Relevance: "When would you use this? Describe a specific situation."
7. Value: "How does this compare to what you do now? Better, worse, or different?"
8. Concerns: "What worries you about this?" (Not "do you have concerns" -- assume they do.)
9. Missing: "What's missing that would make this useful to you?"
10. Priority: "If you had to choose between this and (alternative/status quo), which would you pick? Why?"

**Commitment test (5 min)**
11. (Choose one or more):
    - "If this existed today, would you try it?" (then probe: "What would you try first?")
    - "Would you recommend this to a colleague? Who specifically?"
    - "What would you give up to get this?" (time, money, switching costs)

**Debrief (5 min)**
12. "What's the one thing you'd change about this concept?"
13. "Anything else we should know?"

Discussion guide rules:

Context questions come BEFORE showing the concept. Never anchor participants on your idea before understanding their world.
The first reaction after stimulus presentation is the most valuable data point. Do not talk over it or prompt too quickly.
"Would you use this?" is a weak question (most people say yes to be polite). "When would you use this?" and "What would you give up?" are stronger.
If testing multiple concepts, randomize presentation order across participants to control for order effects.

Step 5: Plan logistics

### Logistics

| Item | Detail |
|---|---|
| Participants | (6-8 typical for concept tests -- more than usability, fewer than survey) |
| Session length | (30-45 min per session) |
| Format | (1:1 interview with stimulus -- not focus group) |
| Setting | (Remote / in-person) |
| Recruitment | (Use /screener-design) |
| Stimulus prep | (Who creates it, by when) |
| Schedule | (Date range, sessions per day -- max 4 to avoid moderator fatigue) |
| Comparison design | (If testing multiple concepts: within-subject rotation or between-subjects) |
| Recording | (Video + audio + notes -- confirm consent) |

Step 6: Review and validate

Ask the user:

Is the hypothesis falsifiable? What result would kill this concept?
Is the stimulus fidelity right? Too rough and participants can't react; too polished and they think it's decided.
Are you testing the concept or the execution? (Concept tests should answer "is this worth building?" not "is this the right UI.")
If you're comparing concepts, are they genuinely different directions or minor variations? (Variations should be A/B tested, not concept tested.)
Who needs to see results, and what format? (Consider /research-readout for stakeholder presentation.)

Output location

Present the concept test plan as formatted text in the conversation. The user copies it into their research documentation or shares with the team.

Example Output

Input

The concept: A "Pre-flight Checklist" feature inside Lattice (HR platform) that surfaces a personalized checklist for managers 2 weeks before each direct report's performance review cycle opens — pulling in flagged 1:1 notes, goal completion data, and peer feedback gaps so managers feel prepared before writing reviews, rather than scrambling at the deadline.
What you're trying to learn: Should we prioritize this feature in Q3 roadmap over the competing "Review Analytics Dashboard" initiative? We need to know if managers experience the preparation problem acutely enough to change their behavior for a solution like this.
Assumptions being tested: (1) Managers feel underprepared when review cycles open. (2) They would engage with a nudge 2 weeks in advance, not just the day before. (3) Aggregating 1:1 notes + goal data in one place is the primary value, not reminders alone.
User segment: People managers at mid-market companies (200–2,000 employees) who run at least 4 direct reports through annual or semi-annual performance reviews. Must be current Lattice users.
Constraints: 3 weeks until roadmap decision meeting. Remote only. Concept exists as a rough 3-screen Figma mockup (not clickable). Budget for 8 participants max.

Output

Concept Test Plan — Lattice Pre-flight Checklist, June 2025

Hypothesis

We believe that people managers at mid-market companies will recognize the Pre-flight Checklist as solving a real, felt pain point when presented with the concept, because they currently lack a single place to consolidate review evidence and often feel caught off-guard when review cycles open. We'll know this is true when:

≥5 of 8 participants accurately describe the core value (aggregated preparation, not just reminders) in their own words
≥5 of 8 spontaneously describe a personal situation where this would have changed their behavior
≤2 of 8 say they'd prefer to continue their current ad hoc process

Evaluation Criteria

Signal	Positive indicator	Negative indicator	How captured
Comprehension	Participant says "it pulls together everything I need before I write reviews" without prompting	Participant describes it as "just another reminder" or conflates it with the existing review form	Think-aloud during stimulus; comprehension check question
Relevance	Participant names a specific past review cycle where this would have changed what they wrote	Participant says "I already do this in a spreadsheet" or "I don't really prepare in advance"	Relevance probe + current workflow questions
Value vs. status quo	Participant says current process takes significant time/effort and positions checklist as meaningfully better	Participant rates current prep as "fine" or says the checklist adds steps rather than removes them	Direct comparison question
2-week timing belief	Participant agrees they would engage 2 weeks out, unprompted or with minimal probing	Participant says "I'd look at this the night before, not two weeks ahead"	Timing probe question
Willingness to act	Participant says they would enable this feature today or asks "how do I turn this on?"	Participant says "maybe, if I remembered to check it" or adds conditions that are unlikely to be met	Commitment question

Stimulus Design

Format: Static mockup (3 annotated Figma screens)

Fidelity level	Rationale
Static mockup (medium polish)	The concept is UI-based and specific enough to require concrete screens, but the decision is whether to build it at all — not whether the UI is right. Interactive prototype would overcommunicate confidence and take time we don't have.

Stimulus description:

Screen 1: Manager home view showing "Review cycle opens in 14 days" banner with a "View your Pre-flight Checklist" CTA
Screen 2: The checklist itself — showing direct report cards, each with a completion ring and flags for: 1:1 notes logged (last 90 days), goal progress %, and peer feedback requested/not-requested
Screen 3: Expanded view for one direct report — showing 3 pulled 1:1 note snippets, goal status, and a "Start review draft" button

Polish level: Medium — grayscale wireframes with realistic copy and data (no lorem ipsum). Not branded, not pixel-perfect.

What will NOT be shown: The review writing experience itself, notification/email format, admin configuration settings.

Stimulus review checklist:

Does the stimulus convey the concept without the moderator explaining it?
Is the fidelity appropriate — not so rough that participants can't react, not so polished that they think it's decided?
Does it avoid leading the participant toward a specific reaction?
If comparing concepts, are they presented at equal fidelity? (N/A — single concept test)

Discussion Guide

Before stimulus — 8 min

Welcome, consent, recording confirmation, think-aloud instruction: "There are no right or wrong answers. The most helpful thing you can do is say exactly what you're thinking, even if it seems obvious."
Warm-up context questions:
- "Walk me through what happens the week a performance review cycle opens for your team. What's your first move?"
- "What do you use to pull together your thoughts before you actually write a review?"
- "Tell me about a time a review felt harder to write than you expected. What was missing?"
- "What's the most stressful part of the process for you personally?"

Goal: Establish whether the preparation problem exists in their world before anchoring them on the solution.

Stimulus presentation — 5 min

"I'm going to show you something we're exploring. It's early — not built yet. We want your honest reaction, not feedback on the visuals."
Share Figma link (view-only). Let participant look. Do not speak for the first 30–45 seconds. Note the first thing they say aloud — this is the most unfiltered signal.
Comprehension check: "In your own words, what would this do for you as a manager?"

Evaluation probes — 18 min

Relevance: "When in the last year would this have shown up at a useful moment? Describe the specific situation."
Timing: "The checklist appears 2 weeks before your cycle opens. Is that the right moment — too early, too late? When would you actually look at it?"
Value vs. status quo: "How does this compare to what you do today to prepare? Is it better, worse, or just different?"
Data sources: "Which of the three pieces of information here — 1:1 notes, goal progress, peer feedback gaps — would actually change what you write in a review? Which is noise?"
Concerns: "What worries you about this? What could go wrong?" (Not "do you have any concerns" — assume they do.)
Missing: "What's not here that would make this genuinely useful?"

Commitment test — 5 min

"If this was in your Lattice account when your next review cycle opened, would you use it — or work around it the way you do now? What would you actually do first?"
"Would you show this to another manager on your team and tell them to use it? Who comes to mind, and why?"

Debrief — 4 min

"If you could change one thing about this concept — not the design, the actual idea — what would it be?"
"Anything we didn't ask that you think we should know?"

Logistics

Item	Detail
Participants	8 (single concept; no rotation needed)
Session length	45 minutes per session
Format	1:1 remote moderated interview — not a focus group
Setting	Zoom with screen share (participant views Figma link)
Recruitment	Lattice CRM — filter: people managers, 4+ direct reports, mid-market account tier, completed ≥1 review cycle on platform. Use `/screener-design` for screener.
Stimulus prep	Design team delivers 3-screen Figma file (view

Run this now

Try /concept-test-plan on your own input

0/4000

Related UX Research skills

Accessibility Audit Assumption Map Card Sort Plan Competitive UX Benchmark Diary Study Plan Interview Plan Interview Script Interview Synthesis

Back to Skills Catalog