Use this when you're joining an existing product and need to quickly understand where the user experience falls short -- or when it's time for a periodic reassessment. Produces a prioritized gap map with severity ratings, evidence, and recommendations.
Related skills: For WCAG accessibility evaluation (perceivable, operable, understandable, robust), use
/accessibility-audit-- this skill covers Nielsen's heuristics, not accessibility standards. For evaluative testing with real users, use/usability-test-plan.
Process
Step 1: Gather inputs
Ask the user to provide:
- Product description — what it does, who it's for, current state
- Raw observations — notes from walking through the product yourself (screenshots, flow descriptions, pain points noticed)
- User data — any available support tickets, NPS responses, app store reviews, analytics (drop-off rates, low-adoption features, time-on-task data)
- User segments — who are the primary users? Different users experience different gaps.
- Core flows to assess — which user flows matter most? (e.g., onboarding, core task, checkout, search)
The user can paste raw notes, screenshots, or data exports. If they haven't walked through the product yet, advise them to spend 2-4 hours using it before running the audit.
Step 2: Heuristic evaluation
Analyze the observations against Nielsen's 10 usability heuristics:
## UX Gap Assessment — (Product name, date)
### Heuristic Evaluation
| # | Heuristic | Rating | Key Findings |
|---|-----------|--------|-------------|
| 1 | Visibility of system status | (Good / Issues found) | (Brief summary) |
| 2 | Match between system and real world | (Good / Issues found) | (Brief summary) |
| 3 | User control and freedom | (Good / Issues found) | (Brief summary) |
| 4 | Consistency and standards | (Good / Issues found) | (Brief summary) |
| 5 | Error prevention | (Good / Issues found) | (Brief summary) |
| 6 | Recognition rather than recall | (Good / Issues found) | (Brief summary) |
| 7 | Flexibility and efficiency | (Good / Issues found) | (Brief summary) |
| 8 | Aesthetic and minimalist design | (Good / Issues found) | (Brief summary) |
| 9 | Error recovery | (Good / Issues found) | (Brief summary) |
| 10 | Help and documentation | (Good / Issues found) | (Brief summary) |
| 11 | Accessibility (WCAG 2.2 AA baseline) | (Good / Issues found) | (Brief summary) |
| 12 | Internationalization & cultural fit (if multi-locale) | (Good / Issues found) | (Brief summary) |
Heuristic 11 -- Accessibility quick screen:
This is not a replacement for /accessibility-audit (which covers full WCAG 2.2 compliance). This is a baseline check that every UX audit should include:
- Perceivable -- do text and UI elements meet contrast ratios (4.5:1 text, 3:1 components)? Is information conveyed beyond color alone? Do images have alt text?
- Operable -- can all interactive elements be reached and activated via keyboard? Are focus indicators visible? Do touch targets meet minimum size (24x24px WCAG 2.2, 44x44px recommended)?
- Understandable -- are form inputs labeled? Do error messages identify the problem and suggest correction? Is language consistent?
- Robust -- does the heading hierarchy follow h1-h6 order? Are custom components using appropriate ARIA roles?
Flag issues found, note severity, and recommend /accessibility-audit for a comprehensive evaluation if significant issues surface.
Heuristic 12 -- Internationalization & cultural fit (if multi-locale): Skip this heuristic if the product serves a single locale. Include it when the product serves or plans to serve multiple languages or regions.
- Text truncation -- do translated strings overflow containers or get clipped? Check longest locales (German, Finnish).
- Date/time/number formatting -- are formats correct for each locale, or hardcoded to one convention?
- RTL layout support -- if serving Arabic, Hebrew, Farsi, or Urdu: is the layout properly mirrored? Navigation, icons, progress indicators?
- Cultural appropriateness -- do imagery, icons, colors, or metaphors carry unintended meaning in target cultures?
- Text expansion handling -- do UI layouts accommodate 30-40% longer strings without breaking?
- Locale switcher -- is the language/region selector discoverable and functional? Does it use native language names (Deutsch, not German)?
- Input methods -- does the product support IME input for CJK languages, diacritics for European languages?
For each issue found, detail:
- What's happening — describe the problem
- Severity — Critical / Major / Minor / Cosmetic
- Frequency — how often does a user encounter this?
- Evidence — observation + supporting data (if available)
Mobile and touch-specific heuristics
If the product has a mobile or responsive experience, add this evaluation:
### Mobile & Touch Evaluation
| Criterion | Rating | Key Findings |
|-----------|--------|-------------|
| Touch targets (minimum 44x44px, 48x48px preferred) | (Pass / Fail) | (Summary) |
| Thumb zone placement (primary actions in easy-reach zone) | (Good / Issues found) | (Summary) |
| Viewport adaptation (no horizontal scroll, readable without zoom) | (Good / Issues found) | (Summary) |
| Input method fit (appropriate keyboards, no hover-dependent UI) | (Good / Issues found) | (Summary) |
| Gesture discoverability (swipe, pull-to-refresh, long-press are discoverable) | (Good / Issues found) | (Summary) |
| Orientation handling (landscape/portrait transitions work) | (Good / Issues found) | (Summary) |
| Fat-finger tolerance (adequate spacing between tappable elements) | (Good / Issues found) | (Summary) |
Step 3: Cross-reference with existing research
Before layering in quantitative data, check if prior research exists that strengthens or challenges your heuristic findings. Ask the user:
- Do you have past interview transcripts, usability test results, or research syntheses for this product?
- Are there research nuggets or synthesis documents from
/interview-synthesisor/research-synthesize?
If prior research is available, cross-reference:
### Prior Research Cross-Reference
| Heuristic finding | Prior research evidence | Effect on severity |
|---|---|---|
| (Gap from heuristic review) | (What interviews/tests revealed about this same area) | Increased / Confirmed / No prior data |
This step matters because heuristic evaluation is expert opinion — prior research adds real user evidence. A heuristic issue that users also struggled with in testing is higher-confidence than one based on expert judgment alone. Conversely, an issue you flagged that users navigated easily may be lower-severity than your review suggested.
If no prior research exists, note this as a gap:
[NOTE: No prior user research available for this product. Heuristic findings are expert-judgment only. Consider follow-up usability testing to validate severity ratings.]
Step 4: Cross-reference with user data
If the user provided support data, analytics, or reviews, layer it against the heuristic findings:
### Data Cross-Reference
| Finding | Heuristic evidence | Data evidence | Confidence |
|---------|-------------------|---------------|------------|
| (Gap) | (What you observed) | (What the data shows) | High / Medium / Low |
Findings supported by both observation and data are high-confidence gaps.
Step 5: Gap map
Produce the prioritized gap map:
### Gap Map (prioritized)
| # | Gap | Severity | Frequency | User Segments | Evidence | Recommendation |
|---|-----|----------|-----------|---------------|----------|----------------|
| 1 | (What's missing or broken) | Critical | Daily | (Who) | (Sources) | (What to do) |
| 2 | ... | Major | Weekly | ... | ... | ... |
### Quick Wins (high value, low effort)
- (Improvement 1) — (why it's quick, what it unlocks)
- (Improvement 2)
### Deep Investigations Needed
- (Gap that needs user research to fully understand)
### What's Working Well
- (Strength 1 — protect this)
- (Strength 2)
Step 6: Review and refine
Ask the user:
- Do the severity ratings match your experience with the product?
- Any gaps missing from the map?
- Are the quick wins actually quick? (You know the codebase and team capacity better than the agent.)
- Is "what's working well" accurate? (Don't let the audit be entirely negative.)
Adjust as needed.
Portfolio example: B Lab (2020-2021)
Kate audited B Lab's certification platform and SDG Action Manager. Key findings: 55% of SDG Action Manager users hadn't answered a single question, 90% abandon rate, NPS of 26, 50%+ confusion-related support tickets. She identified 9 platform-wide UX problems including IA hidden behind hamburger menus, insider "B" language users didn't understand, too many distracting options post-registration, and no clear path to success. Created lo-fi dashboard mockups showing before/after: stripped irrelevant nav items and guided new users only to the actions that mattered most.
Output location
Present the assessment as formatted text in the conversation for the user to share with the team and convert into backlog items.
Example Output
Input
- Product description: Roofr, a SaaS platform for roofing contractors to generate instant roof measurement reports and send proposals to homeowners. Primary users are small roofing business owners and their estimators (30–200 employee shops). Currently in growth phase with ~8,000 active contractor accounts.
- Raw observations: Walked through onboarding, report ordering, and proposal builder flows. Noticed: no progress indicator during the 3-step report order process; proposal editor has 14 toolbar icons with no labels; "Upgrade" CTA appears 6 times on the dashboard before a user has completed their first report; error message on payment failure just reads "Something went wrong"; can't edit a sent proposal without duplicating it first (not obvious).
- User data: App store rating 3.6 stars (common complaint: "confusing editor"); Intercom support tickets show 34% tagged "how do I edit a proposal"; onboarding funnel analytics show 61% drop-off between "account created" and "first report ordered"; NPS of 31; top detractor comment theme: "I didn't know my proposal was sent until the client called me."
- User segments: (1) Owner-operators who do their own estimating, low tech fluency; (2) office estimators at mid-size shops, higher tech fluency but time-pressured; (3) new signups in 14-day trial, highest churn risk.
- Core flows to assess: Onboarding to first report, proposal builder (create → send), and post-send confirmation.
UX Gap Assessment — Roofr, June 2025
Heuristic Evaluation
| # | Heuristic | Rating | Key Findings |
|---|---|---|---|
| 1 | Visibility of system status | Issues found | No progress indicator in report order flow; users don't know if report is processing or failed; no confirmation screen post-send on proposals |
| 2 | Match between system and real world | Issues found | "Measurement report" vs. "slope report" terminology inconsistent; proposal editor uses design-tool metaphors (layer icons) unfamiliar to contractors |
| 3 | User control and freedom | Issues found | No undo in proposal editor; editing a sent proposal requires non-obvious duplication step; no draft autosave indicator |
| 4 | Consistency and standards | Issues found | Primary action buttons alternate between blue and green across flows; "Send" vs. "Deliver" vs. "Submit" used interchangeably for same action |
| 5 | Error prevention | Issues found | No confirmation dialog before sending a proposal; no validation warning if proposal total is $0 before send |
| 6 | Recognition rather than recall | Issues found | 14 unlabeled toolbar icons in proposal editor force memorization; recent reports not surfaced on dashboard |
| 7 | Flexibility and efficiency | Issues found | No proposal templates for returning estimators; no bulk re-order for repeat roof types; power users have no keyboard shortcuts |
| 8 | Aesthetic and minimalist design | Issues found | Dashboard shows 6 "Upgrade" prompts before user completes first report; sidebar has 11 nav items visible to trial users, most irrelevant |
| 9 | Error recovery | Critical issues | Payment failure message: "Something went wrong" — no reason given, no next step; failed report orders have no retry path visible |
| 10 | Help and documentation | Issues found | Help widget hidden under settings menu; no contextual tooltips in proposal editor; onboarding checklist disappears after day 1 |
| 11 | Accessibility (WCAG 2.2 AA baseline) | Issues found | Several toolbar icons fail 3:1 contrast on white background; no visible focus ring on proposal editor canvas; icon-only buttons lack aria-labels |
Accessibility note: Icon contrast failures and missing aria-labels are significant. Recommend running
/accessibility-auditfor full WCAG 2.2 AA evaluation before next major release.
Mobile & Touch Evaluation
| Criterion | Rating | Key Findings |
|---|---|---|
| Touch targets (min 44×44px) | Fail | Proposal editor toolbar icons measure ~28×28px on mobile; frequent mis-taps reported in reviews |
| Thumb zone placement | Issues found | Primary "Send Proposal" CTA sits top-right on mobile — outside easy thumb reach for right-handed users |
| Viewport adaptation | Issues found | Proposal preview table scrolls horizontally on viewports below 390px without user indication |
| Input method fit | Good | Numeric fields correctly trigger numeric keyboard; no hover-only interactions found |
| Gesture discoverability | Issues found | No pull-to-refresh on reports list; swipe-to-delete exists on line items but is undiscoverable |
| Orientation handling | Good | Layout reflows correctly in landscape; no broken states observed |
| Fat-finger tolerance | Issues found | "Duplicate" and "Delete" proposal actions are adjacent with <8px gap; accidental deletion risk |
Prior Research Cross-Reference
| Heuristic finding | Prior research evidence | Effect on severity |
|---|---|---|
| No post-send confirmation (H1) | Previous usability test (Q3 2024) showed 3/5 participants refreshed the page repeatedly after sending, unsure if action completed | Increased — confirmed user confusion, not just expert judgment |
| Unlabeled toolbar icons (H6) | No prior research on this specific issue | No prior data |
| Upgrade prompt overload on dashboard (H8) | Onboarding interview notes (Q1 2025): 4/6 new users described dashboard as "overwhelming" or "pushy" on first login | Increased — research-confirmed |
Data Cross-Reference
| Finding | Heuristic evidence | Data evidence | Confidence |
|---|---|---|---|
| Onboarding drop-off | No progress indicator; noisy dashboard with 11 nav items | 61% drop between signup and first report ordered | High |
| Proposal editor confusion | 14 unlabeled icons, no undo, inconsistent terminology | 34% of support tickets = "how do I edit a proposal"; 3.6 App Store rating with editor complaints | High |
| Post-send status anxiety | No confirmation screen or send receipt | "I didn't know my proposal was sent until the client called me" — top NPS detractor theme | High |
| Error recovery failure | "Something went wrong" payment error, no retry path | Volume not quantified, but payment failure tickets are second-largest support category | Medium |
Gap Map (prioritized)
| # | Gap | Severity | Frequency | User Segments | Evidence | Recommendation |
|---|---|---|---|---|---|---|
| 1 | No confirmation or status after proposal is sent | Critical | Every send action | All, especially owner-operators | NPS detractor quotes; Q3 usability test | Add post-send confirmation screen with: sent timestamp, recipient email, "View sent proposal" link, and optional follow-up reminder |
| 2 | "Something went wrong" error on payment failure | Critical | Every failed payment | All segments | Second-largest support ticket category | Replace with specific error (card declined / billing address mismatch) + clear retry CTA + link to update billing |
| 3 | 61% drop-off between signup and first report | Critical | First session | Trial users (highest churn risk) | Funnel analytics; onboarding interview notes | Reduce dashboard nav to 3–4 items for trial users; replace upgrade prompts with a single persistent trial banner; add linear onboarding checklist with progress indicator |
| 4 | Proposal editor icons are unlabeled and too small | Major | Every editing session | Owner-operators (low tech fluency), mobile users | 34% support tickets; App Store reviews; 28×28px touch targets | Add icon labels below toolbar; increase touch targets to 44×44px; introduce 3-item "most used" shortcut bar |
| 5 | No undo / no autosave signal in proposal editor | Major | Multiple times per session | Estimators, owner-operators | User review: "lost my whole proposal twice" | Implement Ctrl+Z undo stack; add autosave indicator ("Saved 2s ago") in editor header |
| 6 | Editing a sent proposal requires non-obvious duplication | Major | Any revision request | All segments | 34% of support tickets include this workflow | Add "Edit & resend" button directly on sent proposal view; show inline explainer on first |