Use this when you need to evaluate the health of an existing design system -- whether tokens cover what the product actually uses, whether components are specced and consistent, whether documentation exists and stays current. Produces a scored health report with prioritized gaps and a remediation roadmap.
Related skills: Pair with
/style-guidefor token foundations. Use/component-specto spec components flagged as missing or incomplete. Reference/interaction-pattern-libraryfor pattern consistency. Feed accessibility findings into/accessibility-auditfor deeper evaluation.
Process
Step 1: Gather inputs
Ask the user to provide:
- Design system location -- Storybook URL, Figma library, documentation site, or code repo
- Component inventory -- list of components the system claims to offer (or where to find it)
- Token system -- does it exist? Where are tokens defined (Figma, CSS variables, JSON, Style Dictionary)?
- Product screens -- representative screenshots or flows showing how the system is actually used
- Known pain points -- what does the team already know is broken, missing, or drifting?
- Team context -- who maintains the system? How many products consume it?
If the user doesn't have a component inventory, note this as a finding -- a design system without a published inventory is already in trouble.
Step 2: Token coverage audit
Evaluate whether the token system covers foundational design decisions:
## Token Coverage
| Token category | Defined? | Count | Coverage assessment |
|---------------|----------|-------|---------------------|
| Color -- brand | Yes / No | (n) | (Are all brand colors tokenized? Any hardcoded hex in product?) |
| Color -- semantic | Yes / No | (n) | (Success, warning, error, info mapped? Dark mode variants?) |
| Color -- surface/background | Yes / No | (n) | (Elevation levels, card backgrounds, overlays) |
| Typography -- scale | Yes / No | (n) | (Heading levels, body, caption, overline) |
| Typography -- weight | Yes / No | (n) | (Regular, medium, semibold, bold at minimum) |
| Spacing | Yes / No | (n) | (Consistent scale? 4px/8px base? Covers padding + margin) |
| Border radius | Yes / No | (n) | (Consistent values? Component-specific overrides?) |
| Elevation/shadow | Yes / No | (n) | (Layering system? Consistent depth levels?) |
| Motion/duration | Yes / No | (n) | (Transition speeds, easing curves, prefers-reduced-motion?) |
| Breakpoints | Yes / No | (n) | (Responsive breakpoints defined as tokens?) |
| Z-index | Yes / No | (n) | (Managed scale or ad-hoc values?) |
### Token health indicators
- **Hardcoded values in product:** (High / Medium / Low -- are products using tokens or bypassing them?)
- **Naming convention:** (Consistent / Inconsistent / None)
- **Theming support:** (Yes / Partial / No -- can tokens swap for dark mode, white-label, etc.?)
Step 3: Component completeness audit
Compare what the design system offers against what the product actually needs:
## Component Completeness
### Inventory vs. usage
| Component | In system? | In product? | Specced? | Status |
|-----------|-----------|-------------|----------|--------|
| Button | Yes / No | Yes / No | Full / Partial / None | (Complete / Gap / Orphaned) |
| Input | Yes / No | Yes / No | Full / Partial / None | ... |
| (repeat for all components) |
### Status definitions
- **Complete** -- in system, used in product, fully specced
- **Gap** -- used in product but missing from system (or in system but not specced)
- **Orphaned** -- in system but not used in any product
- **Drift** -- exists in both but product version has diverged from system version
### Coverage summary
- Components in system: (n)
- Components used in product: (n)
- Gaps (in product, not in system): (n) --
- Orphaned (in system, not in product): (n)
- Drift (diverged): (n)
Step 4: Consistency check
Evaluate whether the system is internally consistent:
## Consistency Assessment
| Dimension | Rating | Findings |
|-----------|--------|----------|
| Naming conventions | Consistent / Mixed / No pattern | (e.g., "Button" vs "Btn" vs "ActionButton") |
| Variant patterns | Consistent / Mixed | (Do all components use the same variant naming? size=sm/md/lg?) |
| State coverage | Complete / Gaps | (Do all interactive components define hover, focus, active, disabled?) |
| Prop/API patterns | Consistent / Mixed | (Similar components use similar APIs? onChange vs onUpdate?) |
| Spacing application | Consistent / Ad-hoc | (Components use token spacing or hardcode values?) |
| Responsive behavior | Defined / Undefined | (Components have responsive rules or left to consumers?) |
| Error patterns | Consistent / Varies | (Error states follow a single pattern or each component differs?) |
| Empty states | Defined / Missing | (Components that display data have empty state designs?) |
Step 5: Accessibility compliance
Evaluate the system's accessibility posture at the system level -- not individual page compliance, but whether the system makes it easy or hard to build accessible products:
## Accessibility at the System Level
| Criterion | Status | Finding |
|-----------|--------|---------|
| Color contrast tokens meet 4.5:1 (text) and 3:1 (UI) | Pass / Fail | (details) |
| Focus styles defined in the system (not left to consumers) | Yes / No | (details) |
| Keyboard interaction patterns documented per component | Yes / No / Partial | (details) |
| ARIA patterns included in component specs | Yes / No / Partial | (details) |
| Touch target sizes meet 24px minimum (44px recommended) | Yes / No | (details) |
| Motion respects prefers-reduced-motion | Yes / No / Not applicable | (details) |
| Screen reader announcements documented for dynamic components | Yes / No | (details) |
| High-contrast/forced-colors compatibility | Tested / Untested | (details) |
Step 6: Documentation quality
Evaluate whether the system is actually usable by the people who need to build with it:
## Documentation Assessment
| Dimension | Status | Notes |
|-----------|--------|-------|
| Component specs exist for all components | All / Most / Few / None | (percentage) |
| Usage guidelines (when to use, when not to) | Yes / Partial / No | (details) |
| Code examples (copy-paste ready) | Yes / Partial / No | (details) |
| Visual examples (rendered states, variants) | Yes / Partial / No | (details) |
| Do/Don't guidance | Yes / Partial / No | (details) |
| Figma-to-code parity documented | Yes / No | (details) |
| Contribution guidelines (how to add/modify) | Yes / No | (details) |
| Migration guides (for breaking changes) | Yes / No / Not applicable | (details) |
| Changelog maintained | Yes / No | (details) |
| Search/discovery (can people find what they need?) | Good / Poor | (details) |
Step 7: Health scorecard and recommendations
## Design System Health Scorecard
| Dimension | Score (1-5) | Summary |
|-----------|-------------|---------|
| Token coverage | (n) | (one-line summary) |
| Component completeness | (n) | (one-line summary) |
| Internal consistency | (n) | (one-line summary) |
| Accessibility posture | (n) | (one-line summary) |
| Documentation quality | (n) | (one-line summary) |
| **Overall health** | **(avg)** | **(one-line overall)** |
### Score definitions
- **5** -- Mature. Comprehensive coverage, consistent patterns, well-documented.
- **4** -- Strong. Minor gaps, mostly consistent, documentation covers the important parts.
- **3** -- Developing. Notable gaps, some inconsistencies, documentation is spotty.
- **2** -- Fragile. Significant gaps, inconsistent patterns, documentation is sparse or outdated.
- **1** -- Nascent. Major foundational work needed.
### Priority recommendations
| Priority | Recommendation | Dimension | Effort | Impact |
|----------|---------------|-----------|--------|--------|
| P0 | (Fix immediately -- blocking teams or causing defects) | (which) | S/M/L | High |
| P1 | (Fix this cycle -- significant friction or drift) | (which) | S/M/L | High/Medium |
| P2 | (Plan for next cycle -- improvement, not urgent) | (which) | S/M/L | Medium |
### Quick wins
- (Improvement that's small effort, visible impact)
### Systemic issues
- (Patterns that need structural changes, not patches)
### What's working well
- (Strengths to protect and build on)
Step 8: Review
Ask the user:
- Do the scores match your gut sense of system health?
- Are any priority recommendations wrong (too high, too low, missing)?
- What's the team's capacity for design system investment right now?
- Is there a specific dimension you want to dig deeper on?
Adjust scores and recommendations based on their input.
Output location
Present the audit as formatted text in the conversation. The health scorecard and recommendations can be exported for team review or backlog planning.
Example Output
Input
- Design system location: Figma component library ("Meridian DS v2") + internal documentation site at design.harborcredit.io; no Storybook, React component package published on internal npm as
@harbor/meridian - Component inventory: 34 components listed in Figma, documentation site covers ~20 of them; team-provided spreadsheet attached
- Token system: Defined in Figma Styles and partially exported to CSS custom properties via a manual Style Dictionary config; JSON source lives in a GitHub repo last updated 7 months ago
- Product screens: 6 flows provided — loan application, account dashboard, payment history, settings, onboarding, and mobile deposit
- Known pain points: Product teams are hardcoding colors for error states because "the red token looks wrong on dark backgrounds"; data table component is in Figma but no one knows how to implement it; three different button styles exist in production
- Team context: 2 designers + 1 front-end engineer maintain the system part-time; system feeds 4 internal web products and 1 React Native mobile app
Output (abbreviated)
Design System Audit — Harbor Credit / Meridian DS v2
Audit date: 2025-07-11
System: Meridian DS v2 · design.harborcredit.io · @harbor/meridian
Auditor note: No Storybook present. Documentation site covers ~59% of claimed components. These gaps are themselves findings.
Token Coverage
| Token category | Defined? | Count | Coverage assessment |
|---|---|---|---|
| Color — brand | Yes | 12 | Core palette tokenized. harbor-teal-* scale well-defined. |
| Color — semantic | Partial | 4 | Success and error tokens exist but no dark-mode variants; teams hardcoding error reds — confirmed pain point |
| Color — surface/background | Partial | 3 | Only surface-primary and surface-secondary defined; elevation/overlay tokens absent |
| Typography — scale | Yes | 8 | h1–h4, body-lg, body-sm, caption, overline all present |
| Typography — weight | Partial | 2 | Only regular and semibold; medium weight used in product but not tokenized |
| Spacing | Yes | 10 | 4px base, clean 4–64px scale; consistent with product inspection |
| Border radius | Partial | 3 | sm/md/lg defined but component overrides hardcoded in CSS (buttons use 6px not radius-sm) |
| Elevation/shadow | No | 0 | Not tokenized; 4 distinct box-shadow values found hardcoded across product |
| Motion/duration | No | 0 | No duration or easing tokens; prefers-reduced-motion not addressed anywhere |
| Breakpoints | No | 0 | Mobile breakpoints referenced in component specs as raw pixel values only |
| Z-index | No | 0 | Ad-hoc; values of 9, 99, 999, and 9999 all found in product CSS |
Token health indicators
- Hardcoded values in product: High — semantic colors, shadows, radii, and z-index are routinely bypassed
- Naming convention: Mixed — Figma Styles use
Color/Brand/Teal-500; CSS variables use--harbor-teal-500; JSON usescolor.brand.teal.500; three formats, no single source of truth - Theming support: No — dark mode is partially shipped in the mobile app using hardcoded overrides, not a token swap
Component Completeness
| Component | In system? | In product? | Specced? | Status |
|---|---|---|---|---|
| Button (primary/secondary/ghost) | Yes | Yes | Partial | Drift — 3 variants in product vs. 2 in system |
| Input (text) | Yes | Yes | Full | Complete |
| Input (select/dropdown) | Yes | Yes | Partial | Drift — mobile uses custom-built replacement |
| Form validation / inline error | Yes | Yes | None | Gap — no spec, no documented pattern |
| Data table | Yes | Yes | None | Gap — Figma frame exists, zero implementation guidance |
| Modal / dialog | Yes | Yes | Partial | Drift — footer button order inconsistent with spec |
| Toast / notification | Yes | Yes | Partial | Gap — only success variant specced; error/warning absent |
| Pagination | Yes | Yes | None | Gap |
| Date picker | No | Yes | None | Gap — product team built their own, no system alignment |
| Progress indicator (step) | No | Yes | None | Gap — onboarding flow has a 5-step stepper not in system |
| Skeleton loader | No | Yes | None | Gap |
| Badge / status chip | Yes | No | Partial | Orphaned |
| Stat card | Yes | Yes | Full | Complete |
| Avatar | Yes | No | Full | Orphaned |
Coverage summary
- Components in system: 34
- Components used in product: 31
- Gaps (in product, not in system or unspecced): 9
- Orphaned (in system, not in product): 5
- Drift (diverged): 6
Consistency Assessment
| Dimension | Rating | Findings |
|---|---|---|
| Naming conventions | Mixed | "DataTable" in Figma, data-table in CSS, DataGrid in React package — three names, one component |
| Variant patterns | Mixed | Buttons use variant=primary/secondary; form fields use type=filled/outline; no unified convention |
| State coverage | Gaps | Focus and disabled states missing on 8 of 14 interactive components; active state absent on all nav items |
| Prop/API patterns | Mixed | Some components use onChange, others onUpdate, one uses handleChange |
| Spacing application | Ad-hoc | Internal component padding uses token scale in ~60% of cases; rest hardcoded |
| Responsive behavior | Undefined | Only 3 of 34 components have documented responsive rules |
| Error patterns | Varies | 4 different error presentation approaches found across forms, modals, toasts, and inline alerts |
| Empty states | Missing | Data table, payment history, and account dashboard all have empty states in product with no system guidance |
Accessibility at the System Level
| Criterion | Status | Finding |
|---|---|---|
| Color contrast tokens meet 4.5:1 (text) | Fail | text-secondary on surface-primary measures 3.8:1 — fails AA |
| Focus styles defined in system | No | Browsers default focus rings used; no system-level focus token or pattern |
| Keyboard interaction documented per component | Partial | Input and modal have notes; all others silent |
| ARIA patterns in specs | No | Absent across all 34 components |
| Touch target sizes meet 44px recommended | No | Icon buttons in mobile deposit flow measure 32×32px |
| Motion respects prefers-reduced-motion | No | No tokens, no documentation, no implementation |
| Screen reader announcements for dynamic components | No | Toast and form validation have no documented live region guidance |
| High-contrast / forced-colors compatibility | Untested | No evidence this has been evaluated |
Documentation Assessment
| Dimension | Status | Notes |
|---|---|---|
| Component specs exist for all components | Few | ~20 of 34 have any page; ~8 are genuinely complete |
| Usage guidelines (when to use / when not to) | Partial | Present on 6 components (button, input, modal, stat card, badge, avatar) |
| Code examples | Partial | npm snippets on 9 components; none are copy-paste ready for mobile |
| Visual examples (rendered states, variants) | Partial | Figma embeds present but often out of sync with shipped code |
| Do/Don't guidance | No | Not present on any component page |
| Figma-to-code parity documented | No | No explicit mapping; engineers report guessing prop names |
| Contribution guidelines | No | No documented process; system team reports "ad-hoc Slack conversations" |
| Migration guides | No | v1→v2 migration was undocumented; teams still on v1 in 2 products |
| Changelog | No | No changelog; GitHub commit history is the only record |
| Search / discovery | Poor | Documentation site has no search; navigation is a flat alphabetical list |