Use this when you need to evaluate the health of an existing design system -- whether tokens cover what the product actually uses, whether components are specced and consistent, whether documentation exists and stays current. Produces a scored health report with prioritized gaps and a remediation roadmap.

Related skills: Pair with /style-guide for token foundations. Use /component-spec to spec components flagged as missing or incomplete. Reference /interaction-pattern-library for pattern consistency. Feed accessibility findings into /accessibility-audit for deeper evaluation.

Process

Step 1: Gather inputs

Ask the user to provide:

Design system location -- Storybook URL, Figma library, documentation site, or code repo
Component inventory -- list of components the system claims to offer (or where to find it)
Token system -- does it exist? Where are tokens defined (Figma, CSS variables, JSON, Style Dictionary)?
Product screens -- representative screenshots or flows showing how the system is actually used
Known pain points -- what does the team already know is broken, missing, or drifting?
Team context -- who maintains the system? How many products consume it?

If the user doesn't have a component inventory, note this as a finding -- a design system without a published inventory is already in trouble.

Step 2: Token coverage audit

Evaluate whether the token system covers foundational design decisions:

## Token Coverage

| Token category | Defined? | Count | Coverage assessment |
|---------------|----------|-------|---------------------|
| Color -- brand | Yes / No | (n) | (Are all brand colors tokenized? Any hardcoded hex in product?) |
| Color -- semantic | Yes / No | (n) | (Success, warning, error, info mapped? Dark mode variants?) |
| Color -- surface/background | Yes / No | (n) | (Elevation levels, card backgrounds, overlays) |
| Typography -- scale | Yes / No | (n) | (Heading levels, body, caption, overline) |
| Typography -- weight | Yes / No | (n) | (Regular, medium, semibold, bold at minimum) |
| Spacing | Yes / No | (n) | (Consistent scale? 4px/8px base? Covers padding + margin) |
| Border radius | Yes / No | (n) | (Consistent values? Component-specific overrides?) |
| Elevation/shadow | Yes / No | (n) | (Layering system? Consistent depth levels?) |
| Motion/duration | Yes / No | (n) | (Transition speeds, easing curves, prefers-reduced-motion?) |
| Breakpoints | Yes / No | (n) | (Responsive breakpoints defined as tokens?) |
| Z-index | Yes / No | (n) | (Managed scale or ad-hoc values?) |

### Token health indicators
- **Hardcoded values in product:** (High / Medium / Low -- are products using tokens or bypassing them?)
- **Naming convention:** (Consistent / Inconsistent / None)
- **Theming support:** (Yes / Partial / No -- can tokens swap for dark mode, white-label, etc.?)
- **Machine-readable export:** (Yes / Partial / No -- are tokens emitted as structured JSON or Style Dictionary output an AI agent can consume, or do they live only in Figma styles?)

Step 3: Component completeness audit

Compare what the design system offers against what the product actually needs:

## Component Completeness

### Inventory vs. usage

| Component | In system? | In product? | Specced? | Status |
|-----------|-----------|-------------|----------|--------|
| Button | Yes / No | Yes / No | Full / Partial / None | (Complete / Gap / Orphaned) |
| Input | Yes / No | Yes / No | Full / Partial / None | ... |
| (repeat for all components) |

### Status definitions
- **Complete** -- in system, used in product, fully specced
- **Gap** -- used in product but missing from system (or in system but not specced)
- **Orphaned** -- in system but not used in any product
- **Drift** -- exists in both but product version has diverged from system version

### Coverage summary
- Components in system: (n)
- Components used in product: (n)
- Gaps (in product, not in system): (n) -- 
- Orphaned (in system, not in product): (n)
- Drift (diverged): (n)

Step 4: Consistency check

Evaluate whether the system is internally consistent:

## Consistency Assessment

| Dimension | Rating | Findings |
|-----------|--------|----------|
| Naming conventions | Consistent / Mixed / No pattern | (e.g., "Button" vs "Btn" vs "ActionButton") |
| Variant patterns | Consistent / Mixed | (Do all components use the same variant naming? size=sm/md/lg?) |
| State coverage | Complete / Gaps | (Do all interactive components define hover, focus, active, disabled?) |
| Prop/API patterns | Consistent / Mixed | (Similar components use similar APIs? onChange vs onUpdate?) |
| Spacing application | Consistent / Ad-hoc | (Components use token spacing or hardcode values?) |
| Responsive behavior | Defined / Undefined | (Components have responsive rules or left to consumers?) |
| Error patterns | Consistent / Varies | (Error states follow a single pattern or each component differs?) |
| Empty states | Defined / Missing | (Components that display data have empty state designs?) |

Step 5: Accessibility compliance

Evaluate the system's accessibility posture at the system level -- not individual page compliance, but whether the system makes it easy or hard to build accessible products:

## Accessibility at the System Level

| Criterion | Status | Finding |
|-----------|--------|---------|
| Color contrast tokens meet 4.5:1 (text) and 3:1 (UI) | Pass / Fail | (details) |
| Focus styles defined in the system (not left to consumers) | Yes / No | (details) |
| Keyboard interaction patterns documented per component | Yes / No / Partial | (details) |
| ARIA patterns included in component specs | Yes / No / Partial | (details) |
| Touch target sizes meet 24px minimum (44px recommended) | Yes / No | (details) |
| Motion respects prefers-reduced-motion | Yes / No / Not applicable | (details) |
| Screen reader announcements documented for dynamic components | Yes / No | (details) |
| High-contrast/forced-colors compatibility | Tested / Untested | (details) |

Step 6: Documentation quality

Evaluate whether the system is actually usable by the people who need to build with it:

## Documentation Assessment

| Dimension | Status | Notes |
|-----------|--------|-------|
| Component specs exist for all components | All / Most / Few / None | (percentage) |
| Usage guidelines (when to use, when not to) | Yes / Partial / No | (details) |
| Code examples (copy-paste ready) | Yes / Partial / No | (details) |
| Visual examples (rendered states, variants) | Yes / Partial / No | (details) |
| Do/Don't guidance | Yes / Partial / No | (details) |
| Figma-to-code parity documented | Yes / No | (details) |
| Contribution guidelines (how to add/modify) | Yes / No | (details) |
| Migration guides (for breaking changes) | Yes / No / Not applicable | (details) |
| Changelog maintained | Yes / No | (details) |
| Search/discovery (can people find what they need?) | Good / Poor | (details) |

Step 7: AI codegen readiness

By 2026, the design system is the connective tissue that makes AI-generated code trustworthy. Tools like Figma Make, v0, and Lovable turn a prompt or a frame into working Tailwind and Radix components, and an MCP server (Figma's Dev Mode MCP, or a codebase scanner) can feed an agent a structured rules file of token definitions, component libraries, style hierarchies, and naming conventions so it generates brand- and accessibility-aligned code without re-prompting. Only about a third of designers currently trust AI-generated code, and the gap is almost always missing context. A system that exposes its rules cleanly closes that gap. Audit how ready this system is to drive that loop.

## AI Codegen Readiness

| Criterion | Status | Finding |
|-----------|--------|---------|
| Tokens exported in a machine-readable format (JSON, Style Dictionary, W3C tokens) | Yes / Partial / No | (Can an agent read tokens, or are they trapped in Figma styles only?) |
| MCP server or equivalent rules file exposes the system to agents | Yes / Planned / No | (Figma Dev Mode MCP, codebase scanner, or hand-maintained rules file?) |
| Component-to-code mapping exists (Code Connect or similar) | Yes / Partial / No | (Does the agent know which design component maps to which coded component?) |
| Naming conventions are consistent enough for an agent to apply them | Yes / Mixed / No | (Inconsistent naming forces re-prompting and produces drift) |
| Usage rules are explicit, not tribal (when to use, when not to, in writing) | Yes / Partial / No | (Agents cannot read intent that lives only in a designer's head) |
| Accessibility rules encoded as constraints the agent can honor | Yes / Partial / No | (Contrast pairs, focus styles, target sizes available as rules, not just docs) |

### Human-in-the-loop gate (mandatory)
AI-generated output is a starting artifact, not a finished one. Every generated component or screen still needs a human pass before it ships:
- **Edge and empty states** -- generators default to the happy path and skip loading, error, and zero-data states.
- **Accessibility validation** -- run the WCAG checks from Step 5 against the generated output. Nielsen heuristics and WCAG remain the standard; codegen does not replace them.
- **Brand fidelity** -- confirm the agent used real tokens, not approximations it invented to fill a gap.
- **Token drift** -- watch for hardcoded values the agent introduced when a token was missing. Those are new gaps, not solutions.

### Readiness assessment
- **Codegen-ready:** (Yes / Partial / No -- could an agent reliably build with this system today?)
- **Highest-leverage gap:** (The one thing that, if fixed, most improves AI output quality. Usually machine-readable tokens or consistent naming.)

Step 8: Health scorecard and recommendations

## Design System Health Scorecard

| Dimension | Score (1-5) | Summary |
|-----------|-------------|---------|
| Token coverage | (n) | (one-line summary) |
| Component completeness | (n) | (one-line summary) |
| Internal consistency | (n) | (one-line summary) |
| Accessibility posture | (n) | (one-line summary) |
| Documentation quality | (n) | (one-line summary) |
| AI codegen readiness | (n) | (one-line summary) |
| **Overall health** | **(avg)** | **(one-line overall)** |

### Score definitions
- **5** -- Mature. Comprehensive coverage, consistent patterns, well-documented.
- **4** -- Strong. Minor gaps, mostly consistent, documentation covers the important parts.
- **3** -- Developing. Notable gaps, some inconsistencies, documentation is spotty.
- **2** -- Fragile. Significant gaps, inconsistent patterns, documentation is sparse or outdated.
- **1** -- Nascent. Major foundational work needed.

### Priority recommendations

| Priority | Recommendation | Dimension | Effort | Impact |
|----------|---------------|-----------|--------|--------|
| P0 | (Fix immediately -- blocking teams or causing defects) | (which) | S/M/L | High |
| P1 | (Fix this cycle -- significant friction or drift) | (which) | S/M/L | High/Medium |
| P2 | (Plan for next cycle -- improvement, not urgent) | (which) | S/M/L | Medium |

### Quick wins
- (Improvement that's small effort, visible impact)

### Systemic issues
- (Patterns that need structural changes, not patches)

### What's working well
- (Strengths to protect and build on)

Step 9: Review

Ask the user:

Do the scores match your gut sense of system health?
Are any priority recommendations wrong (too high, too low, missing)?
What's the team's capacity for design system investment right now?
Is there a specific dimension you want to dig deeper on?

Adjust scores and recommendations based on their input.

Output location

Present the audit as formatted text in the conversation. The health scorecard and recommendations can be exported for team review or backlog planning.

Example Output

Input

Design system location: Figma component library ("Meridian DS v2") + internal documentation site at design.harborcredit.io; no Storybook, React component package published on internal npm as @harbor/meridian
Component inventory: 34 components listed in Figma, documentation site covers ~20 of them; team-provided spreadsheet attached
Token system: Defined in Figma Styles and partially exported to CSS custom properties via a manual Style Dictionary config; JSON source lives in a GitHub repo last updated 7 months ago
Product screens: 6 flows provided — loan application, account dashboard, payment history, settings, onboarding, and mobile deposit
Known pain points: Product teams are hardcoding colors for error states because "the red token looks wrong on dark backgrounds"; data table component is in Figma but no one knows how to implement it; three different button styles exist in production
Team context: 2 designers + 1 front-end engineer maintain the system part-time; system feeds 4 internal web products and 1 React Native mobile app

Output (abbreviated)

Design System Audit — Harbor Credit / Meridian DS v2

Audit date: 2025-07-11 System: Meridian DS v2 · design.harborcredit.io · @harbor/meridian Auditor note: No Storybook present. Documentation site covers ~59% of claimed components. These gaps are themselves findings.

Token Coverage

Token category	Defined?	Count	Coverage assessment
Color — brand	Yes	12	Core palette tokenized. `harbor-teal-*` scale well-defined.
Color — semantic	Partial	4	Success and error tokens exist but no dark-mode variants; teams hardcoding error reds — confirmed pain point
Color — surface/background	Partial	3	Only `surface-primary` and `surface-secondary` defined; elevation/overlay tokens absent
Typography — scale	Yes	8	h1–h4, body-lg, body-sm, caption, overline all present
Typography — weight	Partial	2	Only regular and semibold; medium weight used in product but not tokenized
Spacing	Yes	10	4px base, clean 4–64px scale; consistent with product inspection
Border radius	Partial	3	sm/md/lg defined but component overrides hardcoded in CSS (buttons use `6px` not `radius-sm`)
Elevation/shadow	No	0	Not tokenized; 4 distinct `box-shadow` values found hardcoded across product
Motion/duration	No	0	No duration or easing tokens; `prefers-reduced-motion` not addressed anywhere
Breakpoints	No	0	Mobile breakpoints referenced in component specs as raw pixel values only
Z-index	No	0	Ad-hoc; values of 9, 99, 999, and 9999 all found in product CSS

Token health indicators

Hardcoded values in product: High — semantic colors, shadows, radii, and z-index are routinely bypassed
Naming convention: Mixed — Figma Styles use Color/Brand/Teal-500; CSS variables use --harbor-teal-500; JSON uses color.brand.teal.500; three formats, no single source of truth
Theming support: No — dark mode is partially shipped in the mobile app using hardcoded overrides, not a token swap

Component Completeness

Component	In system?	In product?	Specced?	Status
Button (primary/secondary/ghost)	Yes	Yes	Partial	Drift — 3 variants in product vs. 2 in system
Input (text)	Yes	Yes	Full	Complete
Input (select/dropdown)	Yes	Yes	Partial	Drift — mobile uses custom-built replacement
Form validation / inline error	Yes	Yes	None	Gap — no spec, no documented pattern
Data table	Yes	Yes	None	Gap — Figma frame exists, zero implementation guidance
Modal / dialog	Yes	Yes	Partial	Drift — footer button order inconsistent with spec
Toast / notification	Yes	Yes	Partial	Gap — only success variant specced; error/warning absent
Pagination	Yes	Yes	None	Gap
Date picker	No	Yes	None	Gap — product team built their own, no system alignment
Progress indicator (step)	No	Yes	None	Gap — onboarding flow has a 5-step stepper not in system
Skeleton loader	No	Yes	None	Gap
Badge / status chip	Yes	No	Partial	Orphaned
Stat card	Yes	Yes	Full	Complete
Avatar	Yes	No	Full	Orphaned

Coverage summary

Components in system: 34
Components used in product: 31
Gaps (in product, not in system or unspecced): 9
Orphaned (in system, not in product): 5
Drift (diverged): 6

Consistency Assessment

Dimension	Rating	Findings
Naming conventions	Mixed	"DataTable" in Figma, `data-table` in CSS, `DataGrid` in React package — three names, one component
Variant patterns	Mixed	Buttons use `variant=primary/secondary`; form fields use `type=filled/outline`; no unified convention
State coverage	Gaps	Focus and disabled states missing on 8 of 14 interactive components; active state absent on all nav items
Prop/API patterns	Mixed	Some components use `onChange`, others `onUpdate`, one uses `handleChange`
Spacing application	Ad-hoc	Internal component padding uses token scale in ~60% of cases; rest hardcoded
Responsive behavior	Undefined	Only 3 of 34 components have documented responsive rules
Error patterns	Varies	4 different error presentation approaches found across forms, modals, toasts, and inline alerts
Empty states	Missing	Data table, payment history, and account dashboard all have empty states in product with no system guidance

Accessibility at the System Level

Criterion	Status	Finding
Color contrast tokens meet 4.5:1 (text)	Fail	`text-secondary` on `surface-primary` measures 3.8:1 — fails AA
Focus styles defined in system	No	Browsers default focus rings used; no system-level focus token or pattern
Keyboard interaction documented per component	Partial	Input and modal have notes; all others silent
ARIA patterns in specs	No	Absent across all 34 components
Touch target sizes meet 44px recommended	No	Icon buttons in mobile deposit flow measure 32×32px
Motion respects prefers-reduced-motion	No	No tokens, no documentation, no implementation
Screen reader announcements for dynamic components	No	Toast and form validation have no documented live region guidance
High-contrast / forced-colors compatibility	Untested	No evidence this has been evaluated

Documentation Assessment

Dimension	Status	Notes
Component specs exist for all components	Few	~20 of 34 have any page; ~8 are genuinely complete
Usage guidelines (when to use / when not to)	Partial	Present on 6 components (button, input, modal, stat card, badge, avatar)
Code examples	Partial	npm snippets on 9 components; none are copy-paste ready for mobile
Visual examples (rendered states, variants)	Partial	Figma embeds present but often out of sync with shipped code
Do/Don't guidance	No	Not present on any component page
Figma-to-code parity documented	No	No explicit mapping; engineers report guessing prop names
Contribution guidelines	No	No documented process; system team reports "ad-hoc Slack conversations"
Migration guides	No	v1→v2 migration was undocumented; teams still on v1 in 2 products
Changelog	No	No changelog; GitHub commit history is the only record
Search / discovery	Poor	Documentation site has no search; navigation is a flat alphabetical list

Design

Run this now

Try /design-system-audit on your own input

0/4000

Related Design skills

Brand Guidelines Generator Client Prototype Competitive Analysis Component Spec Design Critique Design Handoff Design Review Design Sprint

Back to Skills Catalog