Skip to main content
Assessment & Diagnostics/ai-maturity-assess

AI Maturity Assess

You want an individual AI maturity assessment.

Use this when someone wants to understand their personal AI maturity level across 6 dimensions and get specific growth recommendations. Works for PMs, Designers, and Engineers.

Process

Step 1: Identify the role

Ask: What is your role? (Product Manager, Designer, or Engineer)

This determines which behavioral indicators to use for each dimension.

Step 2: Assess each dimension

For each of the 6 dimensions below, ask 2-3 behavioral questions tailored to their role. Ask about what they actually do, not what they know in theory.

The 6 dimensions:

  1. Prompt & Interaction Quality — How well they craft inputs and structure human-AI interaction
  2. Evaluation Discipline — How rigorously they review and validate AI outputs
  3. Workflow Integration — How deeply AI is embedded in their daily work
  4. Context & Knowledge Management — How well they structure context for AI tools
  5. Governance & Bounded Autonomy — How clearly they draw boundaries for AI autonomy vs. human review
  6. AI Foundations — Their understanding of core AI/ML concepts
  7. Agent Operations — How well they manage AI agents and autonomous workflows in production (monitoring, cost control, error recovery, deployment)

Scoring levels:

  • 1 = Not Yet Started (no engagement with AI tools)
  • 2 = Growing (experimenting, inconsistent results)
  • 3 = Meets Expectations (effective daily use with review discipline)
  • 4 = Exceeds Expectations (team multiplier, defines patterns for others)
  • 5 = Leading (shapes organizational culture, drives cross-team standards)

Key maturity signal -- the "experiment to infrastructure" transition: The clearest indicator of moving from level 2 to level 3 is when AI tools stop being experiments and start being treated as infrastructure. At level 2, people say "I tried using AI for..." At level 3, AI is assumed -- it's wired into workflows before anyone consciously chooses it. At level 4-5, teams redesign workflows around AI capabilities rather than bolting AI onto existing processes. Probe for this transition explicitly: "Do you experiment with AI, or is it already part of how work gets done?"

Tool-tier awareness signal: At level 2, people use whatever AI coding tool someone recommended. At level 3, they consciously choose between tool tiers -- using Cursor for codebase work but Lovable for quick internal-tool prototypes. At level 4-5, they match tool tiers to the job systematically: engineering amplifiers for production code, prompt-to-app builders for prototypes and internal tools, agent orchestration for complex multi-step workflows. Probe: "Do you use different AI tools for different types of work, or the same tool for everything?"

Platform selection maturity signal: Beyond AI coding tools, probe whether the person evaluates build-vs-buy-vs-no-code decisions deliberately:

  • Level 2: Uses whatever tool was recommended or is trending. No awareness of lock-in or data ownership trade-offs.
  • Level 3: Consciously chooses between code, no-code platforms (Bubble, Retool, FlutterFlow), and AI-led builders (Rork, Repaint) based on the job.
  • Level 4-5: Evaluates graduation paths proactively -- knows when a prototype should move from a no-code platform to owned infrastructure. Considers data ownership, vendor lock-in, and cost scaling as first-order selection criteria.

Probe: "When you need to build an internal tool or prototype, how do you decide whether to code it, use a no-code platform, or use an AI builder?"

Example questions by role:

PM:

  • "When your team reaches for a no-code tool, what's your process for evaluating whether it's the right choice vs. building with code?"
  • "When you use AI to draft a user story, what does your review process look like?"
  • "How do you provide context to AI tools about your current project?"
  • "What rules does your team have about when AI output needs human review?"

Designer:

  • "How do you use AI tools in your design workflow today?"
  • "When AI generates design suggestions, how do you evaluate them against brand and accessibility standards?"
  • "How do you structure context (personas, brand guidelines) for AI tools?"

Engineer:

  • "How do you use AI for writing tests or implementation code?"
  • "What's your review process for AI-generated code before it ships?"
  • "How do you set up context (codebase, conventions) for AI coding tools?"

Agent Operations (all roles -- skip if the team hasn't deployed agents yet):

  • "If an AI agent fails mid-workflow, what happens? Retry, fallback, human escalation?"
  • "How do you track agent costs separately from other AI usage?"
  • "What's your process for updating prompts or tools in a deployed agent?"
  • "How do you monitor whether agents are actually producing good results over time?"

Ask one dimension at a time. Listen to the answer before moving on.

Step 3: Score and explain

After gathering answers for all 6 dimensions, score each 1-5 based on their consistent behavior (not their best day).

Briefly explain the reasoning for each score.

Step 4: Generate the assessment

Output in this format:


AI Maturity Assessment: (name or role)

Date: (today's date) Role: (PM / Designer / Engineer)

Dimension scores

DimensionScoreKey observation
Prompt & Interaction Quality(1-5)(one sentence)
Evaluation Discipline(1-5)(one sentence)
Workflow Integration(1-5)(one sentence)
Context & Knowledge Management(1-5)(one sentence)
Governance & Bounded Autonomy(1-5)(one sentence)
AI Foundations(1-5)(one sentence)

Overall maturity level: (lowest score)

The overall maturity level uses the weakest-link model — your overall level equals your lowest dimension score. This reflects that a gap in any dimension limits the effectiveness of all others.

Bottleneck dimension: (dimension name)

(Why this dimension matters and how it limits the others)

Growth recommendations

Priority 1: (specific, actionable recommendation for the weakest dimension) Priority 2: (recommendation for the second weakest)

One practice to adopt this week

(A single, concrete weekly habit that addresses the bottleneck — something they can start doing Monday)


Step 5: Discuss

Ask:

  • Does this assessment feel accurate?
  • Any scores you'd adjust based on things I didn't ask about?
  • Want to explore specific practices to grow in your bottleneck dimension?

Output location

Present the assessment as formatted text in the conversation.

Example Output

Input

  • Role: Product Manager
  • Name/Context: Priya Nair, Senior PM at Meridian Health (mid-size digital health company, ~800 employees)
  • Self-described AI usage: "I use ChatGPT pretty regularly for drafting PRDs and user stories, and I've started experimenting with Notion AI. My team recently deployed a patient intake automation agent but I'm not super involved in monitoring it."
  • Sample answers to assessment questions:
    • Prompt quality: "I usually just describe what I need and paste in some context. I don't have a consistent template."
    • Evaluation: "I read the output and edit it — I'd say I catch obvious mistakes but I don't have a formal checklist."
    • Workflow integration: "AI is in my weekly workflow for docs, but I don't use it for roadmap prioritization or stakeholder comms yet."
    • Context management: "I paste in our product brief when I remember. No system for it."
    • Governance: "We don't really have team rules about when to review AI output vs. ship it directly. It's individual judgment."
    • AI foundations: "I know what LLMs are and understand hallucination risk, but I couldn't explain embeddings or fine-tuning."
    • Agent ops: "The intake agent is live but our eng team monitors it. I don't know what the failure mode is or what it costs per run."

Output (abbreviated)

AI Maturity Assessment: Priya Nair

Date: June 14, 2025 Role: Product Manager — Meridian Health


Dimension Scores

DimensionScoreKey Observation
Prompt & Interaction Quality2Uses AI reactively with ad-hoc inputs; no prompt templates or structured interaction patterns.
Evaluation Discipline2Applies judgment informally but has no review checklist or consistent validation standard.
Workflow Integration3AI is reliably embedded in documentation work, but absent from higher-leverage PM tasks like prioritization.
Context & Knowledge Management2No persistent context system; project knowledge shared inconsistently depending on what's remembered.
Governance & Bounded Autonomy1No team-level rules about AI review thresholds; each person decides individually, including in a production agent context.
AI Foundations3Solid working knowledge of LLM behavior and risk; gap in technical depth limits evaluation of architectural trade-offs.
Agent Operations1Live agent in production with no PM visibility into cost, failure modes, or output quality monitoring.

Overall Maturity Level: 1

The weakest-link model applies — Priya's genuine strengths in workflow integration and AI foundations are capped by critical gaps in governance and agent operations. A production patient intake agent running without PM-level oversight represents meaningful compliance and quality risk in a regulated industry.


Bottleneck Dimension: Governance & Bounded Autonomy (tied with Agent Operations)

Meridian Health operates under HIPAA constraints. Without explicit team-level rules defining when AI output requires human review before patient-facing use, every individual on the team is making implicit risk decisions. The live intake agent compounds this — it's infrastructure, not an experiment, but it's being treated like an experiment no one owns.


Growth Recommendations

Priority 1: Define a two-tier review policy with your team this sprint. Tier 1 = internal docs (user stories, PRDs) — spot-check only. Tier 2 = anything patient-facing or feeding a downstream system — mandatory human review before use. Write it as a one-pager and get team sign-off.

Priority 2: Schedule a 30-minute sync with the engineer running the intake agent. Leave with three answers: what triggers a failure alert, what it costs per 100 runs, and what the fallback is if it errors mid-intake. Document these as the agent's operational spec and own it as PM.


One Practice to Adopt This Week

Monday: Create a shared Notion doc titled "AI Context Pack — [Current Quarter]" and paste in your product vision, top 3 OKRs, and one active user persona. Link it in your team Slack channel. Use it as your default context block every time you open a new AI conversation this week. By Friday you'll have a reusable context system instead of starting from scratch each session.


Does this feel accurate? Any dimensions you'd score differently based on context I didn't capture? Happy to go deeper on governance practices for regulated environments specifically.