AI Maturity

Every organization is "doing AI" right now. Few can tell you how well.

After working with 30+ organizations on AI adoption, I've found the same pattern: pockets of experimentation, no shared vocabulary for maturity, and leadership making investment decisions based on vibes instead of data. This framework gives you a structured way to assess where you actually are and what to work on next.

The maturity model at a glance

AI maturity is measured across six dimensions at five levels. Your overall maturity is determined by your weakest dimension - because a team with excellent prompting skills but no evaluation discipline will still produce unreliable outputs.

Five levels

Level	Name	What it looks like
1	Not Yet Started	No AI tool usage in workflows. Team hasn't engaged beyond curiosity.
2	Growing	Individual experimentation. Inconsistent results. Some people are enthusiastic, others skeptical. No shared practices.
3	Meets Expectations	AI is part of daily workflows with review discipline. Shared team practices documented and repeatable.
4	Exceeds Expectations	AI connected to team systems. Systematic evaluation. People are teaching others and defining patterns.
5	Leading	AI-first processes. Human-agent pairing feels natural. The team is pioneering new approaches and shaping organizational culture.

Most teams I assess land at Level 2 - lots of individual experimentation, no shared practices. The jump from 2 to 3 is the hardest and most valuable.

Six dimensions

Dimension	What it measures
Prompt and interaction quality	How well the team crafts inputs and structures conversations with AI tools
Evaluation discipline	How rigorously outputs are reviewed before becoming team artifacts or product decisions
Workflow integration	How deeply AI is embedded in day-to-day processes
Context and knowledge management	How well the team structures and maintains context for AI tools
Governance and bounded autonomy	How clearly the team draws boundaries for what AI can do without human review
AI foundations	Understanding of core concepts - models, tokens, context windows, RAG, agents

How to run an assessment

Step 1: Rate each dimension

For each of the six dimensions, ask the team to self-assess against the five levels. Do this individually first, then discuss as a group. The gaps between individual ratings are often more revealing than the ratings themselves.

Step 2: Identify the weakest link

Your effective maturity is your lowest dimension. A team at Level 4 in prompting but Level 1 in evaluation isn't "advanced" - they're producing sophisticated outputs that nobody is checking. That's worse than Level 2 across the board.

Step 3: Pick one dimension to improve

Don't try to raise all six at once. Pick the weakest link and design a focused improvement plan. Moving one dimension from Level 2 to Level 3 typically takes 4-8 weeks of deliberate practice.

The adoption curve

Organizations move through four stages, which map to the maturity levels:

Exploration (Level 2): People are trying AI tools individually. There's excitement but no structure. The risk is that early frustrations kill momentum before the team finds real value.

Operationalization (Level 3): The team has shared workflows and review practices. AI is part of how work gets done, not a side experiment. This is where ROI starts becoming measurable.

Integration (Level 4): AI is connected to team systems - not just used in standalone tools. Context flows between human work and AI tools. The team has evaluation frameworks and uses them consistently.

Transformation (Level 5): AI-first processes. The team designs workflows around human-agent collaboration rather than retrofitting AI into human workflows. Rare - most organizations aspire to Level 4.

Common patterns I see

The enthusiast gap. One or two people on the team are way ahead; everyone else hasn't started. The fix isn't mandating adoption - it's pairing the enthusiasts with skeptics on real work and letting results speak.

Tool-first thinking. "We bought Copilot" is not an AI strategy. The question isn't which tool - it's which workflow, measured by outcomes, with what review process.

Skipping evaluation. Teams that jump to Level 4 prompting without Level 3 evaluation are producing confident-sounding outputs that nobody checks. This creates a false sense of progress and, eventually, expensive mistakes.

Governance as blocker. Some organizations respond to AI anxiety by creating review committees that slow adoption to a crawl. Governance should enable bounded autonomy, not prevent experimentation.

A real example: the team that was Level 4 and Level 1 at once

A product team I assessed was certain it was advanced. Two engineers were doing genuinely sophisticated things with AI: custom prompts, multi-step workflows, real fluency. On the prompt-and-interaction dimension they were a Level 4. But when I asked how they checked AI outputs before those outputs became product decisions, the room went quiet. Evaluation discipline was Level 1. Nobody was reviewing anything; the team was trusting confident-sounding output because it sounded confident.

That is the weakest-link rule in the wild. Their effective maturity was not the Level 4 they felt, it was the Level 1 nobody was minding, which is more dangerous than being evenly mediocre. Sophisticated unchecked output earns trust it has not earned, so the mistakes it eventually produces are the expensive kind. We left the impressive prompting alone and spent the next six weeks on the boring dimension: a lightweight review step, a handful of golden examples, and an explicit bar for what "good enough to ship" meant. Maturity went up not by getting fancier, but by closing the gap the fanciness was hiding.

If your team feels advanced, distrust the feeling and check your lowest dimension first.

Try this today

Pick one team. Ask each person to rate themselves 1-5 on the six dimensions - or use the AI Maturity Assessment tool to structure the exercise. Collect the results. Look for two things: (1) what's the lowest dimension across the team, and (2) where do individual ratings diverge the most. Those two data points tell you where to focus and what conversations need to happen.

The maturity model at a glance

Five levels

Six dimensions

How to run an assessment

Step 1: Rate each dimension

Step 2: Identify the weakest link

Step 3: Pick one dimension to improve

The adoption curve

Common patterns I see

A real example: the team that was Level 4 and Level 1 at once

Try this today

Skills for this topic

Apps for this topic

See this in practice

Related practices

Related services

Want help with ai maturity?