A PM I was coaching came to me with a product idea. The pitch was good: use an LLM to classify customer support tickets, route them to the right team, and suggest a first response. The demo was impressive. The model was fast. The accuracy was high.
I asked one question: "What does the support agent do after they see the suggestion?"
The PM paused. They hadn't mapped the workflow past the AI output. The model classified tickets beautifully. No one had designed what happened next: how agents edited the suggestion, how confidence thresholds affected routing, what happened when the model was wrong, how the team would know if agents trusted it or ignored it.
The model worked. The product didn't exist yet.
Capability vs. job
I wrote about the Fruit Tool problem - the pattern where teams build one clever AI feature, wrap it in a UI, and call it a product. The diagnostic I use is simple: if you remove the AI, is there still a product? If not, you have a demo.
That diagnostic is the starting point for how I think about AI product strategy. The full framework has three layers.
Layer 1: Start with the job, not the model
Every AI product I've built started with the same question: what's the person trying to accomplish?
At FLUXX Health, we built a menopause education platform with AI-driven phase detection. The capability was impressive - quiz responses mapped to hormonal phases using a scoring engine. But the job wasn't "detect my phase." The job was "help me understand what's happening to my body and what to do about it." Phase detection was infrastructure. The product was clarity.
When I built assessment tools for my own practice, the same principle applied. The AI Maturity Assessment doesn't just produce a score. It surfaces the specific dimension where a team is weakest and recommends the first action for that week. The output is a decision, not a number.
If your AI product's primary value proposition is "look what the model can do," you are building a demo. If the value proposition is "here's what you should do next," you are building a product.
Layer 2: Design for the moment the model is wrong
Every model is wrong sometimes. The question isn't whether it will be wrong. The question is what happens when it is.
Most AI product teams spend 90% of their energy on the happy path - the model is right, the user is delighted. The best teams spend equal energy on the failure path. What does the user see when confidence is low? How do they correct the output? Does the correction make the system better?
At FLUXX, we built uncertainty handling into the phase detection logic. When the model couldn't confidently assign a phase, it said so. It didn't guess. That decision was hard - it meant some users got a less satisfying result on their first use. But it also meant users trusted the results they did get. Trust compounded. Engagement stayed high.
The failure path is where AI products earn or lose trust. Design it as carefully as the happy path.
Layer 3: Measure the outcome, not the output
AI products have a unique measurement trap: the model metrics can be excellent while the product metrics are terrible.
A classification model with 95% accuracy sounds great. But if agents override the suggestion 40% of the time, the product isn't working. If the time-to-resolution doesn't decrease, the classification isn't helping. If agents stop looking at the suggestions after the first week, accuracy doesn't matter.
The metrics that matter are always downstream of the model:
- Did the user accomplish their goal faster?
- Did the user trust the result enough to act on it?
- Did the product reduce the work the user had to do?
At FLUXX, we measured quiz completion rates and return visits - not just phase detection accuracy. A model that detected phases correctly but produced a quiz that 60% of people abandoned was a model that wasn't helping anyone. When we optimized the quiz for mobile and reduced abandonment by 38%, that was a product win. The model didn't change. The experience around it did.
The framework in practice
When I evaluate an AI product opportunity, I run three questions in order:
1. Can I describe the user's job without mentioning AI? If the pitch only works because "AI" is in the sentence, the job isn't real. "Help support agents resolve tickets faster" is a job. "Use AI to classify tickets" is a capability.
2. What happens when the model is wrong? If the answer is "it won't be" or "we'll improve accuracy," the failure path hasn't been designed. Every model will be wrong. The product needs to handle it gracefully.
3. What outcome am I measuring? If the primary metric is model accuracy, the team is optimizing the wrong thing. Find the user outcome that the model enables, and measure that.
The organizational problem
The hardest part of AI product strategy isn't the framework. It's the organizational pressure to ship something that demos well.
Executives who have seen ChatGPT want an AI feature. Boards want an AI story. Sales wants an AI pitch. The pressure to wrap a model in a UI and ship is enormous.
The best AI product leaders I know resist that pressure by reframing the conversation. They don't say "we shouldn't use AI." They say "here's the user problem AI can solve, and here's how we'll know it's working." They make the job the hero of the story, not the model.
That reframe is the entire strategy. Start with the job. Design for failure. Measure outcomes. Everything else is implementation.
I help teams build AI products that solve real problems, not demos that impress in a boardroom. If your team is figuring out where AI fits in your product strategy, let's talk.
Related services
Want to work together?
I help teams ship better products. Let's talk about your situation.
Get in touch