A client needs to understand retention patterns, compare user groups over time, diagnose churn, or forecast future behavior based on historical cohort data. Works for subscription, usage-based, e-commerce, and marketplace businesses.
How it works
- You provide the product/service, available data, retention definition, time period, and key events or milestones
- The skill designs retention curves, defines behavioral segments for cohort comparison, diagnoses churn patterns with root cause hypotheses, and recommends intervention timing
- It returns a cohort analysis framework with interpretation guide, churn diagnosis, and cohort-based forecast Kate can use for retention strategy conversations
Prompt
You are building a cohort analysis framework for a Kate Makrigiannis consulting engagement. Kate uses this to help clients see past vanity metrics and understand how real groups of users behave over time. Cohort analysis reveals whether the product is actually getting better at retaining people, or just growing fast enough to hide churn. Before writing, read knowledge/voice-tone-guide.md -- use the client-facing voice.
Inputs I will provide:
- Product/Service: {{PRODUCT}} (what the product is, business model, stage)
- Available data: {{DATA}} (what data exists -- e.g., "signup dates, login activity, subscription status," "purchase history by customer," "event-level analytics in Mixpanel," or "we have limited data, mostly spreadsheets")
- Retention definition: {{RETENTION}} (what "retained" means for this business -- e.g., "logged in at least once in a 7-day window," "made a purchase in the calendar month," "active subscription," or "not sure, help me define it")
- Time period: {{TIME_PERIOD}} (range of data available and analysis window -- e.g., "12 months of data, want to look at monthly cohorts" or "6 weeks of data, daily cohorts")
- Key events/milestones: {{EVENTS}} (product changes, pricing changes, marketing campaigns, seasonal factors that might affect cohort behavior)
- Context (optional): {{CONTEXT}} (specific questions, known retention problems, growth targets, segments of interest)
Step 1: Define the cohort framework
Retention Definition
Before building any analysis, lock down precisely what "retained" means:
| Parameter | Definition | Rationale |
|---|---|---|
| Cohort grouping | [How users are grouped -- e.g., by signup week, first purchase month, activation date] | [Why this grouping makes sense for the business] |
| Retention event | [The specific action that counts as "retained" -- e.g., "completed a session," "made a purchase," "logged in"] | [Why this event is the right signal] |
| Retention window | [Time bucket for measuring -- daily, weekly, monthly] | [Matches natural usage frequency of the product] |
| Measurement method | [Bounded vs. unbounded retention -- "did the event happen in week N" vs. "did the event happen on or after day N"] | [Which is more appropriate for this product] |
Alternative Retention Definitions to Consider
| Alternative | When to Use | Trade-off |
|---|---|---|
| [e.g., "Any login" vs. "Core action completed"] | [Broad engagement vs. meaningful usage] | [Broad definition flatters retention; narrow definition is more honest] |
| [e.g., "Weekly" vs. "Monthly" windows] | [High-frequency vs. low-frequency products] | [Wider windows smooth noise but hide early drop-off] |
Step 2: Retention curve design
Standard Retention Table Template
Design the cohort retention table structure:
| Cohort | Size | Period 0 | Period 1 | Period 2 | Period 3 | Period 4 | Period 5 | Period 6 |
|---|---|---|---|---|---|---|---|---|
| [Month/Week 1] | [N users] | 100% | [X%] | [X%] | [X%] | [X%] | [X%] | [X%] |
| [Month/Week 2] | [N users] | 100% | [X%] | [X%] | [X%] | [X%] | [X%] | -- |
| [Month/Week 3] | [N users] | 100% | [X%] | [X%] | [X%] | [X%] | -- | -- |
If the client provides actual data, populate the table. If designing the framework, explain what goes in each cell and how to compute it.
Show the math: "Period 1 retention = Users who performed [retention event] in Period 1 / Total users in cohort x 100"
Retention Curve Interpretation Guide
| Curve Shape | What It Means | Typical Cause | Action |
|---|---|---|---|
| Steep early drop, then flattens | Product has a core retained audience but loses most users quickly | Onboarding friction, wrong audience, unclear value prop | Focus on activation and time-to-value |
| Gradual steady decline | Users slowly disengage over time, no stable floor | Product lacks habit loops or ongoing value | Build engagement hooks, recurring value triggers |
| Flat high retention | Most users stick around | Strong product-market fit for this segment | Focus on acquisition, the product retains well |
| Flat low retention | Almost everyone leaves quickly | Fundamental product or audience problem | Revisit product-market fit before optimizing retention |
| Improving over time (newer cohorts retain better) | Product is getting better at retaining | Product improvements, better onboarding, better targeting | Keep iterating, quantify what changed |
| Declining over time (newer cohorts retain worse) | Product or audience quality is degrading | Channel mix shift, market saturation, product neglect | Diagnose urgently -- this compounds fast |
Step 3: Behavioral segmentation for cohort comparison
Define segments that are worth comparing as separate cohorts:
Segmentation Criteria
| Segment Dimension | Segments to Compare | Hypothesis |
|---|---|---|
| Acquisition channel | [e.g., Organic vs. Paid vs. Referral] | [Organic users may retain better because higher intent] |
| Activation behavior | [e.g., Completed onboarding vs. Did not] | [Users who hit the aha moment retain at higher rates] |
| Plan/Tier | [e.g., Free vs. Paid vs. Enterprise] | [Paid users have sunk cost, likely higher retention] |
| Geography | [e.g., US vs. International] | [Product-market fit may vary by region] |
| Use case | [e.g., Primary use case A vs. B] | [One use case may have stronger retention loops] |
| Time-based | [e.g., Pre-launch vs. Post-launch of feature X] | [Feature X was supposed to improve retention -- did it?] |
Priority segments to analyze first:
- [Segment comparison] -- because [reason this is the highest-value comparison]
- [Segment comparison] -- because [reason]
- [Segment comparison] -- because [reason]
Step 4: Churn pattern identification
Churn Timing Analysis
| Churn Window | % of Total Churn | Cumulative | Pattern |
|---|---|---|---|
| Period 0-1 | [X%] | [X%] | [Early churn -- activation problem] |
| Period 1-3 | [X%] | [X%] | [Short-term churn -- value realization gap] |
| Period 3-6 | [X%] | [X%] | [Medium-term churn -- engagement decay] |
| Period 6-12 | [X%] | [X%] | [Long-term churn -- competitive displacement or needs change] |
| Period 12+ | [X%] | [X%] | [Mature churn -- natural lifecycle] |
If actual data is provided, compute these. If designing the framework, explain how to calculate each and what to look for.
Root Cause Hypothesis Matrix
For each significant churn window, generate testable hypotheses:
| Churn Window | Hypothesis | Supporting Signal | How to Validate | Confidence |
|---|---|---|---|---|
| Period 0-1 | [e.g., Users do not understand value in first session] | [e.g., <30% complete onboarding] | [Onboarding completion funnel analysis] | [High / Medium / Low] |
| Period 0-1 | [e.g., Wrong audience from paid acquisition] | [e.g., Paid cohorts churn 2x vs. organic] | [Compare cohorts by acquisition channel] | [Confidence] |
| Period 1-3 | [e.g., No habit loop after initial use] | [e.g., Usage drops 80% after week 1] | [Session frequency analysis by cohort week] | [Confidence] |
| Period 3-6 | [e.g., Users hit a capability ceiling] | [e.g., Power users upgrade, others leave] | [Feature usage correlation with retention] | [Confidence] |
Step 4b: Statistical rigor for cohort comparisons
When comparing retention between segments or testing whether a cohort difference is real vs. noise:
Statistical Validation
| Comparison | Method | When to use |
|---|---|---|
| Two cohort retention rates at a single time point | Z-test for proportions or chi-square | "Is the January cohort's Month 3 retention different from February's?" |
| Full retention curves between two groups | Log-rank test | "Does the entire retention trajectory differ between organic and paid users?" |
| Retention with covariates | Cox proportional hazards regression | "After controlling for plan type and geography, does acquisition channel affect retention?" |
| Time-to-event (time to churn) | Kaplan-Meier estimator | "What's the median time to churn, accounting for users who haven't churned yet?" |
Censoring: the most common cohort analysis mistake. Users who signed up recently haven't had the opportunity to churn at later periods. This isn't "100% retention at Month 6" -- it's missing data. Kaplan-Meier curves handle this correctly by adjusting the denominator as users are "censored" (their observation window hasn't reached that period yet). Standard retention tables handle this by only showing cells where the cohort has had enough time.
Confidence intervals on retention rates: Report retention rates with confidence intervals, especially for small cohorts. A cohort of 50 users showing 60% Month 1 retention has a 95% CI of roughly [45%, 74%] -- that's a wide range. A cohort of 5,000 at 60% has a CI of [58.6%, 61.4%]. The sample size determines whether the difference you see is signal or noise.
Related skills: For choosing the right statistical test, use
/statistical-test-selector. For understanding whether a retention intervention caused the improvement, use/causal-inference-guide.
Step 5: Intervention timing recommendations
Intervention Map
| Churn Risk Window | Intervention | Trigger | Channel | Expected Impact |
|---|---|---|---|---|
| Day 0-3 | [Onboarding email sequence] | [Signup without completing core action] | [Email + in-app] | [Increase activation by X%] |
| Day 7-14 | [Re-engagement nudge] | [No activity for 5+ days] | [Push / Email] | [Recover X% of dormant users] |
| Day 30 | [Value check-in] | [End of first month] | [Email / in-app survey] | [Identify at-risk users early] |
| Day 60-90 | [Feature education] | [Users not using key features] | [In-app walkthrough] | [Expand usage depth] |
| Day 180+ | [Win-back campaign] | [Churned for 30+ days] | [Email with incentive] | [Recover X% at lower LTV] |
Step 6: Cohort-based forecasting
Retention Forecast Model
If the client wants to project future revenue or user counts:
| Input | Value | Source |
|---|---|---|
| Monthly new users | [X] | [Current acquisition rate or target] |
| Retention curve (mature cohort) | [X% at Month 1, X% at Month 3, X% at Month 6, X% at Month 12] | [Historical cohort data or benchmark] |
| Revenue per retained user | [$X/month] | [ARPU or subscription price] |
Forward Projection
| Month | New Users | Retained from Prior Cohorts | Total Active Users | Monthly Revenue |
|---|---|---|---|---|
| Month 1 | [X] | [0] | [X] | [$X] |
| Month 2 | [X] | [X from M1 x M1 retention %] | [X] | [$X] |
| Month 3 | [X] | [Sum of retained from all prior cohorts] | [X] | [$X] |
| ... | ... | ... | ... | ... |
| Month 12 | [X] | [X] | [X] | [$X] |
Show the math for at least Month 3: "Month 3 active = New M3 users + (M1 cohort x M3 retention %) + (M2 cohort x M2 retention %) = X + X + X = X users"
Scenario Comparison
| Scenario | Retention Change | Impact on Month 12 Active Users | Revenue Impact |
|---|---|---|---|
| Baseline | Current retention curve | [X users] | [$X/month] |
| +5% retention improvement | [Adjusted curve] | [X users (+Y%)] | [$X/month (+$Z)] |
| +10% retention improvement | [Adjusted curve] | [X users (+Y%)] | [$X/month (+$Z)] |
"A 5 percentage-point improvement in Month 1 retention compounds to [X] additional active users by Month 12, worth approximately $[X] in additional monthly revenue."
Kate's Talking Points
- "Your retention curve shows [shape]. This tells us [interpretation]. The biggest opportunity is [specific window and action]."
- "Newer cohorts are retaining [better/worse/the same] as older cohorts. This means [the product is improving / something is degrading / retention is stable]."
- "If we improve [specific retention window] by [X] percentage points, that compounds to [X additional users and $X revenue] over 12 months."
Related skills: Feeds into
/funnel-analysisfor understanding where in the funnel retention breaks down. Pairs with/growth-model-builderfor growth projections using cohort retention inputs. Uses/audience-segmentationfor defining behavioral segments to compare.
Example Output
Input
- Product/Service: Brightpath — a B2B SaaS platform for mid-market HR teams to manage employee learning and development (LMS + skills tracking). $299/seat/year subscription. 3 years old, ~$4.2M ARR, Series A raised 14 months ago. Growing ~40% YoY but board is asking hard questions about net revenue retention.
- Available data: Salesforce (contract start/end dates, ARR by account), Mixpanel (event-level usage: logins, course completions, skills assessments, admin actions), and a spreadsheet of monthly active account counts by signup quarter going back 28 months. No individual seat-level data — retention is tracked at the account level.
- Retention definition: Not fully defined. Currently reporting "accounts with at least one login in the calendar month" — but leadership suspects this is masking low-depth engagement. Want help defining something more meaningful.
- Time period: 28 months of data; want monthly cohort analysis (cohorts defined by contract start month). Most mature cohort is 28 months old; most recent is 2 months old.
- Key events/milestones: New onboarding flow launched Month 10; pricing increased ~18% at Month 16; mobile app released Month 20; a large enterprise-focused content library added Month 24. Sales shifted toward mid-market (50–500 employees) starting around Month 12 after early growth came from smaller SMB accounts.
- Context: NRR is reportedly 91% — below the 100%+ benchmark investors expect. Leadership suspects early churn is the main problem but hasn't validated this. Three cohorts from the SMB era (Months 1–6) are dragging down the averages. Want to know whether the product is actually getting better at retaining mid-market accounts or just looks that way because SMB cohorts are aging out.
Output (abbreviated)
Cohort Analysis Framework: Brightpath
Prepared for Kate Makrigiannis | Engagement Reference: Brightpath NRR Diagnostic
Step 1: Retention Definition
Locked Retention Definition
| Parameter | Definition | Rationale |
|---|---|---|
| Cohort grouping | Month of contract start date (from Salesforce) | Aligns retention measurement to the contractual relationship, not login behavior |
| Retention event | Account completed ≥1 course OR ran ≥1 skills assessment in the calendar month | Distinguishes genuine platform usage from admin check-ins; ties retention to Brightpath's core value |
| Retention window | Monthly (calendar month) | Matches natural L&D workflow cycles — most HR teams plan training monthly or quarterly |
| Measurement method | Bounded retention — "did the account perform the retention event in Month N?" | Appropriate for subscription businesses; tells us whether accounts are active right now, not just ever again |
Why the current definition is misleading: "Any login" captures HR admins logging in to pull a report or check a setting — not actual learning activity. An account logging in once per month with zero courses completed is technically "retained" under the current definition but is a churn risk. This inflated headline number is likely masking a depth-of-engagement problem.
Alternative Retention Definitions to Consider
| Alternative | When to Use | Trade-off |
|---|---|---|
| "Any login" (current method) | Useful for early-stage products where any engagement signal matters | Flatters retention numbers; an account with one admin login looks identical to one with 200 completions |
| "≥X% of licensed seats active monthly" | Better for seat-based contracts with large accounts | Penalizes accounts that are active but using fewer seats than purchased — may not reflect renewal risk accurately |
| "Course completion + skills assessment" (recommended) | When the product's core value is demonstrated learning outcomes | Narrower definition will show lower retention initially — that's honest, and it's where the real work is |
Step 2: Retention Curve Design
Standard Retention Table Template (Framework — Populate from Mixpanel + Salesforce)
| Cohort | Accounts | M0 | M1 | M3 | M6 | M12 | M18 | M24 |
|---|---|---|---|---|---|---|---|---|
| Q1 SMB (M1–3) | ~85 | 100% | 74% | 58% | 41% | 28% | 19% | 14% |
| Q2 SMB (M4–6) | ~110 | 100% | 71% | 54% | 38% | 25% | 17% | -- |
| Q3 Transition (M7–9) | ~95 | 100% | 76% | 61% | 47% | 33% | -- | -- |
| Q4 Mid-market (M10–12) | ~130 | 100% | 81% | 67% | 54% | 38% | -- | -- |
| Q5 Mid-market (M13–15) | ~160 | 100% | 83% | 70% | 57% | -- | -- | -- |
| Q6 Post-pricing (M16–18) | ~145 | 100% | 80% | 66% | -- | -- | -- | -- |
| Q7 Mobile era (M19–21) | ~175 | 100% | 85% | 71% | -- | -- | -- | -- |
| Q8 Recent (M22–24) | ~190 | 100% | 84% | -- | -- | -- | -- | -- |
Cells marked -- are censored: those cohorts haven't reached that period yet. Do not report these as 100% retention. Leave them blank in client-facing materials.
The math:
M3 retention for Q4 Mid-market cohort = Accounts that completed ≥1 course OR skills assessment in Month 3 ÷ 130 total accounts in cohort × 100 = 87 ÷ 130 = 67%
Retention Curve Interpretation Guide — Brightpath Context
| Curve Shape | Match to Brightpath? | Interpretation |
|---|---|---|
| Steep early drop, then flattens | ✅ Yes — especially SMB cohorts | Brightpath loses ~25–30% of accounts in Month 1, then stabilizes. This is an activation problem, not a long-term product failure. |
| Improving over time (newer cohorts retain better) | ✅ Likely — Q4 onward shows improvement | The mid-market shift + new onboarding (Month 10) appear to be working. This is the story the board needs to hear — with proof. |
| Declining over time | ⚠️ Watch for post-pricing cohorts (M16+) | If M16–18 cohorts flatten below M13–15, the 18% price increase may have filtered for lower-commitment accounts. |
Step 3: Behavioral Segmentation
Priority Cohort Comparisons
| Segment Dimension | Segments to Compare | Hypothesis |
|---|---|---|
| Company size (ICP shift) | SMB (<50 employees) vs. Mid-market (50–500) | Mid-market accounts have dedicated L&D budgets and stronger internal champions; should retain meaningfully better |
| Onboarding completion | Accounts that completed new onboarding flow (Month 10+) vs. those that did not | Structured onboarding likely reduces early churn by accelerating time-to-first-course-completion |
| Activation depth in Month 1 | Accounts with ≥10 course completions in M1 vs. <10 | Early depth of usage is almost always the strongest leading indicator of long-term retention |
| Pricing tier | Pre-price-increase vs. post-price-increase cohorts | Higher price point may attract higher-intent buyers or may be filtering out marginal accounts |
| Mobile adoption | Accounts with ≥20% of sessions on mobile vs. desktop-only | Mobile app (Month 20) may have unlocked a new usage pattern that correlates with retention |
Priority ranking:
- SMB vs. Mid-market — this is the core diagnostic question. If mid-market cohorts retain at 110%+ NRR, the SMB drag is a legacy problem that will naturally age out. That's a very different conversation with the board than "we have a retention problem."
- **Activation depth in Month