Use this when you need to design, audit, or improve product-level observability — measuring what users do, how they experience the product, and whether they accomplish their goals. This covers event tracking, user journeys, task completion, funnel analysis, and product performance from the user's perspective. If you're looking to measure system health, uptime, or deployment reliability, use /instrumentation-plan instead.
The distinction: Observability answers "Are users successful?" Instrumentation answers "Is the system healthy?" Both are needed. Start here if you can't answer basic questions about how users interact with your product.
For AI-powered features: This skill covers product-level observability. If you need to monitor LLM-specific concerns (prompt quality, token costs, hallucination drift, model regression), use
/llm-observability-plan-- it covers the AI-specific layer between product analytics and system instrumentation.
Process
Step 1: Gather context
Ask the user to provide:
- Product description — what does this product do? Who are the users? What are the primary use cases?
- Current analytics — what's already tracked? (existing analytics tools, dashboards, event logs)
- Key user journeys — the 3-5 most important paths a user takes (e.g., signup → first value, search → purchase, onboard → habit)
- Business goals — what does success look like? (activation rate, retention, revenue, engagement)
- Known blind spots — what questions about user behavior can't be answered today?
- Analytics stack — tools in use or under consideration (Amplitude, Mixpanel, PostHog, GA4, custom, etc.)
If the user doesn't have all of this, work with what's available. Flag gaps as assumptions.
Step 2: Define the event taxonomy
A clean event taxonomy is the foundation of product observability. Design it before implementing anything.
Event naming convention:
Use a consistent Object Action or object_action pattern:
| Pattern | Example | Anti-pattern |
|---|---|---|
Object Action | Button Clicked, Page Viewed, Form Submitted | click, pageview, submit |
| Namespace prefix | Onboarding Step Completed, Search Query Executed | step_done, searched |
| Past tense for completed actions | Account Created, Item Purchased | creating_account, buying |
Event categories:
| Category | Purpose | Examples |
|---|---|---|
| Lifecycle | Track user progression | Account Created, Onboarding Completed, Subscription Started, Account Churned |
| Engagement | Track feature usage | Feature Used, Content Viewed, Search Executed, Export Generated |
| Conversion | Track goal completion | Trial Started, Purchase Completed, Upgrade Initiated |
| Navigation | Track movement patterns | Page Viewed, Tab Switched, Navigation Clicked |
| Error / Friction | Track failure points | Error Displayed, Form Validation Failed, Timeout Experienced |
| Security / Anomaly | Track security-relevant behavior for baseline building | Permission Elevated, Unusual Access Pattern, Data Export Volume Exceeded, Off-Hours Activity, New Device Login |
Event properties (payload structure):
Every event should include:
| Property | Type | Purpose | Example |
|---|---|---|---|
event_name | string | What happened | Button Clicked |
timestamp | ISO 8601 | When it happened | 2026-03-05T14:30:00Z |
user_id | string | Who did it (anonymized if needed) | usr_abc123 |
session_id | string | Session grouping | sess_xyz789 |
page / screen | string | Where it happened | /dashboard |
properties | object | Context-specific details | { button_name: "Export", format: "CSV" } |
Taxonomy design rules:
- Decide on a naming convention before your first event — retrofitting is expensive
- Every event must answer: Who did what, where, when, and with what context?
- Limit property cardinality — a property with 10,000 unique values is hard to analyze
- Version your taxonomy — when you rename or restructure events, document the change
Present the taxonomy as a table:
| Event Name | Category | Trigger | Key Properties | Priority |
|---|---|---|---|---|
| (Page Viewed) | Navigation | Any page load | page_path, referrer, load_time_ms | P0 |
| (Feature X Used) | Engagement | User completes action X | feature_name, input_type, result_count | P0 |
| (Signup Completed) | Lifecycle | Registration finishes | signup_method, referral_source | P0 |
Step 3: Map user journeys and funnels
For each key user journey, define the funnel:
Funnel template:
| Step | Event | Success Criteria | Expected Drop-off | Alert If |
|---|---|---|---|---|
| 1. (Entry) | Page Viewed (landing) | User arrives | — | Traffic < (threshold) |
| 2. (Engagement) | Feature Explored | User interacts | 40-60% drop expected | Drop > 70% |
| 3. (Activation) | First Value Achieved | User gets value | 20-40% drop expected | Drop > 50% |
| 4. (Conversion) | Goal Completed | User converts | 10-30% drop expected | Drop > 40% |
For each funnel:
- Name the journey — e.g., "New user → first value" or "Search → purchase"
- Define the steps — specific events that mark progression
- Set baseline expectations — what's a healthy drop-off rate at each step?
- Define alert thresholds — when does drop-off signal a problem vs. normal behavior?
- Identify branch points — where do users take alternate paths? Are those paths tracked?
Step 4: Task completion and timing
Beyond funnels, measure whether users accomplish what they came to do:
Task completion framework:
| Task | Start Event | End Event | Success Criteria | Time Target | Measure |
|---|---|---|---|---|---|
| (Complete onboarding) | Onboarding Started | Onboarding Completed | All steps finished | < 5 minutes | Completion rate, median time |
| (Find and use a feature) | Search Executed | Feature Used | User finds what they need | < 30 seconds | Success rate, time to result |
| (Submit a form) | Form Opened | Form Submitted | Valid submission | < 2 minutes | Completion rate, error rate, abandonment point |
Timing metrics to capture:
- Time to first value — how long from signup/entry to the first meaningful outcome?
- Time on task — how long does a specific workflow take?
- Time between sessions — how frequently do users return?
- Perceived performance — Core Web Vitals (LCP, FID/INP, CLS) as user-facing performance signals
Behavioral baseline signals (optional -- include when the product feeds security or anomaly detection):
If the organization uses AI-powered security tools (UEBA, anomaly detection), product observability events serve double duty -- they're both product analytics and security intelligence. Consider tracking these behavioral patterns as part of the event taxonomy:
| Signal | What it establishes | Anomaly example |
|---|---|---|
| Access frequency per user | Normal usage cadence | Sudden spike or off-pattern access |
| Typical session duration | Expected engagement length | Unusually long or short sessions |
| Normal data access volume | Baseline download/export behavior | Bulk data export outside normal range |
| Geographic consistency | Expected access locations | Login from new region or impossible travel |
| Feature access patterns | Which features a user typically uses | Sudden access to admin or sensitive features |
These signals feed /telemetry-readiness-audit assessments and enable AI security tools to build meaningful behavioral baselines from the same instrumentation effort.
Step 5: Feature adoption and retention signals
Track whether features are actually used and whether usage sticks:
Adoption metrics:
| Metric | Formula | What it tells you |
|---|---|---|
| Feature adoption rate | Users who used feature / total active users | Is the feature discoverable? |
| Activation rate | Users who completed key action / users who signed up | Are users getting value? |
| Breadth of use | # of features used per user per session | Are users exploring or stuck? |
| Depth of use | Frequency of feature use per user per week | Is usage habitual or one-time? |
| Retention (D1/D7/D30) | Users returning on day N / users who started on day 0 | Does the product stick? |
| Stickiness (DAU/MAU) | Daily active users / monthly active users | How often do users come back? |
Cohort analysis guidance:
- Always segment by acquisition cohort (week or month of first use)
- Compare feature adoption across cohorts to detect trends
- Separate new users from power users in adoption metrics — they have different baselines
Step 6: Experiment instrumentation
If the team runs A/B tests or experiments, ensure the observability layer supports them:
Experiment tracking requirements:
- Every user session tagged with active experiment variants
- Experiment assignment logged as an event (
Experiment Assignedwithexperiment_name,variant,user_id) - Primary and secondary metrics defined before the experiment starts
- Sample size and duration calculated before launch (not after)
- Guardrail metrics defined — metrics that must not degrade (e.g., page load time, error rate)
Experiment event structure:
| Event | Properties | When |
|---|---|---|
Experiment Assigned | experiment_name, variant, assignment_method | User enters experiment |
Experiment Exposed | experiment_name, variant, exposure_context | User sees the variant |
Experiment Goal Reached | experiment_name, variant, goal_name, goal_value | User hits primary metric |
Step 7: Generate the observability plan
Compile everything into a single document:
Observability Plan — (Project name)
Generated: (date) Product: (brief description) Current state: (summary of what's tracked today)
Event Taxonomy
(Table from Step 2 — event names, categories, triggers, properties, priority)
User Journey Funnels
(Funnel definitions from Step 3 — one per key journey)
Task Completion Metrics
(Table from Step 4 — tasks, events, targets, timing)
Feature Adoption & Retention
(Metrics from Step 5 — adoption rate, activation, retention cohorts)
Experiment Instrumentation
(Structure from Step 6 — if applicable; omit if team doesn't run experiments yet)
Implementation Checklist
Priority-ordered list of what to implement next:
- (P0) (Most critical gap — e.g., "No event tracking exists; implement page views and core action events")
- (P0) (Second critical gap — e.g., "Signup funnel has no step-level tracking")
- (P1) (Important but not urgent — e.g., "Add timing instrumentation to onboarding flow")
- (P1) (Next important item)
- (P2) (Nice to have — e.g., "Implement breadth-of-use metric across feature set")
Data Governance Notes
- PII handling: (what user data is collected, how it's anonymized or consented)
- Retention policy: (how long event data is kept)
- Access: (who can see raw events vs. aggregated dashboards)
Open Questions
(Anything that couldn't be resolved without more information)
Step 8: Review and refine
Ask the user:
- Does the event taxonomy cover the questions you need to answer about user behavior?
- Are the funnels measuring the right steps? Any steps missing or too granular?
- Are the task completion targets realistic based on what you know about user behavior?
- Is the implementation checklist ordered correctly for your current priorities?
- Are there privacy or compliance constraints that affect what can be tracked?
- Do you need experiment instrumentation now, or is that a future concern?
Adjust based on feedback.
Related skills
/instrumentation-plan— plan SRE instrumentation for system health, uptime, and deployment reliability/experiment-design— design a specific experiment to validate a product hypothesis/research-synthesize— synthesize qualitative research alongside quantitative observability data
Output location
Present the plan as formatted text in the conversation. The user can copy it into their product wiki, analytics documentation, or team shared drive.
Example Output
Input
- Product description: Capsule is a B2B SaaS product that helps HR teams build and manage employee onboarding programs — creating task checklists, sending automated welcome sequences, and tracking new hire progress through their first 90 days
- Current analytics: Google Analytics 4 (page views only), no event tracking beyond that; no funnel visibility; onboarding completion rates estimated manually from Salesforce notes
- Key user journeys: (1) HR admin signup → first program published, (2) New hire receives invite → completes onboarding checklist, (3) HR admin reviews progress dashboard → takes action on at-risk employee
- Business goals: Improve activation rate (HR admin publishes first program within 7 days of signup), improve new hire completion rate (target 80%), reduce churn among accounts where no program is published within 14 days
- Known blind spots: No visibility into where HR admins abandon the program builder, no data on which checklist steps new hires skip most, no session-level timing, can't tell if progress dashboard is actually used
- Analytics stack: Migrating to PostHog (self-hosted); Salesforce for CRM; Intercom for in-app messaging; no data warehouse yet
Output (abbreviated)
Observability Plan — Capsule HR Onboarding
Generated: 2025-07-14 Product: B2B SaaS for HR teams — builds and tracks employee onboarding programs through 90-day new hire journeys Current state: GA4 page view tracking only. No event-level visibility. Activation, funnel, and task completion data are estimated manually. PostHog migration in progress — this plan targets PostHog as the implementation target.
Event Taxonomy
| Event Name | Category | Trigger | Key Properties | Priority |
|---|---|---|---|---|
Page Viewed | Navigation | Any page load | page_path, referrer, load_time_ms, user_role | P0 |
Account Created | Lifecycle | HR admin completes registration | signup_method, company_size, referral_source | P0 |
Program Builder Opened | Engagement | Admin clicks "Create Program" | entry_point, template_used | P0 |
Program Step Added | Engagement | Admin adds a task to program | step_type, step_index, program_id | P0 |
Program Published | Conversion | Admin clicks "Publish" | program_id, step_count, time_to_publish_days, template_used | P0 |
New Hire Invited | Lifecycle | Admin sends onboarding invite | program_id, invite_method, days_before_start_date | P0 |
Onboarding Checklist Opened | Engagement | New hire opens their checklist | program_id, device_type, hours_since_invite | P0 |
Checklist Step Completed | Engagement | New hire marks a step done | step_id, step_type, step_index, program_id, completion_method | P0 |
Checklist Step Skipped | Error / Friction | New hire skips or bypasses a step | step_id, step_type, step_index, skip_reason | P0 |
Onboarding Completed | Lifecycle | All required steps finished | program_id, total_steps, days_to_complete, skip_count | P0 |
Progress Dashboard Viewed | Engagement | Admin opens new hire progress view | new_hire_count, at_risk_count, view_depth_seconds | P1 |
At-Risk Employee Actioned | Conversion | Admin sends nudge or reassigns step | action_type, days_since_last_hire_activity, program_id | P1 |
Program Builder Abandoned | Error / Friction | Admin exits builder without publishing (session ends) | last_step_reached, steps_added, time_in_builder_minutes | P1 |
Form Validation Failed | Error / Friction | Inline error shown to user | form_name, field_name, error_type, user_role | P1 |
Experiment Assigned | Lifecycle | User enters A/B test | experiment_name, variant, user_role | P1 |
Account Churned | Lifecycle | Subscription cancelled or not renewed | tenure_days, programs_published, last_active_date | P1 |
Bulk Export Generated | Security / Anomaly | Admin exports new hire data | record_count, export_format, time_of_day | P2 |
Permission Elevated | Security / Anomaly | User role changed to admin | changed_by, previous_role, account_id | P2 |
User Journey Funnels
Journey 1: HR Admin Signup → First Program Published (Activation)
| Step | Event | Success Criteria | Expected Drop-off | Alert If |
|---|---|---|---|---|
| 1. Signup | Account Created | Admin registers | — | Volume < 20% below 7-day avg |
| 2. Builder Entry | Program Builder Opened | Admin starts building within 7 days | 20–30% drop | Drop > 45% |
| 3. Content Added | Program Step Added (3+ events) | Admin adds at least 3 steps | 20–30% drop | Drop > 40% |
| 4. Published | Program Published | Admin publishes first program | 25–35% drop | Drop > 50% |
Target activation rate: ≥ 55% of signups publish a program within 7 days Critical blind spot addressed: Program Builder Abandoned event reveals where admins stall — step count and time in builder pinpoint the friction.
Journey 2: New Hire → Onboarding Completed
| Step | Event | Success Criteria | Expected Drop-off | Alert If |
|---|---|---|---|---|
| 1. Invited | New Hire Invited | Invite delivered | — | Delivery failure rate > 5% |
| 2. Checklist Opened | Onboarding Checklist Opened | New hire opens within 48 hrs | 10–20% drop | Drop > 35% |
| 3. First Step Completed | Checklist Step Completed (step_index = 1) | Any first action taken | 15–25% drop | Drop > 40% |
| 4. Halfway | Checklist Step Completed (step_index = 50% of total) | Sustained progress | 15–25% drop | Drop > 35% |
| 5. Completed | Onboarding Completed | All required steps done | 10–20% drop | Completion rate < 70% |
Target completion rate: ≥ 80% of invited new hires
Note: Checklist Step Skipped by step_index will reveal which specific tasks block completion — this is Capsule's most actionable unknown today.
Journey 3: HR Admin → Progress Dashboard → Action Taken
| Step | Event | Success Criteria | Expected Drop-off | Alert If |
|---|---|---|---|---|
| 1. Dashboard Opened | Progress Dashboard Viewed | Admin views dashboard | — | Less than 40% of active accounts/week |
| 2. At-Risk Identified | Dashboard view with at_risk_count > 0 | Admin sees a flagged hire | Varies | — |
| 3. Action Taken | At-Risk Employee Actioned | Admin responds within 48 hrs | 40–60% drop | Action rate < 25% on at-risk accounts |
Task Completion Metrics
| Task | Start Event | End Event | Success Criteria | Time Target | Measure |
|---|---|---|---|---|---|
| Publish first program | Program Builder Opened | Program Published | Program has ≥ 3 steps | < 20 minutes | Completion rate, median time, abandonment step |
| New hire completes onboarding | Onboarding Checklist Opened | Onboarding Completed | All required steps done | < 30 days | Completion rate, skip |