DX Assessment - AI Agent Skill

Use this when engineering leadership wants to understand what's slowing developers down, where the team's biggest friction points are, and what investments would have the highest impact on productivity and retention. This produces a structured DX assessment covering tooling, workflow, knowledge access, cognitive load, and team health signals.

Related skills: DX friction often surfaces as a symptom in /delivery-diagnose. Tooling and infrastructure debt feeds into /tech-debt-assessment. On-call and incident burden connects to /incident-review patterns.

Process

Step 1: Gather inputs

Ask the user to provide:

Team scope -- which team or teams are being assessed? How many engineers?
Known frustrations -- what do engineers complain about most? What comes up in retros?
Recent changes -- any new tooling, process changes, re-orgs, or platform migrations in the last 3-6 months?
Attrition signals -- has anyone left recently citing engineering culture or tooling? Are people disengaged?
Available data -- do you have survey results, CI metrics, PR cycle times, on-call logs, or DORA metrics to reference?

Step 2: Assess across DX dimensions

Evaluate the developer experience across six dimensions. For each, rate as Strong / Adequate / Weak based on the evidence gathered:

2a. Local development environment

Signal	What to check	Red flag
Setup time	How long for a new engineer to go from clone to running app?	> 1 day
Reliability	Does the local env break frequently? Do engineers waste time fixing it?	Weekly "it works on my machine" issues
Parity	Does local env match staging/prod?	Significant drift causing surprises at deploy
Documentation	Is setup documented and current?	Engineers rely on asking teammates

2b. CI/CD and deployment

Signal	What to check	Red flag
Build time	End-to-end CI pipeline duration	> 15 minutes
Flaky tests	Percentage of test runs that fail then pass on retry	> 2%
Deploy frequency	How often the team deploys to production	< 1/week for an active team
Deploy confidence	Do engineers deploy without anxiety?	Deploys require "deploy captain" or are batched weekly
Rollback ease	Can a bad deploy be reverted quickly?	Manual rollback or no rollback at all

2c. Code review and collaboration

Signal	What to check	Red flag
Time to first review	Median time from PR open to first review	> 24 hours
PR cycle time	Median time from PR open to merge	> 2 business days
Review bottleneck	Do PRs wait for specific people?	> 30% of PRs blocked on one person
Review quality	Are reviews substantive or rubber-stamps?	Mostly "LGTM" with no comments
Pairing culture	Does the team pair or mob?	Knowledge stuck in silos

2d. Knowledge access and onboarding

Signal	What to check	Red flag
Onboarding time	How long until a new engineer ships their first meaningful PR?	> 4 weeks
Documentation	Are architecture decisions, runbooks, and domain context documented?	"Ask Sarah, she knows"
Bus factor	Are there critical systems only one person understands?	Bus factor = 1 for any production system
Search and discovery	Can engineers find answers without asking someone?	Tribal knowledge dominates

2e. On-call and operational burden

Signal	What to check	Red flag
On-call rotation	Is on-call distributed fairly?	Same 2-3 people always on-call
Alert noise	Signal-to-noise ratio of production alerts	> 50% alerts are non-actionable
Incident frequency	How often does the team get paged?	> 2 incidents/week requiring human response
Toil	Time spent on repetitive operational work	> 20% of engineering time on toil

2f. Cognitive load and flow

Signal	What to check	Red flag
Context switching	How often are engineers interrupted by meetings, Slack, or incidents?	> 3 context switches per focused work block
Meeting load	Percentage of the week in meetings	> 30% for individual contributors
Scope of ownership	How many systems does each engineer own?	Ownership so broad that nothing gets deep attention
Decision autonomy	Can engineers make technical decisions without escalation?	Every decision requires manager or architect approval

Step 3: Identify the top friction points

From the assessment, identify the 3-5 highest-impact friction points. For each:

What it is -- the specific friction, with evidence
Who it affects -- all engineers, a subset, or specific roles?
Impact type -- slows delivery, hurts quality, causes attrition, blocks scaling, or all of the above?
Root cause -- is this a tooling problem, a process problem, a people problem, or a structural problem?
Trend -- getting worse, stable, or already improving?

Step 4: Generate recommendations

For each friction point, recommend an intervention:

Field	Description
Recommendation	Specific, actionable change
Effort	S / M / L (to implement)
Expected impact	What improves and by how much?
Leading indicator	How will you know it's working within 2-4 weeks?
Owner	Who drives this?

Sequence recommendations: quick wins first (high impact, low effort), then structural improvements.

Step 5: Produce the assessment

Output in this format:

Developer Experience Assessment: {{team-or-org}}

Date: {{date}} | Team size: {{count}} | Assessed by: {{who}}

DX scorecard

Dimension	Rating	Key signal
Local development	{{Strong/Adequate/Weak}}	{{most telling signal}}
CI/CD and deployment	{{Strong/Adequate/Weak}}	{{most telling signal}}
Code review and collaboration	{{Strong/Adequate/Weak}}	{{most telling signal}}
Knowledge access and onboarding	{{Strong/Adequate/Weak}}	{{most telling signal}}
On-call and operational burden	{{Strong/Adequate/Weak}}	{{most telling signal}}
Cognitive load and flow	{{Strong/Adequate/Weak}}	{{most telling signal}}

Top friction points (ranked by impact)

1. {{friction point}}

Evidence: {{specific data or observation}}
Affects: {{who}}
Impact: {{delivery / quality / attrition / scaling}}
Root cause: {{tooling / process / people / structural}}
Trend: {{worse / stable / improving}}

2. {{friction point}}

(Same structure)

Recommendations

#	Recommendation	Effort	Expected impact	Leading indicator	Owner
1	{{action}}	{{S/M/L}}	{{what improves}}	{{early signal}}	{{who}}

What's working well

{{Things the team should keep doing}}

Suggested reassessment date

Step 6: Discuss

Ask the user:

Does the scorecard match your intuition? Any ratings that surprise you?
Are there friction points I missed?
Which recommendations resonate most with the team?
Are there organizational constraints that would block any of these?
Should we run this as a team survey to validate with the broader engineering group?

Output location

Present the assessment as formatted text in the conversation or write to a file if the user specifies a path.

Example Output

Input

Team scope: Platform Engineering team at Meridian Health Systems, 14 engineers (8 senior, 4 mid-level, 2 junior) building internal developer tooling and shared infrastructure for 6 product teams
Known frustrations: Engineers constantly complain about flaky integration tests, slow CI pipelines, and getting pulled into Slack threads to answer questions other teams should be able to self-serve; retros repeatedly surface "I can't find deep focus time"
Recent changes: Migrated from Jenkins to GitHub Actions 4 months ago (incomplete — 3 legacy pipelines still on Jenkins); onboarded 4 new engineers in Q3; one senior engineer (Priya) left 6 weeks ago citing burnout and "being the only one who understood the auth service"
Attrition signals: Priya's exit interview explicitly mentioned on-call burden and knowledge silo stress; two engineers flagged similar concerns in the last engagement survey (eNPS dropped from +32 to +11 over two quarters)
Available data: GitHub Actions dashboards showing median CI time of 22 minutes, flaky test rate of ~6.5%, PR cycle time averaging 3.4 business days; PagerDuty logs show 4.2 incidents/week with 60% classified as non-actionable noise; no formal onboarding tracking

Output (abbreviated)

Developer Experience Assessment: Meridian Health Systems — Platform Engineering

Date: 2025-07-14 | Team size: 14 engineers | Assessed by: Engineering Leadership + DX Assessment

DX scorecard

Dimension	Rating	Key signal
Local development	Adequate	No data on setup time; parity issues suspected given dual CI system
CI/CD and deployment	Weak	Median CI time 22 min; 6.5% flaky test rate; 3 pipelines still on Jenkins
Code review and collaboration	Weak	PR cycle time 3.4 business days; knowledge concentrated after Priya's departure
Knowledge access and onboarding	Weak	No onboarding tracking; auth service now bus factor = 1 (or zero); "ask in Slack" is the de facto docs
On-call and operational burden	Weak	4.2 incidents/week; 60% non-actionable alerts; eNPS drop signals burnout accumulation
Cognitive load and flow	Weak	Platform team fielding cross-team Slack interruptions continuously; 4 new engineers adding ramp burden

Top friction points (ranked by impact)

1. Flaky and slow CI pipeline is the team's biggest daily tax

Evidence: 22-minute median CI time (red flag threshold: 15 min); 6.5% flaky test rate (threshold: 2%); hybrid Jenkins/GitHub Actions environment adds configuration complexity and cognitive overhead
Affects: All 14 engineers, plus the 6 downstream product teams depending on shared pipelines
Impact: Delivery slowdown, quality erosion (engineers learn to distrust test results), scaling blocker as product teams grow
Root cause: Tooling — incomplete migration left technical debt mid-stream; no dedicated effort to quarantine or fix flaky tests
Trend: Worse — migration is stalled and flaky test count has grown since Q3 onboarding

2. Auth service knowledge void — single point of organizational failure

Evidence: Priya's departure 6 weeks ago left auth service effectively undocumented; exit interview cited isolation and burnout as direct causes; no runbooks confirmed in knowledge audit
Affects: All engineers on rotation, all product teams integrating auth, future incident responders
Impact: Attrition risk (same conditions remain for the next senior engineer), delivery risk (any auth incident is now a crisis), onboarding blocker
Root cause: Structural — no knowledge-sharing practice, no documentation culture, rotation didn't distribute ownership
Trend: Worse — the gap is widening as time passes without intervention

3. Alert noise consuming on-call engineers and eroding trust in monitoring

Evidence: 4.2 incidents/week with 60% non-actionable (PagerDuty logs); eNPS dropped from +32 to +11 in two quarters, strongly correlated with on-call burden growth
Affects: Whoever is on rotation; disproportionately senior engineers who inherited legacy alerting configs
Impact: Attrition risk, cognitive load, quality (engineers who are sleep-deprived or interrupted make worse decisions)
Root cause: Process + tooling — no alert review cadence, no ownership assigned to reducing noise, thresholds never tuned post-Jenkins migration
Trend: Stable at a bad level; no active effort to improve

4. Platform team acting as human API for other teams' questions

Evidence: Engineers report continuous Slack interruptions; retros have flagged this for multiple quarters; no self-service documentation portal or searchable runbook library exists
Affects: All engineers, but senior engineers disproportionately; 4 new engineers also impacted as they can't self-onboard
Impact: Cognitive load, flow destruction, scaling ceiling (team can't grow its impact if it's also customer support)
Root cause: Structural — the team has never invested in productizing its knowledge; documentation is treated as optional
Trend: Worse — each new product team onboarded amplifies the interruption surface

Recommendations

#	Recommendation	Effort	Expected impact	Leading indicator	Owner
1	Quarantine all known flaky tests into a separate suite; block merge only on stable suite	S	Restore CI signal reliability within 2 weeks; developer trust in green builds returns	Flaky test rate in primary suite drops to <1%	Platform TL
2	Complete Jenkins → GitHub Actions migration with a hard cutoff date (30 days); assign one engineer as DRI	M	Eliminate dual-system cognitive overhead; unblock pipeline optimization	All pipelines running in GHA; Jenkins decommissioned	EM + Platform TL
3	Run an auth service documentation sprint — 2 engineers pair on Priya's domain for 2 sprints, producing runbooks + architecture doc	M	Bus factor rises from ~0 to 3+; on-call confidence improves	Runbook published; 2 engineers can independently handle auth incidents	Senior engineer + EM
4	Alert audit: review all PagerDuty rules, silence or raise thresholds on non-actionable alerts, assign ownership per service	S	Reduce incident volume by ~40%; reduce on-call burnout	Non-actionable alert rate drops below 25% within 4 weeks	On-call rotation lead
5	Create a Platform team "office hours" model (2x/week, 30 min) + internal docs site (Notion or Backstage) to deflect async questions	M	Reduce ad-hoc Slack interruptions by 50%+; unblock product teams to self-serve	Measurable drop in #platform-help thread volume; new engineer time-to-first-PR improves	EM + one mid-level engineer as DX champion
6	Instrument onboarding: track time-to-first-meaningful-PR for all new engineers; set 3-week target	S	Creates accountability; identifies where new engineers get stuck	First data point visible after next hire	EM

What's working well

GitHub Actions adoption shows the team is willing to invest in tooling improvement — the migration intent was right, execution just stalled
The team is retaining 12 of 14 engineers despite high friction, suggesting psychological safety and team cohesion are intact — a real asset to protect
Engagement survey data and exit interview candor indicate a feedback culture exists; signals are visible and honest, which makes this assessment actionable

Suggested reassessment date

October 14, 2025 (90 days)

Metrics to re-measure at that point:

Median CI time (target: ≤12 minutes)
Flaky test rate in primary suite (target: <1%)
Non-actionable alert rate (target: <25%)
PR cycle time (target: <2 business days)
eNPS (target: return to +25 or above)
Auth service bus factor (target: ≥3 engineers)
Time-to-first-meaningful-PR for new engineers (target: ≤3 weeks)

Run this now

Try /dx-assessment on your own input

0/4000

Part of these Playbook topics

CI/CD

Related Engineering skills

ADR Generate AI Testing Strategy Architecture Context Reviewer Architecture Discovery Boris Model Build vs Buy Code Review Codependency Analyzer

Back to Skills Catalog