ADR Generate - AI Agent Skill

Use this when the team has made (or needs to make) an architecture decision and wants to document it in a durable, findable format. ADRs capture the context, alternatives considered, rationale, and consequences so future engineers understand why the system is built the way it is -- not just how.

Related skills: Use after /architecture-discovery to document decisions that emerge from SNAP analysis. Complements /architecture-context-reviewer, which retrieves existing ADRs -- this skill creates new ones. Reference /system-diagram to provide visual context for the decision.

Process

Step 1: Gather decision context

Ask the user to provide:

What decision needs to be recorded? -- a technology choice, an architecture pattern, a boundary definition, a data model change, an integration approach.
What triggered this decision? -- new feature requirement, scaling issue, incident, technology end-of-life, team growth, compliance requirement, performance bottleneck.
Who are the deciders? -- who made or will make this decision? (Names and roles.)
What's the deadline? -- is this already decided, or does the team need to decide by a certain date?
Any prior discussion? -- links to Slack threads, meeting notes, RFCs, design docs, or PR comments where this was debated.

Step 2: Capture current state and constraints

Document what exists today and what constrains the decision:

Current architecture -- what does the system look like now in the area this decision affects?
Technical constraints -- language, framework, infrastructure, or platform limitations.
Business constraints -- timeline, budget, compliance, team capacity.
Non-negotiables -- requirements that any option must satisfy (e.g., "must support 10x current traffic," "must be HIPAA-compliant," "must not require downtime").

Step 3: Enumerate alternatives

Document at least 2 alternatives (3 is ideal). For each:

Dimension	What to capture
Description	What is this option, concretely?
Pros	What does it do well?
Cons	What are the downsides?
Effort	How much work to implement? (T-shirt size: S/M/L/XL)
Risk	What could go wrong?
Reversibility	How hard is it to undo this choice later? (Easy / Hard / Irreversible)

If an alternative was already rejected before this ADR, still document it with the reason -- this prevents future engineers from re-proposing the same idea.

Step 4: Document the decision

State the chosen option with rationale that ties directly to the trade-off analysis:

Which decision drivers did the chosen option satisfy best?
What trade-offs were accepted?
What was the deciding factor between the top contenders?

The rationale should be specific enough that someone reading this ADR in 2 years can understand why this option was chosen over the alternatives.

Step 5: Define consequences and follow-up

Categorize the consequences of this decision:

Positive -- what improves or becomes possible?
Negative -- what trade-offs are accepted? What becomes harder?
Neutral -- what changes without being clearly better or worse?

Then define:

Follow-up actions -- concrete tasks that need to happen as a result of this decision (with owners and deadlines).
Revisit triggers -- conditions that should prompt re-evaluation of this decision (e.g., "if traffic exceeds 10K RPS," "if the team grows beyond 8 engineers," "when the vendor contract renews in Q3 2027").

Step 6: Generate the ADR

Output using the Michael Nygard ADR template (the industry standard):

ADR-{{NNN}}: {{decision-title}}

Date: {{date}} Deciders: {{names-and-roles}} Context source: {{links-to-prior-discussion}}

Context

{{What is the issue that motivates this decision? What forces are at play -- technical, business, team, timeline? Be specific about the situation, not generic.}}

Decision drivers

{{Driver 1 -- e.g., "Must handle 10x current traffic without re-architecting"}}
{{Driver 2 -- e.g., "Team has deep experience with PostgreSQL but not Cassandra"}}
{{Driver 3 -- e.g., "Compliance requires data residency in EU region"}}

Considered alternatives

Option A: {{name}}

Pros: {{specific advantages}}
Cons: {{specific disadvantages}}
Effort: {{S/M/L/XL}}
Risk: {{what could go wrong}}
Reversibility: {{Easy / Hard / Irreversible}}

Option B: {{name}}

Pros: {{specific advantages}}
Cons: {{specific disadvantages}}
Effort: {{S/M/L/XL}}
Risk: {{what could go wrong}}
Reversibility: {{Easy / Hard / Irreversible}}

Option C: {{name}} (if applicable)

Pros: / Cons: / Effort: / Risk: / Reversibility:

Decision

We will use {{chosen option}} because {{rationale tied directly to decision drivers and trade-off analysis}}.

Consequences

Positive:

{{what improves}}

Negative:

{{what trade-offs are accepted}}

Neutral:

{{what changes without clear valence}}

Follow-up actions

Action	Owner	Deadline
{{specific task}}	{{person-or-role}}	{{date}}

Revisit triggers

{{Condition that should trigger re-evaluation of this decision}}
{{e.g., "If latency exceeds 200ms p99 under the new architecture"}}

Step 7: Discuss

Ask the user:

Does the context section capture the full picture?
Are there alternatives I should add or remove?
Is the rationale clear to someone who wasn't in the room?
What ADR number should this be? (Check existing ADR numbering convention.)
Where should this ADR be stored? (Repo docs folder, wiki, Notion, etc.)

Output location

Present the ADR as formatted text in the conversation or write to a file if the user specifies a path and naming convention.

Example Output

Input

Decision to record: Switch from REST polling to WebSockets for real-time order status updates in the customer-facing order tracking page
Company / team: Dispatch Tiger, a last-mile delivery SaaS — Platform Engineering team (6 engineers)
Trigger: Customer complaints about stale tracking data; current 10-second polling is generating 40% of all API traffic and causing p95 latency spikes during peak lunch/dinner windows
Deciders: Priya Nair (Staff Engineer), Marcus Delgado (Engineering Manager), Sofia Reyes (Product Lead)
Prior discussion: Slack thread #eng-platform-realtime (Nov 4), RFC doc in Notion titled "Order Tracking Latency RFC v2" (Nov 7)

Output (abbreviated)

ADR-019: Real-Time Order Tracking via WebSockets

Date: 2024-11-12 Deciders: Priya Nair (Staff Engineer), Marcus Delgado (Engineering Manager), Sofia Reyes (Product Lead) Context source: Slack #eng-platform-realtime (Nov 4, 2024); Order Tracking Latency RFC v2 (Nov 7, 2024)

Context

The order tracking page at Dispatch Tiger currently fetches order status via REST polling on a 10-second interval. As the customer base has grown to ~85K concurrent sessions during peak delivery windows (12–1 pm and 6–8 pm), this polling pattern now accounts for 40% of total inbound API traffic. Load testing shows p95 latency degrading to 1,800 ms during these windows — well above our 400 ms SLA. Customers are filing support tickets about seeing stale statuses ("Out for delivery" when the package has already arrived).

The backend already emits order state change events on an internal Kafka topic (order.state.changed). The gap is in how those events reach the browser.

Decision drivers

Must reduce polling-generated API traffic by at least 50% without degrading perceived update freshness
Must push status changes to the browser within 2 seconds of the Kafka event
Must work within existing Node.js / Express backend and React frontend — no full-stack rewrite
Must gracefully degrade for customers on flaky mobile connections (no silent data loss)
Must not require a browser extension or native app change (web-only scope)

Considered alternatives

Option A: WebSockets (via Socket.IO)

Pros: Persistent bidirectional connection eliminates polling; sub-500ms push latency achievable; Socket.IO handles reconnection and fallback to long-polling automatically; strong team familiarity
Cons: Stateful connections increase infrastructure complexity; requires sticky sessions or a shared adapter (Redis pub/sub) for horizontal scaling; connection count limits need capacity planning
Effort: M
Risk: Redis adapter becomes a single point of failure if misconfigured; connection storms on deploy restarts
Reversibility: Hard — client code must be refactored back to REST polling if reversed

Option B: Server-Sent Events (SSE)

Pros: Unidirectional (server→client), which matches the use case; simpler than WebSockets; HTTP/2 multiplexing means no sticky session requirement; works through most corporate proxies
Cons: No native browser reconnect backoff (must implement manually); some older mobile browsers have poor SSE support; limited to text/UTF-8 payloads
Effort: S
Risk: Proxy and load balancer timeouts silently dropping streams in customer enterprise environments; lower engineering familiarity
Reversibility: Hard

Option C: Continue REST polling with adaptive interval

Pros: Zero infrastructure change; well-understood failure modes
Cons: Does not solve the root traffic problem — adaptive intervals only reduce load ~15% in simulations; latency improvement minimal; rejected as insufficient
Effort: S
Risk: Traffic problem resurfaces within one product cycle as user growth continues
Reversibility: Easy (pre-rejected)

Decision

We will use Option A: WebSockets via Socket.IO because it satisfies the sub-2-second push latency driver and the graceful-degradation requirement simultaneously. Socket.IO's automatic fallback to long-polling ensures customers on flaky mobile connections do not silently miss updates — a gap that SSE (Option B) would require significant custom code to close. The team's existing Socket.IO experience (used in the driver dispatch console) reduces implementation risk and shortens ramp time. The Redis adapter complexity is accepted as a known, manageable trade-off given that Redis is already in the Dispatch Tiger stack.

SSE was the close second; we will revisit it if Socket.IO's stateful connection model proves operationally burdensome at the next traffic tier.

Consequences

Positive:

Polling traffic eliminated for active order tracking sessions; projected 35–40% reduction in total API request volume
Customer-visible update latency drops from ~10 seconds (polling average) to < 1 second (p95 target)
Backend Kafka consumer already exists; WebSocket layer is a thin fan-out bridge

Negative:

Infrastructure now requires Redis pub/sub adapter for Socket.IO — adds an operational dependency
Load balancer must be configured for sticky sessions (or adapter must be validated under multi-node failover)
Connection count capacity must be modeled and monitored; currently untracked metric

Neutral:

REST /orders/:id/status endpoint remains available for non-browser clients (mobile apps, third-party integrations) — no change to those consumers
Frontend polling logic removed; net reduction in client-side code complexity

Follow-up actions

Action	Owner	Deadline
Provision Redis pub/sub adapter in staging and validate multi-node failover	Priya Nair	Nov 22, 2024
Update load balancer config for WebSocket upgrades + sticky session validation	Marcus Delgado (infra coordination)	Nov 22, 2024
Implement Socket.IO server bridge from `order.state.changed` Kafka topic	Backend squad (lead: Dario Kim)	Nov 29, 2024
Update React order tracking component to Socket.IO client; remove polling	Frontend squad (lead: Anika Wren)	Nov 29, 2024
Define and instrument connection-count dashboard in Datadog	Priya Nair	Dec 6, 2024
Load test WebSocket layer at 100K simulated concurrent connections	QA (lead: Tomás Vera)	Dec 10, 2024

Revisit triggers

If concurrent WebSocket connections exceed 150K (current Redis adapter capacity ceiling) — evaluate Socket.IO cluster sharding or migration to SSE under HTTP/2
If Redis adapter availability falls below 99.9% over any 30-day window — evaluate decoupling the fan-out layer
If team grows beyond 10 engineers and a dedicated infrastructure team forms — re-evaluate operational ownership of the Redis dependency
When AWS ElastiCache contract renews (Q2 2026) — reassess managed vs. self-hosted Redis trade-off

Run this now

Try /adr-generate on your own input

0/4000

Part of these Playbook topics

Product-Engineering Pairing

Related Engineering skills

AI Testing Strategy Architecture Context Reviewer Architecture Discovery Boris Model Build vs Buy Code Review Codependency Analyzer Debug Assist

Back to Skills Catalog