Spec-Driven Feature

Use this when you are building a feature you intend to hand to a coding agent and the work is durable: it integrates with existing systems, ships to production, touches a regulated or audited path, or more than one developer will live with it. The spec becomes the contract the agent codes against and the thing you validate the result back to. If you are exploring, prototyping, or building something you expect to throw away, do not spec it first, vibe-code to learn what you actually need, then come back here once the requirements stop moving.

Related skills: Frames the why with /prd-draft and slices the work with /k8-user-story. Loads the agent's working context via /context-engineering-setup. For an AI-feature spec specifically, use /ai-product-spec.

The hard part most teams miss

The defining shift of 2025-2026 AI-assisted engineering is from prompting a wish to handing the agent a contract (per Thoughtworks, spec-driven development and context engineering are the two practices that define this era). Most "the agent built the wrong thing" complaints are spec failures, not model failures.

A spec is a contract, not a wish. The only thing separating spec-driven development from vibe-coding is whether the spec is version-controlled and authoritative. A paragraph you pasted into a chat is a wish: it is gone next session, no one reviewed it, and nothing holds the code to it. A spec in the repo, reviewed like code, that the implementation must satisfy, is a contract. Same words, completely different leverage.
Acceptance criteria must be checkable or the agent optimizes the wrong thing. "The export should be fast" tells the agent nothing it can verify, so it guesses, and it guesses in whatever direction is easiest to write. "Exporting 10,000 rows returns in under 2 seconds" is a behavior you can run. If a criterion cannot be turned into a test or a manual check with a yes/no answer, it is not a criterion, it is a hope. Write behaviors, not adjectives.
When the implementation and the spec disagree, fix the spec first. The instinct is to patch the code and move on. But the spec is the authority, so a spec that has silently drifted out of date is worse than no spec at all: it lies to the next reader and the next agent run. Every divergence is a fork in the road, either the code is wrong (fix the code) or the spec was wrong (fix the spec, then the code). Never leave them disagreeing.

Process

Step 1: Gather inputs

Ask the user:

What is the feature, in one or two sentences? {{feature_intent}} (The outcome, not the implementation.)
Is this durable or disposable? Production, integrated, regulated, or multi-developer means spec it. Discovery, demo, or throwaway means do not. If disposable, stop here and tell them to vibe-code.
What already exists that this must fit? {{existing_systems}} (Interfaces, data models, conventions the agent must respect.)
What does "correct" look like, concretely? {{success_signals}} (The behaviors that, if true, mean it works.)
What is explicitly out of scope? {{non_goals}} (The things people will assume are included and are not.)
What constraints are non-negotiable? {{constraints}} (Performance, security, compliance, compatibility, dependencies.)

Step 2: Decide spec-first or vibe-first

The living-spec workflow is not "always spec." It is "vibe-code to discover, then formalize before production":

Requirements still moving? Vibe-code a throwaway pass to find out what you actually need. You cannot spec what you have not learned yet.
Requirements settled? Formalize what you learned into the authoritative spec below, then hand that to the agent for the real build.
Never been uncertain? Spec directly.

Say plainly which mode you are in. The mistake is shipping the discovery code as if it were the real thing, with no spec ever written.

Step 3: Pin intent and non-goals

Write the intent as the single sentence the whole feature serves. Then write the non-goals, the adjacent things this feature is not doing. Non-goals do more work than goals: they are where the agent (and the human reviewer) would otherwise over-build. Be specific. "Not handling bulk import" beats "keeping it simple."

Step 4: Define constraints and contracts

Constraints: the hard limits the implementation may not violate, performance budgets, security and compliance rules, browser or platform support, allowed and forbidden dependencies.
Interfaces and contracts: the exact shapes the feature exposes and consumes, function signatures, API request and response schemas, event payloads, data models, error shapes. This is what lets the agent integrate instead of inventing. Name the types. An agent given a contract fills it in; an agent given prose guesses at it.

Step 5: Write acceptance criteria as checkable behaviors

Convert every success signal into a behavior with a yes/no answer. Use the form "When (situation), the system (observable result)." Each one must be something you could hand to a test or a reviewer and get back pass or fail. If you cannot phrase it that way, the requirement is still too vague, go back to the user.

Step 6: Write test scenarios

Name the scenarios that prove the criteria, including the unhappy ones. Happy path, empty and boundary cases, failure and error handling, and the integration points where this feature meets the existing systems from Step 1. These are the agent's targets and your validation checklist in one.

Step 7: Output the executable spec

# Spec: (feature name)

**Mode:** spec-first | formalized-after-vibe
**Owner:** (who arbitrates spec-vs-code disagreements)

## Intent
(One sentence: the outcome this feature serves.)

## Non-goals
- (Adjacent thing this feature deliberately does NOT do)
- (Another, especially ones people will assume are included)

## Constraints
- (Hard limit: performance budget, security rule, compliance, platform, dependency)

## Interfaces / contracts
(Exact shapes the feature exposes and consumes. Name the types.)
- (Function signature / API request+response schema / event payload / data model / error shape)

## Acceptance criteria (checkable behaviors)
- [ ] When (situation), the system (observable, yes/no result).
- [ ] When (situation), the system (observable, yes/no result).

## Test scenarios
- Happy path: (input -> expected outcome)
- Boundary / empty: (edge input -> expected outcome)
- Failure: (error condition -> expected handling)
- Integration: (meets existing system X -> expected behavior)

## Open questions
- (Unresolved decision blocking implementation)

Step 8: Hand off and run the validation loop

Hand the spec to the coding agent as the authority, not as a hint. Then run the loop, do not stop at "it ran":

Implement against the spec.
Verify the result against each acceptance criterion and test scenario, by running them, not by reading the diff.
On divergence, decide: is the code wrong (fix the code) or was the spec wrong (fix the spec first, then the code)? Never leave them disagreeing.
Refine the spec, not just the code, so the next reader and the next agent run inherit the truth. Re-run until every criterion passes.

Step 9: Review