Study guide: Part 2. Why Specification-Driven Development

Lesson 2 of 5 in module «Part 2. Why Specification-Driven Development»

You are viewing the lesson without signing in. Sign in to save progress and take tests.

Topic: Part 2. Why Specification-Driven Development

Difficulty level: Medium

Estimated study time: 4-6 hours (theory + practice)

Prerequisites: Basic understanding of working with AI assistants (Qwen, ChatGPT, Claude, etc.)

Experience working with version control systems (Git)

Basic knowledge of web development (HTML, CSS, JavaScript/TypeScript)

Understanding of basic software architecture concepts

Experience working with command line and npm scripts

Learning objectives: Explain five typical vibe coding failures and the mechanism for eliminating them through SDD (Specification-Driven Development)

Formulate answers to seven checkpoint questions for verifying whether a feature specification is ready for implementation

Compose a complete mini-specification for a feature in the requirements/plan/validation format for a Qwen Code project

Draw a distinction between SDD at the feature level and the classic waterfall design model

Apply the practical rule for determining whether a specification is needed for a particular code change

Overview: This module is dedicated to transitioning from chaotic vibe coding to disciplined specification-driven development (SDD) when working with AI agents, particularly Qwen Code. Vibe coding is an approach where the developer gives sequential impulsive commands to AI, losing architectural context and accumulating technical debt. SDD solves this problem by creating versioned, verifiable, and agent-readable documents that capture intentions, boundaries, decisions made, and acceptance criteria. The module covers typical pathologies of vibe coding, the structure of a sustainable cycle working with Qwen Code, seven quality questions for specifications, SDD redundancy criteria, and the practice of writing mini-specifications.

Key concepts: Vibe coding: An approach to development with AI assistants based on continuous improvisation: the developer gives short commands in chat, the agent immediately executes, context exists only within an ephemeral session. Useful for prototyping and exploration, but generates five critical failures: intention drift, context decay, unverifiable result, hidden decisions, reviewer fatigue.

SDD (Specification-Driven Development): A development methodology where the AI agent receives not just a "do it" command, but structured context in the form of versioned files: project mission, technical stack, roadmap, specifications for individual features. Transfers knowledge from ephemeral chat into the repository, making errors early, small, and verifiable.

Intention drift: A typical vibe coding failure: the agent interprets "simple form" as a reason to add complex state management, a new library, an alternative interface style — everything the developer didn't ask for, but the agent deemed reasonable.

Context decay: Loss of architectural decisions between sessions working with the agent. After several iterations, the agent forgets why the project doesn't use ORM, why the API renders HTML on the server, why a particular stack was chosen — and makes contradictory decisions anew.

Sustainable cycle working with Qwen Code: An alternative to a long improvisational session: three separated phases with forced context clearing (/clear). Phase 1 — reading base specifications and creating a feature plan with clarifying questions. Phase 2 — implementing a strictly limited subset of tasks according to a specific plan. Phase 3 — validating the result against pre-established criteria, listing discrepancies before fixes.

Feature mini-specification: A compact document (not 80 pages of waterfall), oriented toward one phase, one branch, one merge. Consists of three files: requirements.md (requirements and boundaries), plan.md (implementation plan), validation.md (verification criteria). Characteristics: small, verifiable, linked to the roadmap, understandable to a new agent without previous chat, strict enough to exclude guessing.

Seven specification questions: A checklist for substantive readiness: (1) business intention behind the feature, (2) user and scenario, (3) boundaries of work, (4) decisions made and open, (5) non-negotiable constraints, (6) what should not break, (7) how to prove result correctness. Absence of an answer to any question means unreadiness for implementation.

Negative requirements: Documentation of existing behavior that the feature must not affect. Critically important for preventing regressions when working with agents prone to broad changes. Covered in more detail in Part 7 of the course.

Practical redundancy rule: A heuristic for determining the need for a specification: if a change is understandable and verifiable in 5 minutes — a regular request is sufficient; if it touches architecture, data, security, public behavior, or multiple files — a specification is required; if the agent works autonomously for more than a few minutes — a specification almost always pays off.

In scope and out of scope: Explicitly listed inclusions and explicit exclusions in the feature specification. Protect against scope creep and prevent the agent from "imagining" functionality that the developer didn't order.

Practice exercises: Name: Diagnosing a Vibe Coding Failure

Problem: Read the following scenario and identify which of the five typical vibe coding failures occurred: A developer asks the agent to "add reviews to products". The agent creates a full-fledged rating system with moderation, notifications, analytics, and integration with an external service. The developer only wanted text review display under the product card. In the next session, the agent suggests replacing existing SQLite with PostgreSQL for "scalability". The developer spends 2 hours rolling back changes.

Solution: Step 1: Identify the first failure — the agent added unrelated functionality (ratings, moderation, notifications, analytics, external service) instead of the requested minimum. This is classic intention drift: interpreting "reviews" as a reason for an ecosystem. Step 2: Identify the second failure — in a new session, the agent doesn't remember/doesn't account for the decision about SQLite, suggesting PostgreSQL. This is context decay: the architectural decision about the storage stack is lost. Step 3: Identify the third failure — the result requires 2 hours to roll back, meaning the changes were unverifiable in scope and unpredictable in consequences (unverifiable result + reviewer fatigue). Step 4: Propose an SDD solution: create specs/2026-05-08-reviews-display/requirements.md with explicit boundaries ("display only", "no ratings", "no moderation in this phase"), fix the decision about SQLite in tech-stack.md, create validation.md with a check for absence of new dependencies.

Complexity: intermediate

Name: Composing a Mini-Specification for Admin Login

Problem: Take the feature "Add login" from the course example. Transform the impulsive request into a complete mini-specification with three files: requirements.md, plan.md, validation.md. Use the seven questions as a content check. Then formulate three clarifying questions you would ask the agent before starting implementation.

Solution: Step 1: requirements.md — structure: # Requirements — Admin Login; ## Boundaries (one login page, cookie session, no self-registration, no password reset in this phase); ## Out of scope (OAuth, JWT, user registration, password reset, multi-factor authentication); ## Decisions (single administrator in SQLite, httpOnly cookie, protection only for /dashboard); ## Verification (test scenarios). Step 2: plan.md — decomposition into task groups: (1) data schema and user model, (2) routes /login and /dashboard with middleware, (3) form and validation, (4) tests. Step 3: validation.md — specific checks: unauthenticated GET /dashboard → 302 to /login; incorrect password → generic error message without leaking user existence; correct password → httpOnly cookie + 302 to /dashboard; npm test and npm run typecheck pass; no new dependencies except bcrypt and cookie-parser. Step 4: Check against seven questions: (1) intention — restricted access to admin panel, (2) user — single administrator, (3) boundaries — explicitly listed, (4) decisions — SQLite, cookie, bcrypt fixed, (5) constraints — httpOnly, protection only for /dashboard, (6) negative requirements — don't break public pages, (7) proof — 4 test scenarios + CI checks. Step 5: Clarifying questions to the agent: "Confirm you understand: no password reset in this phase means no /forgot-password route and no integration with email service"; "Suggest an alternative to bcrypt with justification if you consider it suboptimal for SQLite"; "How will you verify that the cookie is actually httpOnly and not accessible via document.cookie?"

Complexity: intermediate

Name: Designing a Sustainable Cycle for a Feature

Problem: You have a Next.js blog project with a roadmap: phase 1 — basic posts, phase 2 — comments, phase 3 — tags and search. You want to implement phase 2 (comments). Design three Qwen Code sessions using /clear, indicate which specification files the agent reads in each session, and what specific instructions it receives. Justify the division into three sessions.

Solution: Step 1: Session "Research and Planning" — /clear; reading @specs/mission.md (project goal: minimalist blog with focus on readability), @specs/tech-stack.md (Next.js 14 App Router, Prisma + PostgreSQL, Tailwind, React Server Components by default), @specs/roadmap.md (phase 1 completed, phase 2 — comments, phase 3 — tags). Instruction: "Create a feature specification for phase 2 of the roadmap. First ask me questions about ambiguities. Don't write code." Justification: the agent must understand the project context without mixing it with previous sessions, and formulate ambiguities before implementation. Step 2: Session "Implementation" — /clear; reading @specs/mission.md, @specs/tech-stack.md, @specs/2026-05-15-comments/plan.md (created in session 1). Instruction: "Implement only task groups 1 and 2 of the plan: Prisma schema for comments and API route POST /api/posts/[id]/comments. Don't change unrelated files. Don't add UI in this session." Justification: limiting the scope of work prevents intention drift, clear boundaries protect against scope creep. Step 3: Session "Validation" — /clear; reading @specs/2026-05-15-comments/validation.md. Instruction: "Compare current implementation with validation.md. Before fixes, list discrepancies. Then propose minimal corrections." Justification: a separate verification session prevents impulsive fixes, forces explicit acknowledgment of problems, protects against reviewer fatigue through a structured protocol.

Complexity: advanced

Name: Applying the Practical Redundancy Rule

Problem: Evaluate using the practical rule whether a full specification (requirements/plan/validation) is needed for each of the following changes: (A) Fixing a typo in the homepage heading; (B) Adding a "biography" field to the user profile with 500-character length validation; (C) Migrating from SQLite to PostgreSQL to support concurrent writes; (D) Creating a one-off data migration script for internal use; (E) Implementing Stripe payment integration for subscriptions.

Solution: Step 1: (A) Typo fix — one file change, understandable in 30 seconds, verifiable visually. Practical rule: < 5 minutes, one-off, doesn't touch architecture. Decision: regular request, no specification. Step 2: (B) Biography field — touches data schema, validation, form, API, possibly tests. But the change is localized, template-based, agent works < 5 minutes. Boundaries are obvious. Decision: a light request with context in chat may be sufficient, but minimal documentation in requirements.md improves reliability. Borderline case — can do without full cycle. Step 3: (C) SQLite → PostgreSQL migration — architectural decision, touches data, configuration, possibly ORM queries, tests, CI/CD. Agent will work long, errors are costly. Decision: full specification with explicit decisions, negative requirements (what we don't break), migration plan, validation. Step 4: (D) One-off migration script — internal, one-off, verifiable by result. But if the agent works autonomously and touches production data, risk is high. Decision: if the script is simple and data is not critical — light request; if complex logic or production — minimal specification with verification on data copy. Step 5: (E) Stripe payment integration — security, public behavior, external service, users' money, legal consequences of errors. Decision: mandatory full specification with explicit boundaries (which payments, which currencies, which webhooks), fixed decisions (Stripe SDK vs API, error handling), non-negotiable constraints (PCI compliance, idempotency keys), negative requirements (don't store CVV, don't process payments synchronously), detailed validation.

Complexity: intermediate

Case studies: Name: Startup Migration from Vibe Coding to SDD: Saving a Project Through Specifications

Scenario: An EdTech startup was developing an online course platform using Claude and GitHub Copilot in vibe coding mode. Over 4 months, a team of 2 developers created a working MVP: authorization, video player, learning progress, payments. The project grew impulsively: each feature was added by request "do X", architectural decisions were made by the agent on the fly without documentation.

Challenge: In month 5, the team faced critical problems: (1) Intention drift — the payment system contained 3 different approaches (Stripe Checkout, Stripe Elements, custom form), because in different sessions agents chose different solutions; (2) Context decay — a new developer couldn't understand why the video player used a custom progress storage format instead of standard; (3) Unverifiable result — the release contained 47 changed files, review took 6 hours, errors leaked into production; (4) Hidden decisions — a critical decision about database sharding remained only in the chat history of a departed developer; (5) Reviewer fatigue — the CTO stopped night releases due to inability to guarantee quality.

Solution: The team implemented SDD in 3 weeks: (1) Retrospective — extracting hidden decisions from chat histories, fixing in specs/mission.md, specs/tech-stack.md, specs/architecture-decisions/; (2) Roadmap — decomposing remaining features into phases with explicit boundaries; (3) Mini-specifications — each feature received requirements/plan/validation, verified against seven questions; (4) Sustainable cycle — implementing /clear between sessions, separation into planning/implementation/validation; (5) Automated verification — npm scripts and GitHub Actions for mandatory passing of validation.md before merge.

Result: After 2 months: review time reduced from 6 hours to 45 minutes; release bugs dropped by 70%; a new developer onboarded in 3 days instead of 3 weeks; managed to remove 2 of 3 payment systems without regressions thanks to negative requirements. The CTO resumed night releases with confidence. The project attracted Series A funding.

Lessons learned: Vibe coding has hidden costs: every minute saved on specification turns into hours of review and rollback at the 3-6 month horizon

Seven specification questions work as an early ambiguity detector: if the agent asks unexpected questions when checking requirements.md, the specification is not yet ready

Session separation with /clear is critically important for preventing context mixing: even a "smart" agent confuses priorities when planning, implementation, and bug fixing are combined in one session

Negative requirements are the most underestimated component: explicitly stating "what we don't do" protects against scope creep better than listing "what we do"

SDD pays off not instantly, but through accumulation of versioned context: the benefit grows exponentially with team size and project age

Related concepts: Vibe Coding

Seven Specification Questions

Sustainable Cycle Working with Qwen Code

Negative Requirements

Practical Redundancy Rule

Name: OAuth Integration Failure: When "Add Login" Costs More Than a Specification

Scenario: An indie developer was building a SaaS for freelancers — a time tracker with invoicing. When preparing for public launch, authorization was needed. The developer gave Claude an impulsive request: "Add login with Google".

Challenge: The agent implemented a full OAuth 2.0 integration with Google, including: new user registration via Google, linking existing accounts, token refresh, access to Google Calendar for importing events, profile with avatar from Google. Problems: (1) The developer only wanted authentication, not Calendar data authorization; (2) Account linking created a vulnerability: user A could link user B's email if they had previously logged in with a password; (3) Token refresh required a cron job, undocumented in infrastructure; (4) Registration via Google bypassed the mandatory "timezone" field, critical for invoicing; (5) Rollback took 8 hours, including database recovery from backup.

Solution: After the incident, the developer applied an SDD approach: created specs/auth/requirements.md with boundaries (authentication only, no access to Google data, no self-registration in this phase), fixed the decision to prioritize email-password over OAuth in the first release, described negative requirements (don't break existing users, don't require timezone during OAuth login). The plan contained 2 task groups instead of 8, validation — 5 specific checks including verification of absence of new scopes in Google Console.

Result: Re-implementation took 2 hours instead of 8 hours of rollback + undefined time for fixes. The specification prevented 3 of 5 problems before writing code (Calendar access, account linking, timezone requirement). The 2 remaining problems were caught at validation in 15 minutes. The developer estimated that writing the specification took 25 minutes — a 12x savings relative to rollback.

Lessons learned: An impulsive "Add X" request is only cheap in the moment: the full cost includes rollback, fixes, data recovery, and reputational losses

Agents are prone to "helpful" extensions that the developer didn't order: OAuth naturally pulls in profiles, integrations, synchronizations

25 minutes on a specification is an investment with ROI > 1000% at the first serious failure

Negative requirements in the form of "no access to Google data" are more concrete and verifiable than "authentication only"

Validation through comparing implementation with a pre-established list protects against "seems fine" when the reviewer is tired

Related concepts: Intention Drift

In Scope and Out of Scope

Practical Redundancy Rule

Seven Specification Questions

Feature Mini-Specification

Study tips: Go through the material sequentially: first recognize vibe coding problems through your own experience or cases, then study SDD as a systemic solution, not as bureaucracy

Keep a "vibe coding diary" — record your impulsive AI requests for a week, then analyze: which led to rollback? which could have been formalized in 10 minutes?

Practice the seven questions on real or imaginary features: take any feature from your current project and check if you can answer all seven questions in 5 minutes. If not — you need a specification

Use the "red team" technique: after writing a specification, ask an agent or colleague to find ambiguities through which implementation could go sideways. This trains critical thinking

Create templates for the three specification files in your editor or snippets: requirements.md, plan.md, validation.md. Automating creation lowers the barrier to SDD

Practice session separation literally: close the chat, open a new one with /clear, even if "continuation" seems more efficient. Measure the difference in result quality

Study negative requirements separately: this is the most counterintuitive part, requiring the habit of thinking "what do I NOT want" instead of "what do I want"

Actively compare SDD with waterfall: waterfall tries to predict the future, SDD fixes sufficient context for the next step. These are different logics, not different scales

Measure metrics: review time, number of bugs in release, time to onboard a new developer. SDD is justified by metrics, not intuition

Start with one feature: don't try to cover the entire project with specifications at once. Choose the next non-trivial feature, go through the full cycle, measure the result, then scale

Additional resources: Original SDD course: Course materials from which this part is extracted — foundation for immersion in context-dependent development with AI agents

Qwen Code documentation: Official materials on working with context, /clear commands, @-references to files — technical foundation of the sustainable cycle

"Writing Great Specifications" by Kamil Nicieja: A book about specifications in the classical sense, applicable to AI agents with scale adaptation

ADR (Architecture Decision Records): A format for fixing architectural decisions — complements SDD at the project level, protects against context decay between features

"The Checklist Manifesto" by Atul Gawande: A book about the power of checklists in complex systems — seven specification questions as a quality checklist

Open-source project specification examples: Studying how teams fix feature boundaries in RFCs (Request for Comments) — parallel with SDD mini-specifications

API Design Patterns course: Helps formulate non-negotiable constraints in specifications, especially for features with external interfaces

Summary: Specification-Driven Development (SDD) is the answer to the fundamental limitation of vibe coding: context ephemerality. AI agents write code quickly, but without persistent memory of intentions, boundaries, and decisions made. SDD transfers this context into versioned files, creating a "project map" for the agent. Key practices: sustainable cycle with /clear and separated planning/implementation/validation sessions; feature mini-specifications instead of monolithic design; seven questions as a readiness check; explicit boundaries and negative requirements to protect against scope creep; practical redundancy rule for effort economy. SDD doesn't make the agent infallible — it makes errors early, small, and verifiable. Implementation requires discipline, but pays off exponentially with project and team growth.