Topic: Part 22. Practical Exam
Difficulty level: Medium
Estimated study time: 12-16 hours (theory: 4 hours, practice: 8-12 hours)
Prerequisites: Knowledge of SDD methodology from parts 1-21 of the course
Experience working with Qwen Code CLI
Basic Git and branching skills
Understanding of the specs/ directory structure
Familiarity with markdown specifications
Learning objectives: Conduct a full SDD development cycle for a feature from intent to merge, creating specifications (requirements.md, plan.md, validation.md), implementing code, and passing verification with a score of at least 21 points out of 25
Identify at least 8 critical issues in an incorrect specification and rewrite it in SDD format with concrete boundaries, solutions, and verifiable acceptance criteria
Provide written answers to 22 control questions about SDD principles, demonstrating understanding of the source of truth, differences between QWEN.md and tech-stack.md, reasons for writing specifications before implementation, contents of validation.md, and secret security rules
Perform role-based "author-reviewer" interaction in a paired exam, demonstrating ability to give and receive constructive feedback across four review layers: code, specifications, facts, process
Overview: The practical exam is the culmination of the SDD (Specification-Driven Development) course, replacing a passive test with a real skill check. The student must prove ability to carry a feature through the full cycle: from intent to verification using Qwen Code CLI. The exam consists of four blocks: quick questions (theoretical foundation), problem analysis in specifications (critical thinking), rewriting specifications in SDD format (documentation practice), and the final "feedback form" project (full development cycle). A distinctive feature of the exam is support for a paired variant, where participants alternate between author and reviewer roles, modeling real team work and dispelling the illusion of review passivity. The 25-point scoring covers five dimensions: specifications, implementation, verification, process, team readiness and security. A result of 21+ points indicates readiness for industrial SDD application.
Key concepts: Source of truth in SDD: Versioned specification in the repository, not agent memory, verbal agreements, or external documents. This enables replacing agents and IDEs while maintaining process continuity. Any discrepancy between specification and code is a defect to be fixed.
Qwen.md vs tech-stack.md: QWEN.md defines agent behavior rules (how to work), tech-stack.md defines product technical decisions (what to use). Overlap is impermissible: agent constitution is separate from product architecture. tech-stack.md contains long-term decisions (language, framework, DBMS); feature requirements.md contains single-phase level decisions (routes, fields, error messages).
Specification before implementation: The agent must not guess boundaries and acceptance criteria. The specification fixes human intent before the agent begins generating code. This prevents agent "hallucinations" and provides a basis for objective verification.
Validation.md: Contains four elements: commands for automated checks (npm test, typecheck), manual checks with concrete steps, discrepancy checks (comparing code with specification), readiness definition (clear phase completion criterion). Without validation.md, it is impossible to objectively state whether a feature is complete.
Replanning: Performed between features when new knowledge must update the constitution (QWEN.md), roadmap (roadmap.md), or process. Specification changes are moved to a separate branch when they affect the stack, roadmap, constitution, agent rules, or multiple future features.
/clear command between phases: Checks whether sufficient context is recorded in files. Without /clear, the agent may continue relying on stale context from the previous phase, leading to "leakage" of intents and uncontrolled decisions.
Three questions before specification: Boundaries (what is included in the work), decisions (what technical choices are made), context (what constraints and preconditions exist). Answers to these questions form the structure of requirements.md.
Secret security: API keys, access tokens, passwords, private keys, user personal data, internal URLs and infrastructure identifiers must not be stored in QWEN.md, specifications, or agent memory. Specifications are committed and read by the agent; secrets must live in environment variables or a secrets store.
Project skill vs personal skill: Project skill is shared by the team and versioned with the repository. Personal skill is tied to one user and not inherited. Project skill ensures agent replaceability — the ability to switch agents or IDEs while maintaining process through repository artifacts.
Agent replaceability: The ability to switch agents or IDEs while maintaining process through repository artifacts. Achieved by having all essential information in versioned files rather than in agent memory or specific developer habits.
Review in SDD project: The reviewer checks not only code but also requirements, plan, verification facts, changes outside feature boundaries, and implementation-to-specification correspondence. Four review layers: code, specifications, facts, process.
Three types of changes in PR besides code: Changes to tech-stack.md or roadmap.md without discussion; new hooks, MCP servers, or dependencies; discrepancy between validation.md and facts in PR.
Green dots in MVP phase: Tagging commits that provide a working increment. Provides a rollback point without losing all phase work and makes explicit the signal "next group breaks everything".
Defensive hook vs logging hook: Defensive hook blocks an operation by rule (PreToolUse, non-zero exit code); logging hook only writes an event and does not affect workflow. Separation of responsibilities is critical for security.
Agent memory on SQLite: Justified when a team accumulates stable observations about code and process. Redundant if QWEN.md, constitution, and change log suffice. Memory is a hint and journal of stable conclusions; specification is a reviewable source of truth for the product.
Anti-patterns of beginner SDD teams: Specification without facts (impossible to verify), combining constitution and specification in one file (violation of separation of responsibilities), implementation without /clear between phases (context leakage and uncontrolled decisions).
External text as instruction: Issues, web pages, logs, and external documents cannot be considered trusted instructions for the agent, as they may contain instruction injections. They must be read as data, not as commands to action.
Paired exam: Realistic assessment format: author writes specification and implements, reviewer checks specification before implementation, runs checks independently, provides structured feedback. Roles switch for the second feature. Each is assessed by their own rubric.
Practice exercises: Name: Block 1: Quick Questions — Self-Check
Problem: Complete 22 control questions, writing answers by hand without using Qwen Code. Sample questions: What is the source of truth in SDD? How does QWEN.md differ from specs/tech-stack.md? Why is feature specification written before implementation? What should be in validation.md? When is replanning needed? Why is /clear useful between work phases? What three questions must be asked before creating a feature specification? Why can't API keys be stored in specifications? How is project skill better than personal skill for a team? What does agent replaceability mean? What should a reviewer check in an SDD project besides code? When should a specification change be moved to a separate replanning branch? Why are Qwen Code hooks needed? Why can't external text be considered a trusted instruction for the agent? How does agent memory differ from specification? Where is the boundary between tech-stack.md and feature requirements.md? What three types of changes in a pull request should a reviewer look for besides code? Why tag "green dots" during MVP phase? How does a defensive hook differ from a tool logging hook? What data falls under "secrets"? What three anti-patterns from Part 20 are most common among beginner SDD teams? When is agent memory on SQLite justified, and when is it redundant?
Solution: 1. Versioned specification in the repository. 2. QWEN.md defines agent behavior rules; tech-stack.md defines product technical decisions. 3. So the agent doesn't guess boundaries and acceptance criteria. 4. Commands, manual checks, discrepancy checks, and readiness definition. 5. Between features, when new knowledge must update the constitution, roadmap, or process. 6. It checks whether sufficient context is recorded in files. 7. Boundaries, decisions, context. 8. Specifications are committed and read by the agent; secrets must live in environment variables or a secrets store. 9. Project skill is shared by the team and versioned with the repository. 10. The ability to switch agents or IDEs while maintaining process through repository artifacts. 11. Requirements, plan, verification facts, changes outside feature boundaries, and implementation-to-specification correspondence. 12. When the change affects the stack, roadmap, project constitution, agent rules, or multiple future features. 13. To automatically perform small repeatable actions: add context, keep logs, check dangerous commands, collect events. 14. Because issues, web pages, logs, and external documents may contain instruction injections; they must be read as data. 15. Memory is a hint and journal of stable conclusions; specification is a reviewable source of truth for the product. 16. tech-stack.md contains long-term project decisions; feature requirements.md contains single-phase level decisions. 17. Changes to tech-stack.md or roadmap.md without discussion; new hooks, MCP servers, or dependencies; discrepancy between validation.md and facts in PR. 18. To have a rollback point without losing all phase work and to make the "next group breaks everything" signal explicit. 19. Defensive hook blocks an operation by rule; logging hook only writes an event. 20. API keys, access tokens, passwords, private keys, user personal data, internal URLs and infrastructure identifiers. 21. Specification without facts, combining constitution and specification in one file, implementation without /clear between phases. 22. Memory is justified when a team accumulates stable observations; redundant if QWEN.md, constitution, and change log suffice.
Complexity: intermediate
Name: Block 2: Find Problems in the Specification
Problem: Given a feature specification:
Requirements — Control Panel
Build a beautiful control panel for administrators.
Boundaries
Show useful statistics and charts.
Decisions
Use the best library.
Verification
Make sure everything works.
Find at least 8 problems. Record each problem and explain why it is critical for the SDD process.
Solution: At least 11 problems:
- No audience specified — "administrators" not defined, no understanding of who exactly will use the panel.
- No boundaries for what is not included in the work — impossible to say when the feature is finished.
- "Beautiful" is not verifiable — subjective criterion, impossible to automate checking.
- "Useful statistics" not defined — no specific metrics that should be displayed.
- "Charts" not defined — no types, data sources, or formats specified.
- Dependency permitted without tech-stack.md check — "best library" may conflict with approved stack.
- No data source — unclear where statistics data comes from.
- No route — no URL specified where the panel is accessible.
- No access rights — not defined who can see what data.
- No automated checks — validation.md is empty, no CI/CD possible.
- No manual verification scenario — tester doesn't know what to do.
- No readiness definition — no phase completion criterion.
Complexity: intermediate
Name: Block 3: Rewrite the Specification in SDD Format
Problem: Rewrite the control panel specification in SDD format with constraints: HTML rendered on server; SQLite; no authentication in this phase; route /dashboard; show count of agents, ailments, therapies, and appointment bookings; do not add chart library yet; must pass npm test and npm run typecheck.
Required structure:
Requirements — Administrator Panel
Boundaries
Beyond Boundaries
Decisions
Context
Quick Verification
Solution: ```markdown
Requirements — Administrator Panel
Boundaries
- Server-side HTML rendering on route
/dashboard - Display of four counters: agents, ailments, therapies, appointment bookings
- Data from existing SQLite database
- Basic layout without external CSS frameworks
Beyond Boundaries
- Authentication and authorization (phase without auth)
- Chart libraries (Chart.js, D3, etc.)
- Filtering, sorting, searching data
- Report export
- Real-time WebSocket updates
Decisions
- SQL queries
SELECT COUNT(*)for each entity - Project's server-side templating engine (specify specific one per tech-stack.md)
- Route added to existing router
- Styles within project's existing CSS approach
Context
- Phase without authentication: panel accessible to all via direct URL
- SQLite already used in project, schema exists
- MVP approach: charts deferred until need is confirmed
Quick Verification
Automated:
npm run typecheckpasses without errorsnpm testpasses, including new route tests- Route
/dashboardreturns HTTP 200
Manual:
- Open
/dashboard, see 4 numbers - Verify numbers correspond to actual database data
- Check display on mobile (320px) and desktop (1280px)
- Verify panel link is absent from navigation (no auth)
Readiness: all automated checks pass, manual checks completed, no secrets in code.
Complexity: intermediate
Name: Block 4: Final Project — Feedback Form
Problem: Complete on your AgentClinic or another small project. Add the "feedback form" feature: page `/feedback`; form with `name` and `message`; POST route; SQLite storage; list of recent feedback entries; basic validation; tests; change log. Follow the process: clean main → feature branch → specification directory specs/YYYY-MM-DD-feedback-form/ with requirements.md, plan.md, validation.md → commit specification before implementation → implementation by groups → verification in separate Qwen Code session → update roadmap and change log → pull request per SDD template → security check → merge after checks. Use recommended Qwen Code scenarios for each phase.
Solution: Full step-by-step process:
**Preparation:**
git checkout main git pull git checkout -b feature/feedback-form mkdir -p specs/2024-01-15-feedback-form
**Phase 1: Specification (Qwen Code session):**
/clear Use the feature specification skill to start the next roadmap phase: feedback form. Before writing files, ask me about boundaries, decisions, and context. Do not implement code yet.
Create requirements.md, plan.md, validation.md, commit.
**Phase 2: Implementation (new Qwen Code session):**
/clear Read @QWEN.md, @specs/mission.md, @specs/tech-stack.md, @specs/2024-01-15-feedback-form/requirements.md, @specs/2024-01-15-feedback-form/plan.md, and @specs/2024-01-15-feedback-form/validation.md.
Implement only group 1. Stop after the list of changed files.
Repeat for each plan group, tag green dots.
**Phase 3: Verification (new Qwen Code session):**
/clear Compare current branch with @specs/2024-01-15-feedback-form/validation.md. Show what passed, what failed, and where gaps exist. Do not modify files.
Fix found issues in a separate session.
**Phase 4: Completion:**
Use the change log skill and update @CHANGELOG.md for the feedback form branch.
Update roadmap.md, prepare PR per template, check for absence of secrets, perform merge.
**Self-assessment out of 25 points:**
- Specifications (5): presence of three files, concrete boundaries, tech-stack.md correspondence, plan grouping, automated and manual checks
- Implementation (5): specification correspondence, absence of unnecessary refactorings, clear migration, project conventions, error handling
- Verification (5): typecheck, tests, manual walkthrough, invalid input, responsiveness
- Process (5): correct branch, specification before code, updated roadmap and changelog, correspondence check
- Team readiness and security (5): PR relationship description, absence of out-of-boundary changes, absence of secrets, hook review, explicit weak facts
Complexity: intermediate
Case studies:
Name: Case: AgentClinic — Implementing Feedback Form Through Full SDD Cycle
Scenario: A small team of two developers uses AgentClinic (simplified clinic management system) to learn SDD methodology. Product runs on Node.js + SQLite with server-side rendering. Users complain there is no way to report problems without email. The team decides to add a feedback form through the full SDD cycle as a practice feature for the practical exam.
Challenge: Both developers previously worked on a classic agile model with informal requirements. Problems: (1) habit of writing code immediately after discussion, without documentation; (2) not understanding why validation.md is needed if "we'll check everything anyway"; (3) fear that /clear "will erase useful context"; (4) unwillingness to review specifications, not just code; (5) storing a test API key in QWEN.md for convenience.
Solution: The team applies paired exam: Developer A — author, Developer B — reviewer. Author starts with /clear and creates specification through Qwen Code skill, answering three questions (boundaries, decisions, context). Reviewer reads requirements.md before implementation and finds a critical gap: no message length limit specified (SQL injection through long text). Specification is refined before commit. Implementation proceeds by groups with /clear between phases. Verification is performed on reviewer's machine, who discovers that tests pass but manual verification shows: form accepts empty name (specification required "basic validation" but didn't specify details). This is recorded as a "weak fact" for retrospective. Before merge, reviewer checks for absence of secrets — finds test API key in QWEN.md, removes it, adds rule to .gitignore and project constitution.
Result: Feature merged with score 23/25. Lost points: (1) no explicit XSS test for message in validation.md (manual check found it but wasn't formalized), (2) roadmap.md updated but without priority for next feature. Retrospective showed: "What the agent had to deduce itself" — 2 items (specific length limit, error display format), which is acceptable. Team established rule: if more than 3 items — next specification is written in more detail. Project skill "feedback form specification" added to repository and used as template for new features.
Lessons learned:
Specification review before implementation saves 3-5 hours of code rework; reviewer-found gap with length limit prevented a production incident
/clear between phases revealed that first version validation.md was incomplete — agent couldn't use it as context, proving need for refinement
Paired format creates healthy tension: author strives for specificity, reviewer trains skill of giving constructive feedback across four layers
Verification on reviewer's machine, not on author's word, found Node.js version discrepancy that typecheck didn't catch — confirming importance of independent verification
Found API key in QWEN.md led to creating a defensive PreToolUse hook blocking commits with key patterns — security investment paid off immediately
Related concepts:
Source of truth in SDD
Paired exam and author-reviewer roles
validation.md and four verification elements
Secret security and defensive hooks
/clear between phases
SDD retrospective and 3-item threshold
Project skill vs personal skill
Name: Case: Rewriting Administrator Panel Specification — From Anti-Pattern to Example
Scenario: A student developer receives assignment to rewrite the control panel specification from course Block 2. Original specification is typical of "agile rush": beautiful words, subjective criteria, absence of boundaries. Student must apply SDD approach considering real constraints of the AgentClinic project.
Challenge: Student struggles to abandon habitual phrasing like "beautiful panel" and "useful statistics". Psychological resistance: it feels like specificity "kills creativity". Technical difficulty: need to account for absence of authentication in current phase, creating data leakage risk. Another problem: student wants to add Chart.js "just in case", though explicitly prohibited.
Solution: Student goes through checklist of 11 problems in original specification. For each problem formulates concrete replacement: "beautiful" → "basic layout within project's existing CSS approach", "useful statistics" → "four counters: agents, ailments, therapies, appointment bookings". Special attention to "Beyond Boundaries" section: explicit "no authentication" protects against temptation to add login "while hands are in code". "Decisions" section is cross-checked with project tech-stack.md — confirmed use of existing templating engine, no new library added. Validation.md includes mobile check, which student initially considered "unnecessary".
Result: Specification accepted by reviewer with one remark: no maximum SQL query length specified for large data volume (COUNT(*) may be slow on millions of records). Note added to "Context" about current database size and review condition upon growth. Student realizes: specificity doesn't kill creativity, but transfers it to project-level architectural decisions rather than "make up on the fly" level.
Lessons learned:
Every vague formulation is a potential conflict between author and implementer; specificity is an investment in predictability
"Beyond Boundaries" section is often more important than "Boundaries": it protects against scope creep and gives psychological permission to "not do now"
Mobile check, seeming "unnecessary" for admin panel, found real problem: table with 4 counters broke at 320px due to flex-wrap
Cross-check with tech-stack.md before finalizing decisions prevented adding Chart.js "just in case" — saving 2-3 days on unnecessary dependency integration
Related concepts:
Boundaries and beyond boundaries in specification
tech-stack.md as decision constraint
validation.md with manual and automated checks
Replanning and deferred facts
Specificity vs subjectivity in requirements
Study tips:
Complete blocks sequentially, without skipping: theory (block 1) → critical analysis (block 2) → construction (block 3) → full cycle (block 4). Each block prepares skills for the next
For block 1: write answers by hand or in a text editor without peeking, then verify. Rote memorization is useless — understanding the logic is important to adapt to new situations
For block 2: don't stop at 8 problems. The more you find, the deeper the understanding. Compare your findings with the course checklist — if you missed something important, return to the corresponding course part
For block 3: rewrite the specification twice: first independently, second — after reading the example. Compare: where is your version more precise, where is it redundant
For block 4: be sure to use /clear between phases. This is not a ritual but a functional check: if after /clear the agent cannot continue without your explanations — the specification is insufficient
Find a partner for paired exam even if studying alone. The reviewer role can be "played" yourself through a time gap: write specification, set aside for a day, then read as reviewer
Keep a personal SDD error journal: record what problems you missed, what anti-patterns you repeated. This will become your "agent memory" — but personal, project-based, versioned
For retrospective: be honest in the "What the agent had to deduce itself" section. If more than 3 items — this is not failure but a signal for next iteration. Hiding problems is costlier than acknowledging them
Practice explaining SDD approach to a colleague unfamiliar with the methodology. If you can't explain in 5 minutes — you don't fully understand it yourself
Use a timer: 30 minutes on specification, 60 on implementation, 30 on verification. Hard time limits simulate real pressure and teach prioritizing specificity
Before final exam: complete a trial run on an imaginary feature (e.g., "newsletter subscription"), without merging. This reduces anxiety and reveals weak points in the process
Additional resources:
Parts 1-21 of SDD course: Foundational material referenced by the practical exam. Especially important: part 6 (tech-stack.md and requirements.md boundaries), part 7 (clarification), part 12 (MVP and green dots), part 16 (four review layers), part 17 (hooks), part 18 (secrets), part 19 (agent memory), part 20 (anti-patterns)
Qwen.md — agent constitution template: Example file defining agent behavior rules. Use as basis for your project, but don't copy without adaptation
Specs/tech-stack.md — example technical stack: Sample document with long-term project decisions. Note the structure and level of detail
Changelog.md — change log template: Example format for recording changes by feature. Important for process score in exam
Git flow / github flow: Branching workflow materials. SDD doesn't dictate a specific model, but requires feature isolation and specification commit before implementation
Owasp cheat sheet series: Practical security recommendations. Especially sections on input validation and secret storage — directly related to validation.md and exam security requirements
Cognitive load theory in software development: Understanding cognitive load helps realize why /clear and phase separation are needed — key SDD practices
Specification examples from open SDD projects: Real specs/ directories with requirements.md, plan.md, validation.md for comparison with your level of detail
Summary: The practical exam is not a rote memorization test but a demonstration of ability to carry a real feature through the full SDD cycle. Key skills: write concrete specifications with measurable criteria, use Qwen Code CLI as a tool with clear phases and /clear between them, perform four-layer review (code, specifications, facts, process), ensure security through secret isolation and defensive hooks. Successful completion (21+/25) indicates readiness for industrial application; result 16-20 requires improvement of verification and team loop; below 16 indicates need to reduce phase size and detail specifications. Paired exam format models real team dynamics and teaches both sides of SDD dialogue. Final retrospective with 3-item "self-deduced" threshold creates a continuous improvement mechanism. Main principle: SDD is not about more documentation, but about right documentation at the right time, so the agent doesn't guess but executes verifiable intents.