Topic: Part 21. Conclusion and Working System
Difficulty level: Medium
Estimated study time: 4-6 hours (theory: 2 hours, practice: 2-4 hours)
Prerequisites: Basics of working with Git and branches
Understanding of basic CI/CD
Familiarity with LLM agents (Qwen Code, Claude Code, or equivalents)
Completion of parts 1-20 of the SDD course (preferred but not required)
Basic experience with Markdown and structuring documentation
Learning objectives: Form a complete SDD repository structure and explain the purpose of each file
Execute the operational cycle 'specification → implementation → verification → merge' using /clear between roles
Determine the SDD maturity level of your team on a scale of 0-4 and create a plan for transitioning to the next level
Create a minimal working SDD process for a real project from 5 files
Conduct a retrospective after 2-3 features and identify observations suitable for codification into agent rules
Overview: This concluding part of the SDD course turns theory into a working system. SDD is not a set of files for the sake of files, but a control infrastructure that helps humans maintain sovereignty when an agent can in minutes change what used to take hours. The main skill is not writing maximally long prompts, but establishing the right boundaries between intent (specification), planning, implementation, and verification. The study guide will walk you through the final repository structure, the operational cycle, decision boundaries, typical errors, the maturity scale, and practical implementation in a real team. The goal is to bring the team to level 2 on the didactic AgentClinic project and provide tools for further growth.
Key concepts: Working sdd system: SDD is not documentation for the sake of documentation, but a living process that works only with discipline. Key formula: specification holds long-term intent, facts decide whether a branch can be merged, the agent executes quickly, the human is responsible for judgment. Or even shorter: specifications guide, facts permit merging.
Final repository structure: The complete structure includes: root files (AGENTS.md, QWEN.md, README.md, CHANGELOG.md, package.json, tsconfig.json); .qwen/ directory with settings, hooks, memory, and skills; specs/ directory with mission.md, tech-stack.md, roadmap.md, and feature specifications following the YYYY-MM-DD-feature-name/ template (requirements.md, plan.md, validation.md); source code in src/, tests in tests/, PR template in .github/pull_request_template.md.
Operational cycle: A clear sequence of actions: preparation (git checkout main, git pull, git status, npm test, npm run typecheck) → specification with feature-spec skill and /clear → commit specification → implementation with clear constraint 'only task group 1' and /clear → verification per validation.md listing confirmed, failed, missing, and ambiguous facts → update CHANGELOG with changelog skill → final checks and merge → replanning before the next feature.
Decision boundaries: Each file stores its type of decisions: mission.md — audience, product meaning, tone, success definition; tech-stack.md — language, environment, framework, database, testing, deployment constraints, forbidden technologies; roadmap.md — phase order, statuses, deliverables, next step; feature specification — boundaries, out of scope, task groups, validation.md facts, verification commands, fact statuses, manual checks; QWEN.md/AGENTS.md — behavior rules, verification commands, prohibition on speculative refactorings, requirement for specification before code.
Sdd maturity scale: A five-level scale: 0 — vibe coding (one long chat, decisions in session history); 1 — specifications optional, validation.md as wishes, /clear forgotten; 2 — SDD as default standard, executable facts, /clear as habit, replanning between features; 3 — codified process, skills, guard hooks, four-layer review, anti-pattern recognition; 4 — learnable process, observations become rules, agent memory consciously managed, agent replaceability verified. The study guide's goal is level 2.
Minimal sdd template: To start, 5 files are enough: specs/mission.md, specs/tech-stack.md, specs/roadmap.md, specs/YYYY-MM-DD-feature/(requirements.md, plan.md, validation.md). Don't wait for the perfect framework — start with the minimum and grow based on observations.
Codification of observations: The key mechanism for growth from level 2 to 3: when a recurring agent error is identified, someone calmly proposes 'let's add a rule to QWEN.md', and that rule actually appears. A retrospective after 2-3 features analyzes: where the agent guessed, which specifications were useless, which checks caught real errors, which items were wishes rather than facts, what should be automated.
Agent replaceability: A principle verified at level 4: the team changed tools (Qwen → Claude → another) and didn't lose the process. Achieved through clear documentation of intents and facts, independent of any specific agent.
Practice exercises: Name: Exercise 1: Creating a Minimal SDD Skeleton
Problem: Take your current project (or create a new repository). Create a minimal SDD template from 5 files: specs/mission.md, specs/tech-stack.md, specs/roadmap.md, specs/2026-01-15-auth-module/requirements.md, specs/2026-01-15-auth-module/plan.md, specs/2026-01-15-auth-module/validation.md. Fill each file according to the decision boundary rules. Then execute the operational cycle through the specification stage (inclusive): branch preparation, /clear, request to agent with feature-spec skill, specification commit.
Solution: Step 1: Create a specs/ directory in the project root. Step 2: In mission.md write: audience (e.g., 'internal developers at a medical clinic'), product meaning ('patient appointment automation'), tone ('business-like, no emojis'), success definition ('appointment time reduced from 10 to 2 minutes'). Step 3: In tech-stack.md lock in: TypeScript, Node.js 20, Hono, SQLite, Vitest, Docker, MongoDB forbidden. Step 4: In roadmap.md describe 3 phases: 1) infrastructure (status: done), 2) authentication (status: in progress), 3) scheduling (status: planned). Step 5: For the auth-module feature create requirements.md with boundaries ('includes: JWT login, refresh tokens; excludes: OAuth, 2FA'), plan.md with 3 task groups, validation.md with 5 facts and verification commands. Step 6: git checkout -b feat/auth-spec, git add specs/, git commit -m 'Add auth-module spec'. Verification: repository structure matches the minimal SDD template.
Complexity: beginner
Name: Exercise 2: Diagnosing Team Maturity Level
Problem: Analyze the current development process in your team (or in a hypothetical team of 3 developers using Cursor). On the 0-4 scale, determine the current SDD maturity level. For each transition indicator between levels (0→1, 1→2, 2→3, 3→4), specify concrete facts from your practice that confirm or refute achieving the level. Create a plan of 3 concrete steps for transitioning to the next level with an implementation time estimate.
Solution: Step 1: Create a table with 5 columns: Level, Indicators, Facts from Practice, Confirmed?, Evidence. Step 2: Fill for level 0: 'One long chat with agent' → 'Developer Ivan keeps a 200-message chat in Cursor, decisions undocumented' → Confirmed → screenshot of history. For level 1: 'Specifications optional' → 'For large features we write READMEs, for small ones — no, validation.md absent' → Partially. For level 2: 'SDD as default standard' → 'No, new features start with a request to "write code"' → Not confirmed. Step 3: Current level = 1 (between 1 and 2). Step 4: Transition plan to 2: (a) Implement rule 'No multi-file feature without requirements.md, plan.md, validation.md' — 1 week; (b) Add /clear between roles in prompt template — 3 days; (c) Conduct first retrospective after 2 features — 2 weeks. Total: 4 weeks to stable level 2.
Complexity: intermediate
Name: Exercise 3: Verification Simulation with Rejection
Problem: An agent implemented the 'hello-hono' feature (Hono API with GET /health endpoint). Tests pass, but verification per validation.md reveals discrepancies: (1) specification stated port 3000, agent used 8080; (2) specification required X-Request-ID header, agent didn't add it; (3) agent tests only check 200 OK, while validation.md requires checking response structure {status: 'ok', timestamp: ISO8601}. Compose a verification report in the format 'confirmed facts, failed facts, missing facts, ambiguous items'. Determine whether the branch can be merged, and describe next steps.
Solution: Verification report: Confirmed facts: (a) server starts without errors (npm run dev executes); (b) GET /health endpoint exists; (c) HTTP 200 is returned. Failed facts: (a) server port is 8080 instead of 3000 (validation.md requirement §3.1); (b) X-Request-ID header is missing (requirement §4.2); (c) response structure does not match {status: 'ok', timestamp: ISO8601}, returns plain 'OK' (requirement §5.1). Missing facts: (a) graceful shutdown not verified (§6.1, test absent from implementation); (b) SIGTERM handling not checked (§6.2). Ambiguous items: (a) 'server should be production-ready' — criterion undefined; (b) 'endpoint documentation' — unclear whether Swagger or README. Merge decision: Branch MUST NOT be merged. Failed facts §3.1, §4.2, §5.1 are mandatory (status: required in validation.md). Next steps: (1) Return branch to agent with report and requirement to fix failed facts; (2) Clarify ambiguous items in specification; (3) After fixes — re-verify only failed and missing facts; (4) Update QWEN.md with rule: 'Always take server port from specs/tech-stack.md or feature specification, never use framework default values'.
Complexity: intermediate
Name: Exercise 4: Codification After Retrospective
Problem: After 3 features in the AgentClinic team, the following observations were collected: (a) agent did 'speculative refactoring' 2 out of 3 times — changed directory structure without request; (b) validation.md for the 'agents-ailments' feature was completely copied from previous feature and didn't match real requirements; (c) team forgot /clear between specification and implementation, agent confused contexts; (d) feature-spec skill was created before the team understood their process, and contains outdated instructions. Turn these observations into 4 concrete rules for QWEN.md and describe the process for updating the feature-spec skill.
Solution: Rules for QWEN.md: (1) 'FORBIDDEN: changing directory structure, file names, dependencies without explicit specification. If you see need for refactoring — stop, report in summary, wait for confirmation.' (2) 'Before using validation.md, verify it matches current feature: dates, entity names, verification commands must be specific, not copied.' (3) 'Mandatory sequence: /clear before every role change (specifier → developer → verifier). Chat context does not transfer between roles.' (4) 'Skills are living artifacts. When outdated instruction is found, immediately report, do not blindly follow it.' Updating feature-spec skill: Step 1: Create task 'Update feature-spec after retrospective 2026-05-15'. Step 2: Make changes in .qwen/skills/feature-spec/SKILL.md: add 'Anti-patterns' section with 3 examples from practice, update validation.md template with reminder about specificity, add checklist 'Before creating specification'. Step 3: Test updated skill on one small feature. Step 4: If agent asks fewer clarifying questions — skill improved, lock version in CHANGELOG.
Complexity: advanced
Case studies: Name: Case: MedFlow Startup Migration from Vibe Coding to SDD Level 2
Scenario: MedFlow — telemedicine startup, 4 developers, actively using Cursor with Composer. Over 6 months accumulated 15k messages in chats, production broke 3 times due to agent 'optimizations', documentation outdated at moment of creation. CTO decided to implement SDD after another incident: agent deleted the appointments table, 'optimizing' schema, because in a chat 200 messages ago a different structure was discussed.
Challenge: (1) Developers resisted: 'This will slow us down, we already move fast'. (2) No unified product understanding: each developer had their own vision. (3) Agents were used to long chats and 'guessing' intentions. (4) Needed to preserve feature delivery speed while simultaneously increasing reliability.
Solution: Week 1: Implemented single hard rule — 'No multi-file feature without requirements.md, plan.md, validation.md'. Created specs/mission.md (unified vision: 'platform for asynchronous consultations, not synchronous calls'), specs/tech-stack.md (Firebase forbidden, PostgreSQL locked in). Weeks 2-3: Added short AGENTS.md with 5 rules, including prohibition on speculative refactorings. Feature-spec skill created after 2 manual specifications — not before. Week 4: Implemented /clear between roles, developers track this in pair programming. Week 6: First retrospective — discovered 40% of validation.md items were wishes, not verifiable facts. Clarified criteria. Week 8: Verified replaceability — one developer switched from Cursor to Claude Code for one feature, process worked without losses.
Result: After 3 months: time from idea to production reduced from 14 to 9 days (counterintuitively: fewer rollbacks and bugs), agent-caused incidents — from 3/month to 0, new developer onboarded in 3 days instead of 2 weeks (read specs/, not chat history). Team achieved level 2, planning transition to 3 through hook automation.
Lessons learned: Start not with a perfect process, but with one unbreakable rule — this is the only way to overcome resistance
Agent skills cannot be created before understanding the process: first 2 specifications must be manual to see patterns
Retrospective after 2-3 features is critical: without it validation.md remains a wish list, and the process — a formality
Agent replaceability is not an abstract goal but a practical check: try another tool for one feature and see documentation gaps
Related concepts: Minimal SDD template
SDD maturity scale
Operational cycle
Codification of observations
Agent replaceability
Name: Case: BigTech Corporate Team Stuck at Level 3 Due to Premature Automation
Scenario: A team of 12 developers in a large e-commerce project quickly moved from level 1 to 3: created 15 skills, 8 guard hooks, automatic review. After 4 months the process choked: hooks rejected 60% of commits on false positives, skills contradicted each other, developers bypassed the system through 'vibe coding in personal branches'.
Challenge: (1) Automation outpaced understanding: hooks encoded assumptions, not verified patterns. (2) Skills multiplied without centralized ownership — each developer added 'their' skill. (3) Process became the end goal: 'maturity level' metrics more important than real efficiency. (4) Forgotten retrospective: team didn't analyze which rules work and which don't.
Solution: Radical simplification: disabled all hooks except 2, deleted 10 of 15 skills, kept only feature-spec and changelog. Introduced monthly audit: every rule in QWEN.md must have a reference to a concrete incident it prevented. Returned manual checks for 20% of features — random sample to validate automation. Retrospective became mandatory before every sprint planning, not just between features.
Result: After 2 months: hook false positives dropped from 60% to 8%, 'specification → merge' cycle time reduced by 35%, developers returned to main process. Team consciously returned to level 2 with elements of 3, planning return to 3 after 6 months of observations.
Lessons learned: Levels 3 and 4 should not be the end goal: premature automation is worse than no automation
Every rule in QWEN.md must have a 'pedigree' — a concrete incident, otherwise it's speculation
Random manual check of 20% of automated processes — a necessary quality control
Maturity scale — a diagnostic tool, not a KPI metric: use it for understanding, not for reporting
Related concepts: SDD maturity scale
Codification of observations
Most common errors
Decision boundaries
Study tips: Practice /clear physically: open Qwen Code, execute the command, verify that chat context is actually reset. Many forget to do this automatically, and the agent confuses roles.
Create a 'template repository' SDD on GitHub and clone it for each new project — this lowers the starting threshold and prevents 'blank page paralysis'.
Conduct a 'boundary audit' once a week: open mission.md, tech-stack.md, roadmap.md and check if they contradict each other. Inconsistency among these three files is the main cause of useless specifications.
For visual style: draw the operational cycle on A3 paper and hang it next to your monitor. SDD works only with literal following of the sequence, without skips.
Practice 'red team': ask a colleague to intentionally break your specification — skip validation.md, ignore /clear, do refactoring outside the plan. This reveals blind spots in QWEN.md.
Keep an 'agent incident journal': record date, prompt, unexpected behavior, added rule. After 10 entries you'll see patterns suitable for codification.
For classroom style: discuss the MedFlow case with your team, then conduct a role-play — one 'agent' (blindfolded to source documents), one 'specifier', one 'verifier'. Feel firsthand where the process breaks down.
Don't read this part 'in one sitting'. Stop after each section and do micro-practice: create one file, verify one hypothesis, give the agent one test prompt.
Additional resources: Agentclinic repository (structure example): Study the final structure from course materials as a benchmark for your projects
Minimal sdd template: specs/mission.md + specs/tech-stack.md + specs/roadmap.md + specs/YYYY-MM-DD-feature/(requirements.md, plan.md, validation.md)
Qwen code skills documentation: https://github.com/QwenLM/Qwen2.5-Coder (official examples of .qwen/skills/ structure)
Sdd course part 16 (four-layer review): Course materials — for preparation for level 3
Sdd course part 20 (anti-patterns): Course materials — for recognizing typical errors
Sdd course part 10 (codification): Course materials — for the mechanism of turning observations into rules
Sdd course part 15 (agent replaceability): Course materials — for verifying independence from specific tool
Book 'working effectively with legacy code' (michael feathers): Classic about change boundaries — applicable to decision boundaries in SDD
Practice 'prompt engineering for developers' (deeplearning.ai): For understanding the role of /clear and context separation
Summary: SDD is a working control system, not bureaucracy. Its core is four boundaries: specifications hold intent, facts permit merging, agent executes quickly, human is responsible for judgment. Start with the minimum: 5 files and one unbreakable rule. Reach level 2 through /clear discipline, executable facts in validation.md, and replanning between features. Levels 3 and 4 will come on their own if you regularly conduct retrospectives and codify observations — but don't make them the end goal. Verify your path with a final test: ask the agent to act without chat context, only from files. If it asks good questions — you're on the right path.