Reading: Applied Volume. Production SDD for Qwen Code CLI

Lesson 1 of 5 in module «Applied Volume. Production SDD for Qwen Code CLI»

You are viewing the lesson without signing in. Sign in to save progress and take tests.

Source

Applied Volume. Production SDD for Qwen Code CLI

This directory is the second, applied volume of the textbook. The first volume in book/ teaches the basic SDD cycle on AgentClinic: constitution, feature specification, plan, verifiable facts, implementation, review, and replanning. The second volume transfers the same cycle into production scenarios: legacy traces, validators, multi-agent checks, Spec CI, metrics, model budgets, and limited auto-remediation.

Version: v1.0 — verified 2026-05-20. See CHANGELOG.md for revision history.

The material is not designed for a first introduction to SDD. Before reading, you need to understand requirements.md, plan.md, validation.md, QWEN.md, feature boundaries, negative requirements, and fact-based verification. If these terms have not yet become part of your working vocabulary, complete the first volume first.

The main rule of the second volume: the first pass should leave one small verifiable trace, not introduce all production terminology at once. In each chapter, first close the educational minimum: one artifact, one command, or one blocker for capstone/. The main graded case is high_memory_usage; rules for transferring local cases into the main one are described in Part 0.

Quick Start

Open Part 0 and take the main case high_memory_usage.
Create an empty capstone/.
In chapters 1–3, fill in genealogy.md, a poisoned/fixed pair, and constitution.md.
In chapters 4–11, complete only the "Minimal Educational Scenario" section and runnable commands from [examples/](examples/); if a chapter uses a different case (autoscale_200pct, node_not_ready, appointment_latency / appointment_latency_spike, cdn_error_budget_burn), write one transfer line — which principle from that case protects the main high_memory_usage.
In chapter 12, check the package against antipatterns.
In chapter 13, assemble the final capstone/README.md and verify that it can be understood without chat history.

Minimal check of examples, including expected blockers:

bash book2/examples/smoke_all.sh

How to Read the Chapters

Chapters 1–12 should be read at the same pace. At the beginning of each chapter, first find the short "Before Reading" block: it answers what the chapter takes from the first volume, which local case it launches, what gets transferred to capstone/, and what belongs to the full track.

Then keep five questions in mind:

Foundation from the first volume. Which AgentClinic idea is being extended.
Minimal educational scenario. What to do by hand or run locally.
Control fact. What proves the chapter has been completed.
**How this gets into capstone/.** Which line or file remains after the chapter.
Full track. What will only be needed when deploying to a real production repository.

If a chapter feels dense, don't read it linearly. First complete the minimal scenario, then return to "Key Ideas," and only after that look at calibrations, [project script], and [conceptual interface]. A term that doesn't help fill the current capstone/ file can be skipped until the second pass.

The editorial rule of the second volume: on the first pass, a new chapter should add no more than one new mandatory term to your working vocabulary. If you encounter five more names, but they aren't needed for the current capstone/ file, treat them as reference material and return to them after the minimal scenario.

Practical test for a chapter: after the minimal scenario, the reader should be able to write one line in one capstone/ file. If two new mechanisms need to be understood at once for this, one of them belongs to the second pass or the full track.

Status Labels and Commands

Chapters use the same confidence levels as the first volume:

Standard — fixed behavior of a tool or established practice.
Recommendation — practice that works in most cases but allows adaptation.
Frontier — approach is in use, but the form depends on the team, models, and infrastructure.

Command blocks are divided into three types:

[runnable] — works locally in [book2/examples/](examples/) without external dependencies.
[project script] — interface of a script that needs to be implemented in your own project.
[conceptual interface] — form of a future orchestrator, policy gate, MCP layer, or CI integration.

For educational completion, only [runnable] blocks and manual artifacts are needed. Everything else belongs to the full track.

End-to-End Route

Chapters	What to do on first pass	What to defer
0	understand AgentClinic-production, choose `high_memory_usage`, create empty `capstone/`	adaptation to your own production domain
1–3	recover one requirement, show one defect, formalize `constitution.md`	automatic proof normalizers and rule referendums
4–5	get a counterexample and smoke result from stress mutator	permanent duel and mutation factory in CI
6–7	accept/reject shadow candidate, run Spec CI	full scorebook, scope-gate, and PR reports
8–9	assemble `judgment.md`, simulate cheap tier refusal	separate budget service and arbitration orchestrator
10–11	check guard metrics, readiness and dry-run for `high_memory_usage`	GitOps deploy and automatic remediation without manual confirmation
12	record three risks `blocker / owner / next_check`	turning each antipattern into a CI policy
13	assemble final proof package	production-ready implementation of the entire process

Mandatory Artifacts for First Pass

Track only these files. Other terms can be read later, once the main package already reads as a single case.

genealogy.md — where the requirement came from.
poisoned-spec.md / fixed-spec.md — which defect was found and how it was fixed.
constitution.md — which actions are forbidden to the agent or permitted with limitations.
validation.md — which facts were actually verified.
judgment.md — what verdict was rendered and on what evidence.
budget-note.md — what happens when the cheap tier refuses.
goodhart-note.md — which metric may start lying and which guard metric constrains it.
readiness.md — why the contour is admitted, blocked, or sent to semi-manual mode.
antipattern-audit.md — three risks in the form blocker / owner / next_check after completing chapter 12.
capstone/README.md — final assembly of the package for one case.

Chapter 6 adds a short Shadow notes block to capstone/README.md (or, if you use QWEN.md in your educational repository, there). This is not a separate file in the main list.

Other names (scorebook, metric_network, decision_hash, precedents.md) belong to the full track unless they directly help fill one of the files above.

Each chapter must provide a minimal final fragment for one of these files. If after a chapter you have only general understanding but no line, command, or blocker for capstone/, the chapter is not yet closed at the educational level.

Cross-map of "which chapter writes which capstone/ file":

`capstone/` file	Chapter that opens it	Chapters that supplement it
`genealogy.md`	1	13 (final assembly)
`poisoned-spec.md` / `fixed-spec.md`	2	13
`constitution.md`	3	12 (mutable-rule antipatterns), 13
`validation.md` — happy/negative + counterexample	4	5 (mutants), 7 (Spec CI), 13
`validation.md` — mutation immunity	5	13
`Shadow notes` block in `capstone/README.md`	6	13
`validation.md` — Spec CI line	7	13
`judgment.md`	8	12 (arbitration antipatterns), 13
`budget-note.md`	9	13
`goodhart-note.md`	10	13
`readiness.md`	11	13
`antipattern-audit.md`	12	13
`capstone/README.md` — assembly	13	—

Before self-grading, open [examples/templates/capstone-dossier.md](examples/templates/capstone-dossier.md). This is a completed benchmark of the minimal package for high_memory_usage: it shows how short a good first pass can be.

Chapter Map

Chapter	Foundation from first volume	Minimal output
0. AgentClinic-production Lab	final project structure and practical exam	chosen case, empty `capstone/`, smoke command
1. Recovering Specifications from Legacy	supporting an existing project	one entry in `genealogy.md`
2. Diagnosing Specification Defects	negative requirements and facts	poisoned/fixed pair
3. Project Constitution	`mission.md`, `tech-stack.md`, `roadmap.md`, `QWEN.md`	two immutable rules and one mutable rule
4. LLM Duel	separate verification session	one counterexample or `next_guard`
5. Mutation Testing of Specifications	negative path and counterexamples	stress mutator result
6. Shadow Specification Selection	project memory and few-shot	one accepted and one rejected candidate
7. Specification CI	link `requirements.md → plan.md → validation.md`	Spec CI line with PASS/BLOCK
8. File Arbitration of Disputed Change	independent review	`judgment.md` with `evidence_ref`
9. Tiered Budgets and Token Budgets	choosing model by task risk	budget risk and `token_health`
10. Protecting Metrics from Goodhart	facts instead of persuasive prose	KPI and guard metric
11. Production API	feature boundaries, rollback, manual check	readiness and dry-run
12. Production SDD Antipatterns	SDD antipatterns	three diagnostic risks
13. Practical Exam	full SDD cycle	final `capstone/` package

The full AgentClinic domain map is in Appendix A. Qwen Code command compatibility is described in Appendix B. Checklists are collected in Appendix C.

Why the Case Changes from Chapter to Chapter

The main graded case is high_memory_usage. But chapters 1–10 take different incidents because not every one equally well demonstrates the mechanism being studied: somewhere a priority conflict is easier to see in another domain, somewhere a mutation history is needed that high_memory_usage doesn't have. One case for the entire volume would turn every template into a formality.

The transfer rule is simple: after the chapter, write one line — which principle from that case protects your high_memory_usage.

Chapter	Chapter case	What transfers to `high_memory_usage`
1	`node_not_ready`	technique for recovering a requirement from post-mortem and provenance
2	`appointment_latency`	one controlled priority conflict and reverse run
3	`node_not_ready`	immutable principle and one mutable rule with `ttl` and `rollback_condition`
4	`autoscale_200pct`	minimal counterexample and `next_guard` for violated Then
5	`payment_latency_spike`	smoke mutator result and validator immunity vector
6	`shadow.p0.voice_handoff`	one accepted and one rejected shadow candidate
7	`incident payload`	Spec CI line with PASS on coverage and BLOCK on schema
8	`autoscale_200pct`	`judgment.md` with `verdict`, `evidence_ref`, and Safety role
9	`autoscale_200pct`	budget risk, `token_health`, and cheap tier refusal scenario
10	`cdn_error_budget_burn`	paired anti-Goodhart metric to remediation KPI
11	`high_memory_usage`	readiness 23/25 and dry-run for main case
12	any package from chapters 8–11	three lines `blocker / owner / next_check`
13	`high_memory_usage`	assembly of all artifacts into unified `capstone/`

If a chapter case doesn't transfer in one line — the chapter has been read but not closed.

Parts

Accompanying Documents

Applied Volume Glossary — definitions of second volume terms.
Applied Volume Changelog — revision history of the text.
Instructor Note — workshop formats and typical errors.
Bridges to First Volume — prerequisites and AgentClinic domain map.
Qwen Code Compatibility — built-in commands, custom commands, and project scripts.
Applied SDD Checklists — checks for Spec CI, arbitration, metrics, and production readiness.
Threshold Calibration — "Low / Default / High" tables, threshold shift exercises, and review signals for chapters 5, 6, 9, 10, 11. Not needed on first pass.
Runnable Examples — local smoke runs and templates.

What Counts as Success

By the end of the applied volume, the result should not be a set of beautiful rules but a reproducible contour:

disputed requirements have provenance and uncertainty level;
dangerous automations are constrained by constitution, guardrails, and rollback conditions;
validation.md checks happy path, negative path, counterexamples, drift, and Goodhart traps;
CI or its runnable analog blocks uncovered requirements and weak payload contracts;
agent decisions leave evidence suitable for review by another human or another model;
final capstone/ shows one path from legacy trace to production-ready solution with explicit blockers and fix plan.