Study guide: Applied Part 2. Specification Defect Diagnostics

Lesson 3 of 5 in module «Applied Part 2. Specification Defect Diagnostics»

You are viewing the lesson without signing in. Sign in to save progress and take tests.

Topic: Applied part 2. Specification defect diagnostics

Difficulty level: Medium

Estimated study time: 3-5 hours

Prerequisites: Familiarity with the concept of negative requirements (Part 7 of Volume 1)

Understanding of SDD antipatterns (Part 20 of Volume 1)

Basic skills with YAML and JSON Schema

General understanding of the software development lifecycle and the concept of mutation testing

Learning objectives: Be able to deliberately introduce exactly one controlled defect into a specification to test the system (mutation testing).

Identify and classify the main types of defects: cycles, priority conflicts, and hidden out-of-scope violations.

Recognize the symptoms of an AI agent or system getting stuck using the metrics ask_storm, stage_regress, and phase_context_loss.

Formalize conflict resolutions using override rules and validate them through JSON Schema.

Perform a reverse run of the full SDD loop (Specify → Plan → Tasks → Implement) to confirm defect elimination.

Overview: This topic is dedicated to the engineering technique of working with specifications known as "controlled defective specification" (or mutation testing of requirements). The essence of the method lies in deliberately introducing one strictly defined defect into the specification in order to test how the system (or an AI agent, such as Qwen Code) will handle its diagnosis. The main goal is to learn how to controllably trigger a failure, read its traces, and fix the root cause in the requirements so that the conflict does not recur. The approach requires strict discipline: one mutation, one expected symptom of getting stuck, and one clear recovery criterion, recorded in Given/When/Then and JSON Schema format.

Key concepts: Controlled defective specification (poisoned spec): Deliberate injection of a known defect into the requirements (specification) to test the resilience of the system, AI agent, or triage process. The main rule: only one defect is introduced per iteration.

Defect classes (mutations): The main types of injected errors: 'cycle' (a cyclic dependency between states), 'priority_conflict' (two rules with the same priority leading to different actions), and 'hidden_out_of_scope' (an action that forces a violation of given constraints).

Stuck metrics (diagnostic signs): Heuristics for localizing AI agent behavior issues: 'ask_storm' (repeated questions without new data), 'stage_regress' (rollback to previous stages without reason), 'phase_context_loss' (loss of context of the current phase).

Requirement formalization (yaml + json schema): An approach in which disputed requirements are recorded in an executable format (YAML with priorities), and the boundaries of acceptable behavior are strictly described in JSON Schema, eliminating the ambiguity of natural language.

Reverse run of the sdd loop: Verification of the fixed specification (fixed-spec) by going through the full Specify → Plan → Tasks → Implement cycle. It is considered successful if the original conflict no longer reproduces in tasks and implementation.

Practice exercises: Name: Creating a poisoned/fixed pair for an incident

Problem: Training case appointment_latency. It is necessary to create a specification where the requirement 'escalate P0 within 30 seconds' conflicts with 'any escalation requires manual confirmation'. Create the files poisoned-spec.md and fixed-spec.md.

Solution: 1. In poisoned-spec.md, create two rules with priority=100 that block each other when the owner is unavailable. 2. Record the expected symptom (e.g., stage_regress when trying to create a plan). 3. In fixed-spec.md, add the rule p0_time_critical_override with a priority higher than manual confirmation, and set the flag human_audit_required=true for post-factum verification.

Complexity: intermediate

Name: Recording the recovery line in validation.md

Problem: It is required to formalize the criterion for a successful resolution of the escalation priority conflict for recording in the validation.md file.

Solution: Add to validation.md the line: priority_conflict=false && escalation_path_resolved=P0 && audit_required=true. This will provide a machine-readable verification that the conflict is resolved, the escalation path is defined, and the audit is preserved.

Complexity: beginner

Name: Writing a JSON Schema for an override rule

Problem: It is necessary to prohibit the AI agent from returning to hidden approval for critical incidents. Describe a JSON Schema that requires auto-escalation for P0 when the owner is unavailable.

Solution: Use the if/then construct. In the if block, specify the conditions: severity=P0 and owner_unresponsive=true. In the then block, specify the required fields: auto_escalation_channel=critical_phone, human_audit_required=true, and reason_code=time_critical_override.

Complexity: advanced

Case studies: Name: Diagnostics of appointments-api latency growth

Scenario: A sharp increase in latency on the appointments-api route in a production environment. The incident triage system must automatically handle a P0 incident, but the specification contains conflicting requirements caused by stress load.

Challenge: The specification contains two rules simultaneously with the same maximum priority: 'escalate P0 within 30 seconds' and 'wait for manual confirmation before any escalation'. If the responsible person is unavailable, the AI agent (Qwen Code) falls into an infinite loop (ESCALATE_EVENT → WAIT_APPROVAL → VALIDATE_ESCALATION), which leads to the stage_regress metric and the impossibility of resolving the issue.

Solution: The controlled defective specification method was applied. The 'priority_conflict' defect was formalized in YAML. The fix consisted of introducing the override rule 'p0_time_critical_override', which is activated when severity=P0 and owner_unresponsive=true. Manual verification was moved to a post-factum audit (human_audit_required=true). For validation, a JSON Schema was written that strictly defines the acceptable behavior corridor.

Result: When the Specify → Plan → Tasks → Implement loop was rerun, the cycle was broken. Latency was no longer blocked by waiting for approval, and the audit trail was preserved. The stage_regress metric dropped to 0, and the recovery line in validation.md successfully passed verification.

Lessons learned: Specification defects must be explicit (in code and priorities), not hidden in comments.

Conflict resolution must change the executable rule (requirement) itself, not just the textual explanation.

Any fix must be verified by a full reverse run of the entire SDD cycle.

Related concepts: Priority conflict

Stuck metrics (stage_regress)

JSON Schema validation

Study tips: Start with a minimal radius: introduce only one type of defect per run. Introducing multiple defects (cycle + conflict) will make the trace indistinguishable from chaos.

Record the 'ask_storm' and 'stage_regress' metrics manually in a notebook or validation.md at the early stages to develop intuition for AI agent behavior.

Always record the expected symptom before running the analysis, rather than fitting results to facts.

Translate disputed areas from natural language into Given/When/Then format — this will instantly highlight missing logic branches.

When transferring the solution to the main project (capstone), take with you only the defect class, the patch itself, and the recovery line, so as not to clutter the repository.

Additional resources: Github spec kit quickstart: https://github.github.io/spec-kit/quickstart.html — description of the Specify → Plan → Tasks → Implement phases.

Local catalog of spec ci examples: examples/spec-ci/README.md — runnable analogs of basic specification gateways.

Part 7. Negative requirements: Basic concepts of system behavior constraints from Volume 1 of the materials.

Part 20. SDD antipatterns: Catalog of classic specification errors on which defect injections are based.

Summary: The use of controlled defective specification is a powerful method for protecting requirements from ambiguity. The technique makes it possible to turn random AI agent failures into manageable laboratory mutations. Diagnostic success is built on four pillars: one defect per iteration, precise measurement of stuck symptoms (via ask_storm and stage_regress), formal conflict resolution (JSON Schema + override), and a mandatory reverse run of the full SDD loop.