Study guide: Applied Part 7. Specification CI: specification as an executable artifact

Lesson 3 of 5 in module «Applied Part 7. Specification CI: specification as an executable artifact»

You are viewing the lesson without signing in. Sign in to save progress and take tests.

Topic: Applied Part 7. Specification CI: specification as an executable artifact

Difficulty level: Medium

Estimated study time: 3-4 hours

Prerequisites: Understanding of CI/CD principles (e.g., GitHub Actions)

Experience with Markdown and JSON

Basic knowledge of Python for modifying validation scripts

Familiarity with the concepts of requirements and plans in development

Learning objectives: Set up a CI pipeline for automatic validation of specifications (requirements.md, plan.md).

Implement a Coverage Check for linking requirements (REQ-*) with plan tasks.

Create JSON Schema validation for fixtures extracted from validation.md, including negative test verification.

Generate clear diagnostic error messages from CI indicating the file, line, and action to fix.

Apply Spec CI to validate atomic plans of agent systems (AoT gateway).

Overview: This study guide focuses on turning static documentation (specifications) into executable artifacts verified within CI/CD. We will explore how to use GitHub Actions and Python scripts to automate the validation of specification quality. The core idea is to create a 'Spec Gate' that blocks the merging of a Pull Request when there are violations: uncovered requirements, domain scope violations, or errors in JSON fixtures. The approach is based on the principle: a specification is only useful when it can be automatically rejected.

Key concepts: Spec Gate: A mandatory step in CI that checks the specification as strictly as regular tests check code. It blocks the merging of a PR when semantic or structural errors are found in documentation.

Coverage Check: The process of validating the link graph between requirements.md and plan.md. Each requirement (REQ-) must have an implementing task in the plan (implements: [REQ-]), and each task must be linked to a requirement (no 'rogue tasks').

Scope Check: Validation that actions in plan.md correspond to the allowed operations of the domain model (e.g., incident-response.yaml). Blocks 'extraneous' scenarios such as unauthorized automatic incident closure.

Schema Check: Extraction of JSON examples (fixtures) from validation.md and their validation against JSON Schema. Includes verification of both positive (must pass) and negative (must fail predictably) scenarios.

Aot Gate (atom of thought gate): A specific check for plans generated by AI agents. The plan is represented as a graph of atomic actions, which is validated for unknown tools, dependency cycles, and domain scope violations before execution begins.

Practice exercises: Name: Local requirements coverage check

Problem: You have requirements.md and plan.md files. You need to make sure that all requirements from requirements.md have corresponding tasks in plan.md, and that there are no 'orphaned' (rogue) tasks. Run a local run of the coverage check script.

Solution: 1. Navigate to the example directory: cd book2/examples/spec-ci. 2. Run the script: python3 scripts/check_coverage.py --requirements requirements.md --plan plan.md. 3. Make sure the script returned code 0 (success) and printed a message that all REQ-* identifiers are successfully covered.

Complexity: beginner

Name: JSON Schema validation and negative tests

Problem: You need to validate incident JSON fixtures against the incident_payload.schema.json schema. One of the fixtures is intentionally invalid (missing incident_id). You need to make sure the script correctly rejects it.

Solution: 1. While in book2/examples/spec-ci, run: python3 scripts/validate_schema.py --schema schemas/incident_payload.schema.json --fixtures fixtures. 2. Check the output: a valid fixture should pass, and for the invalid one an error message should be displayed (e.g., 'missing required property incident_id'). 3. Make sure the process completes correctly, signaling the found discrepancy.

Complexity: intermediate

Name: Formatting a CI diagnostic message

Problem: The check_scope.py script detected that plan.md uses the action 'force_resolve', which is not allowed by the domain model. Generate a JSON error object that meets the requirements of good diagnostics: a clear reason, a file reference, a rule identifier, and an action to fix.

Solution: The JSON response should look something like this: { "status": "failed", "check": "scope", "file": "plan.md", "line": 48, "rule": "IR-SCOPE-007", "reason": "Autonomous force resolve is outside the incident-response domain model", "action": "Replace with POST /incidents/{id}/ack or add an approved requirement and domain rule" }

Complexity: intermediate

Name: Agent plan validation (AoT)

Problem: An agent generated a plan in JSON format containing an array of atoms. Create a rule (or script concept) to reject the plan if an atom references a non-existent tool 'delete_database'.

Solution: The script should read the 'name' field of each atom and compare it against the list of allowed tools (whitelist) from the domain model. If 'delete_database' is not in the whitelist, the script should return an error.fail and JSON diagnostics like: {'reason': 'Unknown tool', 'action': 'Remove atom or update domain model'}.

Complexity: advanced

Case studies: Name: Blocking unauthorized incident closure

Scenario: A team is developing incident management automation. A developer creates a Pull Request, adding a new step in plan.md to automatically resolve an incident (force-resolve) without confirmation from the on-call engineer in order to 'speed up' the process.

Challenge: Text review can miss this change if it is formally tied to a requirement. However, such an action violates business logic and process safety (it can hide a critical failure).

Solution: A Spec CI specification gate with a Scope Check was implemented. The check_scope.py script matched the 'force_resolve' action against the incident-response.yaml domain model and found that this operation is not allowed for autonomous systems.

Result: GitHub Actions blocked the merge of the PR. The developer received an automatic comment indicating the file (plan.md), the line, and the rule prohibiting autonomous closure. Dangerous logic did not get into production.

Lessons learned: Checking the text of a requirement is not enough; the semantics of actions must be checked.

Automating specification reviews reduces the load on the team and prevents the 'human factor'.

The gate should check not only the presence of a reference (coverage), but also the content (scope).

Related concepts: Scope Check

Domain Model

Pull Request Protection

Name: Integration with Grafana: protecting the payload contract

Scenario: A project integrates with a monitoring system (e.g., Grafana) via webhooks. validation.md stores examples of JSON payloads that are used to test the integration.

Challenge: When Grafana's API contract was updated, the format of the 'source' field was changed or 'incident_id' was removed. The developer updated the code but forgot to update the documentation and test examples. This could lead to a 'silent' regression, where an alert arrives but is not processed.

Solution: Use of the Schema Check. The extract_fixtures.py script extracted JSON blocks from validation.md, and validate_schema.py checked them against the current incident_payload.schema.json schema.

Result: CI failed at the Spec Gate stage. The team saw the error: 'validation.md:72 missing required property incident_id'. Documentation and fixtures were updated before merging the code, which prevented an integration failure in production.

Lessons learned: A specification must be 'executable' — automatically verifiable with every change.

Negative examples (counterexamples) are just as important as positive ones for verifying schema strictness.

Integration contracts should not be an 'agreement' — they must be validated by a machine.

Related concepts: JSON Schema Validation

Fixtures

Negative Testing

Study tips: Start with a local run: Before setting up a complex GitHub Actions workflow, make sure the check_coverage.py and validate_schema.py scripts work correctly in your local environment (cd book2/examples/spec-ci).

Focus on diagnostics: Pay special attention to the format of error output. The main value of Spec CI is not in 'pointing out an error', but in suggesting how to fix it (file, line, rule, action).

Use negative fixtures: Practice creating schemas that strictly reject invalid data. If an invalid payload passes the schema, the schema is too loose.

Link documentation layers: Remember that requirements, plan, and validation are linked layers. A change in requirements.md should trigger a check of whether it is covered in plan.md.

Additional resources: Github spec kit: A GitHub repository demonstrating the SDD (Specification-Driven Development) approach, where requirements and plans become verifiable layers.

Json schema documentation: The official JSON Schema documentation, required for understanding how to build strict contracts for validating fixtures.

Course materials part 9 & 16: The original parts of the course (Feature Validation and Team Code Review) describing the manual processes that are automated in this chapter.

Summary: Specification CI (Spec CI) turns documentation into an executable artifact that is checked as strictly as code. Implementing an automatic gate (Spec Gate) in CI/CD blocks the merging of changes when there are no links between requirements and the plan, when the domain model is violated, or when the JSON payload does not match the schema. This shifts the focus from subjective text review to machine validation of structure and semantics, ensuring the reproducibility and safety of the incident pipeline.

0 / 10000

Notes are saved in this browser. They will not appear on another device.

Course

Using SDD in Development for Qwen Code CLI. Applied Course

Progress 0 / 95

○ Reading: Practical Part 0. AgentClinic-production Laboratory 🔒 Diagram: Practical Part 0. AgentClinic-production Laboratory 🔒 Study guide: Practical Part 0. AgentClinic-production Laboratory 🔒 Quiz: Practical Part 0. AgentClinic-production Laboratory 🔒 Flashcards: Practical Part 0. AgentClinic-production Laboratory

🔒 Reading: Applied Part 1. Recovering Specifications from Legacy 🔒 Diagram: Applied Part 1. Recovering Specifications from Legacy 🔒 Study guide: Applied Part 1. Recovering Specifications from Legacy 🔒 Quiz: Applied Part 1. Recovering Specifications from Legacy 🔒 Flashcards: Applied Part 1. Recovering Specifications from Legacy

🔒 Reading: Applied Part 2. Specification Defect Diagnostics 🔒 Diagram: Applied Part 2. Specification Defect Diagnostics 🔒 Study guide: Applied Part 2. Specification Defect Diagnostics 🔒 Quiz: Applied Part 2. Specification Defect Diagnostics 🔒 Flashcards: Applied Part 2. Specification Defect Diagnostics

🔒 Reading: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Diagram: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Study guide: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Quiz: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Flashcards: Applied Part 3. Project Constitution: First Referendum on Rules

🔒 Reading: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Diagram: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Study guide: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Quiz: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Flashcards: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements

🔒 Reading: Applied Part 5. Mutation Testing of Specifications 🔒 Diagram: Applied Part 5. Mutation Testing of Specifications 🔒 Study guide: Applied Part 5. Mutation Testing of Specifications 🔒 Quiz: Applied Part 5. Mutation Testing of Specifications 🔒 Flashcards: Applied Part 5. Mutation Testing of Specifications

🔒 Reading: Applied Part 6. Selection of Shadow Specifications 🔒 Diagram: Applied Part 6. Selection of Shadow Specifications 🔒 Study guide: Applied Part 6. Selection of Shadow Specifications 🔒 Quiz: Applied Part 6. Selection of Shadow Specifications 🔒 Flashcards: Applied Part 6. Selection of Shadow Specifications

🔒 Reading: Applied Part 7. Specification CI: specification as an executable artifact 🔒 Diagram: Applied Part 7. Specification CI: specification as an executable artifact ▸ Study guide: Applied Part 7. Specification CI: specification as an executable artifact 🔒 Quiz: Applied Part 7. Specification CI: specification as an executable artifact 🔒 Flashcards: Applied Part 7. Specification CI: specification as an executable artifact

🔒 Reading: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Diagram: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Study guide: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Quiz: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Flashcards: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents

🔒 Reading: Applied Part 9. Model Routing and Token Budget 🔒 Diagram: Applied Part 9. Model Routing and Token Budget 🔒 Study guide: Applied Part 9. Model Routing and Token Budget 🔒 Quiz: Applied Part 9. Model Routing and Token Budget 🔒 Flashcards: Applied Part 9. Model Routing and Token Budget

🔒 Reading: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 🔒 Diagram: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 🔒 Study guide: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 🔒 Quiz: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 🔒 Flashcards: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode

🔒 Reading: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Diagram: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Study guide: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Quiz: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Flashcards: Practical Part 11. Integration with a Real API: From Specification to Deployment

🔒 Reading: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Diagram: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Study guide: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Quiz: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Flashcards: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle

🔒 Reading: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Diagram: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Study guide: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Quiz: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Flashcards: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline

🔒 Reading: Appendix A. Bridges to the first volume 🔒 Diagram: Appendix A. Bridges to the first volume 🔒 Study guide: Appendix A. Bridges to the first volume 🔒 Quiz: Appendix A. Bridges to the first volume 🔒 Flashcards: Appendix A. Bridges to the first volume

🔒 Reading: Appendix B. Qwen Code Compatibility 🔒 Diagram: Appendix B. Qwen Code Compatibility 🔒 Study guide: Appendix B. Qwen Code Compatibility 🔒 Quiz: Appendix B. Qwen Code Compatibility 🔒 Flashcards: Appendix B. Qwen Code Compatibility

🔒 Reading: Appendix C. Applied SDD Checklists 🔒 Diagram: Appendix C. Applied SDD Checklists 🔒 Study guide: Appendix C. Applied SDD Checklists 🔒 Quiz: Appendix C. Applied SDD Checklists 🔒 Flashcards: Appendix C. Applied SDD Checklists

🔒 Reading: Appendix D. Threshold Calibration 🔒 Diagram: Appendix D. Threshold Calibration 🔒 Study guide: Appendix D. Threshold Calibration 🔒 Quiz: Appendix D. Threshold Calibration 🔒 Flashcards: Appendix D. Threshold Calibration

🔒 Reading: Applied Volume Glossary 🔒 Diagram: Applied Volume Glossary 🔒 Study guide: Applied Volume Glossary 🔒 Quiz: Applied Volume Glossary 🔒 Flashcards: Applied Volume Glossary

Study guide: Applied Part 7. Specification CI: specification as an executable artifact

My notes

Course menu

Course

Study guide: Applied Part 7. Specification CI: specification as an executable artifact

My notes

Course menu

Course

1. Practical Part 0. AgentClinic-production Laboratory 0 / 5

2. Applied Part 1. Recovering Specifications from Legacy 0 / 5

3. Applied Part 2. Specification Defect Diagnostics 0 / 5

4. Applied Part 3. Project Constitution: First Referendum on Rules 0 / 5

5. Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 0 / 5

6. Applied Part 5. Mutation Testing of Specifications 0 / 5

7. Applied Part 6. Selection of Shadow Specifications 0 / 5

8. Applied Part 7. Specification CI: specification as an executable artifact 0 / 5

9. Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 0 / 5

10. Applied Part 9. Model Routing and Token Budget 0 / 5

11. Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 0 / 5

12. Practical Part 11. Integration with a Real API: From Specification to Deployment 0 / 5

13. Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 0 / 5

14. Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 0 / 5

15. Appendix A. Bridges to the first volume 0 / 5

16. Appendix B. Qwen Code Compatibility 0 / 5

17. Appendix C. Applied SDD Checklists 0 / 5

18. Appendix D. Threshold Calibration 0 / 5

19. Applied Volume Glossary 0 / 5