Topic: Applied Part 7. Specification CI: Specification as an Executable Artifact
Difficulty level: Medium
Estimated study time: 4-6 hours (theory + practice with runnable examples)
Prerequisites: Basic understanding of Git and GitHub Actions
Experience working with Markdown requirement documents (requirements.md, plan.md)
Familiarity with JSON Schema
Completion of Part 9 of Volume 1 (fact validation) and Part 16 (team review)
Basic Python for running validation scripts
Learning objectives: Configure a local specification gateway (Spec CI) that blocks merge on coverage, scope, or schema violations
Build a reproducible requirements → plan traceability graph with REQ-* identifiers and the implements field
Formulate CI diagnostic messages containing file, line, rule identifier, and specific fix action
Integrate spec_gate into GitHub branch protection so that green unit tests cannot bypass semantic verification
Create negative fixtures for JSON Schema and explain why their predictable failure protects the contract
Overview: This chapter transforms the specification from static text into an executable arbiter of the repository. The Specification Gateway (Spec CI) is a GitHub Actions pipeline that on every push and pull_request checks three layers: requirement-to-plan coverage, plan-to-domain-model scope, and data example correctness (schema). Key principle: any dispute about specification quality reduces to a specific line, rule, and action. The team gets a reproducible blocking mechanism, and the reviewer checks the meaning of a change rather than investigating CI logs. The chapter starts with a local runnable example on incident payload, then scales to a production pipeline with branch protection.
Key concepts: Spec ci (specification gateway): A continuous integration pipeline that checks requirements.md, plan.md, validation.md, and API contracts as executable artifacts. Blocks merge on three violation classes: unfulfilled requirements, out-of-scope actions, JSON Schema errors. The main term for the first pass; other terms are introduced as they appear in scripts.
Gate (gateway): A mandatory CI check that must pass for merge to be possible. Unlike ordinary tests, spec_gate checks semantic integrity, not just code correctness. Must be marked as required in branch protection settings.
Coverage-check (coverage verification): Building a requirements → plan traceability graph through stable REQ-* identifiers and the implements field. Every user story must have an implementing task, every task must have a back-reference. Errors: orphan requirement (requirement without a task) and rogue task (task without a requirement).
Scope-check (scope verification): A detector for domain model boundary violations. Checks that actions in plan.md are permitted in the incident-response domain: acknowledge, escalate, annotate, rollback, notify_on_call. Considers actor, endpoint, and trigger condition. Blocks autonomous operations like force_resolve_without_operator.
Schema-check (schema verification): Validation of JSON fixtures from validation.md against JSON Schema. Two directions: valid examples pass, negative ones fail predictably. If a negative fixture passes — the schema is too permissive.
Fixture (fixture): An input data example extracted from validation.md for contract verification. Valid fixtures confirm correct operation, negative ones protect against schema regressions.
Spec gate (final ci job): The concluding job in a workflow that requires successful completion of coverage, scope, and schema. This specific job must be marked as required in branch protection; otherwise, green unit tests will bypass semantic verification.
Json-diagnostics: A CI rejection format designed for quick fixes without investigation. Four mandatory elements: clear reason, file and line reference, rule identifier, specific action.
Runnable-example: A local educational case on incident payload that can be run without GitHub Actions. Contains check_coverage.py and validate_schema.py to demonstrate gateway principles before full CI adoption.
Practice exercises: Name: Local Gateway Coverage Run
Problem: Navigate to the book2/examples/spec-ci directory and run the coverage verification script. Then deliberately break the link: add a story REQ-999 to requirements.md without an implementing task in plan.md. Run the script again and analyze the diagnostics.
Solution: 1. cd book2/examples/spec-ci
- python3 scripts/check_coverage.py --requirements requirements.md --plan plan.md → expected exit code 0, message coverage ok
- Add to requirements.md: '## REQ-999: As an admin, I want auto-deletion of incidents'
- Run the script again → expected fail with message: 'requirements.md:42: REQ-999 has no link in plan.md. Add a task with implements: REQ-999 to plan.md or remove the requirement.'
- Fix by adding a task with implements: [REQ-999] to plan.md, or remove REQ-999
Complexity: beginner
Name: Negative Fixture and Schema Protection
Problem: In the book2/examples/spec-ci directory, run validate_schema.py. Then create a new negative fixture: valid format, but with severity: 'P0' and without backup_verified. Verify that the schema rejects it. If it passes — the schema is too permissive.
Solution: 1. python3 scripts/validate_schema.py --schema schemas/incident_payload.schema.json --fixtures fixtures → exit code 0, valid-incident.json: valid, invalid-missing-incident-id.json: expected invalid
- Create fixtures/invalid-p0-no-backup.json with _expected_invalid: true, incident_id: 'INC-003', severity: 'P0', but without backup_verified
- Run validate_schema.py again
- If received 'invalid-p0-no-backup.json: expected invalid, rejected: missing required property backup_verified' → schema is correct
- If it passes validation → modify the schema: add 'if severity == P0 then backup_verified required' and repeat
Complexity: intermediate
Name: Porting Spec CI to Capstone for high_memory_usage
Problem: In the capstone project, create a Spec CI line for the high_memory_usage scenario. Requirement: REQ-HM-01 "do not restart pod without confirmed RSS > 90% for 5 minutes". Prove the link to plan.md and create a negative fixture without incident_id.
Solution: 1. Ensure requirements.md contains REQ-HM-01 with a clear condition (RSS > 90%, 5 minutes)
- Add a task with implements: [REQ-HM-01] to plan.md
- Run: python3 scripts/check_coverage.py --requirements requirements.md --plan plan.md → expected coverage ok
- Create a fixture with high_memory_usage payload, but without incident_id, mark _expected_invalid: true
- Run validate_schema.py → expected rejection for missing required property incident_id
- Record in capstone/validation.md:
| Spec CI | check_coverage.py | REQ-HM-01 linked to plan.md | PASS | | Schema negative | validate_schema.py | missing incident_id blocked | PASS |
Complexity: intermediate
Name: Scope Verification: Blocking a Rogue Operation
Problem: In the educational plan.md, add a step 'TASK-ROGUE: autonomous incident closure via POST /incidents/{id}/force-resolve without operator confirmation, with implements: [REQ-014]'. Verify that scope-check blocks this operation, even though coverage remains green.
Solution: 1. Add TASK-ROGUE to plan.md with implements: [REQ-014] and endpoint POST /incidents/{id}/force-resolve
- Run check_coverage.py → green (formal link exists)
- Run check_scope.py (or manual analysis against the incident-response domain model)
- Expected fail: 'plan.md:48 uses force_resolve without domain permission'
- Replace with POST /incidents/{id}/ack or add a separate requirement and domain rule
- Repeat until green status
Complexity: advanced
Name: Formatting Actionable Diagnostics
Problem: Receive a 'bad' error message: 'Coverage failed: missing REQ'. Transform it into a 'good' one across four elements: reason, file/line, rule identifier, action. Record in JSON format.
Solution: Original (bad): 'Coverage failed: missing REQ'
Transformed (good): { 'status': 'failed', 'check': 'coverage', 'file': 'requirements.md', 'line': 42, 'rule': 'REQ-COV-014', 'reason': 'REQ-014 has no implementing task in the plan', 'action': 'Add a task with implements: [REQ-014] to plan.md or remove the requirement from requirements.md' }
Verification: the message enables a fix without opening files and reading process logs
Complexity: intermediate
Case studies: Name: Incident Pipeline: How a Negative Fixture Prevented a False P0 Escalation
Scenario: An SRE team at a fintech company automated processing of Grafana → PagerDuty alerts. validation.md contained webhook examples with fields incident_id, severity, source. The incident_payload.schema.json required backup_verified for severity: 'P0'. The specification was connected to Spec CI with three layers: coverage, scope, schema.
Challenge: A developer 'optimized' the schema by removing the backup_verified condition for P0, assuming that 'there is always a backup'. A week later, another developer added an automatic escalation step to plan.md without changing requirements. Coverage remained green (implements was present), scope too (escalation was in-domain). But the negative fixture invalid-p0-no-backup.json suddenly passed validation — the schema had become too permissive.
Solution: Spec CI at the schema-check level detected that a negative fixture (marked _expected_invalid: true) had passed validation. The spec_gate status turned red, merge blocked. Diagnostics indicated: 'validation.md:72 schema mismatch: negative fixture invalid-p0-no-backup.json passed, expected reject by missing backup_verified. Action: restore if-then requirement in incident_payload.schema.json: severity P0 requires backup_verified'. The team restored schema strictness and added an explicit requirement REQ-BACKUP-01 to requirements.md.
Result: Prevented a situation where a severity P0 incident without a confirmed backup would automatically escalate to the on-call engineer at 3 AM, even though recovery was impossible. Schema regression reaction time: minutes instead of hours or days. The team established a rule: any schema change requires a pair of fixtures — valid and negative.
Lessons learned: Negative fixtures are regression tests for the schema itself; their passage is an alarm signal, not a success
Changing the schema without changing the fixture set is a classic anti-pattern that Spec CI must block
The three verification layers (coverage, scope, schema) are independent: green status in one does not compensate for red in another
Related concepts: schema-check
fixture
spec_gate
JSON-diagnostics
Name: Rogue Task and Autonomous Closure: Why Coverage Is Insufficient
Scenario: In an incident project, the team implemented Spec CI with coverage verification. Every REQ-* had implements, every task referenced a requirement. In review, this looked like perfect traceability.
Challenge: A product manager asked to 'accelerate' incident resolution. A developer added a task TASK-AUTO-RESOLVE to plan.md with implements: [REQ-014] (a general requirement 'on-call receives escalation confirmation') and endpoint POST /incidents/{id}/force-resolve. The formal link exists, but the content — autonomous closure without an operator — exceeds the incident-response domain model. Coverage-check passed green.
Solution: Scope-check triggered at the domain model level: the force_resolve action is not among permitted operations (acknowledge, escalate, annotate, rollback, notify_on_call). The check considered not just the verb but also the actor (autonomous agent vs on-call engineer) and endpoint. Diagnostics: 'plan.md:48: IR-SCOPE-007 — Autonomous force resolve is outside the incident-response domain model. Action: Replace with POST /incidents/{id}/ack or add an approved requirement and domain rule'.
Result: Pull request blocked. The team discussed and rejected autonomous closure as risky. Instead, they added an explicit requirement REQ-MANUAL-ACK and a task with human confirmation. The domain model remained unchanged, trust in remediation preserved.
Lessons learned: An identifier-based coverage graph is stronger than word search, but insufficient without content verification
Scope-check catches semantic drift: when formal structure is preserved but meaning is changed
The domain model is a contract between the team and the system; its violation is more dangerous than missing tests
Related concepts: scope-check
coverage-check
rogue task
domain model
Study tips: Start with the local runnable example in book2/examples/spec-ci, not with GitHub Actions. First achieve a green gateway locally, then port to CI
Use a pre-commit hook only for changed files; leave full runs for CI — this saves time and makes the gateway a familiar part of the cycle
Practice on 'bad' diagnostics: take messages like 'Coverage failed' and rewrite them into the format with file, line, rule, and action
Create negative fixtures in parallel with positive ones — this protects against 'soft' schema regressions
Keep a 'violation log': record which errors spec_gate caught, and use it for team training
For visual style: draw the requirements → plan → domain → schema graph on paper, mark where each gate triggers
For auditory style: explain aloud to a colleague why a push trigger to main is as important as pull_request (direct updates to service files)
For kinesthetic style: physically delete the implements line from plan.md, run the script, feel the block, then restore
Additional resources: Github spec kit: https://github.com/github/spec-kit — reference implementation of the SDD approach where requirements, plan, and tasks become verifiable layers
Chapter runnable examples: book2/examples/spec-ci/scripts/check_coverage.py and validate_schema.py — local smoke-test without external dependencies
Json schema validation: https://json-schema.org/understanding-json-schema/ — specification for creating strict fixture contracts
Volume 1 Part 9: part-09-feature-validation.md — linking validation.md with facts, foundation for understanding fixtures
Volume 1 Part 16: part-16-team-code-review.md — team review of the proof package, context for automation
Volume 1 Part 12: part-12-mvp.md — REQ-identifiers and payload schemas in the educational AgentClinic
Summary: The Specification Gateway (Spec CI) turns requirements.md, plan.md, validation.md, and API contracts from reference documentation into executable artifacts that block merge. The three verification layers — coverage (REQ-* → implements graph), scope (incident-response domain model conformance), schema (JSON fixtures with negative examples) — are independent and complement unit tests. Key benefit: any specification dispute reduces to diagnostics with file, line, rule, and action, reducing reviewer load. For adoption: first local runnable example, then GitHub Actions with a mandatory spec_gate in branch protection, always on push and pull_request. In incident automation, such strictness prevents false escalations, dangerous auto-operations, and loss of trust in remediation.