Reading: Applied Part 7. Specification CI: Specification as an Executable Artifact

Lesson 1 of 5 in module «Applied Part 7. Specification CI: Specification as an Executable Artifact»
You are viewing the lesson without signing in. Sign in to save progress and take tests.

Applied Part 7. Specification CI: Specification as an Executable Artifact

Status: Recommendation. Running specification checks in CI is a sustainable practice. The specific set of gates (coverage, scope, schema, spec_gate) and the JSON diagnostics format are a recommended framework that most teams adapt. JSON Schema validation of fixtures from validation.md is standard tool usage.

In this chapter, "specification gate" is a short name for the pipeline that checks the specification just as regular CI checks code. The check itself consists of ordinary steps: parsing requirements.md and plan.md, validating examples against JSON Schema, checking consistency between files. All other terms — gate, fixture, schema-check, coverage-check — are introduced below in the places where they actually appear in a command or script comment; there is no need to cram them into a single introductory sentence.

The specification gate is the automation of exactly the procedure that Part 9 of the first volume performed manually, and Part 16 — as a team in a pull request. In the educational AgentClinic, REQ identifiers and payload schemas for reviews from Part 12 could still remain an informal agreement. In production, there is no such margin of trust. Those same relationships must be turned into a mandatory gate that green unit tests cannot bypass.

Before Reading

  • Foundation from the first volume: Part 9 links validation.md to facts, Part 16 shows review of a proof package.
  • Local educational case: incident payload, because coverage and JSON Schema can be checked locally without CI.
  • Trace for capstone/: one line of Spec CI for high_memory_usage: command, proven fact, and negative example.
  • Key term for the first pass: Spec CI. gate, fixture, schema-check, coverage-check — reference terms, appear directly in the command and script comment.
  • What to defer: GitHub Actions workflow, scope-gate, and extraction of fixtures from arbitrary validation.md.

Goal

In this chapter, the specification gate transforms from the idea of "checking documents" into a working GitHub Actions pipeline for an incident project. Every push and every pull request goes through a mandatory gate. The gate blocks merge on three classes of violations:

  • unfulfilled requirements,
  • going out of bounds (out-of-scope),
  • JSON Schema errors.

The reader will get a practical repository schema where requirements.md, plan.md, validation.md, and API contracts are checked as executable artifacts, not as reference documentation.

The main gain — the team gets a reproducible blocking mechanism in CI. A dispute about specification quality reduces to a specific line, rule, and action for correction.

Minimal Educational Scenario

Educational Case

incident payload: check that requirements.md is linked to plan.md, and that JSON fixtures (input examples we extract from validation.md) contain the mandatory incident_id. The goal is to see Spec CI as a small local gate (a mandatory check without which merge is blocked), not as a large GitHub Actions process.

Preparation

  • book2/examples/spec-ci/requirements.md.
  • book2/examples/spec-ci/plan.md.
  • book2/examples/spec-ci/fixtures/valid-incident.json.
  • book2/examples/spec-ci/fixtures/invalid-missing-incident-id.json.
  • Scripts check_coverage.py and validate_schema.py.

Steps

  1. cd book2/examples/spec-ci. Expected: you are in the runnable example directory.
  2. python3 scripts/check_coverage.py --requirements requirements.md --plan plan.md. *Expected: exit code 0, all REQ-* have a link to the plan.*
  3. python3 scripts/validate_schema.py --schema schemas/incident_payload.schema.json --fixtures fixtures. Expected: valid fixture passes, negative one fails predictably.
  4. Open the error message for the negative fixture. Expected: it is clear which field is missing and which file to fix.
  5. Record in validation.md what exactly blocks the gate: coverage, scope, or schema.

Control Fact

One local run shows two types of truth: a requirement is linked to a plan, and data matches the contract. If a CI error does not point to a file, rule, and action, it is not ready for the team.

How This Gets Into capstone/

Transfer to capstone/validation.md one line of Spec CI: which command was run, what it proved, which negative example was blocked. Do not transfer the full GitHub Actions workflow if it is not created; for the educational minimum, the runnable analog from examples/spec-ci is sufficient.

Minimal fragment:

| Spec CI | `python3 scripts/check_coverage.py ...` | all REQ-* linked to plan | PASS |
| Schema negative | `python3 scripts/validate_schema.py ...` | missing incident_id blocked | PASS |

Transfer to high_memory_usage

The educational example works on incident payload, but in capstone/ you need a line for high_memory_usage. Apply the same two classes of checks to your requirements:

What we checkCommand (educational)What it proves for high_memory_usage
Coveragecheck_coverage.py --requirements requirements.md --plan plan.mdrequirement REQ-HM-01 "do not restart pod without confirmed RSS > 90% for 5 minutes" is linked to a task in plan.md

| Schema negative | validate_schema.py --schema schemas/incident_payload.schema.json --fixtures fixtures | fixture without incident_id or with severity: "P0" without backup_verified is blocked |

If the Coverage and Schema negative lines for high_memory_usage cannot be written — it means requirements.md does not yet have a checkable requirement, or the schema does not yet distinguish P0 without backup.

Reviewable Trace

In the educational package, preserve changes to requirements.md, plan.md, validation.md, or the schema. Temporary fixtures created only for local diagnostics are not needed if they did not become part of the regression set.

Key Ideas

Here "executable artifact" does not mean running Markdown as a program. It refers to checking requirements, plans, and examples with ordinary CI scripts.

Run a mandatory GitHub Actions gate on both pull_request and push to the protected branch. Why both triggers. A specification violation can enter through two paths: via a regular pull request or via a direct update of service files.

Minimal set of tracked artifacts:

  • requirements.md,
  • plan.md,
  • validation.md,
  • contracts/**,
  • constitution.md — as needed, if it contains domain constraints for the incident pipeline.

In branch protection settings, mark specifically the final spec_gate task as required. Otherwise, green unit tests will be able to bypass the semantic check. This scheme aligns with the SDD approach, where requirements, plan, and tasks become checkable layers rather than static text (GitHub Spec Kit).

> [project script].github/workflows/spec-ci.yml calls project scripts scripts/spec_ci/*.py.

name: spec-ci

on:
  pull_request:
    paths:
      - 'requirements.md'
      - 'plan.md'
      - 'validation.md'
      - 'contracts/**'
      - 'constitution.md'
  push:
    branches: [main]
    paths:
      - 'requirements.md'
      - 'plan.md'
      - 'validation.md'
      - 'contracts/**'
      - 'constitution.md'

jobs:
  coverage:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - run: python3 scripts/spec_ci/check_coverage.py --requirements requirements.md --plan plan.md --out out/spec-ci/coverage-report.json

  scope:
    runs-on: ubuntu-latest
    needs: [coverage]

steps:
      - uses: actions/checkout@v4
      - run: python3 scripts/spec_ci/check_scope.py --domain models/incident-response.yaml --plan plan.md --contracts contracts/api.md --out out/spec-ci/scope-violations.ndjson

  schema:
    runs-on: ubuntu-latest
    needs: [coverage, scope]
    steps:
      - uses: actions/checkout@v4
      - run: python3 scripts/spec_ci/extract_fixtures.py --from validation.md --out out/spec-ci/fixtures
      - run: python3 scripts/spec_ci/validate_schema.py --schemas schemas --fixtures out/spec-ci/fixtures --out out/spec-ci/schema-audit.json

  spec_gate:
    runs-on: ubuntu-latest
    needs: [coverage, scope, schema]
    steps:
      - run: echo "Specification gate passed"

Coverage checking starts with the graph requirements → plan, not with matching words. Let's define traceability rules:

  • every user story in requirements.md gets a stable identifier REQ-*;
  • every task or step in plan.md must reference one or more such identifiers via a field like implements: [REQ-014].

What counts as an error. If a story has no traceable task, exit with fail: the team has already broken a promise to the user before implementation begins. The reverse violation is also important. A task without implements becomes a rogue task. This means the plan has started adding functionality not backed by requirements.

> [project script]scripts/spec_ci/check_coverage.py; runnable analog — [examples/spec-ci/scripts/check_coverage.py](examples/spec-ci/).

python3 scripts/spec_ci/check_coverage.py \
  --requirements requirements.md \
  --plan plan.md \
  --out out/spec-ci/coverage-report.json

An out-of-bounds detector is needed for cases where formal traceability exists, but the content of the step goes beyond the incident domain. Match actions from plan.md with the domain model incident-response. Allowed operations, for example:

  • acknowledge,
  • escalate,
  • annotate,
  • rollback,
  • notify_on_call.

Arbitrary business actions (like notify_finance, close_customer_contract, or force_resolve_without_operator) are not included.

Consider not only the verb. Also check:

  • the actor,
  • the endpoint,
  • the trigger condition.

Why this matters. resolve incident may be allowed for a human on-call operator and forbidden for an autonomous agent. The practical rule is simple: if a step cannot be explained through the incident model and the allowed API contract, it blocks the pull request.

> [project script]scripts/spec_ci/check_scope.py. No ready analog exists; implement yourself on top of the domain model and API contract.

python3 scripts/spec_ci/check_scope.py \
  --domain models/incident-response.yaml \
  --plan plan.md \
  --contracts contracts/api.md \
  --out out/spec-ci/scope-violations.ndjson

JSON Schema checks close the fixture and payload examples layer, where quiet integration regressions often occur. What to do:

  • extract all JSON blocks from validation.md;
  • convert them into separate fixtures;
  • validate against schemas from schemas/** — for example, incident_payload.schema.json, pagerduty_webhook.schema.json, or grafana_alert.schema.json.

Watch both directions. Valid examples must pass without errors. Deliberately negative examples must fail predictably. If a negative payload passes, the schema is too permissive and does not protect the contract.

Before merging into the protected branch, this is checked as strictly as application tests. An invalid incident_id, incorrect severity, or empty source can break the entire remediation pipeline.

> [project script]scripts/spec_ci/extract_fixtures.py and scripts/spec_ci/validate_schema.py; runnable analog for schema checking — [examples/spec-ci/scripts/validate_schema.py](examples/spec-ci/). Implement the extraction step yourself for your project's validation.md format.

python3 scripts/spec_ci/extract_fixtures.py \
  --from validation.md \
  --out out/spec-ci/fixtures

python3 scripts/spec_ci/validate_schema.py \
  --schemas schemas \
  --fixtures out/spec-ci/fixtures \
  --out out/spec-ci/schema-audit.json

Make the CI rejection diagnostic format designed for quick fixes, not for log investigation.

Bad: > Coverage failed: missing REQ

Problem: the reviewer does not know which requirement was orphaned and where to look. The error cannot be fixed without separate investigation.

Good: > requirements.md:42: REQ-014 has no reference in plan.md. Add a task with implements: REQ-014 to plan.md or remove the requirement.

In every error, specify four elements:

  • a clear reason,
  • a reference to file and line,
  • an identifier for the violated rule,
  • a specific action for the specification team.

Error types look like this. For coverage, it may be REQ-021 has no implementing plan item; add implements: [REQ-021] to plan.md or remove the requirement. For scope — plan.md:48 uses force_resolve without domain permission. For schema — validation.md:72 missing required property incident_id.

This format reduces the load on the reviewer. The person checks the meaning of the change, rather than reconstructing what exactly broke the CI.

{
  "status": "failed",
  "check": "scope",
  "file": "plan.md",
  "line": 48,
  "rule": "IR-SCOPE-007",
  "reason": "Autonomous force resolve is outside the incident-response domain model",
  "action": "Replace with POST /incidents/{id}/ack or add an approved requirement and domain rule"
}

Examples and Application

flowchart LR
A[pre-commit hook]
B[local quick run and light duel before push]
C[PR push]
D[changed files selection]
E[check_coverage requirements plan tasks graph]
F[check_scope domain model and contracts/api]
G[check_schema validation and counterexamples]
H[gate report and PR status]
A --> B --> C --> D --> E --> F --> G --> H

A typical pull request in the educational incident repository changes three files:

  • requirements.md,
  • plan.md,
  • validation.md.

The author describes the story REQ-014: as an on-call engineer, I want to receive escalation confirmation. Then in the plan adds task TASK-033 with implements: [REQ-014]. And in validation.md places a webhook payload example with fields incident_id, severity, source, and escalation_target.

What is checked. The coverage check passes if the link REQ-014 → TASK-033 exists. The scope check passes if the action matches the domain model. The schema check passes if the payload matches the contract. If any of the three layers breaks, spec_gate returns a red status and GitHub does not allow merge.

A telling failure: the author tries to "speed up" processing and adds to plan.md a step POST /pagerduty/force-resolve without a separate requirement and without permission in the domain model. Coverage may remain green if the step is formally linked to an existing story. But the scope check will block the pull request: autonomous incident closure without operator confirmation is not among the agreed operations.

If the same pull request adds to validation.md a payload with event_code instead of the mandatory incident_id, the schema check produces an independent blocker. The team gets two different classes of errors:

  • semantic out-of-bounds,
  • data structure violation.

A local quick run before push saves time and makes the specification gate a habitual part of the work cycle. In pre-commit, run only changed files. Leave the full process to GitHub Actions, so as not to slow the developer with a long check of all fixtures.

For an incident project, a command that does three things is sufficient:

  • builds the coverage graph,
  • checks scope by diff,
  • validates affected JSON blocks.

If the local report already shows orphan requirement, rogue task, or schema mismatch, the author fixes the specification before creating a pull request, not after getting a red status in remote CI.

> [project script] — example local wrapper for scripts/spec_ci/*.py.

#!/usr/bin/env bash
set -euo pipefail

python3 scripts/spec_ci/check_coverage.py \
  --requirements requirements.md \
  --plan plan.md \
  --out out/spec-ci/coverage-report.json

python3 scripts/spec_ci/check_scope.py \
  --domain models/incident-response.yaml \
  --plan plan.md \
  --contracts contracts/api.md \
  --out out/spec-ci/scope-violations.ndjson

python3 scripts/spec_ci/extract_fixtures.py \
  --from validation.md \
  --out out/spec-ci/fixtures

python3 scripts/spec_ci/validate_schema.py \
  --schemas schemas \
  --fixtures out/spec-ci/fixtures \
  --out out/spec-ci/schema-audit.json

Summary

The specification gate makes the specification an executable arbiter of the repository. GitHub Actions blocks the pull request on three classes of violations:

  • uncovered user stories,
  • extraneous scenarios in the plan,
  • JSON Schema errors in validation examples.

For the team, this changes the nature of review. Instead of a subjective argument about requirement completeness, a diagnostic report appears with file, line, rule, and action.

In incident automation, such strictness is especially important. An incorrect scope or a weak payload contract can lead to three consequences:

  • false escalations,
  • dangerous auto-operations,
  • loss of trust in remediation.

Next, this pipeline will become the basis for file-based arbitration of disputed changes.

The minimal runnable set for this chapter is in examples/spec-ci/. Go through it before implementing the full GitHub Actions process. First achieve a green local gate. Then transfer the same commands to CI.

> [runnable] — runnable example: examples/spec-ci/scripts/check_coverage.py and examples/spec-ci/scripts/validate_schema.py.

cd book2/examples/spec-ci
python3 scripts/check_coverage.py --requirements requirements.md --plan plan.md
python3 scripts/validate_schema.py --schema schemas/incident_payload.schema.json --fixtures fixtures

Artifacts and Readiness Criteria

ArtifactReady when

| Local run of book2/examples/spec-ci | smoke-pass without external dependencies | | Coverage check requirements → plan | every REQ-* has an implementing task, every task has implements | | JSON Schema check | valid fixture passes, negative one fails predictably | | Record in validation.md | gate error message points to file, rule, and action for correction |

The full track adds .github/workflows/spec-ci.yml or its project analog, out/spec-ci/coverage-report.json for the requirements → plan graph, out/spec-ci/scope-violations.ndjson with domain model violations, out/spec-ci/schema-audit.json for fixtures from validation.md, and a local quick run wrapper. Consider it ready if the scope check blocks autonomous actions outside the incident-response model, the spec_gate task is mandatory in the protected branch, and the CI diagnostic format specifies file, line, rule identifier, and action.

Practice

  1. cd book2/examples/spec-ci && python3 scripts/check_coverage.py --requirements requirements.md --plan plan.md — *expected: code 0, stdout — one line coverage ok: 3 requirements covered.*
  1. python3 scripts/validate_schema.py --schema schemas/incident_payload.schema.json --fixtures fixtures — *expected: code 0; stdout contains valid-incident.json: valid and invalid-missing-incident-id.json: expected invalid, rejected: missing required property incident_id (the negative fixture is marked _expected_invalid: true and therefore considered successfully rejected).*
  2. Transfer to capstone/validation.md one line of Spec CI: "coverage ok: 3/3, schema ok: 2/2 (negative rejected by missing required property incident_id)". Expected: on next regression, the line allows recovering what exactly blocks merge without reading CI logs.

Review Questions

  1. Why is word-based coverage weaker than the requirements → plan graph?
  2. Which violations should the scope check catch, and which should it not?
  3. What makes a CI error fixable without investigation?
  4. The specification gate blocks merge due to a mismatched REQ-ID. The programmer wants to add REQ-ID to an existing plan item and merge the pull request. What is dangerous in this approach?
My notes
0 / 10000

Notes are saved in this browser. They will not appear on another device.

Course menu

Course

Production SDD for Qwen Code CLI. Part 2
Progress 0 / 100