Study guide: Applied Part 1. Recovering Specifications from Legacy

Lesson 3 of 5 in module «Applied Part 1. Recovering Specifications from Legacy»

You are viewing the lesson without signing in. Sign in to save progress and take tests.

Topic: Applied Part 1. Recovering Specifications from Legacy

Difficulty level: Medium

Estimated study time: 4-5 hours

Prerequisites: Knowledge of the material from Part 13 of Volume One (recovering the constitution of an existing project).

Understanding of cloud infrastructure principles (Kubernetes, monitoring, alerts).

Basic skills working with JSON, YAML, and Markdown formats.

General understanding of the SDD (Specification-Driven Development) concept.

Learning objectives: Learn to clearly separate actual requirements (contracts) from background infrastructure context (memory bank).

Master the collection and normalization of evidence (evidence_ref) from disparate sources (logs, Slack, post-mortems) into a single timeline.

Gain the skill of extracting implicit rules and converting them into verifiable statements (claims) using AI (Qwen Code).

Be able to describe recovered requirements in behavior language (Given/When/Then) and design machine-readable contracts (JSON Schema) for them.

Learn to maintain a registry of requirement origins (genealogy.md) to ensure transparency and auditability of specifications.

Overview: This study module examines the process of recovering an engineering-grade specification (SDD) from disparate artifacts of legacy systems, such as unstructured logs, operator chats, and post-mortems. Using the example of the AgentClinic educational production model, you will learn how to turn data chaos into a strict, verifiable contract. You will master techniques for normalizing timelines, using AI to extract requirements with an evidence base, and separating business logic from infrastructure context (memory bank). The course emphasizes that every recovered requirement must be backed by sources, not guesses.

Key concepts: Spec necromancy (recovering specifications): An engineering technique for reconstructing specifications based on observable artifacts (logs, metrics, chats). It allows restoring system logic after team turnover, relying on a verifiable chain of evidence instead of abstract guesses.

Memory bank (background model): A separate layer of infrastructure context (historical agreements, cluster topology, team names). This information helps interpret facts, but is not itself a business requirement (contract) and must not enter the triage logic.

Evidence ref (evidence tag): A reference to a specific place in the source artifact (for example, a line in a log, a Slack message) that confirms the truth of a statement. A requirement without an evidence_ref is considered merely a hypothesis.

Genealogy.md (provenance registry): A file that describes the origin of each requirement. Unlike git log, it shows not only who and when changed the file, but also where the rule came from, the level of confidence (uncertainty), supporting sources, and open questions.

Candidate statement (claim): A pattern or rule extracted from data, supplied with evidence, counterexamples, and a confidence estimate. A candidate's status can be 'approved', 'needs_clarity', or 'rejected'.

Normalized timeline chain: A sequence of events brought to a unified time (UTC) and format, cleared of duplicates. It links logs, alerts, and operator actions in chronological order.

Practice exercises: Name: Separating the contract from background context

Problem: You are given a list of statements from an incident post-mortem:

'The appointments-api service began consuming >90% memory in 10 minutes.'
'On-duty engineer Ivan reported in Slack channel #incidents that this was not a planned deployment.'
'The deployment occurred in the canary namespace.'
'The NOC team received the alert after 15 minutes.'

Task: Separate these facts into two categories: 'Requirements (SDD)' and 'Memory Bank'. Justify your decision.

Solution: 1 and 4 are Requirements (SDD). They describe the observable behavior of the system and the SLA (memory consumption trigger and response time). 2 and 3 are Memory Bank. The on-duty engineer's name and the fact of using the canary namespace are context that helps understand the situation, but should not be hard-coded into the business logic of the triage pipeline as a universal rule.

Complexity: beginner

Name: Forming an entry in genealogy.md

Problem: Based on the educational excerpt:

grafana:HM-2026-05-17-01 cluster=prod-k8s memory_percent=92 window=10m
postmortem:api-memory-2026-05 note='auto-resolve was rejected until stable'

Form a YAML fragment for the genealogy.md file. Statement: 'When memory_percent >= 90% over 10m for appointments-api, a P1 is created'. Specify the status 'needs_clarity', since there is no data on closure conditions.

Solution: - claim: "When memory_percent >= 90% over 10m for appointments-api, a P1 is created." status: needs_clarity evidence_ref:

"grafana:HM-2026-05-17-01"
"postmortem:api-memory-2026-05"

uncertainty: medium open_questions:

"Is the prohibition on auto-resolve without stable windows confirmed?"
"What is the exact threshold for closing the incident?"

Complexity: intermediate

Name: Double recording of the specification (Given/When/Then + JSON Schema)

Problem: Translate the recovered requirement into Given/When/Then format and write a minimal JSON Schema for validating the trigger threshold and SLA. Requirement: 'If >=3 NodeNotReady events are recorded in 10 minutes on a single node, a P1 incident is created with an expected response time of 8 minutes.'

Solution: Given the cluster is in active shift and the monitoring system is recording metrics; When >=3 NodeNotReady events arrive for a single node within 10 minutes; Then the system creates an incident with severity=P1 and sets a response SLA of 8 minutes.

JSON Schema: { "$id": "urn:spec:node-not-ready:v1", "type": "object", "required": ["rule_id", "severity", "sla_minutes", "conditions"], "properties": { "rule_id": {"type": "string"}, "severity": {"type": "string", "enum": ["P0", "P1", "P2", "P3"]}, "sla_minutes": {"type": "integer", "minimum": 1, "maximum": 120}, "conditions": { "type": "object", "required": ["event_code", "count", "window_minutes"], "properties": { "event_code": {"type": "string"}, "count": {"type": "integer", "minimum": 3}, "window_minutes": {"type": "integer", "minimum": 1} } } } }

Complexity: advanced

Case studies: Name: Recovering escalation logic after SRE team turnover

Scenario: A project for automatic incident management experienced the departure of key specialists. Left behind were 47 pages of unfiltered logs, Slack threads, dashboard screenshots, and text post-mortems. The new team needs to build a triage pipeline based on Qwen Code, but no formal specification document (SDD) exists.

Challenge: The information is chaotic: logs contain over 1200 events with different time zones, chats mix real incidents with planned work discussions. There is a high risk that the AI model (Qwen Code) will perceive a random phrase from a chat or a specific cluster topology as a universal business rule.

Solution: The team applied the 'Spec necromancy' method:

Inventory and normalization of data: bringing all timestamps to UTC, filtering noise, creating a single timeline.
Separation of layers: identifying verifiable requirements (triggers, SLAs, closure conditions) and moving everything else to the Memory Bank.
Extracting requirements via AI: using Qwen Code in analysis mode with a strict requirement to provide evidence_ref for every statement.
Maintaining genealogy.md: recording the origin of each rule, to distinguish facts firmly confirmed by post-mortems from unconfirmed hypotheses.

Result: Instead of a set of plausible guesses, the team obtained an engineering-grade specification. The agreed-upon contract was expressed in Given/When/Then and JSON Schema format, which made it possible to automatically validate the behavior of the triage pipeline.

Lessons learned: Never mask disputed hypotheses as approved contracts. Use the needs_clarity status and uncertainty level.

The window filter (for example, [-15m,+5m] relative to the alert) is critically important for linking manual actions in chat with automatic events in logs.

Exceptions (for example, canary namespace) should not be removed as noise; they often point to hidden specification conditions.

Related concepts: evidence_ref

memory bank

genealogy.md

Spec necromancy

Qwen Code

Name: Analyzing the node_not_ready incident: identifying hidden thresholds

Scenario: A specific historical incident NR-2026-05-17-01 is being analyzed. Grafana shows 3 NodeNotReady events on node worker-07 over 10 minutes. The system created a P1 escalation. The post-mortem states: 'auto-resolve was rejected until two stable OK windows'.

Challenge: It is required to recover the exact rules: when exactly an event becomes P1, and when it can be closed automatically. Engineers need to understand whether the threshold of 3 events is a hard rule or a coincidence, and how exactly to check 'stable OK windows' in the code.

Solution: 1. Compose a behavioral history in Given/When/Then format.

Formulate a candidate statement (Claim): 'When >=3 NodeNotReady occur in 10 minutes, a P1 is created'.
Bind evidence (link to Grafana and post-mortem) via evidence_ref.
Record the closure condition in a JSON Schema that requires the presence of two consecutive OK windows.
Highlight the disputed fact (canary namespace) as an open question marked with uncertainty: medium.

Result: A verifiable specification was created that restores the triage logic. The disputed point about the canary namespace did not enter the final SDD as a universal rule, but remained in the status of a hypothesis requiring verification on historical data.

Lessons learned: A smooth textual formulation of a requirement is less useful than a record showing where the requirement is solid and where it requires verification by the service owner.

Double recording (behavior + JSON Schema) eliminates the gap between human understanding and machine validation.

Related concepts: Candidate statement (Claim)

Normalized timeline chain

JSON Schema

Given/When/Then

Study tips: Start with a narrow focus: for the first practical step, choose one claim, two sources, and one open question. Don't try to recover the entire architecture at once.

Train the separation of layers: when reading any post-mortem, ask yourself — is this the observable behavior of a contract, or simply context for the situation?

Practice working with Qwen Code in headless mode (Plan Mode): demand that the model return not finished text, but structured JSON with source, counterexample, and missing_context fields.

Pay attention to the difference between git blame and genealogy.md. Git will show who added a line of code, while genealogy.md will explain which logs and chats the business decision was based on.

Additional resources: Part 13 of Volume One of the textbook: Basic material on recovering the constitution of an existing project. Recommended reading before starting this module.

Part 8 (multi-agent arbitration): Продвинутый material for the full production track. Describes the roles of Verifier, Implementor, and Safety for resolving disputed specifications.

Github spec kit: An external resource describing the SDD philosophy of "specification as an executable artifact".

genealogy.md template (book2/examples/templates/): A practical template required for completing the course exercises.

Summary: Recovering specifications from legacy is the process of transforming the chaos of historical data into a strict, validatable contract. The key to success lies in a strict separation of actual requirements (SDD) and infrastructure context (Memory Bank). Using AI helps extract requirement candidates, but every such statement must be backed by evidence (evidence_ref). Double codification (human-readable Given/When/Then and machine-readable JSON Schema) combined with maintaining the origin registry (genealogy.md) ensures that the specification will be not just a set of plausible guesses, but an auditable engineering artifact.