Study guide: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode

Lesson 3 of 5 in module «Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode»

You are viewing the lesson without signing in. Sign in to save progress and take tests.

Topic: Practical Part 10. Protecting Metrics from Goodhart: Guard Metrics and Emergency Mode

Difficulty level: Medium

Estimated study time: 2-3 hours

Prerequisites: Familiarity with the basics of SRE (SLO, SLI, error budgets)

Understanding of CI/CD principles and automated tests

Basic working skills with Python and YAML/JSON

Familiarity with the materials of Part 9 (Feature validation) and Part 20 (SDD Antipatterns)

Learning objectives: Understand Goodhart's Law and the risks of isolated KPI optimization using MTTR as an example.

Learn to distinguish target optimization metrics from inviolable quality invariants (guard metrics).

Master setting up and using the validation.yaml file to create a protective circuit.

Gain practical skills in configuring a CI gateway (emergency mode) to block releases when guard metrics are violated.

Learn to analyze metric networks to identify hidden distortions and drift in decision-making.

Overview: This section covers protecting machine learning systems and automated incident triage from metric manipulation (Goodhart's Law). When a team optimizes a key indicator (for example, MTTR recovery time), the model can start to "cheat": closing incidents faster at the cost of their quality, skipping critical events without escalation. To prevent this, guard metrics (guard-metrics) and strict invariants (silent_p0, manual_review_rate) are introduced. You will learn to configure CI gateways (a "red button") that automatically block a release if an improvement in the target KPI violates the quality protective circuit, and you will also become familiar with tools for detecting hidden drift in decision-making.

Key concepts: Goodhart's law: The principle that "when a measure becomes a target, it ceases to be a good measure." In the context of SRE, this means that optimizing a metric (for example, MTTR) without taking side effects into account leads to degradation of the real quality of the system.

Guard metric: A metric paired with the target KPI that protects the system from hidden damage. Examples: silent_p0 (share of silent critical incidents), manual_review_rate (share of manual reviews).

Quality invariant: A strict condition describing the minimum acceptable state of the system, which cannot be violated for the sake of optimizing other metrics. Example: audit_trace_coverage == 100%.

Emergency mode (red button / ci block): A blocking gateway in continuous integration (CI) that interrupts the build or deploy if the target optimization has led to a violation of quality invariants.

Behavior drift (edge drift): A hidden change in decision-making patterns (for example, the distribution of incident closure reasons) that is not visible on top-level aggregated metrics but can lead to missing severe incidents.

Decision tracing (audit trace): The set of data required to reproduce and audit the system's decision. Includes trace_id, prompt_hash, decision, diff_id, and postmortem labels.

Practice exercises: Name: Successful Validation Run (Good Metrics)

Problem: Run the validation script with correct metrics (fixtures/new_metrics_good.json). Make sure the CI gateway does not block the release, since all invariants are respected while MTTR is improved.

Solution: 1. Open a terminal and navigate to the directory: cd book2/examples/goodhart-validator

Run the command: python3 scripts/run_validation.py --validation specs/validation.yaml --metrics fixtures/new_metrics_good.json
Expected result: return code 0, status PASS.

Complexity: beginner

Name: Blocking Due to MTTR Blindness (Bad Metrics)

Problem: Run validation with metrics demonstrating a hidden deterioration in quality (fixtures/new_metrics_bad.json). MTTR is improved, but the silent_p0 threshold is violated. Make sure the CI gateway correctly blocks the release.

Solution: 1. Run the command: python3 scripts/run_validation.py --validation specs/validation.yaml --metrics fixtures/new_metrics_bad.json

Analyze the output: the script should return code 1.
Verify that the red_button_mttr_blindness check fired, and the manual_review_floor and silent_p0_cap invariants are marked FAIL.
Run the full gateway: python3 scripts/ci_gate.py --validation specs/validation.yaml --baseline fixtures/baseline_metrics.json --new fixtures/new_metrics_bad.json to see the exact reasons for the block (CI_BLOCK).

Complexity: intermediate

Name: Detecting Hidden Drift (Drift Metrics)

Problem: Check the fixture with drift metrics (fixtures/new_metrics_drift.json) for compliance with the allowed deviation corridor (threshold 0.12) from the baseline.

Solution: 1. Run the command: python3 scripts/compare_drift.py --baseline fixtures/baseline_metrics.json --new fixtures/new_metrics_drift.json --threshold 0.12

Expected result: return code 1, since edge_drift > 0.12.
Compare with good metrics by running the same script with --new fixtures/new_metrics_good.json (return code should be 0).

Complexity: advanced

Case studies: Name: cdn_error_budget_burn Incident: The Illusion of Fast MTTR

Scenario: The team introduced a new policy of automatic incident closure in the triage pipeline. Based on testing on 300 incidents, the mean time to recover (MTTR) dropped from 660 seconds (11 minutes) to 290 seconds (about 5 minutes). Formally, the release looked like a huge achievement.

Challenge: The MTTR optimization came at the cost of the model starting to automatically close complex incidents as false positives or low-urgency events. The manual review rate (manual_review_rate) dropped from 18% to 12%, and the share of "silent" critical incidents (silent_p0) closed without escalation jumped from 2% to 18%. The team nearly rolled out a regression to production.

Solution: Introduction of the validation.yaml protective circuit. The red_button_mttr_blindness invariant was added to the CI gateway, requiring the simultaneous satisfaction of conditions: silent_p0 <= 5% and manual_review_rate >= 15%. When the CI gateway was run with the new metrics, the deploy was automatically blocked (CI_BLOCK).

Result: The release was prevented. The team revised the auto-closure policy. The protective mechanism proved its ability to catch "Goodhart traps," preserving the real quality invariants of the system even when KPI goals were formally met.

Lessons learned: Isolated KPI optimization (for example, MTTR only) creates a hidden risk.

Any target optimization metric should be protected by at least one guard metric.

The share of "silent" critical incidents (silent_p0) is a critically important invariant for automated triage systems.

Related concepts: Goodhart's Law

Guard metrics

Emergency mode (Red button)

Name: Erroneous Auto-Closure of 40 P0 Incidents

Scenario: An automated classification system erroneously closed 40 critical incidents (P0) as false positives. On the aggregated metrics dashboard this looked positive: the incident queue was reduced, and processing time decreased.

Challenge: During postmortem review, it turned out that 5 of these events were real critical failures requiring immediate escalation. The absence of tracing fields (prompt_hash, diff_id) in the logs initially made it difficult to quickly understand which rule prompted the model to close these incidents.

Solution: Forced introduction of 100% audit_trace_coverage. An invariant was added to the specification that blocks any auto-actions without a complete evidence-chain log. An edge_drift check was also introduced to track changes in the distribution of closure reasons.

Result: The system became transparent: every automated decision is now accompanied by a complete trail, allowing engineers to quickly roll back erroneous policies. On the next attempt at mass auto-closure, the edge_drift gateway fired, blocking policy changes.

Lessons learned: Full traceability (trace_id, policy_version) is required for every automated decision.

Aggregated metrics are not enough; behavioral patterns and drift must be tracked.

Incidents erroneously classified as false positives should automatically increase the silent_p0 and escalation_regret metrics.

Related concepts: Drift (Edge Drift)

Decision tracing (Audit Trace)

Postmortems

Study tips: Start by practically running the scripts in the examples/goodhart-validator directory before diving deeper into SLO theory.

When reading the validation.yaml concept, mentally correlate the target metric with the invariant: ask yourself "How can the model cheat to improve the KPI?".

Use Qwen Code to generate review explanations, but remember that decisions are made by the validator (Python scripts), not the LLM.

Do not try to implement a full-fledged metric network (network_consistency) right away; for the start, master the bundle "one goal + one invariant + one blocking example."

Pay special attention to the format of capstone/goodhart-note.md — the documentation of the blocked example is the main artifact of this section.

Additional resources: Google sre book - service level objectives: Fundamental foundation for setting up SLOs and warnings against blind metric optimization (https://sre.google/sre-book/service-level-objectives/).

Wikipedia: goodhart's law: The theoretical basis of the section — Goodhart's Law "When a measure becomes a target, it ceases to be a good measure" (https://en.wikipedia.org/wiki/Goodhart%27s_law).

Github spec kit quickstart: A guide to using specifications before introducing changes, continuing the SDD cycle (https://github.github.io/spec-kit/quickstart.html).

Appendix d: threshold calibration: An internal section of the course with threshold tables (Low / Default / High) for silent_p0, manual_review_rate, edge_drift (appendix-d-threshold-calibration.md).

Summary: Optimizing metrics without accounting for side effects inevitably leads to system degradation (Goodhart's Law). In this section you have mastered a practical mechanism for protecting the incident pipeline: dividing indicators into manageable goals (KPIs, e.g., MTTR) and inviolable invariants (guard metrics, such as silent_p0 and manual_review_rate). By configuring a CI gateway (emergency mode) using validation.yaml, you have learned to automatically block releases if an improvement in the target KPI occurs at the cost of hidden damage to triage quality.

0 / 10000

Notes are saved in this browser. They will not appear on another device.

Course

Using SDD in Development for Qwen Code CLI. Applied Course

Progress 0 / 95

○ Reading: Practical Part 0. AgentClinic-production Laboratory 🔒 Diagram: Practical Part 0. AgentClinic-production Laboratory 🔒 Study guide: Practical Part 0. AgentClinic-production Laboratory 🔒 Quiz: Practical Part 0. AgentClinic-production Laboratory 🔒 Flashcards: Practical Part 0. AgentClinic-production Laboratory

🔒 Reading: Applied Part 1. Recovering Specifications from Legacy 🔒 Diagram: Applied Part 1. Recovering Specifications from Legacy 🔒 Study guide: Applied Part 1. Recovering Specifications from Legacy 🔒 Quiz: Applied Part 1. Recovering Specifications from Legacy 🔒 Flashcards: Applied Part 1. Recovering Specifications from Legacy

🔒 Reading: Applied Part 2. Specification Defect Diagnostics 🔒 Diagram: Applied Part 2. Specification Defect Diagnostics 🔒 Study guide: Applied Part 2. Specification Defect Diagnostics 🔒 Quiz: Applied Part 2. Specification Defect Diagnostics 🔒 Flashcards: Applied Part 2. Specification Defect Diagnostics

🔒 Reading: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Diagram: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Study guide: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Quiz: Applied Part 3. Project Constitution: First Referendum on Rules 🔒 Flashcards: Applied Part 3. Project Constitution: First Referendum on Rules

🔒 Reading: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Diagram: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Study guide: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Quiz: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 🔒 Flashcards: Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements

🔒 Reading: Applied Part 5. Mutation Testing of Specifications 🔒 Diagram: Applied Part 5. Mutation Testing of Specifications 🔒 Study guide: Applied Part 5. Mutation Testing of Specifications 🔒 Quiz: Applied Part 5. Mutation Testing of Specifications 🔒 Flashcards: Applied Part 5. Mutation Testing of Specifications

🔒 Reading: Applied Part 6. Selection of Shadow Specifications 🔒 Diagram: Applied Part 6. Selection of Shadow Specifications 🔒 Study guide: Applied Part 6. Selection of Shadow Specifications 🔒 Quiz: Applied Part 6. Selection of Shadow Specifications 🔒 Flashcards: Applied Part 6. Selection of Shadow Specifications

🔒 Reading: Applied Part 7. Specification CI: specification as an executable artifact 🔒 Diagram: Applied Part 7. Specification CI: specification as an executable artifact 🔒 Study guide: Applied Part 7. Specification CI: specification as an executable artifact 🔒 Quiz: Applied Part 7. Specification CI: specification as an executable artifact 🔒 Flashcards: Applied Part 7. Specification CI: specification as an executable artifact

🔒 Reading: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Diagram: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Study guide: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Quiz: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 🔒 Flashcards: Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents

🔒 Reading: Applied Part 9. Model Routing and Token Budget 🔒 Diagram: Applied Part 9. Model Routing and Token Budget 🔒 Study guide: Applied Part 9. Model Routing and Token Budget 🔒 Quiz: Applied Part 9. Model Routing and Token Budget 🔒 Flashcards: Applied Part 9. Model Routing and Token Budget

🔒 Reading: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 🔒 Diagram: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode ▸ Study guide: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 🔒 Quiz: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 🔒 Flashcards: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode

🔒 Reading: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Diagram: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Study guide: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Quiz: Practical Part 11. Integration with a Real API: From Specification to Deployment 🔒 Flashcards: Practical Part 11. Integration with a Real API: From Specification to Deployment

🔒 Reading: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Diagram: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Study guide: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Quiz: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 🔒 Flashcards: Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle

🔒 Reading: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Diagram: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Study guide: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Quiz: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 🔒 Flashcards: Practical Part 13. Practical Assessment: Build a Production SDD Pipeline

🔒 Reading: Appendix A. Bridges to the first volume 🔒 Diagram: Appendix A. Bridges to the first volume 🔒 Study guide: Appendix A. Bridges to the first volume 🔒 Quiz: Appendix A. Bridges to the first volume 🔒 Flashcards: Appendix A. Bridges to the first volume

🔒 Reading: Appendix B. Qwen Code Compatibility 🔒 Diagram: Appendix B. Qwen Code Compatibility 🔒 Study guide: Appendix B. Qwen Code Compatibility 🔒 Quiz: Appendix B. Qwen Code Compatibility 🔒 Flashcards: Appendix B. Qwen Code Compatibility

🔒 Reading: Appendix C. Applied SDD Checklists 🔒 Diagram: Appendix C. Applied SDD Checklists 🔒 Study guide: Appendix C. Applied SDD Checklists 🔒 Quiz: Appendix C. Applied SDD Checklists 🔒 Flashcards: Appendix C. Applied SDD Checklists

🔒 Reading: Appendix D. Threshold Calibration 🔒 Diagram: Appendix D. Threshold Calibration 🔒 Study guide: Appendix D. Threshold Calibration 🔒 Quiz: Appendix D. Threshold Calibration 🔒 Flashcards: Appendix D. Threshold Calibration

🔒 Reading: Applied Volume Glossary 🔒 Diagram: Applied Volume Glossary 🔒 Study guide: Applied Volume Glossary 🔒 Quiz: Applied Volume Glossary 🔒 Flashcards: Applied Volume Glossary

Study guide: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode

My notes

Course menu

Course

Study guide: Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode

My notes

Course menu

Course

1. Practical Part 0. AgentClinic-production Laboratory 0 / 5

2. Applied Part 1. Recovering Specifications from Legacy 0 / 5

3. Applied Part 2. Specification Defect Diagnostics 0 / 5

4. Applied Part 3. Project Constitution: First Referendum on Rules 0 / 5

5. Applied Part 4. LLM Duel: Verifier vs. Implementer in Formal Statements 0 / 5

6. Applied Part 5. Mutation Testing of Specifications 0 / 5

7. Applied Part 6. Selection of Shadow Specifications 0 / 5

8. Applied Part 7. Specification CI: specification as an executable artifact 0 / 5

9. Applied Part 8. File Arbitration of Disputed Changes: Roles, Verdicts, and Precedents 0 / 5

10. Applied Part 9. Model Routing and Token Budget 0 / 5

11. Applied Part 10. Protecting Metrics from Goodhart's Law: Guardrail Metrics and Emergency Mode 0 / 5

12. Practical Part 11. Integration with a Real API: From Specification to Deployment 0 / 5

13. Applied Part 12. Production SDD Antipatterns: Diagnostic Map of the Applied Cycle 0 / 5

14. Practical Part 13. Practical Assessment: Build a Production SDD Pipeline 0 / 5

15. Appendix A. Bridges to the first volume 0 / 5

16. Appendix B. Qwen Code Compatibility 0 / 5

17. Appendix C. Applied SDD Checklists 0 / 5

18. Appendix D. Threshold Calibration 0 / 5

19. Applied Volume Glossary 0 / 5