Reading: Practical Part 0. AgentClinic-production Laboratory

Lesson 1 of 5 in module «Practical Part 0. AgentClinic-production Laboratory»

You are viewing the lesson without signing in. Sign in to save progress and take tests.

Source

Applied Part 0. AgentClinic-production Laboratory

Status: Standard for the learning route. This part introduces no new technique. It explains how to read the second volume as a single laboratory branch after the first volume.

The first volume builds a small AgentClinic: routes, SQLite, feature specifications, checks, and review. In the second volume, the same project is used as a learning production model. We do not require real Kubernetes, Grafana, PagerDuty, or GitOps. These words denote roles in scenarios: where the signal came from, which action may be dangerous, where a rollback is needed, and which artifact proves the decision.

If you read the chapters as a set of independent advanced techniques, the volume will quickly become heavy. Read it differently: one project, one main production circuit, one growing evidence package. For assessment, use high_memory_usage by default; the other incidents are needed as small laboratory windows for individual mechanisms.

Practical rule of the first pass: capstone/README.md should answer about one incident-case. A local example of another chapter may use a different incident, but only a verifiable principle is carried over into the assessment package. For example, from autoscale_200pct you carry over not the second case but a guard rule "do not expand the blast radius beyond quota". From cdn_error_budget_burn you carry over not a new service but the anti-Goodhart invariant "MTTR must not be improved at the cost of silent P0".

Before reading

Part 0 is methodological, with no learning case. There are no steps or checkpoint fact here; the task is to lay out a map of the volume and align the execution stack. From the next chapter on, the standard "Before reading" block returns and works as a contract between the chapter and the assessment.

Goal

Before chapter 1, you need to understand four things:

which learning production case runs through the assessment package;
which files are considered the result of each chapter;
which commands are truly runnable and which are only the interface of a future production layer;
where the learning minimum ends and the full adoption track begins.

End-to-end case

The base scenario is called AgentClinic-production. It is the same AgentClinic, but now there is an operational circuit around it. The main assessment case is high_memory_usage in appointments-api: it is convenient to bring it to webhook normalization, a readiness gate, a trial run, and the final evidence package. Additional cases demonstrate individual mechanisms but are not required to be mixed into a single capstone/.

service appointments-api;
alerts high_memory_usage, autoscale_200pct, appointment_latency / appointment_latency_spike, node_not_ready, cdn_error_budget_burn;
specifications that must survive /clear, a model switch, and review by another person;
prohibition on dangerous actions without evidence: blast-radius expansion, audit loss, silent P0 closure, automatic rollback bypass.

The learning branch is not required to contain real production code. Artifacts in capstone/, templates from examples/templates/, and runnable examples from examples/ are sufficient. If a chapter uses something other than high_memory_usage, write into capstone/ only the verifiable output: which defect, counterexample, budget risk, or invariant must be carried over to the main case.

Short carryover map:

Chapter's local case	What to carry over to `high_memory_usage`
`node_not_ready`	requirement provenance and the rule "do not close without restoration evidence"
`appointment_latency` / `appointment_latency_spike`	one class of specification defect or the result of a stress-mutator (distinction: `appointment_latency` is the general class of incident "latency on the `/agents` route", `appointment_latency_spike` is a specific learning payload in `examples/stress-mutator/base/base_spec.json` for chapters 2 and 5)
`autoscale_200pct`	a counterexample to blast-radius expansion or a budget risk
`cdn_error_budget_burn`	a pair of KPIs + a guard metric against Goodhart

Execution stack

In the first volume, AgentClinic is an application in TypeScript, Hono, server-side JSX, SQLite, and Vitest. This stack does not go away: in the learning model AgentClinic-production, it remains the stack of the product itself.

In the second volume, a second code layer appears — small runnable scripts in book2/examples/. They are written in Python stdlib and are needed only so that a single person on their machine can run a minimal example of a chapter in a couple of seconds without standing up infrastructure. This is not a product stack switch and not a hint that the production AgentClinic has been rewritten in Python. These are learning simulators: stress-mutator, duel, Spec CI, token budget, readiness calculator. In a real project, such checks are more often packaged as pre-commit, GitHub Actions, an MCP tool, or a service on its own stack — Python here is just the cheapest language to run without a build.

The rule is simple: anything that appears in book2/examples/ can be run as python3 ... with no dependencies. Anything marked in a chapter as [project script] or [conceptual interface] is the form of a future script or integration in your project, not tied to Python.

Minimum route

If you are short on time, go through the second volume like this:

Read this part and the README of the chosen runnable example.
In chapters 1–3, fill in three manual artifacts: genealogy.md, a poisoned/fixed pair, and constitution.md.

In chapters 4–11, run only the [runnable] commands from examples/; carry over the results of other cases to capstone/ as a principle, not as a new domain.
In chapter 12, check the package against the diagnostic checklist.
In chapter 13, assemble a small capstone/ around a single incident.

The minimum route does not require writing external orchestrators, MCP servers, Kubernetes integrations, or real CI gateways. Those elements belong to the full track.

The route check is simple: after each chapter, one new verifiable output should appear in capstone/. Not a full production process, but a small record that can be shown to another person.

> How to read the table. The "output" column is intentionally described in plain words, without terms from chapters 4–13. If a word appears in the right column that is not yet in the short dictionary below, it is introduced in the chapter that needs it. Do not try to learn the volume's vocabulary from this table.

After chapter	Minimum output
1	one requirement with two sources in `genealogy.md`
2	a pair of specifications "defective / fixed" showing one class of error

| 3 | constitution.md with two immutable rules and one rule with an expiration date | | 4 | a minimal counterexample to one rule or a formulation of the next limiter | | 5 | the result of a stress-mutator smoke run or a brief report on which mutations the validator caught | | 6 | one accepted and one rejected shadow candidate (rules that could have entered the specification) | | 7 | a Spec CI line: what is covered, what is blocked | | 8 | one file with a verdict on a controversial change and a link to evidence | | 9 | a model budget exhaustion risk and the threshold at which you switch to a cheap tier | | 10 | a target metric and a paired protective metric against its Goodhart skew | | 11 | a production admission verdict and a dry-run of the permitted action | | 12 | three blocker / owner / next_check items | | 13 | an assembled capstone/ for one incident with five PASS lines of the rubric |

If a chapter introduces additional terms but this output is missing, close the output first. The terms can be finished later.

For orientation, keep a filled-in example of [examples/templates/capstone-dossier.md](examples/templates/capstone-dossier.md) at hand. This is not a template to copy mindlessly, but the minimum form of an answer to the question: "what trace should remain after the first pass?"

The minimum vocabulary for the first pass is short:

capstone/ — the final evidence package for one incident;
genealogy.md — the origin of a requirement and the confidence level;
validation.md — commands, manual facts, and blockers;
judgment.md — a verdict on a controversial change;
readiness.md — why an action is admitted, blocked, or moves to a semi-manual mode.

All other terms are needed only when they help fill in one of these files. If a term does not affect the current chapter's output, do not dwell on it on first reading.

What to actually run

The second volume uses three types of command blocks:

[runnable] — run as written. The example lives in book2/examples/.
[project script] — this is a contract for a future script in your project. If a runnable counterpart is not specified nearby, the command is not required to exist in the textbook repository.

[conceptual interface] — the form of a future integration. It does not need to be run during a learning pass.

The rule is simple: the assessment package may reference only facts you actually ran, or manual artifacts that can be read without chat history.

First smoke run

Before reading chapters 4–11, it is useful to make sure the local examples work:

bash book2/examples/smoke_all.sh

The script runs a smoke test on a temporary copy of book2/examples, so it leaves no out/ or __pycache__ in the working tree. If you are short on time, open examples/README.md and select only the block of the chapter you are currently working through.

Working directory for assessment

Create the directory for the future package:

mkdir -p capstone

Leave it empty for now. In chapters 1–12, you will gradually understand which files will end up there. Do not mix multiple incidents into a single evidence package: one file may reference a runnable counterpart from another case, but the decision must explain one main incident.

capstone/
  README.md
  genealogy.md
  poisoned-spec.md
  fixed-spec.md
  constitution.md
  validation.md
  judgment.md
  budget-note.md

goodhart-note.md
  readiness.md
  antipattern-audit.md

This structure repeats the first volume: first the intent and boundaries, then the plan and facts, then the review and the final package. The only difference in the second volume is that the facts relate not to a single feature but to a production admission of a dangerous action.

On the first pass, do not add files from the full track to capstone/ just because they are named in a chapter. scorebook, metric_network, decision_hash, precedents.md, and CI reports are needed when you have actually created them or can explain which runnable counterpart confirms the same principle.

To make it easier to orient, which chapter opens which file:

`capstone/` file	Opens
`genealogy.md`	chapter 1
`poisoned-spec.md` / `fixed-spec.md`	chapter 2
`constitution.md`	chapter 3
`validation.md` (happy + negative + counterexample)	chapters 4 and 7
`judgment.md`	chapter 8
`budget-note.md`	chapter 9
`goodhart-note.md`	chapter 10
`readiness.md`	chapter 11
`antipattern-audit.md`	chapter 12
`README.md` (final assembly)	chapter 13

If along the way a fourth or fifth file appears in a chapter that is not in this list, that is the full track. Record the principle in one line and move on.

Checkpoint fact

After this chapter, one main incident-case is selected, an empty capstone/ is created, and the runnable examples are verified with the command bash book2/examples/smoke_all.sh or set aside with an explicit reason. If you cannot name the main case and the first file that will end up in capstone/, it is too early to move on to chapter 1.

Check questions

Why is AgentClinic-production a learning model and not a requirement to stand up real infrastructure?
How does [runnable] differ from [project script]?
Why can't high_memory_usage and autoscale_200pct be mixed in the final capstone/ as two equally weighted cases?
Why must the final capstone/ be understandable without chat history?