Topic: Appendix B. Compatibility with Qwen Code
Difficulty level: Medium
Estimated study time: 6-8 hours (theory + practice)
Prerequisites: Basic familiarity with Qwen Code and its CLI interface
Understanding of Git fundamentals and repository structure
Experience working with Markdown files and configurations
Basic knowledge of Python or another scripting language
Understanding of CI/CD concepts (preferred)
Learning objectives: Distinguish between three maturity levels (Standard, Recommendation, Frontier) and correctly classify production processes by implementation layer
Create custom commands in the .qwen/commands/ structure with proper naming and stop contracts
Design reproducible checks as project scripts, independent of the model's persuasiveness
Configure Qwen Code hooks (PreToolUse, PostToolUse, UserPromptSubmit, etc.) for guardrails
Integrate external production APIs via MCP servers with an allowlist of tools and secret protection
Overview: Appendix B establishes a critically important boundary between Qwen Code's built-in capabilities and processes that teams must implement in the project themselves. The second volume of the course describes production processes around Qwen Code: some are built into the tool, some require user commands, skills, hooks, MCP servers, or regular scripts. The study guide is built around the canonical scale of three maturity levels — Standard, Recommendation, and Frontier — which defines expectations for the reader and the implementation layer. Understanding this boundary prevents the error of mistaking a designed process for a standard CLI command, and allows proper architectural distribution of responsibility between the built-in tool, project configuration, and external orchestration.
Key concepts: Canonical scale of three maturity levels: A classification system that does not evaluate the quality of an idea, but defines the boundary of expectations from the reader. Standard — works in regular Qwen Code without additional platform, basic course material, built-in capabilities. Recommendation — useful to formalize in the project for a recurring process, requires repository files: custom commands, skills, hooks, or scripts. Frontier — production orchestration around Qwen Code, needed by teams with external APIs, SRE processes, and model budgeting, implemented via external orchestrator, MCP, external services.
Built-in qwen code layer: Basic capabilities used as-is without additional configuration: /plan (planning mode without edits and shell execution), /review (built-in code review with deterministic checks and parallel review agents), /skills (viewing and explicitly running skills), /memory, /remember, /forget (memory management and QWEN.md), /mcp and qwen mcp add (connecting MCP servers), @path (adding file/directory to context), !command (shell commands in interactive session), qwen -p "..." (headless launch for CI and scripts).
Custom commands: Mechanism for extending Qwen Code by creating Markdown files in the .qwen/commands/<namespace>/<command>.md structure. Commands are invoked as /<namespace>:<<command>, contain prompts with {{args}}, references to @specs/... and stop rules. Allow reproduction of first volume command behavior (e.g., /clarify) without implicit assumptions about their being built-in.
Project scripts: Reproducible checks, independent of the model, implemented as regular scripts in the project directory. Green CI status must depend on the check code, not on the model's persuasiveness. Examples: check_coverage.py, validate_schema.py, mutate_specs.py, run_duel.py, compile.py for budget.
Hooks: Official Qwen Code events for implementing guardrails: PreToolUse, PostToolUse, UserPromptSubmit, SessionStart, Stop, SubagentStop, Notification, PreCompact and others. Require using exact event names from current documentation, not free-form variants.
Mcp servers and external api: Method for integrating production APIs (Grafana, PagerDuty, Kubernetes, Jira) without turning them into unrestricted shell commands. Require an allowlist of tools: read-only for triage, separate write tools for safe actions, explicit confirmation and rollback conditions, prohibition on passing secrets in prompts, traces, and QWEN.md.
Production process roles: Verifier — votes via /review, separate session, sub-agent, or script. Implementor — votes via Qwen Code in default/auto-edit mode after approved plan. Safety — votes with veto at critical_risk via separate session or blast radius check script. Coordinator — non-voting protocolist, implemented by human, CI, or external orchestrator.
File tribunal: Not a built-in command; combination of /review, scripts, reports, and validation.md rules for resolving conflicts during simultaneous editing.
Spec gateway (spec ci): GitHub Actions or local scripts that may use qwen -p only as an auxiliary layer, but the main check is deterministic code.
Budget keeper: External service or script; Qwen Code itself does not manage daily model tier quotas.
Practice exercises: Name: Classifying processes by maturity level
Problem: Five production processes are given: (1) running /plan for task decomposition, (2) automatic test coverage check via check_coverage.py script, (3) integration with Kubernetes for rolling deployment, (4) using /review for code review, (5) custom command /sdd:validate for specification checking. Classify each process according to the canonical scale (Standard, Recommendation, Frontier). Justify the decision for each case.
Solution: 1) Standard — /plan is a built-in Qwen Code command, requires no additional files. 2) Recommendation — check_coverage.py script requires creating a file in the repository, reproducible without model, useful when repeated. 3) Frontier — Kubernetes is an external orchestrator, requires production infrastructure, SRE processes. 4) Standard — /review is built into Qwen Code with deterministic checks. 5) Recommendation — custom command requires creating .qwen/commands/sdd/validate.md, belongs to project layer. Key criterion: does it require files in the repository (Recommendation) or external services (Frontier).
Complexity: beginner
Name: Creating a custom clarify command
Problem: The first volume used the /clarify command for clarifying requirements before planning. It is not built into Qwen Code. Create the file structure and Markdown file content for the custom command /sdd:clarify that reproduces the behavior from the first volume. Define the stop contract and rules for using {{args}} and @specs/.
Solution: Structure: .qwen/commands/sdd/clarify.md. File content must include: (1) description of purpose — clarifying ambiguous requirements before planning; (2) template with {{args}} for passing context; (3) references to @specs/ for loading specifications; (4) stop rules — command stops when all ambiguities are resolved and a list of clarified requirements is formed; (5) instruction not to start planning or code editing. Example invocation: /sdd:clarify 'API authentication requirements'. Stop contract: exit upon reaching state 'all questions clarified, answers documented in list format'.
Complexity: intermediate
Name: Designing an MCP server for safe integration
Problem: A team uses Grafana for monitoring and PagerDuty for alerting. Qwen Code needs access to these services without turning them into unrestricted shell commands. Design an MCP server architecture with an allowlist of tools, define confirmation conditions for write operations, and a secret protection mechanism.
Solution: MCP server architecture: (1) Read-only tools: grafana_query_metrics (promQL queries), grafana_list_dashboards, pagerduty_list_incidents, pagerduty_get_incident_details — for triage and checking, without confirmation. (2) Write tools with explicit confirmation: pagerduty_acknowledge_incident (requires confirmation with reason specified), pagerduty_escalate_incident (requires double confirmation). (3) Rollback conditions: for pagerduty_create_incident — check of mandatory fields, limit of 5 incidents per hour, automatic cancellation on 5xx error. (4) Secret protection: API keys stored in server environment variables, not passed in prompts; QWEN.md contains no endpoint addresses with credentials; request logging masks authorization headers. (5) Configuration: qwen mcp add pagerduty-ops --url http://localhost:3001/sse --tools allowlist:grafana_query_metrics,grafana_list_dashboards,pagerduty_list_incidents,pagerduty_acknowledge_incident.
Complexity: advanced
Name: Configuring hooks for guardrails
Problem: Guardrails need to be implemented for the scenario: (1) prohibit execution of shell commands containing rm -rf / or DROP TABLE, (2) log all external API accesses before execution, (3) archive session context before memory compaction. Select appropriate official Qwen Code events and describe the hook configuration.
Solution: (1) PreToolUse — check shell_execute tool arguments for prohibited patterns: rm -rf /, DROP TABLE, TRUNCATE. On match — cancel execution with user notification. (2) PreToolUse or PostToolUse — logging of external API accesses: recording endpoint, method, timestamp in structured log. (3) PreCompact — archiving current session context to file .qwen/archive/sessions/<timestamp>.md before memory compaction. Important: use exact event names from documentation — PreToolUse, PostToolUse, PreCompact — not variants like pretooluse. Hook configuration requires verification against current Qwen Code documentation for format and parameters.
Complexity: intermediate
Name: Separation of responsibilities in Spec CI
Problem: A team wants to automatically check specifications on every PR. Two approaches are discussed: (A) use qwen -p 'check the specification' as the main mechanism, (B) create a validate_schema.py script with deterministic checks, and use qwen -p only for generating reports. Apply Appendix B principles to choose the correct approach and describe the architecture.
Solution: Correct approach — (B), corresponds to Recommendation/Frontier layer. Architecture: (1) Main check — validate_schema.py script, checking JSON Schema, link integrity, identifier duplication. Green CI status depends only on check code. (2) Auxiliary layer — qwen -p 'generate readable report based on validate_schema.py results', used only for formatting output, does not affect pass/fail. (3) GitHub Actions integration: 'Validate Schema' step runs python scripts/spec_ci/validate_schema.py --strict; 'Generate Report' step optionally uses qwen -p for improved readability. (4) Principle: green status must not depend on the model's persuasiveness — this is a key Appendix B requirement for project scripts.
Complexity: intermediate
Case studies: Name: Migrating a development team from implicit assumptions to explicit Qwen Code architecture
Scenario: A team of 12 developers used Qwen Code for half a year to develop a microservices platform. Team members informally used commands like '/clarify', '/specify', '/tasks', assuming they were built into Qwen Code. During onboarding of new developers, constant conflicts arose: commands didn't work in new projects, behavior differed between sessions, processes couldn't be reproduced in CI.
Challenge: Implicit assumptions about command built-inness led to: (1) process fragmentation — each developer had their own understanding of steps; (2) inability to reproduce in CI — commands working in interactive mode were absent in headless; (3) lack of documentation — stop contracts were verbal; (4) scaling risks — new team members spent weeks figuring out 'how it works here'.
Solution: The team conducted an audit using Appendix B methodology. Step 1: Classification of all used processes by canonical scale. It turned out that /plan and /review are Standard, while /clarify, /specify, /tasks, /validate require creating custom commands. Step 2: Creating .qwen/commands/sdd/ structure with namespace 'sdd' (Software Design Document), including clarify.md, specify.md, tasks.md, validate.md, constitution.md. Each file contained explicit stop contract, templates with {{args}}, references to @specs/. Step 3: Moving deterministic checks to project scripts: validate_schema.py, check_coverage.py, mutate_specs.py. Step 4: Setting up Spec CI in GitHub Actions with qwen -p only for reports. Step 5: Documenting boundaries: README explicitly states which commands are built-in, which are project-specific.
Result: After 3 weeks: onboarding time reduced from 2 weeks to 2 days; 100% reproducibility of CI checks; unified process language across teams; ability to version commands via Git. The team was able to scale to 20 developers without process degradation. Key mindset change: transition from 'Qwen Code can do everything' to 'we explicitly design what Qwen Code does and what our project does'.
Lessons learned: Implicit assumptions about built-inness are the main source of process fragmentation in teams
Canonical scale is a communication tool, not just architectural; it synchronizes expectations between developers
Custom commands with explicit stop contracts are more expensive to create, but free to scale
Green CI status must depend on code, not on the model — this is hard to accept, but critical for production
Related concepts: Canonical scale of three maturity levels
Custom commands
Project scripts
Spec gateway (Spec CI)
Name: Integrating SRE processes via MCP server for a fintech company
Scenario: A fintech company with regulatory requirements used Qwen Code to develop a payment gateway. Integration with PagerDuty for incident escalation, Grafana for metric checking before deployment, and internal API for audit was required. Direct shell commands with API keys in prompts created critical security risks.
Challenge: Regulatory requirements: (1) all actions with production systems must be auditable; (2) API keys must not leak into logs and prompts; (3) incident escalation requires double confirmation; (4) rollback of changes must be possible within 5 minutes. Meanwhile, developers wanted to use Qwen Code for operational tasks without switching context.
Solution: Architecture according to Appendix B principles (Frontier layer): (1) MCP server 'ops-bridge' in Go, deployed inside VPC, with an allowlist of 8 tools. Read-only: grafana_query_metrics, pagerduty_list_incidents, pagerduty_get_incident_details, audit_log_query. Write with confirmation: pagerduty_acknowledge_incident (requires reason ≥ 20 characters), pagerduty_escalate_incident (requires approval_code from SMS), audit_log_append (only structured records). (2) Prohibition on shell access to production via PreToolUse hook — blocking any !command containing kubectl, ssh, curl to production domains. (3) Secrets: API keys in HashiCorp Vault, MCP server authenticates via mTLS; QWEN.md contains only tool descriptions, never endpoint URLs with credentials. (4) PostToolUse hook logs all calls to structured audit trail. (5) Budgeting via external Budget Keeper — separate service tracking model call costs and blocking on exceeding daily quota.
Result: Passed regulator audit without remarks regarding AI tools. Incident response time reduced by 40% due to metric access from Qwen Code. Zero incidents with credential leaks. Model budget is controlled and predictable. The team got a 'single window' for development and operations without violating security boundary.
Lessons learned: Production API via MCP with allowlist is the only acceptable way for regulated industries
PreToolUse/PostToolUse hooks are critically important for defense in depth, but require exact event names from documentation
External Budget Keeper is necessary because Qwen Code does not manage daily quotas — this is frontier layer by definition
QWEN.md must never contain sensitive data, even in encrypted form
Related concepts: MCP servers and external APIs
Hooks
Budget Keeper
Frontier — maturity level
Study tips: Create a physical or digital 'decision map': for any production process, ask 'Can this be done with a built-in Qwen Code command?' → if yes, it's Standard; if it requires project files but not external services — Recommendation; if Kubernetes, Grafana, external APIs are needed — Frontier
Practice on a real repository: create .qwen/commands/demo/ with 2-3 custom commands, invoke them, check reproducibility after git clone to another directory
For understanding hooks: set up a test project, configure PreToolUse and PostToolUse logging, analyze which events are generated during different actions — this will give intuition about interception points
Study MCP through the security lens: for each tool you add, explicitly document 'what is the worst that can happen' and how allowlist/confirmation prevents it
Pair learning: one person designs a process as 'everything built into Qwen Code', another as 'everything external scripts'; then apply the canonical scale for synthesis — this trains architectural thinking
Keep a 'boundary journal': record cases when you or colleagues assumed a command was built-in, but it turned out to be project-specific — this is a typical error, and its patterns repeat
For headless mode (qwen -p): set up a minimal CI pipeline in GitHub Actions or GitLab CI to feel the difference between interactive session and automated launch
Study the connection between second volume roles (Verifier, Implementor, Safety, Coordinator) and specific Qwen Code mechanisms — create a correspondence table for your project
Additional resources: Qwen Code documentation — commands: https://qwenlm.github.io/qwen-code-docs/en/users/features/commands/
Qwen Code documentation — headless mode: https://qwenlm.github.io/qwen-code-docs/en/users/features/headless/
Qwen Code documentation — hooks: https://qwenlm.github.io/qwen-code-docs/en/users/features/hooks/
Qwen Code documentation — skills: https://qwenlm.github.io/qwen-code-docs/en/users/features/skills/
Qwen Code documentation — memory: https://qwenlm.github.io/qwen-code-docs/en/users/features/memory/
Qwen Code documentation — mcp: https://qwenlm.github.io/qwen-code-docs/en/users/features/mcp/
Qwen Code documentation — approval mode: https://qwenlm.github.io/qwen-code-docs/en/users/features/approval-mode/
Qwen Code documentation — code review: https://qwenlm.github.io/qwen-code-docs/en/users/features/code-review/
Github spec kit: https://github.com/github/spec-kit
Aws kiro documentation overview: https://aws.amazon.com/documentation-overview/kiro/
Owasp top 10 for llm applications: https://owasp.org/www-project-top-10-for-large-language-model-applications/
Google sre book: https://sre.google/sre-book/
Goodhart's law (wikipedia): https://en.wikipedia.org/wiki/Goodhart%27s_law
Mcp protocol specification (model context protocol): Recommended to study the official Anthropic specification for deep understanding of MCP server architecture
Summary: Appendix B establishes the architectural boundary between Qwen Code's built-in capabilities and processes that teams must implement themselves. The key tool is the canonical scale of three maturity levels (Standard, Recommendation, Frontier), which defines expectations and the implementation layer. Built-in commands (/plan, /review, /skills, /memory, /mcp, @path, !command, qwen -p) are used as-is. Custom commands require creating .qwen/commands/<namespace>/<command>.md with explicit stop contracts. Project scripts ensure reproducibility independent of the model. Hooks implement guardrails via official Qwen Code events. MCP servers with an allowlist of tools integrate external APIs safely. Understanding this boundary prevents the error of implicit assumptions about built-inness, ensures process scalability, and makes production use of Qwen Code predictable and auditable.