Note (canon v3): the operational content of this appendix is carried by the s4u-product-scale-planning skill — agents load THAT; this appendix remains the full reference.

Appendix L: Product-Scale Planning + Cross-Consistency Review

This appendix expands Section 5.6 of the main methodology document with the concrete process for product-scale planning, the cross-consistency review checklist, and the bounded-autonomy checkpoint shape. The three load-bearing rules — plan-at-the-right-resolution-per-distance, cross-consistency-as-discrete-phase, bounded-autonomy-checkpoints — live in §5.6 and are the canonical reference. This appendix is where the process earns its keep through specifics.

Why this is in its own appendix

Product-scale planning is conceptually a small extension to per-milestone planning (§3.1) but operationally a meaningful shift: it moves the human from "between every milestone" to "between every milestone-cluster." The process steps and review checklist need to be specific enough that anyone running the methodology can execute them mechanically — not "design and review your plans," but "apply this checklist to these inputs and produce this output."

The Process: From Empty Project to /loop-Ready

The product-scale planning pass produces six artifacts, in order:

Product brief / PRD — single document. The "what are we building" anchor. Already part of the methodology (§3.1's design-before-code rule).
ADR set — every architectural decision the product needs. Includes mandatory ADR-0001 (tech stack, references the canonical stack from §4.5 + appendix-m), plus per-product-area ADRs (data model, multi-tenancy, payment routing, etc.). Each ADR is independently reviewable; the cross-consistency review pass below verifies they cohere.
Milestone plan set — one plan document per milestone, drafted at the resolution-per-distance described in §5.6 rule 1. The current milestone gets a fully-walked-through plan; near-future milestones get wave structure + acceptance criteria; far-future milestones get sketches.
Roadmap — orders the milestones, lists their dependencies on each other, marks the bounded-autonomy checkpoint between each milestone pair.
Cross-consistency review report — the output of running the review checklist below against artifacts 1-4. Either "all clear, dispatch authorised" or "N issues, fix and re-review."
STATE.md — initialised with the project's starting position. Updated per-milestone-shipped during /loop execution.

Each artifact is independently authored (or reviewed) by the human; /loop only kicks off after all six exist and the cross-consistency review report says "all clear."

The Cross-Consistency Review Checklist

Run this checklist as a discrete phase — not ambient verification during execution. Output a written report (Markdown, committed to docs/reviews/cross-consistency-{date}.md) listing every check performed and its result. The report's existence is the gate for /loop dispatch.

Check A — ADR↔plan↔baseline-migration cross-reference integrity (mechanical)

Extended in v2.1.2 (2026-04-30) to include the schema-name leg. Extended in v2.1.3 (2026-05-05) to include column nullability, frontend filesystem prerequisites, and schema/enum constraints on test-data shapes. The original ADR↔ADR scope missed one project's M6 four schema-vs-plan divergences and that same project's M9 three plan-vs-reality discoveries at dispatch time; the extended grep would have caught all of them at plan-freeze.

Inputs: every accepted ADR in docs/adr/, every milestone plan in docs/superpowers/plans/, every existing migration in alembic/versions/, the frontend scaffold filesystem (if applicable), and the schema/enum definitions of any test framework that plans claim to populate.

ADR cross-references:

A.1: List every other ADR referenced in any ADR body (e.g., "per ADR-0003"). Verify each referenced ADR exists in the same docs/adr/ directory.
A.2: List every "deferred to ADR-NNNN" or "to be decided in ADR-NNNN" marker. Verify the target ADR exists and is in Accepted status. A deferred decision pointing at a Draft ADR is a yellow flag (acceptable if the draft is being held for legitimate reasons, e.g., privacy ADR awaiting GDPR review); a deferred decision pointing at a non-existent ADR is a red flag.
A.3: Identify "supersedes" / "superseded by" pairs. Verify both sides reference each other and the superseded one is in Superseded status.

Schema-name cross-references (added v2.1.2):

A.4: For each plan, extract every fully-qualified schema name it references — <schema>.<table>.<column> patterns (e.g., public.tenants.payment_enabled, <tenant>.payments.status, payment_routing.expires_at). Build a "claimed schema names" set per plan.
A.5: For each claimed schema name, check whether it exists in any baseline migration under alembic/versions/ — or whether the same plan contains an explicit migration task that creates it. A name claimed by plan N that is neither in an existing migration nor produced by a migration task within plan N is a red flag.
A.6: For each migration filename pinned in any plan (0004_X.py, 0005_Y.py), check whether that exact filename already exists. A pinned filename collision is a red flag — plans should refer to migrations by purpose, never by filename (see also Section 5.6 rule 2(d)).

Column nullability + filesystem + test-data shape cross-references (added v2.1.3):

A.7: For each schema column referenced in plan logic with a nullability assumption (e.g., "filter WHERE phone_number IS NOT NULL", "INSERT skipping the optional phone_number column"), verify the column's actual NULL/NOT NULL shape against the migration. Codified after an M9 T3 dispatch — the plan brief assumed a tenant_admins.phone_number IS NOT NULL filter was meaningful, but the column is NOT NULL (so the filter is a no-op; the subagent had to use is_active=TRUE as proxy).
A.8: For each plan-claimed file path under a directory that requires scaffolding (e.g., frontend/src/... requires create-next-app; docs/site/... requires Docusaurus init; infra/k8s/... requires Helm chart), verify the parent scaffold exists. If frontend/ is empty except for .env.example, a plan listing frontend/src/app/admin/chat/page.tsx is making a scaffold assumption that won't survive dispatch. Codified after an M9 T6 dispatch — the plan listed Next.js paths as if scaffolded; reality forced a T6 split into T6a (bootstrap) + T6b (chat panel).
A.9: For each plan-claimed test scenario or fixture, verify it fits the schema/enum constraints of the test framework's data shape. If a plan adds "admin command scenarios" to a calibration set whose schema is Literal["customer", "agent"] for the role field, the scenarios literally can't be authored against that schema. Codified after an M9 T10 dispatch — the plan listed admin command scenarios for the customer-facing calibration set; required scope correction at dispatch time.

Tooling: a Python script that parses each plan markdown for backtick-quoted schema names, migration filenames, file paths, and test-data shapes; cross-references against ls alembic/versions/, grep -r "CREATE TABLE\|ADD COLUMN" alembic/versions/, the project's filesystem scaffold (parent-dir-exists checks for plan-claimed paths), and the test framework's Pydantic / TypedDict / Literal schemas. The script's output: per-plan list of unsatisfied references across all 9 sub-checks.

Check B — Plan capability handoffs (mechanical)

Inputs: every milestone plan in docs/superpowers/plans/.

For each plan:

B.1: Extract the "Files to create / modify" section. The set of new files a plan creates is its produced capability surface.
B.2: For each subsequent plan, scan its task descriptions and code snippets for imports of any path. If plan N+k imports a path that no earlier plan (M0..N+k-1) produces, flag it.
B.3: Special case: forward references to "TBD in milestone X" or "deferred to plan Y" are tracked. Verify each forward-reference target plan exists.

Tooling: a Python script that parses the plan markdown (Files to create is a stable section header), builds a "produced capabilities" map, then for each plan does grep -E '^from app\.|^import app\.' and checks each import against the produced-by-prior-plans map. Output: a CSV of unsatisfied imports.

Check C — Decision contradictions across plans (judgment)

Inputs: every milestone plan + every accepted ADR.

This check is judgment-call. The pattern that catches contradictions:

C.1: Read every plan's "Review-resolved decisions" sections (if the design received second-party review per §3.1) and every ADR's "Decision" section. List the load-bearing decisions per plan/ADR.
C.2: Look for the same property decided differently in two places. Examples from real project history: ADR-0001 said "Provider-A BSP" → ADR-0008 said "Provider-A Cloud API direct" (resolved by superseding); plan v2 said "soft-fail on cost ceiling" → plan v3 said "M5 ships SOFT-only, hard deferred" (consistent — same call expressed two ways). The latter is fine; the former requires either the supersedes pair OR a contradiction flag.
C.3: Look for properties that should logically agree but don't. Examples: plan N says "use Redis SETNX with 30s TTL" + plan N+1 says "Redis SETNX with 60s TTL" — pick one or document why they differ.

Tooling: this check is human-judgment; the output of the report should list every load-bearing decision found and pair contradictions explicitly. There's no script that catches "the schema field name was tenant_id in plan 3 and tenantId in plan 5" as a meaningful contradiction without semantic understanding.

Check D — Canonical stack drift (mechanical)

Inputs: project's ADR-0001 + the canonical stack from §4.5 + appendix-m.

For each library in the canonical stack:

D.1: If mandatory and the project's ADR-0001 doesn't include it, that's a flag — the project must explicitly accept or override the mandatory item.
D.2: If default and the project picks something else, ADR-0001 must have a "deviation rationale" section for that swap.
D.3: If forbidden and any project plan references the forbidden library, that's a hard fail.

Tooling: 50-line shell script that diffs the project's ADR-0001 stack section against the canonical-stack list.

Check E — Roadmap dependency consistency (mechanical)

Inputs: ROADMAP.md + the milestone plan set.

E.1: Roadmap milestone order matches plan dependency order. If plan M5 imports from plan M3, the roadmap must list M3 before M5.
E.2: Bounded-autonomy checkpoints are explicitly marked between milestones in the roadmap (e.g., a horizontal rule + "Checkpoint: review M3 discoveries before M4 dispatch").

The Bounded-Autonomy Checkpoint Shape

At each checkpoint (between milestones), /loop pauses and produces a structured prompt for the human:

# Checkpoint: M{N} → M{N+1}

## What landed in M{N}
- {commit list with one-line summaries}
- {test count delta}
- {coverage delta}

## Surprises pinned during M{N}
- {list of surprises that went into the close-out memo}

## What M{N+1} plan assumes
- {the load-bearing assumptions M{N+1} makes about the state at this point}

## Question
Did anything in M{N} invalidate the assumptions M{N+1} is making?

## Options
- **Proceed** — M{N+1} plan is still valid; dispatch Wave 1 of M{N+1}
- **Amend** — M{N+1} plan needs revision; pause loop, re-review M{N+1}
- **Halt** — fundamental discovery; suspend product-scale execution, return to ad-hoc planning

Typical answer: "proceed" (5 minutes). Sometimes: "amend" (15-30 minutes of plan revision before re-dispatch). Rarely: "halt" (the discovery is fundamental enough to break product-scale assumptions; revert to per-milestone planning until the discovery is reconciled).

Worked Example: M5 → M6 (planned)

This is a planning-time worked example, not a post-execution one. Picture a project whose M5 plan v3 is dispatching (Wave 1 in flight) while M6 — "payment-provider STK push integration" — has not yet started. Product-scale planning would draft the M6 plan now (not after M5 closes) and run cross-consistency review.

M6 plan sketch (resolution per §5.6 rule 1, since M6 is current+1):

Wave structure: 5-6 tasks across 2-3 waves.
Acceptance criteria: per ADR-0007 (payments orchestration). Provider-A exclusively for cards, Provider-B direct via its native API, 8min/30min nudge/abandon timing, daily 3am consolidated reaper, dead-letter table.
Imports from M5: app.orchestrator.booking_graph, app.persistence.tenant_scoped_saver, app.llm.router. All produced by M5 plans.
Imports from M3: app.tenancy.context.current_tenant. Produced.

Cross-consistency review against M3+M4+M5 (Check B):

✅ Every import M6 needs is produced by an earlier milestone.
⚠️ M6 references app.orchestrator.booking_graph.PAYMENT_PENDING state — M5 plan v3 lists BookingState.fsm_state enum but doesn't enumerate PAYMENT_PENDING. Either M5 plan v3 needs to add it (and we're discovering a missing field at planning time, not at execution time) OR M6 needs to explicitly add the state via an FSM amendment task. Flag this explicitly.

Bounded-autonomy checkpoint between M5 and M6:

"Did M5's FSM state model include payment-related states?" — if yes, proceed; if no, amend M6 plan to start with a state-extension task.

This is the kind of cross-cutting concern the front-loaded planning surfaces before /loop dispatches anything, rather than after M5 closes and M6 hits a missing state at task 1.

Summary

Product-scale planning is the discipline of producing all six artifacts — PRD, ADR set, milestone plan set, roadmap, cross-consistency review, STATE.md — before any subagent dispatches Wave 1 of milestone 1. The cost is 1-2 days of upfront planning effort; the benefit is multi-day autonomous /loop execution with 5-minute checkpoints between milestones. The cross-consistency review checklist (A-E) is the gate for /loop authorisation; the bounded-autonomy checkpoint shape is the human contract that surfaces discovery-during-execution at the right boundaries.

This is not "no humans in the loop." It is "humans at the boundaries that matter, machines through the territory between."

Why this is in its own appendix​

The Process: From Empty Project to /loop-Ready​

The Cross-Consistency Review Checklist​

Check A — ADR↔plan↔baseline-migration cross-reference integrity (mechanical)​

Check B — Plan capability handoffs (mechanical)​

Check C — Decision contradictions across plans (judgment)​

Check D — Canonical stack drift (mechanical)​

Check E — Roadmap dependency consistency (mechanical)​

The Bounded-Autonomy Checkpoint Shape​

Worked Example: M5 → M6 (planned)​

Summary​