Product-Scale Planning + Cross-Consistency Review

The /loop pattern (Section 5.5) gives autonomous execution within a milestone. This section gives autonomous execution across milestones — multi-plan upfront drafting, cross-consistency review as a discrete phase, then /loop dispatches the whole sequence with explicit human checkpoints between milestones.

In one line: Front-load every known milestone's plan (at decreasing detail with distance), run one cross-consistency review, then let /loop dispatch the whole sequence with 5-minute human checkpoints between milestones.

What: Product-scale planning front-loads the architectural decisions and plans for every known milestone before /loop dispatches the first one. Instead of design-M3, build-M3, design-M4, build-M4, the project produces all M3-Mn plans (detail decreasing with distance) plus the gating ADR set, runs one cross-consistency review pass, and only then dispatches Wave 1 of M3. After the review passes, /loop runs M3 → M4 → M5 → … with bounded human checkpoints between milestones.

Why: Plan-build-plan-build is wasteful two ways. First, each per-milestone planning cycle costs high-judgment human attention; across n milestones that is n rounds of context-switching. Second, cross-cutting concerns surface late — an M5 decision reveals M3's schema needs a field, forcing an ADR amendment plus a follow-up commit. Front-loading surfaces those at planning time, when amendments are cheap edits rather than data migrations.

The compounding benefit: /loop persists across milestone boundaries instead of terminating at each, halting only at the explicit checkpoints. A two-week product becomes a multi-day autonomous run with an hour of upfront planning and 5-minute review checkpoints.

Three load-bearing rules:

Plan at the right resolution per milestone distance. M{current} gets a fully reviewed design with task-level deliverables and ‖ parallelisable annotations. M{current+1} gets a plan with wave structure + task names + acceptance criteria but task internals deferred until design-review time. M{current+2..n} get a plan sketch — milestone goal, expected dependencies on prior milestones, expected ADRs gating it, expected wave count. Detail decreases with distance because the cost of churn on far-future plans is high (every M{current} discovery may force a far-future plan rewrite). Trying to fully plan M{current+5} at the same fidelity as M{current} is the over-spec failure mode this rule exists to prevent.
Cross-consistency review is a discrete phase, not an ambient property. After all plans are drafted, before any subagent dispatches the first task, run a single review pass with seven checks: (a) ADR↔plan↔baseline-migration cross-references: every ADR referenced from another ADR or plan must exist; every "deferred to ADR-NNNN" marker must point at a real, accepted ADR; and every schema name (table, column, constraint) mentioned in any plan must appear either in the existing baseline migrations or in an explicit migration task within the same plan. The third leg exists because schema-vs-plan divergences surface at dispatch time that a mechanical schema-name grep would have caught at review. (b) Plan↔plan capability handoffs: if plan N+1 imports app.frob.bar, plan N (or earlier) must deliver it. List every cross-plan import; verify the producer plan exists. (c) No contradictory decisions across plans: plan N says X is JSONB, plan M says X is a foreign key — pick one, amend the other. (d) Migration filename pinning is forbidden: plans refer to migrations by purpose ("the unrouted-callbacks migration"), never by pinned filename, since an earlier task often ships a different filename than the plan anticipated. (e) Column nullability cross-grep: for each schema column referenced in plan logic (e.g., "filter WHERE x IS NOT NULL"), verify the column's actual NULL/NOT NULL shape against the migration — a plan that filters on a NULL-able column that is actually NOT NULL is a silent bug. (f) Filesystem prerequisites: for every plan-claimed file path, verify the parent scaffold exists — a plan that lists frontend/src/app/.../page.tsx against an unscaffolded frontend/ directory will fail at dispatch. Applies to any scaffold-assumed path (frontend, docs site, IaC). (g) Schema/enum constraints on test-data shapes: for every plan-claimed test scenario or fixture, verify it fits the schema/enum constraints of the test framework's data shape (e.g., a scenario role that no Literal permits). The review's output is "all clear, dispatch" or "N issues — fix and re-review." Checks (a), (b), (d), (e), (f), (g) are mechanical greps; (c) is judgment, but the review forces it before execution rather than during.
Bounded autonomy: explicit checkpoints between milestones. Autonomy means "no humans for known territory," not "no humans." Between milestone N close and N+1 first dispatch, /loop pauses for one question: "did we discover anything in M{N} that invalidates the plan for M{N+1}?" Typical answer: no, proceed (5 minutes). Sometimes: yes, amend N+1 (15-30 minutes). Either way the checkpoint surfaces the discovery-during-execution risk that purely-autonomous systems suppress — without it, an M3 discovery silently invalidates an M5 plan that runs to completion before anyone notices.
Schema-invariant tests pin behavior, not raw counts. Tests pinning a single source of truth (column sets, scenario counts, downgrade walks) are valuable but brittle when written as raw integers — adding one column should not break three unrelated test files. Two prescriptions: (a) Downgrade walks use an explicit revision id, never a -N count: alembic downgrade ${baseline_rev} survives every future migration; alembic downgrade -3 breaks on every additive one. (b) Count assertions derive from behavior: prefer assert all(s.final.state for s in scenarios) over assert len(scenarios) == 10. The behavior form catches genuine drift (a malformed scenario) while tolerating intentional additions.

Evidence: The cross-consistency review is mechanically tractable, which is what makes it a real gate rather than a wish: ADR cross-references check via grep against a list of accepted ADR numbers; capability handoffs check by parsing each plan's "Files to create" section against later plans' imports; the schema, nullability, filesystem, and test-data checks are all greps. The bounded-autonomy checkpoint is the explicit acknowledgment that real engineering reveals decisions plans cannot anticipate. Detailed process, checklist, and worked example in appendix-l-product-scale-planning.md.

Product-scale planning interacts with prior amendments:

Design review (Section 3.1) happens at design draft time for every milestone in the product-scale set, not just the next one.
Wave dispatch (Section 5.4) operates within each milestone's plan unchanged.
/loop (Section 5.5) sequences across milestones rather than terminating at milestone boundaries; the bounded-autonomy checkpoints become the explicit pause points in the loop.
STATE.md (Appendix J) tracks "current milestone + last-shipped within milestone" identically to per-milestone planning; the checkpoint moments are the obvious update boundaries.