TenantAtlas/specs/209-heavy-governance-cost/spec.md
2026-04-17 15:15:23 +02:00

301 lines
26 KiB
Markdown

title + explanation + exactly 1 CTA, and tables provide search/sort/filters for core dimensions.
# Feature Specification: Heavy Governance Lane Cost Reduction
**Feature Branch**: `209-heavy-governance-cost`
**Created**: 2026-04-17
**Status**: Draft
**Input**: User description: "Spec 209 - Heavy Governance Lane Cost Reduction"
## Spec Candidate Check *(mandatory — SPEC-GATE-001)*
- **Problem**: The Heavy Governance lane now carries the correct expensive governance families after Spec 208, but the lane itself remains above its documented budget and lacks a precise, shared explanation of which families dominate that cost and why.
- **Today's failure**: Confidence has been repaired, yet Heavy Governance is still too expensive, overbroad family boundaries hide duplicate discovery or validation work, and later CI budget enforcement would punish an honest but still unstable lane.
- **User-visible improvement**: Contributors and reviewers get a Heavy Governance lane whose dominant costs are visible, intentionally sliced, and either brought back under budget or consciously recalibrated with evidence.
- **Smallest enterprise-capable version**: Inventory the dominant heavy-governance families, decompose the top hotspots by trust type and duplicated work, refactor the most overbroad families, rerun the lane with before-and-after reporting, and publish concise author or reviewer guidance for future heavy tests.
- **Explicit non-goals**: No re-run of the lane-segmentation decisions from Spec 208, no general fixture-slimming program, no CI matrix rollout, no browser strategy work, and no blanket removal of legitimate governance coverage.
- **Permanent complexity imported**: A checked-in heavy-family inventory, cost-driver vocabulary, hotspot reporting discipline, budget-recovery decision record, and concise author or reviewer guidance for future heavy tests.
- **Why now**: Spec 208 made the heavy cost honest. Without tightening the Heavy Governance lane now, the next CI-enforcement phase would be built on a lane that is semantically correct but still not economically stable.
- **Why not local**: One-off optimizations cannot explain or control a lane-wide cost class. The repository needs a shared inventory, a shared slimming rule set, and an explicit budget decision that reviewers can evaluate consistently.
- **Approval class**: Cleanup
- **Red flags triggered**: Another governance taxonomy and the possibility of budget recalibration. Defense: the scope is intentionally narrow to repository test-lane cost control and does not introduce new product runtime truth, new product persistence, or new operator-facing surfaces.
- **Score**: Nutzen: 2 | Dringlichkeit: 2 | Scope: 2 | Komplexität: 1 | Produktnähe: 1 | Wiederverwendung: 2 | **Gesamt: 10/12**
- **Decision**: approve
## Spec Scope Fields *(mandatory)*
- **Scope**: workspace
- **Primary Routes**: No end-user HTTP routes change. The affected surfaces are repository-owned Heavy Governance lane manifests, hotspot inventories, runtime reports, family-level governance tests, and contributor guidance.
- **Data Ownership**: Workspace-owned classification notes, hotspot evidence, lane-budget reporting, family refactoring rules, and reviewer guidance. No tenant-owned runtime records or new product data are introduced.
- **RBAC**: No end-user authorization behavior changes. The affected actors are repository contributors, reviewers, and maintainers who need an honest and governable Heavy Governance lane.
## Proportionality Review *(mandatory when structural complexity is introduced)*
- **New source of truth?**: no
- **New persisted entity/table/artifact?**: no new product persistence; only repository-owned inventory, reporting evidence, and guidance updates
- **New abstraction?**: yes, but limited to a repository-level inventory and decomposition model for dominant heavy-governance families
- **New enum/state/reason family?**: no product runtime state is added
- **New cross-domain UI framework/taxonomy?**: no new cross-domain product taxonomy; this work only sharpens the repository's heavy-governance cost vocabulary
- **Current operator problem**: Maintainers cannot tell which heavy-governance families are legitimately expensive, which are redundant, and which are simply overbroad, so the lane stays expensive without a stable correction path.
- **Existing structure is insufficient because**: Spec 208 fixed lane placement, but not family width. The current heavy families can still combine discovery, surface, workflow, and guard trust in ways that repeat work and make budget drift hard to control.
- **Narrowest correct implementation**: Inventory the dominant heavy families, analyze the top hotspots, refactor only the most overbroad families, rerun the lane, and end with an explicit budget-recovery or recalibration decision.
- **Ownership cost**: The team must maintain the hotspot inventory, the family-slimming rules, the before-and-after budget evidence, and the short guidance that keeps future heavy tests from regressing.
- **Alternative intentionally rejected**: Blindly removing assertions or moving families back into lighter lanes, because that would hide cost instead of explaining and controlling it.
- **Release truth**: Current-release repository truth that stabilizes the Heavy Governance lane before any CI budget enforcement is made binding.
## Problem Statement
Spec 208 resolved the earlier dishonesty in lane placement: heavy governance cost is now concentrated in the Heavy Governance lane instead of leaking into Confidence.
That exposed a new and separate problem: the Heavy Governance lane itself is still too expensive for its documented budget.
The cost drivers are now believed to be structural rather than classificatory:
- Some heavy-governance families are too broad and try to prove multiple trust types in one pass.
- Discovery, validation, render, or surface scans may be repeated inside the same family or across closely related families.
- Workflow-heavy and surface-heavy guards may be bundled together even when they should be separate test concerns.
- The most expensive families are now visible, but their internal cost drivers are not yet decomposed well enough to target the right fix.
- The lane is semantically honest but not yet economically stable enough for later CI budget enforcement.
If this remains unresolved, the repository reaches an unhealthy middle state: the right tests live in the right lane, but the lane is still too expensive to use as a dependable budgeted contract.
## Dependencies
- Depends on Spec 206 - Test Suite Governance & Performance Foundation for lane vocabulary, baseline reporting discipline, and budget governance.
- Depends on Spec 207 - Shared Test Fixture Slimming for the earlier reduction of default fixture cost.
- Depends on Spec 208 - Heavy Suite Segmentation for the corrected placement of heavy governance families into the Heavy Governance lane.
- Recommended before any CI matrix or runtime budget enforcement follow-up.
- Does not block ordinary feature delivery, provided new heavy-governance tests continue to follow the current lane-classification rules.
## Goals
- Bring the Heavy Governance lane back under its documented budget or end with an explicit and justified recalibration.
- Inventory the dominant heavy-governance families and their main hotspot files.
- Decompose the top heavy families by trust type, breadth, and duplicate work.
- Reduce redundant discovery, validation, render, or surface work inside the targeted hotspot families.
- Preserve the governance trust delivered by heavy families while making their cost class more controllable and understandable.
- Create a stable basis for later CI budget enforcement.
## Non-Goals
- Reopening the main lane-assignment decisions from Spec 208.
- Replacing the fixture-cost work from Spec 207.
- General CI wiring or hard-fail enforcement.
- Browser-lane optimization.
- Broad rollback of legitimate governance guards merely to improve runtime headlines.
- Primary optimization of Confidence or Fast Feedback beyond any indirect gains that happen naturally.
## Assumptions
- The current Heavy Governance lane already reflects the correct high-level family placement after Spec 208.
- Baseline reporting from the current heavy-governance run can be regenerated before and after this work.
- Some families will remain intentionally expensive, but their purpose and residual cost should still be explicit.
- Not every heavy hotspot will require file splitting; some can be recovered by centralizing repeated work or tightening family scope.
- If a dominant hotspot turns out to be mostly helper-driven or fixture-driven rather than family-breadth-driven, that cause will be recorded explicitly instead of being disguised as a family problem.
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Inventory Heavy Governance Hotspots (Priority: P1)
As a maintainer, I want the dominant Heavy Governance families and files explicitly inventoried so I can see which families are making the lane miss its budget and what kind of cost each family represents.
**Why this priority**: Budget recovery cannot be honest until the lane's dominant costs are visible and named.
**Independent Test**: Run the Heavy Governance lane reporting path, review the resulting inventory, and confirm that the dominant families have named purposes, hotspot files, and cost-driver classifications.
**Acceptance Scenarios**:
1. **Given** a current Heavy Governance baseline run, **When** the hotspot inventory is produced, **Then** the dominant families are listed with family name, purpose, hotspot files, and primary cost-driver classification.
2. **Given** a family is known to dominate the lane, **When** the inventory is reviewed, **Then** it is labeled as overbroad, redundant, discovery-heavy, workflow-heavy, surface-heavy, helper-driven, fixture-driven, intentionally heavy, or another explicitly explained cost type.
3. **Given** the lane is over budget, **When** a maintainer reads the inventory, **Then** they can identify which families are responsible for most of the overrun.
---
### User Story 2 - Slim Overbroad Heavy Families Without Losing Trust (Priority: P1)
As a maintainer, I want the most overbroad heavy families split or tightened so the lane stops repeating equivalent discovery or workflow work while keeping governance trust legible.
**Why this priority**: The largest gains are expected to come from fixing the breadth and duplication of the worst hotspot families, not from cosmetic file moves.
**Independent Test**: Refactor one of the top hotspot families, run its focused pack, and confirm that the resulting families have clearer trust boundaries, less repeated work, and preserved guard intent.
**Acceptance Scenarios**:
1. **Given** a heavy family mixes discovery, workflow, and surface trust, **When** it is refactored, **Then** the resulting tests have clearer semantic boundaries and no unnecessary catch-all scope.
2. **Given** duplicate discovery or validation work exists within a targeted family, **When** the family is slimmed, **Then** repeated passes are reduced or centralized without removing the intended governance guard.
3. **Given** a family remains legitimately heavy after refactoring, **When** it is reviewed, **Then** its remaining cost and trust purpose are explicitly documented.
---
### User Story 3 - Resolve Heavy Lane Budget Failure Explicitly (Priority: P2)
As a maintainer, I want the Heavy Governance lane rerun after the slimming pass so I can prove whether the lane is back within budget or whether the budget itself must be consciously recalibrated.
**Why this priority**: The feature is only complete when budget failure ends in a real decision rather than an unexplained acceptance of ongoing overrun.
**Independent Test**: Rerun the lane after the targeted family changes, compare the baseline and post-change results, and confirm that the outcome records either budget recovery or a justified recalibration decision.
**Acceptance Scenarios**:
1. **Given** the top hotspot families have been slimmed, **When** the Heavy Governance lane is rerun, **Then** the before-and-after delta and remaining hotspots are documented.
2. **Given** the lane still exceeds its former budget after real slimming, **When** the results are reviewed, **Then** the outcome records a conscious recalibration decision with evidence instead of silently treating the overrun as acceptable.
3. **Given** a touched heavy-governance family, **When** the post-change manifest and outcome are reviewed, **Then** the family remains in Heavy Governance unless an explicit non-budget rationale and spec-backed reclassification are recorded.
---
### User Story 4 - Guide Future Heavy Test Authors (Priority: P2)
As a reviewer or contributor, I want short guidance for Heavy Governance families so new heavy tests do not reintroduce the same overbroad patterns that made the lane unstable.
**Why this priority**: Without author guidance, the same structural mistakes can reappear immediately after the cleanup.
**Independent Test**: Review the guidance against a new or recently modified heavy-governance test and confirm that an author can decide whether the test belongs in an existing family, needs a new family, or should be split.
**Acceptance Scenarios**:
1. **Given** a new heavy-governance test is proposed, **When** the author guidance is applied, **Then** the reviewer can decide whether it belongs in an existing family or needs a separate family with explicit rationale.
2. **Given** a proposed test mixes discovery, surface, and workflow trust, **When** the guidance is applied, **Then** the reviewer can tell whether those concerns must be split to avoid another catch-all family.
### Edge Cases
- A hotspot family appears expensive, but the dominant cost is actually a shared helper or fixture path outside the family itself.
- Splitting a family creates clearer files but does not reduce runtime because the same central discovery work still runs repeatedly.
- A family is intentionally expensive and should remain in Heavy Governance, but it still needs an explicit explanation so reviewers do not mistake it for accidental bloat.
- The lane remains over budget even after duplicate work is removed, requiring a conscious recalibration rather than more arbitrary trimming.
- Multiple hotspot families depend on the same discovery or validation topology, so the correct optimization is centralization rather than repeated local edits.
## Requirements *(mandatory)*
**Constitution alignment:** This feature changes no end-user routes, no Microsoft Graph behavior, no queued operation semantics, no authorization planes, and no operator-facing product surfaces. It extends repository test-governance rules only, so the heavy-family inventory, slimming rules, hotspot evidence, and budget decision must remain explicit, reviewable, and measurable.
**Constitution alignment (PROP-001 / ABSTR-001 / BLOAT-001 / TEST-TRUTH-001):** The only new structure is a narrow repository-level inventory and decomposition model for heavy-governance hotspots. It is justified because the current over-budget lane cannot be corrected reliably through isolated local edits or by hiding cost in lighter lanes. The solution must stay explicit, avoid speculative frameworking, and preserve clear trust-to-family mapping.
**Budget authority for this rollout:** Until the normalization work is implemented, the Heavy Governance lane summary threshold of `300s` is the authoritative pre-normalization contract for recovery-or-recalibration decisions. The `200s` lane evaluation still emitted by `budgetTargets()` is legacy drift evidence that must remain visible until reconciled, but it is not a second passing threshold.
### Functional Requirements
- **FR-001 Heavy Family Inventory**: The repository MUST produce an explicit inventory of the dominant Heavy Governance families, including family name, primary purpose, primary hotspot files, and primary cost-driver classification.
- **FR-002 Known Hotspot Inclusion**: The inventory MUST include the current top 5 Heavy Governance families by lane time, or enough families to explain at least 80% of the lane's current runtime, whichever set is larger. This inclusion boundary MUST include baseline-profile-start-surfaces, findings-workflow-surfaces, and finding-bulk-actions-workflow while they remain inside that boundary, and MUST expand to any newly discovered family above the same boundary.
- **FR-003 Cost-Driver Classification**: Each inventoried family MUST be classified using a consistent vocabulary that distinguishes at least overbroad, redundant, discovery-heavy, workflow-heavy, surface-heavy, helper-driven, fixture-driven, and intentionally heavy causes.
- **FR-004 Internal Cost Decomposition**: The top hotspot families selected for action MUST each document which breadth is genuinely required, where duplicate discovery or validation work occurs, and which trust types are currently combined.
- **FR-005 No Catch-All Families**: A targeted Heavy Governance family MUST NOT remain a catch-all that mixes unrelated discovery, workflow, and surface trust without explicit documented justification.
- **FR-006 Duplicate Work Reduction**: Where targeted families repeat semantically equivalent discovery, validation, render, or surface-scanning work, that work MUST be reduced, centralized, or otherwise made non-duplicative.
- **FR-007 Guard Preservation**: Every refactored heavy family MUST retain a clear explanation of which governance rule or trust type it protects so that runtime gains do not come from hidden guard loss.
- **FR-008 Hotspot Reporting**: The Heavy Governance reporting output MUST show lane time before and after the slimming pass, top hotspot files or families, remaining open-expensive families, stabilized families, and whether the checked-in hotspot inventory satisfies the top-5-or-80%-of-runtime inclusion rule.
- **FR-009 Honest Budget Outcome**: After the targeted slimming pass, the Heavy Governance lane MUST be rerun and end with one of two explicit outcomes against the authoritative heavy-governance budget contract: documented recovery within the current threshold or a documented recalibration of the heavy-lane budget based on the now-correct lane composition.
- **FR-010 No Cost Hiding**: Heavy-governance families touched by this spec MUST retain Heavy Governance lane membership unless a non-budget rationale for reclassification is explicitly recorded and approved via a spec update. They MUST NOT be moved into lighter lanes solely to satisfy the budget target.
- **FR-011 Residual Cause Recording**: If a hotspot's dominant cost is determined to be primarily fixture-driven, helper-driven, or otherwise outside family scope, that residual cause MUST be recorded explicitly and routed as follow-up debt rather than being misreported as family-width improvement.
- **FR-012 Author And Reviewer Guidance**: The repository MUST provide concise guidance stating when a new heavy family is justified, when a test belongs in an existing heavy family, when a test is too broad, and when discovery, surface, and workflow concerns should be separated.
- **FR-013 Budget Contract Precedence**: For Spec 209 acceptance and reporting, the Heavy Governance lane summary threshold of `300s` is the authoritative pre-normalization budget contract until the normalization work publishes one reconciled threshold. The `200s` value still emitted by `budgetTargets()` MUST remain visible as legacy drift evidence but MUST NOT be interpreted as a second passing threshold.
### Non-Functional Requirements
- **NFR-001 Budget Honesty**: Heavy Governance cost must remain visible and attributable rather than being hidden through relabeling or silent exclusions.
- **NFR-002 Review Clarity**: A reviewer must be able to explain why a dominant family is heavy and what trust it delivers without relying on local tribal knowledge.
- **NFR-003 Incremental Slimming**: The highest-cost families must be reducible in targeted slices rather than requiring a full rewrite of the lane.
- **NFR-004 Stable Enforcement Readiness**: The resulting lane must be stable enough that later CI budget enforcement can treat its budget as a credible contract, meaning a heavy-governance rerun through the standard wrappers emits summary, budget, and report artifacts that expose the same authoritative threshold and the same budget outcome classification.
## Work Packages
### Work Package A - Heavy Hotspot Inventory
- Profile the current Heavy Governance lane.
- Identify the dominant families and hotspot files.
- Classify each dominant family by primary cost driver.
- Record the current heavy-lane baseline that the recovery pass will be measured against.
### Work Package B - Family Semantics Audit
- For each top hotspot family, identify which trust type it delivers.
- Separate required breadth from accidental breadth.
- Identify duplicate discovery, validation, or surface work.
- Distinguish true family-width issues from helper or fixture issues.
### Work Package C - Dominant Family Refactoring
- Split overbroad families into narrower units when needed.
- Centralize repeated discovery or validation work when that is the real hotspot.
- Separate workflow-heavy checks from surface-heavy or discovery-heavy guards where that improves both clarity and cost.
- Keep each resulting family's trust purpose explicit.
### Work Package D - Budget Recovery Validation
- Rerun the Heavy Governance lane after the slimming pass.
- Document the before-and-after lane delta and remaining hotspots.
- Decide explicitly whether the lane is back under budget or whether the budget must be recalibrated.
- Record any bounded residual debt that remains after honest slimming.
### Work Package E - Author And Reviewer Guidance
- Add short rules for when a heavy family is justified.
- Explain when a new test belongs in an existing heavy family.
- Explain when a proposed heavy test is too broad.
- Give reviewers a clear rule for splitting discovery, workflow, and surface trust when they are unnecessarily combined.
## Deliverables
- A checked-in inventory of dominant Heavy Governance families and hotspot files.
- A cost-driver classification for the heavy-lane hotspots.
- Slimmed or decomposed versions of the dominant overbroad families.
- Updated before-and-after heavy-lane reporting and budget evidence.
- An explicit budget outcome: recovery within the current threshold or a consciously documented recalibration.
- Short author or reviewer guidance for future Heavy Governance tests.
- A final summary of residual risks or remaining debt.
## Risks
### Governance Hollowing
Over-aggressive slimming could remove real governance trust instead of only removing redundant work.
### Cosmetic Decomposition
Renaming or splitting files without actually reducing duplicate work could make the lane look cleaner without recovering budget.
### Budget Gaming
Moving heavy cost into lighter lanes or redefining scope to satisfy the target would damage the credibility of the repository's lane governance.
### Misclassified Residual Cost
Some hotspots may ultimately be helper-driven or fixture-driven; if that is not recorded honestly, the family-level analysis will be misleading.
### Overfitting To Today's Hotspots
If the spec focuses only on today's top families and does not leave reviewer guidance behind, the next wave of heavy tests can recreate the same problem.
## Rollout Guidance
- Profile the current lane before making structural edits.
- Audit the top families semantically before trimming any assertions.
- Prefer removing duplicate work or clarifying family scope before deleting coverage.
- Rerun the lane after the targeted slimming pass and record the delta immediately.
- If the lane is still honestly over budget, end with a conscious recalibration or bounded follow-up debt instead of silent acceptance.
## Design Rules
- **Heavy stays honest**: Heavy cost must remain visible in the Heavy Governance lane.
- **Trust first, then trimming**: Protect the governance rule first, then remove accidental breadth or duplication.
- **No catch-all families**: A family should not mix unrelated trust types without explicit reason.
- **No duplicate discovery without cause**: Equivalent discovery or validation passes require a clear justification.
- **Budget failure resolves explicitly**: Over-budget status must end in recovery or a conscious recalibration decision.
- **Reviewer legibility is required**: A reviewer must be able to see why a family is heavy and what it protects.
### Key Entities *(include if feature involves data)*
- **Heavy Governance Family**: A named cluster of governance tests that contributes deliberate high-cost safety to the Heavy Governance lane.
- **Cost-Driver Classification**: The recorded explanation of why a heavy family is expensive, such as overbroad scope, duplicate work, discovery breadth, workflow breadth, or intentionally retained depth.
- **Hotspot Inventory**: The checked-in record of dominant heavy-lane families, their purpose, hotspot files, and primary cost drivers.
- **Budget Outcome Record**: The documented result that states whether the Heavy Governance lane recovered within budget or required a conscious recalibration.
## Success Criteria *(mandatory)*
### Measurable Outcomes
- **SC-001**: The hotspot inventory covers at least the top 5 Heavy Governance families by current lane time, or enough families to explain at least 80% of the lane's current runtime, whichever set is larger.
- **SC-002**: Every targeted hotspot family has a documented purpose, primary trust type, hotspot files, and primary cost-driver classification.
- **SC-003**: At least the top 3 targeted hotspot families are slimmed, decomposed, or explicitly retained as intentionally heavy with a documented reason.
- **SC-004**: After the slimming pass, the Heavy Governance lane either runs within the authoritative threshold defined for this rollout, which is `300s` until the normalization work publishes a reconciled threshold, or ends with a newly documented budget that includes evidence that the lane composition is now semantically correct and the remaining cost is legitimate.
- **SC-005**: The post-change reporting view exposes at least the top 10 Heavy Governance hotspots, confirms that the inventory covers the top 5 families or at least 80% of runtime, whichever is larger, and marks which targeted families improved, which remain open, and which were intentionally retained as heavy.
- **SC-006**: Reviewer guidance enables a contributor to classify a new heavy-governance test without undocumented local knowledge and to detect when a proposed test is too broad.
- **SC-007**: No targeted heavy-governance family is moved into Fast Feedback or Confidence solely for budget reasons during this rollout.