ahmido a2fdca43fd feat: implement heavy governance cost recovery (#242 )

## Summary
- implement Spec 209 heavy-governance cost recovery end to end
- add the heavy-governance contract, hotspot inventory, decomposition, snapshots, budget outcome, and author-guidance surfaces in the shared lane support seams
- slim the baseline and findings hotspot families, harden wrapper behavior, and refresh the spec, quickstart, and contract artifacts

## Validation
- `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Guards/TestLaneCommandContractTest.php tests/Feature/Guards/ActionSurfaceContractTest.php tests/Feature/Guards/OperationLifecycleOpsUxGuardTest.php`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineProfileCaptureStartSurfaceTest.php tests/Feature/Filament/BaselineProfileCompareStartSurfaceTest.php tests/Feature/Filament/BaselineActionAuthorizationTest.php tests/Feature/Findings/FindingsListFiltersTest.php tests/Feature/Findings/FindingExceptionRenewalTest.php tests/Feature/Findings/FindingWorkflowRowActionsTest.php tests/Feature/Findings/FindingWorkflowViewActionsTest.php tests/Feature/Guards/ActionSurfaceContractTest.php tests/Feature/Guards/OperationLifecycleOpsUxGuardTest.php`
- `./scripts/platform-sail artisan test --compact`

## Outcome
- heavy-governance latest artifacts now agree on an authoritative `330s` threshold with `recalibrated` outcome after the honest rerun
- full suite result: `3760 passed`, `8 skipped`, `23535 assertions`

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #242

2026-04-17 13:17:13 +00:00

9.1 KiB

Raw Blame History

Research: Heavy Governance Lane Cost Reduction

Decision 1: Reuse the existing heavy-lane governance seams instead of adding a new runner or metadata system

Decision: Spec 209 should extend the existing TestLaneManifest, TestLaneBudget, TestLaneReport, scripts/platform-test-lane, and scripts/platform-test-report seams rather than introducing a second planning or reporting framework.
Rationale: The repository already has checked-in heavy-family attribution, family budgets, lane budgets, and report artifacts under apps/platform/storage/logs/test-lanes. The missing work is family decomposition and budget recovery, not missing lane infrastructure.
Alternatives considered:
- Add a separate heavy-cost analysis runner: rejected because it would duplicate the current test-governance contract.
- Use only ad-hoc profiling commands: rejected because Spec 209 requires repeatable before-and-after evidence and reviewer-visible outputs.

Decision 2: Treat the feature as a budget-recovery exercise against the current measured heavy-governance overrun

Decision: The planning baseline for Spec 209 is the current heavy-governance artifact set showing 318.296962 seconds wall-clock versus the authoritative pre-normalization lane summary budget of 300 seconds.
Rationale: Spec 208 already moved the correct heavy families into the heavy-governance lane. The remaining issue is now an explicit cost-recovery problem, not a classification problem.
Alternatives considered:
- Re-profile from scratch without using the current heavy artifact: rejected because the current artifact already captures the relevant runtime signal and hotspot attribution.
- Treat the lane as healthy because the overrun is relatively small: rejected because the spec requires an explicit budget outcome, not quiet acceptance of ongoing drift.

Decision 3: Make the dual heavy-lane budget signal an explicit planning concern

Decision: Spec 209 should explicitly reconcile the current heavy-governance budget mismatch by treating the lane summary threshold of 300 seconds as the authoritative pre-normalization contract and the separate budgetTargets() lane target of 200 seconds as legacy drift evidence until one normalized threshold is published.
Rationale: The current report summary and the detailed budget evaluation do not describe the same target. A later CI budget-enforcement phase cannot be credible while that inconsistency exists.
Alternatives considered:
- Ignore the mismatch and optimize only against the 300-second lane summary: rejected because the stricter 200-second target still appears in checked-in budget evaluations and will confuse reviewers.
- Force the lane to 200 seconds immediately: rejected because the spec first needs to determine whether the 200-second target is still realistic for the now-honest heavy lane.

Decision 4: Prioritize the dominant ui-workflow families before second-wave surface guards

Decision: The first slimming pass should prioritize baseline-profile-start-surfaces, findings-workflow-surfaces, and finding-bulk-actions-workflow, with workspace-settings-slice-management as the next workflow-heavy fallback if more recovery is required.
Rationale: Current heavy-governance attribution shows ui-workflow at 190.606431 seconds, or roughly 60% of lane cost. The three named families together account for about 161.06 seconds and directly align with the spec's required hotspot set.
Alternatives considered:
- Start with action-surface-contract: rejected as the first pass because it is clearly expensive but already documented as an intentional governance guard and may have less removable duplication than the workflow-heavy hotspots.
- Start with all surface-guard families equally: rejected because the current runtime evidence shows ui-workflow as the dominant cost bucket.

Decision 5: Decompose targeted families by repeated work before splitting files mechanically

Decision: Each targeted hotspot family should first be decomposed by repeated Livewire mounts, header-action gating matrices, filter-state persistence checks, bulk-action fan-out, evidence or audit verification, and any helper-driven fixture cost before deciding whether to split files.
Rationale: Spec 209 is about real cost reduction, not cosmetic decomposition. A family can remain overbroad even after being split if the same expensive setup still runs in every resulting file.
Alternatives considered:
- Split all top families immediately: rejected because that can produce cleaner file boundaries without removing the dominant repeated work.
- Only centralize helper setup without family-level analysis: rejected because some cost may be due to semantic breadth rather than helper shape.

Decision 6: Record helper-driven or fixture-driven cost as residual debt instead of forcing a family explanation that is not true

Decision: If a targeted hotspot is found to be dominated by helper, fixture, or support-path cost rather than family breadth, the resulting plan should record that as explicit residual debt and treat it as follow-up work instead of pretending the family itself was narrowed.
Rationale: The spec requires honest attribution. Mislabeling helper or fixture cost as family-width improvement would create false confidence and make later budget work less reliable.
Alternatives considered:
- Force all heavy cost into family-width categories: rejected because it would violate the spec's explicit residual-cause requirement.
- Reopen Spec 207 inside Spec 209: rejected because fixture slimming remains a separate concern even when its residual effects appear here.

Decision 7: Treat `action-surface-contract` and `ops-ux-governance` as intentional heavy families unless decomposition exposes repeatable duplication

Decision: action-surface-contract and ops-ux-governance should be treated as second-wave slimming candidates. They remain in heavy-governance by default and should only be narrowed where clear duplicate discovery or repeated governance passes can be shown.
Rationale: Together they account for 79.636413 seconds and are meaningful heavy governance checks. They may still contain removable redundancy, but their default assumption should be “intentionally heavy until proven otherwise,” not “overbroad by default.”
Alternatives considered:
- Treat all surface-guard cost as excessive: rejected because these families intentionally protect cross-resource governance contracts.
- Exclude them from the plan entirely: rejected because they are still major contributors to lane cost and may need second-pass analysis.

Decision 8: Use the existing heavy-governance report artifacts as the before-and-after evidence contract

Decision: Pre-change and post-change evidence should continue to flow through heavy-governance-latest.summary.md, heavy-governance-latest.budget.json, and heavy-governance-latest.report.json under apps/platform/storage/logs/test-lanes.
Rationale: The repository already reads and writes these artifacts. Extending the same contract keeps Spec 209 measurable without introducing a new artifact root or new tool surface.
Alternatives considered:
- Add a second report directory specifically for Spec 209: rejected because the current lane artifact contract is already canonical.
- Depend only on terminal output: rejected because reviewers need checked-in, inspectable budget evidence.

Decision 9: Keep author guidance repo-local and adjacent to the existing lane contract

Decision: Spec 209 should place future heavy-family guidance in the existing test-governance seam, centered on TestLaneManifest semantics, guard expectations, and checked-in review guidance rather than creating a separate framework or documentation tree.
Rationale: Authors and reviewers already need the manifest and guard seams to understand lane ownership. Keeping the guidance there avoids a new abstraction layer and keeps maintenance local to the existing contract.
Alternatives considered:
- Create a new standalone documentation subsystem for heavy tests: rejected because the guidance is specific to the repository's existing lane contract.
- Leave guidance only in the spec artifacts: rejected because authors need a lasting checked-in hint near the implementation seam after the spec is complete.

Decision 10: Success requires explicit recovery or explicit recalibration, not quiet tolerance

Decision: The feature should end in exactly one of two explicit outcomes: the heavy-governance lane recovers within the authoritative threshold for the rollout, which starts at 300 seconds until normalization completes, or the repository documents a conscious recalibration once the honest lane composition and dominant residual costs are understood.
Rationale: Spec 209 exists to stabilize the heavy lane before CI enforcement. A vague “improved but still heavy” outcome would not satisfy that purpose.
Alternatives considered:
- Accept any measurable improvement as sufficient: rejected because the spec explicitly requires a budget decision.
- Hard-code recalibration in advance: rejected because the plan must first test whether real recovery is feasible from the dominant hotspot families.

9.1 KiB Raw Blame History

Research: Heavy Governance Lane Cost Reduction

Decision 1: Reuse the existing heavy-lane governance seams instead of adding a new runner or metadata system

Decision 2: Treat the feature as a budget-recovery exercise against the current measured heavy-governance overrun

Decision 3: Make the dual heavy-lane budget signal an explicit planning concern

Decision 4: Prioritize the dominant ui-workflow families before second-wave surface guards

Decision 5: Decompose targeted families by repeated work before splitting files mechanically

Decision 6: Record helper-driven or fixture-driven cost as residual debt instead of forcing a family explanation that is not true

Decision 7: Treat action-surface-contract and ops-ux-governance as intentional heavy families unless decomposition exposes repeatable duplication

Decision 8: Use the existing heavy-governance report artifacts as the before-and-after evidence contract

Decision 9: Keep author guidance repo-local and adjacent to the existing lane contract

Decision 10: Success requires explicit recovery or explicit recalibration, not quiet tolerance

9.1 KiB

Raw Blame History

Decision 7: Treat `action-surface-contract` and `ops-ux-governance` as intentional heavy families unless decomposition exposes repeatable duplication