TenantAtlas/specs/209-heavy-governance-cost/quickstart.md
ahmido a2fdca43fd feat: implement heavy governance cost recovery (#242)
## Summary
- implement Spec 209 heavy-governance cost recovery end to end
- add the heavy-governance contract, hotspot inventory, decomposition, snapshots, budget outcome, and author-guidance surfaces in the shared lane support seams
- slim the baseline and findings hotspot families, harden wrapper behavior, and refresh the spec, quickstart, and contract artifacts

## Validation
- `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Guards/TestLaneCommandContractTest.php tests/Feature/Guards/ActionSurfaceContractTest.php tests/Feature/Guards/OperationLifecycleOpsUxGuardTest.php`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament/BaselineProfileCaptureStartSurfaceTest.php tests/Feature/Filament/BaselineProfileCompareStartSurfaceTest.php tests/Feature/Filament/BaselineActionAuthorizationTest.php tests/Feature/Findings/FindingsListFiltersTest.php tests/Feature/Findings/FindingExceptionRenewalTest.php tests/Feature/Findings/FindingWorkflowRowActionsTest.php tests/Feature/Findings/FindingWorkflowViewActionsTest.php tests/Feature/Guards/ActionSurfaceContractTest.php tests/Feature/Guards/OperationLifecycleOpsUxGuardTest.php`
- `./scripts/platform-sail artisan test --compact`

## Outcome
- heavy-governance latest artifacts now agree on an authoritative `330s` threshold with `recalibrated` outcome after the honest rerun
- full suite result: `3760 passed`, `8 skipped`, `23535 assertions`

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #242
2026-04-17 13:17:13 +00:00

151 lines
9.0 KiB
Markdown

# Quickstart: Heavy Governance Lane Cost Reduction
## Goal
Stabilize the heavy-governance lane so its dominant costs are visible, intentionally sliced, and either brought back within the authoritative heavy-lane budget, which starts at `300s` before normalization, or consciously recalibrated with evidence.
## Current Outcome
The latest honest rerun ends in explicit recalibration rather than recovery.
| Signal | Current value | Meaning |
|-------|---------------|---------|
| Final wall clock | `329.305382s` | Current heavy-governance lane runtime after the slimming pass |
| Final authoritative threshold | `330s` | Normalized threshold used consistently by summary, budget, and report artifacts |
| Outcome | `recalibrated` | The lane no longer has dual active thresholds, but it still needs a slightly higher honest contract |
| Baseline delta | `+11.008420s` (`+3.458905%`) | Current rerun versus the preserved pre-slimming baseline |
| Legacy drift signal | `200s` | Preserved as historical detailed-budget evidence only |
| Pre-normalization summary threshold | `300s` | Preserved as the rollout acceptance contract before normalization |
The final reconciled rationale is: workflow-heavy duplication was reduced, but the settled lane still retains intentional surface-guard depth plus the workspace settings residual helper cost, so the contract is now `330s`.
## Implementation Order
1. Capture a fresh heavy-governance baseline through the existing lane wrappers and preserve the current summary, report, and budget artifacts. The checked-in wrappers now support `--capture-baseline` for heavy-governance baseline copies.
2. Build or refresh the hotspot inventory for the current top 5 families by runtime, or enough families to explain at least 80% of lane runtime, whichever set is larger.
3. Decompose the primary ui-workflow hotspots first: `baseline-profile-start-surfaces`, `findings-workflow-surfaces`, and `finding-bulk-actions-workflow`.
4. Decide per family whether the right move is split, centralize repeated work, trim duplicate assertions, or retain as intentionally heavy.
5. Audit second-wave surface-guard families such as `action-surface-contract` and `ops-ux-governance` only after the workflow-heavy hotspots are understood.
6. Extend or adjust manifest and report seams so decomposition, residual causes, and the final budget outcome remain visible.
7. Normalize the heavy-governance budget contract so the authoritative pre-normalization `300s` summary threshold and the legacy `200s` budget-target evaluation describe one intentional rule after the honest lane shape is established.
8. Rerun the focused hotspot packs and the full heavy-governance lane.
9. Record the final outcome as budget recovery or explicit recalibration and add short reviewer guidance for future heavy tests.
## Suggested Code Touches
```text
apps/platform/tests/Support/TestLaneBudget.php
apps/platform/tests/Support/TestLaneManifest.php
apps/platform/tests/Support/TestLaneReport.php
apps/platform/tests/Feature/Baselines/*
apps/platform/tests/Feature/Filament/BaselineActionAuthorizationTest.php
apps/platform/tests/Feature/Filament/BaselineProfileCaptureStartSurfaceTest.php
apps/platform/tests/Feature/Filament/BaselineProfileCompareStartSurfaceTest.php
apps/platform/tests/Feature/Findings/*
apps/platform/tests/Feature/Guards/ActionSurfaceContractTest.php
apps/platform/tests/Feature/OpsUx/*
apps/platform/tests/Feature/SettingsFoundation/WorkspaceSettingsManageTest.php
scripts/platform-test-lane
scripts/platform-test-report
```
## Validation Flow
Use the existing checked-in lane wrappers first:
```bash
./scripts/platform-test-report heavy-governance --capture-baseline
./scripts/platform-test-lane heavy-governance --capture-baseline
./scripts/platform-test-report heavy-governance
./scripts/platform-test-lane heavy-governance
./scripts/platform-test-report heavy-governance
cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent
```
Keep the implementation loop tight with the most relevant focused suites before rerunning the whole lane:
```bash
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament --filter=BaselineProfileCaptureStartSurfaceTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament --filter=BaselineProfileCompareStartSurfaceTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament --filter=BaselineActionAuthorizationTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Findings --filter=FindingBulkActionsTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Findings --filter=FindingWorkflow
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Guards --filter=ActionSurfaceContractTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/OpsUx
```
## Current Baseline
Use the checked-in heavy-governance artifacts under `apps/platform/storage/logs/test-lanes` as the starting point.
| Signal | Current value | Planning note |
|-------|---------------|---------------|
| Lane wall clock | `318.296962s` | Current measured overrun |
| Lane summary threshold | `300s` | Authoritative pre-normalization contract for Spec 209 acceptance |
| Budget target evaluation threshold | `200s` | Legacy drift evidence that must remain visible until the contract is normalized |
| `ui-workflow` total | `190.606431s` | Dominant class; first slimming target |
| `surface-guard` total | `106.845887s` | Second-wave analysis target |
| `discovery-heavy` total | `0.863003s` | Already bounded; not the main cost problem |
## Current Canonical Inventory
The canonical inventory now covers six families because the top five alone do not clear the required `80%` runtime threshold.
| Family | Baseline measured time | Current status | Driver |
|-------|-------------------------|----------------|--------|
| `baseline-profile-start-surfaces` | `98.112193s` | `slimmed` | workflow-heavy |
| `action-surface-contract` | `40.841552s` | `retained` | intentionally-heavy |
| `ops-ux-governance` | `38.794861s` | `retained` | intentionally-heavy |
| `findings-workflow-surfaces` | `36.459493s` | `slimmed` | workflow-heavy |
| `finding-bulk-actions-workflow` | `26.491446s` | `slimmed` | redundant |
| `workspace-settings-slice-management` | `21.740839s` | `follow-up` | helper-driven |
Together these six families explain `263.617244s`, or `80.052516%`, of the latest heavy-governance runtime.
## Latest Rerun Hotspots
| Family | Latest measured time | Current intent |
|-------|------------------------|----------------|
| `baseline-profile-start-surfaces` | `101.895415s` | Still dominant after slimming; trust retained |
| `action-surface-contract` | `38.323501s` | Intentionally heavy and retained |
| `ops-ux-governance` | `36.497049s` | Intentionally heavy and retained |
| `findings-workflow-surfaces` | `35.990272s` | Slimmed, but still a meaningful workflow-heavy slice |
| `finding-bulk-actions-workflow` | `30.145259s` | Slimmed fixture fan-out, still a top single test family |
| `workspace-settings-slice-management` | `20.765748s` | Recorded as explicit follow-up debt |
## Decomposition Checklist
For each primary hotspot family, answer these questions before changing file structure:
1. What governance trust does this family deliver?
2. What breadth is genuinely required for that trust?
3. Which repeated work sources dominate runtime?
4. Is the main cost family-breadth, helper-driven setup, or fixture-driven setup?
5. Is the correct fix a split, a centralization, a duplicate-assertion trim, or intentional retention?
6. What focused tests and lane reruns prove the change did not hollow out governance trust?
## Reviewer Guidance Targets
The implementation should leave behind short rules that cover:
1. When a new heavy family is justified.
2. When a test should join an existing heavy family instead.
3. When discovery, workflow, and surface trust must be separated.
4. When a family should stay intentionally heavy.
5. When a helper or fixture cost must be recorded as residual debt instead of disguised as family improvement.
The canonical reviewer rules now live in `TestLaneManifest::heavyGovernanceAuthorGuidance()` and are:
1. `heavy-family-reuse-before-creation`
2. `heavy-family-create-only-for-new-trust`
3. `split-discovery-workflow-surface-concerns`
4. `retain-intentional-heavy-depth-explicitly`
5. `record-helper-or-fixture-residuals`
## Exit Criteria
1. The heavy-governance budget contract is normalized to one authoritative threshold, and the summary, budget, and report artifacts do not disagree about it.
2. The primary hotspot families have decomposition records and explicit slimming decisions.
3. The heavy-governance lane has fresh before and after evidence in the standard artifact paths, including inventory coverage for the top 5 families or at least 80% of runtime, whichever is larger.
4. The final outcome is explicit: recovered within the authoritative threshold for the rollout or consciously recalibrated.
5. Reviewer guidance exists for future heavy-family authoring.