TenantAtlas/specs/209-heavy-governance-cost/quickstart.md
2026-04-17 15:15:23 +02:00

9.0 KiB

Quickstart: Heavy Governance Lane Cost Reduction

Goal

Stabilize the heavy-governance lane so its dominant costs are visible, intentionally sliced, and either brought back within the authoritative heavy-lane budget, which starts at 300s before normalization, or consciously recalibrated with evidence.

Current Outcome

The latest honest rerun ends in explicit recalibration rather than recovery.

Signal Current value Meaning
Final wall clock 329.305382s Current heavy-governance lane runtime after the slimming pass
Final authoritative threshold 330s Normalized threshold used consistently by summary, budget, and report artifacts
Outcome recalibrated The lane no longer has dual active thresholds, but it still needs a slightly higher honest contract
Baseline delta +11.008420s (+3.458905%) Current rerun versus the preserved pre-slimming baseline
Legacy drift signal 200s Preserved as historical detailed-budget evidence only
Pre-normalization summary threshold 300s Preserved as the rollout acceptance contract before normalization

The final reconciled rationale is: workflow-heavy duplication was reduced, but the settled lane still retains intentional surface-guard depth plus the workspace settings residual helper cost, so the contract is now 330s.

Implementation Order

  1. Capture a fresh heavy-governance baseline through the existing lane wrappers and preserve the current summary, report, and budget artifacts. The checked-in wrappers now support --capture-baseline for heavy-governance baseline copies.
  2. Build or refresh the hotspot inventory for the current top 5 families by runtime, or enough families to explain at least 80% of lane runtime, whichever set is larger.
  3. Decompose the primary ui-workflow hotspots first: baseline-profile-start-surfaces, findings-workflow-surfaces, and finding-bulk-actions-workflow.
  4. Decide per family whether the right move is split, centralize repeated work, trim duplicate assertions, or retain as intentionally heavy.
  5. Audit second-wave surface-guard families such as action-surface-contract and ops-ux-governance only after the workflow-heavy hotspots are understood.
  6. Extend or adjust manifest and report seams so decomposition, residual causes, and the final budget outcome remain visible.
  7. Normalize the heavy-governance budget contract so the authoritative pre-normalization 300s summary threshold and the legacy 200s budget-target evaluation describe one intentional rule after the honest lane shape is established.
  8. Rerun the focused hotspot packs and the full heavy-governance lane.
  9. Record the final outcome as budget recovery or explicit recalibration and add short reviewer guidance for future heavy tests.

Suggested Code Touches

apps/platform/tests/Support/TestLaneBudget.php
apps/platform/tests/Support/TestLaneManifest.php
apps/platform/tests/Support/TestLaneReport.php
apps/platform/tests/Feature/Baselines/*
apps/platform/tests/Feature/Filament/BaselineActionAuthorizationTest.php
apps/platform/tests/Feature/Filament/BaselineProfileCaptureStartSurfaceTest.php
apps/platform/tests/Feature/Filament/BaselineProfileCompareStartSurfaceTest.php
apps/platform/tests/Feature/Findings/*
apps/platform/tests/Feature/Guards/ActionSurfaceContractTest.php
apps/platform/tests/Feature/OpsUx/*
apps/platform/tests/Feature/SettingsFoundation/WorkspaceSettingsManageTest.php
scripts/platform-test-lane
scripts/platform-test-report

Validation Flow

Use the existing checked-in lane wrappers first:

./scripts/platform-test-report heavy-governance --capture-baseline
./scripts/platform-test-lane heavy-governance --capture-baseline
./scripts/platform-test-report heavy-governance
./scripts/platform-test-lane heavy-governance
./scripts/platform-test-report heavy-governance
cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent

Keep the implementation loop tight with the most relevant focused suites before rerunning the whole lane:

cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament --filter=BaselineProfileCaptureStartSurfaceTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament --filter=BaselineProfileCompareStartSurfaceTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Filament --filter=BaselineActionAuthorizationTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Findings --filter=FindingBulkActionsTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Findings --filter=FindingWorkflow
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Guards --filter=ActionSurfaceContractTest
cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/OpsUx

Current Baseline

Use the checked-in heavy-governance artifacts under apps/platform/storage/logs/test-lanes as the starting point.

Signal Current value Planning note
Lane wall clock 318.296962s Current measured overrun
Lane summary threshold 300s Authoritative pre-normalization contract for Spec 209 acceptance
Budget target evaluation threshold 200s Legacy drift evidence that must remain visible until the contract is normalized
ui-workflow total 190.606431s Dominant class; first slimming target
surface-guard total 106.845887s Second-wave analysis target
discovery-heavy total 0.863003s Already bounded; not the main cost problem

Current Canonical Inventory

The canonical inventory now covers six families because the top five alone do not clear the required 80% runtime threshold.

Family Baseline measured time Current status Driver
baseline-profile-start-surfaces 98.112193s slimmed workflow-heavy
action-surface-contract 40.841552s retained intentionally-heavy
ops-ux-governance 38.794861s retained intentionally-heavy
findings-workflow-surfaces 36.459493s slimmed workflow-heavy
finding-bulk-actions-workflow 26.491446s slimmed redundant
workspace-settings-slice-management 21.740839s follow-up helper-driven

Together these six families explain 263.617244s, or 80.052516%, of the latest heavy-governance runtime.

Latest Rerun Hotspots

Family Latest measured time Current intent
baseline-profile-start-surfaces 101.895415s Still dominant after slimming; trust retained
action-surface-contract 38.323501s Intentionally heavy and retained
ops-ux-governance 36.497049s Intentionally heavy and retained
findings-workflow-surfaces 35.990272s Slimmed, but still a meaningful workflow-heavy slice
finding-bulk-actions-workflow 30.145259s Slimmed fixture fan-out, still a top single test family
workspace-settings-slice-management 20.765748s Recorded as explicit follow-up debt

Decomposition Checklist

For each primary hotspot family, answer these questions before changing file structure:

  1. What governance trust does this family deliver?
  2. What breadth is genuinely required for that trust?
  3. Which repeated work sources dominate runtime?
  4. Is the main cost family-breadth, helper-driven setup, or fixture-driven setup?
  5. Is the correct fix a split, a centralization, a duplicate-assertion trim, or intentional retention?
  6. What focused tests and lane reruns prove the change did not hollow out governance trust?

Reviewer Guidance Targets

The implementation should leave behind short rules that cover:

  1. When a new heavy family is justified.
  2. When a test should join an existing heavy family instead.
  3. When discovery, workflow, and surface trust must be separated.
  4. When a family should stay intentionally heavy.
  5. When a helper or fixture cost must be recorded as residual debt instead of disguised as family improvement.

The canonical reviewer rules now live in TestLaneManifest::heavyGovernanceAuthorGuidance() and are:

  1. heavy-family-reuse-before-creation
  2. heavy-family-create-only-for-new-trust
  3. split-discovery-workflow-surface-concerns
  4. retain-intentional-heavy-depth-explicitly
  5. record-helper-or-fixture-residuals

Exit Criteria

  1. The heavy-governance budget contract is normalized to one authoritative threshold, and the summary, budget, and report artifacts do not disagree about it.
  2. The primary hotspot families have decomposition records and explicit slimming decisions.
  3. The heavy-governance lane has fresh before and after evidence in the standard artifact paths, including inventory coverage for the top 5 families or at least 80% of runtime, whichever is larger.
  4. The final outcome is explicit: recovered within the authoritative threshold for the rollout or consciously recalibrated.
  5. Reviewer guidance exists for future heavy-family authoring.