## Summary - implement Spec 211 runtime trend reporting with bounded lane history, drift classification, hotspot trend output, and recalibration evidence handling - extend the repo-truth governance seams and workflow wrappers for comparable-bundle hydration, trend artifact publication, and contract-backed reporting - add the Spec 211 planning artifacts, data model, quickstart, tasks, and repository contract documents ## Validation - parsed `specs/211-runtime-trend-recalibration/contracts/test-runtime-trend-history.schema.json` - parsed `specs/211-runtime-trend-recalibration/contracts/test-runtime-trend.logical.openapi.yaml` - re-ran cross-artifact consistency analysis for the Spec 211 artifact set until no material findings remained - no application test suite was re-run as part of this final commit/push/PR step Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #244
18 KiB
Tasks: Test Runtime Trend Reporting & Baseline Recalibration
Input: Design documents from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/211-runtime-trend-recalibration/
Prerequisites: plan.md (required), spec.md (required), research.md, data-model.md, contracts/, quickstart.md
Tests: Required. This feature changes repository test-governance runtime behavior, so each user story includes Pest guard coverage plus focused lane and wrapper validation through Sail and the repo-root test-governance scripts.
Organization: Tasks are grouped by user story so each story can be implemented and validated independently where possible.
Phase 1: Setup (Shared Context)
Purpose: Freeze the real repo-truth seams and artifact boundaries before implementation begins.
- T001 [P] Audit
apps/platform/tests/Support/TestLaneManifest.php,apps/platform/tests/Support/TestLaneBudget.php,apps/platform/tests/Support/TestLaneReport.php,scripts/platform-test-report,scripts/platform-test-artifacts, and.gitea/workflows/*.ymlas the only valid trend-history and runtime-governance seams before implementation
Phase 2: Foundational (Blocking Prerequisites)
Purpose: Extend the shared manifest, artifact, and wrapper seams that every story depends on.
Critical: No user story work should begin until this phase is complete.
- T002 Extend
apps/platform/tests/Support/TestLaneManifest.phpwith lane trend policy metadata, retention and comparison-window defaults, comparison-fingerprint inputs, hotspot limits, andtrend-history.jsonartifact contracts aligned tospecs/211-runtime-trend-recalibration/data-model.md - T003 [P] Extend
apps/platform/tests/Support/TestLaneReport.phpartifact path, read or write, and staging helpers soapps/platform/storage/logs/test-lanes/<lane>-latest.trend-history.jsoncan be published alongside the existing summary, budget, report, and JUnit artifacts - T004 [P] Update
scripts/platform-test-reportandscripts/platform-test-artifactsto discover, select, and hydrate the latest comparable prior bundle or explicit local history input, then export the canonicaltrend-history.jsonartifact through the existing repo-root wrappers - T005 [P] Add or update shared guard coverage in
apps/platform/tests/Feature/Guards/TestLaneManifestTest.php,apps/platform/tests/Feature/Guards/TestLaneArtifactsContractTest.php,apps/platform/tests/Feature/Guards/TestLaneHistoryHydrationContractTest.php,apps/platform/tests/Feature/Guards/TestLaneTrendContractSchemaTest.php, andapps/platform/tests/Feature/Guards/TestLaneTrendLogicalContractTest.phpto lock lane trend policy metadata, latest-comparable-bundle hydration semantics, JSON schema sync againstspecs/211-runtime-trend-recalibration/contracts/test-runtime-trend-history.schema.json, logical contract sync againstspecs/211-runtime-trend-recalibration/contracts/test-runtime-trend.logical.openapi.yaml, and staged bundle completeness fortrend-history.json
Checkpoint: The shared trend-governance seams are ready for story-specific summary, recalibration, and hotspot work.
Phase 3: User Story 1 - See Lane Drift Before It Becomes A Repeated Gate (Priority: P1) 🎯 MVP
Goal: Publish lane-first trend summaries that show current, previous, baseline, budget, and health status before a lane becomes a recurring blocker.
Independent Test: Review representative three-sample run sequences for fast-feedback and confidence, confirm the summary shows current, previous, baseline, and budget values, and verify that healthy, near-budget, worsening, and noisy cases are distinguishable without manual arithmetic.
Tests for User Story 1
- T006 [P] [US1] Add
apps/platform/tests/Feature/Guards/TestLaneTrendSummaryContractTest.phpand updateapps/platform/tests/Feature/Guards/TestLaneArtifactsContractTest.phpto assert bounded history windows and current, previous, baseline, and budget fields forfast-feedbackandconfidence - T007 [P] [US1] Add
apps/platform/tests/Feature/Guards/TestLaneTrendClassificationTest.phpto coverhealthy,budget-near,trending-worse,regressed, andunstableoutcomes, including one-off noisy spike handling
Implementation for User Story 1
- T008 [US1] Extend
apps/platform/tests/Support/TestLaneReport.phpwithLaneTrendRecordgeneration, comparison-window evaluation, comparison fingerprints, and trend-awaresummary.mdplusreport.jsonoutput forfast-feedbackandconfidence - T009 [US1] Update
apps/platform/tests/Support/TestLaneManifest.php,.gitea/workflows/test-pr-fast-feedback.yml, and.gitea/workflows/test-main-confidence.ymlso pull-request and mainline bundles discover and hydrate the latest comparable history bundle, then republish the refreshedtrend-history.jsonartifact without widening lane execution - T010 [US1] Update
README.mdandspecs/211-runtime-trend-recalibration/quickstart.mdwith reviewer guidance and local validation steps for reading lane health summaries acrossfast-feedbackandconfidence - T011 [US1] Run the narrowest proving path with
./scripts/platform-test-lane fast-feedback,./scripts/platform-test-report fast-feedback,./scripts/platform-test-lane confidence, and./scripts/platform-test-report confidence, then record representative three-samplehealthy,budget-near, andunstableevidence inspecs/211-runtime-trend-recalibration/spec.mdandspecs/211-runtime-trend-recalibration/quickstart.md
Checkpoint: At this point, lane drift visibility for the main contributor lanes should be independently functional and reviewable.
Phase 4: User Story 2 - Decide Recalibration With Evidence Instead Of Habit (Priority: P1)
Goal: Separate baseline and budget recalibration from ordinary health status and make every recalibration decision evidence-backed.
Independent Test: Review one justified recalibration case and one rejected recalibration case, and confirm the report plus policy make the outcome understandable without private notes.
Tests for User Story 2
- T012 [P] [US2] Add
apps/platform/tests/Feature/Guards/TestLaneRecalibrationPolicyTest.phpto assert baseline-vs-budget separation, evidence-window requirements, and approved versus rejected rationale handling - T013 [P] [US2] Add
apps/platform/tests/Feature/Guards/TestLaneRecalibrationEvidenceContractTest.phpto assert candidate, approved, and rejected recalibration records together with explicit summary disclosure for recalibration outcomes
Implementation for User Story 2
- T014 [US2] Extend
apps/platform/tests/Support/TestLaneBudget.phpwith recalibration recommendation helpers, lane-specific tolerance reuse, and explicit baseline plus budget review rules aligned tospecs/211-runtime-trend-recalibration/data-model.md - T015 [US2] Extend
apps/platform/tests/Support/TestLaneManifest.phpandapps/platform/tests/Support/TestLaneReport.phpto emit structured recalibration policy metadata, decision records, evidence run references, andrecordedInguidance pointing tospecs/211-runtime-trend-recalibration/spec.mdor the implementation PR without mutating manifest truth automatically - T016 [US2] Update
README.mdandspecs/211-runtime-trend-recalibration/quickstart.mdwith the approved and rejected recalibration policy, required evidence windows, and reviewer follow-up rules - T017 [US2] Run recalibration validation with
./scripts/platform-test-report fast-feedbackand./scripts/platform-test-report confidenceagainst seeded prior histories, then record one approved and one rejected recalibration example inspecs/211-runtime-trend-recalibration/spec.mdandspecs/211-runtime-trend-recalibration/quickstart.md
Checkpoint: At this point, recalibration guidance should be independently testable and clearly separated from ordinary lane health.
Phase 5: User Story 3 - Track Dominant Hotspots Over Time (Priority: P2)
Goal: Surface persistent, worsening, and newly dominant hotspots so follow-up optimization work targets the real cost drivers.
Independent Test: Review representative hotspot summaries for each primary lane across multiple runs and confirm that persistent, worsening, newly dominant, and unavailable hotspot states are visible.
Tests for User Story 3
- T018 [P] [US3] Add
apps/platform/tests/Feature/Guards/TestLaneHotspotTrendContractTest.phpto assert top family and file delta output, new or dropped hotspot detection, and explicit unavailable-hotspot disclosure - T019 [P] [US3] Update
apps/platform/tests/Feature/Guards/ProfileLaneContractTest.php,apps/platform/tests/Feature/Guards/FastFeedbackLaneContractTest.php,apps/platform/tests/Feature/Guards/ConfidenceLaneContractTest.php,apps/platform/tests/Feature/Guards/HeavyGovernanceLaneContractTest.php,apps/platform/tests/Feature/Guards/BrowserLaneIsolationTest.php, andapps/platform/tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.phpto assert support-lane hotspot evidence and hotspot visibility for all primary lanes plus the chosenjunitorprofilingsupport example
Implementation for User Story 3
- T020 [US3] Extend
apps/platform/tests/Support/TestLaneReport.phpwith hotspot delta computation fromclassificationTotals,familyTotals,hotspotFiles, andslowestEntries, capping readable output to the policy limits defined inapps/platform/tests/Support/TestLaneManifest.php - T021 [US3] Update
apps/platform/tests/Support/TestLaneManifest.php,.gitea/workflows/test-heavy-governance.yml, and.gitea/workflows/test-browser.ymlso heavy and browser bundles retain hotspot-supporting history context and surface missing hotspot evidence explicitly - T022 [US3] Update
README.mdandspecs/211-runtime-trend-recalibration/quickstart.mdwith hotspot investigation guidance,profilingandjunitsupport-lane usage, and examples of persistent versus newly dominant hotspots - T023 [US3] Run representative hotspot validation with
./scripts/platform-test-report fast-feedback,./scripts/platform-test-report confidence,./scripts/platform-test-lane heavy-governance,./scripts/platform-test-report heavy-governance,./scripts/platform-test-lane browser,./scripts/platform-test-report browser, and one support-lane report path from./scripts/platform-test-report profilingor./scripts/platform-test-report junit, then record persistent, worsening, newly dominant, and unavailable hotspot evidence for each primary lane inspecs/211-runtime-trend-recalibration/spec.mdandspecs/211-runtime-trend-recalibration/quickstart.md
Checkpoint: At this point, hotspot trend visibility should be independently functional without depending on recalibration rollout evidence.
Phase 6: Polish & Cross-Cutting Concerns
Purpose: Validate the full trend-governance slice, record evidence, and finish formatting.
- T024 Run focused Pest coverage for
apps/platform/tests/Feature/Guards/TestLaneTrendSummaryContractTest.php,apps/platform/tests/Feature/Guards/TestLaneTrendClassificationTest.php,apps/platform/tests/Feature/Guards/TestLaneRecalibrationPolicyTest.php,apps/platform/tests/Feature/Guards/TestLaneRecalibrationEvidenceContractTest.php,apps/platform/tests/Feature/Guards/TestLaneHotspotTrendContractTest.php,apps/platform/tests/Feature/Guards/TestLaneHistoryHydrationContractTest.php,apps/platform/tests/Feature/Guards/TestLaneTrendContractSchemaTest.php,apps/platform/tests/Feature/Guards/TestLaneTrendLogicalContractTest.php,apps/platform/tests/Feature/Guards/TestLaneManifestTest.php,apps/platform/tests/Feature/Guards/TestLaneArtifactsContractTest.php,apps/platform/tests/Feature/Guards/FastFeedbackLaneContractTest.php,apps/platform/tests/Feature/Guards/ConfidenceLaneContractTest.php,apps/platform/tests/Feature/Guards/ProfileLaneContractTest.php,apps/platform/tests/Feature/Guards/HeavyGovernanceLaneContractTest.php,apps/platform/tests/Feature/Guards/BrowserLaneIsolationTest.php, andapps/platform/tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.phpwithcd apps/platform && ./vendor/bin/sail artisan test --compact ... - T025 [P] Execute the representative local and Gitea evidence set across
.gitea/workflows/test-pr-fast-feedback.yml,.gitea/workflows/test-main-confidence.yml,.gitea/workflows/test-heavy-governance.yml, and.gitea/workflows/test-browser.yml, capture at least three sequential comparable samples for each primary lane, include one support-lane example fromjunitorprofiling, time-box a reviewer dry run to confirm the summary remains decidable within two minutes, and record lane, health class, hotspot availability, recalibration outcome, and any material runtime drift follow-up inspecs/211-runtime-trend-recalibration/spec.mdandspecs/211-runtime-trend-recalibration/quickstart.md - T026 Run
cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agentfor changes inapps/platform/tests/Support/TestLaneManifest.php,apps/platform/tests/Support/TestLaneBudget.php,apps/platform/tests/Support/TestLaneReport.php, and the new or updated guard tests underapps/platform/tests/Feature/Guards/
Dependencies & Execution Order
Phase Dependencies
- Setup (Phase 1): No dependencies and can start immediately.
- Foundational (Phase 2): Depends on Phase 1 and blocks all user story work.
- User Story 1 (Phase 3): Depends on Phase 2 only and is the MVP slice.
- User Story 2 (Phase 4): Depends on Phase 2 and benefits from the trend-history infrastructure completed for User Story 1.
- User Story 3 (Phase 5): Depends on Phase 2 and should follow User Story 1 because hotspot deltas reuse the same history and assessment outputs.
- Polish (Phase 6): Depends on all desired user stories being complete.
User Story Dependencies
- User Story 1 (P1): Can begin immediately after Foundational and delivers the first usable runtime-trend surface.
- User Story 2 (P1): Requires the same history contract as User Story 1 but remains independently valuable once that contract exists.
- User Story 3 (P2): Reuses the bounded history from User Story 1 and the policy limits from Foundational, but does not need User Story 2 to be useful.
Within Each User Story
- Story-specific guard tests should be written and fail before implementation.
- Manifest and wrapper contract changes should be in place before finalizing report output, schema validation, and comparable-bundle hydration steps.
- README and quickstart guidance should land after the corresponding runtime behavior exists.
- Lane validation and evidence capture should complete before closing a story.
Parallel Opportunities
- T003, T004, and T005 can proceed in parallel once T002 fixes the shared manifest shape.
- In User Story 1, T006 and T007 can run in parallel because they cover separate guard surfaces.
- In User Story 2, T012 and T013 can run in parallel because policy rules and evidence-record assertions are independent tests.
- In User Story 3, T018 and T019 can run in parallel because they touch separate guard suites.
- T025 can run in parallel with final formatting once all implementation and guard work is stable.
Parallel Example: User Story 1
# After T002-T005 establish the shared history contract, these can proceed in parallel:
Task: "Add apps/platform/tests/Feature/Guards/TestLaneTrendSummaryContractTest.php and update TestLaneArtifactsContractTest.php"
Task: "Add apps/platform/tests/Feature/Guards/TestLaneTrendClassificationTest.php"
Parallel Example: User Story 2
# After User Story 1 exposes comparable history, these can proceed in parallel:
Task: "Add apps/platform/tests/Feature/Guards/TestLaneRecalibrationPolicyTest.php"
Task: "Add apps/platform/tests/Feature/Guards/TestLaneRecalibrationEvidenceContractTest.php"
Parallel Example: User Story 3
# After the shared hotspot-ready report shape exists, these can proceed in parallel:
Task: "Add apps/platform/tests/Feature/Guards/TestLaneHotspotTrendContractTest.php"
Task: "Update apps/platform/tests/Feature/Guards/ProfileLaneContractTest.php and apps/platform/tests/Feature/Guards/HeavyGovernanceLaneContractTest.php"
Implementation Strategy
MVP First (User Story 1 Only)
- Complete Phase 1: Setup.
- Complete Phase 2: Foundational.
- Complete Phase 3: User Story 1.
- Validate
fast-feedbackandconfidencetrend summaries independently before continuing.
Incremental Delivery
- Deliver bounded history and lane health summaries first.
- Add explicit recalibration policy and evidence records next.
- Add hotspot delta visibility for heavy, browser, and support-lane-assisted investigations last.
- Finish with focused guard validation, real evidence capture, and formatting.
Parallel Team Strategy
- One contributor can extend
apps/platform/tests/Support/TestLaneManifest.phpand wrapper scripts while another prepares the new guard suites. - After Foundational completes, User Story 1 test work and workflow hydration changes can be split across contributors.
- User Story 2 recalibration logic and User Story 3 hotspot logic can proceed separately once the history contract is stable.
Notes
[P]tasks operate on different files or independent guard suites and can run in parallel once dependencies are satisfied.[US1],[US2], and[US3]map tasks directly to the user stories inspec.md.- This feature changes runtime-governance behavior, so the narrowest relevant lane reruns and evidence capture remain part of the definition of done.
- Live Gitea validation remains required because local wrapper tests alone cannot prove cross-run artifact hydration and uploaded bundle behavior.