# Tasks: Test Runtime Trend Reporting & Baseline Recalibration **Input**: Design documents from `/Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/211-runtime-trend-recalibration/` **Prerequisites**: `plan.md` (required), `spec.md` (required), `research.md`, `data-model.md`, `contracts/`, `quickstart.md` **Tests**: Required. This feature changes repository test-governance runtime behavior, so each user story includes Pest guard coverage plus focused lane and wrapper validation through Sail and the repo-root test-governance scripts. **Organization**: Tasks are grouped by user story so each story can be implemented and validated independently where possible. ## Phase 1: Setup (Shared Context) **Purpose**: Freeze the real repo-truth seams and artifact boundaries before implementation begins. - [X] T001 [P] Audit `apps/platform/tests/Support/TestLaneManifest.php`, `apps/platform/tests/Support/TestLaneBudget.php`, `apps/platform/tests/Support/TestLaneReport.php`, `scripts/platform-test-report`, `scripts/platform-test-artifacts`, and `.gitea/workflows/*.yml` as the only valid trend-history and runtime-governance seams before implementation --- ## Phase 2: Foundational (Blocking Prerequisites) **Purpose**: Extend the shared manifest, artifact, and wrapper seams that every story depends on. **Critical**: No user story work should begin until this phase is complete. - [X] T002 Extend `apps/platform/tests/Support/TestLaneManifest.php` with lane trend policy metadata, retention and comparison-window defaults, comparison-fingerprint inputs, hotspot limits, and `trend-history.json` artifact contracts aligned to `specs/211-runtime-trend-recalibration/data-model.md` - [X] T003 [P] Extend `apps/platform/tests/Support/TestLaneReport.php` artifact path, read or write, and staging helpers so `apps/platform/storage/logs/test-lanes/-latest.trend-history.json` can be published alongside the existing summary, budget, report, and JUnit artifacts - [X] T004 [P] Update `scripts/platform-test-report` and `scripts/platform-test-artifacts` to discover, select, and hydrate the latest comparable prior bundle or explicit local history input, then export the canonical `trend-history.json` artifact through the existing repo-root wrappers - [X] T005 [P] Add or update shared guard coverage in `apps/platform/tests/Feature/Guards/TestLaneManifestTest.php`, `apps/platform/tests/Feature/Guards/TestLaneArtifactsContractTest.php`, `apps/platform/tests/Feature/Guards/TestLaneHistoryHydrationContractTest.php`, `apps/platform/tests/Feature/Guards/TestLaneTrendContractSchemaTest.php`, and `apps/platform/tests/Feature/Guards/TestLaneTrendLogicalContractTest.php` to lock lane trend policy metadata, latest-comparable-bundle hydration semantics, JSON schema sync against `specs/211-runtime-trend-recalibration/contracts/test-runtime-trend-history.schema.json`, logical contract sync against `specs/211-runtime-trend-recalibration/contracts/test-runtime-trend.logical.openapi.yaml`, and staged bundle completeness for `trend-history.json` **Checkpoint**: The shared trend-governance seams are ready for story-specific summary, recalibration, and hotspot work. --- ## Phase 3: User Story 1 - See Lane Drift Before It Becomes A Repeated Gate (Priority: P1) 🎯 MVP **Goal**: Publish lane-first trend summaries that show current, previous, baseline, budget, and health status before a lane becomes a recurring blocker. **Independent Test**: Review representative three-sample run sequences for `fast-feedback` and `confidence`, confirm the summary shows current, previous, baseline, and budget values, and verify that healthy, near-budget, worsening, and noisy cases are distinguishable without manual arithmetic. ### Tests for User Story 1 - [X] T006 [P] [US1] Add `apps/platform/tests/Feature/Guards/TestLaneTrendSummaryContractTest.php` and update `apps/platform/tests/Feature/Guards/TestLaneArtifactsContractTest.php` to assert bounded history windows and current, previous, baseline, and budget fields for `fast-feedback` and `confidence` - [X] T007 [P] [US1] Add `apps/platform/tests/Feature/Guards/TestLaneTrendClassificationTest.php` to cover `healthy`, `budget-near`, `trending-worse`, `regressed`, and `unstable` outcomes, including one-off noisy spike handling ### Implementation for User Story 1 - [X] T008 [US1] Extend `apps/platform/tests/Support/TestLaneReport.php` with `LaneTrendRecord` generation, comparison-window evaluation, comparison fingerprints, and trend-aware `summary.md` plus `report.json` output for `fast-feedback` and `confidence` - [X] T009 [US1] Update `apps/platform/tests/Support/TestLaneManifest.php`, `.gitea/workflows/test-pr-fast-feedback.yml`, and `.gitea/workflows/test-main-confidence.yml` so pull-request and mainline bundles discover and hydrate the latest comparable history bundle, then republish the refreshed `trend-history.json` artifact without widening lane execution - [X] T010 [US1] Update `README.md` and `specs/211-runtime-trend-recalibration/quickstart.md` with reviewer guidance and local validation steps for reading lane health summaries across `fast-feedback` and `confidence` - [X] T011 [US1] Run the narrowest proving path with `./scripts/platform-test-lane fast-feedback`, `./scripts/platform-test-report fast-feedback`, `./scripts/platform-test-lane confidence`, and `./scripts/platform-test-report confidence`, then record representative three-sample `healthy`, `budget-near`, and `unstable` evidence in `specs/211-runtime-trend-recalibration/spec.md` and `specs/211-runtime-trend-recalibration/quickstart.md` **Checkpoint**: At this point, lane drift visibility for the main contributor lanes should be independently functional and reviewable. --- ## Phase 4: User Story 2 - Decide Recalibration With Evidence Instead Of Habit (Priority: P1) **Goal**: Separate baseline and budget recalibration from ordinary health status and make every recalibration decision evidence-backed. **Independent Test**: Review one justified recalibration case and one rejected recalibration case, and confirm the report plus policy make the outcome understandable without private notes. ### Tests for User Story 2 - [X] T012 [P] [US2] Add `apps/platform/tests/Feature/Guards/TestLaneRecalibrationPolicyTest.php` to assert baseline-vs-budget separation, evidence-window requirements, and approved versus rejected rationale handling - [X] T013 [P] [US2] Add `apps/platform/tests/Feature/Guards/TestLaneRecalibrationEvidenceContractTest.php` to assert candidate, approved, and rejected recalibration records together with explicit summary disclosure for recalibration outcomes ### Implementation for User Story 2 - [X] T014 [US2] Extend `apps/platform/tests/Support/TestLaneBudget.php` with recalibration recommendation helpers, lane-specific tolerance reuse, and explicit baseline plus budget review rules aligned to `specs/211-runtime-trend-recalibration/data-model.md` - [X] T015 [US2] Extend `apps/platform/tests/Support/TestLaneManifest.php` and `apps/platform/tests/Support/TestLaneReport.php` to emit structured recalibration policy metadata, decision records, evidence run references, and `recordedIn` guidance pointing to `specs/211-runtime-trend-recalibration/spec.md` or the implementation PR without mutating manifest truth automatically - [X] T016 [US2] Update `README.md` and `specs/211-runtime-trend-recalibration/quickstart.md` with the approved and rejected recalibration policy, required evidence windows, and reviewer follow-up rules - [X] T017 [US2] Run recalibration validation with `./scripts/platform-test-report fast-feedback` and `./scripts/platform-test-report confidence` against seeded prior histories, then record one approved and one rejected recalibration example in `specs/211-runtime-trend-recalibration/spec.md` and `specs/211-runtime-trend-recalibration/quickstart.md` **Checkpoint**: At this point, recalibration guidance should be independently testable and clearly separated from ordinary lane health. --- ## Phase 5: User Story 3 - Track Dominant Hotspots Over Time (Priority: P2) **Goal**: Surface persistent, worsening, and newly dominant hotspots so follow-up optimization work targets the real cost drivers. **Independent Test**: Review representative hotspot summaries for each primary lane across multiple runs and confirm that persistent, worsening, newly dominant, and unavailable hotspot states are visible. ### Tests for User Story 3 - [X] T018 [P] [US3] Add `apps/platform/tests/Feature/Guards/TestLaneHotspotTrendContractTest.php` to assert top family and file delta output, new or dropped hotspot detection, and explicit unavailable-hotspot disclosure - [X] T019 [P] [US3] Update `apps/platform/tests/Feature/Guards/ProfileLaneContractTest.php`, `apps/platform/tests/Feature/Guards/FastFeedbackLaneContractTest.php`, `apps/platform/tests/Feature/Guards/ConfidenceLaneContractTest.php`, `apps/platform/tests/Feature/Guards/HeavyGovernanceLaneContractTest.php`, `apps/platform/tests/Feature/Guards/BrowserLaneIsolationTest.php`, and `apps/platform/tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.php` to assert support-lane hotspot evidence and hotspot visibility for all primary lanes plus the chosen `junit` or `profiling` support example ### Implementation for User Story 3 - [X] T020 [US3] Extend `apps/platform/tests/Support/TestLaneReport.php` with hotspot delta computation from `classificationTotals`, `familyTotals`, `hotspotFiles`, and `slowestEntries`, capping readable output to the policy limits defined in `apps/platform/tests/Support/TestLaneManifest.php` - [X] T021 [US3] Update `apps/platform/tests/Support/TestLaneManifest.php`, `.gitea/workflows/test-heavy-governance.yml`, and `.gitea/workflows/test-browser.yml` so heavy and browser bundles retain hotspot-supporting history context and surface missing hotspot evidence explicitly - [X] T022 [US3] Update `README.md` and `specs/211-runtime-trend-recalibration/quickstart.md` with hotspot investigation guidance, `profiling` and `junit` support-lane usage, and examples of persistent versus newly dominant hotspots - [X] T023 [US3] Run representative hotspot validation with `./scripts/platform-test-report fast-feedback`, `./scripts/platform-test-report confidence`, `./scripts/platform-test-lane heavy-governance`, `./scripts/platform-test-report heavy-governance`, `./scripts/platform-test-lane browser`, `./scripts/platform-test-report browser`, and one support-lane report path from `./scripts/platform-test-report profiling` or `./scripts/platform-test-report junit`, then record persistent, worsening, newly dominant, and unavailable hotspot evidence for each primary lane in `specs/211-runtime-trend-recalibration/spec.md` and `specs/211-runtime-trend-recalibration/quickstart.md` **Checkpoint**: At this point, hotspot trend visibility should be independently functional without depending on recalibration rollout evidence. --- ## Phase 6: Polish & Cross-Cutting Concerns **Purpose**: Validate the full trend-governance slice, record evidence, and finish formatting. - [X] T024 Run focused Pest coverage for `apps/platform/tests/Feature/Guards/TestLaneTrendSummaryContractTest.php`, `apps/platform/tests/Feature/Guards/TestLaneTrendClassificationTest.php`, `apps/platform/tests/Feature/Guards/TestLaneRecalibrationPolicyTest.php`, `apps/platform/tests/Feature/Guards/TestLaneRecalibrationEvidenceContractTest.php`, `apps/platform/tests/Feature/Guards/TestLaneHotspotTrendContractTest.php`, `apps/platform/tests/Feature/Guards/TestLaneHistoryHydrationContractTest.php`, `apps/platform/tests/Feature/Guards/TestLaneTrendContractSchemaTest.php`, `apps/platform/tests/Feature/Guards/TestLaneTrendLogicalContractTest.php`, `apps/platform/tests/Feature/Guards/TestLaneManifestTest.php`, `apps/platform/tests/Feature/Guards/TestLaneArtifactsContractTest.php`, `apps/platform/tests/Feature/Guards/FastFeedbackLaneContractTest.php`, `apps/platform/tests/Feature/Guards/ConfidenceLaneContractTest.php`, `apps/platform/tests/Feature/Guards/ProfileLaneContractTest.php`, `apps/platform/tests/Feature/Guards/HeavyGovernanceLaneContractTest.php`, `apps/platform/tests/Feature/Guards/BrowserLaneIsolationTest.php`, and `apps/platform/tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.php` with `cd apps/platform && ./vendor/bin/sail artisan test --compact ...` - [X] T025 [P] Execute the representative local and Gitea evidence set across `.gitea/workflows/test-pr-fast-feedback.yml`, `.gitea/workflows/test-main-confidence.yml`, `.gitea/workflows/test-heavy-governance.yml`, and `.gitea/workflows/test-browser.yml`, capture at least three sequential comparable samples for each primary lane, include one support-lane example from `junit` or `profiling`, time-box a reviewer dry run to confirm the summary remains decidable within two minutes, and record lane, health class, hotspot availability, recalibration outcome, and any material runtime drift follow-up in `specs/211-runtime-trend-recalibration/spec.md` and `specs/211-runtime-trend-recalibration/quickstart.md` - [X] T026 Run `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent` for changes in `apps/platform/tests/Support/TestLaneManifest.php`, `apps/platform/tests/Support/TestLaneBudget.php`, `apps/platform/tests/Support/TestLaneReport.php`, and the new or updated guard tests under `apps/platform/tests/Feature/Guards/` --- ## Dependencies & Execution Order ### Phase Dependencies - **Setup (Phase 1)**: No dependencies and can start immediately. - **Foundational (Phase 2)**: Depends on Phase 1 and blocks all user story work. - **User Story 1 (Phase 3)**: Depends on Phase 2 only and is the MVP slice. - **User Story 2 (Phase 4)**: Depends on Phase 2 and benefits from the trend-history infrastructure completed for User Story 1. - **User Story 3 (Phase 5)**: Depends on Phase 2 and should follow User Story 1 because hotspot deltas reuse the same history and assessment outputs. - **Polish (Phase 6)**: Depends on all desired user stories being complete. ### User Story Dependencies - **User Story 1 (P1)**: Can begin immediately after Foundational and delivers the first usable runtime-trend surface. - **User Story 2 (P1)**: Requires the same history contract as User Story 1 but remains independently valuable once that contract exists. - **User Story 3 (P2)**: Reuses the bounded history from User Story 1 and the policy limits from Foundational, but does not need User Story 2 to be useful. ### Within Each User Story - Story-specific guard tests should be written and fail before implementation. - Manifest and wrapper contract changes should be in place before finalizing report output, schema validation, and comparable-bundle hydration steps. - README and quickstart guidance should land after the corresponding runtime behavior exists. - Lane validation and evidence capture should complete before closing a story. ### Parallel Opportunities - T003, T004, and T005 can proceed in parallel once T002 fixes the shared manifest shape. - In User Story 1, T006 and T007 can run in parallel because they cover separate guard surfaces. - In User Story 2, T012 and T013 can run in parallel because policy rules and evidence-record assertions are independent tests. - In User Story 3, T018 and T019 can run in parallel because they touch separate guard suites. - T025 can run in parallel with final formatting once all implementation and guard work is stable. --- ## Parallel Example: User Story 1 ```bash # After T002-T005 establish the shared history contract, these can proceed in parallel: Task: "Add apps/platform/tests/Feature/Guards/TestLaneTrendSummaryContractTest.php and update TestLaneArtifactsContractTest.php" Task: "Add apps/platform/tests/Feature/Guards/TestLaneTrendClassificationTest.php" ``` --- ## Parallel Example: User Story 2 ```bash # After User Story 1 exposes comparable history, these can proceed in parallel: Task: "Add apps/platform/tests/Feature/Guards/TestLaneRecalibrationPolicyTest.php" Task: "Add apps/platform/tests/Feature/Guards/TestLaneRecalibrationEvidenceContractTest.php" ``` --- ## Parallel Example: User Story 3 ```bash # After the shared hotspot-ready report shape exists, these can proceed in parallel: Task: "Add apps/platform/tests/Feature/Guards/TestLaneHotspotTrendContractTest.php" Task: "Update apps/platform/tests/Feature/Guards/ProfileLaneContractTest.php and apps/platform/tests/Feature/Guards/HeavyGovernanceLaneContractTest.php" ``` --- ## Implementation Strategy ### MVP First (User Story 1 Only) 1. Complete Phase 1: Setup. 2. Complete Phase 2: Foundational. 3. Complete Phase 3: User Story 1. 4. Validate `fast-feedback` and `confidence` trend summaries independently before continuing. ### Incremental Delivery 1. Deliver bounded history and lane health summaries first. 2. Add explicit recalibration policy and evidence records next. 3. Add hotspot delta visibility for heavy, browser, and support-lane-assisted investigations last. 4. Finish with focused guard validation, real evidence capture, and formatting. ### Parallel Team Strategy 1. One contributor can extend `apps/platform/tests/Support/TestLaneManifest.php` and wrapper scripts while another prepares the new guard suites. 2. After Foundational completes, User Story 1 test work and workflow hydration changes can be split across contributors. 3. User Story 2 recalibration logic and User Story 3 hotspot logic can proceed separately once the history contract is stable. --- ## Notes - `[P]` tasks operate on different files or independent guard suites and can run in parallel once dependencies are satisfied. - `[US1]`, `[US2]`, and `[US3]` map tasks directly to the user stories in `spec.md`. - This feature changes runtime-governance behavior, so the narrowest relevant lane reruns and evidence capture remain part of the definition of done. - Live Gitea validation remains required because local wrapper tests alone cannot prove cross-run artifact hydration and uploaded bundle behavior.