TenantAtlas/specs/295-full-suite-ci-baseline/tasks.md
ahmido f03555eae1 Spec 295: full suite CI lane baseline (#350)
## Summary
- add the Spec 295 artifacts for full-suite failure classification and CI lane baseline work
- fix `scripts/platform-test-artifacts` so Sail passes artifact staging inputs into the embedded PHP script via argv
- add a guard test covering the artifact staging input contract

## Scope guards
- no browser screenshot baselines included
- no generated test artifacts included
- no runtime application code changes included

## Notes
- classification evidence and follow-up ownership are documented in `specs/295-full-suite-ci-baseline/failure-classification.md`
- this PR is intentionally limited to the CI/lane/artifact contract slice for Spec 295

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #350
2026-05-11 11:14:56 +00:00

174 lines
17 KiB
Markdown

# Tasks: Full Suite Failure Classification & CI Lane Baseline
**Input**: Design documents from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/`
**Prerequisites**: `spec.md`, `plan.md`, `research.md`, `data-model.md`, `quickstart.md`, `failure-classification.md`, `checklists/requirements.md`
**Review Artifact**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md`
**Failure Inventory**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
## Review Metadata
- **Review outcome class**: `acceptable-special-case`
- **Workflow outcome**: `keep`
- **Test-governance outcome**: `keep`
- **Stop / split triggers**: broad product/runtime repair, new CI framework, new permanent lane, new browser family, new heavy-governance family, runtime application changes, Filament resource/page changes, route restoration, TenantPanelProvider restoration, `/admin/t/...` restoration, provider/verification runtime expansion, historical-spec rewrite, or budget relaxation without classification evidence
## Pinned Failure-Classification Categories
- `ci-signal-restored`
- `ci-wrapper-or-manifest-regression`
- `artifact-publication-regression`
- `budget-or-trend-baseline-drift`
- `product-runtime-or-test-regression`
- `browser-lane-regression`
- `flaky-or-environment`
- `follow-up-spec-required`
- `resolved-or-not-needed`
## Pinned CI / Suite Seams
- `raw-full-suite`
- `fast-feedback-lane`
- `confidence-lane`
- `heavy-governance-lane`
- `browser-lane`
- `profiling-or-junit-support`
- `lane-reporting`
- `artifact-publication`
- `budget-trend-baseline`
- `legacy-cutover-regression-guard`
- `provider-verification-regression-guard`
## Test Governance Checklist
- [x] Lane assignment is named and is the narrowest sufficient proof for each observed failure group.
- [x] New or changed tests stay in the smallest honest family, and any heavy-governance or browser addition is explicit.
- [x] Shared helpers, factories, seeds, fixtures, and context defaults stay cheap by default; any widening is isolated or documented.
- [x] Planned validation commands cover the change without pulling in unrelated lane cost beyond classification.
- [x] The declared surface test profile or `standard-native-filament` relief is explicit.
- [x] Any material budget, baseline, trend, or escalation note is recorded in `failure-classification.md`.
## Phase 1: Setup and Scope Lock
**Purpose**: Confirm Spec `295` remains a classification and CI lane baseline package before any suite command runs.
- [x] T001 Review `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/plan.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/research.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/quickstart.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`, and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md` before changing runtime or tests
- [x] T002 [P] Confirm current branch, working tree, and baseline diff using `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branch` and `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --stat`, then record any pre-existing changes in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T003 [P] Inspect `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/293-post-cutover-suite-stabilization/failure-classification.md` and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/294-provider-verification-runtime-semantics/failure-classification.md` as context only, confirming no task edits are made to Specs `293` or `294`
- [x] T004 [P] Inspect `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-report`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php` to confirm current lane entry points and failure classes
- [x] T005 Confirm the explicit forbidden scope in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`: no TenantPanelProvider restoration, no `/admin/t/...` restoration, no broad product repair, and no historical-spec rewrite
---
## Phase 2: User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1)
**Goal**: Establish the raw full-suite readiness signal or an explicit fallback split before any fix work begins.
**Independent Test**: the raw full-suite result or fallback lane split is represented by classified rows in `failure-classification.md`, with no red group left unclassified.
- [x] T006 [US1] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)` and record pass/fail counts, failing files, and any timeout/noisy-output reason in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T007 [US1] If T006 cannot produce a classifiable result, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback`, `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence`, `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance`, and `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser`, then record each lane outcome in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T008 [US1] Group every failing test file, assertion cluster, wrapper error, report error, artifact error, budget breach, or environment issue into one row in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` with exactly one pinned category and one pinned seam
- [x] T009 [US1] Classify any legacy route or panel-related group under `legacy-cutover-regression-guard` without restoring `/admin/t/...`, TenantPanelProvider, tenant-scoped provider fallback routes, or historical compatibility behavior
- [x] T010 [US1] Classify any provider/verification group under `provider-verification-regression-guard` without rewriting Spec `294`; only mark it in-scope if the failure is a direct CI/lane contract defect rather than provider runtime behavior
---
## Phase 3: User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1)
**Goal**: Prove existing CI wrappers, reports, artifacts, budgets, and failure classes are interpretable after the suite run.
**Independent Test**: every lane either passes with complete report/artifact output or fails with the correct primary failure class.
- [x] T011 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T012 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T013 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T014 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T015 [P] [US2] If machine-readable confidence output is needed for follow-up ownership, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junit` and classify the JUnit support result in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` (not run separately because the lane wrappers produced the needed JUnit artifacts)
- [x] T016 [P] [US2] If artifact publication is suspected, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-fast-feedback-artifacts` or the matching affected lane and classify any missing required artifacts under `artifact-publication-regression`
- [x] T017 [US2] Verify existing failure classes from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php` classify lane outcomes as `test-failure`, `wrapper-failure`, `budget-breach`, `artifact-publication-failure`, or `infrastructure-failure`, and record mismatches in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
---
## Phase 4: User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1)
**Goal**: Keep Spec `295` limited to CI signal readiness by splitting product/runtime failures into explicit follow-up ownership.
**Independent Test**: every non-CI failure group has a follow-up recommendation, owner, or environment disposition.
- [x] T018 [US3] For each row classified as `product-runtime-or-test-regression`, decide whether it is a follow-up spec, lane-specific debt, or active feature blocker, then record the decision in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T019 [US3] For each row classified as `browser-lane-regression`, record the affected browser file under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Browser/`, whether the failure is smoke/environment/product behavior, and the follow-up path in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T020 [US3] For each row classified as `flaky-or-environment`, rerun the narrowest affected command once when safe and record the rerun evidence or environment blocker in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` (no flaky/environment row was identified)
- [x] T021 [US3] Confirm no failure group is being fixed under `295` solely because it is small or nearby; it must be directly tied to CI wrapper, manifest, report, artifact, or budget/trend contract drift
---
## Phase 5: User Story 4 - Apply Only Small CI-Signal Fixes (Priority: P2)
**Goal**: Correct narrow CI/lane contract defects only when classification proves they block a trustworthy CI signal.
**Independent Test**: the directly affected lane/report/artifact guard passes after the minimal fix, and unrelated red groups remain classified.
- [x] T022 [US4] If a `ci-wrapper-or-manifest-regression` row is proven, apply the minimal correction in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected guard test under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/` (not applicable: no `ci-wrapper-or-manifest-regression` row was proven)
- [x] T023 [US4] If an `artifact-publication-regression` row is proven, apply the minimal correction in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected artifact guard test
- [x] T024 [US4] If a `budget-or-trend-baseline-drift` row is proven, update only the documented budget/trend baseline owner in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneBudget.php`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected guard test when the classification row explains why the evidence supports the change (not applicable: no budget/trend baseline rewrite was justified)
- [x] T025 [US4] Add or adjust Pest coverage only when a CI/lane contract defect was fixed, keeping tests under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/` or `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Unit/Support/` and avoiding new browser/heavy families by default
- [x] T026 [US4] Re-run the narrowest affected lane/report/artifact command after any CI/lane fix and update `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` with the final status
---
## Phase 6: Final Readiness Decision and Validation
**Purpose**: Publish one final CI readiness decision and prove no unclassified failure or hidden scope expansion remains.
- [x] T027 Review `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` and confirm every row has category, seam, observed command, candidate owner, fix-in-295 decision, follow-up, and status
- [x] T028 Set the final readiness decision in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` to exactly one of `restored-ci-signal`, `classified-follow-up-required`, or `blocked-by-environment`
- [x] T029 Re-run the final narrowest proof command set for the decision: raw full suite if classifiable, otherwise the exact affected lane/report commands from Phases 2 through 5
- [x] T030 Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)` if any PHP or script-adjacent PHP files changed
- [x] T031 Confirm Filament remains v5 on Livewire v4, provider registration remains in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/bootstrap/providers.php`, no globally searchable resource changed, no destructive action changed, no asset registration changed, no `/admin/t/...` route or TenantPanelProvider behavior was restored, and no Specs `293` or `294` artifact was rewritten
## Dependencies & Execution Order
- **Phase 1** must complete before any suite command.
- **Phase 2** must classify raw suite or fallback lane output before any fix work.
- **Phase 3** depends on Phase 2 because lane reports must be interpreted against observed lane outcomes.
- **Phase 4** depends on the failure group inventory from Phases 2 and 3.
- **Phase 5** depends on classified CI/lane contract defects; skip it entirely if no in-scope CI/lane defect is proven.
- **Phase 6** depends on all classification and any bounded fixes.
## Parallel Execution Examples
- T003 and T004 can run in parallel after T001.
- T011 through T014 can run independently after their corresponding lane outputs exist.
- T018 through T020 can be split by failure group once T008 has created the grouped inventory.
- T022 through T024 must not run until a corresponding classification row proves the in-scope defect.
## Implementation Strategy
### Suggested MVP Scope
MVP = Phases 1 through 4. That is enough to answer whether the suite is green or which follow-up owns each red group. Phase 5 runs only when classification proves a narrow CI/lane contract defect.
### Incremental Delivery
1. Lock scope and read prior stabilization artifacts.
2. Run raw full suite or fallback lane split.
3. Classify every red group.
4. Validate lane/report/artifact signal.
5. Split product/runtime failures to follow-up ownership.
6. Apply only proven CI/lane fixes.
7. Publish the final readiness decision.
## Explicit Follow-Ups / Out of Scope
- Product/runtime failing-test repair outside CI/lane contract defects
- Browser UI repair
- Package Execution
- Guided Operations
- Microsoft Starter Pack
- Virtual Consultant
- Tenant cutover rework
- Provider/verification runtime expansion beyond Spec `294`
- New permanent CI lane or framework
- Historical-spec cleanup