TenantAtlas/specs/295-full-suite-ci-baseline/tasks.md
ahmido f03555eae1 Spec 295: full suite CI lane baseline (#350)
## Summary
- add the Spec 295 artifacts for full-suite failure classification and CI lane baseline work
- fix `scripts/platform-test-artifacts` so Sail passes artifact staging inputs into the embedded PHP script via argv
- add a guard test covering the artifact staging input contract

## Scope guards
- no browser screenshot baselines included
- no generated test artifacts included
- no runtime application code changes included

## Notes
- classification evidence and follow-up ownership are documented in `specs/295-full-suite-ci-baseline/failure-classification.md`
- this PR is intentionally limited to the CI/lane/artifact contract slice for Spec 295

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #350
2026-05-11 11:14:56 +00:00

17 KiB

Tasks: Full Suite Failure Classification & CI Lane Baseline

Input: Design documents from /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/
Prerequisites: spec.md, plan.md, research.md, data-model.md, quickstart.md, failure-classification.md, checklists/requirements.md

Review Artifact: /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md
Failure Inventory: /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md

Review Metadata

  • Review outcome class: acceptable-special-case
  • Workflow outcome: keep
  • Test-governance outcome: keep
  • Stop / split triggers: broad product/runtime repair, new CI framework, new permanent lane, new browser family, new heavy-governance family, runtime application changes, Filament resource/page changes, route restoration, TenantPanelProvider restoration, /admin/t/... restoration, provider/verification runtime expansion, historical-spec rewrite, or budget relaxation without classification evidence

Pinned Failure-Classification Categories

  • ci-signal-restored
  • ci-wrapper-or-manifest-regression
  • artifact-publication-regression
  • budget-or-trend-baseline-drift
  • product-runtime-or-test-regression
  • browser-lane-regression
  • flaky-or-environment
  • follow-up-spec-required
  • resolved-or-not-needed

Pinned CI / Suite Seams

  • raw-full-suite
  • fast-feedback-lane
  • confidence-lane
  • heavy-governance-lane
  • browser-lane
  • profiling-or-junit-support
  • lane-reporting
  • artifact-publication
  • budget-trend-baseline
  • legacy-cutover-regression-guard
  • provider-verification-regression-guard

Test Governance Checklist

  • Lane assignment is named and is the narrowest sufficient proof for each observed failure group.
  • New or changed tests stay in the smallest honest family, and any heavy-governance or browser addition is explicit.
  • Shared helpers, factories, seeds, fixtures, and context defaults stay cheap by default; any widening is isolated or documented.
  • Planned validation commands cover the change without pulling in unrelated lane cost beyond classification.
  • The declared surface test profile or standard-native-filament relief is explicit.
  • Any material budget, baseline, trend, or escalation note is recorded in failure-classification.md.

Phase 1: Setup and Scope Lock

Purpose: Confirm Spec 295 remains a classification and CI lane baseline package before any suite command runs.

  • T001 Review /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md, /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/plan.md, /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/research.md, /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md, /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/quickstart.md, /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md, and /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md before changing runtime or tests
  • T002 [P] Confirm current branch, working tree, and baseline diff using export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branch and export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --stat, then record any pre-existing changes in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T003 [P] Inspect /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/293-post-cutover-suite-stabilization/failure-classification.md and /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/294-provider-verification-runtime-semantics/failure-classification.md as context only, confirming no task edits are made to Specs 293 or 294
  • T004 [P] Inspect /Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane, /Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-report, /Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts, /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json, /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, and /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php to confirm current lane entry points and failure classes
  • T005 Confirm the explicit forbidden scope in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md: no TenantPanelProvider restoration, no /admin/t/... restoration, no broad product repair, and no historical-spec rewrite

Phase 2: User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1)

Goal: Establish the raw full-suite readiness signal or an explicit fallback split before any fix work begins.

Independent Test: the raw full-suite result or fallback lane split is represented by classified rows in failure-classification.md, with no red group left unclassified.

  • T006 [US1] Run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact) and record pass/fail counts, failing files, and any timeout/noisy-output reason in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T007 [US1] If T006 cannot produce a classifiable result, run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback, export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence, export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance, and export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser, then record each lane outcome in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T008 [US1] Group every failing test file, assertion cluster, wrapper error, report error, artifact error, budget breach, or environment issue into one row in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md with exactly one pinned category and one pinned seam
  • T009 [US1] Classify any legacy route or panel-related group under legacy-cutover-regression-guard without restoring /admin/t/..., TenantPanelProvider, tenant-scoped provider fallback routes, or historical compatibility behavior
  • T010 [US1] Classify any provider/verification group under provider-verification-regression-guard without rewriting Spec 294; only mark it in-scope if the failure is a direct CI/lane contract defect rather than provider runtime behavior

Phase 3: User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1)

Goal: Prove existing CI wrappers, reports, artifacts, budgets, and failure classes are interpretable after the suite run.

Independent Test: every lane either passes with complete report/artifact output or fails with the correct primary failure class.

  • T011 [US2] Run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback and classify report, budget, trend, and artifact status in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T012 [US2] Run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence and classify report, budget, trend, and artifact status in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T013 [US2] Run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance and classify report, budget, trend, and artifact status in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T014 [US2] Run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser and classify report, budget, trend, and artifact status in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T015 [P] [US2] If machine-readable confidence output is needed for follow-up ownership, run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junit and classify the JUnit support result in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md (not run separately because the lane wrappers produced the needed JUnit artifacts)
  • T016 [P] [US2] If artifact publication is suspected, run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-fast-feedback-artifacts or the matching affected lane and classify any missing required artifacts under artifact-publication-regression
  • T017 [US2] Verify existing failure classes from /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php classify lane outcomes as test-failure, wrapper-failure, budget-breach, artifact-publication-failure, or infrastructure-failure, and record mismatches in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md

Phase 4: User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1)

Goal: Keep Spec 295 limited to CI signal readiness by splitting product/runtime failures into explicit follow-up ownership.

Independent Test: every non-CI failure group has a follow-up recommendation, owner, or environment disposition.

  • T018 [US3] For each row classified as product-runtime-or-test-regression, decide whether it is a follow-up spec, lane-specific debt, or active feature blocker, then record the decision in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T019 [US3] For each row classified as browser-lane-regression, record the affected browser file under /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Browser/, whether the failure is smoke/environment/product behavior, and the follow-up path in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
  • T020 [US3] For each row classified as flaky-or-environment, rerun the narrowest affected command once when safe and record the rerun evidence or environment blocker in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md (no flaky/environment row was identified)
  • T021 [US3] Confirm no failure group is being fixed under 295 solely because it is small or nearby; it must be directly tied to CI wrapper, manifest, report, artifact, or budget/trend contract drift

Phase 5: User Story 4 - Apply Only Small CI-Signal Fixes (Priority: P2)

Goal: Correct narrow CI/lane contract defects only when classification proves they block a trustworthy CI signal.

Independent Test: the directly affected lane/report/artifact guard passes after the minimal fix, and unrelated red groups remain classified.

  • T022 [US4] If a ci-wrapper-or-manifest-regression row is proven, apply the minimal correction in /Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane, /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json, /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, or the directly affected guard test under /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/ (not applicable: no ci-wrapper-or-manifest-regression row was proven)
  • T023 [US4] If an artifact-publication-regression row is proven, apply the minimal correction in /Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts, /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php, /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, or the directly affected artifact guard test
  • T024 [US4] If a budget-or-trend-baseline-drift row is proven, update only the documented budget/trend baseline owner in /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneBudget.php, /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, or the directly affected guard test when the classification row explains why the evidence supports the change (not applicable: no budget/trend baseline rewrite was justified)
  • T025 [US4] Add or adjust Pest coverage only when a CI/lane contract defect was fixed, keeping tests under /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/ or /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Unit/Support/ and avoiding new browser/heavy families by default
  • T026 [US4] Re-run the narrowest affected lane/report/artifact command after any CI/lane fix and update /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md with the final status

Phase 6: Final Readiness Decision and Validation

Purpose: Publish one final CI readiness decision and prove no unclassified failure or hidden scope expansion remains.

  • T027 Review /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md and confirm every row has category, seam, observed command, candidate owner, fix-in-295 decision, follow-up, and status
  • T028 Set the final readiness decision in /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md to exactly one of restored-ci-signal, classified-follow-up-required, or blocked-by-environment
  • T029 Re-run the final narrowest proof command set for the decision: raw full suite if classifiable, otherwise the exact affected lane/report commands from Phases 2 through 5
  • T030 Run export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent) if any PHP or script-adjacent PHP files changed
  • T031 Confirm Filament remains v5 on Livewire v4, provider registration remains in /Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/bootstrap/providers.php, no globally searchable resource changed, no destructive action changed, no asset registration changed, no /admin/t/... route or TenantPanelProvider behavior was restored, and no Specs 293 or 294 artifact was rewritten

Dependencies & Execution Order

  • Phase 1 must complete before any suite command.
  • Phase 2 must classify raw suite or fallback lane output before any fix work.
  • Phase 3 depends on Phase 2 because lane reports must be interpreted against observed lane outcomes.
  • Phase 4 depends on the failure group inventory from Phases 2 and 3.
  • Phase 5 depends on classified CI/lane contract defects; skip it entirely if no in-scope CI/lane defect is proven.
  • Phase 6 depends on all classification and any bounded fixes.

Parallel Execution Examples

  • T003 and T004 can run in parallel after T001.
  • T011 through T014 can run independently after their corresponding lane outputs exist.
  • T018 through T020 can be split by failure group once T008 has created the grouped inventory.
  • T022 through T024 must not run until a corresponding classification row proves the in-scope defect.

Implementation Strategy

Suggested MVP Scope

MVP = Phases 1 through 4. That is enough to answer whether the suite is green or which follow-up owns each red group. Phase 5 runs only when classification proves a narrow CI/lane contract defect.

Incremental Delivery

  1. Lock scope and read prior stabilization artifacts.
  2. Run raw full suite or fallback lane split.
  3. Classify every red group.
  4. Validate lane/report/artifact signal.
  5. Split product/runtime failures to follow-up ownership.
  6. Apply only proven CI/lane fixes.
  7. Publish the final readiness decision.

Explicit Follow-Ups / Out of Scope

  • Product/runtime failing-test repair outside CI/lane contract defects
  • Browser UI repair
  • Package Execution
  • Guided Operations
  • Microsoft Starter Pack
  • Virtual Consultant
  • Tenant cutover rework
  • Provider/verification runtime expansion beyond Spec 294
  • New permanent CI lane or framework
  • Historical-spec cleanup