## Summary - add the Spec 295 artifacts for full-suite failure classification and CI lane baseline work - fix `scripts/platform-test-artifacts` so Sail passes artifact staging inputs into the embedded PHP script via argv - add a guard test covering the artifact staging input contract ## Scope guards - no browser screenshot baselines included - no generated test artifacts included - no runtime application code changes included ## Notes - classification evidence and follow-up ownership are documented in `specs/295-full-suite-ci-baseline/failure-classification.md` - this PR is intentionally limited to the CI/lane/artifact contract slice for Spec 295 Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #350
17 KiB
Tasks: Full Suite Failure Classification & CI Lane Baseline
Input: Design documents from /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/
Prerequisites: spec.md, plan.md, research.md, data-model.md, quickstart.md, failure-classification.md, checklists/requirements.md
Review Artifact: /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md
Failure Inventory: /Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
Review Metadata
- Review outcome class:
acceptable-special-case - Workflow outcome:
keep - Test-governance outcome:
keep - Stop / split triggers: broad product/runtime repair, new CI framework, new permanent lane, new browser family, new heavy-governance family, runtime application changes, Filament resource/page changes, route restoration, TenantPanelProvider restoration,
/admin/t/...restoration, provider/verification runtime expansion, historical-spec rewrite, or budget relaxation without classification evidence
Pinned Failure-Classification Categories
ci-signal-restoredci-wrapper-or-manifest-regressionartifact-publication-regressionbudget-or-trend-baseline-driftproduct-runtime-or-test-regressionbrowser-lane-regressionflaky-or-environmentfollow-up-spec-requiredresolved-or-not-needed
Pinned CI / Suite Seams
raw-full-suitefast-feedback-laneconfidence-laneheavy-governance-lanebrowser-laneprofiling-or-junit-supportlane-reportingartifact-publicationbudget-trend-baselinelegacy-cutover-regression-guardprovider-verification-regression-guard
Test Governance Checklist
- Lane assignment is named and is the narrowest sufficient proof for each observed failure group.
- New or changed tests stay in the smallest honest family, and any heavy-governance or browser addition is explicit.
- Shared helpers, factories, seeds, fixtures, and context defaults stay cheap by default; any widening is isolated or documented.
- Planned validation commands cover the change without pulling in unrelated lane cost beyond classification.
- The declared surface test profile or
standard-native-filamentrelief is explicit. - Any material budget, baseline, trend, or escalation note is recorded in
failure-classification.md.
Phase 1: Setup and Scope Lock
Purpose: Confirm Spec 295 remains a classification and CI lane baseline package before any suite command runs.
- T001 Review
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md,/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/plan.md,/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/research.md,/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md,/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/quickstart.md,/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md, and/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.mdbefore changing runtime or tests - T002 [P] Confirm current branch, working tree, and baseline diff using
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branchandexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --stat, then record any pre-existing changes in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T003 [P] Inspect
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/293-post-cutover-suite-stabilization/failure-classification.mdand/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/294-provider-verification-runtime-semantics/failure-classification.mdas context only, confirming no task edits are made to Specs293or294 - T004 [P] Inspect
/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane,/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-report,/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts,/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json,/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, and/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.phpto confirm current lane entry points and failure classes - T005 Confirm the explicit forbidden scope in
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md: no TenantPanelProvider restoration, no/admin/t/...restoration, no broad product repair, and no historical-spec rewrite
Phase 2: User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1)
Goal: Establish the raw full-suite readiness signal or an explicit fallback split before any fix work begins.
Independent Test: the raw full-suite result or fallback lane split is represented by classified rows in failure-classification.md, with no red group left unclassified.
- T006 [US1] Run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)and record pass/fail counts, failing files, and any timeout/noisy-output reason in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T007 [US1] If T006 cannot produce a classifiable result, run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback,export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence,export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance, andexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser, then record each lane outcome in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T008 [US1] Group every failing test file, assertion cluster, wrapper error, report error, artifact error, budget breach, or environment issue into one row in
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.mdwith exactly one pinned category and one pinned seam - T009 [US1] Classify any legacy route or panel-related group under
legacy-cutover-regression-guardwithout restoring/admin/t/..., TenantPanelProvider, tenant-scoped provider fallback routes, or historical compatibility behavior - T010 [US1] Classify any provider/verification group under
provider-verification-regression-guardwithout rewriting Spec294; only mark it in-scope if the failure is a direct CI/lane contract defect rather than provider runtime behavior
Phase 3: User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1)
Goal: Prove existing CI wrappers, reports, artifacts, budgets, and failure classes are interpretable after the suite run.
Independent Test: every lane either passes with complete report/artifact output or fails with the correct primary failure class.
- T011 [US2] Run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedbackand classify report, budget, trend, and artifact status in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T012 [US2] Run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidenceand classify report, budget, trend, and artifact status in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T013 [US2] Run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governanceand classify report, budget, trend, and artifact status in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T014 [US2] Run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browserand classify report, budget, trend, and artifact status in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T015 [P] [US2] If machine-readable confidence output is needed for follow-up ownership, run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junitand classify the JUnit support result in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md(not run separately because the lane wrappers produced the needed JUnit artifacts) - T016 [P] [US2] If artifact publication is suspected, run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-fast-feedback-artifactsor the matching affected lane and classify any missing required artifacts underartifact-publication-regression - T017 [US2] Verify existing failure classes from
/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.phpclassify lane outcomes astest-failure,wrapper-failure,budget-breach,artifact-publication-failure, orinfrastructure-failure, and record mismatches in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md
Phase 4: User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1)
Goal: Keep Spec 295 limited to CI signal readiness by splitting product/runtime failures into explicit follow-up ownership.
Independent Test: every non-CI failure group has a follow-up recommendation, owner, or environment disposition.
- T018 [US3] For each row classified as
product-runtime-or-test-regression, decide whether it is a follow-up spec, lane-specific debt, or active feature blocker, then record the decision in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T019 [US3] For each row classified as
browser-lane-regression, record the affected browser file under/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Browser/, whether the failure is smoke/environment/product behavior, and the follow-up path in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md - T020 [US3] For each row classified as
flaky-or-environment, rerun the narrowest affected command once when safe and record the rerun evidence or environment blocker in/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md(no flaky/environment row was identified) - T021 [US3] Confirm no failure group is being fixed under
295solely because it is small or nearby; it must be directly tied to CI wrapper, manifest, report, artifact, or budget/trend contract drift
Phase 5: User Story 4 - Apply Only Small CI-Signal Fixes (Priority: P2)
Goal: Correct narrow CI/lane contract defects only when classification proves they block a trustworthy CI signal.
Independent Test: the directly affected lane/report/artifact guard passes after the minimal fix, and unrelated red groups remain classified.
- T022 [US4] If a
ci-wrapper-or-manifest-regressionrow is proven, apply the minimal correction in/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane,/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json,/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, or the directly affected guard test under/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/(not applicable: noci-wrapper-or-manifest-regressionrow was proven) - T023 [US4] If an
artifact-publication-regressionrow is proven, apply the minimal correction in/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts,/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php,/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, or the directly affected artifact guard test - T024 [US4] If a
budget-or-trend-baseline-driftrow is proven, update only the documented budget/trend baseline owner in/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneBudget.php,/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php, or the directly affected guard test when the classification row explains why the evidence supports the change (not applicable: no budget/trend baseline rewrite was justified) - T025 [US4] Add or adjust Pest coverage only when a CI/lane contract defect was fixed, keeping tests under
/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/or/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Unit/Support/and avoiding new browser/heavy families by default - T026 [US4] Re-run the narrowest affected lane/report/artifact command after any CI/lane fix and update
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.mdwith the final status
Phase 6: Final Readiness Decision and Validation
Purpose: Publish one final CI readiness decision and prove no unclassified failure or hidden scope expansion remains.
- T027 Review
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.mdand confirm every row has category, seam, observed command, candidate owner, fix-in-295 decision, follow-up, and status - T028 Set the final readiness decision in
/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.mdto exactly one ofrestored-ci-signal,classified-follow-up-required, orblocked-by-environment - T029 Re-run the final narrowest proof command set for the decision: raw full suite if classifiable, otherwise the exact affected lane/report commands from Phases 2 through 5
- T030 Run
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)if any PHP or script-adjacent PHP files changed - T031 Confirm Filament remains v5 on Livewire v4, provider registration remains in
/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/bootstrap/providers.php, no globally searchable resource changed, no destructive action changed, no asset registration changed, no/admin/t/...route or TenantPanelProvider behavior was restored, and no Specs293or294artifact was rewritten
Dependencies & Execution Order
- Phase 1 must complete before any suite command.
- Phase 2 must classify raw suite or fallback lane output before any fix work.
- Phase 3 depends on Phase 2 because lane reports must be interpreted against observed lane outcomes.
- Phase 4 depends on the failure group inventory from Phases 2 and 3.
- Phase 5 depends on classified CI/lane contract defects; skip it entirely if no in-scope CI/lane defect is proven.
- Phase 6 depends on all classification and any bounded fixes.
Parallel Execution Examples
- T003 and T004 can run in parallel after T001.
- T011 through T014 can run independently after their corresponding lane outputs exist.
- T018 through T020 can be split by failure group once T008 has created the grouped inventory.
- T022 through T024 must not run until a corresponding classification row proves the in-scope defect.
Implementation Strategy
Suggested MVP Scope
MVP = Phases 1 through 4. That is enough to answer whether the suite is green or which follow-up owns each red group. Phase 5 runs only when classification proves a narrow CI/lane contract defect.
Incremental Delivery
- Lock scope and read prior stabilization artifacts.
- Run raw full suite or fallback lane split.
- Classify every red group.
- Validate lane/report/artifact signal.
- Split product/runtime failures to follow-up ownership.
- Apply only proven CI/lane fixes.
- Publish the final readiness decision.
Explicit Follow-Ups / Out of Scope
- Product/runtime failing-test repair outside CI/lane contract defects
- Browser UI repair
- Package Execution
- Guided Operations
- Microsoft Starter Pack
- Virtual Consultant
- Tenant cutover rework
- Provider/verification runtime expansion beyond Spec
294 - New permanent CI lane or framework
- Historical-spec cleanup