## Summary - add the Spec 295 artifacts for full-suite failure classification and CI lane baseline work - fix `scripts/platform-test-artifacts` so Sail passes artifact staging inputs into the embedded PHP script via argv - add a guard test covering the artifact staging input contract ## Scope guards - no browser screenshot baselines included - no generated test artifacts included - no runtime application code changes included ## Notes - classification evidence and follow-up ownership are documented in `specs/295-full-suite-ci-baseline/failure-classification.md` - this PR is intentionally limited to the CI/lane/artifact contract slice for Spec 295 Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #350
32 KiB
Feature Specification: Full Suite Failure Classification & CI Lane Baseline
Feature Branch: 295-full-suite-ci-baseline
Created: 2026-05-11
Status: Ready
Input: User description: "Spec 295 - Full Suite Failure Classification & CI Lane Baseline. After Specs 293 and 294, run a full-suite classification to determine whether the full platform suite is again a reliable CI signal or whether remaining failures must be classified into separate follow-up specs or lanes. Do not blindly fix the full suite, do not scope-creep, do not reopen tenant cutover, do not restore legacy /admin/t/... or TenantPanelProvider behavior, and perform only small clearly in-scope fixes."
Spec Candidate Check (mandatory - SPEC-GATE-001)
- Problem: Specs
293and294closed the known post-cutover route/action-surface and ProviderConnections/Verification failure blocks, but the complete platform suite has not yet been classified as a restored CI signal. Maintainers need one bounded pass that distinguishes green signal, CI wrapper or lane baseline failures, remaining product regressions, flaky or environment failures, and follow-up-spec debt. - Today's failure: targeted lanes can be green while the raw full suite or CI lane wrappers may still fail for unrelated product debt, wrapper/report/artifact drift, budget baseline changes, browser-specific fallout, or environment-only failures. Without classification, future work cannot tell whether a red run means "fix this PR", "rerun because infrastructure failed", "update lane baseline", or "open a follow-up spec".
- User-visible improvement: maintainers get an attributable CI readiness decision: either the complete platform suite is a reliable blocking signal again, or every remaining red group is explicitly assigned to the right lane, owner, and follow-up path without reviving retired tenant routes or reopening Specs
293and294. - Smallest enterprise-capable version: one classification-first package that runs the raw full suite or its explicit fallback lane split, records every failing group in
failure-classification.md, validates existing lane wrappers/report/artifact contracts, applies only small CI-signal fixes when the failure is clearly in scope, and records all product/runtime failures as follow-up candidates instead of absorbing them. - Explicit non-goals: no broad full-suite repair, no tenant-cutover rework, no TenantPanelProvider reactivation, no
/admin/t/...route restoration, no provider/verification runtime expansion beyond Spec294, no new CI framework, no new permanent test lane by default, no new browser family, no new runtime persistence, no UI redesign, no product feature work, no unrelated failing-test cleanup, and no historical-spec rewrites. - Permanent complexity imported: one spec-local
failure-classification.mdartifact, one bounded failure-category inventory, one bounded CI/lane seam inventory, and focused tasks against existing test lane scripts, lane manifest/report support, and current Pest lane commands. No runtime table, model, enum, provider abstraction, Filament resource, or product surface is introduced. - Why now: after
293and294, the next quality question is no longer one known red cluster. It is whether CI can be trusted again as a whole. If this is not classified now, later specs will either over-trust a partially red suite or keep rediscovering unrelated failures as local surprises. - Why not local: the signal spans raw Pest execution,
scripts/platform-test-lane,scripts/platform-test-report,scripts/platform-test-artifacts,Tests\Support\TestLaneManifest,Tests\Support\TestLaneReport, browser isolation, heavy-governance budget/reporting, and current workflow profiles. A one-file patch would not prove CI readiness. - Approval class: Cleanup
- Red flags triggered: full-suite scope, cross-cutting test governance, and possible temptation to repair unrelated product failures. Defense: this spec is classification-first, uses existing lane/failure-class contracts, imports only a spec-local artifact, and forbids broad repair or legacy route restoration.
- Score: Nutzen: 2 | Dringlichkeit: 2 | Scope: 2 | Komplexitaet: 1 | Produktnaehe: 1 | Wiederverwendung: 2 | Gesamt: 10/12
- Decision: approve
Review Outcome
- Outcome class:
acceptable-special-case - Workflow outcome:
keep - Test-governance outcome:
keep - Reason: full-suite work is normally too broad, but this package is justified because it is a classification and CI-signal baseline pass after two completed stabilization slices, not a fix-all implementation.
- Workflow result: Ready for implementation as one bounded suite-signal classification package after Specs
293and294.
Candidate Selection Gate
- Selected candidate: Full Suite Failure Classification & CI Lane Baseline
- Source location: explicit user-provided manual follow-up after
specs/293-post-cutover-suite-stabilization/andspecs/294-provider-verification-runtime-semantics/ - Why selected now: the known cutover and provider/verification red blocks have been stabilized, so the remaining decision is whether the full platform suite and lane wrappers now form a trustworthy CI signal.
- Why close alternatives were deferred:
- reopening Spec
293would blur route/action-surface cutover cleanup with full-suite CI readiness - reopening Spec
294would blur provider/verification runtime semantics with unrelated suite failures - starting Package Execution, Guided Operations, Microsoft Starter Pack, or Virtual Consultant would hide CI uncertainty under new product work
- creating a new permanent full-suite lane would import CI framework complexity before proving the existing lanes are insufficient
- fixing every failing test in one pass would scope-creep beyond classification and make follow-up ownership unclear
- reopening Spec
- Roadmap relationship: test-governance and platform quality follow-through under
TEST-GOV-001; this is not a new product roadmap lane and not an automatic active queue promotion. - Completed-spec guardrail result: Specs
293and294are context only and are excluded from refresh. Spec294carries implementation close-out evidence. Spec293is treated as the completed post-cutover baseline described by the user and its failure-classification history is preserved; this spec does not rewrite 293 tasks or close-out history. Specs287and288remain prior cutover and no-legacy guard context only. - Smallest viable implementation slice: run the full suite or explicit lane split, classify every remaining failure group, validate CI wrapper/report/artifact contracts, and perform only small CI-signal fixes that do not change product behavior.
- Proposed concise feature description to feed into specify: Classify the full platform test suite after Specs 293 and 294 and establish whether existing CI lanes provide a trustworthy baseline, while splitting unrelated failures into explicit follow-up ownership instead of repairing the suite blindly.
Pinned Failure-Classification Categories
ci-signal-restoredci-wrapper-or-manifest-regressionartifact-publication-regressionbudget-or-trend-baseline-driftproduct-runtime-or-test-regressionbrowser-lane-regressionflaky-or-environmentfollow-up-spec-requiredresolved-or-not-needed
Pinned CI / Suite Seams
raw-full-suitefast-feedback-laneconfidence-laneheavy-governance-lanebrowser-laneprofiling-or-junit-supportlane-reportingartifact-publicationbudget-trend-baselinelegacy-cutover-regression-guardprovider-verification-regression-guard
Spec Scope Fields (mandatory)
- Scope: repository / CI test-governance workflow
- Primary Routes: N/A - no application routes or operator-facing navigation are added or restored. Retired
/admin/t/...routes and TenantPanelProvider behavior remain forbidden. - Data Ownership:
- no new application persistence is introduced
- no runtime source of truth is introduced
failure-classification.mdis a spec-local implementation artifact and is not product/runtime truth- existing test lane truth remains in
apps/platform/tests/Support/TestLaneManifest.php,apps/platform/tests/Support/TestLaneReport.php, and the wrapper scripts underscripts/
- RBAC:
- no authorization model changes are introduced
- existing workspace and managed-environment isolation tests remain ordinary suite participants
- if a failing group concerns RBAC, it must be classified as product/runtime debt or a follow-up spec unless it is clearly only a stale CI/lane assertion
For canonical-view specs, the spec MUST define:
- Default filter behavior when tenant-context is active: N/A - no canonical-view application surface is added or changed.
- Explicit entitlement checks preventing cross-tenant leakage: N/A for this prep package. Any suite failure suggesting leakage must be classified as product-runtime debt and not hidden as a lane issue.
Cross-Cutting / Shared Pattern Reuse (mandatory when the feature touches notifications, status messaging, action links, header actions, dashboard signals/cards, alerts, navigation entry points, evidence/report viewers, or any other existing shared operator interaction family; otherwise write N/A - no shared interaction family touched)
- Cross-cutting feature?: yes
- Interaction class(es): CI lane execution, full-suite signal classification, lane report generation, artifact publication, budget/trend baseline review, and follow-up-spec routing
- Systems touched:
scripts/platform-test-lanescripts/platform-test-reportscripts/platform-test-artifactsapps/platform/composer.jsonapps/platform/tests/Support/TestLaneManifest.phpapps/platform/tests/Support/TestLaneReport.phpapps/platform/tests/Support/TestLaneBudget.phpapps/platform/tests/Feature/Guards/TestLaneManifestTest.phpapps/platform/tests/Feature/Guards/CiLaneFailureClassificationContractTest.phpapps/platform/tests/Feature/Guards/CiFastFeedbackWorkflowContractTest.phpapps/platform/tests/Feature/Guards/CiConfidenceWorkflowContractTest.phpapps/platform/tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.php- existing lane-selected Pest tests and browser smoke files only as classification inputs unless a small CI-signal fix is proven
- Existing pattern(s) to extend: existing
TestLaneManifestlane definitions, existingTestLaneReportfailure classes, existing lane wrapper scripts, existing Gitea workflow profile metadata, existing report/artifact publication contracts - Shared contract / presenter / builder / renderer to reuse:
TestLaneManifest::lanes(),TestLaneManifest::workflowProfiles(),TestLaneManifest::failureClasses(),TestLaneReport::classifyPrimaryFailure(),TestLaneReport::buildCiSummary(),TestLaneReport::artifactPublicationStatus(), andscripts/platform-test-* - Why the existing shared path is sufficient or insufficient: the repo already has explicit lane, failure-class, artifact, and budget contracts. Spec
295must prove whether they are currently enough and fix only small contract drift; it must not create a new CI orchestration layer before existing contracts are classified. - Allowed deviation and why: only a bounded CI/lane contract correction is allowed when a wrapper, manifest, report, artifact, or budget baseline defect prevents classification. Product/runtime failures must be classified and split instead of fixed here.
- Consistency impact: raw suite output, lane wrapper output, report artifacts, budget/trend summaries, and final follow-up classification must tell the same story about whether the suite is green, blocked, flaky, or split.
- Review focus: reviewers must verify that this spec does not become a general failing-test cleanup, does not restore tenant-cutover legacy behavior, and does not add a new permanent lane unless the artifacts explicitly prove existing lanes are insufficient.
OperationRun UX Impact (mandatory when the feature creates, queues, deduplicates, resumes, blocks, completes, or deep-links to an OperationRun; otherwise write N/A - no OperationRun start or link semantics touched)
- Touches OperationRun start/completion/link UX?: no
- Shared OperationRun UX contract/layer reused: N/A
- Delegated start/completion UX behaviors: N/A
- Local surface-owned behavior that remains: N/A
- Queued DB-notification policy: N/A
- Terminal notification path: N/A
- Exception required?: none
Provider Boundary / Platform Core Check (mandatory when the feature changes shared provider/platform seams, identity scope, governed-subject taxonomy, compare strategy selection, provider connection descriptors, or operator vocabulary that may leak provider-specific semantics into platform-core truth; otherwise write N/A - no shared provider/platform boundary touched)
- Shared provider/platform boundary touched?: no product boundary change
- Boundary classification: N/A
- Seams affected: provider and verification tests may fail during classification, but this spec may only classify them as regression or follow-up debt unless the failure is purely a CI/lane contract issue.
- Neutral platform terms preserved or introduced:
workspace,managed environment,provider connection,operation,lane,failure group,CI signal - Provider-specific semantics retained and why: N/A
- Why this does not deepen provider coupling accidentally: Spec
295does not change provider runtime, provider identity, target-scope semantics, or provider copy. It treats provider-specific failures as test/runtime debt requiring explicit follow-up unless they are already covered by the completed Spec294seam and proven to be a small regression in the CI contract. - Follow-up path: any real provider/verification product failure after Spec
294must become a follow-up spec or explicitly named failure group, not hidden in this classification pass.
UI / Surface Guardrail Impact (mandatory when operator-facing surfaces are changed; otherwise write N/A)
N/A - no operator-facing surface change. Browser tests may be run as a lane signal only; visible UI repair is out of scope unless a later implementation explicitly stops and opens a follow-up spec.
Decision-First Surface Role (mandatory when operator-facing surfaces are changed)
N/A - no application decision surface is added or changed.
Audience-Aware Disclosure (mandatory when operator-facing surfaces are changed)
N/A - no application disclosure layer is added or changed.
UI/UX Surface Classification (mandatory when operator-facing surfaces are changed)
N/A - no Filament screen, table, widget, relation manager, or resource is added or materially refactored.
Operator Surface Contract (mandatory when operator-facing surfaces are changed)
N/A - no operator-facing page contract is introduced.
Proportionality Review (mandatory when structural complexity is introduced)
- New source of truth?: no runtime source of truth
- New persisted entity/table/artifact?: no application persistence; one spec-local
failure-classification.mdartifact is added for implementation tracking only - New abstraction?: no
- New enum/state/reason family?: yes, one spec-local failure-classification category set used only inside this spec package
- New cross-domain UI framework/taxonomy?: no
- Current operator problem: maintainers need one reliable answer to whether the full suite is a usable CI signal after Specs
293and294, and if not, exactly which lane or follow-up owns the remaining failures. - Existing structure is insufficient because: targeted green lanes do not prove full-suite readiness, while raw red output without classification does not tell maintainers whether to fix, split, rerun, or update lane baseline artifacts.
- Narrowest correct implementation: add one spec-local failure-classification artifact, use existing lane wrappers and support classes, classify all remaining groups, and fix only small CI-signal defects that block classification.
- Ownership cost: low to moderate; maintain one temporary classification artifact and any small lane contract correction made during implementation.
- Alternative intentionally rejected: a new full-suite framework, broad test rewrite, or permanent new lane. Those options import durable complexity before the existing lane system is proven insufficient.
- Release truth: current-release CI/test-governance readiness only
Compatibility posture
This feature assumes a pre-production environment.
Backward compatibility, legacy aliases, route shims, TenantPanelProvider restoration, and compatibility-specific tests are out of scope. Canonical replacement remains preferred over preservation.
Testing / Lane / Runtime Impact (mandatory for runtime behavior changes)
- Test purpose / classification: Heavy-Governance, Feature, Browser, Support/JUnit, and full-suite classification
- Validation lane(s): raw full suite, fast-feedback, confidence, heavy-governance, browser, profiling/support when needed, junit/report/artifact publication when needed
- Why this classification and these lanes are sufficient: the goal is not one feature behavior. The proving purpose is whether the complete platform suite and existing CI lanes produce a trustworthy pass/fail signal after the known stabilization work.
- New or expanded test families: none by default. Any new test must be limited to a small CI/lane contract guard if a wrapper/report/artifact regression is proven.
- Fixture / helper cost impact: no new expensive fixture defaults are allowed. If fixture drift appears in the full suite, classify it by failing family and split to follow-up unless a one-line lane/guard baseline is the direct cause.
- Heavy-family visibility / justification: explicit. Heavy-governance and browser lanes are signal inputs, not automatic repair ownership.
- Special surface test profile:
global-context-shell,standard-native-filament,shared-detail-family,browser-smoke,surface-guard,discovery-heavy - Standard-native relief or required special coverage: no UI coverage expansion; browser lane reruns are used only to classify the existing smoke baseline.
- Reviewer handoff: reviewers must confirm that Livewire remains v4.0+, Filament remains v5, provider registration stays in
apps/platform/bootstrap/providers.php, globally searchable resources are not changed, destructive actions are not changed, no assets are registered, every remaining failure is classified, and any in-scope fix is tied directly to a CI/lane contract defect. - Budget / baseline / trend impact: the classification may update the documented status of budget or trend baseline drift, but it must not silently relax lane budgets or create a new baseline without an explicit row in
failure-classification.md. - Escalation needed:
document-in-featurefor contained lane baseline findings;follow-up-specfor product/runtime failures, fixture-family debt, new heavy cost centers, browser fallout, or any repair that exceeds CI/lane contract correction. - Active feature PR close-out entry:
FullSuiteClassification - Planned validation commands:
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branchexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --statexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedbackexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidenceexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governanceexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browserexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedbackexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidenceexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governanceexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browserexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junitexport PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)
User Scenarios & Testing (mandatory)
User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1)
As a maintainer, I want the complete platform suite run or explicit fallback lane split classified before any fixes so the project knows whether CI is green, blocked, flaky, or split into follow-up work.
Why this priority: without classification first, Spec 295 would become an uncontrolled full-suite repair pass.
Independent Test: Run the raw full suite or fallback lane split and prove every failing group has exactly one category, one seam, one owner/follow-up decision, and one status row in failure-classification.md.
Acceptance Scenarios:
- Given the repo after Specs
293and294, When the raw full suite passes, Thenfailure-classification.mdrecordsci-signal-restoredwith the command, date, and pass counts. - Given the raw full suite fails, When the failure groups are reviewed, Then each group is classified before any repair is attempted.
- Given a failing group points at
/admin/t/..., TenantPanelProvider, or legacy tenant route behavior, When it is classified, Then the remedy must not restore that behavior and must be split or fixed only through current workspace-first truth.
User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1)
As a maintainer, I want each existing CI lane wrapper, report, artifact, and failure class to produce a trustworthy signal so Gitea CI failures can be interpreted without reading raw terminal output first.
Why this priority: a green or red Pest run is not enough if wrapper, report, artifact, budget, or failure-class summaries are stale.
Independent Test: Run the existing lane wrappers and report commands, then verify each lane either passes with complete artifacts or fails with the correct primary failure class.
Acceptance Scenarios:
- Given a lane fails because tests fail, When its report summary is generated, Then the primary failure class is
test-failurerather than wrapper, artifact, or infrastructure failure. - Given a lane wrapper or manifest no longer resolves to the intended lane, When the lane is classified, Then it is marked
ci-wrapper-or-manifest-regressionand may be fixed in295. - Given required report artifacts are missing after a lane run, When publication is checked, Then it is classified as
artifact-publication-regressionand may be fixed in295.
User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1)
As a maintainer, I want remaining product/runtime failures to become explicit follow-up ownership instead of being silently fixed under a CI-baseline spec.
Why this priority: this protects scope discipline and keeps test-governance decisions attributable.
Independent Test: Review every non-CI failure group and prove it either has a targeted follow-up recommendation or is demonstrably flaky/environmental.
Acceptance Scenarios:
- Given a failing group requires a runtime product fix, When classification finishes, Then it is marked
follow-up-spec-requiredorproduct-runtime-or-test-regressionand not repaired under295unless the user explicitly starts that implementation scope later. - Given a failing group belongs to browser-only behavior, When classification finishes, Then it is marked
browser-lane-regressionwith the existing smoke file and follow-up path. - Given a failing group disappears on rerun or is environment-specific, When classification finishes, Then it is marked
flaky-or-environmentwith rerun evidence instead of treated as restored CI.
User Story 4 - Publish the Final CI Readiness Decision (Priority: P2)
As a maintainer, I want a final readiness statement that says whether the full suite can be used as a CI baseline now, and what exact follow-up remains if it cannot.
Why this priority: the output must be actionable for future specs and Gitea workflows, not just a local debugging note.
Independent Test: Inspect failure-classification.md, lane report outputs, and final validation commands to confirm there are no unclassified failure groups and no hidden scope expansion.
Acceptance Scenarios:
- Given all raw suite and lane signals pass, When close-out is prepared, Then the readiness decision is
restored-ci-signal. - Given any group remains red, When close-out is prepared, Then the readiness decision is
classified-follow-up-requiredand each group has an owner/follow-up. - Given a small CI/lane contract fix was applied, When final validation runs, Then the directly affected lane/report/artifact guard passes and unrelated failures remain classified rather than hidden.
Edge Cases
- The raw full suite times out or produces output too large to classify directly.
- A lane passes tests but fails report or artifact publication.
- A lane fails only because budget/trend baselines drifted, not because tests failed.
- Browser lane failures expose stale screenshots or environment-specific browser state.
- A failure appears to touch Spec
293or294seams but would require reopening retired legacy behavior. - A failure disappears on rerun, suggesting flaky or environment-only behavior.
- A small lane manifest fix changes which tests run in a lane, which could accidentally widen CI cost.
Requirements (mandatory)
Constitution alignment (required): This spec introduces no Microsoft Graph calls, no write/change behavior, no long-running application work, and no new OperationRun. It must preserve workspace/tenant isolation expectations while classifying test failures. Any failure suggesting isolation, RBAC, or audit regressions must be classified as product/runtime debt and not hidden as a CI wrapper issue.
Constitution alignment (PROP-001 / ABSTR-001 / PERSIST-001 / STATE-001 / BLOAT-001): The only structural addition is one spec-local failure-classification vocabulary and artifact. It solves the current CI readiness problem after two stabilization specs; no runtime persistence, CI framework, test engine, or new lane abstraction is introduced.
Constitution alignment (TEST-GOV-001): Spec 295 must explicitly classify the proving purpose of every lane run, preserve the existing lane family boundaries, keep expensive fixture/context setup opt-in, and end with one review outcome: keep, split, document-in-feature, follow-up-spec, or reject-or-split.
Functional Requirements
- FR-295-001: The implementation MUST run the raw full suite once when feasible using
cd apps/platform && ./vendor/bin/sail artisan test --compact. - FR-295-002: If the raw full suite is too slow, noisy, or environment-blocked to classify reliably, the implementation MUST run the explicit fallback lane split:
fast-feedback,confidence,heavy-governance, andbrowser. - FR-295-003: Every failing group MUST be recorded in
failure-classification.mdwith exactly one pinned category, one pinned seam, observed command, candidate owner, fix-in-295 decision, follow-up decision, and status. - FR-295-004: Lane wrapper, report, artifact, budget, and failure-class problems MAY be fixed in
295only when the failure is clearly isolated toscripts/platform-test-lane,scripts/platform-test-report,scripts/platform-test-artifacts,TestLaneManifest,TestLaneReport,TestLaneBudget, or their guard tests. - FR-295-005: Product/runtime failures MUST NOT be repaired under
295unless they are also a small, proven CI/lane contract defect; otherwise they must be assigned to a follow-up spec or classified as unrelated existing debt. - FR-295-006: Any failure related to Specs
293or294MUST be classified without rewriting those completed specs or restoring legacy behavior. - FR-295-007: The implementation MUST NOT restore TenantPanelProvider,
/admin/t/..., tenant-scoped provider fallback routes, or other retired cutover behavior. - FR-295-008: The implementation MUST validate existing lane failure classes:
test-failure,wrapper-failure,budget-breach,artifact-publication-failure, andinfrastructure-failure. - FR-295-009: The implementation MUST produce a final CI readiness decision in
failure-classification.md:restored-ci-signal,classified-follow-up-required, orblocked-by-environment. - FR-295-010: Any new or changed tests MUST be limited to CI/lane contract proof and must use Pest.
Non-Functional Requirements
- NFR-295-001: No new runtime persistence, queue, model, service abstraction, provider registry, Filament resource, or browser family is introduced.
- NFR-295-002: Test lane classification must follow actual proving purpose, not file location.
- NFR-295-003: Existing lane budget and trend baselines must not be relaxed silently.
- NFR-295-004: Classification output must be concise enough for future implementers to route work without re-running the entire suite first.
- NFR-295-005: The final package must preserve Filament v5 / Livewire v4 compatibility and must not change panel provider registration.
Key Entities (include if feature involves data)
- Failure Group: one failing test file, failing assertion cluster, wrapper error, artifact error, budget breach, or environment failure sharing one cause and one owner.
- CI Lane Signal: the pass/fail/report/artifact/budget outcome for one lane in
TestLaneManifest. - Classification Decision: the spec-local row assigning one category, seam, owner, fix-in-295 decision, and follow-up path.
- Readiness Decision: the final status of the full suite and lane baseline after classification.
Success Criteria (mandatory)
- SC-295-001:
failure-classification.mdexists and contains the pinned category and seam definitions. - SC-295-002: Raw full suite output or fallback lane split output is represented by classified groups with no unclassified red group remaining.
- SC-295-003: Existing lane wrappers and report/artifact contracts either pass or have a classified failure class and fix/follow-up decision.
- SC-295-004: No implementation step restores TenantPanelProvider,
/admin/t/..., or retired tenant-scoped fallback behavior. - SC-295-005: The final readiness decision is explicit and actionable:
restored-ci-signal,classified-follow-up-required, orblocked-by-environment. - SC-295-006: If a product/runtime failure remains, the classification identifies a separate follow-up owner instead of treating the full suite as green.
Assumptions
- Specs
293and294have completed the targeted stabilization work described by the user and are context only. - The repo's existing Gitea-compatible lane system remains the preferred CI shape.
- Local implementation will use Sail-first commands unless a non-Docker fallback is explicitly needed.
- Full-suite execution may be expensive; lane split is an allowed fallback only when the raw full suite is not classifiable.
Risks
- Full-suite output may be too large or slow to classify directly.
- Environment-specific Sail/browser failures may obscure real suite status.
- A tempting product fix may be small locally but still outside this CI-baseline scope.
- Budget/trend drift may be real but not appropriate to fix by silently raising thresholds.
- Multiple failing groups may share a fixture root cause and need careful grouping to avoid duplicate follow-up specs.
Open Questions
- None blocking preparation. During implementation, actual failing groups determine whether follow-up specs are needed.