TenantAtlas/specs/295-full-suite-ci-baseline/research.md
ahmido f03555eae1 Spec 295: full suite CI lane baseline (#350)
## Summary
- add the Spec 295 artifacts for full-suite failure classification and CI lane baseline work
- fix `scripts/platform-test-artifacts` so Sail passes artifact staging inputs into the embedded PHP script via argv
- add a guard test covering the artifact staging input contract

## Scope guards
- no browser screenshot baselines included
- no generated test artifacts included
- no runtime application code changes included

## Notes
- classification evidence and follow-up ownership are documented in `specs/295-full-suite-ci-baseline/failure-classification.md`
- this PR is intentionally limited to the CI/lane/artifact contract slice for Spec 295

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #350
2026-05-11 11:14:56 +00:00

4.2 KiB

Research: Full Suite Failure Classification & CI Lane Baseline

Decision: Use classification-first implementation

Rationale: The user explicitly asked not to blindly repair the full suite. Specs 293 and 294 already handled known focused stabilization slices. 295 must first answer whether the full suite is a reliable signal and only then allow small CI/lane fixes.

Alternatives considered:

  • Fix every failing test immediately: rejected because it hides ownership, scope-creeps into unrelated features, and violates the requested goal.
  • Run only targeted lanes: rejected because the central question is the complete suite signal after the targeted lanes were stabilized.
  • Skip full-suite run and rely on CI lanes: rejected because lane split can hide cross-lane fallout or raw-suite issues.

Decision: Prefer raw full suite, then explicit lane split fallback

Rationale: The raw command cd apps/platform && ./vendor/bin/sail artisan test --compact is the most direct answer to the full-suite readiness question. If it times out, produces output too large to classify, or is environment-blocked, the existing wrappers provide explicit fallback segmentation: fast-feedback, confidence, heavy-governance, and browser.

Alternatives considered:

  • Create a new full-suite wrapper: rejected as premature CI framework growth.
  • Use only confidence: rejected because confidence intentionally excludes browser, heavy-governance, and some discovery-heavy families.

Decision: Reuse existing lane and failure-class contracts

Rationale: TestLaneManifest already defines lanes, workflow profiles, budgets, artifact contracts, and lane scope notes. TestLaneReport already classifies CI failures as test-failure, wrapper-failure, budget-breach, artifact-publication-failure, or infrastructure-failure. Spec 295 should verify and minimally correct those contracts rather than inventing another taxonomy.

Pinned Spec 295 categories: ci-signal-restored, ci-wrapper-or-manifest-regression, artifact-publication-regression, budget-or-trend-baseline-drift, product-runtime-or-test-regression, browser-lane-regression, flaky-or-environment, follow-up-spec-required, resolved-or-not-needed.

Pinned Spec 295 seams: raw-full-suite, fast-feedback-lane, confidence-lane, heavy-governance-lane, browser-lane, profiling-or-junit-support, lane-reporting, artifact-publication, budget-trend-baseline, legacy-cutover-regression-guard, provider-verification-regression-guard.

Alternatives considered:

  • Add a separate CI readiness model: rejected because the existing support classes already own this truth.
  • Record only plain-text notes: rejected because future maintainers need stable categories, seams, and follow-up decisions.

Decision: Allow only small CI/lane contract fixes

Rationale: In-scope fixes are limited to wrappers, manifest/report support, artifact publication, budget/report contract drift, and their direct guard tests. This keeps the package focused on CI signal readiness.

Alternatives considered:

  • Fix application/runtime failures discovered by the suite: rejected unless a failure is proven to be a small CI/lane contract defect.
  • Update historical Specs 293 or 294: rejected by completed-spec guardrail and user scope.

Decision: Preserve legacy cutover retirement

Rationale: The request explicitly forbids reopening tenant cutover, legacy /admin/t/..., or TenantPanelProvider. Any failure that appears to depend on those retired paths must be classified without restoring them.

Alternatives considered:

  • Add temporary route aliases to make old tests pass: rejected as direct conflict with the cutover baseline.

Decision: Browser output is classification input, not automatic repair ownership

Rationale: The browser lane is intentionally isolated and may expose environment or smoke fallout. Spec 295 should classify browser failures and only repair browser-specific contract issues if they are lane/report artifacts, not product UI behavior.

Alternatives considered:

  • Run a browser smoke fix loop inside 295: rejected because this is not a UI implementation spec.