Main Confidence / confidence (push) Failing after 3m36s

Details

Spec 210: implement CI test matrix budget enforcement (#243 )

## Summary
- add explicit Gitea workflow files for PR Fast Feedback, `dev` Confidence, Heavy Governance, and Browser lanes
- extend the repo-truth lane support seams with workflow profiles, trigger-aware budget enforcement, artifact publication contracts, CI summaries, and failure classification
- add deterministic artifact staging, new CI governance guard coverage, and Spec 210 planning/contracts/docs updates

## Validation
- `cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`
- `cd apps/platform && ./vendor/bin/sail artisan test --compact tests/Feature/Guards/CiFastFeedbackWorkflowContractTest.php tests/Feature/Guards/CiConfidenceWorkflowContractTest.php tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.php tests/Feature/Guards/CiLaneFailureClassificationContractTest.php tests/Feature/Guards/FastFeedbackLaneContractTest.php tests/Feature/Guards/ConfidenceLaneContractTest.php tests/Feature/Guards/HeavyGovernanceLaneContractTest.php tests/Feature/Guards/BrowserLaneIsolationTest.php tests/Feature/Guards/FixtureLaneImpactBudgetTest.php tests/Feature/Guards/TestLaneManifestTest.php tests/Feature/Guards/TestLaneArtifactsContractTest.php tests/Feature/Guards/TestLaneCommandContractTest.php`
- `./scripts/platform-test-lane fast-feedback`
- `./scripts/platform-test-lane confidence`
- `./scripts/platform-test-lane heavy-governance`
- `./scripts/platform-test-lane browser`
- `./scripts/platform-test-report fast-feedback`
- `./scripts/platform-test-report confidence`

## Notes
- scheduled Heavy Governance and Browser workflows stay gated behind `TENANTATLAS_ENABLE_HEAVY_GOVERNANCE_SCHEDULE=1` and `TENANTATLAS_ENABLE_BROWSER_SCHEDULE=1`
- the remaining rollout evidence task is capturing the live Gitea run set this PR enables: PR Fast Feedback, `dev` Confidence, manual and scheduled Heavy Governance, and manual and scheduled Browser runs

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #243

2026-04-17 18:04:35 +00:00

18 KiB

Raw Blame History

Implementation Plan: CI Test Matrix & Runtime Budget Enforcement

Branch: 210-ci-matrix-budget-enforcement | Date: 2026-04-17 | Spec: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/210-ci-matrix-budget-enforcement/spec.md Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/210-ci-matrix-budget-enforcement/spec.md

Summary

Operationalize Spec 210 by wiring Gitea Actions workflows under .gitea/workflows/ to the existing repo-root lane wrappers, treating dev as the protected confidence branch, keeping pull request validation intentionally narrow because Gitea validates pull request heads rather than merge-preview refs, extending the existing TestLaneManifest/TestLaneBudget/TestLaneReport seams with trigger-aware budget and failure semantics, and standardizing CI artifact staging and upload without duplicating lane-selection logic in workflow YAML.

Technical Context

Language/Version: PHP 8.4.15 for repo-truth test governance, Bash for repo-root wrappers, and GitHub-compatible Gitea Actions workflow YAML under .gitea/workflows/
Primary Dependencies: Laravel 12, Pest v4, PHPUnit 12, Filament v5, Livewire v4, Laravel Sail, Gitea Actions backed by act_runner, and the existing Tests\Support\TestLaneManifest, TestLaneBudget, and TestLaneReport seams
Storage: SQLite :memory: for default lane execution, filesystem artifacts under the app-root contract path storage/logs/test-lanes, checked-in workflow YAML under .gitea/workflows/, and no new product database persistence
Testing: Existing Pest guard suites for lane contracts, focused CI-governance guard tests for workflow/wrapper/artifact policy, and representative live Gitea runs for pull request, dev push, scheduled, and manual workflows
Target Platform: TenantAtlas monorepo on Gitea Actions with act_runner, Docker-isolated jobs, repo-root lane wrappers, and dev as the integration branch for broader confidence validation
Project Type: Monorepo with a Laravel platform app and separate Astro website; this feature is scoped to platform test governance and repository CI infrastructure
Performance Goals: Keep pull request Fast Feedback anchored to the current 200s lane budget, mainline Confidence to 450s, Browser to 150s, and Heavy Governance to the normalized threshold emitted by TestLaneManifest; keep workflow overhead limited to artifact staging and upload rather than additional duplicate lane executions
Constraints: Repo truth first; no inline test-selection logic in workflows; no new product routes, panels, assets, or dependencies; avoid Gitea-ignored GitHub workflow features such as concurrency, continue-on-error, timeout-minutes, complex multi-label runs-on, problem matchers, and annotation-only failure handling; keep PR workflows compatible with Gitea's refs/pull/:number/head behavior; prefer explicit workflow files over dynamic job matrices or heavy conditional logic
Scale/Scope: Six existing lane entry points (fast-feedback, confidence, heavy-governance, browser, profiling, junit), five artifact modes per lane at most, no existing .gitea/workflows/ directory, and an established local governance contract already documented in README.md and guarded in apps/platform/tests/Feature/Guards

Filament v5 Implementation Notes

Livewire v4.0+ compliance: Preserved. This feature governs CI around existing Filament and Livewire tests and does not alter the runtime Filament stack.
Provider registration location: Unchanged. Existing panel providers remain registered in bootstrap/providers.php.
Global search rule: No globally searchable resources are added or changed.
Destructive actions: No runtime destructive actions are introduced. Any affected tests continue to validate existing confirmation and authorization behavior only.
Asset strategy: No panel or shared assets are added. Existing filament:assets deployment behavior remains unchanged.
Testing plan: Add or update Pest guards for workflow-to-lane mapping, wrapper-only execution, artifact staging contract, failure classification behavior, and trigger-aware budget semantics, plus representative live Gitea validation for each workflow path.

Test Governance Impact

Affected validation lanes: fast-feedback for blocking pull request validation, confidence for dev push validation, heavy-governance for separate manual and scheduled heavy validation, and browser for separate manual and scheduled browser validation. profiling and junit remain support-only lanes and are not widened by this feature.
Fixture/helper cost risk: Low and bounded to new CI-governance guard files, support-class policy helpers, and scripts/platform-test-artifacts. The implementation must not add shared product fixtures, widen default guard setup, or accidentally promote CI-governance tests into Heavy Governance or Browser lanes.
Heavy/browser impact: No new heavy families or browser scenarios are created. The work only makes the existing heavy/browser lanes explicit in CI triggers, artifact bundles, and reviewer guidance.
Runtime drift follow-up: Record Fast Feedback CI variance tolerance, any material runtime drift or recalibration, and the required validation evidence set in the active spec or PR.
Required validation evidence set: one pull_request Fast Feedback run, one push to dev Confidence run, one manual Heavy Governance run, one scheduled Heavy Governance run, one manual Browser run, and one scheduled Browser run, each with trigger, lane, artifact bundle, budget outcome, and primary failure classification; the Fast Feedback record must reference the chosen CI variance tolerance, and any material runtime recalibration must be documented in the active spec or PR and may be linked from the affected evidence.

Frozen Trigger Matrix

Trigger class	Workflow profile	Lane binding	Budget mode	Rollout note
`pull-request`	`pr-fast-feedback`	`fast-feedback`	`hard-fail` after documented Fast Feedback CI variance allowance	Always enabled
`mainline-push` on `dev`	`main-confidence`	`confidence`	`soft-warn` for budget, blocking for test and artifact failures	Always enabled
`manual` heavy	`heavy-governance-manual`	`heavy-governance`	`soft-warn` or `trend-only` until stability improves	Required before enabling schedule
`scheduled` heavy	`heavy-governance-scheduled`	`heavy-governance`	`soft-warn` or `trend-only` until stability improves	Disabled until first successful manual run
`manual` browser	`browser-manual`	`browser`	`trend-only` or `soft-warn` until stability improves	Required before enabling schedule
`scheduled` browser	`browser-scheduled`	`browser`	`trend-only` or `soft-warn` until stability improves	Disabled until first successful manual run

No-New-Fixture-Cost Rule

CI-governance helpers and tests for this feature must remain cheap enough for the default non-browser lanes.
No shared product fixture expansion, no broader default seeding, and no accidental Heavy Governance or Browser promotion are permitted as part of this rollout.

Constitution Check

GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.

Inventory-first: PASS. No Inventory, backup, or snapshot truth is changed.
Read/write separation: PASS. This is repository-only CI and test-governance work and introduces no end-user mutations.
Graph contract path: PASS. No Graph calls, contract registry changes, or provider runtime integrations are added.
Deterministic capabilities: PASS. No capability resolver or authorization registry changes.
RBAC-UX, workspace isolation, tenant isolation: PASS. No runtime route, policy, tenant, or workspace access behavior is changed.
Run observability and Ops-UX: PASS. CI artifacts remain filesystem outputs and do not introduce OperationRun or operator notification behavior.
Data minimization: PASS. Lane reports, budget summaries, and CI evidence remain repo-local and must not contain secrets or tenant payloads.
Proportionality and bloat control: PASS WITH LIMITS. The only new semantic layer is a narrow repo-level trigger policy, artifact contract, and failure classification model that extends the existing test-governance seams instead of creating a parallel CI framework.
TEST-TRUTH-001: PASS WITH WORK. The implementation must keep real test failures and workflow misconfiguration legible instead of downgrading them into generic warning noise.
Filament/UI constitutions: PASS / NOT APPLICABLE. No operator-facing runtime UI, action surfaces, badges, or panel IA are changed.

Phase 0 Gate Result: PASS

The feature stays bounded to repository CI wiring, test-governance policy, artifacts, and validation evidence.
No new runtime persistence, product routes, Graph seams, or user-facing surfaces are introduced.
The plan extends existing lane and artifact seams rather than inventing a second governance system.

Project Structure

Documentation (this feature)

specs/210-ci-matrix-budget-enforcement/
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│   ├── ci-lane-matrix.schema.json
│   └── ci-lane-governance.logical.openapi.yaml
└── tasks.md

Source Code (repository root)

.gitea/
├── workflows/
│   ├── test-pr-fast-feedback.yml
│   ├── test-main-confidence.yml
│   ├── test-heavy-governance.yml
│   └── test-browser.yml
apps/
├── platform/
│   ├── composer.json
│   ├── tests/
│   │   ├── Feature/Guards/
│   │   └── Support/
│   │       ├── TestLaneBudget.php
│   │       ├── TestLaneManifest.php
│   │       └── TestLaneReport.php
│   └── storage/logs/test-lanes/
scripts/
├── platform-test-lane
├── platform-test-report
└── platform-test-artifacts
README.md

Structure Decision: Keep workflow logic thin and explicit under .gitea/workflows/, keep lane selection and budget truth inside the existing TestLaneManifest/TestLaneBudget/TestLaneReport seams, and add only one narrow repo-root helper for CI artifact staging so artifact naming and upload rules do not become duplicated shell logic across multiple workflows.

Complexity Tracking

Violation	Why Needed	Simpler Alternative Rejected Because
None	Not applicable	Not applicable

Proportionality Review

Current operator problem: Contributors and reviewers cannot rely on shared CI to preserve lane discipline, budget honesty, or standardized artifacts across pull request and mainline validation.
Existing structure is insufficient because: The repo already has wrapper scripts, budgets, lane manifests, and reports, but none of that is yet wired into checked-in Gitea workflows with explicit trigger, artifact, and failure semantics.
Narrowest correct implementation: Add explicit workflow files, extend the existing lane/report support classes with trigger-aware budget and failure metadata, and stage per-lane artifacts through a narrow helper instead of duplicating rules in YAML.
Ownership cost created: The repo must maintain four workflow files, one CI artifact staging helper, trigger-aware budget policy, and a small set of CI-governance guard tests.
Alternative intentionally rejected: A single dynamic matrix workflow with GitHub-style concurrency and conditionals, or inline lane logic embedded directly inside workflow files, because that would depend on weaker Gitea compatibility and duplicate repository truth.
Release truth: Current-release repository truth required to institutionalize the already implemented test-governance model from Specs 206 through 209.

Phase 0 — Research (complete)

Output: research.md
Resolved key decisions:
- Use explicit workflow files per trigger class instead of a single dynamic job matrix because Gitea ignores or limits several GitHub workflow primitives and the current need does not justify indirection.
- Treat dev as the protected mainline Confidence branch because repo process already uses dev as the integration branch and Gitea pull request refs point to PR heads rather than synthetic merge previews.
- Keep scripts/platform-test-lane and scripts/platform-test-report as the only lane execution entry points and extend the existing support classes instead of creating a second CI-only selection manifest.
- Do not run the separate junit support lane in CI by default; publish the JUnit XML already produced by the Confidence lane to avoid duplicate non-browser cost.
- Stage *-latest.* files into a per-run export directory before upload so CI artifacts have stable names even though the local contract remains lane-latest.*.
- Introduce trigger-aware budget enforcement profiles with a variance allowance and three outcome classes (hard-fail, soft-warn, trend-only) so Fast Feedback can block on mature overruns while heavier lanes remain non-blocking until their budgets stabilize.
- Encode failure classes in repo-produced summary and JSON artifacts rather than relying on Gitea problem matchers or annotations, because Gitea ignores those GitHub-centric UI features.
- Keep profiling outside the default PR/Main matrix and reserve it for manual or follow-up trend work, which aligns with Spec 211 as the next maturity step.

Phase 1 — Design & Contracts (complete)

Output: data-model.md formalizes workflow profiles, lane bindings, budget-enforcement profiles, artifact publication contracts, failure classifications, and validation evidence packs.
Output: contracts/ci-lane-matrix.schema.json defines the checked-in contract for workflow profiles, trigger-to-lane bindings, artifact requirements, and budget behavior.
Output: contracts/ci-lane-governance.logical.openapi.yaml captures the logical contract for executing a governed lane, staging artifacts, classifying budget outcomes, and summarizing CI results.
Output: quickstart.md provides the implementation order, validation commands, and first-live-run checklist for the CI rollout.

Post-design Constitution Re-check

PASS: No runtime routes, panels, Graph seams, or authorization planes are introduced.
PASS: Trigger policy, artifact staging, and failure classification remain repo-local governance constructs justified by current shared-validation needs.
PASS: The design extends existing lane/report classes and wrappers rather than adding a generic CI framework or new persistence.
PASS WITH WORK: Fast Feedback hard-fail budget behavior must include a documented tolerance strategy so CI runner noise does not create brittle red builds.
PASS WITH WORK: Heavy Governance and Browser must remain explicitly separated from the fast path in both trigger configuration and workflow naming so the CI contract stays legible.

Phase 2 — Implementation Planning

tasks.md should cover:

Creating .gitea/workflows/test-pr-fast-feedback.yml for pull_request events (opened, reopened, synchronize) and wiring it only to the Fast Feedback lane.
Creating .gitea/workflows/test-main-confidence.yml for pushes to dev and wiring it to the Confidence lane while publishing summary, report, budget, and JUnit artifacts from that same lane run.
Creating .gitea/workflows/test-heavy-governance.yml and .gitea/workflows/test-browser.yml for workflow_dispatch, with scheduled execution enabled only after the first successful manual validation so heavy and browser cost classes stay separately visible and independently re-runnable.
Extending TestLaneManifest with trigger-aware CI policy metadata so workflows consume lane truth instead of carrying duplicate branch, artifact, or budget rules inline.
Extending TestLaneBudget and TestLaneReport so they emit trigger-aware budget outcome classes, single primary failure classes, and CI summary metadata that can be surfaced without GitHub-only annotations.
Adding the narrow repo-root helper scripts/platform-test-artifacts to stage and rename per-lane artifacts from apps/platform/storage/logs/test-lanes into a deterministic upload directory.
Defining the initial budget-enforcement policy: PR Fast Feedback blocking on test, wrapper, manifest, and artifact failures plus mature budget overruns; Main Confidence blocking on test and artifact failures while budget remains warning-first; Heavy Governance and Browser warning-only or trend-only until stability improves; and documenting the Fast Feedback CI variance tolerance explicitly.
Adding or updating guard tests that verify workflow file presence, wrapper-only lane invocation, wrong-lane and unresolved-entry-point classification, dev push mapping, heavy/browser separation, artifact staging completeness, failure-class legibility, and no accidental heavy/browser promotion of CI-governance coverage.
Updating README.md with a concise contributor guide for local reproduction, trigger expectations, blocking semantics, and artifact locations.
Validating each workflow path with the required evidence set of representative pull request, dev push, manual, and scheduled runs, and archiving evidence that documented triggers, expected lanes, published artifacts, and failure semantics all match the contract.

Contract Implementation Note

The JSON schema is schema-first and repository-tooling-oriented. It defines what the checked-in CI lane matrix must express even if the first implementation stores most policy in PHP arrays and workflow YAML.
The OpenAPI file is logical rather than transport-prescriptive. It documents how workflows, wrappers, and reporting helpers interact, not a public HTTP API.
The design intentionally avoids new database persistence or a separate CI service. Artifacts remain filesystem-based and are uploaded per job after staging.

Deployment Sequencing Note

No database migration is planned.
No asset publish step changes.
Rollout order should be: extend manifest/report policy fields, add artifact staging helper, land PR Fast Feedback workflow, land dev Confidence workflow, land manual Heavy Governance and Browser workflows, enable their schedules after the first successful manual validation, then capture one scheduled Heavy Governance run and one scheduled Browser run as part of rollout evidence.

18 KiB Raw Blame History