Spec 295: full suite CI lane baseline (#350)

## Summary
- add the Spec 295 artifacts for full-suite failure classification and CI lane baseline work
- fix `scripts/platform-test-artifacts` so Sail passes artifact staging inputs into the embedded PHP script via argv
- add a guard test covering the artifact staging input contract

## Scope guards
- no browser screenshot baselines included
- no generated test artifacts included
- no runtime application code changes included

## Notes
- classification evidence and follow-up ownership are documented in `specs/295-full-suite-ci-baseline/failure-classification.md`
- this PR is intentionally limited to the CI/lane/artifact contract slice for Spec 295

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #350
This commit is contained in:
ahmido 2026-05-11 11:14:56 +00:00
parent d3158f5103
commit f03555eae1
17 changed files with 3380 additions and 9 deletions

View File

@ -0,0 +1,295 @@
---
name: browsertest
description: Führe einen vollständigen Smoke-Browser-Test im Integrated Browser für das aktuelle Feature aus, inklusive Happy Path, zentraler Regressionen, Kontext-Prüfung und belastbarer Ergebniszusammenfassung.
license: MIT
metadata:
author: GitHub Copilot
---
# Browser Smoke Test
## What This Skill Does
Use this skill to validate the current feature end-to-end in the integrated browser.
This is a focused smoke test, not a full exploratory test session. The goal is to prove that the primary operator flow:
- loads in the correct auth, workspace, and tenant context
- exposes the expected controls and decision points
- completes the main happy path without blocking issues
- lands in the expected end state or canonical drilldown
- does not show obvious regressions such as broken navigation, missing data, or conflicting actions
The skill should produce a concrete pass or fail result with actionable evidence.
## When To Apply
Activate this skill when:
- the user asks to smoke test the current feature in the browser
- a new Filament page, dashboard signal, report, wizard, or detail flow was just added
- a UI regression fix needs confirmation in a real browser context
- the primary question is whether the feature works from an operator perspective
- you need a quick integration-level check without writing a full browser test suite first
## What Success Looks Like
A successful smoke test confirms all of the following:
- the target route opens successfully
- the visible context is correct
- the main flow is usable
- the expected result appears after interaction
- the route or drilldown destination is correct
- the surface does not obviously violate its intended interaction model
If the test cannot be completed, the output must clearly state whether the blocker is:
- authentication
- missing data or fixture state
- routing
- UI interaction failure
- server error
- an unclear expected behavior contract
Do not guess. If the route or state is blocked, report the blocker explicitly.
## Preconditions
Before running the browser smoke test, make sure you know:
- the canonical route or entry point for the feature
- the primary operator action or happy path
- the expected success state
- whether the feature depends on a specific tenant, workspace, or seeded record
When available, use the feature spec, quickstart, tasks, or current browser page as the source of truth.
## Standard Workflow
### 1. Define the smoke-test scope
Identify:
- the route to open
- the primary action to perform
- the expected end state
- one or two critical regressions that must not break
The smoke test should stay narrow. Prefer one complete happy path plus one critical boundary over broad exploratory clicking.
### 2. Establish the browser state
- Reuse the current browser page if it already matches the target feature.
- Otherwise open the canonical route.
- Confirm the current auth and scope context before interacting.
For this repo, that usually means checking whether the page is on:
- `/admin/...` for workspace-context surfaces
- `/admin/t/{tenant}/...` for tenant-context surfaces
### 3. Inspect before acting
- Use `read_page` before interacting so you understand the live controls, refs, headings, and route context.
- Prefer `read_page` over screenshots for actual interaction planning.
- Use screenshots only for visual evidence or when the user asks for them.
### 4. Execute the primary happy path
Run the smallest meaningful flow that proves the feature works.
Typical steps include:
- open the page
- verify heading or key summary text
- click the primary CTA or row
- fill the minimum required form fields
- confirm modal or dialog text when relevant
- submit or navigate
- verify the expected destination or changed state
After each meaningful action, re-read the page so the next step is based on current DOM state.
### 5. Validate the outcome
Check the exact result that matters for the feature.
Examples:
- a new row appears
- a status changes
- a success message appears
- a report filter changes the result set
- a row click lands on the canonical detail page
- a dashboard signal links to the correct report page
### 6. Check for obvious regressions
Even in a smoke test, verify a few core non-negotiables:
- the page is not blank or half-rendered
- the main action is present and usable
- the visible context is correct
- the drilldown destination is canonical
- no obviously duplicated primary actions exist
- no stuck modal, spinner, or blocked interaction remains onscreen
### 7. Capture evidence and summarize clearly
Your result should state:
- route tested
- context used
- steps executed
- pass or fail
- exact blocker or discrepancy if failed
Include a screenshot only when it adds value.
## Tool Usage Guidance
Use the browser tools in this order by default:
1. `read_page`
2. `click_element`
3. `type_in_page`
4. `handle_dialog` when needed
5. `navigate_page` or `open_browser_page` only when route changes are required
6. `run_playwright_code` only if the normal browser tools are insufficient
7. `screenshot_page` for evidence, not for primary navigation logic
## Repo-Specific Guidance For TenantPilot
### Workspace surfaces
For `/admin` pages and similar workspace-context surfaces:
- verify the page is reachable without forcing tenant-route assumptions
- confirm any summary signal or CTA lands on the canonical destination
- verify calm-state versus attention-state behavior when the feature defines both
### Tenant surfaces
For `/admin/t/{tenant}/...` pages:
- verify the tenant context is explicit and correct
- verify drilldowns stay in the intended tenant scope
- treat cross-tenant leakage or silent scope changes as failures
### Filament list or report surfaces
For Filament tables, reports, or registry-style pages:
- verify the heading and table shell render
- verify fixed filters or summary controls exist when the spec requires them
- verify row click or the primary inspect affordance behaves as designed
- verify empty-state messaging is specific rather than generic when the feature defines custom behavior
### Filament detail pages
For detail or view surfaces:
- verify the canonical record loads
- verify expected sections or summary content are present
- verify critical actions or drillbacks are usable
## Result Format
Use a compact result format like this:
```text
Browser smoke result: PASS
Route: /admin/findings/hygiene
Context: workspace member with visible hygiene issues
Steps: opened report -> verified filters -> clicked finding row -> landed on canonical finding detail
Verified: report rendered, primary interaction worked, drilldown route was correct
```
If the test fails:
```text
Browser smoke result: FAIL
Route: /admin/findings/hygiene
Context: authenticated workspace member
Failed step: clicking the summary CTA
Expected: navigate to /admin/findings/hygiene
Actual: remained on /admin with no route change
Blocker: CTA appears rendered but is not interactive
```
## Examples
### Example 1: Smoke test a new report page
Use this when the feature adds a new read-only report.
Steps:
- open the canonical report route
- verify the page heading and main controls
- confirm the table or defined empty state is visible
- click one row or primary inspect affordance
- verify navigation lands on the canonical detail route
Pass criteria:
- report loads
- intended controls exist
- primary inspect path works
### Example 2: Smoke test a dashboard signal
Use this when the feature adds a summary signal on `/admin`.
Steps:
- open `/admin`
- find the signal
- verify the visible count or summary text
- click the CTA
- confirm navigation lands on the canonical downstream surface
Pass criteria:
- signal is visible in the correct state
- CTA text is present
- CTA opens the correct route
### Example 3: Smoke test a tenant detail follow-up
Use this when a workspace-level surface should drill into a tenant-level detail page.
Steps:
- open the workspace-level surface
- trigger the drilldown
- verify the target route includes the correct tenant and record
- confirm the target page actually loads the expected detail content
Pass criteria:
- drilldown route is canonical
- tenant context is correct
- destination content matches the selected record
## Common Pitfalls
- Clicking before reading the page state and refs
- Treating a blocked auth session as a feature failure
- Confusing workspace-context routes with tenant-context routes
- Reporting visual impressions without validating the actual interaction result
- Forgetting to re-read the page after a modal opens or a route changes
- Claiming success without verifying the final destination or changed state
## Non-Goals
This skill does not replace:
- full exploratory QA
- formal Pest browser coverage
- accessibility review
- visual regression approval
- backend correctness tests
It is a fast, real-browser confidence pass for the current feature.

View File

@ -0,0 +1,8 @@
---
name: giteaflow
description: Describe what this skill does and when to use it. Include keywords that help agents identify relevant tasks.
---
<!-- Tip: Use /create-skill in chat to generate content with agent assistance -->
comit all changes, push to remote, and create a pull request against platform-dev with gitea mcp

View File

@ -0,0 +1,167 @@
---
name: pest-testing
description: "Tests applications using the Pest 4 PHP framework. Activates when writing tests, creating unit or feature tests, adding assertions, testing Livewire components, browser testing, debugging test failures, working with datasets or mocking; or when the user mentions test, spec, TDD, expects, assertion, coverage, or needs to verify functionality works."
license: MIT
metadata:
author: laravel
---
# Pest Testing 4
## When to Apply
Activate this skill when:
- Creating new tests (unit, feature, or browser)
- Modifying existing tests
- Debugging test failures
- Working with browser testing or smoke testing
- Writing architecture tests or visual regression tests
## Documentation
Use `search-docs` for detailed Pest 4 patterns and documentation.
## Basic Usage
### Creating Tests
All tests must be written using Pest. Use `php artisan make:test --pest {name}`.
### Test Organization
- Unit/Feature tests: `tests/Feature` and `tests/Unit` directories.
- Browser tests: `tests/Browser/` directory.
- Do NOT remove tests without approval - these are core application code.
### Basic Test Structure
<!-- Basic Pest Test Example -->
```php
it('is true', function () {
expect(true)->toBeTrue();
});
```
### Running Tests
- Run minimal tests with filter before finalizing: `php artisan test --compact --filter=testName`.
- Run all tests: `php artisan test --compact`.
- Run file: `php artisan test --compact tests/Feature/ExampleTest.php`.
## Assertions
Use specific assertions (`assertSuccessful()`, `assertNotFound()`) instead of `assertStatus()`:
<!-- Pest Response Assertion -->
```php
it('returns all', function () {
$this->postJson('/api/docs', [])->assertSuccessful();
});
```
| Use | Instead of |
|-----|------------|
| `assertSuccessful()` | `assertStatus(200)` |
| `assertNotFound()` | `assertStatus(404)` |
| `assertForbidden()` | `assertStatus(403)` |
## Mocking
Import mock function before use: `use function Pest\Laravel\mock;`
## Datasets
Use datasets for repetitive tests (validation rules, etc.):
<!-- Pest Dataset Example -->
```php
it('has emails', function (string $email) {
expect($email)->not->toBeEmpty();
})->with([
'james' => 'james@laravel.com',
'taylor' => 'taylor@laravel.com',
]);
```
## Pest 4 Features
| Feature | Purpose |
|---------|---------|
| Browser Testing | Full integration tests in real browsers |
| Smoke Testing | Validate multiple pages quickly |
| Visual Regression | Compare screenshots for visual changes |
| Test Sharding | Parallel CI runs |
| Architecture Testing | Enforce code conventions |
### Browser Test Example
Browser tests run in real browsers for full integration testing:
- Browser tests live in `tests/Browser/`.
- Use Laravel features like `Event::fake()`, `assertAuthenticated()`, and model factories.
- Use `RefreshDatabase` for clean state per test.
- Interact with page: click, type, scroll, select, submit, drag-and-drop, touch gestures.
- Test on multiple browsers (Chrome, Firefox, Safari) if requested.
- Test on different devices/viewports (iPhone 14 Pro, tablets) if requested.
- Switch color schemes (light/dark mode) when appropriate.
- Take screenshots or pause tests for debugging.
<!-- Pest Browser Test Example -->
```php
it('may reset the password', function () {
Notification::fake();
$this->actingAs(User::factory()->create());
$page = visit('/sign-in');
$page->assertSee('Sign In')
->assertNoJavaScriptErrors()
->click('Forgot Password?')
->fill('email', 'nuno@laravel.com')
->click('Send Reset Link')
->assertSee('We have emailed your password reset link!');
Notification::assertSent(ResetPassword::class);
});
```
### Smoke Testing
Quickly validate multiple pages have no JavaScript errors:
<!-- Pest Smoke Testing Example -->
```php
$pages = visit(['/', '/about', '/contact']);
$pages->assertNoJavaScriptErrors()->assertNoConsoleLogs();
```
### Visual Regression Testing
Capture and compare screenshots to detect visual changes.
### Test Sharding
Split tests across parallel processes for faster CI runs.
### Architecture Testing
Pest 4 includes architecture testing (from Pest 3):
<!-- Architecture Test Example -->
```php
arch('controllers')
->expect('App\Http\Controllers')
->toExtendNothing()
->toHaveSuffix('Controller');
```
## Common Pitfalls
- Not importing `use function Pest\Laravel\mock;` before using mock
- Using `assertStatus(200)` instead of `assertSuccessful()`
- Forgetting datasets for repetitive validation tests
- Deleting tests without approval
- Forgetting `assertNoJavaScriptErrors()` in browser tests

View File

@ -0,0 +1,625 @@
---
name: platform-feature-finish
description: Commit, push, create a Gitea PR from a TenantPilot platform feature branch into platform-dev, and optionally refresh the platform-dev to dev integration PR by rebase.
---
# Skill: platform-feature-finish
## Purpose
Automate the TenantPilot platform feature completion workflow.
Trigger this skill when the user says something like:
- "alles committen pushen und PR gegen platform-dev"
- "feature fertig, bitte PR erstellen"
- "platform feature abschließen"
- "commit push PR mit Gitea MCP"
- "mach PR gegen platform-dev"
- "finish platform feature"
- "platform-dev nach dev vorbereiten"
- "platform-dev PR aktualisieren"
- "out-of-date mit dev beheben"
- "integration PR refresh"
- "platform-dev auf dev rebasen"
This skill handles:
1. Validate current Git branch
2. Commit all feature changes
3. Push current feature branch
4. Create a Gitea pull request into `platform-dev`
5. Refresh the `platform-dev``dev` integration PR when explicitly requested
6. Report the PR link and next integration step
---
## Branch Model
TenantPilot uses area branches:
```text
dev = shared integration branch
platform-dev = platform/application area integration branch
website-dev = website/marketing area integration branch
```
For platform features:
```text
platform-dev
feature branch
PR back to platform-dev
platform-dev → dev integration PR
```
Rules:
- Platform feature branches MUST target `platform-dev`.
- Do NOT target `dev` directly unless the user explicitly asks.
- Do NOT use `website-dev` for platform features.
- `platform-dev` is the default PR base for TenantPilot platform/application work.
- `dev` is the shared integration branch.
### Solo Workflow Rule
The user works alone on `platform-dev`.
For refreshing the integration branch before opening or updating the PR `platform-dev``dev`, prefer rebase over merge.
Do not repeatedly merge `origin/dev` into `platform-dev` for refresh.
Avoid creating repeated merge commits like:
```text
Merge remote-tracking branch 'origin/dev' into platform-dev
```
Use `--force-with-lease`, never plain `--force`.
If rebase conflicts occur, stop and report the conflict files.
---
## Preconditions
Before committing:
1. Confirm repository root.
2. Confirm current branch is not protected.
Protected branches:
```text
dev
platform-dev
website-dev
main
master
```
If the current branch is protected, STOP and report:
```text
Ich bin auf einem geschützten Branch. Bitte zuerst einen Feature-Branch auschecken.
```
3. Confirm remote exists.
4. Confirm there are local changes, untracked files, or unpushed commits.
5. Confirm there are no unresolved conflicts.
Do not ask for confirmation unless:
- The current branch is protected.
- Git status indicates unresolved conflicts.
- There is no remote configured.
- `.env` or other local secret/config files would be committed.
- Commit fails.
- Push fails.
- Gitea MCP PR creation fails.
---
## Required Tools
Use terminal for Git operations.
Use Gitea MCP for pull request creation.
Preferred Gitea MCP operation:
```text
create_pull_request
```
Required PR parameters:
```json
{
"owner": "ahmido",
"repo": "TenantAtlas",
"head": "<current-feature-branch>",
"base": "platform-dev",
"title": "<generated-title>",
"body": "<generated-body>"
}
```
---
## Workflow
### Step 1 — Inspect Git state
Run:
```bash
git rev-parse --show-toplevel
git rev-parse --abbrev-ref HEAD
git status --porcelain
git status -sb
git config --get remote.origin.url
git log --oneline --max-count=5
```
Determine:
- repository root
- current branch
- changed files
- untracked files
- remote URL
- whether there are unpushed commits
- whether unresolved conflicts exist
If the current branch is protected, stop.
If unresolved conflicts exist, stop.
If no remote exists, stop.
---
### Step 2 — Check for local environment files
Before `git add -A`, check whether local environment/config files are modified or untracked:
```bash
git status --porcelain | grep -E '(^.. \.env$|^.. apps/platform/\.env$|^.. .*\.env$)' || true
```
If `.env` or another environment file is included, STOP and report:
```text
Achtung: Eine .env-/Environment-Datei ist geändert oder untracked. Ich committe das nicht automatisch. Bitte prüfen oder aus dem Commit entfernen.
```
Do not commit secrets or local runtime configuration.
---
### Step 3 — Build commit message
Use the current branch name.
If branch starts with a spec number, for example:
```text
256-external-support-desk-handoff
```
Generate:
```text
feat(specs/256): external support desk handoff
```
If branch does not contain a spec number, generate:
```text
feat(platform): complete <branch-name>
```
Rules:
- Use lowercase subject.
- Use feature-style subject.
- Do not include `WIP`.
- Do not include `final`.
- Do not include overly generic `updates`.
Examples:
```text
feat(specs/256): external support desk handoff
feat(specs/252): platform localization v1
feat(platform): improve tenant review workspace
```
---
### Step 4 — Commit all changes
Run:
```bash
git add -A
git commit -m "<commit-message>"
```
If there are no local changes to commit, continue only if the branch has unpushed commits.
Check unpushed commits with:
```bash
git status -sb
git log --oneline origin/<current-branch>..HEAD
```
If there are no local changes and no unpushed commits, report:
```text
Es gibt keine lokalen Änderungen und keine unpushed commits. Ich erstelle keinen leeren Commit.
```
Then continue to PR creation only if the branch already exists remotely or can be pushed.
---
### Step 5 — Push branch
Run:
```bash
git push --set-upstream origin <current-branch>
```
If the upstream already exists, this is acceptable.
Never force-push unless the user explicitly requests it.
---
### Step 6 — Create PR into platform-dev via Gitea MCP
Use Gitea MCP to create a pull request:
```json
{
"owner": "ahmido",
"repo": "TenantAtlas",
"head": "<current-feature-branch>",
"base": "platform-dev",
"title": "<commit-message>",
"body": "Implements platform feature branch `<current-feature-branch>`.\n\nTarget branch: `platform-dev`.\n\nFollow-up integration path after merge:\n\n`platform-dev``dev`."
}
```
If a PR already exists for the same branch and base, do not create a duplicate.
Report the existing PR if available.
---
## Optional Step — Check platform-dev to dev PR
After creating the feature PR, check whether an open integration PR exists:
```text
platform-dev → dev
```
If a Gitea MCP list/search pull request function is available, use it.
If one exists, report:
```text
Der Folge-PR `platform-dev``dev` existiert bereits: <url>
```
If none exists, report:
```text
Nach dem Merge dieses Feature-PRs sollte der Integrations-PR `platform-dev``dev` erstellt oder aktualisiert werden.
```
Do not automatically create the `platform-dev``dev` PR unless the user explicitly asks for it.
Reason: before the feature PR is merged into `platform-dev`, the integration PR may not include the new feature yet.
---
## Integration Refresh Mode
Use this mode when the user explicitly says one of the following:
- "platform-dev nach dev vorbereiten"
- "platform-dev PR aktualisieren"
- "out-of-date mit dev beheben"
- "integration PR refresh"
- "platform-dev auf dev rebasen"
- "auch platform-dev nach dev"
- "und danach platform-dev nach dev"
- "full integration"
- "kompletten platform-dev zu dev PR machen"
- "folge-pr erstellen"
This mode prepares or updates the integration PR:
```text
platform-dev → dev
```
Because the user works alone on `platform-dev`, prefer rebase over merge.
### Integration Refresh Preconditions
Before running this mode:
1. Ensure the working tree is clean.
2. Ensure there are no unresolved conflicts.
3. Fetch remote branches.
4. Ensure `origin/platform-dev` exists.
5. Ensure `origin/dev` exists.
If the working tree is dirty, STOP and report:
```text
Der Working Tree ist nicht sauber. Bitte erst Änderungen committen, stashen oder verwerfen, bevor `platform-dev` auf `dev` rebased wird.
```
If unresolved conflicts exist, STOP and report the conflict files.
### Integration Refresh Workflow
Run:
```bash
git fetch origin
git checkout platform-dev
git reset --hard origin/platform-dev
git rebase origin/dev
git push --force-with-lease origin platform-dev
```
After pushing, verify that `origin/dev` is now an ancestor of `origin/platform-dev`:
```bash
git fetch origin
git merge-base --is-ancestor origin/dev origin/platform-dev \
&& echo "OK: platform-dev contains dev" \
|| echo "OUTDATED: platform-dev does not contain dev"
```
If the verification prints `OUTDATED`, stop and report it. Do not claim the PR is up-to-date.
Rules:
- Do not merge `origin/dev` into `platform-dev` for this refresh.
- Do not create repeated merge commits from `origin/dev` into `platform-dev`.
- Use `git push --force-with-lease origin platform-dev` after a successful rebase.
- Never use plain `git push --force`.
- If `git rebase origin/dev` reports conflicts, stop immediately.
- Do not continue to PR creation while a rebase is unresolved.
- Do not auto-merge the PR.
- Do not claim Gitea will remove the out-of-date warning unless the ancestor check succeeds.
If rebase conflicts occur, report:
```text
Rebase-Konflikte erkannt. Ich habe gestoppt.
Konfliktdateien:
<files>
Bitte Konflikte lösen, dann `git rebase --continue` ausführen oder den Rebase mit `git rebase --abort` abbrechen.
```
### Create or Report Integration PR
After the rebase, push, and ancestor verification succeeded, use Gitea MCP to create or report the integration PR:
```json
{
"owner": "ahmido",
"repo": "TenantAtlas",
"head": "platform-dev",
"base": "dev",
"title": "chore(platform): merge platform-dev into dev",
"body": "Integrates latest TenantPilot platform changes from `platform-dev` into `dev`.\n\nThis PR was created by agent on user request; do not merge automatically."
}
```
If an open PR already exists for `platform-dev``dev`, do not create a duplicate. Report the existing PR.
### Integration Refresh Reporting Format
Final response for this mode must include:
```text
Fertig.
- Branch aktualisiert: platform-dev
- Refresh-Methode: rebase auf origin/dev
- Ancestor-Check: origin/dev ist Ancestor von origin/platform-dev
- Push: --force-with-lease origin/platform-dev
- Integration PR: <url>
- Base: dev
- Hinweis: PR wurde nicht automatisch gemerged.
```
Do not claim tests passed unless they were actually executed.
---
## Reporting Format
Final response must be concise and include:
```text
Fertig.
- Branch: <branch>
- Commit: <commit-sha or "keine neuen Änderungen">
- Push: origin/<branch>
- PR: <url>
- Base: platform-dev
- Nächster Schritt: Nach Merge `platform-dev``dev` PR aktualisieren/erstellen
```
If tests were not run, say:
```text
Tests wurden in diesem Skill nicht automatisch ausgeführt.
```
Do not claim tests passed unless the tool actually ran them.
---
## Safety Rules
- Never commit directly to `dev`, `platform-dev`, `website-dev`, `main`, or `master`.
- Never force-push unless explicitly requested.
- For Integration Refresh Mode only, `git push --force-with-lease origin platform-dev` is allowed because the user works alone on `platform-dev`; never use plain `--force`.
- Never auto-merge PRs unless explicitly requested.
- Never target `dev` directly for platform feature PRs unless explicitly requested.
- Never delete branches unless explicitly requested.
- Never claim tests were run unless the tool actually ran them.
- Never commit `.env`, secrets, local tokens, local mock-server configuration, or temporary runtime-only changes.
- If migrations were created, mention that the target environment needs migration execution after deployment.
- If unresolved conflicts exist, stop.
---
## Useful Commands
Inspect:
```bash
git rev-parse --show-toplevel
git rev-parse --abbrev-ref HEAD
git status --porcelain
git status -sb
git config --get remote.origin.url
```
Detect protected branch:
```bash
branch="$(git rev-parse --abbrev-ref HEAD)"
case "$branch" in
dev|platform-dev|website-dev|main|master)
echo "PROTECTED_BRANCH:$branch"
exit 2
;;
esac
```
Detect unresolved conflicts:
```bash
git diff --name-only --diff-filter=U
```
Detect `.env` changes:
```bash
git status --porcelain | grep -E '(^.. \.env$|^.. apps/platform/\.env$|^.. .*\.env$)' || true
```
Commit:
```bash
git add -A
git commit -m "<message>"
```
Push:
```bash
git push --set-upstream origin "$(git rev-parse --abbrev-ref HEAD)"
```
Latest commit:
```bash
git rev-parse --short HEAD
git log -1 --pretty=%s
```
Integration refresh:
```bash
git fetch origin
git checkout platform-dev
git reset --hard origin/platform-dev
git rebase origin/dev
git push --force-with-lease origin platform-dev
```
Verify integration refresh:
```bash
git fetch origin
git merge-base --is-ancestor origin/dev origin/platform-dev \
&& echo "OK: platform-dev contains dev" \
|| echo "OUTDATED: platform-dev does not contain dev"
```
Check rebase conflicts:
```bash
git diff --name-only --diff-filter=U
```
---
## Example User Request
User:
```text
alles committen pushen und pr gegen platform-dev mit gitea mcp
```
Assistant should:
1. Check current branch.
2. Stop if branch is protected.
3. Stop if `.env` or secrets would be committed.
4. Commit all changes.
5. Push current branch.
6. Create PR into `platform-dev` with Gitea MCP.
7. Report result.
Do not ask unnecessary follow-up questions.
---
## Example Integration Refresh Request
User:
```text
platform-dev PR aktualisieren
```
Assistant should:
1. Ensure the working tree is clean.
2. Fetch origin.
3. Checkout `platform-dev`.
4. Reset local `platform-dev` to `origin/platform-dev`.
5. Rebase `platform-dev` onto `origin/dev`.
6. Push with `--force-with-lease`.
7. Verify `origin/dev` is an ancestor of `origin/platform-dev`.
8. Create or report the PR `platform-dev``dev`.
9. Report result.
Do not merge the PR automatically.

View File

@ -0,0 +1,447 @@
---
name: spec-kit-implementation-loop
description: Implement an existing TenantPilot/TenantAtlas Spec Kit feature, run tests, browser smoke checks where applicable, post-implementation analysis, fix all confirmed in-scope findings when safe and bounded, and repeat until no in-scope findings remain or a stop condition is reached.
---
# Skill: Spec Kit Implementation Loop
## Purpose
Use this skill to implement an already prepared TenantPilot/TenantAtlas Spec Kit feature and verify it with a bounded implementation loop.
This skill assumes `spec.md`, `plan.md`, and `tasks.md` already exist and have passed preparation readiness or have been explicitly accepted by the user.
The intended workflow is:
```text
active or explicitly named spec
→ inspect repo truth, constitution, spec, plan, tasks, and relevant code/tests
→ evaluate implementation gates
→ implement strictly task-by-task
→ run relevant tests/checks
→ run browser smoke test when UI/user-facing flows are affected
→ run strict post-implementation analysis
→ fix confirmed in-scope findings
→ repeat test + browser smoke + analysis + fix loop until clean or bounded stop condition is reached
→ final implementation report
```
## When to Use
Use this skill when the user asks to:
- implement an active or explicitly named Spec Kit feature
- run Spec Kit implement
- analyze after implementation
- fix implementation findings
- repeat implementation verification until no confirmed in-scope findings remain
- run tests and browser smoke checks after implementation
Typical user prompts:
```text
Implementiere die aktive Spec und analysiere danach, ob alles passt.
```
```text
Implementiere specs/243-product-usage-adoption-telemetry streng nach tasks.md.
```
```text
Mach Spec Kit implement und danach analyse. Behebe alle Abweichungen und wiederhole bis sauber.
```
```text
Implementiere die vorbereitete Spec. Danach Tests, Browser Smoke Test falls UI betroffen ist, Analyse und Fix-Loop bis keine In-Scope Findings mehr offen sind.
```
## Hard Rules
- Work strictly repo-based.
- Implement only the active or explicitly named Spec Kit feature.
- Do not choose a new candidate.
- Do not create a new spec.
- Do not expand scope beyond `spec.md`, `plan.md`, and `tasks.md`.
- Do not silently add roadmap features, adjacent UX rewrites, speculative architecture, or unrelated refactors.
- Follow the repository constitution and existing Spec Kit conventions.
- Preserve TenantPilot/TenantAtlas terminology.
- Prefer small, reviewable patches over broad rewrites.
- Treat repository truth as authoritative over assumptions.
- If repository truth conflicts with implementation scope, stop and report the conflict unless there is an obvious minimal correction inside active spec scope.
- Fix only confirmed findings from tests, static checks, browser smoke checks, or post-implementation analysis.
- Fix all confirmed in-scope findings, regardless of severity, when they are safe and bounded.
- Do not leave Medium/Low findings open silently. If they are not fixed, document exactly why.
- Never hide failing tests, weaken assertions, delete meaningful coverage, or mark tasks complete without implementation evidence.
- Do not run destructive commands.
- Do not force checkout, reset, stash, rebase, merge, or delete branches.
- Do not perform database-destructive actions unless the repository test workflow explicitly requires isolated test database resets.
- Do not continue analysis/fix loops indefinitely.
- Do not move from implementation to final status unless the Test Gate, Browser Smoke Test Gate where applicable, and Post-Implementation Analysis Gate have been evaluated.
- Do not claim merge-readiness unless the Merge Readiness Gate passes.
## Required Inputs
The user should provide at least one of:
- explicit spec directory such as `specs/<number>-<slug>/`
- instruction to use the current active Spec Kit feature
- instruction to implement the prepared/current spec
If the active spec cannot be determined safely, inspect the repository Spec Kit context first. If it is still ambiguous, stop and ask for the specific spec directory.
## Required Repository Checks
Always check:
1. active Spec Kit context / current branch
2. git status
3. `.specify/memory/constitution.md`
4. the active spec directory
5. `spec.md`
6. `plan.md`
7. `tasks.md`
8. relevant templates or conventions under `.specify/templates/`
9. nearby existing specs with related terminology or scope
10. application code surfaces referenced by the active spec
11. existing tests related to the changed behavior
## Git and Branch Safety
Before making implementation changes:
1. Check the current branch.
2. Check whether the working tree is clean.
3. If there are unrelated uncommitted changes, stop and report them. Do not continue.
4. If the working tree only contains user-intended changes for this operation, continue cautiously.
5. Do not force checkout, reset, stash, rebase, merge, or delete branches.
6. Do not overwrite unrelated work.
## Quality Gates
### Gate 1: Spec Readiness Gate
Required before implementation starts.
Pass criteria:
- `spec.md`, `plan.md`, and `tasks.md` exist.
- The spec has clear problem statement, user value, functional requirements, out-of-scope boundaries, acceptance criteria, assumptions, and risks.
- The plan identifies likely affected repo surfaces and does not contradict repository architecture.
- The tasks are small, ordered, verifiable, and include test/validation tasks.
- RBAC, workspace/tenant isolation, auditability, OperationRun semantics, evidence/result-truth, and UX requirements are addressed where relevant.
- No open question blocks safe implementation.
- The scope is small enough for a bounded implementation loop.
Fail behavior:
- Stop before implementation.
- Report readiness gaps.
- Do not compensate for an unclear spec by inventing implementation scope.
### Gate 2: Implementation Scope Gate
Required before changing application code.
Pass criteria:
- The active spec directory is known.
- The implementation target is traceable to specific tasks in `tasks.md`.
- The affected files/surfaces are consistent with `plan.md` or clearly justified by repository truth.
- No required change would introduce unrelated product behavior.
- No required change conflicts with constitution, existing architecture, RBAC/isolation boundaries, or source-of-truth semantics.
Fail behavior:
- Stop before code changes and report the conflict or ambiguity.
- Suggest a minimal spec/plan/tasks correction if the issue is in the artifacts rather than the codebase.
### Gate 3: Test Gate
Required after implementation and after each fix iteration.
Pass criteria:
- Targeted tests for changed behavior pass.
- Relevant existing tests pass or failures are proven unrelated and documented.
- Static analysis, linting, formatting, or type checks used by the repository pass when applicable.
- Security/governance-relevant changes have backend, policy, or domain coverage; UI-only verification is not enough.
- Regression coverage exists for each fixed Blocker or High finding where practical.
Fail behavior:
- Fix in-scope failures before post-implementation analysis.
- If failures are unrelated or pre-existing, document evidence and continue only if they do not invalidate the active spec.
- Do not weaken tests to pass the gate.
### Gate 4: Browser Smoke Test Gate
Required before claiming implementation is ready for manual review/merge when the change affects Filament UI, Livewire interactions, navigation, forms, tables, actions, modals, dashboards, operation drilldowns, tenant/workspace context, or any user-facing flow.
Not required for backend-only, domain-only, enum-only, contract-only, or test-only changes unless those changes alter a user-facing flow.
Pass criteria:
- The relevant page or flow loads in a real browser or the repository's browser-testing harness.
- The primary action introduced or changed by the spec can be executed successfully.
- Expected UI states, labels, badges, actions, empty states, tables, forms, modals, and navigation are visible where relevant.
- Workspace/tenant context is preserved across the tested flow where relevant.
- RBAC/capability-dependent visibility behaves as expected where practical to verify.
- Livewire interactions complete without visible runtime errors.
- No relevant browser console errors occur.
- No failed network requests occur for the tested flow, except known unrelated development noise that is explicitly documented.
- OperationRun, audit, evidence, result, or support-diagnostic drilldowns work where relevant.
- The smoke-tested path is documented in the final response.
Fail behavior:
- Fix in-scope browser, UX, Livewire, navigation, or runtime failures before claiming merge-readiness.
- If a browser issue is unrelated existing debt, document evidence and residual risk.
- Do not treat a passing browser smoke test as a substitute for backend, policy, domain, security, feature, or integration tests.
- Do not expand the smoke test into a full E2E suite unless the user explicitly asks for that.
### Gate 5: Post-Implementation Analysis Gate
Required after implementation and after each fix iteration.
Pass criteria:
- The implementation has been checked against `spec.md`, `plan.md`, `tasks.md`, and constitution.
- All completed tasks have implementation evidence.
- No confirmed in-scope findings remain.
- Medium/Low findings are fixed when they are inside active spec scope, clearly bounded, and safe.
- Medium/Low findings that remain open are explicitly documented with one of these reasons:
- out of scope
- requires separate spec
- risky refactor
- existing unrelated debt
- not reproducible
- blocked by unclear product/architecture decision
- No scope expansion was introduced during fixes.
Fail behavior:
- Fix confirmed in-scope findings, regardless of severity, when the fix is safe and bounded.
- Stop instead of fixing when remediation would expand scope, contradict repo architecture, introduce risky refactors, or repeat the same failed fix twice.
### Gate 6: Merge Readiness Gate
Required before claiming the implementation is ready for manual review/merge.
Pass criteria:
- Spec Readiness Gate passed.
- Implementation Scope Gate passed.
- Test Gate passed.
- Browser Smoke Test Gate passed when applicable, or was explicitly marked not applicable with a reason.
- Post-Implementation Analysis Gate passed.
- `tasks.md` reflects actual completion status.
- No confirmed in-scope findings remain.
- All remaining findings are documented as out-of-scope, follow-up candidates, unrelated existing debt, or explicit residual risks.
- Final response includes changed files, tests/checks run, browser smoke result, iterations performed, residual risks, and follow-up candidates.
Fail behavior:
- Do not claim merge-readiness.
- Report the failed gate, remaining risks, and the smallest recommended next action.
## Implementation Loop
Execute the loop in bounded phases:
1. Evaluate the Spec Readiness Gate.
2. Evaluate the Implementation Scope Gate before changing application code.
3. Implement the active Spec Kit feature scope task-by-task.
4. Run targeted tests and relevant static/dynamic checks.
5. Evaluate the Test Gate.
6. Run a Browser Smoke Test when the change affects UI/user-facing flows.
7. Evaluate the Browser Smoke Test Gate as passed, failed, or not applicable with a reason.
8. Run strict post-implementation analysis against spec, plan, tasks, constitution, changed code, changed tests, browser smoke results where applicable, and relevant existing patterns.
9. Evaluate the Post-Implementation Analysis Gate.
10. Identify confirmed findings by severity: Blocker, High, Medium, Low.
11. Fix all confirmed in-scope findings regardless of severity when safe and bounded.
12. Do not fix findings that require scope expansion, risky unrelated refactors, or architectural/product decisions outside the active spec; document them as follow-up/residual risks with reasons.
13. Re-run relevant tests and browser smoke checks where applicable after fixes.
14. Repeat test + browser smoke + analysis + fix loop until no confirmed in-scope findings remain or a stop condition is reached.
15. Evaluate the Merge Readiness Gate.
16. Report final implementation status, changed files, tests, browser smoke result, residual risks, failed/passed gates, and manual review prompt.
## Stop Conditions
Stop the implementation loop when any of the following is true:
- No confirmed in-scope findings remain.
- The same finding appears twice after attempted fixes.
- A required fix conflicts with the spec, plan, constitution, or repository architecture.
- A required fix would expand scope beyond the active spec.
- A required fix would require a risky unrelated refactor.
- A required fix depends on an unresolved product or architecture decision.
- Tests reveal an unrelated pre-existing failure that cannot be safely fixed inside the active spec.
- Browser smoke testing reveals an unrelated pre-existing UI/runtime failure that cannot be safely fixed inside the active spec.
- Three analysis/fix iterations have already been completed.
- The repository state is ambiguous enough that continuing would risk damaging architecture or data semantics.
When stopping before full cleanliness, report exactly why the loop stopped and what remains.
## Post-Implementation Analysis Prompt
Use this prompt internally after implementation and after each fix iteration:
```markdown
Du bist ein Senior Staff Software Engineer, Software Architect und Enterprise SaaS Reviewer.
Analysiere die Implementierung der aktiven Spec streng repo-basiert.
Ziel:
Prüfe, ob die Umsetzung vollständig, konsistent, getestet und constitution-konform ist.
Prüfe gegen:
- spec.md
- plan.md
- tasks.md
- .specify/memory/constitution.md
- geänderte Anwendungscodes
- geänderte Tests
- Browser-Smoke-Test-Ergebnis, falls UI/user-facing Flows betroffen sind
- bestehende Repository-Patterns
Wichtig:
- Keine Spekulation ohne Repo-Beleg.
- Keine Scope-Erweiterung.
- Keine neuen Produktideen als Pflicht-Fixes.
- Findings nach Blocker, High, Medium, Low gruppieren.
- Für jedes Finding konkrete Datei-/Code-Belege nennen.
- Für jedes Finding eine minimale Remediation nennen.
- Separat ausweisen, welche Findings innerhalb der aktiven Spec behoben werden müssen.
- Medium/Low Findings innerhalb der aktiven Spec ebenfalls zur Behebung markieren, wenn sie sicher und bounded sind.
- Bei UI-/Filament-/Livewire-Änderungen prüfen, ob ein Browser Smoke Test durchgeführt wurde und ob der getestete Operator-Flow wirklich funktioniert.
- Findings, die nicht behoben werden sollen, nur als Follow-up/Residual Risk ausweisen, wenn sie out of scope, risky refactor, unrelated existing debt, not reproducible oder durch eine offene Produkt-/Architekturentscheidung blockiert sind.
- Wenn keine bestätigten In-Scope Findings verbleiben, klare Implementierungsfreigabe geben.
```
## Task Completion Rules
- Keep `tasks.md` aligned with actual implementation status.
- Check off tasks only after the implementation and test evidence exists.
- If a task is obsolete because repository truth proves a different path, update the task note with the reason instead of silently deleting it.
- If a task cannot be completed inside scope, leave it unchecked and report why.
## Testing Rules
- Add or update tests for all changed business behavior.
- Include RBAC and workspace/tenant isolation tests where relevant.
- Include OperationRun, audit, evidence, or result-truth tests where relevant.
- Prefer regression tests for every fixed Blocker or High finding.
- Add regression tests for Medium/Low findings when the behavior is important and testable without excessive churn.
- Do not weaken tests to pass the suite.
- Do not treat a green UI path as sufficient without backend or policy coverage when the behavior is security- or governance-relevant.
## Browser Smoke Test Rules
Apply these rules when the active spec changes Filament UI, Livewire interactions, navigation, forms, tables, actions, modals, dashboards, operation drilldowns, tenant/workspace context, or any user-facing flow.
The browser smoke test should be narrow and focused. It is not a full E2E suite unless explicitly requested.
Minimum smoke path:
1. Open the relevant page or entry point.
2. Confirm the expected workspace/tenant context where relevant.
3. Confirm the changed or newly introduced UI element is visible.
4. Execute the primary action or interaction changed by the spec.
5. Confirm the expected result state, notification, redirect, table update, modal state, operation link, or drilldown.
6. Check for relevant console errors.
7. Check for failed network requests related to the tested flow.
8. Document the tested path in the final response.
For TenantPilot/TenantAtlas, pay special attention to:
- Filament actions and header actions
- Livewire polling, modals, validation, and actions
- workspace/tenant context preservation
- RBAC/capability-dependent action visibility
- OperationRun links and drilldown continuity
- audit/evidence/result/support-diagnostic drilldowns where relevant
- empty states, badges, labels, and decision guidance where relevant
Browser smoke testing is required for UI/user-facing changes and optional for backend-only changes.
Do not treat browser smoke success as proof that backend security, policies, domain logic, auditability, or workspace/tenant isolation are correct. Those still require automated tests or repo-based verification.
## Failure Handling
If an implementation step, test phase, browser smoke phase, or post-implementation analysis fails:
1. Stop at the relevant gate or stop condition.
2. Report the failing command or phase.
3. Summarize the error.
4. Do not attempt unrelated implementation as a workaround.
5. Suggest the smallest safe next action.
If the branch or working tree state is unsafe:
1. Stop before implementation changes.
2. Report the current branch and relevant uncommitted files.
3. Ask the user to commit, stash, or move to a clean worktree.
## Final Response Requirements
Respond with:
1. Active spec directory
2. Summary of implemented changes
3. Tests/checks run and their results
4. Browser smoke test result, tested path, or not-applicable reason
5. Quality gates passed/failed and number of analysis/fix iterations performed
6. Remaining in-scope findings, if any
7. Residual risks and follow-up candidates, if relevant
8. Files changed
9. Explicit statement whether the Merge Readiness Gate passed and whether the implementation is ready for manual review/merge
Keep the final response concise, but include enough detail for the user to continue immediately.
## Manual Review Prompt
Provide a ready-to-copy prompt like this, adapted to the active spec number and slug:
```markdown
Du bist ein Senior Staff Software Architect und Enterprise SaaS Reviewer.
Führe eine finale manuelle Review der implementierten Spec `<spec-number>-<slug>` streng repo-basiert durch.
Ziel:
Prüfe, ob die Implementierung nach dem Agenten-Loop wirklich merge-ready ist.
Wichtig:
- Keine Implementierung.
- Keine Codeänderungen.
- Keine Scope-Erweiterung.
- Prüfe gegen spec.md, plan.md, tasks.md und constitution.md.
- Prüfe die geänderten Dateien, Tests, Browser-Smoke-Test-Ergebnis, RBAC, Workspace-/Tenant-Isolation, Auditability, UX und OperationRun-Semantik, soweit relevant.
- Benenne nur konkrete Findings mit Repo-Beleg.
- Gib am Ende eine klare Entscheidung: Merge-ready, merge-ready with notes, oder not merge-ready.
```
## Example Invocation
User:
```text
Nutze den Skill spec-kit-implementation-loop.
Implementiere die aktive Spec.
Danach Tests ausführen, Browser Smoke Test falls UI/user-facing betroffen ist, Post-Implementation Analyse durchführen und alle bestätigten In-Scope Findings unabhängig von Severity beheben, wenn safe und bounded.
Wiederhole test + browser smoke + analysis + fix bis keine In-Scope Findings mehr offen sind oder eine Stop Condition greift.
```
Expected behavior:
1. Inspect active Spec Kit context, constitution, spec, plan, tasks, relevant code, and relevant tests.
2. Evaluate the Spec Readiness Gate and Implementation Scope Gate.
3. Implement only the active spec scope.
4. Run targeted tests and relevant checks.
5. Evaluate the Test Gate.
6. Run and evaluate Browser Smoke Test when UI/user-facing flows are affected.
7. Run post-implementation analysis.
8. Fix all confirmed in-scope findings regardless of severity when safe and bounded.
9. Repeat test + browser smoke + analysis + fix loop up to the stop conditions.
10. Evaluate the Merge Readiness Gate.
11. Report final status, changed files, tests, browser smoke result, residual risks, gates, and manual review prompt.
```

View File

@ -0,0 +1,612 @@
---
name: spec-kit-next-best-prep
description: Select the next suitable TenantPilot/TenantAtlas spec candidate from roadmap/spec-candidates, run the repository's Spec Kit preparation flow, create or update spec.md/plan.md/tasks.md, run preparation analysis, fix preparation-artifact issues only, and stop before application implementation.
---
# Skill: Spec Kit Next-Best Preparation
## Purpose
Use this skill to prepare the next implementation-ready Spec Kit package for TenantPilot/TenantAtlas without implementing application code.
This skill supports preparation only:
1. Select or scope the next suitable feature from roadmap/spec-candidates.
2. Run the repository's real Spec Kit preparation workflow where available.
3. Create or update `spec.md`, `plan.md`, and `tasks.md`.
4. Run preparation `analyze` when supported.
5. Fix preparation-artifact issues only.
6. Evaluate preparation quality gates.
7. Stop before application implementation.
The intended workflow is:
```text
roadmap / spec-candidates / feature idea
→ inspect repo truth, constitution, roadmap, spec candidates, existing specs, and relevant code
→ select the next suitable candidate or scope the provided idea
→ run Spec Kit specify/plan/tasks/analyze where available
→ create or update spec.md + plan.md + tasks.md
→ fix preparation-artifact issues only
→ evaluate Candidate Selection Gate and Spec Readiness Gate
→ final preparation report
→ explicit implementation step later
```
## When to Use
Use this skill when the user asks to:
- select the next best spec candidate from `docs/product/spec-candidates.md` and roadmap sources
- turn a feature idea, roadmap item, or candidate into `spec.md`, `plan.md`, and `tasks.md`
- prepare Spec Kit artifacts in one pass
- run specify/plan/tasks/analyze without implementation
- fix preparation analysis issues in Spec Kit artifacts only
- prepare a feature package for a later implementation skill
Typical user prompts:
```text
Nimm den nächsten sinnvollen Spec Candidate aus Roadmap/spec-candidates und mach spec, plan und tasks.
```
```text
Mach daraus spec, plan und tasks in einem Rutsch, aber noch nicht implementieren.
```
```text
Wähle aus roadmap.md und spec-candidates.md die nächste sinnvollste Spec und führe specify, plan, tasks und analyze aus.
```
```text
Behebe alle analyze-Issues in den Spec-Kit-Artefakten. Keine Application-Implementierung.
```
## Hard Rules
- Work strictly repo-based.
- This is a preparation-only skill.
- Do not implement application code.
- Do not modify production code.
- Do not modify migrations, models, services, jobs, Filament resources, Livewire components, policies, commands, routes, views, tests, or runtime behavior.
- Use the repository's actual Spec Kit workflow, scripts, templates, branch naming rules, and generated paths when available.
- Do not manually invent spec numbers, branch names, or spec paths if Spec Kit provides a script or command for that.
- Do not bypass Spec Kit branch mechanics.
- Create or update only Spec Kit preparation artifacts unless repository conventions require additional documentation artifacts.
- Do not expand scope beyond the selected feature, `spec.md`, `plan.md`, and `tasks.md`.
- Do not silently add roadmap features, adjacent UX rewrites, speculative architecture, or unrelated refactors.
- Follow the repository constitution and existing Spec Kit conventions.
- Preserve TenantPilot/TenantAtlas terminology.
- Prefer small, reviewable, implementation-ready specs over broad rewrites.
- Treat repository truth as authoritative over assumptions.
- If repository truth conflicts with the user-provided draft or candidate wording, keep repository truth and document the deviation.
- Fix only confirmed preparation-artifact findings from Spec Kit preparation analysis.
- Do not leave preparation findings open silently. If they are not fixed, document exactly why.
- Do not run destructive commands.
- Do not force checkout, reset, stash, rebase, merge, or delete branches.
- Do not overwrite existing specs.
- Do not rewrite completed specs back into preparation state.
- Do not remove or normalize implementation history, close-out notes, validation results, completed task markers, smoke results, or post-implementation review language from completed specs.
- Treat completed-spec close-out and validation language as intentional repository history, not preparation drift.
- Do not move from preparation to an implementation step inside this skill.
## Required Inputs
The user should provide at least one of:
- feature title and short goal
- full spec candidate
- roadmap item
- rough problem statement
- UX or architecture improvement idea
- instruction to choose the next best candidate from roadmap/spec-candidates
If the input is incomplete, proceed with the smallest reasonable interpretation and document assumptions.
If no suitable candidate can be selected safely, stop and report why.
## Required Repository Checks
Always check:
1. `.specify/memory/constitution.md`
2. `.specify/templates/`
3. `.specify/scripts/`
4. existing Spec Kit command usage or repository instructions, if present
5. current branch and git status
6. `specs/`
7. `docs/product/spec-candidates.md`
8. relevant roadmap documents under `docs/product/`, especially `roadmap.md` if present
9. nearby existing specs with related terminology or scope
10. application code only as needed to avoid wrong naming, wrong architecture, duplicate concepts, impossible tasks, duplicated specs, or already-completed candidates
Do not edit application code.
## Completed-Spec Guardrail
Before selecting an existing spec package as a `next-best-prep` target, explicitly check whether the spec is already completed, implementation-closed, or validated.
A spec must be treated as completed if any of the following signals are present in `spec.md`, `plan.md`, `tasks.md`, `quickstart.md`, checklist artifacts, or related Spec Kit package files:
- `Implementation Close-Out`
- `Implementation completed on`
- `Implementation Validation Results`
- `Implemented and validated`
- `Review Outcome` or `Implementation Review Outcome`
- passed validation, smoke, browser, or guardrail results
- completed task checklist markers for the implementation tasks
- post-implementation review or close-out language
- a status marker indicating implemented, completed, closed, or validated
If a spec is completed:
- exclude it from `next-best-prep` candidate selection
- do not patch, normalize, rewrite, or convert it back to preparation-only state
- do not remove close-out sections, validation results, completed task markers, smoke results, or post-implementation review language
- treat those artifacts as historical implementation evidence
- only use the completed spec as context for dependency or roadmap reasoning
If all high-priority candidates are already specced, active, or completed, stop and report `no safe next prep target` instead of modifying existing completed specs.
## Git and Branch Safety
Before running any Spec Kit command:
1. Check the current branch.
2. Check whether the working tree is clean.
3. If there are unrelated uncommitted changes, stop and report them. Do not continue.
4. If the working tree only contains user-intended planning edits for this operation, continue cautiously.
5. Let Spec Kit create or switch to the correct feature branch when that is how the repository workflow works.
6. Do not force checkout, reset, stash, rebase, merge, or delete branches.
7. Do not overwrite existing specs.
If the repo requires an explicit branch creation script for `specify`, use that script rather than manually creating the branch.
## Quality Gates
### Gate 1: Candidate Selection Gate
Required before creating a new spec from roadmap/spec-candidates.
Pass criteria:
- The selected candidate exists in roadmap/spec-candidate material or is directly provided by the user.
- The selected candidate is not already covered by an existing active or completed spec.
- The selected target is not a completed spec package with implementation close-out, validation results, completed tasks, smoke results, or post-implementation review history.
- The selected candidate aligns with current roadmap priorities or explicitly documented product direction.
- The candidate can be scoped as a small, reviewable, implementation-ready slice.
- Major adjacent concerns are listed as follow-up candidates instead of being hidden inside the primary scope.
Fail behavior:
- If no candidate satisfies the gate, stop and report the top candidates plus the reason none is ready.
- If the only plausible targets are completed specs, stop and report `no safe next prep target`; do not modify those completed specs.
- Do not invent a new roadmap direction to force progress.
### Gate 2: Spec Readiness Gate
Required before reporting that the package is ready for implementation.
Pass criteria:
- `spec.md`, `plan.md`, and `tasks.md` exist.
- The spec has clear problem statement, user value, functional requirements, out-of-scope boundaries, acceptance criteria, assumptions, and risks.
- The plan identifies likely affected repo surfaces and does not contradict repository architecture.
- The tasks are small, ordered, verifiable, and include test/validation tasks.
- RBAC, workspace/tenant isolation, auditability, OperationRun semantics, evidence/result-truth, and UX requirements are addressed where relevant.
- No open question blocks safe implementation.
- The scope is small enough for a bounded implementation loop in a later implementation skill.
- Required checklist artifacts exist when the constitution requires them.
Fail behavior:
- Fix preparation-artifact issues when they are safe and bounded.
- If readiness cannot be achieved without implementation or unresolved product decisions, stop and report the gap.
- Do not compensate for an unclear spec by inventing implementation scope.
## Candidate Selection Rules
When the user asks for the next best spec from roadmap/spec-candidates:
- Read `docs/product/spec-candidates.md`.
- Read relevant roadmap documents under `docs/product/`, especially `roadmap.md` if present.
- Check existing specs to avoid duplicates.
- Check existing specs for completed-spec signals before selecting an existing package as a refresh target.
- Exclude completed specs from next-best-prep selection, even if their artifacts contain close-out, validation, or completed-task language that would look like drift in a preparation-only package.
- Prefer candidates that align with current roadmap priorities, platform foundations, enterprise UX, RBAC/isolation, auditability, observability, and governance workflow maturity.
- Prefer candidates that unlock roadmap progress, reduce architectural drift, harden foundations, or remove known blockers.
- Prefer small, implementation-ready slices over broad platform rewrites.
- If multiple candidates are plausible, choose one primary candidate and document why it was selected.
- Add non-selected relevant candidates as follow-up spec candidates, not hidden scope.
- Do not invent a candidate if existing roadmap/spec-candidate material provides a suitable one.
- Do not pick a spec only because it is listed first.
- Evaluate the Candidate Selection Gate before creating the spec directory.
Evaluate candidates using these criteria:
1. **Roadmap Fit**: Does it support the current roadmap sequence or unlock the next roadmap layer?
2. **Foundation Value**: Does it strengthen reusable platform foundations such as RBAC, isolation, auditability, evidence, OperationRun observability, provider boundaries, vocabulary, baseline/control/finding semantics, or enterprise UX patterns?
3. **Dependency Unblocking**: Does it make future specs smaller, safer, or more consistent?
4. **Scope Size**: Can it be implemented as a narrow, testable slice?
5. **Repo Readiness**: Does the repo already have enough structure to implement the next slice safely?
6. **Risk Reduction**: Does it reduce current architectural or product risk?
7. **User/Product Value**: Does it produce visible operator value or make the platform more sellable without heavy scope?
8. **Completion Safety**: Is the target genuinely unprepared or incomplete, rather than an already completed spec whose historical close-out artifacts should be preserved?
## Required Selection Output Before Spec Kit Execution
Before running the Spec Kit flow, identify:
- selected candidate title
- source location in roadmap/spec-candidates
- why it was selected
- why close alternatives were deferred
- roadmap relationship
- completed-spec check result for related existing specs
- smallest viable implementation slice
- proposed concise feature description to feed into `specify`
The feature description must be product- and behavior-oriented. It should not be a low-level implementation plan.
## Spec Kit Preparation Flow
### Step 1: Determine the repository's Spec Kit command pattern
Inspect repository instructions and scripts to identify how this repo expects Spec Kit to be run.
Common locations to inspect:
```text
.specify/scripts/
.specify/templates/
.specify/memory/constitution.md
.github/prompts/
.github/skills/
README.md
specs/
```
Use the repo-specific mechanism if present.
### Step 2: Run `specify`
Run the repository's `specify` flow using the selected candidate and the smallest viable slice.
The `specify` input should include:
- selected candidate title
- problem statement
- operator/user value
- roadmap relationship
- out-of-scope boundaries
- key acceptance criteria
- important enterprise constraints
Let Spec Kit create the correct branch and spec location if that is the repo's configured behavior.
### Step 3: Run `plan`
Run the repository's `plan` flow for the generated spec.
The `plan` input should keep the scope tight and should require repo-based alignment with:
- constitution
- existing architecture
- workspace/tenant isolation
- RBAC
- OperationRun/observability where relevant
- evidence/snapshot/truth semantics where relevant
- Filament/Livewire conventions where relevant
- test strategy
### Step 4: Run `tasks`
Run the repository's `tasks` flow for the generated plan.
The generated tasks must be:
- ordered
- small
- testable
- grouped by phase
- limited to the selected scope
- suitable for later implementation or manual analysis before implementation
### Step 5: Run preparation `analyze`
Run the repository's `analyze` flow against the generated Spec Kit artifacts when the repository supports it.
Analyze must check:
- consistency between `spec.md`, `plan.md`, and `tasks.md`
- constitution alignment
- roadmap alignment
- whether the selected candidate was narrowed safely
- whether tasks are complete enough for implementation
- whether tasks accidentally require scope not described in the spec
- whether plan details conflict with repository architecture or terminology
- whether implementation risks are documented instead of silently ignored
Do not use analyze as a trigger to implement application code.
### Step 6: Fix preparation-artifact issues only
If preparation analyze finds issues, first confirm that the selected package is not completed. Then fix only Spec Kit preparation artifacts such as:
- `spec.md`
- `plan.md`
- `tasks.md`
- `checklists/requirements.md` or other generated Spec Kit metadata files, if the repository uses them
Allowed fixes include:
- clarify requirements
- tighten scope
- move out-of-scope work into follow-up candidates
- correct terminology
- add missing tasks
- remove tasks not backed by the spec
- align plan language with repository architecture
- add missing acceptance criteria or validation tasks
- add missing checklist artifacts required by the constitution
Forbidden fixes include:
- modifying application code
- creating migrations
- editing models, services, jobs, policies, Filament resources, Livewire components, tests, commands, routes, or views
- running implementation or test-fix loops
- changing runtime behavior
- removing implementation close-out history from completed specs
- converting completed specs back to preparation-only wording
- changing passed validation or smoke results into planned validation commands
- unchecking completed implementation tasks in a completed spec
### Step 7: Evaluate the Spec Readiness Gate
After preparation analyze has passed or preparation-artifact issues have been fixed, evaluate the Spec Readiness Gate.
Stop after this gate and do not implement.
## Spec Directory Rules
When creating a new spec directory, use the repository's Spec Kit-generated directory or path.
If the repository does not provide a command for spec setup, use the next valid spec number and a kebab-case slug:
```text
specs/<number>-<slug>/
```
The exact number must be derived from the current repository state and existing numbering conventions.
Create or update preparation artifacts inside the selected spec directory:
```text
specs/<number>-<slug>/spec.md
specs/<number>-<slug>/plan.md
specs/<number>-<slug>/tasks.md
```
If the repository templates require additional preparation files, create them only when this is consistent with existing Spec Kit conventions.
## `spec.md` Requirements
The spec must be product- and behavior-oriented. It should avoid premature implementation detail unless needed for correctness.
Include:
- Feature title
- Problem statement
- Business/product value
- Primary users/operators
- User stories
- Functional requirements
- Non-functional requirements
- UX requirements
- RBAC/security requirements
- Auditability/observability requirements
- Data/truth-source requirements where relevant
- Out of scope
- Acceptance criteria
- Success criteria
- Risks
- Assumptions
- Open questions
TenantPilot/TenantAtlas specs should preserve enterprise SaaS principles:
- workspace/tenant isolation
- capability-first RBAC
- auditability
- operation/result truth separation
- source-of-truth clarity
- calm enterprise operator UX
- progressive disclosure where useful
- no false positive calmness
## `plan.md` Requirements
The plan must be repo-aware and implementation-oriented, but it must not make code changes by itself.
Include:
- Technical approach
- Existing repository surfaces likely affected
- Domain/model implications
- UI/Filament implications
- Livewire implications where relevant
- OperationRun/monitoring implications where relevant
- RBAC/policy implications
- Audit/logging/evidence implications where relevant
- Data/migration implications where relevant
- Test strategy
- Rollout considerations
- Risk controls
- Implementation phases
The plan should clearly distinguish where relevant:
- execution truth
- artifact truth
- backup/snapshot truth
- recovery/evidence truth
- operator next action
## `tasks.md` Requirements
Tasks must be ordered, small, and verifiable.
Include:
- checkbox tasks
- phase grouping
- tests before or alongside implementation tasks where practical
- final validation tasks
- documentation/update tasks if needed
- explicit non-goals where useful
Avoid vague tasks such as:
```text
Clean up code
Refactor UI
Improve performance
Make it enterprise-ready
```
Prefer concrete tasks such as:
```text
- [ ] Add a feature test covering workspace isolation for <specific behavior>.
- [ ] Update <specific Filament page/resource> to display <specific state>.
- [ ] Add policy coverage for <specific capability>.
```
If exact file names are not known yet, phrase tasks as repo-verification tasks first rather than inventing file paths.
## Preparation Scope Control
If the requested feature implies multiple independent concerns, create one primary spec for the smallest valuable slice and add a `Follow-up spec candidates` section.
Examples of follow-up candidates:
- assigned findings
- pending approvals
- personal work queue
- notification delivery settings
- evidence pack export hardening
- operation monitoring refinements
- autonomous governance decision surfaces
Do not force all follow-up candidates into the primary spec.
## Failure Handling
If a Spec Kit command or preparation analyze phase fails:
1. Stop at the relevant gate.
2. Report the failing command or phase.
3. Summarize the error.
4. Do not attempt implementation as a workaround.
5. Suggest the smallest safe next action.
If the branch or working tree state is unsafe:
1. Stop before running Spec Kit commands.
2. Report the current branch and relevant uncommitted files.
3. Ask the user to commit, stash, or move to a clean worktree.
If a completed spec is accidentally selected or modified:
1. Stop immediately.
2. Report that the selected spec is completed and therefore not a valid preparation target.
3. Revert only the changes made by this operation to that completed spec package, if they are isolated and safe to revert.
4. Run `git status --short` and report remaining changes.
5. Re-run candidate selection excluding completed specs.
6. If no safe unprepared candidate exists, report `no safe next prep target`.
## Final Response Requirements
Respond with:
1. Selected candidate and why it was chosen
2. Why close alternatives were deferred
3. Completed-spec guardrail result for related existing specs
4. Current branch after Spec Kit execution, if changed
5. Generated spec path
6. Files created or updated by Spec Kit
7. Preparation analyze result summary
8. Preparation-artifact fixes applied after analyze
9. Assumptions made
10. Open questions, if any
11. Candidate Selection Gate result
12. Spec Readiness Gate result
13. Recommended next implementation prompt
14. Explicit statement that no application implementation was performed
Keep the final response concise, but include enough detail for the user to continue immediately.
## Manual Review and Next-Step Prompts
Provide a ready-to-copy manual artifact review prompt like this, adapted to the generated spec branch/path:
```markdown
Du bist ein Senior Staff Software Architect und Enterprise SaaS Reviewer.
Analysiere die neu erstellte Spec `<spec-branch-or-spec-path>` streng repo-basiert.
Ziel:
Prüfe, ob `spec.md`, `plan.md` und `tasks.md` vollständig, konsistent, implementierbar und constitution-konform sind.
Wichtig:
- Keine Implementierung.
- Keine Codeänderungen.
- Keine Scope-Erweiterung.
- Prüfe nur gegen Repo-Wahrheit.
- Benenne konkrete Konflikte mit Dateien, Patterns, Datenflüssen oder bestehenden Specs.
- Schlage nur minimale Korrekturen an `spec.md`, `plan.md` und `tasks.md` vor.
- Wenn alles passt, gib eine klare Implementierungsfreigabe.
```
Also provide a ready-to-copy implementation prompt for the separate implementation skill after analyze has passed or preparation-artifact issues have been fixed:
```markdown
/spec-kit-implementation-loop
Implementiere die vorbereitete Spec `<spec-branch-or-spec-path>` streng anhand von `tasks.md`.
Danach Tests ausführen, Browser Smoke Test falls UI/user-facing betroffen ist, Post-Implementation Analyse durchführen und alle bestätigten In-Scope Findings unabhängig von Severity beheben, wenn safe und bounded.
Wiederhole test + browser smoke + analysis + fix bis keine In-Scope Findings mehr offen sind oder eine Stop Condition greift.
```
## Example Invocation
User:
```text
Nutze den Skill spec-kit-next-best-prep.
Wähle aus roadmap.md und spec-candidates.md die nächste sinnvollste Spec.
Führe danach GitHub Spec Kit specify, plan, tasks und analyze in einem Rutsch aus.
Behebe alle analyze-Issues in den Spec-Kit-Artefakten.
Keine Application-Implementierung.
```
Expected behavior:
1. Inspect constitution, Spec Kit scripts/templates, specs, roadmap, and spec candidates.
2. Check branch and working tree safety.
3. Compare candidate suitability.
4. Select the next best candidate.
5. Exclude already completed specs from preparation or refresh targets, preserving their close-out and validation history.
6. Evaluate the Candidate Selection Gate.
7. Run the repository's real Spec Kit `specify` flow, letting it handle branch/spec setup.
8. Run the repository's real Spec Kit `plan` flow.
9. Run the repository's real Spec Kit `tasks` flow.
10. Run the repository's real Spec Kit preparation `analyze` flow.
11. Fix analyze issues only in Spec Kit preparation artifacts.
12. Evaluate the Spec Readiness Gate.
13. Stop before application implementation.
14. Return selection rationale, branch/path summary, artifact summary, analyze summary, fixes applied, gates, and next implementation prompt.
```

View File

@ -0,0 +1,129 @@
---
name: tailwindcss-development
description: "Styles applications using Tailwind CSS v4 utilities. Activates when adding styles, restyling components, working with gradients, spacing, layout, flex, grid, responsive design, dark mode, colors, typography, or borders; or when the user mentions CSS, styling, classes, Tailwind, restyle, hero section, cards, buttons, or any visual/UI changes."
license: MIT
metadata:
author: laravel
---
# Tailwind CSS Development
## When to Apply
Activate this skill when:
- Adding styles to components or pages
- Working with responsive design
- Implementing dark mode
- Extracting repeated patterns into components
- Debugging spacing or layout issues
## Documentation
Use `search-docs` for detailed Tailwind CSS v4 patterns and documentation.
## Basic Usage
- Use Tailwind CSS classes to style HTML. Check and follow existing Tailwind conventions in the project before introducing new patterns.
- Offer to extract repeated patterns into components that match the project's conventions (e.g., Blade, JSX, Vue).
- Consider class placement, order, priority, and defaults. Remove redundant classes, add classes to parent or child elements carefully to reduce repetition, and group elements logically.
## Tailwind CSS v4 Specifics
- Always use Tailwind CSS v4 and avoid deprecated utilities.
- `corePlugins` is not supported in Tailwind v4.
### CSS-First Configuration
In Tailwind v4, configuration is CSS-first using the `@theme` directive — no separate `tailwind.config.js` file is needed:
<!-- CSS-First Config -->
```css
@theme {
--color-brand: oklch(0.72 0.11 178);
}
```
### Import Syntax
In Tailwind v4, import Tailwind with a regular CSS `@import` statement instead of the `@tailwind` directives used in v3:
<!-- v4 Import Syntax -->
```diff
- @tailwind base;
- @tailwind components;
- @tailwind utilities;
+ @import "tailwindcss";
```
### Replaced Utilities
Tailwind v4 removed deprecated utilities. Use the replacements shown below. Opacity values remain numeric.
| Deprecated | Replacement |
|------------|-------------|
| bg-opacity-* | bg-black/* |
| text-opacity-* | text-black/* |
| border-opacity-* | border-black/* |
| divide-opacity-* | divide-black/* |
| ring-opacity-* | ring-black/* |
| placeholder-opacity-* | placeholder-black/* |
| flex-shrink-* | shrink-* |
| flex-grow-* | grow-* |
| overflow-ellipsis | text-ellipsis |
| decoration-slice | box-decoration-slice |
| decoration-clone | box-decoration-clone |
## Spacing
Use `gap` utilities instead of margins for spacing between siblings:
<!-- Gap Utilities -->
```html
<div class="flex gap-8">
<div>Item 1</div>
<div>Item 2</div>
</div>
```
## Dark Mode
If existing pages and components support dark mode, new pages and components must support it the same way, typically using the `dark:` variant:
<!-- Dark Mode -->
```html
<div class="bg-white dark:bg-gray-900 text-gray-900 dark:text-white">
Content adapts to color scheme
</div>
```
## Common Patterns
### Flexbox Layout
<!-- Flexbox Layout -->
```html
<div class="flex items-center justify-between gap-4">
<div>Left content</div>
<div>Right content</div>
</div>
```
### Grid Layout
<!-- Grid Layout -->
```html
<div class="grid grid-cols-1 md:grid-cols-2 lg:grid-cols-3 gap-6">
<div>Card 1</div>
<div>Card 2</div>
<div>Card 3</div>
</div>
```
## Common Pitfalls
- Using deprecated v3 utilities (bg-opacity-*, flex-shrink-*, etc.)
- Using `@tailwind` directives instead of `@import "tailwindcss"`
- Trying to use `tailwind.config.js` instead of CSS `@theme` directive
- Using margins for spacing between siblings instead of gap utilities
- Forgetting to add dark mode variants when the project uses dark mode

View File

@ -38,6 +38,19 @@
->and(file_exists(repo_path('scripts/platform-test-artifacts')))->toBeTrue();
});
it('passes artifact staging inputs through php argv for sail execution', function (): void {
$artifactRunner = (string) file_get_contents(repo_path('scripts/platform-test-artifacts'));
expect($artifactRunner)
->toContain('./vendor/bin/sail php -- "${LANE}" "${STAGING_DIRECTORY}" "${ARTIFACT_DIRECTORY}"')
->and($artifactRunner)->toContain('$laneId = (string) ($argv[1] ?? \'\');')
->and($artifactRunner)->toContain('$stagingDirectory = (string) ($argv[2] ?? \'\');')
->and($artifactRunner)->toContain('$artifactDirectory = (string) ($argv[3] ?? \'\');')
->and($artifactRunner)->not->toContain("getenv('LANE_ID')")
->and($artifactRunner)->not->toContain("getenv('STAGING_DIRECTORY')")
->and($artifactRunner)->not->toContain("getenv('ARTIFACT_DIRECTORY')");
});
it('keeps heavy-governance baseline capture support inside the checked-in wrappers', function (): void {
$laneRunner = (string) file_get_contents(repo_path('scripts/platform-test-lane'));
$reportRunner = (string) file_get_contents(repo_path('scripts/platform-test-report'));

View File

@ -48,20 +48,17 @@ fi
cd "${APP_DIR}"
LANE_ID="${LANE}" \
STAGING_DIRECTORY="${STAGING_DIRECTORY}" \
ARTIFACT_DIRECTORY="${ARTIFACT_DIRECTORY}" \
./vendor/bin/sail php <<'PHP'
./vendor/bin/sail php -- "${LANE}" "${STAGING_DIRECTORY}" "${ARTIFACT_DIRECTORY}" <<'PHP'
<?php
declare(strict_types=1);
require 'vendor/autoload.php';
$laneId = (string) getenv('LANE_ID');
$stagingDirectory = (string) getenv('STAGING_DIRECTORY');
$artifactDirectory = getenv('ARTIFACT_DIRECTORY');
$artifactDirectory = is_string($artifactDirectory) && trim($artifactDirectory) !== ''
$laneId = (string) ($argv[1] ?? '');
$stagingDirectory = (string) ($argv[2] ?? '');
$artifactDirectory = (string) ($argv[3] ?? '');
$artifactDirectory = trim($artifactDirectory) !== ''
? $artifactDirectory
: null;
@ -70,4 +67,4 @@ $result = \Tests\Support\TestLaneReport::stageArtifacts($laneId, $stagingDirecto
echo json_encode($result, JSON_PRETTY_PRINT | JSON_THROW_ON_ERROR).PHP_EOL;
exit(($result['complete'] ?? false) === true ? 0 : 1);
PHP
PHP

View File

@ -0,0 +1,45 @@
# Specification Quality Checklist: Full Suite Failure Classification & CI Lane Baseline
**Purpose**: Validate specification completeness and quality before implementation
**Created**: 2026-05-11
**Feature**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md`
## Content Quality
- [x] No application implementation details leak into product requirements beyond required repo-truth paths and validation commands.
- [x] Focused on user value and business needs: restored or classified CI signal after Specs `293` and `294`.
- [x] Written for maintainers and reviewers who must interpret CI output.
- [x] All mandatory Spec Kit sections are completed or explicitly marked N/A.
## Requirement Completeness
- [x] No unresolved clarification markers remain.
- [x] Requirements are testable and unambiguous.
- [x] Success criteria are measurable.
- [x] Success criteria are technology-aware only where repo validation commands require it.
- [x] All acceptance scenarios are defined.
- [x] Edge cases are identified.
- [x] Scope is clearly bounded.
- [x] Dependencies and assumptions are identified.
## Feature Readiness
- [x] All functional requirements have clear acceptance criteria.
- [x] User scenarios cover primary classification, lane signal, follow-up split, and final readiness decision flows.
- [x] Feature meets measurable outcomes defined in Success Criteria.
- [x] No application implementation has been performed during preparation.
## Spec 295 Guardrails
- [x] Pinned categories stay aligned: `ci-signal-restored`, `ci-wrapper-or-manifest-regression`, `artifact-publication-regression`, `budget-or-trend-baseline-drift`, `product-runtime-or-test-regression`, `browser-lane-regression`, `flaky-or-environment`, `follow-up-spec-required`, `resolved-or-not-needed`.
- [x] Pinned seams stay aligned: `raw-full-suite`, `fast-feedback-lane`, `confidence-lane`, `heavy-governance-lane`, `browser-lane`, `profiling-or-junit-support`, `lane-reporting`, `artifact-publication`, `budget-trend-baseline`, `legacy-cutover-regression-guard`, `provider-verification-regression-guard`.
- [x] Completed Specs `293` and `294` are context only and are not rewritten.
- [x] Legacy `/admin/t/...` and TenantPanelProvider restoration is explicitly forbidden.
- [x] In-scope fixes are limited to CI wrapper, manifest, report, artifact, and budget/trend contract defects.
- [x] Product/runtime failures are explicitly split to follow-up ownership.
- [x] No new permanent lane, CI framework, runtime persistence, provider abstraction, Filament resource, or browser family is introduced.
- [x] Filament v5 / Livewire v4 compliance is preserved; no panel provider registration change is planned.
## Notes
- Preparation analyze found no blocking readiness gap after aligning category and seam names across all artifacts.

View File

@ -0,0 +1,67 @@
# Data Model: Full Suite Failure Classification & CI Lane Baseline
Spec `295` introduces no application entity, table, model, enum, migration, or persisted runtime artifact. Its only modeled data is spec-local workflow truth for implementation: a bounded failure-classification artifact and one final CI readiness decision.
## Spec-Local Artifact: `failure-classification.md`
`failure-classification.md` records the classification state for implementation. It is not product truth and must not be read by the application.
### Failure Group
| Field | Meaning |
|---|---|
| `Group` | One failing test file, assertion cluster, wrapper failure, report failure, artifact failure, budget breach, or environment failure sharing the same cause and owner |
| `Observed Command` | Exact command that produced the signal |
| `Seam` | One pinned CI/suite seam |
| `Category` | One pinned failure-classification category |
| `Observed Failure` | Concise description of the failure evidence |
| `Candidate Owner` | Existing script, support class, guard test, product area, or follow-up package likely responsible |
| `Fix In 295?` | `yes`, `no`, or `only-if-ci-contract-proven` |
| `Follow-up` | Follow-up spec/lane decision, or `none` when resolved |
| `Status` | `pending`, `classified`, `fixed`, `resolved`, `follow-up-required`, or `environment-blocked` |
## Pinned Categories
| Category | Meaning |
|---|---|
| `ci-signal-restored` | Full suite or lane split is green and usable as a CI signal |
| `ci-wrapper-or-manifest-regression` | Wrapper, composer script, workflow binding, or lane manifest no longer invokes the intended lane |
| `artifact-publication-regression` | Required report/JUnit/budget/profile/trend artifacts are not generated or staged as contracted |
| `budget-or-trend-baseline-drift` | Tests pass or mostly pass, but runtime budget/trend baseline output is stale or no longer interpretable |
| `product-runtime-or-test-regression` | A real app/test behavior failure outside the CI wrapper/report/artifact contract |
| `browser-lane-regression` | Existing browser lane or smoke failure that needs browser-specific follow-up unless it is a CI artifact issue |
| `flaky-or-environment` | Nondeterministic, local container, browser runtime, database, queue, or runner issue |
| `follow-up-spec-required` | Confirmed out-of-scope failure needing a separate spec/lane owner |
| `resolved-or-not-needed` | Initially suspected group that no longer needs work after rerun or adjacent classification |
## Pinned CI / Suite Seams
| Seam | Meaning |
|---|---|
| `raw-full-suite` | Direct `sail artisan test --compact` complete suite signal |
| `fast-feedback-lane` | Existing fast-feedback wrapper and manifest selection |
| `confidence-lane` | Existing confidence wrapper and manifest selection |
| `heavy-governance-lane` | Existing heavy-governance wrapper and manifest selection |
| `browser-lane` | Existing browser wrapper and smoke selection |
| `profiling-or-junit-support` | Support lanes used for profiling or durable machine-readable output |
| `lane-reporting` | `scripts/platform-test-report` and `TestLaneReport` output |
| `artifact-publication` | `scripts/platform-test-artifacts` and lane artifact contracts |
| `budget-trend-baseline` | `TestLaneBudget`, lane thresholds, and trend-history classification |
| `legacy-cutover-regression-guard` | Failures that appear to challenge the retired route/panel baseline from Specs `287` to `293` |
| `provider-verification-regression-guard` | Failures that appear to challenge Spec `294` provider/verification semantics |
## Final Readiness Decision
| Decision | Meaning |
|---|---|
| `restored-ci-signal` | Full suite and required lanes are green or have only non-blocking documented budget notes |
| `classified-follow-up-required` | CI is not fully green, but every remaining red group is classified with follow-up ownership |
| `blocked-by-environment` | Classification cannot complete because local/runner environment failures prevent trustworthy signal collection |
## Invariants
- Every failing group must use exactly one pinned category and one pinned seam.
- `ci-signal-restored` may be used only when the relevant command output is green.
- Product/runtime failures cannot be fixed in `295` unless the classification proves a direct CI/lane contract defect.
- Retired `/admin/t/...` or TenantPanelProvider behavior must never be restored as a remedy.
- The same category and seam names must appear in `spec.md`, `plan.md`, `tasks.md`, `quickstart.md`, `checklists/requirements.md`, and `failure-classification.md`.

View File

@ -0,0 +1,122 @@
# Failure Classification: Full Suite Failure Classification & CI Lane Baseline
## Purpose
Use this artifact during implementation of Spec `295` to classify the complete platform suite signal after Specs `293` and `294`.
This artifact is spec-local workflow truth only. It is not application runtime truth.
## Implementation Scope Lock
- Date: 2026-05-11
- Branch: `295-full-suite-ci-baseline`
- Baseline commit: `eb85b76e Added Skill for Codex`
- Pre-run working tree: only the active untracked spec directory `specs/295-full-suite-ci-baseline/` is present; `git diff --stat` is empty.
- Scope confirmation: no runtime application code, Filament UI, routes, provider runtime, TenantPanelProvider behavior, `/admin/t/...` behavior, or completed Spec `293` / `294` artifacts are in scope unless a narrow CI/lane contract defect is proven by classification evidence.
- Forbidden repair confirmation: product/runtime failures, browser UI behavior failures, and provider/verification runtime failures are classification and follow-up candidates only unless the observed failure is directly caused by an existing lane wrapper, manifest, report, artifact, or budget/trend contract.
## Pinned Failure-Classification Categories
| Category | Meaning |
|---|---|
| `ci-signal-restored` | Full suite or lane split is green and usable as a CI signal |
| `ci-wrapper-or-manifest-regression` | Wrapper, composer script, workflow binding, or lane manifest no longer invokes the intended lane |
| `artifact-publication-regression` | Required report/JUnit/budget/profile/trend artifacts are not generated or staged as contracted |
| `budget-or-trend-baseline-drift` | Tests pass or mostly pass, but runtime budget/trend baseline output is stale or no longer interpretable |
| `product-runtime-or-test-regression` | A real app/test behavior failure outside the CI wrapper/report/artifact contract |
| `browser-lane-regression` | Existing browser lane or smoke failure needing browser-specific follow-up unless it is a CI artifact issue |
| `flaky-or-environment` | Nondeterministic, local container, browser runtime, database, queue, or runner issue |
| `follow-up-spec-required` | Confirmed out-of-scope failure needing a separate spec/lane owner |
| `resolved-or-not-needed` | Initially suspected group that no longer needs work after rerun or adjacent classification |
## Pinned CI / Suite Seams
| Seam | Meaning |
|---|---|
| `raw-full-suite` | Direct `sail artisan test --compact` complete suite signal |
| `fast-feedback-lane` | Existing fast-feedback wrapper and manifest selection |
| `confidence-lane` | Existing confidence wrapper and manifest selection |
| `heavy-governance-lane` | Existing heavy-governance wrapper and manifest selection |
| `browser-lane` | Existing browser wrapper and smoke selection |
| `profiling-or-junit-support` | Support lanes used for profiling or durable machine-readable output |
| `lane-reporting` | `scripts/platform-test-report` and `TestLaneReport` output |
| `artifact-publication` | `scripts/platform-test-artifacts` and lane artifact contracts |
| `budget-trend-baseline` | `TestLaneBudget`, lane thresholds, and trend-history classification |
| `legacy-cutover-regression-guard` | Failures that appear to challenge the retired route/panel baseline from Specs `287` to `293` |
| `provider-verification-regression-guard` | Failures that appear to challenge Spec `294` provider/verification semantics |
## Baseline Commands
Primary command:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)
```
Fallback lane split:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser
```
Report commands:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser
```
## Baseline Run Queue
The implementation must run or explicitly skip these targets and then add classified failure or success rows in the classification table below.
| Run Target | Expected Command | Expected Seam | Current Status |
|---|---|---|---|
| `raw-full-suite` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)` | `raw-full-suite` | red; 450 failed, 8 skipped, 4194 passed, 28831 assertions, 4686.08s; output too broad/truncated for complete group ownership, fallback lane split required |
| `fast-feedback-lane` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback` | `fast-feedback-lane` | red; 82 failed, 1743 passed, 12151 assertions, 164.11s; report wall clock 171.551792s within 200s warning budget |
| `confidence-lane` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence` | `confidence-lane` | red; 409 failed, 8 skipped, 3853 passed, 25994 assertions, 605.10s; report wall clock 622.531394s over 450s warning budget |
| `heavy-governance-lane` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance` | `heavy-governance-lane` | red; 21 failed, 319 passed, 2443 assertions, 314.28s; report wall clock 314.828382s within 315s warning budget |
| `browser-lane` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser` | `browser-lane` | red; 20 failed, 29 passed, 417 assertions, 285.32s; report wall clock 285.719479s over 150s warning budget |
## Classification Table
The implementation must append one row per failing group or one `ci-signal-restored` row for a fully green signal.
| Group | Observed Command | Seam | Category | Observed Failure | Candidate Owner | Fix In 295? | Follow-up | Status |
|---|---|---|---|---|---|---|---|---|
| `raw-full-suite-red-baseline` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)` | `raw-full-suite` | `follow-up-spec-required` | Raw full suite completed but is not a restored CI signal: 450 failed, 8 skipped, 4194 passed, 28831 assertions, 4686.08s. Failure output includes unit RBAC/capability assertions, provider boundary/start gate failures, route generation errors for workspace-aware operation routes, Filament panel URL generation errors, and browser smoke failures; full output was too broad/truncated to classify every group from raw output alone. | suite/lane ownership classification | no | run fallback lane split and classify by lane/report artifacts before any repair | classified |
| `fast-feedback-lane-product-red` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback` | `fast-feedback-lane` | `product-runtime-or-test-regression` | Lane completed through the wrapper but is red: 82 failed, 1743 passed, 12151 assertions, 164.11s. JUnit and console output show workspace-aware operation route URL generation without `workspace`, Filament `hasTenancy()` calls with no panel context, authorization expectation drift, RBAC/UI action assertions, provider boundary/start-gate assertions, and monitoring/required-permissions surfaces. | workspace route and Filament panel-context follow-up; RBAC/authorization follow-up; provider verification follow-up | no | split product/test failures into focused follow-up specs; keep 295 limited to lane/report/artifact contracts | classified |
| `confidence-lane-product-red` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence` | `confidence-lane` | `product-runtime-or-test-regression` | Lane completed through the wrapper but is red: 409 failed, 8 skipped, 3853 passed, 25994 assertions, 605.10s. Failure groups include the same workspace-route and Filament panel-context errors, missing/renamed Filament resource routes, bulk-action test helpers returning null actions, deny-as-not-found expectation drift, and legacy admin URL assumptions. | confidence-lane product/runtime owners by resource area | no | create follow-up ownership slices before any product repair; do not absorb broad application repair into 295 | classified |
| `heavy-governance-lane-product-red` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance` | `heavy-governance-lane` | `product-runtime-or-test-regression` | Lane completed through the wrapper but is red: 21 failed, 319 passed, 2443 assertions, 314.28s. Failures are concentrated in canonical operation detail/list tests missing the `workspace` route parameter, Filament URL generation without panel context, one tenant sync summary-count assertion, and RBAC relation-manager UI enforcement. | operations canonical viewer/list follow-up; tenant sync summary follow-up; RBAC UI follow-up | no | follow-up specs should repair these product/test contracts independently of the 295 lane baseline | classified |
| `browser-lane-red` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser` | `browser-lane` | `browser-lane-regression` | Browser lane completed through the wrapper but is red: 20 failed, 29 passed, 417 assertions, 285.32s. Failures include smoke-login pages not showing `Dashboard`, workspace-aware operation route URL generation errors, Filament `hasTenancy()` panel-context errors, a tenant dashboard layout assertion, a Spec 279 `/admin/t/...` path expectation, and a tenant membership page copy/action expectation. | browser smoke/product UI follow-up owners | no | split browser repairs separately; do not treat the lane as green or restore retired tenant routes in 295 | classified |
| `legacy-cutover-route-expectations` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser` and `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence` | `legacy-cutover-regression-guard` | `follow-up-spec-required` | Browser Spec 279 still expects `/admin/t/spec-279-production`, while the current path is `/admin/workspaces/{workspace}/environments/{environment}`. Confidence output also includes older expectations around `/admin/operations` and admin operation URLs. | tenant cutover regression guard owner | no | create follow-up only if current cutover truth should change; do not restore `/admin/t/...`, TenantPanelProvider behavior, or historical compatibility routes in 295 | classified |
| `provider-verification-regression-guard` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback` and raw full suite | `provider-verification-regression-guard` | `follow-up-spec-required` | Raw and fast-feedback output include provider boundary/status assertions such as unexpected `provider.capability_registry`, `review_required` vs `blocked`, and provider operation start-gate dispatch count drift. These are provider/runtime semantics, not lane wrapper failures. | Spec 294 provider/verification follow-up owner | no | open a provider verification follow-up if the current runtime semantics are wrong; do not rewrite Spec 294 artifacts under 295 | classified |
| `lane-reporting-all-lanes` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback`, `confidence`, `heavy-governance`, `browser` | `lane-reporting` | `resolved-or-not-needed` | All four report commands exited 0 and rendered `summary.md`, `report.json`, `budget.json`, `junit.xml`, and `trend-history.json` references. Reports preserved the red lane status and exposed budget/trend metadata instead of crashing. | TestLaneReport and report wrapper | no | none for report rendering | classified |
| `budget-trend-baseline-status` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback`, `confidence`, `heavy-governance`, `browser` | `budget-trend-baseline` | `resolved-or-not-needed` | Fast-feedback reported `within-budget` at 171.551792s under 200s. Heavy-governance reported `within-budget` at 314.828382s under 315s. Confidence and browser reported warning-level budget output, respectively 622.531394s over 450s and 285.719479s over 150s, with warning enforcement and no hard budget-blocking failure. Trend windows were either stable, scope-changed, or insufficient-history as documented. | TestLaneBudget/trend baseline owner | no | investigate confidence/browser runtime separately if desired; no 295 budget relaxation or baseline rewrite was justified | classified |
| `artifact-publication-env-forwarding` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-295-fast-feedback-artifacts` and matching commands for confidence, heavy-governance, browser | `artifact-publication` | `artifact-publication-regression` | Initial artifact staging failed for every lane with `Unknown test lane []` because `scripts/platform-test-artifacts` passed lane and staging inputs to the Sail PHP process via host environment variables that were empty inside the container. | `scripts/platform-test-artifacts` | yes | fixed by passing lane, staging directory, and artifact directory as PHP argv through Sail and adding a guard test | resolved |
| `artifact-publication-after-fix` | `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-295-fast-feedback-artifacts` and matching commands for confidence, heavy-governance, browser | `artifact-publication` | `resolved-or-not-needed` | After the wrapper fix, all four artifact commands exited 0 and staged all five required artifacts with `complete: true`, `primaryFailureClassId: null`, and no missing required artifacts. | `scripts/platform-test-artifacts` and TestLaneReport artifact contract | yes | none for artifact publication | classified |
| `junit-support-output` | not run separately; lane wrappers produced `apps/platform/storage/logs/test-lanes/*-latest.junit.xml` | `profiling-or-junit-support` | `resolved-or-not-needed` | Separate `./scripts/platform-test-lane junit` was not needed because fast-feedback, confidence, heavy-governance, and browser wrappers already produced machine-readable JUnit artifacts used for classification. | TestLaneManifest JUnit support | no | none unless a future follow-up needs the dedicated JUnit lane | classified |
## Final Readiness Decision
Current decision: `classified-follow-up-required`
Allowed values:
- `restored-ci-signal`
- `classified-follow-up-required`
- `blocked-by-environment`
## Classification Rules
- Every red group must have exactly one pinned category and one pinned seam.
- Do not use `ci-signal-restored` for a partially red lane.
- Fix in `295` only if the group is directly tied to CI wrapper, manifest, report, artifact, or budget/trend contract drift.
- Split product/runtime failures to follow-up ownership.
- Do not restore TenantPanelProvider, `/admin/t/...`, or retired tenant-scoped fallback routes.
- Do not rewrite completed Specs `293` or `294`.

View File

@ -0,0 +1,181 @@
# Implementation Plan: Full Suite Failure Classification & CI Lane Baseline
**Branch**: `295-full-suite-ci-baseline` | **Date**: 2026-05-11 | **Spec**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md`
**Input**: Feature specification from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md`
## Summary
Spec `295` determines whether the full TenantPilot platform suite is again a reliable CI signal after Specs `293` and `294`. The implementation must run the raw full suite when classifiable, fall back to explicit existing lane wrappers when needed, classify every red group in `failure-classification.md`, validate report/artifact/budget failure classes, and only fix small CI/lane contract defects. Product/runtime failures are split into follow-up ownership instead of repaired here.
## Technical Context
**Language/Version**: PHP 8.4.15, Laravel 12.52.0
**Primary Dependencies**: Pest 4.3.1, PHPUnit 12.5.4, Laravel Sail 1.52.0, Filament 5.2.1, Livewire 4.1.4
**Storage**: no application storage changes; spec-local `failure-classification.md` only
**Testing**: Pest via Sail-first commands and existing lane wrappers
**Validation Lanes**: raw full suite, fast-feedback, confidence, heavy-governance, browser, junit/report support, profiling only if classification needs it
**Target Platform**: local Sail and Gitea-compatible CI wrappers
**Project Type**: Laravel monolith under `apps/platform` with repo-root CI helper scripts
**Performance Goals**: classify the existing suite signal without creating a new permanent lane or widening lane cost
**Constraints**: no broad suite repair, no legacy `/admin/t/...`, no TenantPanelProvider restoration, no runtime persistence, no new test family by default
**Scale/Scope**: complete platform test suite signal plus existing CI lane/report/artifact contracts
## UI / Surface Guardrail Plan
- **Guardrail scope**: no operator-facing surface change
- **Native vs custom classification summary**: N/A
- **Shared-family relevance**: CI/test-governance workflow only
- **State layers in scope**: none
- **Audience modes in scope**: N/A
- **Decision/diagnostic/raw hierarchy plan**: N/A for product UI; classification output keeps summary first and raw failure detail in row notes
- **Raw/support gating plan**: N/A
- **One-primary-action / duplicate-truth control**: one final readiness decision in `failure-classification.md`
- **Handling modes by drift class or surface**: CI/lane contract drift may be fixed; product/runtime drift becomes `follow-up-spec-required` or `product-runtime-or-test-regression`
- **Repository-signal treatment**: review-mandatory for every failing group; hard-stop if a group remains unclassified
- **Special surface test profiles**: `browser-smoke`, `surface-guard`, `discovery-heavy`, `global-context-shell`
- **Required tests or manual smoke**: existing Pest lane wrappers and raw full-suite command; no in-app Browser smoke unless implementation later changes visible UI, which is out of scope
- **Exception path and spread control**: any repair outside CI/lane contract correction triggers follow-up-spec classification
- **Active feature PR close-out entry**: `FullSuiteClassification`
## Shared Pattern & System Fit
- **Cross-cutting feature marker**: yes
- **Systems touched**: `scripts/platform-test-lane`, `scripts/platform-test-report`, `scripts/platform-test-artifacts`, `apps/platform/composer.json`, `apps/platform/tests/Support/TestLaneManifest.php`, `apps/platform/tests/Support/TestLaneReport.php`, `apps/platform/tests/Support/TestLaneBudget.php`, CI guard tests under `apps/platform/tests/Feature/Guards/`
- **Shared abstractions reused**: `TestLaneManifest`, `TestLaneReport`, `TestLaneBudget`, existing wrapper scripts and composer scripts
- **New abstraction introduced? why?**: none
- **Why the existing abstraction was sufficient or insufficient**: existing lane and failure-class contracts are the current source of truth; this spec proves or minimally corrects them instead of adding another layer
- **Bounded deviation / spread control**: product/runtime failures must be classified and split rather than repaired here
## OperationRun UX Impact
- **Touches OperationRun start/completion/link UX?**: no
- **Central contract reused**: N/A
- **Delegated UX behaviors**: N/A
- **Surface-owned behavior kept local**: N/A
- **Queued DB-notification policy**: N/A
- **Terminal notification path**: N/A
- **Exception path**: none
## Provider Boundary & Portability Fit
- **Shared provider/platform boundary touched?**: no product provider boundary change
- **Provider-owned seams**: provider/verification test failures may be classified, but runtime repair is out of scope unless it is strictly CI/lane contract drift
- **Platform-core seams**: CI lane/report/artifact contract only
- **Neutral platform terms / contracts preserved**: `workspace`, `managed environment`, `provider connection`, `lane`, `failure group`, `CI signal`
- **Retained provider-specific semantics and why**: none added
- **Bounded extraction or follow-up path**: follow-up-spec for any real provider/verification runtime debt after Spec `294`
## Constitution Check
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
- Inventory-first: PASS. No inventory or snapshot runtime behavior changes.
- Read/write separation: PASS. No application write/change function is introduced.
- Graph contract path: PASS. No Microsoft Graph calls are introduced or changed.
- Deterministic capabilities: PASS. Capability derivation is not changed.
- RBAC-UX: PASS. Existing RBAC tests may fail and be classified, but authorization behavior is not changed by this spec unless a future follow-up owns it.
- Workspace isolation: PASS. Workspace/managed-environment isolation failures are product/runtime debt, not CI-wrapper debt.
- Tenant isolation: PASS. No tenant-plane route or compatibility behavior is restored.
- Run observability: PASS. No new `OperationRun`, queue, scheduled work, or terminal notification policy is introduced.
- Test governance (TEST-GOV-001): PASS. The spec explicitly names proving purpose, lane mix, fixture cost boundaries, heavy/browser visibility, budget/trend treatment, and split decisions.
- Proportionality (PROP-001): PASS. The only new structure is one spec-local classification artifact needed for current CI readiness.
- No premature abstraction (ABSTR-001): PASS. No new CI framework or lane abstraction is introduced.
- Persisted truth (PERSIST-001): PASS. No application persistence; spec artifact is not runtime truth.
- Behavioral state (STATE-001): PASS. The classification vocabulary controls implementation workflow only and does not become product state.
- Shared pattern first (XCUT-001): PASS. Existing `TestLaneManifest`, `TestLaneReport`, wrapper scripts, and guard tests remain the shared path.
- Provider boundary (PROV-001): PASS. No provider runtime or vocabulary boundary is changed.
- V1 explicitness / few layers (V1-EXP-001, LAYER-001): PASS. Use direct classification and existing helpers.
- Spec discipline / bloat check (SPEC-DISC-001, BLOAT-001): PASS with proportionality review in `spec.md`.
- Filament-native UI (UI-FIL-001): PASS. No operator-facing Filament UI change.
- Filament v5 / Livewire v4: PASS. Current app info confirms Filament 5.2.1 and Livewire 4.1.4; this spec does not alter that relationship.
- Provider registration: PASS. No panel provider changes; Laravel provider registration remains in `apps/platform/bootstrap/providers.php`.
**Post-design re-check**: PASS while categories, seams, planned commands, and out-of-scope boundaries remain aligned across `spec.md`, `plan.md`, `research.md`, `data-model.md`, `quickstart.md`, `tasks.md`, `checklists/requirements.md`, and `failure-classification.md`.
## Test Governance Check
- **Pinned categories**: `ci-signal-restored`, `ci-wrapper-or-manifest-regression`, `artifact-publication-regression`, `budget-or-trend-baseline-drift`, `product-runtime-or-test-regression`, `browser-lane-regression`, `flaky-or-environment`, `follow-up-spec-required`, `resolved-or-not-needed`
- **Pinned seams**: `raw-full-suite`, `fast-feedback-lane`, `confidence-lane`, `heavy-governance-lane`, `browser-lane`, `profiling-or-junit-support`, `lane-reporting`, `artifact-publication`, `budget-trend-baseline`, `legacy-cutover-regression-guard`, `provider-verification-regression-guard`
- **Test purpose / classification by changed surface**: full-suite classification, CI lane contract verification, and optional CI/lane guard tests only
- **Affected validation lanes**: raw full suite, fast-feedback, confidence, heavy-governance, browser, junit/report support
- **Why this lane mix is the narrowest sufficient proof**: raw full suite answers the main readiness question; explicit lane split keeps classification possible when the raw run is too noisy; report/artifact commands validate CI interpretability
- **Narrowest proving command(s)**:
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser`
- corresponding `./scripts/platform-test-report <lane>` commands for report/artifact classification
- **Fixture / helper / factory / seed / context cost risks**: no new defaults; classify fixture-heavy failures instead of widening setup by default
- **Expensive defaults or shared helper growth introduced?**: no
- **Heavy-family additions, promotions, or visibility changes**: none by default
- **Surface-class relief / special coverage rule**: browser/heavy lane output is classification-only unless active fix scope explicitly owns it
- **Closing validation and reviewer handoff**: reviewers should confirm no unclassified failing group, no hidden budget relaxation, no new lane family, and no legacy cutover behavior restoration
- **Budget / baseline / trend follow-up**: classify in `failure-classification.md`; only adjust a baseline when the row explains why current evidence supports it
- **Review-stop questions**: lane fit, hidden fixture cost, product repair scope creep, browser scope creep, budget baseline relaxation
- **Escalation path**: `document-in-feature` for CI/lane contract corrections, `follow-up-spec` for product/runtime failures
- **Active feature PR close-out entry**: `FullSuiteClassification`
- **Why no dedicated follow-up spec is needed**: this spec is itself the bounded classification pass. Follow-up specs are created only for classified product/runtime groups.
## Project Structure
### Documentation (this feature)
```text
specs/295-full-suite-ci-baseline/
├── checklists/
│ └── requirements.md
├── data-model.md
├── failure-classification.md
├── plan.md
├── quickstart.md
├── research.md
├── spec.md
└── tasks.md
```
### Source Code (repository root)
```text
scripts/
├── platform-test-artifacts
├── platform-test-lane
└── platform-test-report
apps/platform/
├── composer.json
└── tests/
├── Feature/Guards/
└── Support/
```
**Structure Decision**: implementation should touch only the documentation artifacts above unless classification proves a small CI/lane contract defect in the listed scripts/support/guard-test surfaces. Runtime application code, migrations, models, Filament resources, routes, views, and provider services are out of scope.
## Complexity Tracking
| Violation | Why Needed | Simpler Alternative Rejected Because |
|---|---|---|
| Spec-local failure-classification vocabulary | The full-suite readiness decision needs one bounded way to classify all red groups after Specs `293` and `294` | Raw terminal notes would not preserve ownership, lane, or follow-up decisions |
## Proportionality Review
- **Current operator problem**: maintainers cannot safely decide whether CI is restored without a classified full-suite baseline.
- **Existing structure is insufficient because**: targeted green lanes and raw full-suite output answer different questions; neither alone assigns follow-up ownership.
- **Narrowest correct implementation**: one spec-local classification artifact and existing lane wrappers.
- **Ownership cost**: temporary classification upkeep during implementation and possibly small lane contract guard adjustments.
- **Alternative intentionally rejected**: new full-suite CI framework or fix-all suite cleanup.
- **Release truth**: current-release test governance and CI readiness.
## Phase 0: Research Output
See `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/research.md`.
## Phase 1: Design Output
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md`
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/quickstart.md`
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
## Phase 2: Task Planning Output
See `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/tasks.md`.

View File

@ -0,0 +1,90 @@
# Quickstart: Full Suite Failure Classification & CI Lane Baseline
## Purpose
Use this package to classify whether the complete platform test suite is a reliable CI signal after Specs `293` and `294`.
## Before Implementation
1. Review:
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md`
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/plan.md`
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md`
- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
2. Confirm the branch is clean.
3. Confirm no implementation step is about restoring TenantPanelProvider, `/admin/t/...`, or tenant-scoped legacy fallbacks.
## Primary Classification Flow
Use only the pinned categories and seams from `failure-classification.md`: `ci-signal-restored`, `ci-wrapper-or-manifest-regression`, `artifact-publication-regression`, `budget-or-trend-baseline-drift`, `product-runtime-or-test-regression`, `browser-lane-regression`, `flaky-or-environment`, `follow-up-spec-required`, `resolved-or-not-needed`; and `raw-full-suite`, `fast-feedback-lane`, `confidence-lane`, `heavy-governance-lane`, `browser-lane`, `profiling-or-junit-support`, `lane-reporting`, `artifact-publication`, `budget-trend-baseline`, `legacy-cutover-regression-guard`, `provider-verification-regression-guard`.
Run the raw full suite when feasible:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)
```
Record the outcome in `failure-classification.md`.
If the raw full suite is too slow, noisy, or environment-blocked to classify, run the explicit lane split:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser
```
## CI Report and Artifact Flow
After lane runs, generate lane reports when needed:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser
```
Use artifact staging only if artifact publication itself is being validated:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-fast-feedback-artifacts
```
## Fix Rules
Fix in `295` only when the failure is directly and narrowly caused by:
- `scripts/platform-test-lane`
- `scripts/platform-test-report`
- `scripts/platform-test-artifacts`
- `apps/platform/tests/Support/TestLaneManifest.php`
- `apps/platform/tests/Support/TestLaneReport.php`
- `apps/platform/tests/Support/TestLaneBudget.php`
- directly related CI guard tests under `apps/platform/tests/Feature/Guards/`
Do not fix in `295` when the failure requires:
- application runtime behavior changes
- Filament page/resource changes
- routes, middleware, policies, services, jobs, migrations, views, or models
- provider/verification runtime changes beyond the completed Spec `294`
- browser UI repair
- tenant-cutover compatibility restoration
Classify those as follow-up work instead.
## Expected Close-Out
Close out with exactly one final readiness decision:
- `restored-ci-signal`
- `classified-follow-up-required`
- `blocked-by-environment`
Then run formatting for any changed PHP files:
```bash
export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)
```

View File

@ -0,0 +1,58 @@
# Research: Full Suite Failure Classification & CI Lane Baseline
## Decision: Use classification-first implementation
**Rationale**: The user explicitly asked not to blindly repair the full suite. Specs `293` and `294` already handled known focused stabilization slices. `295` must first answer whether the full suite is a reliable signal and only then allow small CI/lane fixes.
**Alternatives considered**:
- **Fix every failing test immediately**: rejected because it hides ownership, scope-creeps into unrelated features, and violates the requested goal.
- **Run only targeted lanes**: rejected because the central question is the complete suite signal after the targeted lanes were stabilized.
- **Skip full-suite run and rely on CI lanes**: rejected because lane split can hide cross-lane fallout or raw-suite issues.
## Decision: Prefer raw full suite, then explicit lane split fallback
**Rationale**: The raw command `cd apps/platform && ./vendor/bin/sail artisan test --compact` is the most direct answer to the full-suite readiness question. If it times out, produces output too large to classify, or is environment-blocked, the existing wrappers provide explicit fallback segmentation: `fast-feedback`, `confidence`, `heavy-governance`, and `browser`.
**Alternatives considered**:
- **Create a new full-suite wrapper**: rejected as premature CI framework growth.
- **Use only `confidence`**: rejected because confidence intentionally excludes browser, heavy-governance, and some discovery-heavy families.
## Decision: Reuse existing lane and failure-class contracts
**Rationale**: `TestLaneManifest` already defines lanes, workflow profiles, budgets, artifact contracts, and lane scope notes. `TestLaneReport` already classifies CI failures as `test-failure`, `wrapper-failure`, `budget-breach`, `artifact-publication-failure`, or `infrastructure-failure`. Spec `295` should verify and minimally correct those contracts rather than inventing another taxonomy.
**Pinned Spec 295 categories**: `ci-signal-restored`, `ci-wrapper-or-manifest-regression`, `artifact-publication-regression`, `budget-or-trend-baseline-drift`, `product-runtime-or-test-regression`, `browser-lane-regression`, `flaky-or-environment`, `follow-up-spec-required`, `resolved-or-not-needed`.
**Pinned Spec 295 seams**: `raw-full-suite`, `fast-feedback-lane`, `confidence-lane`, `heavy-governance-lane`, `browser-lane`, `profiling-or-junit-support`, `lane-reporting`, `artifact-publication`, `budget-trend-baseline`, `legacy-cutover-regression-guard`, `provider-verification-regression-guard`.
**Alternatives considered**:
- **Add a separate CI readiness model**: rejected because the existing support classes already own this truth.
- **Record only plain-text notes**: rejected because future maintainers need stable categories, seams, and follow-up decisions.
## Decision: Allow only small CI/lane contract fixes
**Rationale**: In-scope fixes are limited to wrappers, manifest/report support, artifact publication, budget/report contract drift, and their direct guard tests. This keeps the package focused on CI signal readiness.
**Alternatives considered**:
- **Fix application/runtime failures discovered by the suite**: rejected unless a failure is proven to be a small CI/lane contract defect.
- **Update historical Specs `293` or `294`**: rejected by completed-spec guardrail and user scope.
## Decision: Preserve legacy cutover retirement
**Rationale**: The request explicitly forbids reopening tenant cutover, legacy `/admin/t/...`, or TenantPanelProvider. Any failure that appears to depend on those retired paths must be classified without restoring them.
**Alternatives considered**:
- **Add temporary route aliases to make old tests pass**: rejected as direct conflict with the cutover baseline.
## Decision: Browser output is classification input, not automatic repair ownership
**Rationale**: The browser lane is intentionally isolated and may expose environment or smoke fallout. Spec `295` should classify browser failures and only repair browser-specific contract issues if they are lane/report artifacts, not product UI behavior.
**Alternatives considered**:
- **Run a browser smoke fix loop inside 295**: rejected because this is not a UI implementation spec.

View File

@ -0,0 +1,342 @@
# Feature Specification: Full Suite Failure Classification & CI Lane Baseline
**Feature Branch**: `295-full-suite-ci-baseline`
**Created**: 2026-05-11
**Status**: Ready
**Input**: User description: "Spec 295 - Full Suite Failure Classification & CI Lane Baseline. After Specs 293 and 294, run a full-suite classification to determine whether the full platform suite is again a reliable CI signal or whether remaining failures must be classified into separate follow-up specs or lanes. Do not blindly fix the full suite, do not scope-creep, do not reopen tenant cutover, do not restore legacy `/admin/t/...` or TenantPanelProvider behavior, and perform only small clearly in-scope fixes."
## Spec Candidate Check *(mandatory - SPEC-GATE-001)*
- **Problem**: Specs `293` and `294` closed the known post-cutover route/action-surface and ProviderConnections/Verification failure blocks, but the complete platform suite has not yet been classified as a restored CI signal. Maintainers need one bounded pass that distinguishes green signal, CI wrapper or lane baseline failures, remaining product regressions, flaky or environment failures, and follow-up-spec debt.
- **Today's failure**: targeted lanes can be green while the raw full suite or CI lane wrappers may still fail for unrelated product debt, wrapper/report/artifact drift, budget baseline changes, browser-specific fallout, or environment-only failures. Without classification, future work cannot tell whether a red run means "fix this PR", "rerun because infrastructure failed", "update lane baseline", or "open a follow-up spec".
- **User-visible improvement**: maintainers get an attributable CI readiness decision: either the complete platform suite is a reliable blocking signal again, or every remaining red group is explicitly assigned to the right lane, owner, and follow-up path without reviving retired tenant routes or reopening Specs `293` and `294`.
- **Smallest enterprise-capable version**: one classification-first package that runs the raw full suite or its explicit fallback lane split, records every failing group in `failure-classification.md`, validates existing lane wrappers/report/artifact contracts, applies only small CI-signal fixes when the failure is clearly in scope, and records all product/runtime failures as follow-up candidates instead of absorbing them.
- **Explicit non-goals**: no broad full-suite repair, no tenant-cutover rework, no TenantPanelProvider reactivation, no `/admin/t/...` route restoration, no provider/verification runtime expansion beyond Spec `294`, no new CI framework, no new permanent test lane by default, no new browser family, no new runtime persistence, no UI redesign, no product feature work, no unrelated failing-test cleanup, and no historical-spec rewrites.
- **Permanent complexity imported**: one spec-local `failure-classification.md` artifact, one bounded failure-category inventory, one bounded CI/lane seam inventory, and focused tasks against existing test lane scripts, lane manifest/report support, and current Pest lane commands. No runtime table, model, enum, provider abstraction, Filament resource, or product surface is introduced.
- **Why now**: after `293` and `294`, the next quality question is no longer one known red cluster. It is whether CI can be trusted again as a whole. If this is not classified now, later specs will either over-trust a partially red suite or keep rediscovering unrelated failures as local surprises.
- **Why not local**: the signal spans raw Pest execution, `scripts/platform-test-lane`, `scripts/platform-test-report`, `scripts/platform-test-artifacts`, `Tests\Support\TestLaneManifest`, `Tests\Support\TestLaneReport`, browser isolation, heavy-governance budget/reporting, and current workflow profiles. A one-file patch would not prove CI readiness.
- **Approval class**: Cleanup
- **Red flags triggered**: full-suite scope, cross-cutting test governance, and possible temptation to repair unrelated product failures. Defense: this spec is classification-first, uses existing lane/failure-class contracts, imports only a spec-local artifact, and forbids broad repair or legacy route restoration.
- **Score**: Nutzen: 2 | Dringlichkeit: 2 | Scope: 2 | Komplexitaet: 1 | Produktnaehe: 1 | Wiederverwendung: 2 | **Gesamt: 10/12**
- **Decision**: approve
## Review Outcome
- **Outcome class**: `acceptable-special-case`
- **Workflow outcome**: `keep`
- **Test-governance outcome**: `keep`
- **Reason**: full-suite work is normally too broad, but this package is justified because it is a classification and CI-signal baseline pass after two completed stabilization slices, not a fix-all implementation.
- **Workflow result**: Ready for implementation as one bounded suite-signal classification package after Specs `293` and `294`.
## Candidate Selection Gate
- **Selected candidate**: Full Suite Failure Classification & CI Lane Baseline
- **Source location**: explicit user-provided manual follow-up after `specs/293-post-cutover-suite-stabilization/` and `specs/294-provider-verification-runtime-semantics/`
- **Why selected now**: the known cutover and provider/verification red blocks have been stabilized, so the remaining decision is whether the full platform suite and lane wrappers now form a trustworthy CI signal.
- **Why close alternatives were deferred**:
- reopening Spec `293` would blur route/action-surface cutover cleanup with full-suite CI readiness
- reopening Spec `294` would blur provider/verification runtime semantics with unrelated suite failures
- starting Package Execution, Guided Operations, Microsoft Starter Pack, or Virtual Consultant would hide CI uncertainty under new product work
- creating a new permanent full-suite lane would import CI framework complexity before proving the existing lanes are insufficient
- fixing every failing test in one pass would scope-creep beyond classification and make follow-up ownership unclear
- **Roadmap relationship**: test-governance and platform quality follow-through under `TEST-GOV-001`; this is not a new product roadmap lane and not an automatic active queue promotion.
- **Completed-spec guardrail result**: Specs `293` and `294` are context only and are excluded from refresh. Spec `294` carries implementation close-out evidence. Spec `293` is treated as the completed post-cutover baseline described by the user and its failure-classification history is preserved; this spec does not rewrite 293 tasks or close-out history. Specs `287` and `288` remain prior cutover and no-legacy guard context only.
- **Smallest viable implementation slice**: run the full suite or explicit lane split, classify every remaining failure group, validate CI wrapper/report/artifact contracts, and perform only small CI-signal fixes that do not change product behavior.
- **Proposed concise feature description to feed into specify**: Classify the full platform test suite after Specs 293 and 294 and establish whether existing CI lanes provide a trustworthy baseline, while splitting unrelated failures into explicit follow-up ownership instead of repairing the suite blindly.
## Pinned Failure-Classification Categories
- `ci-signal-restored`
- `ci-wrapper-or-manifest-regression`
- `artifact-publication-regression`
- `budget-or-trend-baseline-drift`
- `product-runtime-or-test-regression`
- `browser-lane-regression`
- `flaky-or-environment`
- `follow-up-spec-required`
- `resolved-or-not-needed`
## Pinned CI / Suite Seams
- `raw-full-suite`
- `fast-feedback-lane`
- `confidence-lane`
- `heavy-governance-lane`
- `browser-lane`
- `profiling-or-junit-support`
- `lane-reporting`
- `artifact-publication`
- `budget-trend-baseline`
- `legacy-cutover-regression-guard`
- `provider-verification-regression-guard`
## Spec Scope Fields *(mandatory)*
- **Scope**: repository / CI test-governance workflow
- **Primary Routes**: N/A - no application routes or operator-facing navigation are added or restored. Retired `/admin/t/...` routes and TenantPanelProvider behavior remain forbidden.
- **Data Ownership**:
- no new application persistence is introduced
- no runtime source of truth is introduced
- `failure-classification.md` is a spec-local implementation artifact and is not product/runtime truth
- existing test lane truth remains in `apps/platform/tests/Support/TestLaneManifest.php`, `apps/platform/tests/Support/TestLaneReport.php`, and the wrapper scripts under `scripts/`
- **RBAC**:
- no authorization model changes are introduced
- existing workspace and managed-environment isolation tests remain ordinary suite participants
- if a failing group concerns RBAC, it must be classified as product/runtime debt or a follow-up spec unless it is clearly only a stale CI/lane assertion
For canonical-view specs, the spec MUST define:
- **Default filter behavior when tenant-context is active**: N/A - no canonical-view application surface is added or changed.
- **Explicit entitlement checks preventing cross-tenant leakage**: N/A for this prep package. Any suite failure suggesting leakage must be classified as product-runtime debt and not hidden as a lane issue.
## Cross-Cutting / Shared Pattern Reuse *(mandatory when the feature touches notifications, status messaging, action links, header actions, dashboard signals/cards, alerts, navigation entry points, evidence/report viewers, or any other existing shared operator interaction family; otherwise write `N/A - no shared interaction family touched`)*
- **Cross-cutting feature?**: yes
- **Interaction class(es)**: CI lane execution, full-suite signal classification, lane report generation, artifact publication, budget/trend baseline review, and follow-up-spec routing
- **Systems touched**:
- `scripts/platform-test-lane`
- `scripts/platform-test-report`
- `scripts/platform-test-artifacts`
- `apps/platform/composer.json`
- `apps/platform/tests/Support/TestLaneManifest.php`
- `apps/platform/tests/Support/TestLaneReport.php`
- `apps/platform/tests/Support/TestLaneBudget.php`
- `apps/platform/tests/Feature/Guards/TestLaneManifestTest.php`
- `apps/platform/tests/Feature/Guards/CiLaneFailureClassificationContractTest.php`
- `apps/platform/tests/Feature/Guards/CiFastFeedbackWorkflowContractTest.php`
- `apps/platform/tests/Feature/Guards/CiConfidenceWorkflowContractTest.php`
- `apps/platform/tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.php`
- existing lane-selected Pest tests and browser smoke files only as classification inputs unless a small CI-signal fix is proven
- **Existing pattern(s) to extend**: existing `TestLaneManifest` lane definitions, existing `TestLaneReport` failure classes, existing lane wrapper scripts, existing Gitea workflow profile metadata, existing report/artifact publication contracts
- **Shared contract / presenter / builder / renderer to reuse**: `TestLaneManifest::lanes()`, `TestLaneManifest::workflowProfiles()`, `TestLaneManifest::failureClasses()`, `TestLaneReport::classifyPrimaryFailure()`, `TestLaneReport::buildCiSummary()`, `TestLaneReport::artifactPublicationStatus()`, and `scripts/platform-test-*`
- **Why the existing shared path is sufficient or insufficient**: the repo already has explicit lane, failure-class, artifact, and budget contracts. Spec `295` must prove whether they are currently enough and fix only small contract drift; it must not create a new CI orchestration layer before existing contracts are classified.
- **Allowed deviation and why**: only a bounded CI/lane contract correction is allowed when a wrapper, manifest, report, artifact, or budget baseline defect prevents classification. Product/runtime failures must be classified and split instead of fixed here.
- **Consistency impact**: raw suite output, lane wrapper output, report artifacts, budget/trend summaries, and final follow-up classification must tell the same story about whether the suite is green, blocked, flaky, or split.
- **Review focus**: reviewers must verify that this spec does not become a general failing-test cleanup, does not restore tenant-cutover legacy behavior, and does not add a new permanent lane unless the artifacts explicitly prove existing lanes are insufficient.
## OperationRun UX Impact *(mandatory when the feature creates, queues, deduplicates, resumes, blocks, completes, or deep-links to an `OperationRun`; otherwise write `N/A - no OperationRun start or link semantics touched`)*
- **Touches OperationRun start/completion/link UX?**: no
- **Shared OperationRun UX contract/layer reused**: N/A
- **Delegated start/completion UX behaviors**: N/A
- **Local surface-owned behavior that remains**: N/A
- **Queued DB-notification policy**: N/A
- **Terminal notification path**: N/A
- **Exception required?**: none
## Provider Boundary / Platform Core Check *(mandatory when the feature changes shared provider/platform seams, identity scope, governed-subject taxonomy, compare strategy selection, provider connection descriptors, or operator vocabulary that may leak provider-specific semantics into platform-core truth; otherwise write `N/A - no shared provider/platform boundary touched`)*
- **Shared provider/platform boundary touched?**: no product boundary change
- **Boundary classification**: N/A
- **Seams affected**: provider and verification tests may fail during classification, but this spec may only classify them as regression or follow-up debt unless the failure is purely a CI/lane contract issue.
- **Neutral platform terms preserved or introduced**: `workspace`, `managed environment`, `provider connection`, `operation`, `lane`, `failure group`, `CI signal`
- **Provider-specific semantics retained and why**: N/A
- **Why this does not deepen provider coupling accidentally**: Spec `295` does not change provider runtime, provider identity, target-scope semantics, or provider copy. It treats provider-specific failures as test/runtime debt requiring explicit follow-up unless they are already covered by the completed Spec `294` seam and proven to be a small regression in the CI contract.
- **Follow-up path**: any real provider/verification product failure after Spec `294` must become a follow-up spec or explicitly named failure group, not hidden in this classification pass.
## UI / Surface Guardrail Impact *(mandatory when operator-facing surfaces are changed; otherwise write `N/A`)*
N/A - no operator-facing surface change. Browser tests may be run as a lane signal only; visible UI repair is out of scope unless a later implementation explicitly stops and opens a follow-up spec.
## Decision-First Surface Role *(mandatory when operator-facing surfaces are changed)*
N/A - no application decision surface is added or changed.
## Audience-Aware Disclosure *(mandatory when operator-facing surfaces are changed)*
N/A - no application disclosure layer is added or changed.
## UI/UX Surface Classification *(mandatory when operator-facing surfaces are changed)*
N/A - no Filament screen, table, widget, relation manager, or resource is added or materially refactored.
## Operator Surface Contract *(mandatory when operator-facing surfaces are changed)*
N/A - no operator-facing page contract is introduced.
## Proportionality Review *(mandatory when structural complexity is introduced)*
- **New source of truth?**: no runtime source of truth
- **New persisted entity/table/artifact?**: no application persistence; one spec-local `failure-classification.md` artifact is added for implementation tracking only
- **New abstraction?**: no
- **New enum/state/reason family?**: yes, one spec-local failure-classification category set used only inside this spec package
- **New cross-domain UI framework/taxonomy?**: no
- **Current operator problem**: maintainers need one reliable answer to whether the full suite is a usable CI signal after Specs `293` and `294`, and if not, exactly which lane or follow-up owns the remaining failures.
- **Existing structure is insufficient because**: targeted green lanes do not prove full-suite readiness, while raw red output without classification does not tell maintainers whether to fix, split, rerun, or update lane baseline artifacts.
- **Narrowest correct implementation**: add one spec-local failure-classification artifact, use existing lane wrappers and support classes, classify all remaining groups, and fix only small CI-signal defects that block classification.
- **Ownership cost**: low to moderate; maintain one temporary classification artifact and any small lane contract correction made during implementation.
- **Alternative intentionally rejected**: a new full-suite framework, broad test rewrite, or permanent new lane. Those options import durable complexity before the existing lane system is proven insufficient.
- **Release truth**: current-release CI/test-governance readiness only
### Compatibility posture
This feature assumes a pre-production environment.
Backward compatibility, legacy aliases, route shims, TenantPanelProvider restoration, and compatibility-specific tests are out of scope. Canonical replacement remains preferred over preservation.
## Testing / Lane / Runtime Impact *(mandatory for runtime behavior changes)*
- **Test purpose / classification**: Heavy-Governance, Feature, Browser, Support/JUnit, and full-suite classification
- **Validation lane(s)**: raw full suite, fast-feedback, confidence, heavy-governance, browser, profiling/support when needed, junit/report/artifact publication when needed
- **Why this classification and these lanes are sufficient**: the goal is not one feature behavior. The proving purpose is whether the complete platform suite and existing CI lanes produce a trustworthy pass/fail signal after the known stabilization work.
- **New or expanded test families**: none by default. Any new test must be limited to a small CI/lane contract guard if a wrapper/report/artifact regression is proven.
- **Fixture / helper cost impact**: no new expensive fixture defaults are allowed. If fixture drift appears in the full suite, classify it by failing family and split to follow-up unless a one-line lane/guard baseline is the direct cause.
- **Heavy-family visibility / justification**: explicit. Heavy-governance and browser lanes are signal inputs, not automatic repair ownership.
- **Special surface test profile**: `global-context-shell`, `standard-native-filament`, `shared-detail-family`, `browser-smoke`, `surface-guard`, `discovery-heavy`
- **Standard-native relief or required special coverage**: no UI coverage expansion; browser lane reruns are used only to classify the existing smoke baseline.
- **Reviewer handoff**: reviewers must confirm that Livewire remains v4.0+, Filament remains v5, provider registration stays in `apps/platform/bootstrap/providers.php`, globally searchable resources are not changed, destructive actions are not changed, no assets are registered, every remaining failure is classified, and any in-scope fix is tied directly to a CI/lane contract defect.
- **Budget / baseline / trend impact**: the classification may update the documented status of budget or trend baseline drift, but it must not silently relax lane budgets or create a new baseline without an explicit row in `failure-classification.md`.
- **Escalation needed**: `document-in-feature` for contained lane baseline findings; `follow-up-spec` for product/runtime failures, fixture-family debt, new heavy cost centers, browser fallout, or any repair that exceeds CI/lane contract correction.
- **Active feature PR close-out entry**: `FullSuiteClassification`
- **Planned validation commands**:
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branch`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --stat`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junit`
- `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)`
## User Scenarios & Testing *(mandatory)*
### User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1)
As a maintainer, I want the complete platform suite run or explicit fallback lane split classified before any fixes so the project knows whether CI is green, blocked, flaky, or split into follow-up work.
**Why this priority**: without classification first, Spec `295` would become an uncontrolled full-suite repair pass.
**Independent Test**: Run the raw full suite or fallback lane split and prove every failing group has exactly one category, one seam, one owner/follow-up decision, and one status row in `failure-classification.md`.
**Acceptance Scenarios**:
1. **Given** the repo after Specs `293` and `294`, **When** the raw full suite passes, **Then** `failure-classification.md` records `ci-signal-restored` with the command, date, and pass counts.
2. **Given** the raw full suite fails, **When** the failure groups are reviewed, **Then** each group is classified before any repair is attempted.
3. **Given** a failing group points at `/admin/t/...`, TenantPanelProvider, or legacy tenant route behavior, **When** it is classified, **Then** the remedy must not restore that behavior and must be split or fixed only through current workspace-first truth.
---
### User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1)
As a maintainer, I want each existing CI lane wrapper, report, artifact, and failure class to produce a trustworthy signal so Gitea CI failures can be interpreted without reading raw terminal output first.
**Why this priority**: a green or red Pest run is not enough if wrapper, report, artifact, budget, or failure-class summaries are stale.
**Independent Test**: Run the existing lane wrappers and report commands, then verify each lane either passes with complete artifacts or fails with the correct primary failure class.
**Acceptance Scenarios**:
1. **Given** a lane fails because tests fail, **When** its report summary is generated, **Then** the primary failure class is `test-failure` rather than wrapper, artifact, or infrastructure failure.
2. **Given** a lane wrapper or manifest no longer resolves to the intended lane, **When** the lane is classified, **Then** it is marked `ci-wrapper-or-manifest-regression` and may be fixed in `295`.
3. **Given** required report artifacts are missing after a lane run, **When** publication is checked, **Then** it is classified as `artifact-publication-regression` and may be fixed in `295`.
---
### User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1)
As a maintainer, I want remaining product/runtime failures to become explicit follow-up ownership instead of being silently fixed under a CI-baseline spec.
**Why this priority**: this protects scope discipline and keeps test-governance decisions attributable.
**Independent Test**: Review every non-CI failure group and prove it either has a targeted follow-up recommendation or is demonstrably flaky/environmental.
**Acceptance Scenarios**:
1. **Given** a failing group requires a runtime product fix, **When** classification finishes, **Then** it is marked `follow-up-spec-required` or `product-runtime-or-test-regression` and not repaired under `295` unless the user explicitly starts that implementation scope later.
2. **Given** a failing group belongs to browser-only behavior, **When** classification finishes, **Then** it is marked `browser-lane-regression` with the existing smoke file and follow-up path.
3. **Given** a failing group disappears on rerun or is environment-specific, **When** classification finishes, **Then** it is marked `flaky-or-environment` with rerun evidence instead of treated as restored CI.
---
### User Story 4 - Publish the Final CI Readiness Decision (Priority: P2)
As a maintainer, I want a final readiness statement that says whether the full suite can be used as a CI baseline now, and what exact follow-up remains if it cannot.
**Why this priority**: the output must be actionable for future specs and Gitea workflows, not just a local debugging note.
**Independent Test**: Inspect `failure-classification.md`, lane report outputs, and final validation commands to confirm there are no unclassified failure groups and no hidden scope expansion.
**Acceptance Scenarios**:
1. **Given** all raw suite and lane signals pass, **When** close-out is prepared, **Then** the readiness decision is `restored-ci-signal`.
2. **Given** any group remains red, **When** close-out is prepared, **Then** the readiness decision is `classified-follow-up-required` and each group has an owner/follow-up.
3. **Given** a small CI/lane contract fix was applied, **When** final validation runs, **Then** the directly affected lane/report/artifact guard passes and unrelated failures remain classified rather than hidden.
### Edge Cases
- The raw full suite times out or produces output too large to classify directly.
- A lane passes tests but fails report or artifact publication.
- A lane fails only because budget/trend baselines drifted, not because tests failed.
- Browser lane failures expose stale screenshots or environment-specific browser state.
- A failure appears to touch Spec `293` or `294` seams but would require reopening retired legacy behavior.
- A failure disappears on rerun, suggesting flaky or environment-only behavior.
- A small lane manifest fix changes which tests run in a lane, which could accidentally widen CI cost.
## Requirements *(mandatory)*
**Constitution alignment (required):** This spec introduces no Microsoft Graph calls, no write/change behavior, no long-running application work, and no new `OperationRun`. It must preserve workspace/tenant isolation expectations while classifying test failures. Any failure suggesting isolation, RBAC, or audit regressions must be classified as product/runtime debt and not hidden as a CI wrapper issue.
**Constitution alignment (PROP-001 / ABSTR-001 / PERSIST-001 / STATE-001 / BLOAT-001):** The only structural addition is one spec-local failure-classification vocabulary and artifact. It solves the current CI readiness problem after two stabilization specs; no runtime persistence, CI framework, test engine, or new lane abstraction is introduced.
**Constitution alignment (TEST-GOV-001):** Spec `295` must explicitly classify the proving purpose of every lane run, preserve the existing lane family boundaries, keep expensive fixture/context setup opt-in, and end with one review outcome: `keep`, `split`, `document-in-feature`, `follow-up-spec`, or `reject-or-split`.
### Functional Requirements
- **FR-295-001**: The implementation MUST run the raw full suite once when feasible using `cd apps/platform && ./vendor/bin/sail artisan test --compact`.
- **FR-295-002**: If the raw full suite is too slow, noisy, or environment-blocked to classify reliably, the implementation MUST run the explicit fallback lane split: `fast-feedback`, `confidence`, `heavy-governance`, and `browser`.
- **FR-295-003**: Every failing group MUST be recorded in `failure-classification.md` with exactly one pinned category, one pinned seam, observed command, candidate owner, fix-in-295 decision, follow-up decision, and status.
- **FR-295-004**: Lane wrapper, report, artifact, budget, and failure-class problems MAY be fixed in `295` only when the failure is clearly isolated to `scripts/platform-test-lane`, `scripts/platform-test-report`, `scripts/platform-test-artifacts`, `TestLaneManifest`, `TestLaneReport`, `TestLaneBudget`, or their guard tests.
- **FR-295-005**: Product/runtime failures MUST NOT be repaired under `295` unless they are also a small, proven CI/lane contract defect; otherwise they must be assigned to a follow-up spec or classified as unrelated existing debt.
- **FR-295-006**: Any failure related to Specs `293` or `294` MUST be classified without rewriting those completed specs or restoring legacy behavior.
- **FR-295-007**: The implementation MUST NOT restore TenantPanelProvider, `/admin/t/...`, tenant-scoped provider fallback routes, or other retired cutover behavior.
- **FR-295-008**: The implementation MUST validate existing lane failure classes: `test-failure`, `wrapper-failure`, `budget-breach`, `artifact-publication-failure`, and `infrastructure-failure`.
- **FR-295-009**: The implementation MUST produce a final CI readiness decision in `failure-classification.md`: `restored-ci-signal`, `classified-follow-up-required`, or `blocked-by-environment`.
- **FR-295-010**: Any new or changed tests MUST be limited to CI/lane contract proof and must use Pest.
### Non-Functional Requirements
- **NFR-295-001**: No new runtime persistence, queue, model, service abstraction, provider registry, Filament resource, or browser family is introduced.
- **NFR-295-002**: Test lane classification must follow actual proving purpose, not file location.
- **NFR-295-003**: Existing lane budget and trend baselines must not be relaxed silently.
- **NFR-295-004**: Classification output must be concise enough for future implementers to route work without re-running the entire suite first.
- **NFR-295-005**: The final package must preserve Filament v5 / Livewire v4 compatibility and must not change panel provider registration.
## Key Entities *(include if feature involves data)*
- **Failure Group**: one failing test file, failing assertion cluster, wrapper error, artifact error, budget breach, or environment failure sharing one cause and one owner.
- **CI Lane Signal**: the pass/fail/report/artifact/budget outcome for one lane in `TestLaneManifest`.
- **Classification Decision**: the spec-local row assigning one category, seam, owner, fix-in-295 decision, and follow-up path.
- **Readiness Decision**: the final status of the full suite and lane baseline after classification.
## Success Criteria *(mandatory)*
- **SC-295-001**: `failure-classification.md` exists and contains the pinned category and seam definitions.
- **SC-295-002**: Raw full suite output or fallback lane split output is represented by classified groups with no unclassified red group remaining.
- **SC-295-003**: Existing lane wrappers and report/artifact contracts either pass or have a classified failure class and fix/follow-up decision.
- **SC-295-004**: No implementation step restores TenantPanelProvider, `/admin/t/...`, or retired tenant-scoped fallback behavior.
- **SC-295-005**: The final readiness decision is explicit and actionable: `restored-ci-signal`, `classified-follow-up-required`, or `blocked-by-environment`.
- **SC-295-006**: If a product/runtime failure remains, the classification identifies a separate follow-up owner instead of treating the full suite as green.
## Assumptions
- Specs `293` and `294` have completed the targeted stabilization work described by the user and are context only.
- The repo's existing Gitea-compatible lane system remains the preferred CI shape.
- Local implementation will use Sail-first commands unless a non-Docker fallback is explicitly needed.
- Full-suite execution may be expensive; lane split is an allowed fallback only when the raw full suite is not classifiable.
## Risks
- Full-suite output may be too large or slow to classify directly.
- Environment-specific Sail/browser failures may obscure real suite status.
- A tempting product fix may be small locally but still outside this CI-baseline scope.
- Budget/trend drift may be real but not appropriate to fix by silently raising thresholds.
- Multiple failing groups may share a fixture root cause and need careful grouping to avoid duplicate follow-up specs.
## Open Questions
- None blocking preparation. During implementation, actual failing groups determine whether follow-up specs are needed.

View File

@ -0,0 +1,173 @@
# Tasks: Full Suite Failure Classification & CI Lane Baseline
**Input**: Design documents from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/`
**Prerequisites**: `spec.md`, `plan.md`, `research.md`, `data-model.md`, `quickstart.md`, `failure-classification.md`, `checklists/requirements.md`
**Review Artifact**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md`
**Failure Inventory**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
## Review Metadata
- **Review outcome class**: `acceptable-special-case`
- **Workflow outcome**: `keep`
- **Test-governance outcome**: `keep`
- **Stop / split triggers**: broad product/runtime repair, new CI framework, new permanent lane, new browser family, new heavy-governance family, runtime application changes, Filament resource/page changes, route restoration, TenantPanelProvider restoration, `/admin/t/...` restoration, provider/verification runtime expansion, historical-spec rewrite, or budget relaxation without classification evidence
## Pinned Failure-Classification Categories
- `ci-signal-restored`
- `ci-wrapper-or-manifest-regression`
- `artifact-publication-regression`
- `budget-or-trend-baseline-drift`
- `product-runtime-or-test-regression`
- `browser-lane-regression`
- `flaky-or-environment`
- `follow-up-spec-required`
- `resolved-or-not-needed`
## Pinned CI / Suite Seams
- `raw-full-suite`
- `fast-feedback-lane`
- `confidence-lane`
- `heavy-governance-lane`
- `browser-lane`
- `profiling-or-junit-support`
- `lane-reporting`
- `artifact-publication`
- `budget-trend-baseline`
- `legacy-cutover-regression-guard`
- `provider-verification-regression-guard`
## Test Governance Checklist
- [x] Lane assignment is named and is the narrowest sufficient proof for each observed failure group.
- [x] New or changed tests stay in the smallest honest family, and any heavy-governance or browser addition is explicit.
- [x] Shared helpers, factories, seeds, fixtures, and context defaults stay cheap by default; any widening is isolated or documented.
- [x] Planned validation commands cover the change without pulling in unrelated lane cost beyond classification.
- [x] The declared surface test profile or `standard-native-filament` relief is explicit.
- [x] Any material budget, baseline, trend, or escalation note is recorded in `failure-classification.md`.
## Phase 1: Setup and Scope Lock
**Purpose**: Confirm Spec `295` remains a classification and CI lane baseline package before any suite command runs.
- [x] T001 Review `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/plan.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/research.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/quickstart.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`, and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md` before changing runtime or tests
- [x] T002 [P] Confirm current branch, working tree, and baseline diff using `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branch` and `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --stat`, then record any pre-existing changes in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T003 [P] Inspect `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/293-post-cutover-suite-stabilization/failure-classification.md` and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/294-provider-verification-runtime-semantics/failure-classification.md` as context only, confirming no task edits are made to Specs `293` or `294`
- [x] T004 [P] Inspect `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-report`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php` to confirm current lane entry points and failure classes
- [x] T005 Confirm the explicit forbidden scope in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`: no TenantPanelProvider restoration, no `/admin/t/...` restoration, no broad product repair, and no historical-spec rewrite
---
## Phase 2: User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1)
**Goal**: Establish the raw full-suite readiness signal or an explicit fallback split before any fix work begins.
**Independent Test**: the raw full-suite result or fallback lane split is represented by classified rows in `failure-classification.md`, with no red group left unclassified.
- [x] T006 [US1] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)` and record pass/fail counts, failing files, and any timeout/noisy-output reason in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T007 [US1] If T006 cannot produce a classifiable result, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback`, `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence`, `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance`, and `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser`, then record each lane outcome in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T008 [US1] Group every failing test file, assertion cluster, wrapper error, report error, artifact error, budget breach, or environment issue into one row in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` with exactly one pinned category and one pinned seam
- [x] T009 [US1] Classify any legacy route or panel-related group under `legacy-cutover-regression-guard` without restoring `/admin/t/...`, TenantPanelProvider, tenant-scoped provider fallback routes, or historical compatibility behavior
- [x] T010 [US1] Classify any provider/verification group under `provider-verification-regression-guard` without rewriting Spec `294`; only mark it in-scope if the failure is a direct CI/lane contract defect rather than provider runtime behavior
---
## Phase 3: User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1)
**Goal**: Prove existing CI wrappers, reports, artifacts, budgets, and failure classes are interpretable after the suite run.
**Independent Test**: every lane either passes with complete report/artifact output or fails with the correct primary failure class.
- [x] T011 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T012 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T013 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T014 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T015 [P] [US2] If machine-readable confidence output is needed for follow-up ownership, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junit` and classify the JUnit support result in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` (not run separately because the lane wrappers produced the needed JUnit artifacts)
- [x] T016 [P] [US2] If artifact publication is suspected, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-fast-feedback-artifacts` or the matching affected lane and classify any missing required artifacts under `artifact-publication-regression`
- [x] T017 [US2] Verify existing failure classes from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php` classify lane outcomes as `test-failure`, `wrapper-failure`, `budget-breach`, `artifact-publication-failure`, or `infrastructure-failure`, and record mismatches in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
---
## Phase 4: User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1)
**Goal**: Keep Spec `295` limited to CI signal readiness by splitting product/runtime failures into explicit follow-up ownership.
**Independent Test**: every non-CI failure group has a follow-up recommendation, owner, or environment disposition.
- [x] T018 [US3] For each row classified as `product-runtime-or-test-regression`, decide whether it is a follow-up spec, lane-specific debt, or active feature blocker, then record the decision in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T019 [US3] For each row classified as `browser-lane-regression`, record the affected browser file under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Browser/`, whether the failure is smoke/environment/product behavior, and the follow-up path in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`
- [x] T020 [US3] For each row classified as `flaky-or-environment`, rerun the narrowest affected command once when safe and record the rerun evidence or environment blocker in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` (no flaky/environment row was identified)
- [x] T021 [US3] Confirm no failure group is being fixed under `295` solely because it is small or nearby; it must be directly tied to CI wrapper, manifest, report, artifact, or budget/trend contract drift
---
## Phase 5: User Story 4 - Apply Only Small CI-Signal Fixes (Priority: P2)
**Goal**: Correct narrow CI/lane contract defects only when classification proves they block a trustworthy CI signal.
**Independent Test**: the directly affected lane/report/artifact guard passes after the minimal fix, and unrelated red groups remain classified.
- [x] T022 [US4] If a `ci-wrapper-or-manifest-regression` row is proven, apply the minimal correction in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected guard test under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/` (not applicable: no `ci-wrapper-or-manifest-regression` row was proven)
- [x] T023 [US4] If an `artifact-publication-regression` row is proven, apply the minimal correction in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected artifact guard test
- [x] T024 [US4] If a `budget-or-trend-baseline-drift` row is proven, update only the documented budget/trend baseline owner in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneBudget.php`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected guard test when the classification row explains why the evidence supports the change (not applicable: no budget/trend baseline rewrite was justified)
- [x] T025 [US4] Add or adjust Pest coverage only when a CI/lane contract defect was fixed, keeping tests under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/` or `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Unit/Support/` and avoiding new browser/heavy families by default
- [x] T026 [US4] Re-run the narrowest affected lane/report/artifact command after any CI/lane fix and update `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` with the final status
---
## Phase 6: Final Readiness Decision and Validation
**Purpose**: Publish one final CI readiness decision and prove no unclassified failure or hidden scope expansion remains.
- [x] T027 Review `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` and confirm every row has category, seam, observed command, candidate owner, fix-in-295 decision, follow-up, and status
- [x] T028 Set the final readiness decision in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` to exactly one of `restored-ci-signal`, `classified-follow-up-required`, or `blocked-by-environment`
- [x] T029 Re-run the final narrowest proof command set for the decision: raw full suite if classifiable, otherwise the exact affected lane/report commands from Phases 2 through 5
- [x] T030 Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)` if any PHP or script-adjacent PHP files changed
- [x] T031 Confirm Filament remains v5 on Livewire v4, provider registration remains in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/bootstrap/providers.php`, no globally searchable resource changed, no destructive action changed, no asset registration changed, no `/admin/t/...` route or TenantPanelProvider behavior was restored, and no Specs `293` or `294` artifact was rewritten
## Dependencies & Execution Order
- **Phase 1** must complete before any suite command.
- **Phase 2** must classify raw suite or fallback lane output before any fix work.
- **Phase 3** depends on Phase 2 because lane reports must be interpreted against observed lane outcomes.
- **Phase 4** depends on the failure group inventory from Phases 2 and 3.
- **Phase 5** depends on classified CI/lane contract defects; skip it entirely if no in-scope CI/lane defect is proven.
- **Phase 6** depends on all classification and any bounded fixes.
## Parallel Execution Examples
- T003 and T004 can run in parallel after T001.
- T011 through T014 can run independently after their corresponding lane outputs exist.
- T018 through T020 can be split by failure group once T008 has created the grouped inventory.
- T022 through T024 must not run until a corresponding classification row proves the in-scope defect.
## Implementation Strategy
### Suggested MVP Scope
MVP = Phases 1 through 4. That is enough to answer whether the suite is green or which follow-up owns each red group. Phase 5 runs only when classification proves a narrow CI/lane contract defect.
### Incremental Delivery
1. Lock scope and read prior stabilization artifacts.
2. Run raw full suite or fallback lane split.
3. Classify every red group.
4. Validate lane/report/artifact signal.
5. Split product/runtime failures to follow-up ownership.
6. Apply only proven CI/lane fixes.
7. Publish the final readiness decision.
## Explicit Follow-Ups / Out of Scope
- Product/runtime failing-test repair outside CI/lane contract defects
- Browser UI repair
- Package Execution
- Guided Operations
- Microsoft Starter Pack
- Virtual Consultant
- Tenant cutover rework
- Provider/verification runtime expansion beyond Spec `294`
- New permanent CI lane or framework
- Historical-spec cleanup