From eb85b76eede1e85d3236dee6c7dcbf21642d13f1 Mon Sep 17 00:00:00 2001 From: Ahmed Darrazi Date: Mon, 11 May 2026 10:43:16 +0200 Subject: [PATCH 1/2] Added Skill for Codex --- .codex/skills/browsertest/SKILL.md | 295 +++++++++ .codex/skills/giteaflow/SKILL.md | 8 + .codex/skills/pest-testing/SKILL.md | 167 +++++ .../skills/platform-feature-finish/SKILL.md | 625 ++++++++++++++++++ .../spec-kit-implementation-loop/SKILL.md | 447 +++++++++++++ .../skills/spec-kit-next-best-prep/SKILL.md | 612 +++++++++++++++++ .../skills/tailwindcss-development/SKILL.md | 129 ++++ 7 files changed, 2283 insertions(+) create mode 100644 .codex/skills/browsertest/SKILL.md create mode 100644 .codex/skills/giteaflow/SKILL.md create mode 100644 .codex/skills/pest-testing/SKILL.md create mode 100644 .codex/skills/platform-feature-finish/SKILL.md create mode 100644 .codex/skills/spec-kit-implementation-loop/SKILL.md create mode 100644 .codex/skills/spec-kit-next-best-prep/SKILL.md create mode 100644 .codex/skills/tailwindcss-development/SKILL.md diff --git a/.codex/skills/browsertest/SKILL.md b/.codex/skills/browsertest/SKILL.md new file mode 100644 index 00000000..efcd7578 --- /dev/null +++ b/.codex/skills/browsertest/SKILL.md @@ -0,0 +1,295 @@ +--- +name: browsertest +description: Führe einen vollständigen Smoke-Browser-Test im Integrated Browser für das aktuelle Feature aus, inklusive Happy Path, zentraler Regressionen, Kontext-Prüfung und belastbarer Ergebniszusammenfassung. +license: MIT +metadata: + author: GitHub Copilot +--- + +# Browser Smoke Test + +## What This Skill Does + +Use this skill to validate the current feature end-to-end in the integrated browser. + +This is a focused smoke test, not a full exploratory test session. The goal is to prove that the primary operator flow: + +- loads in the correct auth, workspace, and tenant context +- exposes the expected controls and decision points +- completes the main happy path without blocking issues +- lands in the expected end state or canonical drilldown +- does not show obvious regressions such as broken navigation, missing data, or conflicting actions + +The skill should produce a concrete pass or fail result with actionable evidence. + +## When To Apply + +Activate this skill when: + +- the user asks to smoke test the current feature in the browser +- a new Filament page, dashboard signal, report, wizard, or detail flow was just added +- a UI regression fix needs confirmation in a real browser context +- the primary question is whether the feature works from an operator perspective +- you need a quick integration-level check without writing a full browser test suite first + +## What Success Looks Like + +A successful smoke test confirms all of the following: + +- the target route opens successfully +- the visible context is correct +- the main flow is usable +- the expected result appears after interaction +- the route or drilldown destination is correct +- the surface does not obviously violate its intended interaction model + +If the test cannot be completed, the output must clearly state whether the blocker is: + +- authentication +- missing data or fixture state +- routing +- UI interaction failure +- server error +- an unclear expected behavior contract + +Do not guess. If the route or state is blocked, report the blocker explicitly. + +## Preconditions + +Before running the browser smoke test, make sure you know: + +- the canonical route or entry point for the feature +- the primary operator action or happy path +- the expected success state +- whether the feature depends on a specific tenant, workspace, or seeded record + +When available, use the feature spec, quickstart, tasks, or current browser page as the source of truth. + +## Standard Workflow + +### 1. Define the smoke-test scope + +Identify: + +- the route to open +- the primary action to perform +- the expected end state +- one or two critical regressions that must not break + +The smoke test should stay narrow. Prefer one complete happy path plus one critical boundary over broad exploratory clicking. + +### 2. Establish the browser state + +- Reuse the current browser page if it already matches the target feature. +- Otherwise open the canonical route. +- Confirm the current auth and scope context before interacting. + +For this repo, that usually means checking whether the page is on: + +- `/admin/...` for workspace-context surfaces +- `/admin/t/{tenant}/...` for tenant-context surfaces + +### 3. Inspect before acting + +- Use `read_page` before interacting so you understand the live controls, refs, headings, and route context. +- Prefer `read_page` over screenshots for actual interaction planning. +- Use screenshots only for visual evidence or when the user asks for them. + +### 4. Execute the primary happy path + +Run the smallest meaningful flow that proves the feature works. + +Typical steps include: + +- open the page +- verify heading or key summary text +- click the primary CTA or row +- fill the minimum required form fields +- confirm modal or dialog text when relevant +- submit or navigate +- verify the expected destination or changed state + +After each meaningful action, re-read the page so the next step is based on current DOM state. + +### 5. Validate the outcome + +Check the exact result that matters for the feature. + +Examples: + +- a new row appears +- a status changes +- a success message appears +- a report filter changes the result set +- a row click lands on the canonical detail page +- a dashboard signal links to the correct report page + +### 6. Check for obvious regressions + +Even in a smoke test, verify a few core non-negotiables: + +- the page is not blank or half-rendered +- the main action is present and usable +- the visible context is correct +- the drilldown destination is canonical +- no obviously duplicated primary actions exist +- no stuck modal, spinner, or blocked interaction remains onscreen + +### 7. Capture evidence and summarize clearly + +Your result should state: + +- route tested +- context used +- steps executed +- pass or fail +- exact blocker or discrepancy if failed + +Include a screenshot only when it adds value. + +## Tool Usage Guidance + +Use the browser tools in this order by default: + +1. `read_page` +2. `click_element` +3. `type_in_page` +4. `handle_dialog` when needed +5. `navigate_page` or `open_browser_page` only when route changes are required +6. `run_playwright_code` only if the normal browser tools are insufficient +7. `screenshot_page` for evidence, not for primary navigation logic + +## Repo-Specific Guidance For TenantPilot + +### Workspace surfaces + +For `/admin` pages and similar workspace-context surfaces: + +- verify the page is reachable without forcing tenant-route assumptions +- confirm any summary signal or CTA lands on the canonical destination +- verify calm-state versus attention-state behavior when the feature defines both + +### Tenant surfaces + +For `/admin/t/{tenant}/...` pages: + +- verify the tenant context is explicit and correct +- verify drilldowns stay in the intended tenant scope +- treat cross-tenant leakage or silent scope changes as failures + +### Filament list or report surfaces + +For Filament tables, reports, or registry-style pages: + +- verify the heading and table shell render +- verify fixed filters or summary controls exist when the spec requires them +- verify row click or the primary inspect affordance behaves as designed +- verify empty-state messaging is specific rather than generic when the feature defines custom behavior + +### Filament detail pages + +For detail or view surfaces: + +- verify the canonical record loads +- verify expected sections or summary content are present +- verify critical actions or drillbacks are usable + +## Result Format + +Use a compact result format like this: + +```text +Browser smoke result: PASS +Route: /admin/findings/hygiene +Context: workspace member with visible hygiene issues +Steps: opened report -> verified filters -> clicked finding row -> landed on canonical finding detail +Verified: report rendered, primary interaction worked, drilldown route was correct +``` + +If the test fails: + +```text +Browser smoke result: FAIL +Route: /admin/findings/hygiene +Context: authenticated workspace member +Failed step: clicking the summary CTA +Expected: navigate to /admin/findings/hygiene +Actual: remained on /admin with no route change +Blocker: CTA appears rendered but is not interactive +``` + +## Examples + +### Example 1: Smoke test a new report page + +Use this when the feature adds a new read-only report. + +Steps: + +- open the canonical report route +- verify the page heading and main controls +- confirm the table or defined empty state is visible +- click one row or primary inspect affordance +- verify navigation lands on the canonical detail route + +Pass criteria: + +- report loads +- intended controls exist +- primary inspect path works + +### Example 2: Smoke test a dashboard signal + +Use this when the feature adds a summary signal on `/admin`. + +Steps: + +- open `/admin` +- find the signal +- verify the visible count or summary text +- click the CTA +- confirm navigation lands on the canonical downstream surface + +Pass criteria: + +- signal is visible in the correct state +- CTA text is present +- CTA opens the correct route + +### Example 3: Smoke test a tenant detail follow-up + +Use this when a workspace-level surface should drill into a tenant-level detail page. + +Steps: + +- open the workspace-level surface +- trigger the drilldown +- verify the target route includes the correct tenant and record +- confirm the target page actually loads the expected detail content + +Pass criteria: + +- drilldown route is canonical +- tenant context is correct +- destination content matches the selected record + +## Common Pitfalls + +- Clicking before reading the page state and refs +- Treating a blocked auth session as a feature failure +- Confusing workspace-context routes with tenant-context routes +- Reporting visual impressions without validating the actual interaction result +- Forgetting to re-read the page after a modal opens or a route changes +- Claiming success without verifying the final destination or changed state + +## Non-Goals + +This skill does not replace: + +- full exploratory QA +- formal Pest browser coverage +- accessibility review +- visual regression approval +- backend correctness tests + +It is a fast, real-browser confidence pass for the current feature. \ No newline at end of file diff --git a/.codex/skills/giteaflow/SKILL.md b/.codex/skills/giteaflow/SKILL.md new file mode 100644 index 00000000..319b4a3a --- /dev/null +++ b/.codex/skills/giteaflow/SKILL.md @@ -0,0 +1,8 @@ +--- +name: giteaflow +description: Describe what this skill does and when to use it. Include keywords that help agents identify relevant tasks. +--- + + + +comit all changes, push to remote, and create a pull request against platform-dev with gitea mcp \ No newline at end of file diff --git a/.codex/skills/pest-testing/SKILL.md b/.codex/skills/pest-testing/SKILL.md new file mode 100644 index 00000000..56198610 --- /dev/null +++ b/.codex/skills/pest-testing/SKILL.md @@ -0,0 +1,167 @@ +--- +name: pest-testing +description: "Tests applications using the Pest 4 PHP framework. Activates when writing tests, creating unit or feature tests, adding assertions, testing Livewire components, browser testing, debugging test failures, working with datasets or mocking; or when the user mentions test, spec, TDD, expects, assertion, coverage, or needs to verify functionality works." +license: MIT +metadata: + author: laravel +--- + +# Pest Testing 4 + +## When to Apply + +Activate this skill when: + +- Creating new tests (unit, feature, or browser) +- Modifying existing tests +- Debugging test failures +- Working with browser testing or smoke testing +- Writing architecture tests or visual regression tests + +## Documentation + +Use `search-docs` for detailed Pest 4 patterns and documentation. + +## Basic Usage + +### Creating Tests + +All tests must be written using Pest. Use `php artisan make:test --pest {name}`. + +### Test Organization + +- Unit/Feature tests: `tests/Feature` and `tests/Unit` directories. +- Browser tests: `tests/Browser/` directory. +- Do NOT remove tests without approval - these are core application code. + +### Basic Test Structure + + +```php +it('is true', function () { + expect(true)->toBeTrue(); +}); +``` + +### Running Tests + +- Run minimal tests with filter before finalizing: `php artisan test --compact --filter=testName`. +- Run all tests: `php artisan test --compact`. +- Run file: `php artisan test --compact tests/Feature/ExampleTest.php`. + +## Assertions + +Use specific assertions (`assertSuccessful()`, `assertNotFound()`) instead of `assertStatus()`: + + +```php +it('returns all', function () { + $this->postJson('/api/docs', [])->assertSuccessful(); +}); +``` + +| Use | Instead of | +|-----|------------| +| `assertSuccessful()` | `assertStatus(200)` | +| `assertNotFound()` | `assertStatus(404)` | +| `assertForbidden()` | `assertStatus(403)` | + +## Mocking + +Import mock function before use: `use function Pest\Laravel\mock;` + +## Datasets + +Use datasets for repetitive tests (validation rules, etc.): + + +```php +it('has emails', function (string $email) { + expect($email)->not->toBeEmpty(); +})->with([ + 'james' => 'james@laravel.com', + 'taylor' => 'taylor@laravel.com', +]); +``` + +## Pest 4 Features + +| Feature | Purpose | +|---------|---------| +| Browser Testing | Full integration tests in real browsers | +| Smoke Testing | Validate multiple pages quickly | +| Visual Regression | Compare screenshots for visual changes | +| Test Sharding | Parallel CI runs | +| Architecture Testing | Enforce code conventions | + +### Browser Test Example + +Browser tests run in real browsers for full integration testing: + +- Browser tests live in `tests/Browser/`. +- Use Laravel features like `Event::fake()`, `assertAuthenticated()`, and model factories. +- Use `RefreshDatabase` for clean state per test. +- Interact with page: click, type, scroll, select, submit, drag-and-drop, touch gestures. +- Test on multiple browsers (Chrome, Firefox, Safari) if requested. +- Test on different devices/viewports (iPhone 14 Pro, tablets) if requested. +- Switch color schemes (light/dark mode) when appropriate. +- Take screenshots or pause tests for debugging. + + +```php +it('may reset the password', function () { + Notification::fake(); + + $this->actingAs(User::factory()->create()); + + $page = visit('/sign-in'); + + $page->assertSee('Sign In') + ->assertNoJavaScriptErrors() + ->click('Forgot Password?') + ->fill('email', 'nuno@laravel.com') + ->click('Send Reset Link') + ->assertSee('We have emailed your password reset link!'); + + Notification::assertSent(ResetPassword::class); +}); +``` + +### Smoke Testing + +Quickly validate multiple pages have no JavaScript errors: + + +```php +$pages = visit(['/', '/about', '/contact']); + +$pages->assertNoJavaScriptErrors()->assertNoConsoleLogs(); +``` + +### Visual Regression Testing + +Capture and compare screenshots to detect visual changes. + +### Test Sharding + +Split tests across parallel processes for faster CI runs. + +### Architecture Testing + +Pest 4 includes architecture testing (from Pest 3): + + +```php +arch('controllers') + ->expect('App\Http\Controllers') + ->toExtendNothing() + ->toHaveSuffix('Controller'); +``` + +## Common Pitfalls + +- Not importing `use function Pest\Laravel\mock;` before using mock +- Using `assertStatus(200)` instead of `assertSuccessful()` +- Forgetting datasets for repetitive validation tests +- Deleting tests without approval +- Forgetting `assertNoJavaScriptErrors()` in browser tests \ No newline at end of file diff --git a/.codex/skills/platform-feature-finish/SKILL.md b/.codex/skills/platform-feature-finish/SKILL.md new file mode 100644 index 00000000..a204a3b4 --- /dev/null +++ b/.codex/skills/platform-feature-finish/SKILL.md @@ -0,0 +1,625 @@ + + +--- +name: platform-feature-finish +description: Commit, push, create a Gitea PR from a TenantPilot platform feature branch into platform-dev, and optionally refresh the platform-dev to dev integration PR by rebase. +--- + +# Skill: platform-feature-finish + +## Purpose + +Automate the TenantPilot platform feature completion workflow. + +Trigger this skill when the user says something like: + +- "alles committen pushen und PR gegen platform-dev" +- "feature fertig, bitte PR erstellen" +- "platform feature abschließen" +- "commit push PR mit Gitea MCP" +- "mach PR gegen platform-dev" +- "finish platform feature" +- "platform-dev nach dev vorbereiten" +- "platform-dev PR aktualisieren" +- "out-of-date mit dev beheben" +- "integration PR refresh" +- "platform-dev auf dev rebasen" + +This skill handles: + +1. Validate current Git branch +2. Commit all feature changes +3. Push current feature branch +4. Create a Gitea pull request into `platform-dev` +5. Refresh the `platform-dev` → `dev` integration PR when explicitly requested +6. Report the PR link and next integration step + +--- + +## Branch Model + +TenantPilot uses area branches: + +```text +dev = shared integration branch +platform-dev = platform/application area integration branch +website-dev = website/marketing area integration branch +``` + +For platform features: + +```text +platform-dev + ↓ +feature branch + ↓ +PR back to platform-dev + ↓ +platform-dev → dev integration PR +``` + +Rules: + +- Platform feature branches MUST target `platform-dev`. +- Do NOT target `dev` directly unless the user explicitly asks. +- Do NOT use `website-dev` for platform features. +- `platform-dev` is the default PR base for TenantPilot platform/application work. +- `dev` is the shared integration branch. + +### Solo Workflow Rule + +The user works alone on `platform-dev`. + +For refreshing the integration branch before opening or updating the PR `platform-dev` → `dev`, prefer rebase over merge. + +Do not repeatedly merge `origin/dev` into `platform-dev` for refresh. + +Avoid creating repeated merge commits like: + +```text +Merge remote-tracking branch 'origin/dev' into platform-dev +``` + +Use `--force-with-lease`, never plain `--force`. + +If rebase conflicts occur, stop and report the conflict files. + +--- + +## Preconditions + +Before committing: + +1. Confirm repository root. +2. Confirm current branch is not protected. + +Protected branches: + +```text +dev +platform-dev +website-dev +main +master +``` + +If the current branch is protected, STOP and report: + +```text +Ich bin auf einem geschützten Branch. Bitte zuerst einen Feature-Branch auschecken. +``` + +3. Confirm remote exists. +4. Confirm there are local changes, untracked files, or unpushed commits. +5. Confirm there are no unresolved conflicts. + +Do not ask for confirmation unless: + +- The current branch is protected. +- Git status indicates unresolved conflicts. +- There is no remote configured. +- `.env` or other local secret/config files would be committed. +- Commit fails. +- Push fails. +- Gitea MCP PR creation fails. + +--- + +## Required Tools + +Use terminal for Git operations. + +Use Gitea MCP for pull request creation. + +Preferred Gitea MCP operation: + +```text +create_pull_request +``` + +Required PR parameters: + +```json +{ + "owner": "ahmido", + "repo": "TenantAtlas", + "head": "", + "base": "platform-dev", + "title": "", + "body": "" +} +``` + +--- + +## Workflow + +### Step 1 — Inspect Git state + +Run: + +```bash +git rev-parse --show-toplevel +git rev-parse --abbrev-ref HEAD +git status --porcelain +git status -sb +git config --get remote.origin.url +git log --oneline --max-count=5 +``` + +Determine: + +- repository root +- current branch +- changed files +- untracked files +- remote URL +- whether there are unpushed commits +- whether unresolved conflicts exist + +If the current branch is protected, stop. + +If unresolved conflicts exist, stop. + +If no remote exists, stop. + +--- + +### Step 2 — Check for local environment files + +Before `git add -A`, check whether local environment/config files are modified or untracked: + +```bash +git status --porcelain | grep -E '(^.. \.env$|^.. apps/platform/\.env$|^.. .*\.env$)' || true +``` + +If `.env` or another environment file is included, STOP and report: + +```text +Achtung: Eine .env-/Environment-Datei ist geändert oder untracked. Ich committe das nicht automatisch. Bitte prüfen oder aus dem Commit entfernen. +``` + +Do not commit secrets or local runtime configuration. + +--- + +### Step 3 — Build commit message + +Use the current branch name. + +If branch starts with a spec number, for example: + +```text +256-external-support-desk-handoff +``` + +Generate: + +```text +feat(specs/256): external support desk handoff +``` + +If branch does not contain a spec number, generate: + +```text +feat(platform): complete +``` + +Rules: + +- Use lowercase subject. +- Use feature-style subject. +- Do not include `WIP`. +- Do not include `final`. +- Do not include overly generic `updates`. + +Examples: + +```text +feat(specs/256): external support desk handoff +feat(specs/252): platform localization v1 +feat(platform): improve tenant review workspace +``` + +--- + +### Step 4 — Commit all changes + +Run: + +```bash +git add -A +git commit -m "" +``` + +If there are no local changes to commit, continue only if the branch has unpushed commits. + +Check unpushed commits with: + +```bash +git status -sb +git log --oneline origin/..HEAD +``` + +If there are no local changes and no unpushed commits, report: + +```text +Es gibt keine lokalen Änderungen und keine unpushed commits. Ich erstelle keinen leeren Commit. +``` + +Then continue to PR creation only if the branch already exists remotely or can be pushed. + +--- + +### Step 5 — Push branch + +Run: + +```bash +git push --set-upstream origin +``` + +If the upstream already exists, this is acceptable. + +Never force-push unless the user explicitly requests it. + +--- + +### Step 6 — Create PR into platform-dev via Gitea MCP + +Use Gitea MCP to create a pull request: + +```json +{ + "owner": "ahmido", + "repo": "TenantAtlas", + "head": "", + "base": "platform-dev", + "title": "", + "body": "Implements platform feature branch ``.\n\nTarget branch: `platform-dev`.\n\nFollow-up integration path after merge:\n\n`platform-dev` → `dev`." +} +``` + +If a PR already exists for the same branch and base, do not create a duplicate. + +Report the existing PR if available. + +--- + +## Optional Step — Check platform-dev to dev PR + +After creating the feature PR, check whether an open integration PR exists: + +```text +platform-dev → dev +``` + +If a Gitea MCP list/search pull request function is available, use it. + +If one exists, report: + +```text +Der Folge-PR `platform-dev` → `dev` existiert bereits: +``` + +If none exists, report: + +```text +Nach dem Merge dieses Feature-PRs sollte der Integrations-PR `platform-dev` → `dev` erstellt oder aktualisiert werden. +``` + +Do not automatically create the `platform-dev` → `dev` PR unless the user explicitly asks for it. + +Reason: before the feature PR is merged into `platform-dev`, the integration PR may not include the new feature yet. + +--- + +## Integration Refresh Mode + +Use this mode when the user explicitly says one of the following: + +- "platform-dev nach dev vorbereiten" +- "platform-dev PR aktualisieren" +- "out-of-date mit dev beheben" +- "integration PR refresh" +- "platform-dev auf dev rebasen" +- "auch platform-dev nach dev" +- "und danach platform-dev nach dev" +- "full integration" +- "kompletten platform-dev zu dev PR machen" +- "folge-pr erstellen" + +This mode prepares or updates the integration PR: + +```text +platform-dev → dev +``` + +Because the user works alone on `platform-dev`, prefer rebase over merge. + +### Integration Refresh Preconditions + +Before running this mode: + +1. Ensure the working tree is clean. +2. Ensure there are no unresolved conflicts. +3. Fetch remote branches. +4. Ensure `origin/platform-dev` exists. +5. Ensure `origin/dev` exists. + +If the working tree is dirty, STOP and report: + +```text +Der Working Tree ist nicht sauber. Bitte erst Änderungen committen, stashen oder verwerfen, bevor `platform-dev` auf `dev` rebased wird. +``` + +If unresolved conflicts exist, STOP and report the conflict files. + +### Integration Refresh Workflow + +Run: + +```bash +git fetch origin +git checkout platform-dev +git reset --hard origin/platform-dev +git rebase origin/dev +git push --force-with-lease origin platform-dev +``` + +After pushing, verify that `origin/dev` is now an ancestor of `origin/platform-dev`: + +```bash +git fetch origin +git merge-base --is-ancestor origin/dev origin/platform-dev \ + && echo "OK: platform-dev contains dev" \ + || echo "OUTDATED: platform-dev does not contain dev" +``` + +If the verification prints `OUTDATED`, stop and report it. Do not claim the PR is up-to-date. + +Rules: + +- Do not merge `origin/dev` into `platform-dev` for this refresh. +- Do not create repeated merge commits from `origin/dev` into `platform-dev`. +- Use `git push --force-with-lease origin platform-dev` after a successful rebase. +- Never use plain `git push --force`. +- If `git rebase origin/dev` reports conflicts, stop immediately. +- Do not continue to PR creation while a rebase is unresolved. +- Do not auto-merge the PR. +- Do not claim Gitea will remove the out-of-date warning unless the ancestor check succeeds. + +If rebase conflicts occur, report: + +```text +Rebase-Konflikte erkannt. Ich habe gestoppt. + +Konfliktdateien: + + +Bitte Konflikte lösen, dann `git rebase --continue` ausführen oder den Rebase mit `git rebase --abort` abbrechen. +``` + +### Create or Report Integration PR + +After the rebase, push, and ancestor verification succeeded, use Gitea MCP to create or report the integration PR: + +```json +{ + "owner": "ahmido", + "repo": "TenantAtlas", + "head": "platform-dev", + "base": "dev", + "title": "chore(platform): merge platform-dev into dev", + "body": "Integrates latest TenantPilot platform changes from `platform-dev` into `dev`.\n\nThis PR was created by agent on user request; do not merge automatically." +} +``` + +If an open PR already exists for `platform-dev` → `dev`, do not create a duplicate. Report the existing PR. + +### Integration Refresh Reporting Format + +Final response for this mode must include: + +```text +Fertig. + +- Branch aktualisiert: platform-dev +- Refresh-Methode: rebase auf origin/dev +- Ancestor-Check: origin/dev ist Ancestor von origin/platform-dev +- Push: --force-with-lease origin/platform-dev +- Integration PR: +- Base: dev +- Hinweis: PR wurde nicht automatisch gemerged. +``` + +Do not claim tests passed unless they were actually executed. + +--- + +## Reporting Format + +Final response must be concise and include: + +```text +Fertig. + +- Branch: +- Commit: +- Push: origin/ +- PR: +- Base: platform-dev +- Nächster Schritt: Nach Merge `platform-dev` → `dev` PR aktualisieren/erstellen +``` + +If tests were not run, say: + +```text +Tests wurden in diesem Skill nicht automatisch ausgeführt. +``` + +Do not claim tests passed unless the tool actually ran them. + +--- + +## Safety Rules + +- Never commit directly to `dev`, `platform-dev`, `website-dev`, `main`, or `master`. +- Never force-push unless explicitly requested. +- For Integration Refresh Mode only, `git push --force-with-lease origin platform-dev` is allowed because the user works alone on `platform-dev`; never use plain `--force`. +- Never auto-merge PRs unless explicitly requested. +- Never target `dev` directly for platform feature PRs unless explicitly requested. +- Never delete branches unless explicitly requested. +- Never claim tests were run unless the tool actually ran them. +- Never commit `.env`, secrets, local tokens, local mock-server configuration, or temporary runtime-only changes. +- If migrations were created, mention that the target environment needs migration execution after deployment. +- If unresolved conflicts exist, stop. + +--- + +## Useful Commands + +Inspect: + +```bash +git rev-parse --show-toplevel +git rev-parse --abbrev-ref HEAD +git status --porcelain +git status -sb +git config --get remote.origin.url +``` + +Detect protected branch: + +```bash +branch="$(git rev-parse --abbrev-ref HEAD)" +case "$branch" in + dev|platform-dev|website-dev|main|master) + echo "PROTECTED_BRANCH:$branch" + exit 2 + ;; +esac +``` + +Detect unresolved conflicts: + +```bash +git diff --name-only --diff-filter=U +``` + +Detect `.env` changes: + +```bash +git status --porcelain | grep -E '(^.. \.env$|^.. apps/platform/\.env$|^.. .*\.env$)' || true +``` + +Commit: + +```bash +git add -A +git commit -m "" +``` + +Push: + +```bash +git push --set-upstream origin "$(git rev-parse --abbrev-ref HEAD)" +``` + +Latest commit: + +```bash +git rev-parse --short HEAD +git log -1 --pretty=%s +``` + +Integration refresh: + +```bash +git fetch origin +git checkout platform-dev +git reset --hard origin/platform-dev +git rebase origin/dev +git push --force-with-lease origin platform-dev +``` + +Verify integration refresh: + +```bash +git fetch origin +git merge-base --is-ancestor origin/dev origin/platform-dev \ + && echo "OK: platform-dev contains dev" \ + || echo "OUTDATED: platform-dev does not contain dev" +``` + +Check rebase conflicts: + +```bash +git diff --name-only --diff-filter=U +``` + +--- + +## Example User Request + +User: + +```text +alles committen pushen und pr gegen platform-dev mit gitea mcp +``` + +Assistant should: + +1. Check current branch. +2. Stop if branch is protected. +3. Stop if `.env` or secrets would be committed. +4. Commit all changes. +5. Push current branch. +6. Create PR into `platform-dev` with Gitea MCP. +7. Report result. + +Do not ask unnecessary follow-up questions. + +--- + +## Example Integration Refresh Request + +User: + +```text +platform-dev PR aktualisieren +``` + +Assistant should: + +1. Ensure the working tree is clean. +2. Fetch origin. +3. Checkout `platform-dev`. +4. Reset local `platform-dev` to `origin/platform-dev`. +5. Rebase `platform-dev` onto `origin/dev`. +6. Push with `--force-with-lease`. +7. Verify `origin/dev` is an ancestor of `origin/platform-dev`. +8. Create or report the PR `platform-dev` → `dev`. +9. Report result. + +Do not merge the PR automatically. \ No newline at end of file diff --git a/.codex/skills/spec-kit-implementation-loop/SKILL.md b/.codex/skills/spec-kit-implementation-loop/SKILL.md new file mode 100644 index 00000000..bcf2ca30 --- /dev/null +++ b/.codex/skills/spec-kit-implementation-loop/SKILL.md @@ -0,0 +1,447 @@ +--- +name: spec-kit-implementation-loop +description: Implement an existing TenantPilot/TenantAtlas Spec Kit feature, run tests, browser smoke checks where applicable, post-implementation analysis, fix all confirmed in-scope findings when safe and bounded, and repeat until no in-scope findings remain or a stop condition is reached. +--- + +# Skill: Spec Kit Implementation Loop + +## Purpose + +Use this skill to implement an already prepared TenantPilot/TenantAtlas Spec Kit feature and verify it with a bounded implementation loop. + +This skill assumes `spec.md`, `plan.md`, and `tasks.md` already exist and have passed preparation readiness or have been explicitly accepted by the user. + +The intended workflow is: + +```text +active or explicitly named spec +→ inspect repo truth, constitution, spec, plan, tasks, and relevant code/tests +→ evaluate implementation gates +→ implement strictly task-by-task +→ run relevant tests/checks +→ run browser smoke test when UI/user-facing flows are affected +→ run strict post-implementation analysis +→ fix confirmed in-scope findings +→ repeat test + browser smoke + analysis + fix loop until clean or bounded stop condition is reached +→ final implementation report +``` + +## When to Use + +Use this skill when the user asks to: + +- implement an active or explicitly named Spec Kit feature +- run Spec Kit implement +- analyze after implementation +- fix implementation findings +- repeat implementation verification until no confirmed in-scope findings remain +- run tests and browser smoke checks after implementation + +Typical user prompts: + +```text +Implementiere die aktive Spec und analysiere danach, ob alles passt. +``` + +```text +Implementiere specs/243-product-usage-adoption-telemetry streng nach tasks.md. +``` + +```text +Mach Spec Kit implement und danach analyse. Behebe alle Abweichungen und wiederhole bis sauber. +``` + +```text +Implementiere die vorbereitete Spec. Danach Tests, Browser Smoke Test falls UI betroffen ist, Analyse und Fix-Loop bis keine In-Scope Findings mehr offen sind. +``` + +## Hard Rules + +- Work strictly repo-based. +- Implement only the active or explicitly named Spec Kit feature. +- Do not choose a new candidate. +- Do not create a new spec. +- Do not expand scope beyond `spec.md`, `plan.md`, and `tasks.md`. +- Do not silently add roadmap features, adjacent UX rewrites, speculative architecture, or unrelated refactors. +- Follow the repository constitution and existing Spec Kit conventions. +- Preserve TenantPilot/TenantAtlas terminology. +- Prefer small, reviewable patches over broad rewrites. +- Treat repository truth as authoritative over assumptions. +- If repository truth conflicts with implementation scope, stop and report the conflict unless there is an obvious minimal correction inside active spec scope. +- Fix only confirmed findings from tests, static checks, browser smoke checks, or post-implementation analysis. +- Fix all confirmed in-scope findings, regardless of severity, when they are safe and bounded. +- Do not leave Medium/Low findings open silently. If they are not fixed, document exactly why. +- Never hide failing tests, weaken assertions, delete meaningful coverage, or mark tasks complete without implementation evidence. +- Do not run destructive commands. +- Do not force checkout, reset, stash, rebase, merge, or delete branches. +- Do not perform database-destructive actions unless the repository test workflow explicitly requires isolated test database resets. +- Do not continue analysis/fix loops indefinitely. +- Do not move from implementation to final status unless the Test Gate, Browser Smoke Test Gate where applicable, and Post-Implementation Analysis Gate have been evaluated. +- Do not claim merge-readiness unless the Merge Readiness Gate passes. + +## Required Inputs + +The user should provide at least one of: + +- explicit spec directory such as `specs/-/` +- instruction to use the current active Spec Kit feature +- instruction to implement the prepared/current spec + +If the active spec cannot be determined safely, inspect the repository Spec Kit context first. If it is still ambiguous, stop and ask for the specific spec directory. + +## Required Repository Checks + +Always check: + +1. active Spec Kit context / current branch +2. git status +3. `.specify/memory/constitution.md` +4. the active spec directory +5. `spec.md` +6. `plan.md` +7. `tasks.md` +8. relevant templates or conventions under `.specify/templates/` +9. nearby existing specs with related terminology or scope +10. application code surfaces referenced by the active spec +11. existing tests related to the changed behavior + +## Git and Branch Safety + +Before making implementation changes: + +1. Check the current branch. +2. Check whether the working tree is clean. +3. If there are unrelated uncommitted changes, stop and report them. Do not continue. +4. If the working tree only contains user-intended changes for this operation, continue cautiously. +5. Do not force checkout, reset, stash, rebase, merge, or delete branches. +6. Do not overwrite unrelated work. + +## Quality Gates + +### Gate 1: Spec Readiness Gate + +Required before implementation starts. + +Pass criteria: + +- `spec.md`, `plan.md`, and `tasks.md` exist. +- The spec has clear problem statement, user value, functional requirements, out-of-scope boundaries, acceptance criteria, assumptions, and risks. +- The plan identifies likely affected repo surfaces and does not contradict repository architecture. +- The tasks are small, ordered, verifiable, and include test/validation tasks. +- RBAC, workspace/tenant isolation, auditability, OperationRun semantics, evidence/result-truth, and UX requirements are addressed where relevant. +- No open question blocks safe implementation. +- The scope is small enough for a bounded implementation loop. + +Fail behavior: + +- Stop before implementation. +- Report readiness gaps. +- Do not compensate for an unclear spec by inventing implementation scope. + +### Gate 2: Implementation Scope Gate + +Required before changing application code. + +Pass criteria: + +- The active spec directory is known. +- The implementation target is traceable to specific tasks in `tasks.md`. +- The affected files/surfaces are consistent with `plan.md` or clearly justified by repository truth. +- No required change would introduce unrelated product behavior. +- No required change conflicts with constitution, existing architecture, RBAC/isolation boundaries, or source-of-truth semantics. + +Fail behavior: + +- Stop before code changes and report the conflict or ambiguity. +- Suggest a minimal spec/plan/tasks correction if the issue is in the artifacts rather than the codebase. + +### Gate 3: Test Gate + +Required after implementation and after each fix iteration. + +Pass criteria: + +- Targeted tests for changed behavior pass. +- Relevant existing tests pass or failures are proven unrelated and documented. +- Static analysis, linting, formatting, or type checks used by the repository pass when applicable. +- Security/governance-relevant changes have backend, policy, or domain coverage; UI-only verification is not enough. +- Regression coverage exists for each fixed Blocker or High finding where practical. + +Fail behavior: + +- Fix in-scope failures before post-implementation analysis. +- If failures are unrelated or pre-existing, document evidence and continue only if they do not invalidate the active spec. +- Do not weaken tests to pass the gate. + +### Gate 4: Browser Smoke Test Gate + +Required before claiming implementation is ready for manual review/merge when the change affects Filament UI, Livewire interactions, navigation, forms, tables, actions, modals, dashboards, operation drilldowns, tenant/workspace context, or any user-facing flow. + +Not required for backend-only, domain-only, enum-only, contract-only, or test-only changes unless those changes alter a user-facing flow. + +Pass criteria: + +- The relevant page or flow loads in a real browser or the repository's browser-testing harness. +- The primary action introduced or changed by the spec can be executed successfully. +- Expected UI states, labels, badges, actions, empty states, tables, forms, modals, and navigation are visible where relevant. +- Workspace/tenant context is preserved across the tested flow where relevant. +- RBAC/capability-dependent visibility behaves as expected where practical to verify. +- Livewire interactions complete without visible runtime errors. +- No relevant browser console errors occur. +- No failed network requests occur for the tested flow, except known unrelated development noise that is explicitly documented. +- OperationRun, audit, evidence, result, or support-diagnostic drilldowns work where relevant. +- The smoke-tested path is documented in the final response. + +Fail behavior: + +- Fix in-scope browser, UX, Livewire, navigation, or runtime failures before claiming merge-readiness. +- If a browser issue is unrelated existing debt, document evidence and residual risk. +- Do not treat a passing browser smoke test as a substitute for backend, policy, domain, security, feature, or integration tests. +- Do not expand the smoke test into a full E2E suite unless the user explicitly asks for that. + +### Gate 5: Post-Implementation Analysis Gate + +Required after implementation and after each fix iteration. + +Pass criteria: + +- The implementation has been checked against `spec.md`, `plan.md`, `tasks.md`, and constitution. +- All completed tasks have implementation evidence. +- No confirmed in-scope findings remain. +- Medium/Low findings are fixed when they are inside active spec scope, clearly bounded, and safe. +- Medium/Low findings that remain open are explicitly documented with one of these reasons: + - out of scope + - requires separate spec + - risky refactor + - existing unrelated debt + - not reproducible + - blocked by unclear product/architecture decision +- No scope expansion was introduced during fixes. + +Fail behavior: + +- Fix confirmed in-scope findings, regardless of severity, when the fix is safe and bounded. +- Stop instead of fixing when remediation would expand scope, contradict repo architecture, introduce risky refactors, or repeat the same failed fix twice. + +### Gate 6: Merge Readiness Gate + +Required before claiming the implementation is ready for manual review/merge. + +Pass criteria: + +- Spec Readiness Gate passed. +- Implementation Scope Gate passed. +- Test Gate passed. +- Browser Smoke Test Gate passed when applicable, or was explicitly marked not applicable with a reason. +- Post-Implementation Analysis Gate passed. +- `tasks.md` reflects actual completion status. +- No confirmed in-scope findings remain. +- All remaining findings are documented as out-of-scope, follow-up candidates, unrelated existing debt, or explicit residual risks. +- Final response includes changed files, tests/checks run, browser smoke result, iterations performed, residual risks, and follow-up candidates. + +Fail behavior: + +- Do not claim merge-readiness. +- Report the failed gate, remaining risks, and the smallest recommended next action. + +## Implementation Loop + +Execute the loop in bounded phases: + +1. Evaluate the Spec Readiness Gate. +2. Evaluate the Implementation Scope Gate before changing application code. +3. Implement the active Spec Kit feature scope task-by-task. +4. Run targeted tests and relevant static/dynamic checks. +5. Evaluate the Test Gate. +6. Run a Browser Smoke Test when the change affects UI/user-facing flows. +7. Evaluate the Browser Smoke Test Gate as passed, failed, or not applicable with a reason. +8. Run strict post-implementation analysis against spec, plan, tasks, constitution, changed code, changed tests, browser smoke results where applicable, and relevant existing patterns. +9. Evaluate the Post-Implementation Analysis Gate. +10. Identify confirmed findings by severity: Blocker, High, Medium, Low. +11. Fix all confirmed in-scope findings regardless of severity when safe and bounded. +12. Do not fix findings that require scope expansion, risky unrelated refactors, or architectural/product decisions outside the active spec; document them as follow-up/residual risks with reasons. +13. Re-run relevant tests and browser smoke checks where applicable after fixes. +14. Repeat test + browser smoke + analysis + fix loop until no confirmed in-scope findings remain or a stop condition is reached. +15. Evaluate the Merge Readiness Gate. +16. Report final implementation status, changed files, tests, browser smoke result, residual risks, failed/passed gates, and manual review prompt. + +## Stop Conditions + +Stop the implementation loop when any of the following is true: + +- No confirmed in-scope findings remain. +- The same finding appears twice after attempted fixes. +- A required fix conflicts with the spec, plan, constitution, or repository architecture. +- A required fix would expand scope beyond the active spec. +- A required fix would require a risky unrelated refactor. +- A required fix depends on an unresolved product or architecture decision. +- Tests reveal an unrelated pre-existing failure that cannot be safely fixed inside the active spec. +- Browser smoke testing reveals an unrelated pre-existing UI/runtime failure that cannot be safely fixed inside the active spec. +- Three analysis/fix iterations have already been completed. +- The repository state is ambiguous enough that continuing would risk damaging architecture or data semantics. + +When stopping before full cleanliness, report exactly why the loop stopped and what remains. + +## Post-Implementation Analysis Prompt + +Use this prompt internally after implementation and after each fix iteration: + +```markdown +Du bist ein Senior Staff Software Engineer, Software Architect und Enterprise SaaS Reviewer. + +Analysiere die Implementierung der aktiven Spec streng repo-basiert. + +Ziel: +Prüfe, ob die Umsetzung vollständig, konsistent, getestet und constitution-konform ist. + +Prüfe gegen: +- spec.md +- plan.md +- tasks.md +- .specify/memory/constitution.md +- geänderte Anwendungscodes +- geänderte Tests +- Browser-Smoke-Test-Ergebnis, falls UI/user-facing Flows betroffen sind +- bestehende Repository-Patterns + +Wichtig: +- Keine Spekulation ohne Repo-Beleg. +- Keine Scope-Erweiterung. +- Keine neuen Produktideen als Pflicht-Fixes. +- Findings nach Blocker, High, Medium, Low gruppieren. +- Für jedes Finding konkrete Datei-/Code-Belege nennen. +- Für jedes Finding eine minimale Remediation nennen. +- Separat ausweisen, welche Findings innerhalb der aktiven Spec behoben werden müssen. +- Medium/Low Findings innerhalb der aktiven Spec ebenfalls zur Behebung markieren, wenn sie sicher und bounded sind. +- Bei UI-/Filament-/Livewire-Änderungen prüfen, ob ein Browser Smoke Test durchgeführt wurde und ob der getestete Operator-Flow wirklich funktioniert. +- Findings, die nicht behoben werden sollen, nur als Follow-up/Residual Risk ausweisen, wenn sie out of scope, risky refactor, unrelated existing debt, not reproducible oder durch eine offene Produkt-/Architekturentscheidung blockiert sind. +- Wenn keine bestätigten In-Scope Findings verbleiben, klare Implementierungsfreigabe geben. +``` + +## Task Completion Rules + +- Keep `tasks.md` aligned with actual implementation status. +- Check off tasks only after the implementation and test evidence exists. +- If a task is obsolete because repository truth proves a different path, update the task note with the reason instead of silently deleting it. +- If a task cannot be completed inside scope, leave it unchecked and report why. + +## Testing Rules + +- Add or update tests for all changed business behavior. +- Include RBAC and workspace/tenant isolation tests where relevant. +- Include OperationRun, audit, evidence, or result-truth tests where relevant. +- Prefer regression tests for every fixed Blocker or High finding. +- Add regression tests for Medium/Low findings when the behavior is important and testable without excessive churn. +- Do not weaken tests to pass the suite. +- Do not treat a green UI path as sufficient without backend or policy coverage when the behavior is security- or governance-relevant. + +## Browser Smoke Test Rules + +Apply these rules when the active spec changes Filament UI, Livewire interactions, navigation, forms, tables, actions, modals, dashboards, operation drilldowns, tenant/workspace context, or any user-facing flow. + +The browser smoke test should be narrow and focused. It is not a full E2E suite unless explicitly requested. + +Minimum smoke path: + +1. Open the relevant page or entry point. +2. Confirm the expected workspace/tenant context where relevant. +3. Confirm the changed or newly introduced UI element is visible. +4. Execute the primary action or interaction changed by the spec. +5. Confirm the expected result state, notification, redirect, table update, modal state, operation link, or drilldown. +6. Check for relevant console errors. +7. Check for failed network requests related to the tested flow. +8. Document the tested path in the final response. + +For TenantPilot/TenantAtlas, pay special attention to: + +- Filament actions and header actions +- Livewire polling, modals, validation, and actions +- workspace/tenant context preservation +- RBAC/capability-dependent action visibility +- OperationRun links and drilldown continuity +- audit/evidence/result/support-diagnostic drilldowns where relevant +- empty states, badges, labels, and decision guidance where relevant + +Browser smoke testing is required for UI/user-facing changes and optional for backend-only changes. + +Do not treat browser smoke success as proof that backend security, policies, domain logic, auditability, or workspace/tenant isolation are correct. Those still require automated tests or repo-based verification. + +## Failure Handling + +If an implementation step, test phase, browser smoke phase, or post-implementation analysis fails: + +1. Stop at the relevant gate or stop condition. +2. Report the failing command or phase. +3. Summarize the error. +4. Do not attempt unrelated implementation as a workaround. +5. Suggest the smallest safe next action. + +If the branch or working tree state is unsafe: + +1. Stop before implementation changes. +2. Report the current branch and relevant uncommitted files. +3. Ask the user to commit, stash, or move to a clean worktree. + +## Final Response Requirements + +Respond with: + +1. Active spec directory +2. Summary of implemented changes +3. Tests/checks run and their results +4. Browser smoke test result, tested path, or not-applicable reason +5. Quality gates passed/failed and number of analysis/fix iterations performed +6. Remaining in-scope findings, if any +7. Residual risks and follow-up candidates, if relevant +8. Files changed +9. Explicit statement whether the Merge Readiness Gate passed and whether the implementation is ready for manual review/merge + +Keep the final response concise, but include enough detail for the user to continue immediately. + +## Manual Review Prompt + +Provide a ready-to-copy prompt like this, adapted to the active spec number and slug: + +```markdown +Du bist ein Senior Staff Software Architect und Enterprise SaaS Reviewer. + +Führe eine finale manuelle Review der implementierten Spec `-` streng repo-basiert durch. + +Ziel: +Prüfe, ob die Implementierung nach dem Agenten-Loop wirklich merge-ready ist. + +Wichtig: +- Keine Implementierung. +- Keine Codeänderungen. +- Keine Scope-Erweiterung. +- Prüfe gegen spec.md, plan.md, tasks.md und constitution.md. +- Prüfe die geänderten Dateien, Tests, Browser-Smoke-Test-Ergebnis, RBAC, Workspace-/Tenant-Isolation, Auditability, UX und OperationRun-Semantik, soweit relevant. +- Benenne nur konkrete Findings mit Repo-Beleg. +- Gib am Ende eine klare Entscheidung: Merge-ready, merge-ready with notes, oder not merge-ready. +``` + +## Example Invocation + +User: + +```text +Nutze den Skill spec-kit-implementation-loop. +Implementiere die aktive Spec. +Danach Tests ausführen, Browser Smoke Test falls UI/user-facing betroffen ist, Post-Implementation Analyse durchführen und alle bestätigten In-Scope Findings unabhängig von Severity beheben, wenn safe und bounded. +Wiederhole test + browser smoke + analysis + fix bis keine In-Scope Findings mehr offen sind oder eine Stop Condition greift. +``` + +Expected behavior: + +1. Inspect active Spec Kit context, constitution, spec, plan, tasks, relevant code, and relevant tests. +2. Evaluate the Spec Readiness Gate and Implementation Scope Gate. +3. Implement only the active spec scope. +4. Run targeted tests and relevant checks. +5. Evaluate the Test Gate. +6. Run and evaluate Browser Smoke Test when UI/user-facing flows are affected. +7. Run post-implementation analysis. +8. Fix all confirmed in-scope findings regardless of severity when safe and bounded. +9. Repeat test + browser smoke + analysis + fix loop up to the stop conditions. +10. Evaluate the Merge Readiness Gate. +11. Report final status, changed files, tests, browser smoke result, residual risks, gates, and manual review prompt. +``` \ No newline at end of file diff --git a/.codex/skills/spec-kit-next-best-prep/SKILL.md b/.codex/skills/spec-kit-next-best-prep/SKILL.md new file mode 100644 index 00000000..376d38e7 --- /dev/null +++ b/.codex/skills/spec-kit-next-best-prep/SKILL.md @@ -0,0 +1,612 @@ +--- +name: spec-kit-next-best-prep +description: Select the next suitable TenantPilot/TenantAtlas spec candidate from roadmap/spec-candidates, run the repository's Spec Kit preparation flow, create or update spec.md/plan.md/tasks.md, run preparation analysis, fix preparation-artifact issues only, and stop before application implementation. +--- + +# Skill: Spec Kit Next-Best Preparation + +## Purpose + +Use this skill to prepare the next implementation-ready Spec Kit package for TenantPilot/TenantAtlas without implementing application code. + +This skill supports preparation only: + +1. Select or scope the next suitable feature from roadmap/spec-candidates. +2. Run the repository's real Spec Kit preparation workflow where available. +3. Create or update `spec.md`, `plan.md`, and `tasks.md`. +4. Run preparation `analyze` when supported. +5. Fix preparation-artifact issues only. +6. Evaluate preparation quality gates. +7. Stop before application implementation. + +The intended workflow is: + +```text +roadmap / spec-candidates / feature idea +→ inspect repo truth, constitution, roadmap, spec candidates, existing specs, and relevant code +→ select the next suitable candidate or scope the provided idea +→ run Spec Kit specify/plan/tasks/analyze where available +→ create or update spec.md + plan.md + tasks.md +→ fix preparation-artifact issues only +→ evaluate Candidate Selection Gate and Spec Readiness Gate +→ final preparation report +→ explicit implementation step later +``` + +## When to Use + +Use this skill when the user asks to: + +- select the next best spec candidate from `docs/product/spec-candidates.md` and roadmap sources +- turn a feature idea, roadmap item, or candidate into `spec.md`, `plan.md`, and `tasks.md` +- prepare Spec Kit artifacts in one pass +- run specify/plan/tasks/analyze without implementation +- fix preparation analysis issues in Spec Kit artifacts only +- prepare a feature package for a later implementation skill + +Typical user prompts: + +```text +Nimm den nächsten sinnvollen Spec Candidate aus Roadmap/spec-candidates und mach spec, plan und tasks. +``` + +```text +Mach daraus spec, plan und tasks in einem Rutsch, aber noch nicht implementieren. +``` + +```text +Wähle aus roadmap.md und spec-candidates.md die nächste sinnvollste Spec und führe specify, plan, tasks und analyze aus. +``` + +```text +Behebe alle analyze-Issues in den Spec-Kit-Artefakten. Keine Application-Implementierung. +``` + +## Hard Rules + +- Work strictly repo-based. +- This is a preparation-only skill. +- Do not implement application code. +- Do not modify production code. +- Do not modify migrations, models, services, jobs, Filament resources, Livewire components, policies, commands, routes, views, tests, or runtime behavior. +- Use the repository's actual Spec Kit workflow, scripts, templates, branch naming rules, and generated paths when available. +- Do not manually invent spec numbers, branch names, or spec paths if Spec Kit provides a script or command for that. +- Do not bypass Spec Kit branch mechanics. +- Create or update only Spec Kit preparation artifacts unless repository conventions require additional documentation artifacts. +- Do not expand scope beyond the selected feature, `spec.md`, `plan.md`, and `tasks.md`. +- Do not silently add roadmap features, adjacent UX rewrites, speculative architecture, or unrelated refactors. +- Follow the repository constitution and existing Spec Kit conventions. +- Preserve TenantPilot/TenantAtlas terminology. +- Prefer small, reviewable, implementation-ready specs over broad rewrites. +- Treat repository truth as authoritative over assumptions. +- If repository truth conflicts with the user-provided draft or candidate wording, keep repository truth and document the deviation. +- Fix only confirmed preparation-artifact findings from Spec Kit preparation analysis. +- Do not leave preparation findings open silently. If they are not fixed, document exactly why. +- Do not run destructive commands. +- Do not force checkout, reset, stash, rebase, merge, or delete branches. +- Do not overwrite existing specs. +- Do not rewrite completed specs back into preparation state. +- Do not remove or normalize implementation history, close-out notes, validation results, completed task markers, smoke results, or post-implementation review language from completed specs. +- Treat completed-spec close-out and validation language as intentional repository history, not preparation drift. +- Do not move from preparation to an implementation step inside this skill. + +## Required Inputs + +The user should provide at least one of: + +- feature title and short goal +- full spec candidate +- roadmap item +- rough problem statement +- UX or architecture improvement idea +- instruction to choose the next best candidate from roadmap/spec-candidates + +If the input is incomplete, proceed with the smallest reasonable interpretation and document assumptions. + +If no suitable candidate can be selected safely, stop and report why. + +## Required Repository Checks + +Always check: + +1. `.specify/memory/constitution.md` +2. `.specify/templates/` +3. `.specify/scripts/` +4. existing Spec Kit command usage or repository instructions, if present +5. current branch and git status +6. `specs/` +7. `docs/product/spec-candidates.md` +8. relevant roadmap documents under `docs/product/`, especially `roadmap.md` if present +9. nearby existing specs with related terminology or scope +10. application code only as needed to avoid wrong naming, wrong architecture, duplicate concepts, impossible tasks, duplicated specs, or already-completed candidates + +Do not edit application code. + +## Completed-Spec Guardrail + +Before selecting an existing spec package as a `next-best-prep` target, explicitly check whether the spec is already completed, implementation-closed, or validated. + +A spec must be treated as completed if any of the following signals are present in `spec.md`, `plan.md`, `tasks.md`, `quickstart.md`, checklist artifacts, or related Spec Kit package files: + +- `Implementation Close-Out` +- `Implementation completed on` +- `Implementation Validation Results` +- `Implemented and validated` +- `Review Outcome` or `Implementation Review Outcome` +- passed validation, smoke, browser, or guardrail results +- completed task checklist markers for the implementation tasks +- post-implementation review or close-out language +- a status marker indicating implemented, completed, closed, or validated + +If a spec is completed: + +- exclude it from `next-best-prep` candidate selection +- do not patch, normalize, rewrite, or convert it back to preparation-only state +- do not remove close-out sections, validation results, completed task markers, smoke results, or post-implementation review language +- treat those artifacts as historical implementation evidence +- only use the completed spec as context for dependency or roadmap reasoning + +If all high-priority candidates are already specced, active, or completed, stop and report `no safe next prep target` instead of modifying existing completed specs. + +## Git and Branch Safety + +Before running any Spec Kit command: + +1. Check the current branch. +2. Check whether the working tree is clean. +3. If there are unrelated uncommitted changes, stop and report them. Do not continue. +4. If the working tree only contains user-intended planning edits for this operation, continue cautiously. +5. Let Spec Kit create or switch to the correct feature branch when that is how the repository workflow works. +6. Do not force checkout, reset, stash, rebase, merge, or delete branches. +7. Do not overwrite existing specs. + +If the repo requires an explicit branch creation script for `specify`, use that script rather than manually creating the branch. + +## Quality Gates + +### Gate 1: Candidate Selection Gate + +Required before creating a new spec from roadmap/spec-candidates. + +Pass criteria: + +- The selected candidate exists in roadmap/spec-candidate material or is directly provided by the user. +- The selected candidate is not already covered by an existing active or completed spec. +- The selected target is not a completed spec package with implementation close-out, validation results, completed tasks, smoke results, or post-implementation review history. +- The selected candidate aligns with current roadmap priorities or explicitly documented product direction. +- The candidate can be scoped as a small, reviewable, implementation-ready slice. +- Major adjacent concerns are listed as follow-up candidates instead of being hidden inside the primary scope. + +Fail behavior: + +- If no candidate satisfies the gate, stop and report the top candidates plus the reason none is ready. +- If the only plausible targets are completed specs, stop and report `no safe next prep target`; do not modify those completed specs. +- Do not invent a new roadmap direction to force progress. + +### Gate 2: Spec Readiness Gate + +Required before reporting that the package is ready for implementation. + +Pass criteria: + +- `spec.md`, `plan.md`, and `tasks.md` exist. +- The spec has clear problem statement, user value, functional requirements, out-of-scope boundaries, acceptance criteria, assumptions, and risks. +- The plan identifies likely affected repo surfaces and does not contradict repository architecture. +- The tasks are small, ordered, verifiable, and include test/validation tasks. +- RBAC, workspace/tenant isolation, auditability, OperationRun semantics, evidence/result-truth, and UX requirements are addressed where relevant. +- No open question blocks safe implementation. +- The scope is small enough for a bounded implementation loop in a later implementation skill. +- Required checklist artifacts exist when the constitution requires them. + +Fail behavior: + +- Fix preparation-artifact issues when they are safe and bounded. +- If readiness cannot be achieved without implementation or unresolved product decisions, stop and report the gap. +- Do not compensate for an unclear spec by inventing implementation scope. + +## Candidate Selection Rules + +When the user asks for the next best spec from roadmap/spec-candidates: + +- Read `docs/product/spec-candidates.md`. +- Read relevant roadmap documents under `docs/product/`, especially `roadmap.md` if present. +- Check existing specs to avoid duplicates. +- Check existing specs for completed-spec signals before selecting an existing package as a refresh target. +- Exclude completed specs from next-best-prep selection, even if their artifacts contain close-out, validation, or completed-task language that would look like drift in a preparation-only package. +- Prefer candidates that align with current roadmap priorities, platform foundations, enterprise UX, RBAC/isolation, auditability, observability, and governance workflow maturity. +- Prefer candidates that unlock roadmap progress, reduce architectural drift, harden foundations, or remove known blockers. +- Prefer small, implementation-ready slices over broad platform rewrites. +- If multiple candidates are plausible, choose one primary candidate and document why it was selected. +- Add non-selected relevant candidates as follow-up spec candidates, not hidden scope. +- Do not invent a candidate if existing roadmap/spec-candidate material provides a suitable one. +- Do not pick a spec only because it is listed first. +- Evaluate the Candidate Selection Gate before creating the spec directory. + +Evaluate candidates using these criteria: + +1. **Roadmap Fit**: Does it support the current roadmap sequence or unlock the next roadmap layer? +2. **Foundation Value**: Does it strengthen reusable platform foundations such as RBAC, isolation, auditability, evidence, OperationRun observability, provider boundaries, vocabulary, baseline/control/finding semantics, or enterprise UX patterns? +3. **Dependency Unblocking**: Does it make future specs smaller, safer, or more consistent? +4. **Scope Size**: Can it be implemented as a narrow, testable slice? +5. **Repo Readiness**: Does the repo already have enough structure to implement the next slice safely? +6. **Risk Reduction**: Does it reduce current architectural or product risk? +7. **User/Product Value**: Does it produce visible operator value or make the platform more sellable without heavy scope? +8. **Completion Safety**: Is the target genuinely unprepared or incomplete, rather than an already completed spec whose historical close-out artifacts should be preserved? + +## Required Selection Output Before Spec Kit Execution + +Before running the Spec Kit flow, identify: + +- selected candidate title +- source location in roadmap/spec-candidates +- why it was selected +- why close alternatives were deferred +- roadmap relationship +- completed-spec check result for related existing specs +- smallest viable implementation slice +- proposed concise feature description to feed into `specify` + +The feature description must be product- and behavior-oriented. It should not be a low-level implementation plan. + +## Spec Kit Preparation Flow + +### Step 1: Determine the repository's Spec Kit command pattern + +Inspect repository instructions and scripts to identify how this repo expects Spec Kit to be run. + +Common locations to inspect: + +```text +.specify/scripts/ +.specify/templates/ +.specify/memory/constitution.md +.github/prompts/ +.github/skills/ +README.md +specs/ +``` + +Use the repo-specific mechanism if present. + +### Step 2: Run `specify` + +Run the repository's `specify` flow using the selected candidate and the smallest viable slice. + +The `specify` input should include: + +- selected candidate title +- problem statement +- operator/user value +- roadmap relationship +- out-of-scope boundaries +- key acceptance criteria +- important enterprise constraints + +Let Spec Kit create the correct branch and spec location if that is the repo's configured behavior. + +### Step 3: Run `plan` + +Run the repository's `plan` flow for the generated spec. + +The `plan` input should keep the scope tight and should require repo-based alignment with: + +- constitution +- existing architecture +- workspace/tenant isolation +- RBAC +- OperationRun/observability where relevant +- evidence/snapshot/truth semantics where relevant +- Filament/Livewire conventions where relevant +- test strategy + +### Step 4: Run `tasks` + +Run the repository's `tasks` flow for the generated plan. + +The generated tasks must be: + +- ordered +- small +- testable +- grouped by phase +- limited to the selected scope +- suitable for later implementation or manual analysis before implementation + +### Step 5: Run preparation `analyze` + +Run the repository's `analyze` flow against the generated Spec Kit artifacts when the repository supports it. + +Analyze must check: + +- consistency between `spec.md`, `plan.md`, and `tasks.md` +- constitution alignment +- roadmap alignment +- whether the selected candidate was narrowed safely +- whether tasks are complete enough for implementation +- whether tasks accidentally require scope not described in the spec +- whether plan details conflict with repository architecture or terminology +- whether implementation risks are documented instead of silently ignored + +Do not use analyze as a trigger to implement application code. + +### Step 6: Fix preparation-artifact issues only + +If preparation analyze finds issues, first confirm that the selected package is not completed. Then fix only Spec Kit preparation artifacts such as: + +- `spec.md` +- `plan.md` +- `tasks.md` +- `checklists/requirements.md` or other generated Spec Kit metadata files, if the repository uses them + +Allowed fixes include: + +- clarify requirements +- tighten scope +- move out-of-scope work into follow-up candidates +- correct terminology +- add missing tasks +- remove tasks not backed by the spec +- align plan language with repository architecture +- add missing acceptance criteria or validation tasks +- add missing checklist artifacts required by the constitution + +Forbidden fixes include: + +- modifying application code +- creating migrations +- editing models, services, jobs, policies, Filament resources, Livewire components, tests, commands, routes, or views +- running implementation or test-fix loops +- changing runtime behavior +- removing implementation close-out history from completed specs +- converting completed specs back to preparation-only wording +- changing passed validation or smoke results into planned validation commands +- unchecking completed implementation tasks in a completed spec + +### Step 7: Evaluate the Spec Readiness Gate + +After preparation analyze has passed or preparation-artifact issues have been fixed, evaluate the Spec Readiness Gate. + +Stop after this gate and do not implement. + +## Spec Directory Rules + +When creating a new spec directory, use the repository's Spec Kit-generated directory or path. + +If the repository does not provide a command for spec setup, use the next valid spec number and a kebab-case slug: + +```text +specs/-/ +``` + +The exact number must be derived from the current repository state and existing numbering conventions. + +Create or update preparation artifacts inside the selected spec directory: + +```text +specs/-/spec.md +specs/-/plan.md +specs/-/tasks.md +``` + +If the repository templates require additional preparation files, create them only when this is consistent with existing Spec Kit conventions. + +## `spec.md` Requirements + +The spec must be product- and behavior-oriented. It should avoid premature implementation detail unless needed for correctness. + +Include: + +- Feature title +- Problem statement +- Business/product value +- Primary users/operators +- User stories +- Functional requirements +- Non-functional requirements +- UX requirements +- RBAC/security requirements +- Auditability/observability requirements +- Data/truth-source requirements where relevant +- Out of scope +- Acceptance criteria +- Success criteria +- Risks +- Assumptions +- Open questions + +TenantPilot/TenantAtlas specs should preserve enterprise SaaS principles: + +- workspace/tenant isolation +- capability-first RBAC +- auditability +- operation/result truth separation +- source-of-truth clarity +- calm enterprise operator UX +- progressive disclosure where useful +- no false positive calmness + +## `plan.md` Requirements + +The plan must be repo-aware and implementation-oriented, but it must not make code changes by itself. + +Include: + +- Technical approach +- Existing repository surfaces likely affected +- Domain/model implications +- UI/Filament implications +- Livewire implications where relevant +- OperationRun/monitoring implications where relevant +- RBAC/policy implications +- Audit/logging/evidence implications where relevant +- Data/migration implications where relevant +- Test strategy +- Rollout considerations +- Risk controls +- Implementation phases + +The plan should clearly distinguish where relevant: + +- execution truth +- artifact truth +- backup/snapshot truth +- recovery/evidence truth +- operator next action + +## `tasks.md` Requirements + +Tasks must be ordered, small, and verifiable. + +Include: + +- checkbox tasks +- phase grouping +- tests before or alongside implementation tasks where practical +- final validation tasks +- documentation/update tasks if needed +- explicit non-goals where useful + +Avoid vague tasks such as: + +```text +Clean up code +Refactor UI +Improve performance +Make it enterprise-ready +``` + +Prefer concrete tasks such as: + +```text +- [ ] Add a feature test covering workspace isolation for . +- [ ] Update to display . +- [ ] Add policy coverage for . +``` + +If exact file names are not known yet, phrase tasks as repo-verification tasks first rather than inventing file paths. + +## Preparation Scope Control + +If the requested feature implies multiple independent concerns, create one primary spec for the smallest valuable slice and add a `Follow-up spec candidates` section. + +Examples of follow-up candidates: + +- assigned findings +- pending approvals +- personal work queue +- notification delivery settings +- evidence pack export hardening +- operation monitoring refinements +- autonomous governance decision surfaces + +Do not force all follow-up candidates into the primary spec. + +## Failure Handling + +If a Spec Kit command or preparation analyze phase fails: + +1. Stop at the relevant gate. +2. Report the failing command or phase. +3. Summarize the error. +4. Do not attempt implementation as a workaround. +5. Suggest the smallest safe next action. + +If the branch or working tree state is unsafe: + +1. Stop before running Spec Kit commands. +2. Report the current branch and relevant uncommitted files. +3. Ask the user to commit, stash, or move to a clean worktree. + +If a completed spec is accidentally selected or modified: + +1. Stop immediately. +2. Report that the selected spec is completed and therefore not a valid preparation target. +3. Revert only the changes made by this operation to that completed spec package, if they are isolated and safe to revert. +4. Run `git status --short` and report remaining changes. +5. Re-run candidate selection excluding completed specs. +6. If no safe unprepared candidate exists, report `no safe next prep target`. + +## Final Response Requirements + +Respond with: + +1. Selected candidate and why it was chosen +2. Why close alternatives were deferred +3. Completed-spec guardrail result for related existing specs +4. Current branch after Spec Kit execution, if changed +5. Generated spec path +6. Files created or updated by Spec Kit +7. Preparation analyze result summary +8. Preparation-artifact fixes applied after analyze +9. Assumptions made +10. Open questions, if any +11. Candidate Selection Gate result +12. Spec Readiness Gate result +13. Recommended next implementation prompt +14. Explicit statement that no application implementation was performed + +Keep the final response concise, but include enough detail for the user to continue immediately. + +## Manual Review and Next-Step Prompts + +Provide a ready-to-copy manual artifact review prompt like this, adapted to the generated spec branch/path: + +```markdown +Du bist ein Senior Staff Software Architect und Enterprise SaaS Reviewer. + +Analysiere die neu erstellte Spec `` streng repo-basiert. + +Ziel: +Prüfe, ob `spec.md`, `plan.md` und `tasks.md` vollständig, konsistent, implementierbar und constitution-konform sind. + +Wichtig: +- Keine Implementierung. +- Keine Codeänderungen. +- Keine Scope-Erweiterung. +- Prüfe nur gegen Repo-Wahrheit. +- Benenne konkrete Konflikte mit Dateien, Patterns, Datenflüssen oder bestehenden Specs. +- Schlage nur minimale Korrekturen an `spec.md`, `plan.md` und `tasks.md` vor. +- Wenn alles passt, gib eine klare Implementierungsfreigabe. +``` + +Also provide a ready-to-copy implementation prompt for the separate implementation skill after analyze has passed or preparation-artifact issues have been fixed: + +```markdown +/spec-kit-implementation-loop + +Implementiere die vorbereitete Spec `` streng anhand von `tasks.md`. + +Danach Tests ausführen, Browser Smoke Test falls UI/user-facing betroffen ist, Post-Implementation Analyse durchführen und alle bestätigten In-Scope Findings unabhängig von Severity beheben, wenn safe und bounded. + +Wiederhole test + browser smoke + analysis + fix bis keine In-Scope Findings mehr offen sind oder eine Stop Condition greift. +``` + +## Example Invocation + +User: + +```text +Nutze den Skill spec-kit-next-best-prep. +Wähle aus roadmap.md und spec-candidates.md die nächste sinnvollste Spec. +Führe danach GitHub Spec Kit specify, plan, tasks und analyze in einem Rutsch aus. +Behebe alle analyze-Issues in den Spec-Kit-Artefakten. +Keine Application-Implementierung. +``` + +Expected behavior: + +1. Inspect constitution, Spec Kit scripts/templates, specs, roadmap, and spec candidates. +2. Check branch and working tree safety. +3. Compare candidate suitability. +4. Select the next best candidate. +5. Exclude already completed specs from preparation or refresh targets, preserving their close-out and validation history. +6. Evaluate the Candidate Selection Gate. +7. Run the repository's real Spec Kit `specify` flow, letting it handle branch/spec setup. +8. Run the repository's real Spec Kit `plan` flow. +9. Run the repository's real Spec Kit `tasks` flow. +10. Run the repository's real Spec Kit preparation `analyze` flow. +11. Fix analyze issues only in Spec Kit preparation artifacts. +12. Evaluate the Spec Readiness Gate. +13. Stop before application implementation. +14. Return selection rationale, branch/path summary, artifact summary, analyze summary, fixes applied, gates, and next implementation prompt. +``` \ No newline at end of file diff --git a/.codex/skills/tailwindcss-development/SKILL.md b/.codex/skills/tailwindcss-development/SKILL.md new file mode 100644 index 00000000..21a7e463 --- /dev/null +++ b/.codex/skills/tailwindcss-development/SKILL.md @@ -0,0 +1,129 @@ +--- +name: tailwindcss-development +description: "Styles applications using Tailwind CSS v4 utilities. Activates when adding styles, restyling components, working with gradients, spacing, layout, flex, grid, responsive design, dark mode, colors, typography, or borders; or when the user mentions CSS, styling, classes, Tailwind, restyle, hero section, cards, buttons, or any visual/UI changes." +license: MIT +metadata: + author: laravel +--- + +# Tailwind CSS Development + +## When to Apply + +Activate this skill when: + +- Adding styles to components or pages +- Working with responsive design +- Implementing dark mode +- Extracting repeated patterns into components +- Debugging spacing or layout issues + +## Documentation + +Use `search-docs` for detailed Tailwind CSS v4 patterns and documentation. + +## Basic Usage + +- Use Tailwind CSS classes to style HTML. Check and follow existing Tailwind conventions in the project before introducing new patterns. +- Offer to extract repeated patterns into components that match the project's conventions (e.g., Blade, JSX, Vue). +- Consider class placement, order, priority, and defaults. Remove redundant classes, add classes to parent or child elements carefully to reduce repetition, and group elements logically. + +## Tailwind CSS v4 Specifics + +- Always use Tailwind CSS v4 and avoid deprecated utilities. +- `corePlugins` is not supported in Tailwind v4. + +### CSS-First Configuration + +In Tailwind v4, configuration is CSS-first using the `@theme` directive — no separate `tailwind.config.js` file is needed: + + +```css +@theme { + --color-brand: oklch(0.72 0.11 178); +} +``` + +### Import Syntax + +In Tailwind v4, import Tailwind with a regular CSS `@import` statement instead of the `@tailwind` directives used in v3: + + +```diff +- @tailwind base; +- @tailwind components; +- @tailwind utilities; ++ @import "tailwindcss"; +``` + +### Replaced Utilities + +Tailwind v4 removed deprecated utilities. Use the replacements shown below. Opacity values remain numeric. + +| Deprecated | Replacement | +|------------|-------------| +| bg-opacity-* | bg-black/* | +| text-opacity-* | text-black/* | +| border-opacity-* | border-black/* | +| divide-opacity-* | divide-black/* | +| ring-opacity-* | ring-black/* | +| placeholder-opacity-* | placeholder-black/* | +| flex-shrink-* | shrink-* | +| flex-grow-* | grow-* | +| overflow-ellipsis | text-ellipsis | +| decoration-slice | box-decoration-slice | +| decoration-clone | box-decoration-clone | + +## Spacing + +Use `gap` utilities instead of margins for spacing between siblings: + + +```html +
+
Item 1
+
Item 2
+
+``` + +## Dark Mode + +If existing pages and components support dark mode, new pages and components must support it the same way, typically using the `dark:` variant: + + +```html +
+ Content adapts to color scheme +
+``` + +## Common Patterns + +### Flexbox Layout + + +```html +
+
Left content
+
Right content
+
+``` + +### Grid Layout + + +```html +
+
Card 1
+
Card 2
+
Card 3
+
+``` + +## Common Pitfalls + +- Using deprecated v3 utilities (bg-opacity-*, flex-shrink-*, etc.) +- Using `@tailwind` directives instead of `@import "tailwindcss"` +- Trying to use `tailwind.config.js` instead of CSS `@theme` directive +- Using margins for spacing between siblings instead of gap utilities +- Forgetting to add dark mode variants when the project uses dark mode \ No newline at end of file -- 2.45.2 From e6550ee1c6fe84b40009f091583d413b1e2fc264 Mon Sep 17 00:00:00 2001 From: Ahmed Darrazi Date: Mon, 11 May 2026 13:11:06 +0200 Subject: [PATCH 2/2] test: capture spec 295 full suite ci baseline --- .../Guards/TestLaneCommandContractTest.php | 13 + scripts/platform-test-artifacts | 15 +- .../checklists/requirements.md | 45 +++ .../295-full-suite-ci-baseline/data-model.md | 67 ++++ .../failure-classification.md | 122 +++++++ specs/295-full-suite-ci-baseline/plan.md | 181 +++++++++ .../295-full-suite-ci-baseline/quickstart.md | 90 +++++ specs/295-full-suite-ci-baseline/research.md | 58 +++ specs/295-full-suite-ci-baseline/spec.md | 342 ++++++++++++++++++ specs/295-full-suite-ci-baseline/tasks.md | 173 +++++++++ 10 files changed, 1097 insertions(+), 9 deletions(-) create mode 100644 specs/295-full-suite-ci-baseline/checklists/requirements.md create mode 100644 specs/295-full-suite-ci-baseline/data-model.md create mode 100644 specs/295-full-suite-ci-baseline/failure-classification.md create mode 100644 specs/295-full-suite-ci-baseline/plan.md create mode 100644 specs/295-full-suite-ci-baseline/quickstart.md create mode 100644 specs/295-full-suite-ci-baseline/research.md create mode 100644 specs/295-full-suite-ci-baseline/spec.md create mode 100644 specs/295-full-suite-ci-baseline/tasks.md diff --git a/apps/platform/tests/Feature/Guards/TestLaneCommandContractTest.php b/apps/platform/tests/Feature/Guards/TestLaneCommandContractTest.php index 7e3f0f23..1e8ae1db 100644 --- a/apps/platform/tests/Feature/Guards/TestLaneCommandContractTest.php +++ b/apps/platform/tests/Feature/Guards/TestLaneCommandContractTest.php @@ -38,6 +38,19 @@ ->and(file_exists(repo_path('scripts/platform-test-artifacts')))->toBeTrue(); }); +it('passes artifact staging inputs through php argv for sail execution', function (): void { + $artifactRunner = (string) file_get_contents(repo_path('scripts/platform-test-artifacts')); + + expect($artifactRunner) + ->toContain('./vendor/bin/sail php -- "${LANE}" "${STAGING_DIRECTORY}" "${ARTIFACT_DIRECTORY}"') + ->and($artifactRunner)->toContain('$laneId = (string) ($argv[1] ?? \'\');') + ->and($artifactRunner)->toContain('$stagingDirectory = (string) ($argv[2] ?? \'\');') + ->and($artifactRunner)->toContain('$artifactDirectory = (string) ($argv[3] ?? \'\');') + ->and($artifactRunner)->not->toContain("getenv('LANE_ID')") + ->and($artifactRunner)->not->toContain("getenv('STAGING_DIRECTORY')") + ->and($artifactRunner)->not->toContain("getenv('ARTIFACT_DIRECTORY')"); +}); + it('keeps heavy-governance baseline capture support inside the checked-in wrappers', function (): void { $laneRunner = (string) file_get_contents(repo_path('scripts/platform-test-lane')); $reportRunner = (string) file_get_contents(repo_path('scripts/platform-test-report')); diff --git a/scripts/platform-test-artifacts b/scripts/platform-test-artifacts index df974095..c6e512bf 100755 --- a/scripts/platform-test-artifacts +++ b/scripts/platform-test-artifacts @@ -48,20 +48,17 @@ fi cd "${APP_DIR}" -LANE_ID="${LANE}" \ -STAGING_DIRECTORY="${STAGING_DIRECTORY}" \ -ARTIFACT_DIRECTORY="${ARTIFACT_DIRECTORY}" \ -./vendor/bin/sail php <<'PHP' +./vendor/bin/sail php -- "${LANE}" "${STAGING_DIRECTORY}" "${ARTIFACT_DIRECTORY}" <<'PHP' ` commands for report/artifact classification +- **Fixture / helper / factory / seed / context cost risks**: no new defaults; classify fixture-heavy failures instead of widening setup by default +- **Expensive defaults or shared helper growth introduced?**: no +- **Heavy-family additions, promotions, or visibility changes**: none by default +- **Surface-class relief / special coverage rule**: browser/heavy lane output is classification-only unless active fix scope explicitly owns it +- **Closing validation and reviewer handoff**: reviewers should confirm no unclassified failing group, no hidden budget relaxation, no new lane family, and no legacy cutover behavior restoration +- **Budget / baseline / trend follow-up**: classify in `failure-classification.md`; only adjust a baseline when the row explains why current evidence supports it +- **Review-stop questions**: lane fit, hidden fixture cost, product repair scope creep, browser scope creep, budget baseline relaxation +- **Escalation path**: `document-in-feature` for CI/lane contract corrections, `follow-up-spec` for product/runtime failures +- **Active feature PR close-out entry**: `FullSuiteClassification` +- **Why no dedicated follow-up spec is needed**: this spec is itself the bounded classification pass. Follow-up specs are created only for classified product/runtime groups. + +## Project Structure + +### Documentation (this feature) + +```text +specs/295-full-suite-ci-baseline/ +├── checklists/ +│ └── requirements.md +├── data-model.md +├── failure-classification.md +├── plan.md +├── quickstart.md +├── research.md +├── spec.md +└── tasks.md +``` + +### Source Code (repository root) + +```text +scripts/ +├── platform-test-artifacts +├── platform-test-lane +└── platform-test-report + +apps/platform/ +├── composer.json +└── tests/ + ├── Feature/Guards/ + └── Support/ +``` + +**Structure Decision**: implementation should touch only the documentation artifacts above unless classification proves a small CI/lane contract defect in the listed scripts/support/guard-test surfaces. Runtime application code, migrations, models, Filament resources, routes, views, and provider services are out of scope. + +## Complexity Tracking + +| Violation | Why Needed | Simpler Alternative Rejected Because | +|---|---|---| +| Spec-local failure-classification vocabulary | The full-suite readiness decision needs one bounded way to classify all red groups after Specs `293` and `294` | Raw terminal notes would not preserve ownership, lane, or follow-up decisions | + +## Proportionality Review + +- **Current operator problem**: maintainers cannot safely decide whether CI is restored without a classified full-suite baseline. +- **Existing structure is insufficient because**: targeted green lanes and raw full-suite output answer different questions; neither alone assigns follow-up ownership. +- **Narrowest correct implementation**: one spec-local classification artifact and existing lane wrappers. +- **Ownership cost**: temporary classification upkeep during implementation and possibly small lane contract guard adjustments. +- **Alternative intentionally rejected**: new full-suite CI framework or fix-all suite cleanup. +- **Release truth**: current-release test governance and CI readiness. + +## Phase 0: Research Output + +See `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/research.md`. + +## Phase 1: Design Output + +- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md` +- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/quickstart.md` +- `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` + +## Phase 2: Task Planning Output + +See `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/tasks.md`. diff --git a/specs/295-full-suite-ci-baseline/quickstart.md b/specs/295-full-suite-ci-baseline/quickstart.md new file mode 100644 index 00000000..6612d771 --- /dev/null +++ b/specs/295-full-suite-ci-baseline/quickstart.md @@ -0,0 +1,90 @@ +# Quickstart: Full Suite Failure Classification & CI Lane Baseline + +## Purpose + +Use this package to classify whether the complete platform test suite is a reliable CI signal after Specs `293` and `294`. + +## Before Implementation + +1. Review: + - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md` + - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/plan.md` + - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md` + - `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +2. Confirm the branch is clean. +3. Confirm no implementation step is about restoring TenantPanelProvider, `/admin/t/...`, or tenant-scoped legacy fallbacks. + +## Primary Classification Flow + +Use only the pinned categories and seams from `failure-classification.md`: `ci-signal-restored`, `ci-wrapper-or-manifest-regression`, `artifact-publication-regression`, `budget-or-trend-baseline-drift`, `product-runtime-or-test-regression`, `browser-lane-regression`, `flaky-or-environment`, `follow-up-spec-required`, `resolved-or-not-needed`; and `raw-full-suite`, `fast-feedback-lane`, `confidence-lane`, `heavy-governance-lane`, `browser-lane`, `profiling-or-junit-support`, `lane-reporting`, `artifact-publication`, `budget-trend-baseline`, `legacy-cutover-regression-guard`, `provider-verification-regression-guard`. + +Run the raw full suite when feasible: + +```bash +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact) +``` + +Record the outcome in `failure-classification.md`. + +If the raw full suite is too slow, noisy, or environment-blocked to classify, run the explicit lane split: + +```bash +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser +``` + +## CI Report and Artifact Flow + +After lane runs, generate lane reports when needed: + +```bash +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser +``` + +Use artifact staging only if artifact publication itself is being validated: + +```bash +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-fast-feedback-artifacts +``` + +## Fix Rules + +Fix in `295` only when the failure is directly and narrowly caused by: + +- `scripts/platform-test-lane` +- `scripts/platform-test-report` +- `scripts/platform-test-artifacts` +- `apps/platform/tests/Support/TestLaneManifest.php` +- `apps/platform/tests/Support/TestLaneReport.php` +- `apps/platform/tests/Support/TestLaneBudget.php` +- directly related CI guard tests under `apps/platform/tests/Feature/Guards/` + +Do not fix in `295` when the failure requires: + +- application runtime behavior changes +- Filament page/resource changes +- routes, middleware, policies, services, jobs, migrations, views, or models +- provider/verification runtime changes beyond the completed Spec `294` +- browser UI repair +- tenant-cutover compatibility restoration + +Classify those as follow-up work instead. + +## Expected Close-Out + +Close out with exactly one final readiness decision: + +- `restored-ci-signal` +- `classified-follow-up-required` +- `blocked-by-environment` + +Then run formatting for any changed PHP files: + +```bash +export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent) +``` diff --git a/specs/295-full-suite-ci-baseline/research.md b/specs/295-full-suite-ci-baseline/research.md new file mode 100644 index 00000000..7f298064 --- /dev/null +++ b/specs/295-full-suite-ci-baseline/research.md @@ -0,0 +1,58 @@ +# Research: Full Suite Failure Classification & CI Lane Baseline + +## Decision: Use classification-first implementation + +**Rationale**: The user explicitly asked not to blindly repair the full suite. Specs `293` and `294` already handled known focused stabilization slices. `295` must first answer whether the full suite is a reliable signal and only then allow small CI/lane fixes. + +**Alternatives considered**: + +- **Fix every failing test immediately**: rejected because it hides ownership, scope-creeps into unrelated features, and violates the requested goal. +- **Run only targeted lanes**: rejected because the central question is the complete suite signal after the targeted lanes were stabilized. +- **Skip full-suite run and rely on CI lanes**: rejected because lane split can hide cross-lane fallout or raw-suite issues. + +## Decision: Prefer raw full suite, then explicit lane split fallback + +**Rationale**: The raw command `cd apps/platform && ./vendor/bin/sail artisan test --compact` is the most direct answer to the full-suite readiness question. If it times out, produces output too large to classify, or is environment-blocked, the existing wrappers provide explicit fallback segmentation: `fast-feedback`, `confidence`, `heavy-governance`, and `browser`. + +**Alternatives considered**: + +- **Create a new full-suite wrapper**: rejected as premature CI framework growth. +- **Use only `confidence`**: rejected because confidence intentionally excludes browser, heavy-governance, and some discovery-heavy families. + +## Decision: Reuse existing lane and failure-class contracts + +**Rationale**: `TestLaneManifest` already defines lanes, workflow profiles, budgets, artifact contracts, and lane scope notes. `TestLaneReport` already classifies CI failures as `test-failure`, `wrapper-failure`, `budget-breach`, `artifact-publication-failure`, or `infrastructure-failure`. Spec `295` should verify and minimally correct those contracts rather than inventing another taxonomy. + +**Pinned Spec 295 categories**: `ci-signal-restored`, `ci-wrapper-or-manifest-regression`, `artifact-publication-regression`, `budget-or-trend-baseline-drift`, `product-runtime-or-test-regression`, `browser-lane-regression`, `flaky-or-environment`, `follow-up-spec-required`, `resolved-or-not-needed`. + +**Pinned Spec 295 seams**: `raw-full-suite`, `fast-feedback-lane`, `confidence-lane`, `heavy-governance-lane`, `browser-lane`, `profiling-or-junit-support`, `lane-reporting`, `artifact-publication`, `budget-trend-baseline`, `legacy-cutover-regression-guard`, `provider-verification-regression-guard`. + +**Alternatives considered**: + +- **Add a separate CI readiness model**: rejected because the existing support classes already own this truth. +- **Record only plain-text notes**: rejected because future maintainers need stable categories, seams, and follow-up decisions. + +## Decision: Allow only small CI/lane contract fixes + +**Rationale**: In-scope fixes are limited to wrappers, manifest/report support, artifact publication, budget/report contract drift, and their direct guard tests. This keeps the package focused on CI signal readiness. + +**Alternatives considered**: + +- **Fix application/runtime failures discovered by the suite**: rejected unless a failure is proven to be a small CI/lane contract defect. +- **Update historical Specs `293` or `294`**: rejected by completed-spec guardrail and user scope. + +## Decision: Preserve legacy cutover retirement + +**Rationale**: The request explicitly forbids reopening tenant cutover, legacy `/admin/t/...`, or TenantPanelProvider. Any failure that appears to depend on those retired paths must be classified without restoring them. + +**Alternatives considered**: + +- **Add temporary route aliases to make old tests pass**: rejected as direct conflict with the cutover baseline. + +## Decision: Browser output is classification input, not automatic repair ownership + +**Rationale**: The browser lane is intentionally isolated and may expose environment or smoke fallout. Spec `295` should classify browser failures and only repair browser-specific contract issues if they are lane/report artifacts, not product UI behavior. + +**Alternatives considered**: + +- **Run a browser smoke fix loop inside 295**: rejected because this is not a UI implementation spec. diff --git a/specs/295-full-suite-ci-baseline/spec.md b/specs/295-full-suite-ci-baseline/spec.md new file mode 100644 index 00000000..8ca947ec --- /dev/null +++ b/specs/295-full-suite-ci-baseline/spec.md @@ -0,0 +1,342 @@ +# Feature Specification: Full Suite Failure Classification & CI Lane Baseline + +**Feature Branch**: `295-full-suite-ci-baseline` +**Created**: 2026-05-11 +**Status**: Ready +**Input**: User description: "Spec 295 - Full Suite Failure Classification & CI Lane Baseline. After Specs 293 and 294, run a full-suite classification to determine whether the full platform suite is again a reliable CI signal or whether remaining failures must be classified into separate follow-up specs or lanes. Do not blindly fix the full suite, do not scope-creep, do not reopen tenant cutover, do not restore legacy `/admin/t/...` or TenantPanelProvider behavior, and perform only small clearly in-scope fixes." + +## Spec Candidate Check *(mandatory - SPEC-GATE-001)* + +- **Problem**: Specs `293` and `294` closed the known post-cutover route/action-surface and ProviderConnections/Verification failure blocks, but the complete platform suite has not yet been classified as a restored CI signal. Maintainers need one bounded pass that distinguishes green signal, CI wrapper or lane baseline failures, remaining product regressions, flaky or environment failures, and follow-up-spec debt. +- **Today's failure**: targeted lanes can be green while the raw full suite or CI lane wrappers may still fail for unrelated product debt, wrapper/report/artifact drift, budget baseline changes, browser-specific fallout, or environment-only failures. Without classification, future work cannot tell whether a red run means "fix this PR", "rerun because infrastructure failed", "update lane baseline", or "open a follow-up spec". +- **User-visible improvement**: maintainers get an attributable CI readiness decision: either the complete platform suite is a reliable blocking signal again, or every remaining red group is explicitly assigned to the right lane, owner, and follow-up path without reviving retired tenant routes or reopening Specs `293` and `294`. +- **Smallest enterprise-capable version**: one classification-first package that runs the raw full suite or its explicit fallback lane split, records every failing group in `failure-classification.md`, validates existing lane wrappers/report/artifact contracts, applies only small CI-signal fixes when the failure is clearly in scope, and records all product/runtime failures as follow-up candidates instead of absorbing them. +- **Explicit non-goals**: no broad full-suite repair, no tenant-cutover rework, no TenantPanelProvider reactivation, no `/admin/t/...` route restoration, no provider/verification runtime expansion beyond Spec `294`, no new CI framework, no new permanent test lane by default, no new browser family, no new runtime persistence, no UI redesign, no product feature work, no unrelated failing-test cleanup, and no historical-spec rewrites. +- **Permanent complexity imported**: one spec-local `failure-classification.md` artifact, one bounded failure-category inventory, one bounded CI/lane seam inventory, and focused tasks against existing test lane scripts, lane manifest/report support, and current Pest lane commands. No runtime table, model, enum, provider abstraction, Filament resource, or product surface is introduced. +- **Why now**: after `293` and `294`, the next quality question is no longer one known red cluster. It is whether CI can be trusted again as a whole. If this is not classified now, later specs will either over-trust a partially red suite or keep rediscovering unrelated failures as local surprises. +- **Why not local**: the signal spans raw Pest execution, `scripts/platform-test-lane`, `scripts/platform-test-report`, `scripts/platform-test-artifacts`, `Tests\Support\TestLaneManifest`, `Tests\Support\TestLaneReport`, browser isolation, heavy-governance budget/reporting, and current workflow profiles. A one-file patch would not prove CI readiness. +- **Approval class**: Cleanup +- **Red flags triggered**: full-suite scope, cross-cutting test governance, and possible temptation to repair unrelated product failures. Defense: this spec is classification-first, uses existing lane/failure-class contracts, imports only a spec-local artifact, and forbids broad repair or legacy route restoration. +- **Score**: Nutzen: 2 | Dringlichkeit: 2 | Scope: 2 | Komplexitaet: 1 | Produktnaehe: 1 | Wiederverwendung: 2 | **Gesamt: 10/12** +- **Decision**: approve + +## Review Outcome + +- **Outcome class**: `acceptable-special-case` +- **Workflow outcome**: `keep` +- **Test-governance outcome**: `keep` +- **Reason**: full-suite work is normally too broad, but this package is justified because it is a classification and CI-signal baseline pass after two completed stabilization slices, not a fix-all implementation. +- **Workflow result**: Ready for implementation as one bounded suite-signal classification package after Specs `293` and `294`. + +## Candidate Selection Gate + +- **Selected candidate**: Full Suite Failure Classification & CI Lane Baseline +- **Source location**: explicit user-provided manual follow-up after `specs/293-post-cutover-suite-stabilization/` and `specs/294-provider-verification-runtime-semantics/` +- **Why selected now**: the known cutover and provider/verification red blocks have been stabilized, so the remaining decision is whether the full platform suite and lane wrappers now form a trustworthy CI signal. +- **Why close alternatives were deferred**: + - reopening Spec `293` would blur route/action-surface cutover cleanup with full-suite CI readiness + - reopening Spec `294` would blur provider/verification runtime semantics with unrelated suite failures + - starting Package Execution, Guided Operations, Microsoft Starter Pack, or Virtual Consultant would hide CI uncertainty under new product work + - creating a new permanent full-suite lane would import CI framework complexity before proving the existing lanes are insufficient + - fixing every failing test in one pass would scope-creep beyond classification and make follow-up ownership unclear +- **Roadmap relationship**: test-governance and platform quality follow-through under `TEST-GOV-001`; this is not a new product roadmap lane and not an automatic active queue promotion. +- **Completed-spec guardrail result**: Specs `293` and `294` are context only and are excluded from refresh. Spec `294` carries implementation close-out evidence. Spec `293` is treated as the completed post-cutover baseline described by the user and its failure-classification history is preserved; this spec does not rewrite 293 tasks or close-out history. Specs `287` and `288` remain prior cutover and no-legacy guard context only. +- **Smallest viable implementation slice**: run the full suite or explicit lane split, classify every remaining failure group, validate CI wrapper/report/artifact contracts, and perform only small CI-signal fixes that do not change product behavior. +- **Proposed concise feature description to feed into specify**: Classify the full platform test suite after Specs 293 and 294 and establish whether existing CI lanes provide a trustworthy baseline, while splitting unrelated failures into explicit follow-up ownership instead of repairing the suite blindly. + +## Pinned Failure-Classification Categories + +- `ci-signal-restored` +- `ci-wrapper-or-manifest-regression` +- `artifact-publication-regression` +- `budget-or-trend-baseline-drift` +- `product-runtime-or-test-regression` +- `browser-lane-regression` +- `flaky-or-environment` +- `follow-up-spec-required` +- `resolved-or-not-needed` + +## Pinned CI / Suite Seams + +- `raw-full-suite` +- `fast-feedback-lane` +- `confidence-lane` +- `heavy-governance-lane` +- `browser-lane` +- `profiling-or-junit-support` +- `lane-reporting` +- `artifact-publication` +- `budget-trend-baseline` +- `legacy-cutover-regression-guard` +- `provider-verification-regression-guard` + +## Spec Scope Fields *(mandatory)* + +- **Scope**: repository / CI test-governance workflow +- **Primary Routes**: N/A - no application routes or operator-facing navigation are added or restored. Retired `/admin/t/...` routes and TenantPanelProvider behavior remain forbidden. +- **Data Ownership**: + - no new application persistence is introduced + - no runtime source of truth is introduced + - `failure-classification.md` is a spec-local implementation artifact and is not product/runtime truth + - existing test lane truth remains in `apps/platform/tests/Support/TestLaneManifest.php`, `apps/platform/tests/Support/TestLaneReport.php`, and the wrapper scripts under `scripts/` +- **RBAC**: + - no authorization model changes are introduced + - existing workspace and managed-environment isolation tests remain ordinary suite participants + - if a failing group concerns RBAC, it must be classified as product/runtime debt or a follow-up spec unless it is clearly only a stale CI/lane assertion + +For canonical-view specs, the spec MUST define: + +- **Default filter behavior when tenant-context is active**: N/A - no canonical-view application surface is added or changed. +- **Explicit entitlement checks preventing cross-tenant leakage**: N/A for this prep package. Any suite failure suggesting leakage must be classified as product-runtime debt and not hidden as a lane issue. + +## Cross-Cutting / Shared Pattern Reuse *(mandatory when the feature touches notifications, status messaging, action links, header actions, dashboard signals/cards, alerts, navigation entry points, evidence/report viewers, or any other existing shared operator interaction family; otherwise write `N/A - no shared interaction family touched`)* + +- **Cross-cutting feature?**: yes +- **Interaction class(es)**: CI lane execution, full-suite signal classification, lane report generation, artifact publication, budget/trend baseline review, and follow-up-spec routing +- **Systems touched**: + - `scripts/platform-test-lane` + - `scripts/platform-test-report` + - `scripts/platform-test-artifacts` + - `apps/platform/composer.json` + - `apps/platform/tests/Support/TestLaneManifest.php` + - `apps/platform/tests/Support/TestLaneReport.php` + - `apps/platform/tests/Support/TestLaneBudget.php` + - `apps/platform/tests/Feature/Guards/TestLaneManifestTest.php` + - `apps/platform/tests/Feature/Guards/CiLaneFailureClassificationContractTest.php` + - `apps/platform/tests/Feature/Guards/CiFastFeedbackWorkflowContractTest.php` + - `apps/platform/tests/Feature/Guards/CiConfidenceWorkflowContractTest.php` + - `apps/platform/tests/Feature/Guards/CiHeavyBrowserWorkflowContractTest.php` + - existing lane-selected Pest tests and browser smoke files only as classification inputs unless a small CI-signal fix is proven +- **Existing pattern(s) to extend**: existing `TestLaneManifest` lane definitions, existing `TestLaneReport` failure classes, existing lane wrapper scripts, existing Gitea workflow profile metadata, existing report/artifact publication contracts +- **Shared contract / presenter / builder / renderer to reuse**: `TestLaneManifest::lanes()`, `TestLaneManifest::workflowProfiles()`, `TestLaneManifest::failureClasses()`, `TestLaneReport::classifyPrimaryFailure()`, `TestLaneReport::buildCiSummary()`, `TestLaneReport::artifactPublicationStatus()`, and `scripts/platform-test-*` +- **Why the existing shared path is sufficient or insufficient**: the repo already has explicit lane, failure-class, artifact, and budget contracts. Spec `295` must prove whether they are currently enough and fix only small contract drift; it must not create a new CI orchestration layer before existing contracts are classified. +- **Allowed deviation and why**: only a bounded CI/lane contract correction is allowed when a wrapper, manifest, report, artifact, or budget baseline defect prevents classification. Product/runtime failures must be classified and split instead of fixed here. +- **Consistency impact**: raw suite output, lane wrapper output, report artifacts, budget/trend summaries, and final follow-up classification must tell the same story about whether the suite is green, blocked, flaky, or split. +- **Review focus**: reviewers must verify that this spec does not become a general failing-test cleanup, does not restore tenant-cutover legacy behavior, and does not add a new permanent lane unless the artifacts explicitly prove existing lanes are insufficient. + +## OperationRun UX Impact *(mandatory when the feature creates, queues, deduplicates, resumes, blocks, completes, or deep-links to an `OperationRun`; otherwise write `N/A - no OperationRun start or link semantics touched`)* + +- **Touches OperationRun start/completion/link UX?**: no +- **Shared OperationRun UX contract/layer reused**: N/A +- **Delegated start/completion UX behaviors**: N/A +- **Local surface-owned behavior that remains**: N/A +- **Queued DB-notification policy**: N/A +- **Terminal notification path**: N/A +- **Exception required?**: none + +## Provider Boundary / Platform Core Check *(mandatory when the feature changes shared provider/platform seams, identity scope, governed-subject taxonomy, compare strategy selection, provider connection descriptors, or operator vocabulary that may leak provider-specific semantics into platform-core truth; otherwise write `N/A - no shared provider/platform boundary touched`)* + +- **Shared provider/platform boundary touched?**: no product boundary change +- **Boundary classification**: N/A +- **Seams affected**: provider and verification tests may fail during classification, but this spec may only classify them as regression or follow-up debt unless the failure is purely a CI/lane contract issue. +- **Neutral platform terms preserved or introduced**: `workspace`, `managed environment`, `provider connection`, `operation`, `lane`, `failure group`, `CI signal` +- **Provider-specific semantics retained and why**: N/A +- **Why this does not deepen provider coupling accidentally**: Spec `295` does not change provider runtime, provider identity, target-scope semantics, or provider copy. It treats provider-specific failures as test/runtime debt requiring explicit follow-up unless they are already covered by the completed Spec `294` seam and proven to be a small regression in the CI contract. +- **Follow-up path**: any real provider/verification product failure after Spec `294` must become a follow-up spec or explicitly named failure group, not hidden in this classification pass. + +## UI / Surface Guardrail Impact *(mandatory when operator-facing surfaces are changed; otherwise write `N/A`)* + +N/A - no operator-facing surface change. Browser tests may be run as a lane signal only; visible UI repair is out of scope unless a later implementation explicitly stops and opens a follow-up spec. + +## Decision-First Surface Role *(mandatory when operator-facing surfaces are changed)* + +N/A - no application decision surface is added or changed. + +## Audience-Aware Disclosure *(mandatory when operator-facing surfaces are changed)* + +N/A - no application disclosure layer is added or changed. + +## UI/UX Surface Classification *(mandatory when operator-facing surfaces are changed)* + +N/A - no Filament screen, table, widget, relation manager, or resource is added or materially refactored. + +## Operator Surface Contract *(mandatory when operator-facing surfaces are changed)* + +N/A - no operator-facing page contract is introduced. + +## Proportionality Review *(mandatory when structural complexity is introduced)* + +- **New source of truth?**: no runtime source of truth +- **New persisted entity/table/artifact?**: no application persistence; one spec-local `failure-classification.md` artifact is added for implementation tracking only +- **New abstraction?**: no +- **New enum/state/reason family?**: yes, one spec-local failure-classification category set used only inside this spec package +- **New cross-domain UI framework/taxonomy?**: no +- **Current operator problem**: maintainers need one reliable answer to whether the full suite is a usable CI signal after Specs `293` and `294`, and if not, exactly which lane or follow-up owns the remaining failures. +- **Existing structure is insufficient because**: targeted green lanes do not prove full-suite readiness, while raw red output without classification does not tell maintainers whether to fix, split, rerun, or update lane baseline artifacts. +- **Narrowest correct implementation**: add one spec-local failure-classification artifact, use existing lane wrappers and support classes, classify all remaining groups, and fix only small CI-signal defects that block classification. +- **Ownership cost**: low to moderate; maintain one temporary classification artifact and any small lane contract correction made during implementation. +- **Alternative intentionally rejected**: a new full-suite framework, broad test rewrite, or permanent new lane. Those options import durable complexity before the existing lane system is proven insufficient. +- **Release truth**: current-release CI/test-governance readiness only + +### Compatibility posture + +This feature assumes a pre-production environment. + +Backward compatibility, legacy aliases, route shims, TenantPanelProvider restoration, and compatibility-specific tests are out of scope. Canonical replacement remains preferred over preservation. + +## Testing / Lane / Runtime Impact *(mandatory for runtime behavior changes)* + +- **Test purpose / classification**: Heavy-Governance, Feature, Browser, Support/JUnit, and full-suite classification +- **Validation lane(s)**: raw full suite, fast-feedback, confidence, heavy-governance, browser, profiling/support when needed, junit/report/artifact publication when needed +- **Why this classification and these lanes are sufficient**: the goal is not one feature behavior. The proving purpose is whether the complete platform suite and existing CI lanes produce a trustworthy pass/fail signal after the known stabilization work. +- **New or expanded test families**: none by default. Any new test must be limited to a small CI/lane contract guard if a wrapper/report/artifact regression is proven. +- **Fixture / helper cost impact**: no new expensive fixture defaults are allowed. If fixture drift appears in the full suite, classify it by failing family and split to follow-up unless a one-line lane/guard baseline is the direct cause. +- **Heavy-family visibility / justification**: explicit. Heavy-governance and browser lanes are signal inputs, not automatic repair ownership. +- **Special surface test profile**: `global-context-shell`, `standard-native-filament`, `shared-detail-family`, `browser-smoke`, `surface-guard`, `discovery-heavy` +- **Standard-native relief or required special coverage**: no UI coverage expansion; browser lane reruns are used only to classify the existing smoke baseline. +- **Reviewer handoff**: reviewers must confirm that Livewire remains v4.0+, Filament remains v5, provider registration stays in `apps/platform/bootstrap/providers.php`, globally searchable resources are not changed, destructive actions are not changed, no assets are registered, every remaining failure is classified, and any in-scope fix is tied directly to a CI/lane contract defect. +- **Budget / baseline / trend impact**: the classification may update the documented status of budget or trend baseline drift, but it must not silently relax lane budgets or create a new baseline without an explicit row in `failure-classification.md`. +- **Escalation needed**: `document-in-feature` for contained lane baseline findings; `follow-up-spec` for product/runtime failures, fixture-family debt, new heavy cost centers, browser fallout, or any repair that exceeds CI/lane contract correction. +- **Active feature PR close-out entry**: `FullSuiteClassification` +- **Planned validation commands**: + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branch` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --stat` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junit` + - `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)` + +## User Scenarios & Testing *(mandatory)* + +### User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1) + +As a maintainer, I want the complete platform suite run or explicit fallback lane split classified before any fixes so the project knows whether CI is green, blocked, flaky, or split into follow-up work. + +**Why this priority**: without classification first, Spec `295` would become an uncontrolled full-suite repair pass. + +**Independent Test**: Run the raw full suite or fallback lane split and prove every failing group has exactly one category, one seam, one owner/follow-up decision, and one status row in `failure-classification.md`. + +**Acceptance Scenarios**: + +1. **Given** the repo after Specs `293` and `294`, **When** the raw full suite passes, **Then** `failure-classification.md` records `ci-signal-restored` with the command, date, and pass counts. +2. **Given** the raw full suite fails, **When** the failure groups are reviewed, **Then** each group is classified before any repair is attempted. +3. **Given** a failing group points at `/admin/t/...`, TenantPanelProvider, or legacy tenant route behavior, **When** it is classified, **Then** the remedy must not restore that behavior and must be split or fixed only through current workspace-first truth. + +--- + +### User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1) + +As a maintainer, I want each existing CI lane wrapper, report, artifact, and failure class to produce a trustworthy signal so Gitea CI failures can be interpreted without reading raw terminal output first. + +**Why this priority**: a green or red Pest run is not enough if wrapper, report, artifact, budget, or failure-class summaries are stale. + +**Independent Test**: Run the existing lane wrappers and report commands, then verify each lane either passes with complete artifacts or fails with the correct primary failure class. + +**Acceptance Scenarios**: + +1. **Given** a lane fails because tests fail, **When** its report summary is generated, **Then** the primary failure class is `test-failure` rather than wrapper, artifact, or infrastructure failure. +2. **Given** a lane wrapper or manifest no longer resolves to the intended lane, **When** the lane is classified, **Then** it is marked `ci-wrapper-or-manifest-regression` and may be fixed in `295`. +3. **Given** required report artifacts are missing after a lane run, **When** publication is checked, **Then** it is classified as `artifact-publication-regression` and may be fixed in `295`. + +--- + +### User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1) + +As a maintainer, I want remaining product/runtime failures to become explicit follow-up ownership instead of being silently fixed under a CI-baseline spec. + +**Why this priority**: this protects scope discipline and keeps test-governance decisions attributable. + +**Independent Test**: Review every non-CI failure group and prove it either has a targeted follow-up recommendation or is demonstrably flaky/environmental. + +**Acceptance Scenarios**: + +1. **Given** a failing group requires a runtime product fix, **When** classification finishes, **Then** it is marked `follow-up-spec-required` or `product-runtime-or-test-regression` and not repaired under `295` unless the user explicitly starts that implementation scope later. +2. **Given** a failing group belongs to browser-only behavior, **When** classification finishes, **Then** it is marked `browser-lane-regression` with the existing smoke file and follow-up path. +3. **Given** a failing group disappears on rerun or is environment-specific, **When** classification finishes, **Then** it is marked `flaky-or-environment` with rerun evidence instead of treated as restored CI. + +--- + +### User Story 4 - Publish the Final CI Readiness Decision (Priority: P2) + +As a maintainer, I want a final readiness statement that says whether the full suite can be used as a CI baseline now, and what exact follow-up remains if it cannot. + +**Why this priority**: the output must be actionable for future specs and Gitea workflows, not just a local debugging note. + +**Independent Test**: Inspect `failure-classification.md`, lane report outputs, and final validation commands to confirm there are no unclassified failure groups and no hidden scope expansion. + +**Acceptance Scenarios**: + +1. **Given** all raw suite and lane signals pass, **When** close-out is prepared, **Then** the readiness decision is `restored-ci-signal`. +2. **Given** any group remains red, **When** close-out is prepared, **Then** the readiness decision is `classified-follow-up-required` and each group has an owner/follow-up. +3. **Given** a small CI/lane contract fix was applied, **When** final validation runs, **Then** the directly affected lane/report/artifact guard passes and unrelated failures remain classified rather than hidden. + +### Edge Cases + +- The raw full suite times out or produces output too large to classify directly. +- A lane passes tests but fails report or artifact publication. +- A lane fails only because budget/trend baselines drifted, not because tests failed. +- Browser lane failures expose stale screenshots or environment-specific browser state. +- A failure appears to touch Spec `293` or `294` seams but would require reopening retired legacy behavior. +- A failure disappears on rerun, suggesting flaky or environment-only behavior. +- A small lane manifest fix changes which tests run in a lane, which could accidentally widen CI cost. + +## Requirements *(mandatory)* + +**Constitution alignment (required):** This spec introduces no Microsoft Graph calls, no write/change behavior, no long-running application work, and no new `OperationRun`. It must preserve workspace/tenant isolation expectations while classifying test failures. Any failure suggesting isolation, RBAC, or audit regressions must be classified as product/runtime debt and not hidden as a CI wrapper issue. + +**Constitution alignment (PROP-001 / ABSTR-001 / PERSIST-001 / STATE-001 / BLOAT-001):** The only structural addition is one spec-local failure-classification vocabulary and artifact. It solves the current CI readiness problem after two stabilization specs; no runtime persistence, CI framework, test engine, or new lane abstraction is introduced. + +**Constitution alignment (TEST-GOV-001):** Spec `295` must explicitly classify the proving purpose of every lane run, preserve the existing lane family boundaries, keep expensive fixture/context setup opt-in, and end with one review outcome: `keep`, `split`, `document-in-feature`, `follow-up-spec`, or `reject-or-split`. + +### Functional Requirements + +- **FR-295-001**: The implementation MUST run the raw full suite once when feasible using `cd apps/platform && ./vendor/bin/sail artisan test --compact`. +- **FR-295-002**: If the raw full suite is too slow, noisy, or environment-blocked to classify reliably, the implementation MUST run the explicit fallback lane split: `fast-feedback`, `confidence`, `heavy-governance`, and `browser`. +- **FR-295-003**: Every failing group MUST be recorded in `failure-classification.md` with exactly one pinned category, one pinned seam, observed command, candidate owner, fix-in-295 decision, follow-up decision, and status. +- **FR-295-004**: Lane wrapper, report, artifact, budget, and failure-class problems MAY be fixed in `295` only when the failure is clearly isolated to `scripts/platform-test-lane`, `scripts/platform-test-report`, `scripts/platform-test-artifacts`, `TestLaneManifest`, `TestLaneReport`, `TestLaneBudget`, or their guard tests. +- **FR-295-005**: Product/runtime failures MUST NOT be repaired under `295` unless they are also a small, proven CI/lane contract defect; otherwise they must be assigned to a follow-up spec or classified as unrelated existing debt. +- **FR-295-006**: Any failure related to Specs `293` or `294` MUST be classified without rewriting those completed specs or restoring legacy behavior. +- **FR-295-007**: The implementation MUST NOT restore TenantPanelProvider, `/admin/t/...`, tenant-scoped provider fallback routes, or other retired cutover behavior. +- **FR-295-008**: The implementation MUST validate existing lane failure classes: `test-failure`, `wrapper-failure`, `budget-breach`, `artifact-publication-failure`, and `infrastructure-failure`. +- **FR-295-009**: The implementation MUST produce a final CI readiness decision in `failure-classification.md`: `restored-ci-signal`, `classified-follow-up-required`, or `blocked-by-environment`. +- **FR-295-010**: Any new or changed tests MUST be limited to CI/lane contract proof and must use Pest. + +### Non-Functional Requirements + +- **NFR-295-001**: No new runtime persistence, queue, model, service abstraction, provider registry, Filament resource, or browser family is introduced. +- **NFR-295-002**: Test lane classification must follow actual proving purpose, not file location. +- **NFR-295-003**: Existing lane budget and trend baselines must not be relaxed silently. +- **NFR-295-004**: Classification output must be concise enough for future implementers to route work without re-running the entire suite first. +- **NFR-295-005**: The final package must preserve Filament v5 / Livewire v4 compatibility and must not change panel provider registration. + +## Key Entities *(include if feature involves data)* + +- **Failure Group**: one failing test file, failing assertion cluster, wrapper error, artifact error, budget breach, or environment failure sharing one cause and one owner. +- **CI Lane Signal**: the pass/fail/report/artifact/budget outcome for one lane in `TestLaneManifest`. +- **Classification Decision**: the spec-local row assigning one category, seam, owner, fix-in-295 decision, and follow-up path. +- **Readiness Decision**: the final status of the full suite and lane baseline after classification. + +## Success Criteria *(mandatory)* + +- **SC-295-001**: `failure-classification.md` exists and contains the pinned category and seam definitions. +- **SC-295-002**: Raw full suite output or fallback lane split output is represented by classified groups with no unclassified red group remaining. +- **SC-295-003**: Existing lane wrappers and report/artifact contracts either pass or have a classified failure class and fix/follow-up decision. +- **SC-295-004**: No implementation step restores TenantPanelProvider, `/admin/t/...`, or retired tenant-scoped fallback behavior. +- **SC-295-005**: The final readiness decision is explicit and actionable: `restored-ci-signal`, `classified-follow-up-required`, or `blocked-by-environment`. +- **SC-295-006**: If a product/runtime failure remains, the classification identifies a separate follow-up owner instead of treating the full suite as green. + +## Assumptions + +- Specs `293` and `294` have completed the targeted stabilization work described by the user and are context only. +- The repo's existing Gitea-compatible lane system remains the preferred CI shape. +- Local implementation will use Sail-first commands unless a non-Docker fallback is explicitly needed. +- Full-suite execution may be expensive; lane split is an allowed fallback only when the raw full suite is not classifiable. + +## Risks + +- Full-suite output may be too large or slow to classify directly. +- Environment-specific Sail/browser failures may obscure real suite status. +- A tempting product fix may be small locally but still outside this CI-baseline scope. +- Budget/trend drift may be real but not appropriate to fix by silently raising thresholds. +- Multiple failing groups may share a fixture root cause and need careful grouping to avoid duplicate follow-up specs. + +## Open Questions + +- None blocking preparation. During implementation, actual failing groups determine whether follow-up specs are needed. diff --git a/specs/295-full-suite-ci-baseline/tasks.md b/specs/295-full-suite-ci-baseline/tasks.md new file mode 100644 index 00000000..d6d2a29f --- /dev/null +++ b/specs/295-full-suite-ci-baseline/tasks.md @@ -0,0 +1,173 @@ +# Tasks: Full Suite Failure Classification & CI Lane Baseline + +**Input**: Design documents from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/` +**Prerequisites**: `spec.md`, `plan.md`, `research.md`, `data-model.md`, `quickstart.md`, `failure-classification.md`, `checklists/requirements.md` + +**Review Artifact**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md` +**Failure Inventory**: `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` + +## Review Metadata + +- **Review outcome class**: `acceptable-special-case` +- **Workflow outcome**: `keep` +- **Test-governance outcome**: `keep` +- **Stop / split triggers**: broad product/runtime repair, new CI framework, new permanent lane, new browser family, new heavy-governance family, runtime application changes, Filament resource/page changes, route restoration, TenantPanelProvider restoration, `/admin/t/...` restoration, provider/verification runtime expansion, historical-spec rewrite, or budget relaxation without classification evidence + +## Pinned Failure-Classification Categories + +- `ci-signal-restored` +- `ci-wrapper-or-manifest-regression` +- `artifact-publication-regression` +- `budget-or-trend-baseline-drift` +- `product-runtime-or-test-regression` +- `browser-lane-regression` +- `flaky-or-environment` +- `follow-up-spec-required` +- `resolved-or-not-needed` + +## Pinned CI / Suite Seams + +- `raw-full-suite` +- `fast-feedback-lane` +- `confidence-lane` +- `heavy-governance-lane` +- `browser-lane` +- `profiling-or-junit-support` +- `lane-reporting` +- `artifact-publication` +- `budget-trend-baseline` +- `legacy-cutover-regression-guard` +- `provider-verification-regression-guard` + +## Test Governance Checklist + +- [x] Lane assignment is named and is the narrowest sufficient proof for each observed failure group. +- [x] New or changed tests stay in the smallest honest family, and any heavy-governance or browser addition is explicit. +- [x] Shared helpers, factories, seeds, fixtures, and context defaults stay cheap by default; any widening is isolated or documented. +- [x] Planned validation commands cover the change without pulling in unrelated lane cost beyond classification. +- [x] The declared surface test profile or `standard-native-filament` relief is explicit. +- [x] Any material budget, baseline, trend, or escalation note is recorded in `failure-classification.md`. + +## Phase 1: Setup and Scope Lock + +**Purpose**: Confirm Spec `295` remains a classification and CI lane baseline package before any suite command runs. + +- [x] T001 Review `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/spec.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/plan.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/research.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/data-model.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/quickstart.md`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`, and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/checklists/requirements.md` before changing runtime or tests +- [x] T002 [P] Confirm current branch, working tree, and baseline diff using `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git status --short --branch` and `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && git diff --stat`, then record any pre-existing changes in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T003 [P] Inspect `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/293-post-cutover-suite-stabilization/failure-classification.md` and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/294-provider-verification-runtime-semantics/failure-classification.md` as context only, confirming no task edits are made to Specs `293` or `294` +- [x] T004 [P] Inspect `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-report`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, and `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php` to confirm current lane entry points and failure classes +- [x] T005 Confirm the explicit forbidden scope in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md`: no TenantPanelProvider restoration, no `/admin/t/...` restoration, no broad product repair, and no historical-spec rewrite + +--- + +## Phase 2: User Story 1 - Classify the Full Suite Before Any Repair (Priority: P1) + +**Goal**: Establish the raw full-suite readiness signal or an explicit fallback split before any fix work begins. + +**Independent Test**: the raw full-suite result or fallback lane split is represented by classified rows in `failure-classification.md`, with no red group left unclassified. + +- [x] T006 [US1] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail artisan test --compact)` and record pass/fail counts, failing files, and any timeout/noisy-output reason in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T007 [US1] If T006 cannot produce a classifiable result, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane fast-feedback`, `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane confidence`, `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane heavy-governance`, and `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane browser`, then record each lane outcome in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T008 [US1] Group every failing test file, assertion cluster, wrapper error, report error, artifact error, budget breach, or environment issue into one row in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` with exactly one pinned category and one pinned seam +- [x] T009 [US1] Classify any legacy route or panel-related group under `legacy-cutover-regression-guard` without restoring `/admin/t/...`, TenantPanelProvider, tenant-scoped provider fallback routes, or historical compatibility behavior +- [x] T010 [US1] Classify any provider/verification group under `provider-verification-regression-guard` without rewriting Spec `294`; only mark it in-scope if the failure is a direct CI/lane contract defect rather than provider runtime behavior + +--- + +## Phase 3: User Story 2 - Validate CI Lane and Artifact Signal (Priority: P1) + +**Goal**: Prove existing CI wrappers, reports, artifacts, budgets, and failure classes are interpretable after the suite run. + +**Independent Test**: every lane either passes with complete report/artifact output or fails with the correct primary failure class. + +- [x] T011 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report fast-feedback` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T012 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report confidence` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T013 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report heavy-governance` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T014 [US2] Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-report browser` and classify report, budget, trend, and artifact status in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T015 [P] [US2] If machine-readable confidence output is needed for follow-up ownership, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-lane junit` and classify the JUnit support result in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` (not run separately because the lane wrappers produced the needed JUnit artifacts) +- [x] T016 [P] [US2] If artifact publication is suspected, run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && ./scripts/platform-test-artifacts fast-feedback /tmp/tenantpilot-fast-feedback-artifacts` or the matching affected lane and classify any missing required artifacts under `artifact-publication-regression` +- [x] T017 [US2] Verify existing failure classes from `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php` classify lane outcomes as `test-failure`, `wrapper-failure`, `budget-breach`, `artifact-publication-failure`, or `infrastructure-failure`, and record mismatches in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` + +--- + +## Phase 4: User Story 3 - Split Product Failures Instead of Absorbing Them (Priority: P1) + +**Goal**: Keep Spec `295` limited to CI signal readiness by splitting product/runtime failures into explicit follow-up ownership. + +**Independent Test**: every non-CI failure group has a follow-up recommendation, owner, or environment disposition. + +- [x] T018 [US3] For each row classified as `product-runtime-or-test-regression`, decide whether it is a follow-up spec, lane-specific debt, or active feature blocker, then record the decision in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T019 [US3] For each row classified as `browser-lane-regression`, record the affected browser file under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Browser/`, whether the failure is smoke/environment/product behavior, and the follow-up path in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` +- [x] T020 [US3] For each row classified as `flaky-or-environment`, rerun the narrowest affected command once when safe and record the rerun evidence or environment blocker in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` (no flaky/environment row was identified) +- [x] T021 [US3] Confirm no failure group is being fixed under `295` solely because it is small or nearby; it must be directly tied to CI wrapper, manifest, report, artifact, or budget/trend contract drift + +--- + +## Phase 5: User Story 4 - Apply Only Small CI-Signal Fixes (Priority: P2) + +**Goal**: Correct narrow CI/lane contract defects only when classification proves they block a trustworthy CI signal. + +**Independent Test**: the directly affected lane/report/artifact guard passes after the minimal fix, and unrelated red groups remain classified. + +- [x] T022 [US4] If a `ci-wrapper-or-manifest-regression` row is proven, apply the minimal correction in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-lane`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/composer.json`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected guard test under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/` (not applicable: no `ci-wrapper-or-manifest-regression` row was proven) +- [x] T023 [US4] If an `artifact-publication-regression` row is proven, apply the minimal correction in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/scripts/platform-test-artifacts`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneReport.php`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected artifact guard test +- [x] T024 [US4] If a `budget-or-trend-baseline-drift` row is proven, update only the documented budget/trend baseline owner in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneBudget.php`, `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Support/TestLaneManifest.php`, or the directly affected guard test when the classification row explains why the evidence supports the change (not applicable: no budget/trend baseline rewrite was justified) +- [x] T025 [US4] Add or adjust Pest coverage only when a CI/lane contract defect was fixed, keeping tests under `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Feature/Guards/` or `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/tests/Unit/Support/` and avoiding new browser/heavy families by default +- [x] T026 [US4] Re-run the narrowest affected lane/report/artifact command after any CI/lane fix and update `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` with the final status + +--- + +## Phase 6: Final Readiness Decision and Validation + +**Purpose**: Publish one final CI readiness decision and prove no unclassified failure or hidden scope expansion remains. + +- [x] T027 Review `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` and confirm every row has category, seam, observed command, candidate owner, fix-in-295 decision, follow-up, and status +- [x] T028 Set the final readiness decision in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/specs/295-full-suite-ci-baseline/failure-classification.md` to exactly one of `restored-ci-signal`, `classified-follow-up-required`, or `blocked-by-environment` +- [x] T029 Re-run the final narrowest proof command set for the decision: raw full suite if classifiable, otherwise the exact affected lane/report commands from Phases 2 through 5 +- [x] T030 Run `export PATH="/bin:/usr/bin:/usr/local/bin:$PATH" && (cd apps/platform && ./vendor/bin/sail bin pint --dirty --format agent)` if any PHP or script-adjacent PHP files changed +- [x] T031 Confirm Filament remains v5 on Livewire v4, provider registration remains in `/Users/ahmeddarrazi/Documents/projects/wt-plattform/apps/platform/bootstrap/providers.php`, no globally searchable resource changed, no destructive action changed, no asset registration changed, no `/admin/t/...` route or TenantPanelProvider behavior was restored, and no Specs `293` or `294` artifact was rewritten + +## Dependencies & Execution Order + +- **Phase 1** must complete before any suite command. +- **Phase 2** must classify raw suite or fallback lane output before any fix work. +- **Phase 3** depends on Phase 2 because lane reports must be interpreted against observed lane outcomes. +- **Phase 4** depends on the failure group inventory from Phases 2 and 3. +- **Phase 5** depends on classified CI/lane contract defects; skip it entirely if no in-scope CI/lane defect is proven. +- **Phase 6** depends on all classification and any bounded fixes. + +## Parallel Execution Examples + +- T003 and T004 can run in parallel after T001. +- T011 through T014 can run independently after their corresponding lane outputs exist. +- T018 through T020 can be split by failure group once T008 has created the grouped inventory. +- T022 through T024 must not run until a corresponding classification row proves the in-scope defect. + +## Implementation Strategy + +### Suggested MVP Scope + +MVP = Phases 1 through 4. That is enough to answer whether the suite is green or which follow-up owns each red group. Phase 5 runs only when classification proves a narrow CI/lane contract defect. + +### Incremental Delivery + +1. Lock scope and read prior stabilization artifacts. +2. Run raw full suite or fallback lane split. +3. Classify every red group. +4. Validate lane/report/artifact signal. +5. Split product/runtime failures to follow-up ownership. +6. Apply only proven CI/lane fixes. +7. Publish the final readiness decision. + +## Explicit Follow-Ups / Out of Scope + +- Product/runtime failing-test repair outside CI/lane contract defects +- Browser UI repair +- Package Execution +- Guided Operations +- Microsoft Starter Pack +- Virtual Consultant +- Tenant cutover rework +- Provider/verification runtime expansion beyond Spec `294` +- New permanent CI lane or framework +- Historical-spec cleanup -- 2.45.2