TenantAtlas/specs/391-operations-hub-stability-debug-safe-runtime/tasks.md
ahmido 40b866604a feat: add operations hub stability and safety runtime checks (#462)
Automated PR created by Codex via Gitea API.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #462
2026-06-20 14:16:20 +00:00

12 KiB

Tasks: Spec 391 - Operations Hub Stability and Debug-Safe Runtime

Input: Design documents from /specs/391-operations-hub-stability-debug-safe-runtime/
Prerequisites: plan.md, spec.md
Tests: Required. Use Pest 4 feature/Livewire/browser coverage. No seeders, provider syncs, restore execution, exports, deletes, archives, force-deletes, notifications, or customer-facing delivery actions.

Test Governance Checklist

  • Lane assignment is named and is the narrowest sufficient proof for the changed behavior.
  • New or changed tests stay in the smallest honest family, and the browser addition is explicit.
  • Shared helpers, factories, seeds, fixtures, and context defaults stay cheap by default; any widening is isolated or documented.
  • Planned validation commands cover the change without pulling in unrelated lane cost.
  • The declared surface test profile (monitoring-state-page plus global-context-shell) is explicit.
  • Any material budget, baseline, trend, or escalation note is recorded in the active spec or PR.

Phase 1: Setup and Safety Boundary

  • T001 Record initial git status --short, current branch, and latest commit in specs/391-operations-hub-stability-debug-safe-runtime/artifacts/verification.md.
  • T002 Re-read specs/391-operations-hub-stability-debug-safe-runtime/spec.md, plan.md, tasks.md, specs/browser-productization-bug-audit/browser-bug-report.md, and completed context-only Specs 328, 361, 362, 364, 367, and 377 before editing runtime code.
  • T003 Confirm the implementation scope excludes Evidence, Provider, Review Pack, Restore, dashboard semantics, provider mutations, restore jobs, exports, deletes, archives, force-deletes, notifications, customer-facing delivery actions, migrations, seeders, and max_execution_time changes.
  • T004 Confirm Filament v5 / Livewire v4.0+ compliance and no Livewire v3/Filament legacy API use in touched code.
  • T005 Confirm panel provider registration remains apps/platform/bootstrap/providers.php and no panel provider path changes are required.
  • T006 Confirm OperationRunResource remains non-globally-searchable, or update this spec before changing global-search posture.
  • T007 Confirm no new persisted entity, migration, enum/status family, operation type, summary-count key, or domain abstraction is needed; if one appears necessary, stop and update spec.md and plan.md first.

Phase 2: Reproduce and Locate Root Cause

  • T008 Reproduce or confirm BUG-001 with the browser/Playwright or a targeted route request for /admin/workspaces/3/operations?environment_id=4, recording HTTP status, elapsed time, and visible/debug output in artifacts/verification.md.
  • T009 Inspect the latest Laravel error/log context for the audited max-execution failure without mutating data; record whether HasAttributes.php:1577 still appears.
  • T010 Inspect apps/platform/app/Filament/Pages/Monitoring/Operations.php render methods, especially decisionWorkbench(), selectedWorkbenchOperation(), topOperationFromQuery(), summaryCount(), table(), scopedSummaryQuery(), filter handling, and environment entitlement helpers.
  • T011 Inspect apps/platform/app/Filament/Resources/OperationRunResource.php table columns, filters, actions, URL builders, status/outcome descriptions, target-scope helpers, and any helpers used per visible row.
  • T012 Inspect apps/platform/app/Models/OperationRun.php accessors/casts used by the list and workbench, including context, failure_summary, summary_counts, problemClass(), freshnessState(), requiresOperatorReview(), and actionability-related helpers.
  • T013 Identify whether the render cost comes from unbounded row hydration, query option scans, relationship N+1, JSON casts/accessors, PHP sorting over hydrated rows, actionability/freshness evaluation, or table column/action helper work; record the confirmed root cause in artifacts/verification.md.

Phase 3: Automated Regression Tests First

  • T014 Add apps/platform/tests/Feature/Monitoring/Spec391OperationsHubRendersWithEnvironmentFilterTest.php proving an authenticated admin can open the Operations route with an entitled environment filter, receives a successful response, sees Operations title/context/table or empty state, and does not see Laravel debug-page, stack-trace, or Maximum execution time text.
  • T015 Add a test in the same feature file proving the environment filter remains scoped: rows/counts/filter context for another environment or workspace do not appear, and non-entitled environment filters fail closed according to existing 404/filter-discard contract.
  • T016 Add a test proving dashboard/workspace links that target Operations with environment_id produce the canonical Operations URL and the target route renders.
  • T017 Add apps/platform/tests/Feature/Monitoring/Spec391OperationRunResourceIndexPerformanceTest.php with more operation runs than a table page and large context/failure_summary payloads, asserting the index remains bounded and does not require unbounded rows to render.
  • T018 Add or extend a no-Graph render guard proving Operations index/workbench rendering never invokes GraphClientInterface or provider clients.
  • T019 Add a focused empty-state test proving no-runs for an entitled environment displays controlled copy and no false health claim.
  • T020 Add a loading-state/context test where feasible, or a browser assertion, proving the Operations route preserves the active workspace/environment filter and does not flash raw framework/debug output while loading.
  • T021 Add a safe detail-link test proving at least one authorized row still opens the tenantless OperationRun detail route.
  • T022 If a smoke/runtime helper is introduced, add a Unit or Feature test proving it is opt-in and does not disable Debugbar/Vite behavior for normal local requests.

Phase 4: Browser/Productization Smoke Tests

  • T023 Add apps/platform/tests/Browser/Spec391OperationsHubProductizationSmokeTest.php using existing browser smoke-login/auth fixture patterns where possible.
  • T024 Make the browser test discover or create a safe workspace/environment fixture instead of hardcoding ids, unless the audited workspace 3/environment 4 fixture is explicitly present and safe to use.
  • T025 Browser-smoke the authenticated route /admin/workspaces/{workspace}/operations?environment_id={environment} and assert page renders successfully with Operations/Operations Hub, active environment context, and bounded table or controlled empty state.
  • T026 Add a browser render-time guard targeting under 3 seconds after authentication for the audited local data shape; if too flaky for CI, keep browser timing recorded and rely on a deterministic lower-level render/query guard.
  • T027 Add browser assertions that no visible Laravel debug page, stack trace, Maximum execution time, _debugbar, phpstorm://open, raw source links, or debug exception text is visible in productization-smoke mode.
  • T028 Add browser console assertions that fail on missing Filament/Livewire/Alpine runtime globals needed by the route, including filamentSchema is not defined, filamentSchemaComponent is not defined, filamentTable is not defined, and selectFormComponent is not defined.
  • T029 Add browser network/console assertions that fail on Vite dev-client connection failures for http://localhost:5173/@vite/client when running in productization-smoke mode.
  • T030 Add browser network assertions that fail on Operations HTTP 500s and _debugbar requests in productization-smoke mode.
  • T031 Capture the final screenshot under specs/391-operations-hub-stability-debug-safe-runtime/artifacts/screenshots/ or record why screenshot capture is unavailable.

Phase 5: Operations Render-Path Stabilization

  • T032 Update apps/platform/app/Filament/Pages/Monitoring/Operations.php so workspace and environment entitlement filters apply at the query level before list rows, summary counts, selected workbench operation, and filter state render.
  • T033 Keep the Operations table paginated with TablePaginationProfiles::resource() or a narrower documented equivalent.
  • T034 Bound selectedWorkbenchOperation() / topOperationFromQuery() so it does not hydrate unbounded rows or sort expensive accessor-derived state across large result sets.
  • T035 Replace or defer expensive per-row work in OperationRunResource::table() columns/actions; keep default list columns useful without parsing raw context/failure payloads for every visible row.
  • T036 Restrict eager loading to relationships actually rendered on the index (tenant, user, or narrower selected columns) and avoid N+1 relationship traversal for status/scope/next-action display.
  • T037 Avoid default index hydration/presentation of large JSON payloads (context, failure_summary, summary_counts) unless a visible column truly needs them; move heavy diagnostics to detail/collapsed support paths.
  • T038 Scope and bound filter option queries for type and initiator so they do not scan unrelated workspaces or unbounded historical rows during normal index render.
  • T039 Preserve existing OperationRun status/outcome/actionability semantics; do not change lifecycle truth to make the list faster.
  • T040 Preserve existing canonical detail/view links through OperationRunLinks and tenantless OperationRun viewer routes.

Phase 6: Controlled States and Runtime Smoke Mode

  • T041 Ensure the Operations empty state is specific to the active workspace/environment scope, customer-ready, and avoids false health claims.
  • T042 Ensure loading behavior preserves the active workspace/environment filter and does not expose framework/debug output.
  • T043 Add a controlled display-only error/notice state only if implementation proves one is appropriate; do not use a catch-all to hide the expensive path or raw exceptions.
  • T044 Reuse App\Http\Middleware\SuppressDebugbarForSmokeRequests for smoke-cookie/session suppression where possible.
  • T045 Reuse or extend App\Support\Filament\PanelThemeAsset behavior so productization-smoke mode can run without requiring the Vite dev client when built assets are available.
  • T046 If a new env/config flag is required, name it narrowly for productization/browser smoke, document it in this spec's verification artifact, and ensure normal local developer Debugbar/Vite workflow remains unchanged.
  • T047 Ensure productization-smoke assertions do not fail all arbitrary local warnings; fail only on the explicit runtime/debug leakage signatures from this spec.

Phase 7: Validation and Formatting

  • T048 Run targeted feature tests for Spec 391 render/scoping/bounded behavior.
  • T049 Run targeted browser smoke for Spec 391.
  • T050 Run targeted formatting for touched PHP files with php vendor/bin/pint --test <touched php files> or the project-equivalent narrow formatting command.
  • T051 Run git diff --check from the repository root.
  • T052 Open the Operations route in the browser after implementation and record route, HTTP status, render time, page title/header, table/empty state, workspace/environment context, console errors, network errors, absence of debug page, and absence of Debugbar/source-link leakage in artifacts/verification.md.
  • T053 Confirm in artifacts/verification.md that no provider mutations, restore jobs, exports, deletes, archives, force-deletes, notifications, customer-facing delivery actions, migrations, seeders, or destructive commands were executed.
  • T054 Record final git status --short, intentionally changed files, pre-existing unrelated dirty files if any, and known limitations in artifacts/verification.md.

Non-Tasks / Guardrails

  • NT001 Do not increase PHP max_execution_time.
  • NT002 Do not hide or remove the Operations route or links.
  • NT003 Do not mask the error with a generic catch-all while leaving the expensive render path intact.
  • NT004 Do not change Evidence, Provider, Review Pack, Restore, dashboard, or customer-facing artifact semantics.
  • NT005 Do not run provider syncs, provider mutations, restore jobs, exports, deletes, archives, force-deletes, seeders, or destructive commands.
  • NT006 Do not add migrations unless spec/plan are updated first with proof.
  • NT007 Do not add new OperationRun types, statuses, outcomes, summary-count keys, lifecycle semantics, or unscoped caching.
  • NT008 Do not rewrite or normalize completed Operations/productization specs.