ahmido 40b866604a feat: add operations hub stability and safety runtime checks (#462 )

Automated PR created by Codex via Gitea API.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #462

2026-06-20 14:16:20 +00:00

36 KiB

Raw Blame History

Feature Specification: Spec 391 - Operations Hub Stability and Debug-Safe Runtime

Feature Branch: 391-operations-hub-stability-debug-safe-runtime
Created: 2026-06-20
Status: Draft
Input: User-provided Spec 391 prompt plus browser productization audit BUG-001 and BUG-009.

Problem

The Operations hub times out under environment filtering and exposes debug/runtime leakage during productization audit.

Goals

Stable Operations render for the audited environment-filtered route.
Scoped environment filtering with bounded, paginated index rendering.
Controlled empty, error, and loading states.
Browser-smoke guard for debug/runtime leakage.
No destructive, provider-mutating, restore, export, or customer-delivery side effects.

Non-Goals

Evidence/provider/review-pack semantics.
Restore workflow redesign.
System login branding.
Broad app-wide UI overhaul.
Broad productization infrastructure redesign beyond the narrow smoke path required for this regression.

Spec Candidate Check (mandatory - SPEC-GATE-001)

Problem: The admin Operations hub can time out under an environment filter and expose a raw Laravel debug page, while productization browser audits are polluted by Debugbar/source links, missing Filament globals, and Vite dev-client failures.
Today's failure: /admin/workspaces/3/operations?environment_id=4 was observed taking roughly 40 seconds, returning HTTP 500, and showing a debug page with Maximum execution time of 30 seconds exceeded; browser logs also showed missing Filament/Alpine globals and debug/source-link leakage.
User-visible improvement: Operators can open the Operations hub from dashboard/workspace drilldowns and receive a bounded, scoped, customer-ready operations list or controlled empty/error state without debug/runtime pollution.
Smallest enterprise-capable version: Stabilize only the existing admin Operations hub and productization-smoke runtime checks needed to catch the audited regression; keep evidence, provider readiness, review pack, restore, and dashboard semantics out of scope.
Explicit non-goals: No Evidence anchor changes, provider permission/readiness semantics, review-pack download gating, Customer Review Workspace labeling, system login branding, restore readiness redesign, broad UI redesign, production infrastructure overhaul, provider mutations, restore jobs, exports, deletes, archives, or notifications.
Permanent complexity imported: No new persisted entity, table, enum/status family, domain abstraction, taxonomy, or operation lifecycle truth is intended. Some focused tests/browser-smoke helpers may be added if existing smoke controls are insufficient.
Why now: The route is a common drilldown from operations/workspace surfaces and currently blocks productization browser validation with a P1 500/timeout.
Why not local: The fix should stay local to Operations render/query/runtime-smoke paths, but it still requires a spec because the route is a strategic operator surface, uses OperationRun execution truth, and adds explicit Browser lane guardrails.
Approval class: Core Enterprise.
Red flags triggered: None requiring defense. Browser-smoke guardrails are bounded to productization validation and do not create a general UI/runtime framework.
Score: Nutzen: 2 | Dringlichkeit: 2 | Scope: 2 | Komplexitaet: 2 | Produktnaehe: 2 | Wiederverwendung: 1 | Gesamt: 11/12
Decision: approve.

Spec Scope Fields (mandatory)

Scope: canonical-view.
Primary Routes: /admin/workspaces/{workspace}/operations, including ?environment_id={managedEnvironment}.
Data Ownership: operation_runs are tenant-owned execution records with workspace_id and nullable managed_environment_id; the Operations hub is a workspace-context canonical view and must enforce workspace and environment entitlement before revealing rows.
RBAC: Admin plane only. A workspace member may view the workspace Operations route; environment-filtered data must be limited to environments the actor is entitled to view. Non-member or non-entitled workspace/environment access remains deny-as-not-found (404). Member-without-capability semantics for any existing detail/action links remain unchanged.

For canonical-view specs:

Default filter behavior when tenant-context is active: environment_id is an explicit URL/table filter owned by the Operations page; it must not rely on hidden global environment context, legacy aliases, or remembered tenant state.
Explicit entitlement checks preventing cross-tenant leakage: The query must constrain by current workspace and permitted managed environment ids before rendering rows, summary counts, filter options, or drilldown links.

UI Surface Impact (mandatory - UI-COV-001)

Does this spec add, remove, rename, or materially change any reachable UI surface?

No UI surface impact
Existing page changed
New page/route added
Navigation changed
Filament panel/provider surface changed
New modal/drawer/wizard/action added
New table/form/state added
Customer-facing surface changed
Dangerous action changed
Status/evidence/review presentation changed
Workspace/environment context presentation changed

UI/Productization Coverage (mandatory when UI Surface Impact is not "No UI surface impact"; otherwise write `N/A - no reachable UI surface impact` plus rationale)

Route/page/surface: Admin Operations hub, App\Filament\Pages\Monitoring\Operations, backed by App\Filament\Resources\OperationRunResource.
Current or new page archetype: Existing Operations Hub strategic surface, UI-016.
Design depth: Strategic Surface, but this spec is a stability/runtime guardrail pass rather than a visual redesign.
Repo-truth level: repo-verified for route, page class, resource, OperationRun model, existing browser tests, and audit evidence.
Existing pattern reused: Existing Spec 328 Operations Hub workbench, OperationRun monitoring/detail family, OperationRunLinks, OperationUxPresenter, BadgeCatalog / BadgeRenderer, TablePaginationProfiles, SuppressDebugbarForSmokeRequests, and PanelThemeAsset patterns.
New pattern required: none expected; add only narrow productization-smoke assertions/helpers if existing smoke controls cannot express BUG-009 checks.
Screenshot required: yes for final browser smoke if implementation changes visible Operations states; store under specs/391-operations-hub-stability-debug-safe-runtime/artifacts/screenshots/.
Page audit required: no new full page audit by default; this is a regression-stability pass over an existing audited strategic surface. Escalate only if implementation materially changes the page archetype.
Customer-safe review required: no, this route is admin/operator-facing. It still must avoid raw debug pages, stack traces, raw provider secrets, and customer-facing artifact leakage in productization-smoke mode.
Dangerous-action review required: no new dangerous actions. Existing detail/actions must retain existing authorization, confirmation, and audit behavior.
Coverage files updated or explicitly not needed:
- docs/ui-ux-enterprise-audit/route-inventory.md
- docs/ui-ux-enterprise-audit/design-coverage-matrix.md
- docs/ui-ux-enterprise-audit/page-reports/...
- docs/ui-ux-enterprise-audit/strategic-surfaces.md
- docs/ui-ux-enterprise-audit/grouped-follow-up-candidates.md
- docs/ui-ux-enterprise-audit/unresolved-pages.md
- N/A - existing UI-016 Operations route coverage remains valid unless implementation discovers a material archetype or route change
No-impact rationale when applicable: N/A.

Cross-Cutting / Shared Pattern Reuse (mandatory when the feature touches notifications, status messaging, action links, header actions, dashboard signals/cards, alerts, navigation entry points, evidence/report viewers, or any other existing shared operator interaction family; otherwise write `N/A - no shared interaction family touched`)

Cross-cutting feature?: yes, bounded.
Interaction class(es): status messaging, table/list rendering, action links, navigation/drilldown, browser-smoke runtime guardrails.
Systems touched: Operations hub, OperationRun list/detail links, productization browser smoke, Debugbar/Vite/Filament runtime checks.
Existing pattern(s) to extend: OperationRun monitoring family, OperationRunLinks, OperationUxPresenter, BadgeCatalog / BadgeRenderer, TablePaginationProfiles, SuppressDebugbarForSmokeRequests, PanelThemeAsset, existing Pest Browser smoke tests.
Shared contract / presenter / builder / renderer to reuse: Existing OperationRun and Filament-native presentation paths; no new shared runtime framework unless a tiny helper is required to keep smoke assertions deterministic.
Why the existing shared path is sufficient or insufficient: Existing paths already own status/action/link semantics; the gap is bounded render performance and runtime-smoke coverage, not a missing domain contract.
Allowed deviation and why: none expected. Any productization-smoke helper must remain test/support-local and not change normal local developer workflow.
Consistency impact: Operations list/detail language must continue to use OperationRun execution truth and existing run-link vocabulary. Debug/runtime checks must not fail arbitrary local development warnings outside smoke mode.
Review focus: Verify no parallel status language, action-link path, or broad runtime framework is introduced.

OperationRun UX Impact (mandatory when the feature creates, queues, deduplicates, resumes, blocks, completes, or deep-links to an `OperationRun`; otherwise write `N/A - no OperationRun start or link semantics touched`)

Touches OperationRun start/completion/link UX?: yes, link/render semantics only. No OperationRun creation, queueing, status transition, completion, deduplication, or reconciliation write is in scope.
Shared OperationRun UX contract/layer reused: OperationRunLinks, OperationUxPresenter, OperationRunResource, existing tenantless OperationRun viewer/detail routes.
Delegated start/completion UX behaviors: Open operation / View run links and tenant/workspace-safe URL resolution stay delegated to existing OperationRun link helpers. Start/completion messaging is N/A.
Local surface-owned behavior that remains: Query scoping, environment filter display, bounded list rendering, empty/error/loading state copy, and smoke-regression checks.
Queued DB-notification policy: N/A - no queued operation starts or notifications.
Terminal notification path: N/A - no terminal lifecycle notification changes.
Exception required?: none.

Provider Boundary / Platform Core Check (mandatory when the feature changes shared provider/platform seams, identity scope, governed-subject taxonomy, compare strategy selection, provider connection descriptors, or operator vocabulary that may leak provider-specific semantics into platform-core truth; otherwise write `N/A - no shared provider/platform boundary touched`)

Shared provider/platform boundary touched?: no.
Boundary classification: N/A.
Seams affected: N/A.
Neutral platform terms preserved or introduced: Operations, OperationRun, workspace, managed environment, execution truth.
Provider-specific semantics retained and why: none added.
Why this does not deepen provider coupling accidentally: The feature must not call Graph, mutate provider state, or add provider-specific filters/labels beyond existing recorded run context.
Follow-up path: none.

UI / Surface Guardrail Impact (mandatory when operator-facing surfaces are changed; otherwise write `N/A`)

Surface / Change	Operator-facing surface change?	Native vs Custom	Shared-Family Relevance	State Layers Touched	Exception Needed?	Low-Impact / `N/A` Note
Operations hub environment-filtered index stability	yes	Native Filament page/resource plus existing Blade composition	OperationRun monitoring family	page, table, URL-query, browser runtime	no	Existing surface; stability and bounded states only
Productization browser smoke runtime checks	yes, validation workflow only	Pest Browser / existing smoke helpers	Browser smoke guardrail	browser session, console/network/DOM assertions	no	Smoke mode must not alter normal local dev workflow

Decision-First Surface Role (mandatory when operator-facing surfaces are changed)

Surface	Decision Role	Human-in-the-loop Moment	Immediately Visible for First Decision	On-Demand Detail / Evidence	Why This Is Primary or Why Not	Workflow Alignment	Attention-load Reduction
Operations hub	Primary Decision Surface for execution follow-up	Operator decides whether an operation needs inspection or whether the current filtered scope has no runs	Page title, workspace/environment context, bounded table or empty state, status/outcome, time, safe next action	Operation detail, diagnostics, raw context, stack traces, provider payloads	Primary because it is the canonical execution monitoring hub	Follows operations triage and drilldown from dashboard/workspace surfaces	Removes blocker caused by timeout/debug page and keeps rows bounded

Audience-Aware Disclosure (mandatory when operator-facing surfaces are changed)

Surface	Audience Modes In Scope	Decision-First Default-Visible Content	Operator Diagnostics	Support / Raw Evidence	One Dominant Next Action	Hidden / Gated By Default	Duplicate-Truth Prevention
Operations hub	operator-MSP, manager, support-platform	operation type/name, status/outcome, environment, started/updated time, duration if available, attention/error indicator, empty/error state	run detail and collapsed diagnostics	raw context, failure summary, stack trace, provider payloads	Open operation/detail for safe records	raw debug pages, stack traces, provider secrets, source links, Debugbar	list states the run outcome once; detail adds proof only

UI/UX Surface Classification (mandatory when operator-facing surfaces are changed)

Surface	Action Surface Class	Surface Type	Likely Next Operator Action	Primary Inspect/Open Model	Row Click	Secondary Actions Placement	Destructive Actions Placement	Canonical Collection Route	Canonical Detail Route	Scope Signals	Canonical Noun	Critical Truth Visible by Default	Exception Type / Justification
Operations hub	List / Table / Monitoring	Read-only Registry / Report Surface	Open an operation or clear/adjust filter	row/detail route	allowed	existing filters/contextual links only	none introduced	`/admin/workspaces/{workspace}/operations`	`/admin/workspaces/{workspace}/operations/{run}`	workspace route plus explicit environment filter chip	Operations / Operation	successful render, scoped rows, status/outcome, environment, time, safe next action	none

Operator Surface Contract (mandatory when operator-facing surfaces are changed)

Surface	Primary Persona	Decision / Operator Action Supported	Surface Type	Primary Operator Question	Default-visible Information	Diagnostics-only Information	Status Dimensions Used	Mutation Scope	Primary Actions	Dangerous Actions
Operations hub	Operations responder / MSP operator	Determine whether filtered operations need attention and open safe detail	Monitoring list/workbench	Did the selected workspace/environment operations route load successfully, and what run needs attention?	page title, context, active environment filter, bounded rows or empty state, status/outcome, timing, duration, next action	raw context, stack traces, debug/source links, provider payloads, support diagnostics	execution status, terminal outcome, environment scope, lifecycle/freshness where already supported	none in this spec	open operation/detail; clear filter	none introduced

Proportionality Review (mandatory when structural complexity is introduced)

New source of truth?: no.
New persisted entity/table/artifact?: no.
New abstraction?: no domain abstraction expected. Test/support helpers are allowed only if existing smoke controls cannot express the checks.
New enum/state/reason family?: no.
New cross-domain UI framework/taxonomy?: no.
Current operator problem: Operations route fails to render and productization smoke cannot distinguish real UX issues from debug/runtime leakage.
Existing structure is insufficient because: Existing tests did not catch the environment-filtered timeout/debug-page regression or the BUG-009 runtime pollution path.
Narrowest correct implementation: Optimize existing Operations query/render path and add focused smoke assertions for the affected route/runtime conditions.
Ownership cost: A small feature/browser test family and possibly a productization-smoke test helper; no new runtime truth.
Alternative intentionally rejected: Increasing PHP max_execution_time, hiding/removing the route, generic catch-all masking, broad UI redesign, or app-wide debug infrastructure rewrite.
Release truth: Current-release productization blocker.

Compatibility posture

This feature assumes a pre-production environment. Backward compatibility, legacy aliases, migration shims, historical fixtures, and compatibility-specific tests are out of scope unless implementation proves an existing contract requires them.

Testing / Lane / Runtime Impact (mandatory for runtime behavior changes)

Test purpose / classification: Feature/Livewire for route/render/scoping/query guards; Browser for authenticated productization smoke and JS/runtime leak checks; Unit only if a small asset/debug helper is introduced.
Validation lane(s): fast-feedback/confidence for targeted Pest feature tests; browser for productization runtime smoke; profiling only if implementation needs query/render measurement.
Why this classification and these lanes are sufficient: The regression is both server-render and browser-runtime visible; a feature-only test would miss console/Vite/Debugbar leakage, while browser-only proof would be too slow and less deterministic for query/scoping guards.
New or expanded test families: One explicit Spec 391 Operations Hub feature/Livewire family and one explicit Spec 391 browser smoke file.
Fixture / helper cost impact: Must use factories or existing browser smoke-login helpers, no seeders, no provider setup, no real Graph access, no queues/jobs that mutate provider/customer state.
Heavy-family visibility / justification: Browser smoke is explicit because BUG-009 is browser/runtime-specific. It must remain named and scoped to Operations/productization smoke.
Special surface test profile: monitoring-state-page plus global-context-shell.
Standard-native relief or required special coverage: Special coverage required for environment-filtered render budget, debug-page absence, missing Filament globals, Vite client failures, Debugbar/source-link leakage, and network 500s.
Reviewer handoff: Reviewers must confirm lane fit, no hidden seed/provider setup, no broad browser suite drift, and exact proof commands.
Budget / baseline / trend impact: The browser smoke should fail over a reasonable threshold; target is under 3 seconds for the audited local data shape. If CI timing is flaky, keep a lower-level query/render guard and record measured browser timing in verification.
Escalation needed: document-in-feature.
Active feature PR close-out entry: Guardrail / Exception / Smoke Coverage.
Planned validation commands:
- cd apps/platform && php vendor/bin/pest tests/Feature/Monitoring/Spec391OperationsHubRendersWithEnvironmentFilterTest.php
- cd apps/platform && php vendor/bin/pest tests/Feature/Monitoring/Spec391OperationRunResourceIndexPerformanceTest.php
- cd apps/platform && php artisan test --compact tests/Browser/Spec391OperationsHubProductizationSmokeTest.php
- cd apps/platform && php vendor/bin/pint --test <touched php files>
- git diff --check

User Scenarios & Testing (mandatory)

User Story 1 - Environment-filtered Operations route renders (Priority: P1)

As an admin operator, I want /admin/workspaces/{workspace}/operations?environment_id={id} to render a bounded Operations hub for an entitled environment so dashboard/workspace drilldowns do not land on a 500 or timeout.

Why this priority: This is the audited P1 blocker.

Independent Test: Authenticate as a workspace/environment-entitled admin and open the route with fixture operation runs; assert HTTP success, no debug page text, bounded rows, active environment context, and safe detail link for a record.

Acceptance Scenarios:

Given an entitled workspace user and an environment filter, When the Operations hub opens, Then the response is successful and renders within the agreed budget.
Given another environment exists in the same or another workspace, When the filter is applied, Then rows, counts, and filter options remain scoped to the permitted workspace/environment.

User Story 2 - Bounded Operations index rendering (Priority: P1)

As an operator, I want the Operations index to paginate and avoid per-row heavy accessors so the page stays responsive even when many operation runs exist.

Why this priority: The observed max-execution error points to render-path cost rather than missing infrastructure.

Independent Test: Create many OperationRun rows with large context/failure payloads and assert the index route/table render does not hydrate unbounded rows or scan expensive per-row details for every record.

Acceptance Scenarios:

Given more operation runs than one table page, When the Operations index renders, Then only the bounded page/list context is evaluated.
Given operation rows contain large JSON context, When the list renders, Then default columns do not parse or present raw detail payloads per row.

User Story 3 - Controlled empty/error/loading states (Priority: P2)

As an operator, I want filtered Operations states to be understandable and customer-ready so no-data or recoverable render problems do not look like application crashes.

Why this priority: The route should fail closed and explain the current scope without masking the real performance issue.

Independent Test: Open Operations with no rows for an entitled environment and with a safely simulated render failure where applicable; assert the page shows controlled copy and no raw Laravel stack trace.

Acceptance Scenarios:

Given no operation runs exist for the active environment filter, When the page loads, Then a specific empty state is visible and no false health claim appears.
Given a non-debug productization-smoke browser session, When Operations encounters a handled display-only state, Then no raw Laravel debug page, stack trace, or Maximum execution time text is visible.

User Story 4 - Productization-safe browser smoke catches runtime leakage (Priority: P2)

As a productization reviewer, I want the Operations smoke path to fail on debug/runtime leakage so future audits are not polluted by Debugbar, Vite dev-client failures, or missing Filament globals.

Why this priority: BUG-009 directly affected audit signal quality and Filament table/action reliability.

Independent Test: Run the Spec 391 browser smoke in productization-smoke mode and assert no missing Filament globals, Vite client connection failures, _debugbar requests/DOM, phpstorm://open links, visible stack traces, network 500s, or debug page text.

Acceptance Scenarios:

Given productization-smoke mode is active, When the browser opens Operations, Then Debugbar/source links are absent and compiled/stable assets or existing test asset fallbacks are used.
Given Filament/Livewire runtime is missing, When the smoke runs, Then the test fails with a specific console/global/runtime assertion.

Edge Cases

The audited workspace/environment ids may not exist in every test database; automated browser tests must discover or create a safe fixture instead of hardcoding ids unless the audited fixture is explicitly present.
Environment filter values from another workspace must not leak rows or options.
Empty filters must render workspace-wide entitled rows only.
Invalid environment_id must be discarded or rejected according to existing Operations route contract without leaking existence.
Large context, failure_summary, or summary count payloads must not become default-visible list content.
Productization-smoke mode must not disable normal local developer Debugbar/Vite behavior outside the explicit smoke session.

Requirements (mandatory)

Functional Requirements

FR-391-001: The admin Operations hub MUST render successfully for an authenticated, workspace-entitled user with an entitled environment_id filter.
FR-391-002: The Operations hub MUST NOT return HTTP 500 or expose a raw Laravel debug page for the audited environment-filtered path.
FR-391-003: The Operations hub MUST apply workspace and environment filters at the query level before rows, summary counts, filter options, or links are rendered.
FR-391-004: The Operations list MUST remain paginated and bounded using the existing table pagination profile or a narrower documented equivalent.
FR-391-005: The Operations index MUST avoid expensive per-row work during render, including Graph calls, unbounded relationship traversal, and default parsing/presentation of large JSON context/failure payloads.
FR-391-006: Visible Operations columns MUST remain useful for operation type/name, status/outcome, environment/scope, started/updated time, duration when available, and attention/error/next-action signal.
FR-391-007: Filter option queries MUST be scoped and bounded enough not to scan unrelated workspaces or unbounded historical rows during normal index render.
FR-391-008: The route MUST show a controlled empty state when no operation runs exist for the active workspace/environment filter.
FR-391-009: The route MUST preserve filters/context during normal loading states and MUST NOT flash raw framework/debug output while loading.
FR-391-010: The route MUST show controlled error/notice states only for appropriate display conditions and MUST NOT hide an expensive render path behind a catch-all.
FR-391-011: Existing safe operation detail/view links from the list MUST still route to the canonical tenantless OperationRun detail viewer for authorized records.
FR-391-012: Dashboard/workspace navigation links that point to Operations MUST no longer lead to a broken Operations page.
FR-391-013: Productization browser smoke MUST fail on visible Laravel debug pages, stack traces, Maximum execution time, network 500s, missing Filament globals, missing Livewire/Alpine runtime needed by the page, Vite dev-client connection failures in smoke mode, _debugbar leakage, or phpstorm://open source links.
FR-391-014: Productization-smoke mode MUST use existing environment/test controls when possible and MUST NOT disable Debugbar or Vite globally for ordinary local development.
FR-391-015: Tests MUST be deterministic and MUST NOT require real provider access, seeders, provider syncs, restore execution, exports, deletes, archives, or queued customer/provider mutations.

Non-Functional Requirements

NFR-391-001: Target browser render budget is under 3 seconds after authentication for the audited local data shape.
NFR-391-002: If browser timing is too flaky for CI, implementation MUST add a lower-level query/render guard and record observed browser timing in verification.
NFR-391-003: Operations render must remain DB-only and must not invoke GraphClientInterface or external provider clients.
NFR-391-004: No migrations are expected. If implementation proves a migration or index is required, update this spec and plan before continuing.
NFR-391-005: No new operation type, status, outcome, reason family, summary-count key, or persisted truth is allowed.

Acceptance Criteria

/admin/workspaces/3/operations?environment_id=4 renders successfully for the audited workspace/environment or the implementation browser test discovers an equivalent safe fixture when exact ids differ.
The route does not return 500.
The route does not expose a raw Laravel debug page or stack trace.
The route renders in a bounded time under normal local productization validation conditions.
Target browser render budget is under 3 seconds after authentication for the audited data shape.
Operations list is paginated/bounded.
Environment filtering does not trigger N+1-heavy presenter/model accessor work.
Table columns/actions do not perform expensive per-row work during render.
Empty states are controlled and customer-ready.
Error states are controlled and customer-ready.
Existing operation detail/view actions still work for safe records.
Navigation links from dashboard/workspace surfaces to Operations no longer lead to a broken page.
Browser smoke test catches a future Operations 500/timeout regression.
Browser smoke test catches raw Laravel debug-page exposure on Operations.
Browser/runtime smoke check fails on missing Filament JS globals on the Operations route.
Browser/runtime smoke check fails on Vite dev-client connection failures when running in productization-smoke mode.
Browser/runtime smoke check fails on visible Debugbar/source-link leakage when running in productization-smoke mode.
Tests are deterministic and do not require real provider access.
No destructive operations are performed.
No unrelated Evidence/Provider/Review/Restore semantics are changed.

UI Action Matrix (mandatory when Filament is changed)

Surface	Location	Header Actions	Inspect Affordance (List/Table)	Row Actions (max 2 visible)	Bulk Actions (grouped)	Empty-State CTA(s)	View Header Actions	Create/Edit Save+Cancel	Audit log?	Notes / Exemptions
Operations hub	`apps/platform/app/Filament/Pages/Monitoring/Operations.php`, `apps/platform/app/Filament/Resources/OperationRunResource.php`	Existing scope/back/filter-reset navigation only	Existing row/detail route	Existing safe detail/open links only	none	controlled no-runs state; no mutation CTA unless already permitted and existing	owned by tenantless OperationRun detail viewer	N/A	existing run lifecycle audit only	No destructive action added; no retry/cancel/start/export/delete/archive behavior in scope

Key Entities (include if feature involves data)

OperationRun: Existing execution-truth record used for list rows, detail links, status/outcome, timing, and scoped monitoring.
ManagedEnvironment: Existing environment filter target; filter values must be workspace-entitled and active according to existing route rules.
Workspace: Existing primary route/session context for admin Operations.
Productization smoke session: Test/browser-mode behavior, not persisted product truth, used to suppress Debugbar/source-link leakage and use stable assets where supported.

Success Criteria (mandatory)

Measurable Outcomes

SC-391-001: Environment-filtered Operations route returns a successful response in targeted feature/Livewire coverage.
SC-391-002: Browser smoke opens the Operations hub and observes no network 500s, debug page text, stack trace text, missing Filament globals, Vite dev-client failures, _debugbar leakage, or phpstorm://open source links in productization-smoke mode.
SC-391-003: Browser verification records render timing, with target under 3 seconds for the audited local data shape or a documented lower-level guard if browser timing is unsuitable for CI.
SC-391-004: Feature/performance guard proves index rendering stays bounded when more operation rows exist than a single table page.
SC-391-005: No provider mutations, restore jobs, exports, deletes, archives, or customer-facing delivery actions are executed during tests or verification.

Expected UX

The Operations hub should present a clear title, workspace/environment context, visible active environment filter, bounded table or controlled empty state, useful operation status columns, safe detail actions, and no raw stack traces, Debugbar/source links, missing runtime globals, or framework/debug branding in productization-smoke validation.

Risks

The root cause may be a combination of table filter option scans, summary/top-run queries, OperationRunResource column/action helpers, JSON casts, and actionability/freshness accessors. Implementation must profile or instrument enough to fix the render path rather than masking it.
Browser timing can be flaky in local/CI environments. If so, keep browser leak assertions and add deterministic lower-level query/render guards.
Productization-smoke mode could accidentally disable normal local debugging if implemented too broadly; keep it explicit to smoke requests.
Existing completed Operations specs contain validated productization behavior and must not be rewritten.

Assumptions

Spec 391 is a fresh regression/stability package, not a continuation of Spec 328 productization redesign.
The audited ids workspace_id=3 and environment_id=4 may be available locally, but tests should create/discover safe fixtures when they are not.
No schema migration is required unless implementation proves the current query path cannot be bounded without an index or schema change.
Existing SuppressDebugbarForSmokeRequests and PanelThemeAsset patterns are the preferred starting point for BUG-009 smoke controls.

Open Questions

None blocking preparation. Implementation must confirm the exact render-path root cause before changing code.

Out Of Scope

Evidence anchor selection.
Provider permission/readiness semantics.
Review pack download gating.
Customer Review Workspace evidence labeling.
System login branding.
Restore readiness behavior unless the Operations hub directly depends on it.
Broad app-wide UI redesign.
Broad production infrastructure configuration changes unrelated to this spec.
Real provider mutations, provider syncs, restore jobs, destructive actions, exports, notifications, customer-facing delivery actions, archives, deletes, or force-deletes.
Increasing PHP max_execution_time.
Hiding/removing the Operations route or links.
Generic catch-all error masking while leaving the expensive render path intact.

Follow-up Spec Candidates

System login branding and cross-panel debug-safe branding from BUG-008/BUG-009 if productization audit keeps it separate.
Evidence/current-vs-anchored follow-up from BUG-002/BUG-003.
Review pack/customer download gating from BUG-004/BUG-007.
Provider readiness semantics from BUG-005/BUG-006 if separately promoted.

Verification

Planned verification details live in specs/391-operations-hub-stability-debug-safe-runtime/artifacts/verification.md.

Implementation verification must capture:

HTTP status.
Render time.
Page title/header.
Visible table/empty state.
Active workspace/environment context.
Console errors.
Network 500s.
Absence of Laravel debug page.
Absence of Debugbar/source-link leakage in productization-smoke mode.
Confirmation that no provider mutations, restore jobs, exports, deletes, archives, notifications, or customer-facing delivery actions were executed.

36 KiB Raw Blame History

Feature Specification: Spec 391 - Operations Hub Stability and Debug-Safe Runtime

Problem

Goals

Non-Goals

Spec Candidate Check (mandatory - SPEC-GATE-001)

Spec Scope Fields (mandatory)

UI Surface Impact (mandatory - UI-COV-001)

UI/Productization Coverage (mandatory when UI Surface Impact is not "No UI surface impact"; otherwise write N/A - no reachable UI surface impact plus rationale)

OperationRun UX Impact (mandatory when the feature creates, queues, deduplicates, resumes, blocks, completes, or deep-links to an OperationRun; otherwise write N/A - no OperationRun start or link semantics touched)

UI / Surface Guardrail Impact (mandatory when operator-facing surfaces are changed; otherwise write N/A)

Decision-First Surface Role (mandatory when operator-facing surfaces are changed)

Audience-Aware Disclosure (mandatory when operator-facing surfaces are changed)

UI/UX Surface Classification (mandatory when operator-facing surfaces are changed)

Operator Surface Contract (mandatory when operator-facing surfaces are changed)

Proportionality Review (mandatory when structural complexity is introduced)

Compatibility posture

Testing / Lane / Runtime Impact (mandatory for runtime behavior changes)

User Scenarios & Testing (mandatory)

User Story 1 - Environment-filtered Operations route renders (Priority: P1)

User Story 2 - Bounded Operations index rendering (Priority: P1)

User Story 3 - Controlled empty/error/loading states (Priority: P2)

User Story 4 - Productization-safe browser smoke catches runtime leakage (Priority: P2)

Edge Cases

Requirements (mandatory)

Functional Requirements

Non-Functional Requirements

Acceptance Criteria

UI Action Matrix (mandatory when Filament is changed)

Key Entities (include if feature involves data)

Success Criteria (mandatory)

Measurable Outcomes

Expected UX

Risks

Assumptions

Open Questions

Out Of Scope

Follow-up Spec Candidates

Verification

36 KiB

Raw Blame History

UI/Productization Coverage (mandatory when UI Surface Impact is not "No UI surface impact"; otherwise write `N/A - no reachable UI surface impact` plus rationale)

OperationRun UX Impact (mandatory when the feature creates, queues, deduplicates, resumes, blocks, completes, or deep-links to an `OperationRun`; otherwise write `N/A - no OperationRun start or link semantics touched`)

UI / Surface Guardrail Impact (mandatory when operator-facing surfaces are changed; otherwise write `N/A`)