TenantAtlas/specs/367-operationrun-actionability-system/plan.md
ahmido 564da05096 feat: implement operation run actionability system (#439)
This PR introduces the Operation Run Actionability System.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #439
2026-06-08 13:34:25 +00:00

17 KiB

Implementation Plan: OperationRun Actionability System v1

Branch: 367-operationrun-actionability-system | Date: 2026-06-08 | Spec: specs/367-operationrun-actionability-system/spec.md Input: Feature specification from specs/367-operationrun-actionability-system/spec.md

Summary

Introduce a derived OperationRun actionability layer that separates historical terminal execution truth from current UI follow-up truth. The implementation will keep operation_runs immutable history intact, classify every known operation type deliberately, migrate dashboard/Operations/current-follow-up consumers away from raw terminalFollowUp() semantics, and prove the known Provider Connection CTA loop is closed.

Technical Context

Language/Version: PHP 8.4.15, Laravel 12.52.0
Primary Dependencies: Filament 5.2.1, Livewire 4.1.4, Pest 4.3.1, PostgreSQL
Storage: Existing PostgreSQL tables only; no migration planned
Testing: Pest 4 Unit, Feature, Architecture/guard, optional Browser smoke
Validation Lanes: fast-feedback, confidence, browser only if rendered UI changes
Target Platform: Laravel Sail locally, Dokploy container deployment for staging/production
Project Type: Laravel monolith under apps/platform
Performance Goals: DB-only render-time evaluation; batch-friendly dashboard and Operations table evaluation; no Graph calls during UI render
Constraints: preserve workspace/managed-environment isolation, RBAC, OperationRunService lifecycle ownership, global-search-disabled OperationRun posture, and existing Operations routes
Scale/Scope: Known operation types in OperationCatalog, provider registry, reconciliation registry, jobs, factories, seeders, and current tests

UI / Surface Guardrail Plan

  • Guardrail scope: changed existing dashboard and Operations status/follow-up surfaces.
  • Affected routes/pages/actions/states/navigation/panel/provider surfaces:
    • /admin/workspaces/{workspace}/operations
    • /admin/workspaces/{workspace}/operations/{run}
    • tenant dashboard widgets NeedsAttention and BaselineCompareNow
    • shell active-work hint via BulkOperationProgress / ActiveRuns
  • No-impact class, if applicable: N/A.
  • Native vs custom classification summary: mixed existing native Filament table/detail plus existing dashboard widget Blade/Livewire surfaces.
  • Shared-family relevance: status messaging, dashboard signals, operation links, current follow-up filters, primary action guidance.
  • State layers in scope: widget, page, detail, shell hint, URL query for Operations filters.
  • Audience modes in scope: operator-MSP and support-platform; no customer-facing surface.
  • Decision/diagnostic/raw hierarchy plan: current actionability is default-visible; historical status remains visible; raw/support diagnostics stay secondary/collapsed/capability-gated.
  • Raw/support gating plan: unchanged existing Operations detail gating.
  • One-primary-action / duplicate-truth control: use one actionability result for dashboard count, Operations current-follow-up filters, and OperationRun action eligibility.
  • Handling modes by drift class or surface: review-mandatory for high-risk operation families and insufficient correlation proof.
  • Repository-signal treatment: hard-stop-candidate for any UI consumer that still directly treats raw terminal historical status as current follow-up after migration.
  • Special surface test profiles: monitoring-state-page, dashboard-signal, shared-detail-family.
  • Required tests or manual smoke: Unit actionability policies, Feature dashboard/Operations regressions, guard tests, optional browser smoke for the provider loop.
  • Exception path and spread control: none expected.
  • Active feature PR close-out entry: Guardrail + Smoke Coverage.
  • UI/Productization coverage decision: implementation must update existing coverage artifacts or document checked no-update rationale.
  • Coverage artifacts to update: existing Operations/dashboard entries only if visible surface contract materially changes; otherwise record no-update rationale in close-out.
  • No-impact rationale: N/A.
  • Navigation / Filament provider-panel handling: no panel/provider changes; providers remain in apps/platform/bootstrap/providers.php.
  • Screenshot or page-report need: screenshot only if rendered dashboard/Operations layout or copy changes materially.

Shared Pattern & System Fit

  • Shared pattern touched: OperationRun monitoring, dashboard attention, links, current follow-up, and action eligibility.
  • Existing shared paths to reuse:
    • App\Support\OperationCatalog
    • App\Support\OperationRunLinks
    • App\Support\Operations\OperationRunActionEligibility
    • App\Support\Operations\Reconciliation\OperationRunReconciliationRegistry
    • App\Support\Operations\OperationRunCorrelationResolver
    • App\Support\OpsUx\OperationUxPresenter
    • App\Support\OpsUx\ActiveRuns
  • New shared path allowed: bounded actionability resolver/registry/policies under App\Support\Operations\Actionability or equivalent local namespace.
  • Forbidden drift: no parallel UI action framework, no persisted actionability table, no ad-hoc dashboard-only special case, no Graph calls, no broad Operations UX rebuild.
  • OperationRun UX contract: no run-start or terminal-notification changes. Existing OperationRun lifecycle remains service-owned.

Constitution Check

  • Inventory-first / snapshots-second: N/A; uses existing OperationRun and domain state only.
  • Read/write separation: PASS; actionability is read-only and DB-only.
  • Single Contract Path to Graph: PASS; no Graph calls allowed in evaluation or render.
  • Proportionality First / BLOAT-001: REQUIRED and satisfied in spec.md; new derived status/resolver/registry is justified by a confirmed false CTA loop and multiple current consumers.
  • No new persisted truth: PASS; no migration/table.
  • No new state without behavioral consequence: PASS; actionability states change dashboard counts, CTA routing, filters, and manual-review behavior.
  • Workspace/Tenant isolation: REQUIRED; same-workspace/same-environment proof only unless operation is explicitly workspace-owned.
  • RBAC-UX: REQUIRED; existing policies stay authoritative; UI state is not security.
  • OperationRun standards: REQUIRED; historical execution truth remains OperationRun; current actionability is separate derived truth.
  • UI-COV-001: REQUIRED; existing reachable UI surfaces change in status/follow-up semantics.
  • TEST-GOV-001: REQUIRED; test lanes and fixture costs are explicit.
  • LEAN-001: PASS; no legacy compatibility shims or data migration.

Project Structure

Likely application surfaces for later implementation:

apps/platform/app/Models/OperationRun.php
apps/platform/app/Support/OperationCatalog.php
apps/platform/app/Support/OperationRunLinks.php
apps/platform/app/Support/Operations/Actionability/
apps/platform/app/Support/Operations/OperationRunActionEligibility.php
apps/platform/app/Support/Operations/Reconciliation/
apps/platform/app/Support/OpsUx/ActiveRuns.php
apps/platform/app/Support/OpsUx/OperationUxPresenter.php
apps/platform/app/Support/GovernanceInbox/GovernanceInboxSectionBuilder.php
apps/platform/app/Support/EnvironmentDashboard/EnvironmentDashboardSummaryBuilder.php
apps/platform/app/Support/Workspaces/WorkspaceOverviewBuilder.php
apps/platform/app/Filament/Widgets/Dashboard/NeedsAttention.php
apps/platform/app/Filament/Widgets/Dashboard/BaselineCompareNow.php
apps/platform/app/Filament/Widgets/Operations/OperationsWorkbenchStats.php
apps/platform/app/Filament/Pages/Monitoring/Operations.php
apps/platform/app/Filament/Resources/OperationRunResource.php
apps/platform/app/Filament/Pages/Operations/TenantlessOperationRunViewer.php
apps/platform/app/Livewire/BulkOperationProgress.php
specs/367-operationrun-actionability-system/repo-truth-map.md
apps/platform/lang/en/localization.php
apps/platform/lang/de/localization.php
apps/platform/tests/Unit/Support/Operations/
apps/platform/tests/Feature/Operations/
apps/platform/tests/Feature/Monitoring/
apps/platform/tests/Feature/Guards/
apps/platform/tests/Browser/

Data Model / Persistence Impact

  • No new tables.
  • No migration.
  • No new persisted actionability column.
  • Tenant-bound OperationRun records remain tenant-owned operational artifacts for authorization purposes even though canonical Operations routes are workspace-scoped; workspace-only runs are valid only for explicitly workspace-owned operation types.
  • Existing operation_runs.context may be read for correlation proof; implementation must not mutate historical context as part of evaluation.
  • Related current-state proof may read existing models such as ProviderConnection, BaselineSnapshot, EvidenceSnapshot, EnvironmentReview, ReviewPack, BackupSet, and RestoreRun only where the policy can prove same-scope ownership.

Domain Model / Service Approach

  1. Add a derived actionability result object.
  2. Add a derived actionability status enum/value object.
  3. Add a resolver with single-run and batch evaluation APIs.
  4. Add a registry that maps canonical operation families to policies and fails coverage tests when known types are uncovered.
  5. Implement initial policies:
    • provider connection checks
    • repeatable sync operations
    • baseline operations
    • evidence/review/review-pack artifact operations
    • backup operations
    • restore/promotion/destructive-like operations
    • alert/notification/informational fallback decisions discovered during implementation
  6. Explicitly classify every remaining OperationCatalog::canonicalInventory() entry as actionable, manual-review, superseded-capable, resolved-by-current-state-capable, or informational-only.
  7. Reuse OperationCatalog for canonical/alias type resolution.
  8. Reuse existing reconciliation proof where it is already same-scope and safe.
  9. Feed result into existing OperationRunActionEligibility instead of replacing it.

Policy Semantics

  • Actionable: current operator action exists and should count in dashboard/current filters.
  • Requires manual review: current action remains necessary because safe automatic resolution is not provable.
  • Superseded by later success: later same-scope successful run proves old terminal problem no longer current.
  • Resolved by current state: current domain model proves old terminal problem no longer current.
  • Informational only: historical record only; never dashboard CTA.
  • Not terminal: active/non-problem run; active stale remains owned by existing freshness/reconciliation path.

UI / Filament / Livewire Implications

  • Filament v5 / Livewire v4.0+ compliance remains required; current app has Livewire 4.1.4.
  • Panel providers remain registered in apps/platform/bootstrap/providers.php; no panel provider changes planned.
  • OperationRunResource remains protected static bool $isGloballySearchable = false; no global search changes.
  • No destructive actions are added. Existing high-impact Operations actions continue using existing confirmation/authorization patterns.
  • Dashboard widgets must show current actionability counts, not raw historical terminal follow-up.
  • Operations list/current-follow-up filters must distinguish current actionable/manual-review rows from historical resolved/superseded rows.
  • Operations workbench stats, Governance Inbox, environment dashboard summaries, workspace overview signals, and OperationUxPresenter decision copy must use actionability for current-follow-up truth or explicitly remain historical-only.
  • Operation detail should show historical status and current actionability separately if implementation touches rendered detail.
  • Shell active-run hint must not mix terminal actionability with active progress semantics; existing terminal follow-up visibility in ActiveRuns::shellVisibleQueryForTenantId() must be removed or converted into a distinct actionability-backed non-active signal.

RBAC / Policy Implications

  • Existing OperationRunPolicy and workspace/environment entitlement checks remain authoritative.
  • Any actionability result that references a superseding run or resolving model must be same workspace and same managed environment or explicitly workspace-owned.
  • Non-members must not learn that hidden runs or hidden resolving models exist.
  • Dashboard counts and Operations filters must run inside actor-visible scope.

Audit / Observability / Evidence Implications

  • No new AuditLog writes for read-only evaluation.
  • No changes to OperationRun lifecycle transitions.
  • Historical terminal status remains audit truth.
  • Current actionability explanation is derived UI truth; it should be testable and explainable but not persisted.

Performance Plan

  • Use batch evaluation for dashboard/Operations collections.
  • Group lookups by operation family and scope.
  • Avoid per-row ProviderConnection, artifact, or later-run queries in table render and in multi-tenant aggregate builders such as workspace overview and governance inbox.
  • Keep evaluation DB-only and deterministic.
  • Add tests or code-review tasks to catch obvious N+1 regressions in actionability consumer paths.

Test Strategy

  • Unit:
    • resolver/result/status behavior
    • provider connection policy
    • repeatable sync policy
    • baseline/artifact/backup policies
    • high-risk manual-review policy
    • registry coverage against operation catalog
  • Feature:
    • dashboard provider loop regression
    • dashboard/BaselineCompareNow current follow-up counts
    • Operations filters and action links
    • cross-workspace/cross-environment non-resolution
    • action eligibility alignment
  • Guard / Architecture:
    • no direct current-follow-up UI consumption of terminalFollowUp() / dashboardNeedsFollowUp()
    • new operation types require actionability coverage
    • no Graph calls in actionability/render tests through fail-hard binding where practical
  • Browser:
    • bounded provider-loop smoke if UI rendering changes: old provider blocker, current healthy state, dashboard no false CTA, Operations history still visible.

Rollout / Deployment Considerations

  • Env vars: none.
  • Migrations: none.
  • Queues/workers: none.
  • Scheduler: none.
  • Storage/volumes: none.
  • Filament assets: no new assets; filament:assets not newly required by this spec.
  • Staging/production: validate on Staging before Production because dashboard current-follow-up semantics change operator attention routing.
  • Rollback: code rollback restores old raw terminal-follow-up behavior; no data rollback needed.

Implementation Phases

Phase 1 - Repo Truth Inventory

  • Confirm all known operation types from OperationCatalog, provider registry, reconciliation registry, jobs, factories, seeders, and tests.
  • Confirm every current consumer of terminal follow-up / dashboard follow-up / problem class.
  • Confirm ProviderConnection current-state proof fields.

Phase 2 - Failing Proof

  • Add Unit/Feature/Guard tests before runtime changes where practical.
  • Prove the provider CTA loop, repeatable superseded success, high-risk manual review, cross-scope non-resolution, and guard coverage.

Phase 3 - Core Actionability Contract

  • Add status/result/resolver/registry/policies.
  • Add batch evaluation APIs.
  • Register known operation families.

Phase 4 - Consumer Migration

  • Migrate Dashboard, BaselineCompareNow, Operations filters/list/detail, OperationRunActionEligibility, OperationRunLinks where relevant, and ActiveRuns boundaries.
  • Keep history visible.

Phase 5 - Validation and Close-Out

  • Run focused tests and optional browser smoke.
  • Update or document UI coverage artifacts.
  • Record Filament/Livewire/global-search/destructive-action/asset/deployment posture in close-out.

Risk Controls

  • Default to actionable/manual-review when correlation proof is incomplete.
  • Keep high-risk mutation families manual-review.
  • Require same-scope proof for superseded/resolved outcomes.
  • Do not evaluate with Graph calls.
  • Do not add persistence.
  • Do not remove terminalFollowUp() until all current consumers are migrated and guard tests exist.

Implementation Verification Notes

  • No product question blocks implementation. The implementation loop must verify which exact ProviderConnection timestamp or verification proof is safe for resolved_by_current_state when no later successful run exists. If no reliable field exists, the provider policy can still resolve via later same-scope successful runs and must otherwise leave the old blocker actionable/manual-review.

Readiness Assessment

Ready for implementation once tasks.md is followed. The only verification note is ProviderConnection proof-field selection, not a product blocker, because the provider policy can rely on later same-scope successful runs if timestamp proof is insufficient.