TenantAtlas/specs/178-ops-truth-alignment/plan.md
2026-04-05 23:40:45 +02:00

25 KiB

Implementation Plan: Operations Lifecycle Alignment & Cross-Surface Truth Consistency

Branch: 178-ops-truth-alignment | Date: 2026-04-05 | Spec: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/178-ops-truth-alignment/spec.md Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/178-ops-truth-alignment/spec.md

Summary

Align operations truth across tenant dashboard summaries, workspace overview summaries, BulkOperationProgress, recent-operations widgets, the canonical admin monitoring hub, the canonical run-detail page, and the system-panel stuck or failure surfaces without changing the OperationRun schema, inventing a second lifecycle model, or widening authorization scope. The implementation stays narrow by reusing existing OperationRun status, outcome, freshness, reconciliation metadata, and route structure, then hardening four seams that already exist but drift independently today: summary bucketing, local progress freshness, canonical drill-through continuity, and decision-zone emphasis.

The first slice adds one shared derived problem-class contract on top of the existing lifecycle truth so all covered surfaces can separate terminal follow-up from active stale/stuck attention without creating new persistence. The second slice applies that contract to local progress and summary surfaces, aligns the admin and system monitoring surfaces around the same stale or reconciled story, and preserves that story through notifications and drill-throughs. Focused Pest coverage then locks in cross-surface truth, polling freshness, system visibility of reconciled stale lineage, and decision-zone emphasis.

Technical Context

Language/Version: PHP 8.4.15
Primary Dependencies: Laravel 12, Filament v5, Livewire v4, Pest v4, existing OperationRun, OperationLifecyclePolicy, OperationRunFreshnessState, OperationUxPresenter, OperationRunLinks, ActiveRuns, StuckRunClassifier, WorkspaceOverviewBuilder, dashboard widgets, workspace widgets, and system ops pages
Storage: PostgreSQL unchanged; existing operation_runs JSONB-backed context, summary_counts, and failure_summary; no schema change
Testing: Pest 4 feature and Livewire or Filament component tests through Laravel Sail, plus existing system-panel and monitoring guard coverage
Target Platform: Laravel monolith web application in Sail locally and containerized Linux deployment in staging/production
Project Type: web application
Performance Goals: keep tenant, workspace, admin, and system monitoring surfaces DB-only at render; converge local progress truth within one polling cycle after canonical state changes; poll only while relevant active runs exist; preserve existing 10-second active-surface polling cadence where polling is used
Constraints: no schema migration; no new persisted lifecycle truth; no enum rewrite; no new route family; no cross-plane leakage; no ad-hoc status or badge mappings; lifecycle transitions remain service-owned; system stuck truth must remain discoverable after reconciliation; no new panel assets or provider-registration changes
Scale/Scope: 8 operator-facing surfaces across tenant, workspace, canonical admin, and system panels plus existing operation notifications and shared presenter or query seams; one canonical lifecycle model reused across existing operation types

Constitution Check

GATE: Passed before Phase 0 research. Re-checked after Phase 1 design and still passing.

Principle Pre-Research Post-Design Notes
Inventory-first / snapshots-second PASS PASS No inventory, backup, or snapshot ownership semantics change.
Read/write separation PASS PASS The slice is read-time truth alignment only; no new mutation path or action surface is introduced.
Graph contract path N/A N/A No Microsoft Graph call path is touched.
Deterministic capabilities PASS PASS Existing admin-plane and system-plane authorization remains authoritative.
RBAC-UX plane separation PASS PASS /admin and /system remain separate; no cross-plane bypass is introduced.
Workspace + tenant isolation PASS PASS Canonical admin routes remain workspace- and tenant-safe; system routes remain platform-scoped only.
Destructive confirmation standard PASS PASS No new destructive action is introduced. Existing destructive flows remain governed by their originating features.
Global search safety PASS PASS No global-search behavior changes are part of this slice.
Run observability / Ops-UX 3-surface contract PASS PASS Existing OperationRun truth remains canonical; the feature changes presentation and polling seams only.
Ops lifecycle ownership PASS PASS OperationRun.status and OperationRun.outcome remain service-owned; summary surfaces stay read-only.
Ops summary counts PASS PASS No new summary_counts shape or key family is introduced.
Data minimization / DB-only render PASS PASS Monitoring and dashboard surfaces remain DB-only and do not add render-time external calls.
Proportionality / no premature abstraction PASS PASS The design reuses model scopes, presenter seams, and existing route helpers instead of adding a new lifecycle framework.
Persisted truth / behavioral state PASS PASS No new table, persisted artifact, or top-level state family is added.
UI semantics / few layers PASS PASS Only a thin derived problem-class split is introduced; it remains derived from existing lifecycle truth.
Badge semantics (BADGE-001) PASS PASS Existing badge and presenter seams remain authoritative for status, outcome, and freshness meaning.
Filament-native UI / Action Surface Contract PASS PASS Existing widgets, tables, pages, and detail surfaces remain in place; no redundant inspect model is introduced.
Filament UX-001 PASS PASS Existing detail and list hierarchies remain intact; stale or reconciled truth is elevated inside existing summary structures.
Filament v5 / Livewire v4 compliance PASS PASS The feature stays inside the current Filament v5 + Livewire v4 stack.
Provider registration location PASS PASS No panel or provider change is required; Laravel 11+ registration remains in bootstrap/providers.php.
Global-search hard rule PASS PASS No globally searchable resource is added or altered.
Asset strategy PASS PASS No new assets or filament:assets deployment changes are needed.
Testing truth (TEST-TRUTH-001) PASS PASS The plan adds business-truth regression coverage for alignment, visibility, and drill-through continuity rather than thin view-only tests.

Phase 0 Research

Research outcomes are captured in /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/178-ops-truth-alignment/research.md.

Key decisions:

  • Reuse the existing OperationRunFreshnessState, lifecycle policy, reconciliation metadata, and OperationRun query scopes as the canonical lifecycle base instead of introducing a second lifecycle or problem-state model.
  • Introduce a thin derived split between terminal follow-up and active stale/stuck attention through existing model or presenter seams rather than a new taxonomy framework.
  • Extend the repo's current conditional polling pattern to BulkOperationProgress instead of introducing a new live-refresh mechanism.
  • Keep BulkOperationProgress as an active-only surface: terminal or reconciled runs should disappear from the overlay within one refresh cycle, while recent and attention surfaces carry their follow-up semantics.
  • Use /admin/operations as the sole canonical collection route and preserve problem-class continuity through filter or tab state rather than new routes.
  • Keep /system/ops/stuck focused on active stale candidates, but make reconciled stale lineage explicitly discoverable on system runs, failures, and detail surfaces so the stale truth chain does not disappear after reconciliation.
  • Elevate stale and reconciled lifecycle truth through the existing canonical decision-zone and guidance seams instead of a new detail surface or banner framework.
  • Keep notification and entry-point changes narrow by extending the existing OperationRunCompleted / OperationUxPresenter path instead of redesigning the notification subsystem.

Phase 1 Design

Design artifacts are created under /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/178-ops-truth-alignment/:

  • data-model.md: existing persistent source truth plus the derived cross-surface truth contracts for this slice
  • contracts/operations-truth-alignment.openapi.yaml: internal route and UI contract for summary buckets, canonical drill-through state, and system/admin monitoring continuity
  • quickstart.md: focused implementation and verification workflow for admin-plane, system-plane, and local-progress truth alignment

Design decisions:

  • OperationRun remains the only persistent lifecycle source of truth; the design uses existing freshness and reconciliation semantics rather than adding a new table or state family.
  • The narrowest shared seam is an extension of existing OperationRun query scopes and OperationUxPresenter-style rendering helpers so dashboard, workspace, monitoring, detail, and notification surfaces can agree on one derived problem-class split.
  • BulkOperationProgress gains the same conditional polling discipline already used elsewhere and remains an active-only affordance rather than becoming a second summary surface.
  • Dashboard, workspace, and recent-operation surfaces carry problem-class-specific drill-through metadata into /admin/operations, where the admin monitoring hub becomes the single canonical collection route for both terminal follow-up and active stale/stuck attention.
  • System monitoring stays within the current page family. Stuck remains active-stale focused, while Runs, Failures, and detail surfaces make reconciled stale lineage visible so operators can still recover the stale story after reconciliation.
  • Canonical run detail hardening happens inside existing summary and decision-zone seams so stale/reconciled attention is promoted without changing routing or page ownership.

Project Structure

Documentation (this feature)

specs/178-ops-truth-alignment/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│   └── operations-truth-alignment.openapi.yaml
├── checklists/
│   └── requirements.md
└── tasks.md

Source Code (repository root)

app/
├── Filament/
│   ├── Pages/
│   │   ├── Monitoring/
│   │   │   └── Operations.php
│   │   └── Operations/
│   │       └── TenantlessOperationRunViewer.php
│   ├── System/
│   │   └── Pages/
│   │       └── Ops/
│   │           ├── Runs.php
│   │           ├── Failures.php
│   │           ├── Stuck.php
│   │           └── ViewRun.php
│   └── Widgets/
│       ├── Dashboard/
│       │   ├── DashboardKpis.php
│       │   ├── NeedsAttention.php
│       │   └── RecentOperations.php
│       └── Workspace/
│           ├── WorkspaceNeedsAttention.php
│           └── WorkspaceRecentOperations.php
├── Livewire/
│   └── BulkOperationProgress.php
├── Models/
│   └── OperationRun.php
├── Notifications/
│   └── OperationRunCompleted.php
└── Support/
    ├── Operations/
    │   ├── OperationLifecyclePolicy.php
    │   └── OperationRunFreshnessState.php
    ├── OpsUx/
    │   ├── ActiveRuns.php
    │   └── OperationUxPresenter.php
    ├── SystemConsole/
    │   └── StuckRunClassifier.php
    ├── Workspaces/
    │   └── WorkspaceOverviewBuilder.php
    └── OperationRunLinks.php

resources/
├── views/
│   ├── filament/
│   │   └── system/
│   │       └── pages/
│   │           └── ops/
│   │               └── view-run.blade.php
│   └── livewire/
│       └── bulk-operation-progress.blade.php

tests/
├── Feature/
│   ├── Filament/
│   │   ├── DashboardKpisWidgetTest.php
│   │   ├── NeedsAttentionWidgetTest.php
│   │   ├── RecentOperationsSummaryWidgetTest.php
│   │   └── WorkspaceOverviewOperationsTest.php
│   ├── Monitoring/
│   │   ├── MonitoringOperationsTest.php
│   │   ├── OperationsDashboardDrillthroughTest.php
│   │   ├── OperationsDbOnlyRenderTest.php
│   │   ├── OperationsDbOnlyTest.php
│   │   └── OperationsTenantScopeTest.php
│   ├── Notifications/
│   │   └── OperationRunNotificationTest.php
│   ├── OpsUx/
│   │   └── BulkOperationProgressDbOnlyTest.php
│   ├── System/
│   │   └── Spec114/
│   │       ├── CanonicalRunDetailTest.php
│   │       ├── OpsFailuresViewTest.php
│   │       ├── OpsStuckViewTest.php
│   │       └── OpsTriageActionsTest.php
│   ├── Guards/
│   │   └── ActionSurfaceContractTest.php
│   └── RunAuthorizationTenantIsolationTest.php

Structure Decision: Keep the existing Laravel monolith structure. The implementation should extend current model scopes, presenters, widgets, monitoring pages, and system pages instead of introducing a new operations-overview layer or new directory family.

Implementation Strategy

Phase A — Introduce One Shared Derived Problem-Class Contract

Goal: Let every covered surface derive the same operator-facing split between terminal follow-up and active stale/stuck attention from the existing lifecycle truth.

Step File Change
A.1 app/Models/OperationRun.php Add narrow query scopes or helpers for terminal follow-up and active stale/stuck attention, keeping dashboardNeedsFollowUp() as a compatibility umbrella if needed.
A.2 app/Support/OpsUx/OperationUxPresenter.php and existing freshness helpers Centralize derived problem-class text, stale-lineage wording, and row/detail/notification display decisions using existing freshness and reconciliation truth.
A.3 app/Support/OpsUx/ActiveRuns.php Extend or refine active-run polling helpers so summary and progress surfaces can poll only while relevant active runs still exist.

Phase B — Harden Local Progress And Summary Surfaces

Goal: Remove stale local-progress residue and separate terminal problems from active stale attention on tenant and workspace entry surfaces.

Step File Change
B.1 app/Livewire/BulkOperationProgress.php and resources/views/livewire/bulk-operation-progress.blade.php Add conditional polling, canonical refresh behavior, and active-only visibility so terminal or reconciled runs disappear within one refresh cycle.
B.2 app/Filament/Widgets/Dashboard/DashboardKpis.php and app/Filament/Widgets/Dashboard/NeedsAttention.php Split mixed operations follow-up into explicit terminal vs stale-active buckets with one matching destination each.
B.3 app/Filament/Widgets/Dashboard/RecentOperations.php Surface freshness/problem-class truth per row so stale candidates and terminal follow-up do not read like generic recent activity.
B.4 app/Support/Workspaces/WorkspaceOverviewBuilder.php, app/Filament/Widgets/Workspace/WorkspaceNeedsAttention.php, and app/Filament/Widgets/Workspace/WorkspaceRecentOperations.php Mirror the same split and row semantics on workspace surfaces so workspace and tenant summaries speak the same truth grammar.

Phase C — Align Canonical Operations Hub And Drill-Through State

Goal: Make /admin/operations the canonical collection route for both problem classes and preserve the originating class through every entry point.

Step File Change
C.1 app/Filament/Pages/Monitoring/Operations.php Add or tighten problem-class-aware filters/tabs for active stale/stuck attention and terminal follow-up without creating a new monitoring page.
C.2 app/Support/OperationRunLinks.php Carry tenant-safe problem-class filter state and existing navigation context into canonical operations links from dashboard, workspace, and notification entry points.
C.3 Existing summary and recent-operation surfaces Replace broad needs follow-up drill-throughs with explicit problem-class destinations so the landing page confirms the originating operator story.

Phase D — Preserve Stale Truth Across Canonical And System Monitoring Surfaces

Goal: Keep stale/reconciled lineage visible to operators after auto-reconciliation and promote stale/reconciled truth inside the primary decision hierarchy.

Step File Change
D.1 Existing canonical run-detail composition seams under OperationRunResource / TenantlessOperationRunViewer Elevate likely-stale and reconciled lifecycle truth inside the existing decision-zone/current-state summary rather than leaving it only in secondary banners or diagnostics.
D.2 app/Filament/System/Pages/Ops/Runs.php, Failures.php, Stuck.php, and ViewRun.php Keep Stuck focused on active stale candidates while exposing reconciled stale lineage on system runs, failures, and detail so platform operators can still recover the stale story after reconciliation.
D.3 resources/views/filament/system/pages/ops/view-run.blade.php Strengthen stale/reconciled visual emphasis in the existing guidance/current-state rendering instead of adding a new detail surface.

Phase E — Keep Notifications And Entry-Point Wording Truthful

Goal: Ensure entry points never frame a run more calmly than its current lifecycle or freshness truth.

Step File Change
E.1 app/Notifications/OperationRunCompleted.php and existing presenter seams Preserve problem-class wording and stale-lineage emphasis in terminal notifications and linked entry-point copy.
E.2 Existing dashboard/workspace/navigation copy Keep needs follow-up as an umbrella only when the concrete sub-class remains visible and recoverable.

Phase F — Regression Protection And Verification

Goal: Lock the truth contract into tests and preserve DB-only rendering, authorization semantics, and system/admin continuity.

Step File Change
F.1 tests/Feature/Filament/DashboardKpisWidgetTest.php, NeedsAttentionWidgetTest.php, RecentOperationsSummaryWidgetTest.php, and WorkspaceOverviewOperationsTest.php Add assertions for terminal-vs-stale separation, row truth, and workspace/tenant summary parity.
F.2 tests/Feature/OpsUx/BulkOperationProgressDbOnlyTest.php Prove polling freshness and active-only visibility without enqueue-event dependence.
F.3 tests/Feature/Monitoring/MonitoringOperationsTest.php, OperationsDashboardDrillthroughTest.php, OperationsDbOnlyRenderTest.php, OperationsDbOnlyTest.php, OperationsTenantScopeTest.php, and RunAuthorizationTenantIsolationTest.php Prove canonical hub filters, drill-through continuity, DB-only rendering, and tenant-safe access semantics.
F.4 tests/Feature/System/Spec114/CanonicalRunDetailTest.php, OpsFailuresViewTest.php, OpsStuckViewTest.php, OpsTriageActionsTest.php, and tests/Feature/Notifications/OperationRunNotificationTest.php Prove stale/reconciled visibility across system detail, failures, stuck surfaces, and notifications.
F.5 tests/Feature/Guards/ActionSurfaceContractTest.php plus vendor/bin/sail bin pint --dirty --format agent Preserve surface-contract compliance and formatting before implementation is considered complete.

Key Design Decisions

D-001 — Preserve OperationRunFreshnessState as the lifecycle base and derive problem class above it

The repo already has a narrow, useful freshness enum. The plan builds the operator-facing split above that existing truth instead of adding a second enum or persisted state family.

D-002 — Use existing model/presenter seams instead of a new taxonomy framework

The narrowest implementation is to extend OperationRun query scopes, OperationUxPresenter, and existing route helpers. A new cross-domain classification framework would be disproportionate to the problem.

D-003 — Keep BulkOperationProgress active-only

The overlay should not become a second follow-up surface. Once a run is terminal or reconciled, it should leave the overlay and let recent/attention/canonical surfaces tell the rest of the story.

D-004 — /admin/operations remains the only canonical collection route

Problem-class continuity should be achieved with filters/tabs and link state, not by introducing tenant-specific or class-specific duplicate routes.

D-005 — Stuck remains active-stale focused, but stale lineage must survive reconciliation elsewhere in system monitoring

Widening /system/ops/stuck into a mixed registry would blur its purpose. The narrower design keeps active stale candidates there and makes reconciled stale lineage visible on /system/ops/runs, /system/ops/failures, and system detail.

D-006 — Decision-zone emphasis is the canonical fix for stale/reconciled truth

The canonical detail surface already has the right ownership. The plan promotes stale/reconciled truth inside that existing hierarchy instead of inventing a new banner or side-panel architecture.

D-007 — Notification changes stay on the existing OperationRunCompleted path

The feature should extend current terminal notification semantics, not create a new notification subsystem or duplicate entry-point logic elsewhere.

Risk Assessment

Risk Impact Likelihood Mitigation
Summary surfaces still share labels but not identical filter meaning High Medium Carry problem-class state through OperationRunLinks and verify destination continuity in focused tests.
Bulk progress stays live too long or polls too often High Medium Reuse the existing conditional polling pattern and verify convergence/stoppage behavior in DB-only tests.
System surfaces lose stale lineage after reconciliation High Medium Keep stale lineage explicit on system runs, failures, and detail even when Stuck remains active-only.
Canonical detail duplicates stale/reconciled emphasis in multiple equal-priority areas Medium Medium Reuse the existing decision-zone hierarchy and verify emphasis through detail tests instead of adding parallel warning surfaces.
The thin derived split drifts into a new framework over time Medium Low Keep it constrained to existing model scopes, presenter seams, route helpers, and regression coverage.

Test Strategy

  • Extend tenant and workspace summary widget coverage so dashboard and workspace attention surfaces clearly separate terminal follow-up from active stale/stuck attention.
  • Extend BulkOperationProgressDbOnlyTest.php to prove terminal/reconciled runs disappear within one refresh cycle and polling stops when no relevant active runs remain.
  • Extend canonical monitoring tests so /admin/operations filters, row rendering, and drill-throughs preserve problem-class continuity and remain DB-only and tenant-safe.
  • Extend system-panel tests so /system/ops/stuck, /system/ops/failures, /system/ops/runs, and system detail preserve stale/reconciled lineage in a way platform operators can recover.
  • Extend notification tests so completed notifications and linked destinations do not frame reconciled or stale-derived terminal runs more calmly than current truth.
  • Run focused Pest suites through Sail plus vendor/bin/sail bin pint --dirty --format agent; full-suite execution is not required for planning artifacts.

Post-Design Constitution Re-check

  • PASS The design keeps OperationRun as the only persistent lifecycle truth and introduces no new schema or state family.
  • PASS The derived problem-class split remains narrow and stays within existing model, presenter, and route-helper seams.
  • PASS Admin-plane and system-plane surfaces stay separated and tenant-safe; no cross-plane leakage path is introduced.
  • PASS The run-detail emphasis work reuses existing decision-zone ownership instead of adding a second detail hierarchy.
  • PASS No new destructive actions, global-search changes, asset registration, or provider registration changes are required.
  • PASS Livewire v4 and Filament v5 compliance remains intact.

Complexity Tracking

No constitution waiver is expected. This slice hardens shared truth semantics and removes drift without introducing new persistence, new orchestration, or a new semantic framework.