TenantAtlas/specs/365-operations-ui-operator-actions-regression-gate/plan.md
ahmido 6ac0913ff8 feat: implement operations UI operator actions regression gate (#436)
Implemented operations UI operator actions regression gate.

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #436
2026-06-08 01:21:14 +00:00

21 KiB

Implementation Plan: Operations UI Operator Actions & Regression Gate

Branch: 365-operations-ui-operator-actions-regression-gate | Date: 2026-06-07 | Spec: spec.md
Input: Feature specification from /specs/365-operations-ui-operator-actions-regression-gate/spec.md

Note: This plan is a preparation artifact only. It defines the implementation path and validation gate; it does not implement application code.

Summary

Spec 365 completes the OperationRun/Reconciliation program by making existing run truth operator-actionable in the existing Operations hub and OperationRun detail. The implementation approach is to add one central derived action eligibility resolver, reuse existing reconciliation and related-link seams, integrate safe actions into existing Filament surfaces, and add a final regression matrix across Specs 358-364.

The plan intentionally does not introduce a new adapter framework, a generic retry engine, a new persisted action table, or new top-level Operations pages. Retry is limited to operation families with a repo-verified safe retry/start seam; unsupported families fail closed with a disabled/deferred reason.

Technical Context

Language/Version: PHP 8.4.15, Laravel 12.52.0
Primary Dependencies: Filament 5.2.1, Livewire 4.1.4, Pest 4.3.1, Laravel Sail 1.x
Storage: PostgreSQL; existing operation_runs table/context JSON only. No new table planned.
Testing: Pest 4 unit/feature/browser tests. Filament action tests must mount Livewire components/pages, not static resource classes.
Validation Lanes: fast-feedback, confidence, browser.
Target Platform: Laravel web application under apps/platform.
Project Type: Laravel + Filament application.
Performance Goals: Operations hub and detail remain DB-only render paths for run status; no Graph/provider calls during UI render.
Constraints: Fail closed for uncertain actions; no high-risk retry; no force-success; no raw diagnostics by default; no asset changes unless proven necessary.
Scale/Scope: Existing Operations hub/detail and related domain links. No new top-level IA.

Repo Truth Captured During Prep

  • Current branch at prep time: 365-operations-ui-operator-actions-regression-gate
  • Baseline HEAD: 3ce1cae7 feat: implement restore high risk operation reconciliation (#435)
  • Baseline status before artifacts: clean except the new Spec 365 directory.
  • Spec 364 baseline context: restore high-risk reconciliation has completed task/checklist artifacts and is treated as the immediate completed predecessor.
  • Current package baseline from Laravel Boost:
    • PHP 8.4.15
    • Laravel 12.52.0
    • Filament 5.2.1
    • Livewire 4.1.4
    • Pest 4.3.1

Relevant implementation files discovered:

apps/platform/app/Models/OperationRun.php
apps/platform/app/Services/OperationRunService.php
apps/platform/app/Services/AdapterRunReconciler.php
apps/platform/app/Support/Operations/Reconciliation/OperationRunReconciliationRegistry.php
apps/platform/app/Support/Operations/Reconciliation/*
apps/platform/app/Support/OperationRunType.php
apps/platform/app/Support/OperationRunStatus.php
apps/platform/app/Support/OperationRunOutcome.php
apps/platform/app/Support/OperationCatalog.php
apps/platform/app/Support/OperationRunCapabilityResolver.php
apps/platform/app/Support/Auth/Capabilities.php
apps/platform/app/Policies/OperationRunPolicy.php
apps/platform/app/Support/OpsUx/OperationUxPresenter.php
apps/platform/app/Support/OpsUx/OperationRunProgressContract.php
apps/platform/app/Support/OperationRunLinks.php
apps/platform/app/Support/Navigation/RelatedNavigationResolver.php
apps/platform/app/Support/Navigation/RelatedActionLabelCatalog.php
apps/platform/app/Filament/Pages/Monitoring/Operations.php
apps/platform/app/Filament/Resources/OperationRunResource.php
apps/platform/app/Filament/Pages/Operations/TenantlessOperationRunViewer.php
apps/platform/app/Services/Audit/AuditRecorder.php
apps/platform/app/Services/Audit/WorkspaceAuditLogger.php

Repo-specific decisions:

  • OperationRunService remains the write seam for OperationRun state/outcome/reconciliation transitions.
  • AdapterRunReconciler and OperationRunReconciliationRegistry are the only approved reconciliation execution path.
  • OperationRunLinks / RelatedNavigationResolver are the preferred related-object navigation path.
  • OperationRunPolicy and OperationRunCapabilityResolver remain the authorization entry points; do not use raw capability strings.
  • TenantlessOperationRunViewer::resumeCaptureAction() is a narrow existing resume-like seam and must be reconciled into the central eligibility model if touched.
  • No repo-wide generic retry seam was found during prep; broad retry must remain unavailable/deferred unless implementation verifies or adds a bounded safe seam per operation family.

UI / Surface Guardrail Plan

  • Guardrail scope: changed surfaces.
  • Affected routes/pages/actions/states/navigation/panel/provider surfaces:
    • /admin/workspaces/{workspace}/operations
    • tenantless OperationRun detail page
    • OperationRun detail header action groups
    • related domain object action links
  • No-impact class, if applicable: N/A.
  • Native vs custom classification summary: mixed but Filament-native first. Reuse existing native table/detail actions and existing OperationRun detail sections; avoid new styling systems.
  • Shared-family relevance: status messaging, header actions, related navigation, evidence/report/restore links, diagnostics disclosure.
  • State layers in scope: page, table row, detail header, detail sections, action modal state.
  • Audience modes in scope: operator-MSP and support-platform; customer-readable defaults must remain calm and non-technical.
  • Decision/diagnostic/raw hierarchy plan: decision-first default, diagnostics second, support/raw third.
  • Raw/support gating plan: collapsed and capability-gated.
  • One-primary-action / duplicate-truth control: OperationRunActionEligibility output is the single source for primary action and disabled reasons consumed by list/detail/actions.
  • Handling modes by drift class or surface: review-mandatory for high-risk action surface and raw leakage guard.
  • Repository-signal treatment: report-only for existing Operations page audit docs unless implementation materially changes IA; review-mandatory for dangerous-action and customer-safe checks.
  • Special surface test profiles: monitoring-state-page, shared-detail-family.
  • Required tests or manual smoke: functional core + state-contract + browser smoke.
  • Exception path and spread control: none planned. Any generic retry exception must be documented in this feature and covered by tests.
  • Active feature PR close-out entry: Guardrail + Smoke Coverage.
  • UI/Productization coverage decision: Existing strategic Operations page is materially changed; update existing coverage docs or record proportional no-update rationale during implementation close-out.
  • Coverage artifacts to update: update docs/ui-ux-enterprise-audit/page-reports/ui-003-operations.md / design matrix when implementation changes layout, action hierarchy, state hierarchy, or screenshots materially. A no-update rationale is allowed only when changes are limited to existing pattern-compatible action/copy wiring, and that rationale must be recorded in tasks.md close-out.
  • No-impact rationale: N/A.
  • Navigation / Filament provider-panel handling: no panel provider or navigation change planned.
  • Screenshot or page-report need: screenshot/browser artifact recommended for the final smoke; no new page report required during prep.

Shared Pattern & System Fit

  • Cross-cutting feature marker: yes.
  • Systems touched: OperationRun UX presenter/progress, related navigation, reconciliation registry/service, authorization policy/capability resolver, audit logging, localization.
  • Shared abstractions reused: OperationUxPresenter, OperationRunProgressContract, OperationRunLinks, RelatedNavigationResolver, AdapterRunReconciler, OperationRunService, OperationRunPolicy.
  • New abstraction introduced? why?: Yes, a narrow action eligibility resolver because no existing shared layer combines status/outcome/freshness/risk/adapter/RBAC/scope/related metadata into a single action contract.
  • Why the existing abstraction was sufficient or insufficient: Existing presenters and services are sufficient for summaries, links, and writes, but insufficient for one-primary-action and action permission consistency.
  • Bounded deviation / spread control: The resolver must not create a new operation taxonomy, adapter registry, or persisted action model.

OperationRun UX Impact

  • Touches OperationRun start/completion/link UX?: yes.
  • Central contract reused: OperationUxPresenter, OperationRunProgressContract, OperationRunLinks, OperationRunService.
  • Delegated UX behaviors: artifact links, run detail links, reconciliation result handling, lifecycle result display, tenant/workspace-safe URLs.
  • Surface-owned behavior kept local: visible hierarchy and invocation of approved actions.
  • Queued DB-notification policy: no new policy.
  • Terminal notification path: existing central lifecycle mechanism.
  • Exception path: Generic retry is not approved by this plan. If implemented, it must be through an existing or bounded safe start seam with explicit tests and audit.

Provider Boundary & Portability Fit

  • Shared provider/platform boundary touched?: yes, bounded.
  • Provider-owned seams: existing provider reason codes and provider failure summaries in canonical OperationRun context.
  • Platform-core seams: OperationRun action eligibility, outcome summaries, high-risk classification, related action display.
  • Neutral platform terms / contracts preserved: operation, outcome, attention, reconcile, retry, verification, partial, blocked, related evidence.
  • Retained provider-specific semantics and why: Provider reason codes remain diagnostics only because they help operators/support understand blocked/partial states.
  • Bounded extraction or follow-up path: none planned.

Constitution Check

GATE: Must pass before implementation. Re-check after design and before code merge.

  • Inventory-first: PASS. This spec reads existing OperationRun and related artifact truth only.
  • Read/write separation: PASS with constraints. Reconcile/retry are explicit operator actions with authorization, confirmation where appropriate, and audit.
  • Graph contract path: PASS. No Graph calls in UI render or action eligibility. Retry seam, if any, must use existing service/jobs and Graph abstractions.
  • Deterministic capabilities: PASS. Use Capabilities constants and OperationRunCapabilityResolver/policies. Add no raw strings.
  • RBAC-UX: PASS. Preserve non-member/not-entitled 404 and member-missing-capability 403 semantics.
  • Workspace isolation: PASS. All actions must enforce workspace/environment scope.
  • Destructive-like actions: PASS. This spec forbids destructive and force-success actions. State-changing reconcile/retry actions require confirmation/audit as appropriate.
  • Global search: PASS. OperationRunResource remains not globally searchable unless separately changed with View/Edit contract.
  • Run observability: PASS. Reconcile uses existing run write seam; retry creates a run only through safe start seam.
  • OperationRun start UX: PASS with constraint. Retry must reuse central start UX if implemented.
  • Ops-UX 3-surface feedback: PASS. No new terminal DB notification behavior planned.
  • Ops-UX lifecycle: PASS. OperationRunService owns state/outcome/reconciliation writes.
  • Ops-UX summary counts: PASS. No new summary count keys planned; any touched keys must use OperationSummaryKeys.
  • Data minimization: PASS. Raw context hidden/gated; no secrets in audit.
  • Test governance: PASS. Unit, feature, and browser lane plans are explicit.
  • Proportionality: PASS. New resolver is justified by safety, RBAC, high-risk guard, and multiple current concrete run families.
  • No premature abstraction: PASS. One resolver replaces scattered action decision logic and is not a registry/framework.
  • Persisted truth: PASS. No new independent persisted truth.
  • Behavioral state: PASS. No new status/outcome family.
  • UI semantics: PASS. Derived summaries and actions map from existing domain/run truth.
  • Shared pattern first: PASS. Existing presenters/links/services are reused.
  • Provider boundary: PASS. Provider details remain diagnostics only.
  • V1 explicitness / few layers: PASS. One narrow derived layer.
  • Spec discipline / bloat check: PASS. Proportionality review complete.
  • Badge semantics: PASS. Implementation must reuse badge/shared status rendering if status-like badges change.
  • Filament-native UI: PASS. Use native Filament actions/sections and existing shared primitives.
  • UI/UX surface taxonomy: PASS. Surfaces classified.
  • Decision-first operating model: PASS. Required default-visible fields and hierarchy defined.
  • Audience-aware disclosure: PASS. Raw/support content hidden/gated.
  • Filament UI Action Surface Contract: PASS with implementation tasks.
  • UI/Productization coverage: PASS. Impact classified; coverage update/no-update rationale required at close-out.

Test Governance Check

  • Test purpose / classification by changed surface: Unit for resolver/presenter; Feature for actions/RBAC/scope/audit/related links; Browser for Operations UI decision-first and raw leakage guard.
  • Affected validation lanes: fast-feedback, confidence, browser.
  • Why this lane mix is the narrowest sufficient proof: Action safety is mostly deterministic logic plus server-side enforcement; browser coverage is limited to the user-visible matrix and leakage guard.
  • Narrowest proving command(s):
    • cd apps/platform && ./vendor/bin/sail artisan test --compact --filter=Spec365
    • cd apps/platform && ./vendor/bin/pint --dirty
    • git diff --check
  • Fixture / helper / factory / seed / context cost risks: Need canonical OperationRun fixtures for each matrix state; keep as explicit Spec365 helpers only.
  • Expensive defaults or shared helper growth introduced?: no; browser fixtures must not become global defaults.
  • Heavy-family additions, promotions, or visibility changes: one explicit browser smoke file.
  • Surface-class relief / special coverage rule: monitoring-state-page/shared-detail-family special coverage applies.
  • Closing validation and reviewer handoff: Run Spec365 plus Spec358-364 regression filters and browser smoke. Review retry deferrals explicitly.
  • Budget / baseline / trend follow-up: none expected.
  • Review-stop questions: Does any action bypass resolver/policy? Does any retry path lack a safe start seam? Does high-risk restore expose unsafe copy/action? Are raw diagnostics default-visible? Does every state-changing action disclose TenantPilot-only, Microsoft-tenant, or simulation-only scope before execution?
  • Escalation path: document-in-feature for retry/acknowledge deferrals; follow-up-spec for a generic retry framework.
  • Active feature PR close-out entry: Guardrail + Smoke Coverage.
  • Why no dedicated follow-up spec is needed: This completes the current OperationRun/Reconciliation program without introducing a larger governance inbox.

Project Structure

Documentation (this feature)

specs/365-operations-ui-operator-actions-regression-gate/
├── spec.md
├── plan.md
├── tasks.md
├── checklists/
│   └── requirements.md
└── artifacts/
    ├── spec365-action-eligibility-matrix.md
    └── spec365-regression-gate-matrix.md

Source Code (repository root)

Expected implementation paths:

apps/platform/app/
├── Filament/
│   ├── Pages/Monitoring/Operations.php
│   ├── Pages/Operations/TenantlessOperationRunViewer.php
│   └── Resources/OperationRunResource.php
├── Policies/OperationRunPolicy.php
├── Services/OperationRunService.php
├── Services/AdapterRunReconciler.php
├── Support/
│   ├── Auth/Capabilities.php
│   ├── OperationRunCapabilityResolver.php
│   ├── OperationRunLinks.php
│   ├── OpsUx/
│   │   ├── OperationUxPresenter.php
│   │   └── OperationRunProgressContract.php
│   └── Operations/
│       ├── OperationRunActionEligibility.php
│       └── Reconciliation/
│           └── OperationRunReconciliationRegistry.php
└── Services/Audit/
    ├── AuditRecorder.php
    └── WorkspaceAuditLogger.php

apps/platform/lang/
├── en/localization.php
└── de/localization.php

apps/platform/tests/
├── Unit/Support/Operations/
├── Unit/Support/OpsUx/
├── Feature/Operations/
└── Browser/

Phase 0 - Research

Completed during prep:

  • Read repository governance and architecture docs required by AGENTS.md.
  • Verified installed Laravel/Filament/Livewire/Pest versions with Laravel Boost.
  • Searched current Filament/Livewire/Pest docs through Laravel Boost for action confirmation/testing/global search constraints.
  • Audited OperationRun model/service/reconciliation registry/link/policy/presenter/UI files.
  • Checked completed Specs 358-364 for baseline context and scope continuity.

Research conclusions:

  • Central action eligibility is justified.
  • Reconcile can reuse existing registry/reconciler/service seams.
  • Related links should reuse existing link/navigation resolvers.
  • Generic retry is not repo-real yet and must be explicitly bounded.
  • OperationRun acknowledge should be deferred unless a clean existing seam is verified during implementation.

Phase 1 - Design

Design artifacts:

  • artifacts/spec365-action-eligibility-matrix.md
  • artifacts/spec365-regression-gate-matrix.md

Primary design decisions:

  1. Use one derived resolver for action decisions.
  2. Keep writes service-owned.
  3. Keep related navigation canonical and scope-safe.
  4. Keep high-risk operations fail-closed.
  5. Keep raw/support diagnostics hidden and gated.
  6. Keep Operations as the central surface.

Phase 2 - Implementation Approach

Implementation sequence:

  1. Add resolver and DTO/presenter tests first.
  2. Add high-risk guard and forbidden-action tests.
  3. Wire related links and detail/list primary action display.
  4. Add safe reconcile Filament action with policy/audit/scope enforcement.
  5. Verify retry seams; implement only safe non-high-risk retry or document deferral.
  6. Add localization and summary presenter copy.
  7. Add feature tests for actions/RBAC/scope/audit.
  8. Add browser smoke for representative Operations UI states.
  9. Run Spec365 and Spec358-364 regression gate, including Spec363.

Risk Register

Risk Impact Mitigation
Generic retry grows into unsafe re-execution framework High Implement only repo-verified seams; otherwise disabled/deferred reason
UI action visibility diverges from direct action authorization High Central resolver plus policy checks plus direct-action feature tests
Restore/high-risk shows unsafe next action High High-risk guard unit/feature/browser tests
Raw provider/SQL/queue leakage appears in default UI High Presenter sanitization and browser leakage guard
Related links bypass scope checks High Reuse existing link/policy resolvers and add cross-scope tests
New resolver becomes taxonomy framework Medium Keep derived, non-persisted, no new enum/status family

Deployment / Ops Impact

  • Migrations: none planned.
  • Environment variables: none planned.
  • Queue/cron workers: no new workers planned; retry, if implemented, must use existing queues/jobs for that operation family.
  • Storage/volumes: none.
  • Assets: none planned. If implementation unexpectedly registers Filament assets, deploy must include cd apps/platform && php artisan filament:assets.
  • Staging validation: required before production because this touches operator action affordances and high-risk restore safety.

Open Questions for Implementation

  1. Which existing capability constants should govern reconcile, retry, and view diagnostics for each operation family?
  2. Is there a clean existing OperationRun acknowledge/note/audit seam? If not, acknowledge is deferred.
  3. Which non-high-risk operation families have a safe idempotent retry/start seam today?
  4. Should action metadata live only in existing audit logs, or also in bounded context.operator_actions for UI display?
  5. Does implementation materially change the Operations page enough to require updating the UI audit page report/design matrix?

Implementation Stop Conditions

  • Stop and update spec/plan before adding a generic retry framework.
  • Stop and update spec/plan before adding a new table/entity/status/outcome family.
  • Stop before introducing any restore retry/re-execute or force-success action.
  • Stop if action execution cannot be made server-side RBAC/scope-safe.
  • Stop if state-changing action copy cannot make mutation scope explicit before execution.
  • Stop if raw diagnostics cannot be gated/collapsed without broader UI redesign.