TenantAtlas/specs/233-stale-run-visibility/data-model.md
ahmido 6fdd45fb02
Some checks failed
Main Confidence / confidence (push) Failing after 53s
feat: surface stale active operation runs (#269)
## Summary
- keep stale active operation runs visible in the tenant progress overlay and polling state
- align tenant and canonical operation surfaces around the shared stale-active presentation contract
- add Spec 233 artifacts and clean the promoted-candidate backlog entries

## Validation
- browser smoke: `/admin/t/18000000-0000-4000-8000-000000000180` -> stale dashboard CTA -> `/admin/operations?tenant_id=7&activeTab=active_stale_attention&problemClass=active_stale_attention` -> `/admin/operations/15`
- verified healthy vs likely-stale tenant cards, canonical stale list row, and canonical run detail consistency

## Notes
- local smoke fixture seeded with one fresh and one stale running `baseline_compare` operation for browser validation
- Pest suite was not re-run in this session before opening this PR

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #269
2026-04-23 15:10:06 +00:00

148 lines
6.5 KiB
Markdown

# Data Model: Operation Run Active-State Visibility & Stale Escalation
## Overview
This feature introduces no new persisted entity, table, or stored projection. It formalizes one derived active-state presentation contract over existing `OperationRun` lifecycle truth so tenant and workspace monitoring surfaces present the same meaning.
## Source Entity: OperationRun
- **Purpose**: Canonical lifecycle and outcome record for long-running admin-plane work.
- **Existing fields used by this feature**:
- `id`
- `workspace_id`
- `tenant_id`
- `type`
- `status`
- `outcome`
- `created_at`
- `started_at`
- `completed_at`
- `context`
- `failure_summary`
- **Existing relationships used by this feature**:
- `tenant`
- `user` where available for initiator context
- **Existing invariants**:
- Lifecycle status and outcome remain service-owned.
- Reconciliation metadata stays inside `context.reconciliation`.
- No new persisted status or outcome values are introduced for visibility purposes.
## Derived Truth: OperationRunFreshnessState
- **Type**: Existing enum `App\Support\Operations\OperationRunFreshnessState`
- **Values**:
- `fresh_active`
- `likely_stale`
- `reconciled_failed`
- `terminal_normal`
- `unknown`
- **Inputs**:
- `status`
- `created_at`
- `started_at`
- `context.reconciliation`
- existing `OperationLifecyclePolicy`
- **Behavioral rule**:
- This remains the only stale/late truth input for surface rendering.
- No widget, page, or Livewire component may introduce its own threshold logic.
## Derived Truth: OperationRun Problem Class
- **Type**: Existing derived string on `OperationRun`
- **Values**:
- `none`
- `active_stale_attention`
- `terminal_follow_up`
- **Purpose**:
- Separates active stale attention from terminal follow-up while keeping both distinct from calm/no-action runs.
- **Relationship to freshness**:
- `likely_stale` freshness yields `active_stale_attention`.
- `reconciled_failed` freshness yields `terminal_follow_up`.
- Completed blocked/partial/failed runs may also yield `terminal_follow_up` without stale lineage.
## Derived View Model: Active-State Presentation Contract
- **Type**: Derived, request-scoped presentation payload. Prefer reuse of `OperationUxPresenter::decisionZoneTruth()` and existing badge/presenter outputs before adding any new helper.
- **Required fields across covered surfaces**:
- `freshness_state`
- `problem_class`
- `is_currently_active`
- `is_reconciled`
- `compact_label`
- `diagnostic_label`
- `guidance`
- `stale_lineage_note`
- `show_in_active_progress`
- `keep_active_polling`
- **Presentation categories**:
- `healthy_active`
- `past_expected_lifecycle`
- `likely_stale`
- `no_longer_active`
- `unknown` fallback
- **Category mapping rules**:
- `fresh_active` + active run -> `healthy_active`
- `likely_stale` on compact summary surfaces -> `past_expected_lifecycle`
- `likely_stale` on canonical or stronger diagnostic surfaces -> `likely_stale`
- `terminal_normal` or `reconciled_failed` -> `no_longer_active`
- `unknown` -> fallback copy without false stale escalation
- **Important constraint**:
- `past_expected_lifecycle` and `likely_stale` are density variants over the same stale truth, not separate persisted states.
## Derived Surface Policy: Tenant Active Progress Visibility
- **Current consumers**:
- `App\Livewire\BulkOperationProgress`
- `App\Support\OpsUx\ActiveRuns`
- **Former issue**:
- Both used `healthyActive()` and therefore suppressed stale-active runs from the tenant progress overlay and polling decision.
- **Implemented rule**:
- Fresh and stale active runs remain visible as active work.
- Terminal runs disappear on the next refresh cycle.
- Polling continues while any visible active work remains, including stale-active runs.
- Overlay rendering uses the existing status badge and `OperationUxPresenter` guidance path so stale-active elevation stays derived from shared freshness truth.
## Covered Surface Consumers
| Consumer | Current Truth Inputs | Required Change |
|---|---|---|
| `BulkOperationProgress` | Active run query, `healthyActive()`, `ActiveRuns` | Include stale-active work in visibility and polling semantics while keeping terminal runs excluded |
| `RecentOperationsSummary` | Raw recent runs for tenant | Ensure active-state emphasis and copy stay aligned with canonical freshness meaning |
| `Dashboard\RecentOperations` | Badge rendering + `OperationUxPresenter` | Preserve and tighten existing freshness-aware row semantics |
| `Dashboard\NeedsAttention` / `DashboardKpis` | Problem-class counts + links | Keep stale-active counts and linked monitoring semantics aligned |
| `WorkspaceOverviewBuilder` / `WorkspaceRecentOperations` | Badge rendering + `OperationUxPresenter` | Preserve workspace summary consistency and diagnostic separation |
| `OperationRunResource` | Status/outcome badges + lifecycle summaries | Keep canonical list/detail authoritative and consistent with compact surfaces |
| `TenantlessOperationRunViewer` | Canonical detail page around resource truth | Preserve diagnostic-first explanation of stale versus terminal meaning |
## State Transitions Relevant To This Feature
1. `queued` or `running` within lifecycle threshold
- Freshness: `fresh_active`
- Presentation: `healthy_active`
- Visible on active-only compact surfaces: yes
2. `queued` or `running` beyond lifecycle threshold
- Freshness: `likely_stale`
- Presentation: `past_expected_lifecycle` on compact surfaces, `likely_stale` on diagnostic surfaces
- Visible on active-only compact surfaces: yes
3. `completed` without reconciliation
- Freshness: `terminal_normal`
- Presentation: `no_longer_active`
- Visible on active-only compact surfaces: no
4. `completed` with reconciliation metadata
- Freshness: `reconciled_failed`
- Presentation: `no_longer_active` with stale-lineage diagnostics
- Visible on active-only compact surfaces: no
## Validation Rules And Invariants
- No new `OperationRun.status` or `OperationRun.outcome` values may be added.
- No new persisted `operation_runs` summary or mirror table may be added.
- All stale/late meaning must derive from existing freshness truth.
- Tenant-scoped surfaces must only reflect runs already visible to the current tenant-entitled operator.
- Workspace summaries must stay limited to entitled tenant slices.
- Healthy queued/running work must not inherit stale emphasis.
- Terminal runs must stop appearing in active-only surfaces on the next refresh cycle.