Some checks failed
Main Confidence / confidence (push) Failing after 53s
## Summary - keep stale active operation runs visible in the tenant progress overlay and polling state - align tenant and canonical operation surfaces around the shared stale-active presentation contract - add Spec 233 artifacts and clean the promoted-candidate backlog entries ## Validation - browser smoke: `/admin/t/18000000-0000-4000-8000-000000000180` -> stale dashboard CTA -> `/admin/operations?tenant_id=7&activeTab=active_stale_attention&problemClass=active_stale_attention` -> `/admin/operations/15` - verified healthy vs likely-stale tenant cards, canonical stale list row, and canonical run detail consistency ## Notes - local smoke fixture seeded with one fresh and one stale running `baseline_compare` operation for browser validation - Pest suite was not re-run in this session before opening this PR Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de> Reviewed-on: #269
5.4 KiB
5.4 KiB
Research: Operation Run Active-State Visibility & Stale Escalation
Decision 1: Keep lifecycle freshness truth in the existing run model and reconciler
- Decision: Use
OperationRunFreshnessState,OperationRun::freshnessState(),OperationRun::problemClass(), andOperationLifecycleReconcileras the only lifecycle-truth inputs for this feature. - Rationale: The application already computes
fresh_active,likely_stale,reconciled_failed,terminal_normal, andunknownfrom the run record plusOperationLifecyclePolicy. Canonical monitoring surfaces already rely on that truth, so adding a second stale heuristic would immediately recreate the drift this spec is trying to remove. - Alternatives considered:
- Add new
OperationRun.statusvalues such asstaleorlate: rejected because the distinction is presentation and triage-oriented, not a new persisted lifecycle state. - Add page-local thresholds per widget: rejected because it would create conflicting meaning across tenant, workspace, and canonical monitoring surfaces.
- Add new
Decision 2: Reuse the existing Ops UX presenter path before introducing a new helper
- Decision: Prefer
OperationUxPresenter::decisionZoneTruth(),lifecycleAttentionSummary(),surfaceGuidance(), and centralized badge rendering as the presentation backbone. - Rationale: The code already exposes a derived decision-zone payload and shared stale/reconciled copy.
OperationRunStatusBadgealready rendersLikely stalewhen queued/running work carriesfreshness_state=likely_stale, andOperationUxPresenteralready provides compact and diagnostic explanations off the same truth. - Alternatives considered:
- New dedicated presenter family for active-state visibility: rejected unless the existing presenter path proves insufficient during implementation.
- Widget-local copy branches: rejected because they would increase semantic spread and regression risk.
Decision 3: Treat stale-active runs as still active for tenant progress visibility
- Decision: Change tenant-local active-progress visibility to include freshness-elevated active runs rather than suppressing them via
healthyActive(). - Rationale:
BulkOperationProgressandActiveRuns::existForTenantId()previously usedhealthyActive(), which caused stale queued/running work to disappear from the tenant progress overlay and stopped polling when only stale runs remained. That was the clearest concrete contradiction with the canonical monitoring surfaces. - Alternatives considered:
- Keep stale runs hidden in the progress overlay and rely on dashboard/list only: rejected because the spec explicitly covers tenant-local active-run cards and progress summaries.
- Add a separate stale-only overlay: rejected because it would create a second active-work surface family instead of fixing the existing one.
Decision 4: Preserve current surface roles and drill-through flow
- Decision: Keep the current route and surface model: tenant dashboard and tenant progress remain secondary context,
/admin/operationsremains the primary triage list, and/admin/operations/{run}remains diagnostic-first. - Rationale: Existing links already converge through
OperationRunLinks, and current pages/widgets match the constitution's decision-first model. The gap is the honesty of compact active-state messaging, not missing routes. - Alternatives considered:
- New operations hub or new tenant-local detail page: rejected as unnecessary workflow expansion.
- New notification channel for stale active work: rejected because the spec explicitly excludes new notification behavior.
Decision 5: Extend existing focused tests and invert stale-hidden assumptions where necessary
- Decision: Update existing monitoring, Filament, and Ops UX tests rather than creating a new broad suite.
- Rationale: The repository already has focused coverage for lifecycle presentation and tenant progress behavior. In particular,
BulkOperationProgressDbOnlyTestandProgressWidgetFiltersTestcurrently codify the stale-hidden behavior that this feature must deliberately replace. - Alternatives considered:
- Add a brand-new browser suite: rejected because feature tests already cover the underlying business truth and UI copy.
- Leave old progress-widget tests untouched and add parallel tests: rejected because the old assertions would preserve the wrong contract.
Decision 6: Keep “past expected lifecycle” and “likely stale” as density-specific labels over the same stale truth
- Decision: Model compact “past expected lifecycle” phrasing and stronger “likely stale” diagnostic phrasing as different density outputs over the same
likely_stalefreshness truth rather than as separate persisted states. - Rationale: The spec allows same meaning, different density. The current code already points in that direction:
OperationUxPresenter::surfaceGuidance()says the run is “past its lifecycle window,” whileOperationRunStatusBadgecan label the same runLikely stale. - Alternatives considered:
- Create two separate freshness states for “late” and “likely stale”: rejected because existing lifecycle truth has only one stale boundary and no additional behavioral consequence.
- Collapse all stale-active copy to a single label everywhere: rejected because compact surfaces and canonical detail need different density without changing meaning.