TenantAtlas/specs/184-dashboard-recovery-honesty/research.md
ahmido f1a73490e4 feat: finalize dashboard recovery honesty (#215)
## Summary
- add a dedicated Recovery Readiness dashboard widget for backup posture and recovery evidence
- group Needs Attention items by domain and elevate the recovery call-to-action
- align restore-run and recovery posture tests with the extracted widget and continuity flows
- include the related spec artifacts for 184-dashboard-recovery-honesty

## Verification
- `cd /Users/ahmeddarrazi/Documents/projects/TenantAtlas/apps/platform && ./vendor/bin/sail bin pint --dirty --format agent`
- `cd /Users/ahmeddarrazi/Documents/projects/TenantAtlas/apps/platform && ./vendor/bin/sail artisan test --compact --filter="DashboardKpisWidget|DashboardRecoveryPosture|TenantDashboardDbOnly|TenantpilotSeedBackupHealthBrowserFixtureCommand|NeedsAttentionWidget"`
- browser smoke verified on the calm, unvalidated, and weakened dashboard states

## Notes
- Livewire v4.0+ compliant with Filament v5
- no panel provider changes; Laravel 11+ provider registration remains in `bootstrap/providers.php`
- Recovery Readiness stays within the existing tenant dashboard asset strategy; no new Filament asset registration required

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #215
2026-04-08 23:21:36 +00:00

7.3 KiB
Raw Permalink Blame History

Research: Dashboard Recovery Posture Honesty

Decision 1: Keep the slice on tenant dashboard and restore-history confirmation surfaces

Decision: Implement Spec 184 on the tenant dashboard (DashboardKpis, NeedsAttention) and the canonical restore-run list or detail drilldowns. Do not expand WorkspaceOverviewBuilder in this slice.

Rationale: Current workspace overview logic does not restate backup or recovery posture; it surfaces governance, compare, findings, alerts, and operations, then links into the tenant dashboard when appropriate. Not changing it keeps the blast radius small and avoids introducing a second partially overlapping recovery summary.

Alternatives considered:

  • Extend workspace overview now with recovery posture. Rejected because it broadens the slice, increases query and wording risk, and is not needed to stop the tenant-dashboard overclaim.
  • Create a new recovery overview page. Rejected because the spec explicitly avoids a new recovery-confidence surface.

Decision 2: Reuse the existing backup positive-claim boundary

Decision: Reuse TenantBackupHealthAssessment::positiveClaimBoundary on summary-level dashboard surfaces.

Rationale: The product already has canonical copy that says backup health reflects backup inputs only and does not prove restore success. Reusing it avoids page-local rewrites and keeps the claim boundary consistent across backup-detail and dashboard contexts.

Alternatives considered:

  • Introduce new dashboard-only claim-boundary copy. Rejected because it creates semantic drift for the same truth.
  • Leave the boundary only on detail pages. Rejected because the spec specifically hardens summary surfaces.

Decision 3: Define relevant restore history as executed, non-preview restore runs only

Decision: Treat only non-dry-run, non-preview, executed restore runs as relevant restore history for overview recovery language.

Rationale: RestoreSafetyResolver::resultAttentionForRun(...) already treats dry-runs and preview states as not_executed and explicitly says they do not prove execution. Counting those records as recovery evidence would overstate confidence.

Alternatives considered:

  • Count every RestoreRun record, including preview-only runs. Rejected because preview truth is not execution truth.
  • Count only fully completed runs. Rejected because failed, partial, and follow-up runs are exactly the weak-history evidence the overview must surface.

Decision 4: Use RestoreResultAttention as the sole authority for weak-history states

Decision: Reuse RestoreSafetyResolver::resultAttentionForRun(...) for failed, partial, completed-with-follow-up, and completed semantics instead of remapping raw statuses in widgets.

Rationale: The resolver already inspects status, operation outcome, item results, assignment failures, skips, and metadata such as non_applied. It is the narrowest existing source of truth for result quality and already carries recovery-claim boundaries.

Alternatives considered:

  • Build dashboard-specific mapping from raw RestoreRun.status. Rejected because it would ignore existing deeper result logic and duplicate truth.
  • Introduce a new tenant-level recovery enum. Rejected because the spec is explicitly not a recovery-confidence engine.

Decision 5: Use list-subheading continuity for no-history and fallback drilldowns

Decision: Reuse the backup_health_reason continuity pattern on ListRestoreRuns with a restore-history-specific reason query parameter for list fallbacks.

Rationale: The repo already uses list subheadings on backup-set and backup-schedule pages to explain why a dashboard drillthrough landed on a list. The same pattern fits no history and weak-history list fallbacks without new UI shells or modal layers.

Alternatives considered:

  • Always deep-link to a restore-run detail. Rejected because no-history cases have no record, and weak-history detail links can become brittle when a record is deleted or inaccessible.
  • Use only existing table filters with no continuity copy. Rejected because current restore-run filters cannot explain no history and do not guarantee self-explanatory arrival states.

Decision: Link to restore-run detail only when a recent problematic run exists and the detail is the clearest confirmation. Otherwise fall back to the tenant restore-run list with continuity context.

Rationale: The list is the stable canonical collection route and is already accessible to readonly members. It is also the only truthful destination for no history.

Alternatives considered:

  • Always link to the most recent executed run. Rejected because that can create dead ends or misleading confirmations when the run is gone or no longer the right representative.
  • Link to admin operations pages instead of restore runs. Rejected because Spec 184 is about restore history and result attention, not generic operation monitoring.

Decision 7: Keep summary language cautious under RBAC restrictions

Decision: Summary surfaces must stay cautious even if the current user cannot open the most specific restore evidence. Action links may disable or fall back, but the claim must never grow stronger.

Rationale: Existing tests show readonly tenant members can open restore-run history while mutations remain disabled. Even if a more specific deep link is unavailable, the summary must still express unvalidated or weakened, not healthy.

Alternatives considered:

  • Hide recovery-honesty signals when drilldown is limited. Rejected because that would falsify the surface by omission.
  • Treat inaccessible detail as proof that no issues exist. Rejected because it directly violates the specs honesty boundary.

Decision 8: Extend existing widget and restore-run tests instead of introducing a new harness

Decision: Build coverage on top of DashboardKpisWidgetTest, NeedsAttentionWidgetTest, RestoreResultAttentionSurfaceTest, RestoreRunUiEnforcementTest, and focused unit tests under Support/RestoreSafety.

Rationale: The repo already has targeted Livewire and Pest coverage for the exact seams being changed. Extending them is cheaper and keeps the tests aligned with business truth instead of test-only indirection.

Alternatives considered:

  • Create a new browser-only or end-to-end suite. Rejected because the feature is a UI-truth-hardening slice on existing server-rendered surfaces.
  • Skip unit coverage and rely only on widget tests. Rejected because executed-history selection and attention precedence are easier and cheaper to pin down at the narrow derivation seam.

Decision 9: No new assets, panel registration, or global-search behavior

Decision: Keep the implementation entirely within existing Filament widgets, resource pages, and Blade views. Do not add assets, providers, or search changes.

Rationale: This feature changes honesty and drillthrough continuity, not panel infrastructure. The current deployment step for filament:assets remains unchanged because there are no new assets.

Alternatives considered:

  • Add custom JavaScript or CSS for richer attention states. Rejected because existing Filament primitives are sufficient.
  • Change panel or navigation registration. Rejected because it is unrelated to the feature goal.