24 KiB
Implementation Plan: Tenant Backup Health Signals
Branch: 180-tenant-backup-health | Date: 2026-04-07 | Spec: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/180-tenant-backup-health/spec.md
Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/180-tenant-backup-health/spec.md
Summary
Harden the tenant dashboard so operators can tell within seconds whether a tenant has no usable backup basis, a stale latest backup basis, a degraded latest backup basis, or a healthy recent backup basis without opening deep backup surfaces first. The implementation keeps BackupSet, BackupItem, BackupSchedule, and the existing backup-quality layer as the only sources of truth, introduces one narrow derived tenant backup-health resolver over those records, adds a config-backed freshness policy with schedule follow-up semantics, integrates the result into DashboardKpis and NeedsAttention, and preserves reason-driven drillthrough into existing backup-set and backup-schedule surfaces without adding a new persistence model or recovery-confidence framework.
Key approach: work inside the existing TenantDashboard, DashboardKpis, NeedsAttention, BackupSetResource, BackupScheduleResource, and BackupQualityResolver seams; derive tenant posture from the latest relevant completed backup set plus existing backup-quality truth and enabled-schedule timing; keep the feature Filament v5 and Livewire v4 compliant; avoid new tables, Graph calls, jobs, or asset registration; validate the result with focused Pest, Livewire, truth-alignment, and RBAC coverage.
Technical Context
Language/Version: PHP 8.4, Laravel 12, Blade, Filament v5, Livewire v4
Primary Dependencies: Filament v5, Livewire v4, Pest v4, Laravel Sail, existing DashboardKpis, NeedsAttention, BackupSetResource, BackupScheduleResource, BackupQualityResolver, BackupQualitySummary, ScheduleTimeService, shared badge infrastructure, and existing RBAC helpers
Storage: PostgreSQL with existing tenant-owned backup_sets, backup_items, and backup_schedules records plus existing JSON-backed backup metadata; no schema change planned
Testing: Pest feature tests, Livewire widget and resource tests, and unit tests for the narrow backup-health derivation layer, all run through Sail
Target Platform: Laravel web application in Sail locally and containerized Linux deployment in staging and production
Project Type: Laravel monolith web application
Performance Goals: Keep tenant-dashboard rendering DB-only and query-bounded, avoid new N+1 query hotspots while deriving the latest relevant backup basis, and preserve 5 to 10 second operator scanability on tenant dashboard and drillthrough destinations
Constraints: No new backup-health table, no recovery-confidence score, no new Graph contract path, no new queue or OperationRun, no RBAC drift, no calmness leakage beyond evidence, no ad-hoc badge mappings, and no new global Filament assets
Scale/Scope: One tenant-scoped dashboard composition, two existing dashboard widgets, one narrow derived backup-health layer, optional config additions in config/tenantpilot.php, and focused regression coverage across resolver, widget, drillthrough, and RBAC behavior
Constitution Check
GATE: Passed before Phase 0 research. Re-checked after Phase 1 design and still passing.
| Principle | Status | Notes |
|---|---|---|
| Inventory-first | Pass | Backups remain immutable snapshot truth; the feature only summarizes existing backup and schedule state on read |
| Read/write separation | Pass | This is a read-first dashboard hardening slice; existing backup and schedule mutations remain unchanged and separately confirmed or audited |
| Graph contract path | Pass | No new Microsoft Graph calls or contract-registry changes are introduced |
| Deterministic capabilities | Pass | Existing capability registry and tenant-scoped authorization remain authoritative; no raw capability strings are introduced |
| RBAC-UX planes and 404 vs 403 | Pass | The feature stays in the tenant/admin plane under /admin/t/{tenant}/...; non-members remain 404, and existing in-scope authorization stays server-side |
| Workspace isolation | Pass | No workspace-scope broadening or cross-workspace aggregation is added |
| Tenant isolation | Pass | Backup sets, backup items, schedules, and dashboard summaries stay tenant-owned and tenant-scoped |
| Dangerous and destructive confirmations | Pass | No new destructive action is introduced. Existing backup and schedule destructive actions remain ->requiresConfirmation() and capability-gated |
| Global search safety | Pass | No new globally searchable resource is introduced or changed. BackupSetResource already has a view page, BackupScheduleResource already has an edit page, and global search configuration remains unchanged |
| Run observability | Pass | No new long-running work or OperationRun usage is introduced |
| Ops-UX 3-surface feedback | Pass | No new queued action or run feedback surface is added |
| Ops-UX lifecycle ownership | Pass | OperationRun.status and OperationRun.outcome are untouched |
| Ops-UX summary counts | Pass | No new summary_counts keys are required |
| Data minimization | Pass | The feature reuses existing metadata and timestamps only; no new secret or payload exposure is planned |
| Proportionality (PROP-001) | Pass | Added logic is limited to one narrow tenant backup-health layer plus config-backed freshness semantics |
| Persisted truth (PERSIST-001) | Pass | No new table, column, or stored backup-health mirror is introduced |
| Behavioral state (STATE-001) | Pass | New posture and reason families are derived only because they change operator guidance and dashboard calmness behavior |
| Badge semantics (BADGE-001) | Pass | Existing badge and tag infrastructure remains the semantic source; any new backup-health tone stays inside shared UI primitives rather than local mappings |
| Filament-native UI (UI-FIL-001) | Pass | Existing Filament widgets, stats, tables, and shared primitives remain the implementation seams |
| UI naming (UI-NAMING-001) | Pass | Operator-facing vocabulary stays bounded to backup health, last backup, stale, degraded, no backups, and schedule follow-up, without recoverable or proven claims |
| Operator surfaces (OPSURF-001) | Pass | Default-visible tenant-dashboard content becomes more operator-first by exposing backup posture before deep diagnostics |
| Filament Action Surface Contract | Pass | BackupSetResource and BackupScheduleResource keep existing inspect models and destructive placement; TenantDashboard remains under the current dashboard exemption |
| Filament UX-001 | Pass with documented variance | No new create or edit screen is added. Existing backup-set and backup-schedule resources remain the canonical follow-up surfaces, with summary-first truth added where needed |
| Filament v5 / Livewire v4 compliance | Pass | The implementation stays inside the current Filament v5 and Livewire v4 stack |
| Provider registration location | Pass | No panel or provider changes are planned; Laravel 11+ provider registration remains in bootstrap/providers.php |
| Asset strategy | Pass | No new panel assets are planned; deployment keeps the existing php artisan filament:assets step unchanged |
Phase 0 Research
Research outcomes are captured in /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/180-tenant-backup-health/research.md.
Key decisions:
- Derive tenant backup health from existing
BackupSet,BackupItem,BackupSchedule, andBackupQualityResolvertruth instead of introducing persisted backup-health state. - Let the latest relevant completed backup set govern tenant posture rather than allowing older healthier history to calm the dashboard.
- Reuse existing backup-quality summaries for degradation truth and add no competing backup-quality taxonomy.
- Define backup freshness through one config-backed fallback window on the latest relevant completed backup set, while treating schedule timing as a secondary follow-up signal rather than health proof.
- Derive schedule follow-up from enabled schedules whose current
next_run_atorlast_run_atsemantics indicate missed or overdue execution beyond a small grace window. - Integrate backup health into the existing
DashboardKpisandNeedsAttentionwidgets and keep healthy wording suppressed unless the backing evidence is fully supportive. - Route dashboard drillthroughs by problem class: no usable backup basis opens the backup-set list, stale or degraded latest backup opens the latest relevant backup-set detail, and schedule follow-up opens the backup-schedules list.
- Extend the current widget, truth-alignment, backup-set, schedule, and tenant-scope Pest coverage instead of creating a browser-first harness.
Phase 1 Design
Design artifacts are created under /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/180-tenant-backup-health/:
research.md: implementation and domain decisions for tenant backup-health derivationdata-model.md: existing entities, config inputs, and derived backup-health modelscontracts/tenant-backup-health.openapi.yaml: internal logical contract for dashboard summary, backup-set confirmation, and schedule follow-up surfacesquickstart.md: focused automated and manual validation workflow for tenant backup-health signals
Design decisions:
- No schema migration is required. The design adds only a narrow derived resolver layer and a small config section in
config/tenantpilot.phpfor backup-health freshness semantics. - Tenant backup health is derived at render time from the latest relevant completed backup set, existing
BackupQualitySummary, and enabled-schedule timing. No newTenantfield, cache table, or materialized rollup is planned. - Stale versus degraded precedence is deterministic:
absentoutranks everything,staleoutranksdegraded,degradedoutrankshealthy, andschedule_follow_upremains a secondary reason family. When the latest backup basis is fresh and non-degraded, posture may remainhealthy, butschedule_follow_upbecomes the active reason and suppresses any positive healthy confirmation until resolved. DashboardKpisowns the primary backup-health stat or card, whileNeedsAttentionowns reason-specific backup follow-up items and the positive healthy backup check.- Backup-set detail remains the confirmation surface for stale and degraded latest-backup posture by combining recency and existing backup-quality summary. Backup-schedules list remains the confirmation surface for schedule-follow-up posture and must foreground one derived follow-up indicator so the missed-run or overdue reason stays scan-fast.
- The feature stays Filament v5 and Livewire v4 compliant, introduces no new panel provider, and requires no new asset registration.
Project Structure
Documentation (this feature)
specs/180-tenant-backup-health/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│ └── tenant-backup-health.openapi.yaml
├── checklists/
│ └── requirements.md
└── tasks.md
Source Code (repository root, including planned additions for this feature)
app/
├── Filament/
│ ├── Pages/
│ │ └── TenantDashboard.php
│ ├── Resources/
│ │ ├── BackupScheduleResource.php
│ │ └── BackupSetResource.php
│ └── Widgets/
│ └── Dashboard/
│ ├── DashboardKpis.php
│ └── NeedsAttention.php
├── Models/
│ ├── BackupItem.php
│ ├── BackupSchedule.php
│ ├── BackupSet.php
│ └── Tenant.php
├── Support/
│ ├── BackupHealth/
│ │ ├── TenantBackupHealthAssessment.php
│ │ ├── BackupFreshnessEvaluation.php
│ │ ├── BackupScheduleFollowUpEvaluation.php
│ │ ├── BackupHealthActionTarget.php
│ │ ├── BackupHealthDashboardSignal.php
│ │ └── TenantBackupHealthResolver.php
│ ├── BackupQuality/
│ │ ├── BackupQualityResolver.php
│ │ └── BackupQualitySummary.php
│ └── Badges/
│ └── [existing shared badge seams only if new backup-health tone mapping is needed]
config/
└── tenantpilot.php
tests/
├── Feature/
│ ├── BackupScheduling/
│ │ └── BackupScheduleLifecycleTest.php
│ └── Filament/
│ ├── BackupSetListContinuityTest.php
│ ├── BackupSetEnterpriseDetailPageTest.php
│ ├── DashboardKpisWidgetTest.php
│ ├── NeedsAttentionWidgetTest.php
│ ├── TenantDashboardDbOnlyTest.php
│ ├── TenantDashboardTenantScopeTest.php
│ └── TenantDashboardTruthAlignmentTest.php
└── Unit/
└── Support/
└── BackupHealth/
└── TenantBackupHealthResolverTest.php
Structure Decision: Standard Laravel monolith. The implementation stays inside existing dashboard widgets, backup resources, shared support helpers, and current test structure. Any new helper types and lightweight dashboard-facing value objects live under app/Support/BackupHealth/ as a narrow derived layer shared by the dashboard and drillthrough logic.
Implementation Strategy
Phase A — Introduce Narrow Tenant Backup-Health Derivation
Goal: Create one derived path that can answer absent, stale, degraded, or healthy from existing backup and schedule truth without introducing new persistence.
| Step | File | Change |
|---|---|---|
| A.1 | New narrow helper(s) under app/Support/BackupHealth/ |
Introduce TenantBackupHealthResolver plus lightweight TenantBackupHealthAssessment, BackupFreshnessEvaluation, BackupScheduleFollowUpEvaluation, BackupHealthActionTarget, and BackupHealthDashboardSignal value objects that derive the latest relevant completed backup basis, posture, primary reason, supporting message, drillthrough target, and healthy-claim boundary with query-bounded latest-basis loading |
| A.2 | app/Support/BackupQuality/BackupQualityResolver.php plus the new backup-health layer |
Explicitly reuse BackupQualityResolver and BackupQualitySummary output to classify material degradation instead of creating a second backup-quality system |
| A.3 | config/tenantpilot.php |
Add a small backup_health config section for canonical freshness hours and schedule overdue grace so stale logic is explicit, testable, and not hard-coded in widgets |
Phase B — Integrate Backup Health Into Primary Tenant Dashboard Surfaces
Goal: Make tenant backup posture visible on the dashboard before the operator has to open deep backup pages.
| Step | File | Change |
|---|---|---|
| B.1 | app/Filament/Widgets/Dashboard/DashboardKpis.php |
Add a backup-health stat or card that reflects the derived posture, last relevant backup timing, current reason, color tone, and one reason-driven destination |
| B.2 | app/Filament/Widgets/Dashboard/NeedsAttention.php |
Add backup-health attention items for no usable backup basis, stale latest backup, degraded latest backup, and schedule follow-up |
| B.3 | app/Filament/Widgets/Dashboard/NeedsAttention.php |
Add Backups are recent and healthy to the healthy-check set only when the derived assessment positively supports it and no backup-health attention item, including schedule_follow_up, remains |
Phase C — Preserve Drillthrough Continuity On Backup And Schedule Surfaces
Goal: Ensure the dashboard warning or healthy claim can be rediscovered on the destination surface without guesswork.
| Step | File | Change |
|---|---|---|
| C.1 | app/Support/BackupHealth/TenantBackupHealthResolver.php plus app/Support/BackupHealth/BackupHealthActionTarget.php |
Centralize reason-driven URL selection in the existing backup-health layer so no-basis goes to backup-set index, stale or degraded latest backup goes to the relevant backup-set detail, and schedule follow-up goes to backup-schedules index |
| C.2 | app/Filament/Resources/BackupSetResource.php |
Reuse or slightly harden the backup-set list and detail presentation so the index confirms no usable backup basis and the latest relevant backup-set detail clearly confirms stale or degraded posture on arrival |
| C.3 | app/Filament/Resources/BackupScheduleResource.php |
Add one derived schedule-follow-up confirmation signal on the list surface so existing last_run_at, last_run_status, and next_run_at evidence remains scan-fast on arrival |
Phase D — Lock Semantics With Focused Regression Coverage
Goal: Protect resolver truth, dashboard truth, continuity, and tenant safety from regression.
| Step | File | Change |
|---|---|---|
| D.1 | New unit tests under tests/Unit/Support/BackupHealth/ |
Cover no-backup, stale, degraded, healthy, schedule-follow-up, and latest-history-governs derivation |
| D.2 | tests/Feature/Filament/DashboardKpisWidgetTest.php |
Extend KPI payload and URL assertions for backup-health posture and reason-driven drillthrough |
| D.3 | tests/Feature/Filament/NeedsAttentionWidgetTest.php |
Extend attention and healthy-check coverage for no-backup, stale-backup, degraded-latest-backup, schedule-follow-up, and healthy-backup scenarios |
| D.4 | tests/Feature/Filament/TenantDashboardTruthAlignmentTest.php |
Ensure backup-health calmness and caution align with the rest of the tenant dashboard and do not reintroduce calmness leakage |
| D.5 | tests/Feature/Filament/BackupSetListContinuityTest.php, tests/Feature/Filament/BackupSetEnterpriseDetailPageTest.php, and tests/Feature/BackupScheduling/BackupScheduleLifecycleTest.php |
Prove that no-basis, stale, degraded, and schedule-follow-up drillthrough destinations confirm the same problem class the dashboard named |
| D.6 | tests/Feature/Filament/TenantDashboardTenantScopeTest.php or a new RBAC-safe visibility test |
Preserve tenant-scope truth and non-member-safe behavior for dashboard summary and backup follow-up routes |
| D.7 | vendor/bin/sail bin pint --dirty --format agent and focused Pest runs |
Required formatting and targeted verification before implementation is considered complete |
Key Design Decisions
D-001 — Tenant backup health is derived, not stored
The product already stores the facts this slice needs: completed backup sets, backup-item quality metadata, and backup schedule timing. The missing piece is a tenant-level interpretation layer for overview truth, not a new persistence model.
D-002 — The latest relevant completed backup set governs posture
Older healthy history cannot calm the dashboard if the latest relevant completed backup is stale or degraded. This keeps the overview aligned with the operator's current recovery starting point.
D-003 — Stale and degraded remain distinct, with deterministic precedence
absent, stale, degraded, and healthy are mutually exclusive primary posture states. When the latest relevant backup is both old and degraded, stale becomes the primary posture while degradation remains visible as supporting detail rather than disappearing.
D-004 — Schedule timing is follow-up truth, not health proof
An enabled schedule can support the operator's diagnosis, but it cannot prove healthy backup posture. Overdue or never-successful schedules add schedule_follow_up; they do not substitute for a recent healthy completed backup basis. If the backup basis is otherwise healthy, posture may stay healthy, but schedule_follow_up becomes the active reason and suppresses calm confirmation until the schedule concern clears.
D-005 — Healthy wording is stricter than mere backup existence
Backups are recent and healthy is reserved for tenants whose latest relevant completed backup exists, meets the freshness window, and carries no material degradation under existing backup-quality truth. Lack of evidence must suppress calmness.
D-006 — Existing Filament seams are sufficient
The current DashboardKpis, NeedsAttention, BackupSetResource, and BackupScheduleResource surfaces already provide the right seams. This slice does not need a new page shell, a new dashboard module, or a new front-end state layer.
D-007 — Keep the claim boundary below recovery confidence
The feature can say that backups are absent, stale, degraded, or healthy as backup inputs. It cannot say that the tenant is recoverable, that restore will succeed, or that recovery posture is proven.
Risk Assessment
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Latest-basis selection drifts from operator expectation and lets older history calm the dashboard | High | Medium | Make latest relevant completed backup selection explicit in the resolver and cover mixed-history precedence with unit tests |
| Dashboard calmness returns because schedule presence is treated as a proxy for health | High | Medium | Keep schedule follow-up secondary in the resolver and test that schedules never make a tenant healthy on their own |
| Backup health duplicates or contradicts existing backup-quality truth | High | Medium | Reuse BackupQualityResolver and existing degradation families rather than adding a second backup-quality mapping |
| Schedule drillthrough lands on a surface that does not clearly confirm the warning | Medium | Medium | Use the schedule list as the primary follow-up destination and add one scan-fast confirmation signal if timestamps alone are insufficient |
| Tight stale thresholds create noise or false calmness over time | Medium | Medium | Externalize fallback freshness and schedule grace in config and pin the semantics with unit and feature tests |
Test Strategy
- Add unit tests for the narrow backup-health resolver so latest-basis selection, stale precedence, degraded detection reuse, healthy-gate logic, and schedule-follow-up derivation remain deterministic.
- Extend
DashboardKpisWidgetTestto assert the backup-health stat label, value, description, color, and destination across absent, stale, degraded, and healthy scenarios. - Extend
NeedsAttentionWidgetTestto assert backup-health attention items, healthy-check inclusion or suppression, and safe degraded-link behavior when appropriate. - Extend
TenantDashboardTruthAlignmentTestso backup-health calmness or caution cannot contradict the rest of the dashboard's operator truth. - Extend backup-set and schedule surface tests so dashboard drillthroughs recover the same problem class on the target page.
- Extend tenant-scope or RBAC coverage so entitled users see truthful summary state and non-members receive deny-as-not-found semantics without cross-tenant hints.
- Keep all tests Livewire v4 compatible and run the smallest affected subset through Sail before asking for a full-suite pass.
- Run
vendor/bin/sail bin pint --dirty --format agentbefore final verification.
Complexity Tracking
No constitution violations or exception-driven complexity were identified. The only added structure is a narrow derived backup-health layer and a small derived posture or reason family already justified by the proportionality review.
Proportionality Review
- Current operator problem: The tenant dashboard can look healthy while backup posture is missing, stale, or degraded, which hides a recovery-relevant truth from the operator's primary overview surface.
- Existing structure is insufficient because: Existing backup-quality truth lives in backup-set, item, version, and restore-adjacent surfaces, but there is no tenant-level rollup that answers the dashboard question directly.
- Narrowest correct implementation: Add one narrow derived tenant backup-health layer, wire it into the existing dashboard widgets, and reuse current backup and schedule destinations for continuity without creating new persistence or a broader recovery-confidence system.
- Ownership cost created: A small amount of resolver logic, a small config-backed freshness policy, limited widget wiring, and focused unit and feature tests.
- Alternative intentionally rejected: A persisted backup-health table, a workspace-wide recovery rollup, or a recovery-confidence score. Each adds broader truth and maintenance cost than the current tenant-dashboard problem requires.
- Release truth: Current-release truth. The feature corrects a trust gap on already-shipped tenant overview surfaces.