TenantAtlas/specs/176-backup-quality-truth/plan.md
ahmido e840007127 feat: add backup quality truth surfaces (#211)
## Summary
- add a shared backup-quality resolver and summary model for backup sets, backup items, policy versions, and restore selection
- surface backup-quality truth across Filament backup-set, policy-version, and restore-wizard entry points
- add focused Pest coverage and the full Spec Kit artifact set for spec 176

## Testing
- focused backup-quality verification and integrated-browser smoke coverage were completed during implementation
- degraded browser smoke path was validated with temporary seeded records and then cleaned up again
- the workspace already has a prior `vendor/bin/sail artisan test --compact` run exiting non-zero; that full-suite failure was not reworked as part of this PR

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #211
2026-04-07 11:39:40 +00:00

25 KiB

Implementation Plan: Backup Quality Truth Surfaces

Branch: 176-backup-quality-truth | Date: 2026-04-07 | Spec: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/176-backup-quality-truth/spec.md Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/176-backup-quality-truth/spec.md

Summary

Harden the backup and versioning surfaces so operators can distinguish stored from usable and degraded recovery input before they reach restore-safety or execution surfaces. The implementation keeps BackupSet, BackupItem, and PolicyVersion as the existing sources of truth, introduces only a narrow derived backup-quality layer over current metadata and relationships, aggregates existing metadata-only and assignment-quality signals into summary facts, and hardens backup-set list and detail, backup-item relation, policy-version list and detail, and restore wizard step 1 and step 2 selection seams without adding a new persistence model.

Key approach: work inside the existing BackupSetResource, BackupItemsRelationManager, PolicyVersionResource, RestoreRunResource, and CreateRestoreRun seams; derive per-item and aggregate quality from existing metadata keys such as source, snapshot_source, assignments_fetch_failed, assignment_capture_reason, and has_orphaned_assignments; reuse Filament v5 tables, infolists, enterprise-detail builders, and shared badge infrastructure; keep all changes Livewire v4 compliant; avoid new tables, new Graph calls, and new asset registration; validate the result with focused Pest, Livewire, RBAC, and regression coverage.

Technical Context

Language/Version: PHP 8.4, Laravel 12, Blade, Filament v5, Livewire v4
Primary Dependencies: Filament v5, Livewire v4, Pest v4, Laravel Sail, existing BackupSetResource, BackupItemsRelationManager, PolicyVersionResource, RestoreRunResource, CreateRestoreRun, AssignmentBackupService, VersionService, PolicySnapshotService, RestoreRiskChecker, BadgeRenderer, PolicySnapshotModeBadge, EnterpriseDetailBuilder, and existing RBAC helpers
Storage: PostgreSQL with existing tenant-owned backup_sets, backup_items, policy_versions, and restore wizard input state; JSON-backed metadata, snapshot, assignments, and scope_tags; no schema change planned
Testing: Pest feature tests, Livewire page or action tests, unit tests for narrow derived backup-quality helpers, all run through Sail
Target Platform: Laravel web application in Sail locally and containerized Linux deployment in staging and production
Project Type: Laravel monolith web application
Performance Goals: Keep backup, version, and restore-selection surfaces server-driven and DB-backed at render time, avoid new render-time external calls, preserve fast list scanability, and avoid introducing new N+1 query hotspots while computing quality summaries
Constraints: No new backup-health table, no new Graph contract path, no new queue or OperationRun, no route identity change, no RBAC drift, no conflation of backup quality with restore safety or tenant recoverability, no page-local badge mappings, and no new global Filament assets
Scale/Scope: One tenant-scoped backup-set list and detail flow, one backup-items relation-manager table, one tenant-scoped policy-version list and detail flow, restore wizard step 1 and step 2 selection surfaces, one narrow derived backup-quality helper layer, and focused regression coverage across truth presentation and RBAC behavior

Constitution Check

GATE: Passed before Phase 0 research. Re-checked after Phase 1 design and still passing.

Principle Status Notes
Inventory-first Pass Backups and versions remain immutable snapshot truth; no inventory ownership rule changes
Read/write separation Pass This slice is read-first truth hardening; existing restore and delete flows retain their current confirmations, audits, and tests
Graph contract path Pass No new Graph endpoints, no new Graph calls, and no contract registry changes are introduced
Deterministic capabilities Pass Existing capability registry, CapabilityResolver, and UiEnforcement remain authoritative
RBAC-UX planes and 404 vs 403 Pass All changed surfaces remain tenant-scoped; non-members still get 404, in-scope members without mutation capability still get 403 on execution
Workspace isolation Pass No workspace-scope broadening or cross-workspace visibility changes are planned
Tenant isolation Pass BackupSet, BackupItem, and PolicyVersion stay tenant-owned and tenant-entitled across list, detail, and wizard selection surfaces
Dangerous and destructive confirmations Pass Existing archive, restore, force-delete, and remove actions stay confirmation-gated and server-authorized
Global search safety Pass This feature adds no new globally searchable resource. PolicyVersionResource remains non-globally-searchable. BackupSetResource already has a view page if current configuration exposes it to search, and this slice adds no new cross-tenant hints
Run observability Pass No new long-running work or OperationRun usage is introduced
Ops-UX 3-surface feedback Pass No new operation start, toast, progress, or terminal notification surface is added
Ops-UX lifecycle ownership Pass OperationRun.status and OperationRun.outcome are untouched
Ops-UX summary counts Pass No new summary_counts keys or operation metrics are required
Data minimization Pass The slice reuses existing metadata and keeps diagnostics secondary; no new secret or raw payload exposure is planned
Proportionality (PROP-001) Pass Added logic is limited to a narrow derived backup-quality helper and direct surface integration across existing resources
Persisted truth (PERSIST-001) Pass No new table, column, or stored mirror is introduced; quality remains derived
Behavioral state (STATE-001) Pass Quality distinctions remain derived presentation truth from existing metadata, not new persisted lifecycle state
Badge semantics (BADGE-001) Pass Snapshot-mode rendering continues through BadgeDomain::PolicySnapshotMode; any new quality chips or labels stay inside shared badge or copy seams
Filament-native UI (UI-FIL-001) Pass Existing Filament tables, infolists, enterprise-detail cards, and wizard form descriptions remain the primary seams
UI naming (UI-NAMING-001) Pass The plan preserves operator vocabulary such as metadata-only, assignment issues, degraded, full payload, and recovery input, while avoiding safe to restore claims
Operator surfaces (OPSURF-001) Pass Changed surfaces become more operator-first by surfacing quality summary before diagnostics or later restore checks
Filament Action Surface Contract Pass No new inspect model, redundant View action, or empty action group is introduced; action placement remains unchanged
Filament UX-001 Pass with documented variance Backup-set detail continues to use the existing enterprise-detail layout and relation manager, but the plan adds a summary-first quality section before technical detail
Filament v5 / Livewire v4 compliance Pass The implementation stays inside the current Filament v5 and Livewire v4 stack
Provider registration location Pass No provider or panel changes; Laravel 11+ registration remains in bootstrap/providers.php
Asset strategy Pass No new panel assets are planned; deployment keeps the existing php artisan filament:assets step unchanged

Phase 0 Research

Research outcomes are captured in /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/176-backup-quality-truth/research.md.

Key decisions:

  • Derive backup quality from existing item and version metadata rather than introducing a persisted backup-health model.
  • Treat backup lifecycle status and backup quality as separate truths on every affected surface.
  • Reuse the central snapshot-mode badge and shared badge semantics instead of introducing page-local color or status logic.
  • Extend the existing backup-set enterprise-detail builder, backup-items relation manager, policy-version resource, and restore wizard descriptions instead of creating a parallel dashboard or UI shell.
  • Surface backup-set and item quality in restore wizard selection steps before the current restore-safety checks and preview steps, without turning quality hints into safety claims.
  • Keep quality truth visible for TENANT_VIEW users even when restore actions remain unavailable.
  • Use unknown quality only when the existing record does not contain authoritative metadata that can justify a stronger claim.
  • Extend the existing Pest and Livewire test surfaces rather than creating a new browser-first harness.

Phase 1 Design

Design artifacts are created under /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/176-backup-quality-truth/:

  • research.md: design and framework decisions for deriving and surfacing backup quality
  • data-model.md: existing entities, current metadata signals, and narrow derived backup-quality models
  • contracts/backup-quality-truth.openapi.yaml: internal logical contract for backup-set list and detail, backup-item relation rows, policy-version list and detail, and restore wizard selection surfaces
  • quickstart.md: focused automated and manual validation workflow for backup-quality truth hardening

Design decisions:

  • No schema migration is required; the design derives quality from existing backup_items.metadata, policy_versions.metadata, relationships, and current restore wizard state.
  • A narrow derived helper layer is justified because the same quality truth must appear consistently across backup-set list, backup-set detail, backup-items, policy versions, and restore selection surfaces.
  • Backup-set detail hardening stays inside BackupSetResource::enterpriseDetailPage() and existing enterprise-detail cards or sections rather than a new page shell.
  • Policy-version hardening stays inside the existing table and infolist schema, replacing disabled-action-only signaling with explicit quality truth.
  • Restore selection hardening stays inside RestoreRunResource::getWizardSteps() and restoreItemOptionData() so input quality appears before the existing checks and preview steps.
  • Snapshot mode remains the primary quality badge, while aggregate counts and next-action language stay derived and secondary.

Project Structure

Documentation (this feature)

specs/176-backup-quality-truth/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│   └── backup-quality-truth.openapi.yaml
├── checklists/
│   └── requirements.md
└── tasks.md

Source Code (repository root)

app/
├── Filament/
│   └── Resources/
│       ├── BackupSetResource.php
│       ├── PolicyVersionResource.php
│       ├── RestoreRunResource.php
│       └── BackupSetResource/
│           └── RelationManagers/
│               └── BackupItemsRelationManager.php
├── Models/
│   ├── BackupItem.php
│   ├── BackupSet.php
│   └── PolicyVersion.php
├── Services/
│   ├── AssignmentBackupService.php
│   └── Intune/
│       ├── PolicySnapshotService.php
│       ├── RestoreRiskChecker.php
│       ├── RestoreService.php
│       └── VersionService.php
└── Support/
    ├── BackupQuality/
    │   ├── BackupQualityResolver.php
    │   └── BackupQualitySummary.php
    ├── Badges/
    │   └── Domains/
    │       └── PolicySnapshotModeBadge.php
    ├── Ui/
    │   └── EnterpriseDetail/

tests/
├── Feature/
│   ├── Filament/
│   │   ├── BackupSetUiEnforcementTest.php
│   │   ├── BackupSetEnterpriseDetailPageTest.php
│   │   ├── BackupItemsRelationManagerFiltersTest.php
│   │   ├── BackupQualityTruthSurfaceTest.php
│   │   ├── PolicyVersionQualityTruthSurfaceTest.php
│   │   ├── PolicyVersionTest.php
│   │   ├── PolicyVersionRestoreViaWizardTest.php
│   │   ├── RestoreItemSelectionTest.php
│   │   └── RestoreSelectionQualityTruthTest.php
│   └── Rbac/
│       ├── BackupItemsRelationManagerUiEnforcementTest.php
│       ├── BackupQualityVisibilityTest.php
│       ├── CreateRestoreRunAuthorizationTest.php
│       └── PolicyVersionsRestoreToIntuneUiEnforcementTest.php
│   └── RestoreRiskChecksWizardTest.php
└── Unit/
    ├── Support/
    │   └── BackupQuality/
    │       ├── BackupQualityResolverTest.php
    │       └── BackupSetQualitySummaryTest.php
    ├── AssignmentBackupServiceTest.php
    └── BackupItemTest.php

Structure Decision: Standard Laravel monolith. The implementation stays inside existing Filament resources, existing models and services that already hold the underlying metadata, and the current test structure. Any new helper types stay under the existing app/Support/BackupQuality/ namespace as a narrow derived layer shared across backup, version, and restore-selection surfaces.

Implementation Strategy

Phase A — Introduce Narrow Derived Backup-Quality Facts

Goal: Create one reusable derivation path for backup quality from current metadata without adding a new persistence model.

Step File Change
A.1 New narrow helper(s) under app/Support/ if needed Introduce a minimal backup-quality resolver or read-model helper that computes snapshot mode, assignment capture issues, orphaned assignment flags, integrity warnings, aggregate counts, and next-action guidance from existing BackupItem and PolicyVersion metadata
A.2 app/Models/BackupItem.php and, only if clearly justified, app/Models/PolicyVersion.php Add small convenience helpers for repeated metadata checks where this reduces duplication without embedding presentation language into the models
A.3 app/Support/Badges/Domains/PolicySnapshotModeBadge.php and shared copy seams only if needed Reuse the current snapshot-mode badge as the canonical item or version completeness signal; add no new badge domain unless a shared value cannot be expressed through current badge semantics

Phase B — Harden Backup-Set List And Detail Truth

Goal: Make backup-set surfaces answer stored versus degraded before diagnostics or restore intent.

Step File Change
B.1 app/Filament/Resources/BackupSetResource.php Add a compact backup-quality summary to the table that stays separate from lifecycle status and uses aggregate degraded counts rather than status to imply quality
B.2 app/Filament/Resources/BackupSetResource.php Update enterpriseDetailPage() to place a quality summary card or section ahead of technical detail, including metadata-only count, assignment issue count, orphaned assignment count, one primary next action, and contextual related links that stay out of the header
B.3 app/Filament/Resources/BackupSetResource.php query seams Ensure the list and detail surfaces eager-load or aggregate the needed backup-item quality facts without introducing a new N+1 hotspot

Phase C — Harden Backup-Item And Policy-Version Truth

Goal: Expose item-level and version-level input quality directly where operators inspect captured records.

Step File Change
C.1 app/Filament/Resources/BackupSetResource/RelationManagers/BackupItemsRelationManager.php Add per-item snapshot mode, assignment capture issue, and orphaned-assignment truth to the relation table, preserving the current inspect model and action placement
C.2 app/Filament/Resources/PolicyVersionResource.php Add explicit snapshot mode or quality columns plus a single empty-state CTA to the policy-version list so metadata-only versions are visible at scan speed
C.3 app/Filament/Resources/PolicyVersionResource.php Add an explicit backup-quality section to the policy-version detail infolist so restore availability no longer acts as the only quality signal
C.4 app/Filament/Resources/PolicyVersionResource.php Preserve current restore-via-wizard gating and tooltip behavior while making quality truth visible independently from action disablement

Phase D — Harden Restore Selection Entry Points

Goal: Expose weak backup inputs before existing restore-safety checks and preview steps begin.

Step File Change
D.1 app/Filament/Resources/RestoreRunResource.php Enrich backup-set option labels or helper copy on wizard step 1 with backup-quality summary facts and degraded counts
D.2 app/Filament/Resources/RestoreRunResource.php Enrich restoreItemOptionData() so wizard step 2 descriptions include snapshot mode and item-level degradation truth before any risk checks run
D.3 app/Filament/Resources/RestoreRunResource.php and app/Filament/Resources/RestoreRunResource/Pages/CreateRestoreRun.php Preserve the current step order and restore-safety authority, while ensuring backup-quality messaging stops short of safe to restore or recovery guaranteed language

Phase E — Regression Protection And Focused Verification

Goal: Lock the new truth semantics into automated tests without weakening existing backup or restore behavior.

Step File Change
E.1 Existing and new unit tests under tests/Unit/Support/ Add deterministic coverage for item-level quality derivation, aggregate backup-set counts, metadata-only detection, assignment failure mapping, and unknown-quality fallback
E.2 tests/Feature/Filament/BackupSetEnterpriseDetailPageTest.php and new backup-set truth tests Cover list or detail quality summary visibility, mixed-quality aggregation, and summary-first ordering
E.3 tests/Feature/Filament/PolicyVersionTest.php, tests/Feature/Filament/PolicyVersionRestoreViaWizardTest.php, and new policy-version truth tests Cover snapshot mode visibility, explicit detail quality truth, and non-reliance on disabled actions
E.4 tests/Feature/Filament/RestoreItemSelectionTest.php and new restore-selection truth tests Cover backup-set quality in step 1 and per-item quality in step 2 before risk checks
E.5 RBAC tests under tests/Feature/Rbac/ Preserve 404 versus 403 behavior and verify that TENANT_VIEW users still see quality truth without restore rights
E.6 vendor/bin/sail bin pint --dirty --format agent and focused Pest runs Required formatting and targeted verification before implementation is considered complete

Key Design Decisions

D-001 — Backup quality is derived from existing capture truth, not stored separately

The current product already records the signals that matter: metadata-only source markers, assignment fetch failures, orphaned assignments, warnings, and integrity hints. The missing piece is a consistent way to aggregate and display them across surfaces.

D-002 — Backup lifecycle status and backup quality stay orthogonal

completed, partial, and failed remain capture-lifecycle truth. Aggregate backup-quality summaries answer whether the captured inputs appear strong or degraded as recovery input. The plan never reuses lifecycle status as a proxy for quality.

D-003 — Snapshot completeness stays on the central badge system

The existing PolicySnapshotModeBadge already defines the primary full versus metadata only language. This slice reuses that badge instead of introducing a second status vocabulary for the same truth.

D-004 — Restore selection surfaces expose input quality, not safety approval

Step 1 and step 2 only need to tell the operator whether the chosen backup set or items look degraded. Restore safety, preview decisions, and execution readiness remain owned by the later steps and existing restore-safety logic.

D-005 — RBAC can suppress actions, not truth

Users with view rights must still see backup-quality truth even when restore entry points or maintenance actions are unavailable. Hiding or muting quality because of missing restore capability would falsify the surface.

D-006 — Existing Filament seams are sufficient

The current enterprise-detail builder, table columns, infolist sections, and checkbox-list descriptions already provide the UI seams this slice needs. A dashboard, custom shell, or new client-side state layer would be disproportionate.

D-007 — Unknown quality is an explicit fallback, not the default

The product should only emit unknown quality where current records truly lack authoritative metadata. If existing metadata can justify metadata-only, assignment issue, or orphaned assignments, the surface must say so directly.

Risk Assessment

Risk Impact Likelihood Mitigation
Aggregation logic diverges between backup items, policy versions, and restore selection descriptions High Medium Use one narrow derived helper path and cover it with mixed-quality unit and feature tests
Quality summary introduces N+1 queries or heavy per-row work on backup-set list pages High Medium Preload relations or aggregate counts deliberately and add list-focused regression coverage
UI wording slips from backup quality into restore safety or tenant recoverability claims High Medium Keep operator copy centralized and test for explicit non-claims on degraded and healthy-looking cases
Read-only users lose quality visibility because existing restore gating is accidentally reused High Medium Add dedicated RBAC visibility tests for TENANT_VIEW members without restore capability
Metadata-only restore blocking semantics regress because selection hints are coupled too tightly to risk checks Medium Medium Keep restore selection quality read-only and rerun focused restore-safety regression tests alongside the new surface tests

Test Strategy

  • Extend existing backup-set, backup-items, policy-version, restore-selection, and RBAC Pest coverage before introducing any new harness.
  • Add unit tests for the narrow backup-quality helper so metadata-only detection, assignment issue mapping, orphaned-assignment mapping, and aggregate counts remain deterministic.
  • Add feature tests that prove completed and good backup are no longer visually conflated on backup-set list and detail surfaces.
  • Add feature tests that prove metadata-only and assignment-capture issues are visible on backup items and policy versions without relying on disabled actions or late restore checks.
  • Add feature tests that prove restore wizard step 1 and step 2 expose degraded input before risk checks or preview generation.
  • Add RBAC tests that prove TENANT_VIEW users still see backup-quality truth while restore actions remain unavailable, and non-members still receive 404 semantics.
  • Re-run existing restore-safety and restore-selection tests so earlier input-quality visibility does not change existing risk-check or execution behavior.
  • Keep all tests Livewire v4 compatible and run the smallest affected subset through Sail before asking for a full-suite pass.

Complexity Tracking

No constitution violations or exception-driven complexity were identified. The only added structure is a narrow derived backup-quality helper layer justified by cross-surface reuse and the need to keep current metadata interpretation consistent across list, detail, and wizard selection surfaces.

Proportionality Review

  • Current operator problem: Operators can currently tell that a backup set, backup item, or policy version exists, but they cannot quickly tell whether it is strong, degraded, or metadata-only as recovery input before they reach deep diagnostics or restore-safety surfaces.
  • Existing structure is insufficient because: The relevant truth is fragmented across backup metadata, version metadata, assignment fetch flags, orphaned-assignment markers, and disabled restore actions. Presence is visible earlier than usefulness, which creates false trust.
  • Narrowest correct implementation: Add one narrow derived backup-quality helper path and integrate it directly into existing backup-set, backup-item, policy-version, and restore-selection surfaces without adding new persistence or a broad taxonomy framework.
  • Ownership cost created: A small amount of derivation logic, additional list or detail wiring, and focused unit and feature tests to keep the mapping stable.
  • Alternative intentionally rejected: A persisted backup-health table, a recovery-confidence score, or a dashboard-wide backup-health program. Each would create broader truth and ownership cost than the current operator problem requires.
  • Release truth: Current-release truth. This slice corrects the truth on already-shipped backup and version surfaces before later backup-health or recovery-confidence work builds on them.