23 KiB
Implementation Plan: Restore Safety Integrity
Branch: 181-restore-safety-integrity | Date: 2026-04-06 | Spec: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/181-restore-safety-integrity/spec.md
Input: Feature specification from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/181-restore-safety-integrity/spec.md
Summary
Harden the restore flow so operators can distinguish stale versus current preview truth, stale versus current checks truth, technical startability versus safety readiness, and run completion versus real follow-up truth without adding a new recovery persistence model. The implementation keeps RestoreRun and OperationRun as the existing sources of truth, introduces a narrow derived restore-safety layer for scope fingerprinting and integrity assessment, persists only a compact execution-time safety snapshot inside existing RestoreRun.metadata when needed, hardens the wizard and detail surfaces, and preserves restore-specific truth on the canonical operation detail page.
Key approach: work inside the existing RestoreRunResource, CreateRestoreRun, restore form component views, restore infolist entry views, and restore-linked OperationRunResource seams; add derived restore safety helpers under the existing application structure; keep all changes Filament v5 and Livewire v4 compliant; avoid new tables and new Graph contract paths; validate the result with focused Pest, Livewire, hardening, ops-UX, and RBAC coverage.
Technical Context
Language/Version: PHP 8.4, Laravel 12, Blade, Filament v5, Livewire v4
Primary Dependencies: Filament v5, Livewire v4, Pest v4, Laravel Sail, existing RestoreRunResource, RestoreService, RestoreRiskChecker, RestoreDiffGenerator, OperationRunResource, TenantlessOperationRunViewer, shared badge infrastructure, and existing RBAC or write-gate helpers
Storage: PostgreSQL with existing restore_runs and operation_runs records plus JSON or array-backed metadata, preview, results, and context; no schema change planned
Testing: Pest feature tests, Livewire page and action tests, unit tests for narrow derived restore-safety helpers, all run through Sail
Target Platform: Laravel web application in Sail locally and containerized Linux deployment in staging and production
Project Type: Laravel monolith web application
Performance Goals: Keep restore wizard and detail surfaces server-driven, avoid new render-time external calls, preserve quick operator scanability on confirm and result surfaces, and keep canonical operation detail DB-only at render time
Constraints: No new central recovery-state table, no new Graph contract path, no route identity change, no RBAC drift, no collapse of executable versus safe versus recovered semantics, no ad-hoc badge mappings, and no new global Filament assets
Scale/Scope: One tenant-scoped restore wizard, one tenant restore detail surface, one restore-linked canonical operation detail surface, a narrow derived restore-safety layer, and focused regression coverage across wizard, result, RBAC, and ops-UX behavior
Constitution Check
GATE: Passed before Phase 0 research. Re-checked after Phase 1 design and still passing.
| Principle | Status | Notes |
|---|---|---|
| Inventory-first | Pass | Backups remain immutable snapshots and no inventory ownership rule changes |
| Read/write separation | Pass | Real restore execution stays behind preview, checks, hard confirmation, audit, and tests |
| Graph contract path | Pass | No new Graph endpoints or contract registry changes; existing restore calls stay behind current restore services and GraphClientInterface |
| Deterministic capabilities | Pass | Existing capability registry and UiEnforcement or capability resolver remain authoritative |
| RBAC-UX planes and 404 vs 403 | Pass | Tenant restore surfaces remain tenant-scoped; canonical /admin/operations/{run} remains workspace-safe and tenant-safe |
| Workspace isolation | Pass | No workspace scope broadening; canonical monitoring remains workspace-member gated |
| Tenant isolation | Pass | Restore runs, restore previews, checks, and result detail stay tenant-owned and tenant-entitled |
| Dangerous and destructive confirmations | Pass | Existing archive, restore, rerun, and force-delete actions remain confirmation-gated; real execution remains hard-confirmed in the wizard |
| Global search safety | Pass | OperationRunResource already remains non-globally-searchable; this feature adds no new globally searchable resource. RestoreRunResource is not made newly searchable, and it already has a view page if search is later enabled |
| Run observability | Pass | Existing restore.execute operations continue to create or reuse OperationRun; no new run model is introduced |
| Ops-UX 3-surface feedback | Pass | Existing queued toast, progress surfaces, and terminal monitoring behavior remain authoritative |
| Ops-UX lifecycle ownership | Pass | OperationRun.status and OperationRun.outcome remain service-owned; this feature only adds restore-specific read truth |
| Ops-UX summary counts | Pass | No new OperationRun summary-count keys are required; restore-specific integrity stays on restore context |
| Data minimization | Pass | No new secrets or external payload exposure; detail diagnostics remain secondary |
| Proportionality (PROP-001) | Pass | New logic is limited to derived restore-safety helpers and optional nested metadata snapshotting on existing records |
| Persisted truth (PERSIST-001) | Pass | No new table; only a narrow execution-time safety snapshot may be stored on the existing restore run |
| Behavioral state (STATE-001) | Pass | New integrity, safety, and follow-up states directly change operator guidance and execution gating semantics |
| Badge semantics (BADGE-001) | Pass | Any new restore safety badges or chips must route through central badge or shared primitive semantics, not page-local mapping |
| Filament-native UI (UI-FIL-001) | Pass | Existing Filament wizard, sections, view fields, infolist entries, and shared primitives remain the primary UI seams |
| UI naming (UI-NAMING-001) | Pass | The plan preserves preview, checks, dry-run, restore, partial, and follow-up as operator vocabulary |
| Operator surfaces (OPSURF-001) | Pass | Wizard and result surfaces become more operator-first, not more diagnostic-first |
| Filament Action Surface Contract | Pass | No redundant view actions or empty action groups are introduced; list inspect model remains row click |
| Filament UX-001 | Pass with documented variance | The wizard remains structured and the detail page remains infolist-based with custom entry views, but still follows summary-first information architecture |
| Filament v5 / Livewire v4 compliance | Pass | The implementation stays inside the current Filament v5 and Livewire v4 stack |
| Provider registration location | Pass | No panel or provider changes; Laravel 11+ provider registration remains in bootstrap/providers.php |
| Asset strategy | Pass | No new panel assets are planned; deployment keeps the existing php artisan filament:assets step unchanged |
Phase 0 Research
Research outcomes are captured in /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/181-restore-safety-integrity/research.md.
Key decisions:
- Derive a deterministic restore scope fingerprint from existing wizard inputs instead of introducing a new persisted scope entity.
- Separate preview and checks integrity from blocker and warning severity so
no blockerscan no longer be misread assafe. - Preserve invalidation evidence in wizard state instead of silently clearing prior preview and checks truth.
- Persist only a narrow execution-time safety snapshot inside
RestoreRun.metadatawhen historical truth is required for restore detail. - Derive result follow-up truth from existing results, assignment outcomes, and linked
OperationRunoutcome without adding a recovery entity. - Preserve restore-specific follow-up truth on canonical operation detail via enrichment or a safe deep link rather than an
OperationRunschema change. - Reuse Filament wizard, action, and infolist seams plus existing Pest and Livewire test patterns instead of introducing a new UI shell or browser-first harness.
Phase 1 Design
Design artifacts are created under /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/181-restore-safety-integrity/:
data-model.md: existing entities, narrow metadata additions, and derived restore safety modelscontracts/restore-safety-integrity.openapi.yaml: internal logical contract for the wizard, create submission, restore detail, and restore-linked canonical operation detailquickstart.md: focused automated and manual validation workflow for restore safety hardening
Design decisions:
- No schema migration is required; the design reuses
RestoreRun,OperationRun, and existing JSON-backed fields. - Historical execution truth may be captured inside existing
RestoreRun.metadataas a narrow safety snapshot rather than as a new entity. - Wizard hardening remains inside
RestoreRunResource::getWizardSteps()andCreateRestoreRun, with restore form component views displaying integrity state and guidance. - Result hardening remains inside existing restore detail infolist entry views and the restore-linked canonical operation detail seams.
- Test coverage stays focused on restore wizard, restore detail, linked operation detail, hardening, ops-UX, and RBAC behavior.
Project Structure
Documentation (this feature)
specs/181-restore-safety-integrity/
├── spec.md
├── plan.md
├── research.md
├── data-model.md
├── quickstart.md
├── contracts/
│ └── restore-safety-integrity.openapi.yaml
├── checklists/
│ └── requirements.md
└── tasks.md
Source Code (repository root)
app/
├── Filament/
│ ├── Pages/
│ │ └── Operations/
│ │ └── TenantlessOperationRunViewer.php
│ └── Resources/
│ ├── OperationRunResource.php
│ └── RestoreRunResource.php
│ └── Pages/
│ ├── CreateRestoreRun.php
│ ├── ListRestoreRuns.php
│ └── ViewRestoreRun.php
├── Models/
│ └── RestoreRun.php
├── Services/
│ └── Intune/
│ ├── RestoreDiffGenerator.php
│ ├── RestoreRiskChecker.php
│ └── RestoreService.php
├── Support/
│ ├── Badges/
│ │ └── Domains/
│ │ ├── RestoreCheckSeverityBadge.php
│ │ ├── RestorePreviewDecisionBadge.php
│ │ ├── RestoreResultStatusBadge.php
│ │ └── RestoreRunStatusBadge.php
│ ├── OpsUx/
│ │ └── OperationUxPresenter.php
│ └── RestoreRunStatus.php
resources/
└── views/
└── filament/
├── forms/
│ └── components/
│ ├── restore-run-checks.blade.php
│ └── restore-run-preview.blade.php
└── infolists/
└── entries/
├── restore-preview.blade.php
└── restore-results.blade.php
tests/
├── Feature/
│ ├── Filament/
│ │ ├── RestorePreviewTest.php
│ │ ├── RestoreRunUiEnforcementTest.php
│ │ └── [new or expanded restore safety integrity page tests]
│ ├── OpsUx/
│ │ └── RestoreExecutionOperationRunSyncTest.php
│ ├── Operations/
│ │ └── [new or expanded restore-linked operation detail tests]
│ ├── Hardening/
│ │ └── [existing restore start gate tests]
│ ├── RestoreRiskChecksWizardTest.php
│ └── RestoreRunWizardExecuteTest.php
└── Unit/
└── [new narrow restore safety resolver tests under app/Support]
Structure Decision: Standard Laravel monolith. The implementation stays inside existing Filament resources, Blade views, restore services, and monitoring seams. Any new helper types stay under existing app/Support or another already-established application namespace. No new base folders or standalone subsystems are required.
Implementation Strategy
Phase A — Introduce Scope Fingerprinting And Derived Integrity State
Goal: Create the smallest possible restore-safety layer that can explain whether preview and checks still apply to the current scope.
| Step | File | Change |
|---|---|---|
| A.1 | app/Support/RestoreRunStatus.php plus a new narrow restore safety helper namespace under app/Support/ |
Introduce derived scope fingerprint and integrity assessment helpers without changing persisted RestoreRunStatus, and make invalidate_after_mutation the explicit freshness policy for wizard-scoped evidence |
| A.2 | app/Models/RestoreRun.php |
Add narrow metadata accessors or helpers for scope_basis, check_basis, preview_basis, and execution_safety_snapshot |
| A.3 | app/Support/Badges/Domains/ and any shared primitive seam needed |
Add central state-to-badge or label mappings only if the new integrity or safety states are surfaced as badges |
Phase B — Harden Wizard Invalidation And Confirmation
Goal: Turn the existing wizard into an explicit restore safety gate instead of a sequence that silently forgets prior evaluation work.
| Step | File | Change |
|---|---|---|
| B.1 | app/Filament/Resources/RestoreRunResource.php |
Extend getWizardSteps() to compute and compare scope fingerprints, preserve invalidation evidence, and separate execution readiness from safety readiness |
| B.2 | app/Filament/Resources/RestoreRunResource/Pages/CreateRestoreRun.php |
Ensure the final create flow validates current preview, current checks, matching fingerprint, and hard-confirm state before a real restore queues |
| B.3 | resources/views/filament/forms/components/restore-run-checks.blade.php |
Surface current, stale, invalidated, or not_run states with one primary next step |
| B.4 | resources/views/filament/forms/components/restore-run-preview.blade.php |
Surface preview integrity state, generated-at truth, and rerun guidance without calm false positives |
| B.5 | app/Filament/Resources/RestoreRunResource.php or app/Services/Intune/RestoreService.php |
Persist a narrow execution_safety_snapshot inside existing RestoreRun.metadata when a real restore is queued |
Phase C — Harden Restore Result And Detail Truth
Goal: Ensure restore detail answers follow-up truth and next action before raw result lists.
| Step | File | Change |
|---|---|---|
| C.1 | app/Filament/Resources/RestoreRunResource.php |
Build a result-attention model from existing results, assignment outcomes, and linked run context |
| C.2 | resources/views/filament/infolists/entries/restore-preview.blade.php |
Show which preview basis applied and whether it was current, stale, or invalidated |
| C.3 | resources/views/filament/infolists/entries/restore-results.blade.php |
Elevate overall result truth, follow-up truth, primary cause family, and one primary next action above raw item detail |
Phase D — Preserve Restore Truth On Canonical Operation Detail
Goal: Keep restore-specific follow-up truth visible in canonical monitoring without duplicating restore persistence.
| Step | File | Change |
|---|---|---|
| D.1 | app/Filament/Resources/OperationRunResource.php |
Add restore-linked continuation truth for restore.execute runs using existing restore linkage and tenant-safe deep-link behavior |
| D.2 | app/Filament/Pages/Operations/TenantlessOperationRunViewer.php |
Preserve restore-specific guidance or safe restore-detail links without broken navigation when deeper access is unavailable |
Phase E — Regression Protection And Focused Verification
Goal: Lock the new safety semantics into automated tests and protect existing restore orchestration behavior.
| Step | File | Change |
|---|---|---|
| E.1 | tests/Feature/RestoreRunWizardExecuteTest.php |
Extend confirmation coverage to include fingerprint and integrity-state validation |
| E.2 | tests/Feature/RestoreRiskChecksWizardTest.php |
Extend checks-state persistence and invalidation coverage |
| E.3 | tests/Feature/Filament/RestorePreviewTest.php and new restore safety detail tests |
Cover preview integrity, stale versus invalidated display, and calmness suppression |
| E.4 | tests/Feature/Filament/RestoreRunUiEnforcementTest.php |
Preserve 404 versus 403 behavior and disabled-action truth under reduced capability |
| E.5 | tests/Feature/OpsUx/RestoreExecutionOperationRunSyncTest.php and new restore-linked operation detail tests |
Preserve OperationRun continuity and restore-specific follow-up visibility from canonical monitoring |
| E.6 | New unit tests under tests/Unit/Support/ |
Cover scope fingerprint generation, integrity classification, safety assessment, and result attention derivation |
| E.7 | vendor/bin/sail bin pint --dirty --format agent and focused Pest runs |
Required formatting and targeted verification before implementation is considered complete |
Key Design Decisions
D-001 — Scope mismatch must be explicit, not inferred from missing data
The current wizard safety behavior already clears preview and checks when some scope inputs change. This plan formalizes that behavior as explicit invalidation truth so the operator can see that prior work existed and was invalidated by a specific change.
D-002 — Execution-time safety truth belongs to the restore run, not a new recovery entity
The operator needs historical truth about what basis was used when a real restore was queued. That justifies a narrow metadata snapshot on the existing RestoreRun but does not justify a second persisted model.
D-003 — Result meaning must be derived from existing restore outputs, not from RestoreRun.status alone
completed, partial, and failed remain important lifecycle statuses, but the operator-facing follow-up truth comes from the combination of lifecycle, item results, assignment outcomes, and linked operation context.
D-004 — Canonical operation detail must acknowledge restore-specific follow-up without becoming the restore source of truth
OperationRun stays the monitoring record. RestoreRun stays the restore truth. The canonical operation surface should expose restore continuation meaning or link to it, not clone restore persistence.
D-005 — Filament-native seams are sufficient for this hardening slice
Filament wizard steps, view fields, custom infolist views, confirmation patterns, and Livewire action tests already fit the feature. The plan therefore avoids a parallel UI framework or custom client-side state layer.
D-006 — Restore evidence freshness is mutation-sensitive, not age-window-driven
This slice uses the repo's existing invalidate_after_mutation freshness language for wizard-scoped derived state. Matching fingerprint plus valid capture markers is enough for current inside the active draft. invalidated represents explicit scope drift after a covered mutation, while stale is reserved for legacy or incomplete persisted evidence that cannot prove currentness.
Risk Assessment
| Risk | Impact | Likelihood | Mitigation |
|---|---|---|---|
| Scope fingerprint is too narrow and misses a real execution-affecting change | High | Medium | Define the fingerprint from actual restore inputs used by checks and preview, cover it with unit tests and wizard regression tests |
| Historical safety truth drifts if the detail page recomputes everything from current logic | High | Medium | Persist a narrow execution-time safety snapshot on the existing restore run |
| New integrity states exist but the UI still reads calmly | High | Medium | Lock calmness suppression into wizard and detail tests, not only into helper code |
| Restore-specific truth disappears on canonical operation detail | Medium | Medium | Add explicit restore continuation coverage on the operation detail seams |
| The slice grows into a recovery dashboard or new persisted health system | Medium | Low | Keep the design constrained to existing restore and operation records, with no new table |
Test Strategy
- Extend existing restore wizard, preview, hardening, RBAC, and ops-UX Pest coverage before adding any new test harness.
- Add unit tests for the narrow derived restore safety helpers so fingerprint, integrity, safety, and result attention logic stay deterministic.
- Extend existing restore audit, execution-job, and preview-diff tests so invalidation reasoning remains derivable from restore records and the current execution and diff flows remain behaviorally intact.
- Add feature tests that prove stale or invalidated preview and checks suppress calm execution language.
- Add feature tests that prove scope changes invalidate prior readiness and that confirm-step validation refuses calm execution when integrity conditions are not met.
- Add feature tests that prove partial or completed-with-follow-up results are elevated above raw item lists and do not imply tenant recovery.
- Add canonical operation-detail tests that prove restore follow-up truth remains visible or safely linked.
- Re-run the existing ops-UX constitution and notification guards for direct status transitions, terminal DB notifications, canonical View run links, queued toast copy, and whitelisted
summary_countsso reuse ofOperationRuncannot regress the three-surface feedback contract. - Keep the manual
quickstart.mdvalidation pass as an explicit completion step so the 15-second and one-click operator outcomes are verified, not merely assumed from automated coverage. - Keep all tests Livewire v4 compatible and run the smallest affected subset through Sail before asking for a full-suite pass.
Complexity Tracking
No constitution violations or exception-driven complexity were identified. The only added complexity is the narrow derived restore-safety layer and the compact persisted execution-time safety snapshot already justified by the proportionality review.
Proportionality Review
- Current operator problem: Operators can currently treat stale preview or stale checks as if they still authorize the current restore scope, and can read
completedas calmer than the product can prove. - Existing structure is insufficient because: Existing restore flow data exists, but presence alone does not distinguish current versus invalid or safe versus merely executable. Existing result rendering does not elevate follow-up truth strongly enough.
- Narrowest correct implementation: Add a narrow derived restore-safety layer plus optional nested metadata snapshotting on the existing restore run. Reuse existing wizard, result, and operation-detail surfaces instead of creating a second workflow or persistence model.
- Ownership cost created: A small set of derived helpers, central state mapping, new view-model wiring, and additional unit and feature tests.
- Alternative intentionally rejected: A new recovery-health table, a tenant-wide recovery dashboard, or a generalized trust framework. Each was rejected as too broad for the current operator problem.
- Release truth: Current-release truth. This feature hardens already-shipped restore behavior before broader backup-quality or recovery-confidence work depends on it.