TenantAtlas/specs/181-restore-safety-integrity/data-model.md
ahmido a107e7e41b feat: restore safety integrity and queue slide-over (#210)
## Summary
- add the Spec 181 restore-safety layer with scope fingerprinting, preview/check integrity states, execution safety snapshots, result attention, and operator-facing copy across the wizard, restore detail, and canonical operation detail
- add focused unit and feature coverage for restore-safety assessment, result attention, and restore-linked operation detail
- switch the finding exceptions queue `Inspect exception` action to a native Filament slide-over while preserving query-param-backed inline summary behavior

## Testing
- `vendor/bin/sail artisan test --compact tests/Feature/Monitoring/FindingExceptionsQueueTest.php tests/Feature/Filament/RestoreSafetyIntegrityWizardTest.php tests/Feature/Filament/RestoreResultAttentionSurfaceTest.php tests/Feature/Operations/RestoreLinkedOperationDetailTest.php tests/Unit/Support/RestoreSafety`

## Notes
- Spec 181 checklist is complete (`specs/181-restore-safety-integrity/checklists/requirements.md`)
- the branch still has unchecked follow-up tasks in `specs/181-restore-safety-integrity/tasks.md`: `T012`, `T018`, `T019`, `T023`, `T025`, `T029`, `T032`, `T033`, `T041`, `T042`, `T043`, `T044`
- Filament v5 / Livewire v4 compliance is preserved, no panel provider registration changes were made, no global-search behavior was added, destructive actions remain confirmation-gated, and no new Filament assets were introduced

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #210
2026-04-06 23:37:14 +00:00

276 lines
13 KiB
Markdown

# Data Model: Restore Safety Integrity
## Overview
This feature does not add or change a top-level persisted domain entity. It introduces a tighter derived safety model around the existing restore flow using current `RestoreRun`, `OperationRun`, risk-check, preview, and result data.
The central design task is to turn existing restore inputs and outputs into explicit operator truth without changing:
- `RestoreRun` ownership or route identity
- `OperationRun` ownership or lifecycle ownership
- existing backup, policy-version, and assignment storage
- existing write-gate, RBAC, and audit responsibilities
- the no-new-table boundary of this feature
## Existing Persistent Entities
### 1. RestoreRun
- Purpose: Tenant-owned restore record for scope selection, preview basis, checks basis, execution intent, and restore result detail.
- Existing persistent fields used by this feature:
- `id`
- `tenant_id`
- `backup_set_id`
- `operation_run_id`
- `status`
- `is_dry_run`
- `requested_items`
- `group_mapping`
- `preview`
- `results`
- `metadata`
- `requested_by`
- `started_at`
- `completed_at`
- Existing relationships used by this feature:
- `tenant`
- `backupSet`
- `operationRun`
#### Proposed nested metadata additions
No new columns are required. If persisted historical truth is needed, this feature may add the following nested structures inside `RestoreRun.metadata`:
| Key | Type | Purpose |
|---|---|---|
| `scope_basis` | object | Historical snapshot of the restore scope used for checks, preview, or execution |
| `check_basis` | object | Fingerprint and timing for the last checks considered valid enough to persist with the run |
| `preview_basis` | object | Fingerprint and timing for the last preview considered valid enough to persist with the run |
| `execution_safety_snapshot` | object | Exact safety truth captured when a real restore was queued or executed |
Minimal persisted shape:
```text
metadata
├── scope_basis
│ ├── fingerprint
│ ├── scope_mode
│ ├── selected_item_ids
│ ├── group_mapping_fingerprint
│ └── captured_at
├── check_basis
│ ├── fingerprint
│ ├── ran_at
│ ├── blocking_count
│ ├── warning_count
│ └── result_codes
├── preview_basis
│ ├── fingerprint
│ ├── generated_at
│ └── summary
└── execution_safety_snapshot
├── evaluated_at
├── scope_fingerprint
├── preview_state
├── checks_state
├── safety_state
├── blocking_count
├── warning_count
├── primary_issue_code
└── follow_up_boundary
```
Notes:
- `scope_basis`, `check_basis`, and `preview_basis` may be persisted only when needed for historical result truth. They do not require independent lifecycle behavior.
- The snapshot is intentionally narrow. It stores the safety basis used at execution time, not a tenant-wide recovery claim.
### 2. OperationRun
- Purpose: Canonical workspace-owned monitoring record for restore execution.
- Existing persistent fields used by this feature:
- `id`
- `workspace_id`
- `tenant_id`
- `type`
- `status`
- `outcome`
- `context`
- `summary_counts`
- `created_at`
- `started_at`
- `completed_at`
- Existing relationship and linkage used by this feature:
- restore execution runs already carry `context.restore_run_id` or a direct `RestoreRun.operation_run_id` link
No schema change is planned for `OperationRun`.
## Derived Models
### 1. RestoreScopeFingerprint
Deterministic representation of the current restore scope.
| Field | Type | Source | Notes |
|---|---|---|---|
| `backupSetId` | integer | `backup_set_id` | Required |
| `scopeMode` | string | `scope_mode` | `all` or `selected` |
| `selectedItemIds` | list<integer> | `backup_item_ids` or `requested_items` | Sorted, unique, empty for `all` scope |
| `groupMapping` | object | normalized `group_mapping` | Keys sorted, explicit `SKIP` retained |
| `fingerprint` | string | derived hash | Canonical equality signal |
Rules:
- The fingerprint must change whenever any execution-affecting restore input changes.
- Pure confirmation inputs like `tenant_confirm` or `acknowledged_impact` are not part of the scope fingerprint.
### 2. PreviewIntegrityState
Derived trust state for preview.
| Field | Type | Source | Notes |
|---|---|---|---|
| `state` | string | derived | `not_generated`, `current`, `stale`, `invalidated` |
| `freshnessPolicy` | string | derived | Fixed to `invalidate_after_mutation` for this feature |
| `fingerprint` | string or null | `preview_basis.fingerprint` or wizard state | Null if never generated |
| `generatedAt` | datetime or null | `preview_ran_at` or `preview_basis.generated_at` | Null if never generated |
| `invalidationReasons` | list<string> | derived | e.g. `scope_mismatch`, `mapping_changed`, `backup_set_changed` |
| `rerunRequired` | boolean | derived | True for all states except `current` |
| `displaySummary` | string | derived | Operator-facing explanation |
### 3. ChecksIntegrityState
Derived trust state for restore checks.
| Field | Type | Source | Notes |
|---|---|---|---|
| `state` | string | derived | `not_run`, `current`, `stale`, `invalidated` |
| `freshnessPolicy` | string | derived | Fixed to `invalidate_after_mutation` for this feature |
| `fingerprint` | string or null | `check_basis.fingerprint` or wizard state | Null if never run |
| `ranAt` | datetime or null | `checks_ran_at` or `check_basis.ran_at` | Null if never run |
| `blockingCount` | integer | `check_summary.blocking` | Preserved even if the state becomes invalid |
| `warningCount` | integer | `check_summary.warning` | Preserved even if the state becomes invalid |
| `invalidationReasons` | list<string> | derived | Same family as preview invalidation |
| `rerunRequired` | boolean | derived | True for all states except `current` |
### 4. ExecutionReadinessState
Technical ability to start restore execution.
| Field | Type | Source | Notes |
|---|---|---|---|
| `allowed` | boolean | derived from RBAC, write-gate, provider operability, hard blockers | Answers “can the system start?” |
| `blockingReasons` | list<string> | derived | `missing_capability`, `write_gate_blocked`, `provider_unavailable`, `risk_blocker` |
| `mutationScope` | string | derived | `simulation_only` or `microsoft_tenant` |
| `requiredCapability` | string | derived | existing registry entry, not a raw string literal in feature code |
### 5. RestoreSafetyAssessment
Decision-layer state that separates executable from safe.
| Field | Type | Source | Notes |
|---|---|---|---|
| `state` | string | derived | `blocked`, `risky`, `ready_with_caution`, `ready` |
| `executionReadiness` | object | `ExecutionReadinessState` | Technical startability |
| `previewIntegrity` | object | `PreviewIntegrityState` | Decision basis currentness |
| `checksIntegrity` | object | `ChecksIntegrityState` | Decision basis currentness |
| `positiveClaimSuppressed` | boolean | derived | True when warnings or integrity issues suppress calm claims |
| `primaryIssueCode` | string or null | derived | Most important blocker or warning reason |
| `primaryNextAction` | string | derived | e.g. `rerun_checks`, `regenerate_preview`, `adjust_scope`, `review_warnings` |
Derived-state rules:
- `blocked`: execution readiness is false, or risk blockers are present.
- `risky`: execution may be technically possible, but preview or checks are not current enough to support calm execution, or another integrity problem suppresses approval.
- `ready_with_caution`: current preview and current checks exist, blockers are absent, but warnings remain suppressive.
- `ready`: current preview and current checks exist, blockers are absent, warnings are absent or non-suppressive, and the operator can receive a calm execution signal.
### 6. RestoreExecutionSafetySnapshot
Historical snapshot stored on the existing restore run when a real restore is queued.
| Field | Type | Source | Notes |
|---|---|---|---|
| `evaluatedAt` | datetime | confirmation time | Historical anchor |
| `scopeFingerprint` | string | `RestoreScopeFingerprint` | Basis used to queue execution |
| `previewState` | string | `PreviewIntegrityState.state` | Historical truth at queue time |
| `checksState` | string | `ChecksIntegrityState.state` | Historical truth at queue time |
| `safetyState` | string | `RestoreSafetyAssessment.state` | Historical decision truth |
| `blockingCount` | integer | checks summary | Historical fact |
| `warningCount` | integer | checks summary | Historical fact |
| `primaryIssueCode` | string or null | `RestoreSafetyAssessment.primaryIssueCode` | Audit-friendly summary |
| `followUpBoundary` | string | derived | e.g. `run_completed_not_recovery_proven` |
### 7. RestoreResultAttention
Derived result-follow-up truth for restore detail and linked monitoring surfaces.
| Field | Type | Source | Notes |
|---|---|---|---|
| `state` | string | derived | `not_executed`, `completed`, `partial`, `failed`, `completed_with_follow_up` |
| `followUpRequired` | boolean | derived | Primary operator signal |
| `primaryCauseFamily` | string | derived | `execution_failure`, `write_gate_or_rbac`, `provider_operability`, `missing_dependency_or_mapping`, `payload_quality`, `scope_mismatch`, `item_level_failure`, `none` |
| `summary` | string | derived | Short operator-facing summary |
| `primaryNextAction` | string | derived | One leading next step |
| `recoveryClaimBoundary` | string | derived | Explicitly states what the surface is not proving |
Decision rules:
- `partial`: mixed item outcomes or mixed assignment outcomes remain after execution.
- `completed_with_follow_up`: execution reached a terminal completed path, but unresolved warnings, skipped items, or open recovery work remain.
- `completed`: execution finished and no derived follow-up remains visible at the restore-run truth level, without implying tenant recovery.
### 8. RestoreWizardPageModel
Server-driven page model for the wizard.
| Field | Type | Purpose |
|---|---|---|
| `currentScope` | `RestoreScopeFingerprint` | Shows what the operator is about to restore |
| `previewIntegrity` | `PreviewIntegrityState` | Shows whether preview still applies |
| `checksIntegrity` | `ChecksIntegrityState` | Shows whether checks still apply |
| `executionReadiness` | `ExecutionReadinessState` | Shows whether the system can technically start |
| `safetyAssessment` | `RestoreSafetyAssessment` | Shows whether the action is safe enough to claim calm readiness |
| `primaryGuidance` | object | One primary next step and supporting explanation |
### 9. RestoreRunDetailPageModel
Page model for the restore-run detail and result surface.
| Field | Type | Purpose |
|---|---|---|
| `header` | object | identity, backup set, mode, requested by, timestamps |
| `basisTruth` | object | preview basis, checks basis, execution safety snapshot |
| `resultAttention` | `RestoreResultAttention` | overall result truth and next step |
| `itemBreakdown` | list<object> | per-item and assignment outcomes |
| `diagnostics` | list<object> | raw preview, raw results, provider details, mapping detail |
### 10. RestoreOperationContinuationModel
Minimal restore-specific truth exposed on the canonical operation detail.
| Field | Type | Purpose |
|---|---|---|
| `restoreRunId` | integer | linked restore record |
| `resultAttention` | `RestoreResultAttention` | restore follow-up truth summary |
| `restoreDetailUrl` | string or null | safe deep link when entitled |
| `accessState` | string | `linked`, `unavailable`, `forbidden_by_scope` |
| `unavailableReason` | string or null | truthful degradation without broken links |
## Validation Rules
- Preview is `current` only when a preview basis exists, its fingerprint matches the current scope fingerprint, a parseable generated timestamp exists, and no covered mutation has invalidated the basis.
- Checks are `current` only when a check basis exists, its fingerprint matches the current scope fingerprint, a parseable checks timestamp exists, and no covered mutation has invalidated the basis.
- A fingerprint mismatch must classify preview or checks as `invalidated`, not merely `stale`.
- Preview or checks classify as `stale` when evidence exists but required basis markers are incomplete, legacy, or otherwise insufficient to prove currentness on a persisted draft or run, even though an explicit fingerprint mismatch is not available.
- This feature uses freshness policy `invalidate_after_mutation`; it does not add a separate age-based timeout for preview or checks inside the active wizard draft.
- `ready` requires `ExecutionReadinessState.allowed = true`, `PreviewIntegrityState.state = current`, `ChecksIntegrityState.state = current`, and no suppressive warnings or blockers.
- `ready_with_caution` requires current integrity and zero blockers, but at least one suppressive warning remains.
- `risky` remains possible when execution readiness is true but calm approval is suppressed by integrity or warning truth.
- `completed` on the result surface must never imply tenant recovery unless another feature later supplies external reconciliation proof.
## State Notes
- `RestoreRunStatus` remains the persisted execution lifecycle enum. This feature does not replace it.
- Preview integrity, checks integrity, restore safety, and result attention are derived state families. They are not new top-level persisted enums.
- The only persisted addition this design allows is a narrow snapshot of the safety basis used for an actual restore run.