Summary This PR implements Spec 049 – Backup/Restore Job Orchestration: all critical Backup/Restore execution paths are job-only, idempotent, tenant-scoped, and observable via run records + DB notifications (Phase 1). The UI no longer performs heavy Graph work inside request/Filament actions for these flows. Why We want predictable UX and operations at MSP scale: • no timeouts / long-running requests • reproducible run state + per-item results • safe error persistence (no secrets / no token leakage) • strict tenant isolation + auditability for write paths What changed Foundational (Runs + Idempotency + Observability) • Added a shared RunIdempotency helper (dedupe while queued/running). • Added a read-only BulkOperationRuns surface (list + view) for status/progress. • Added DB notifications for run status changes (with “View run” link). US1 – Policy “Capture snapshot” is job-only • Policy detail “Capture snapshot” now: • creates/reuses a run (dedupe key: tenant + policy.capture_snapshot + policy DB id) • dispatches a queued job • returns immediately with notification + link to run detail • Graph capture work moved fully into the job; request path stays Graph-free. US3 – Restore runs orchestration is job-only + safe • Live restore execution is queued and updates RestoreRun status/progress. • Per-item outcomes are persisted deterministically (per internal DB record). • Audit logging is written for live restore. • Preview/dry-run is enforced as read-only (no writes). Tenant isolation / authorization (non-negotiable) • Run list/view/start are tenant-scoped and policy-guarded (cross-tenant access => 403, not 404). • Explicit Pest tests cover cross-tenant denial and start authorization. Tests / Verification • ./vendor/bin/pint --dirty • Targeted suite (examples): • policy capture snapshot queued + idempotency tests • restore orchestration + audit logging + preview read-only tests • run authorization / tenant isolation tests Notes / Scope boundaries • Phase 1 UX = DB notifications + run detail page. A global “progress widget” is tracked as Phase 2 and not required for merge. • Resilience/backoff is tracked in tasks but can be iterated further after merge. Review focus • Dedupe behavior for queued/running runs (reuse vs create-new) • Tenant scoping & policy gates for all run surfaces • Restore safety: audit event + preview no-writes Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #56
3.3 KiB
Data Model: Backup/Restore Job Orchestration (049)
This feature relies on existing “run record” models/tables and (optionally) extends them to meet the orchestration requirements.
Entities
1) RestoreRun (restore_runs)
Purpose: Run record for restore executions and dry-run/preview workflows.
Model: App\Models\RestoreRun
Key fields (existing):
id(PK)tenant_id(FK → tenants)backup_set_id(FK → backup_sets)requested_by(string|null)is_dry_run(bool)status(string)requested_items(json|null)preview(json|null) — persisted preview outputresults(json|null) — persisted execution output (may include per-item outcomes)failure_reason(text|null)started_at/completed_at(timestamp|null)metadata(json|null)
Relationships:
RestoreRun belongsTo TenantRestoreRun belongsTo BackupSet
State transitions (target):
queued → running → succeeded|failed|partial
Validation constraints (creation/dispatch):
- tenant-scoped access required
backup_set_idmust belong to tenant- preview/dry-run must not perform writes (constitution Read/Write Separation)
2) BulkOperationRun (bulk_operation_runs)
Purpose: Run record for background operations that process many internal items, including backup-set capture-like actions.
Model: App\Models\BulkOperationRun
Key fields (existing):
id(PK)tenant_id(FK → tenants)user_id(FK → users)resource(string) — e.g.policy,backup_setaction(string) — e.g.export,add_policiesstatus(string) —pending,running,completed,completed_with_errors,failed,abortedtotal_items,processed_items,succeeded,failed,skippeditem_ids(jsonb)failures(jsonb|null) — safe per-item error summariesaudit_log_id(FK → audit_logs|null)
Relationships:
BulkOperationRun belongsTo TenantBulkOperationRun belongsTo User
Recommended additions (to satisfy FR-002/FR-004 cleanly):
idempotency_key(string, indexed; uniqueness enforced for active statuses via partial index)started_at/finished_at(timestampTz)error_code(string|null)error_context(jsonb|null)
State transitions (target):
queued → running → succeeded|failed|partialpendingmaps toqueuedcompleted_with_errorsmaps topartial
3) Notification Event (DB notifications)
Purpose: Persist state transitions and completion notices for the initiating user.
Storage: Laravel Notifications (DB channel).
Payload shape (target):
tenant_idrun_type(restore_run / bulk_operation_run)run_idstatus(queued/running/succeeded/failed/partial)counts(optional)safe_error_code+safe_error_context(optional)
Notes on “per-item outcomes” (FR-005)
- For restore workflows, per-item outcomes can initially be stored in
restore_runs.resultsas a structured JSON array/object keyed by internal item identifiers. - For bulk operations, per-item outcomes are already persisted as
bulk_operation_runs.failuresplus the counter columns. - If Phase 1 needs relational per-item tables for querying/filtering, introduce a dedicated “run item results” table per run type (Phase 2+ preferred).