Summary This PR implements Spec 049 – Backup/Restore Job Orchestration: all critical Backup/Restore execution paths are job-only, idempotent, tenant-scoped, and observable via run records + DB notifications (Phase 1). The UI no longer performs heavy Graph work inside request/Filament actions for these flows. Why We want predictable UX and operations at MSP scale: • no timeouts / long-running requests • reproducible run state + per-item results • safe error persistence (no secrets / no token leakage) • strict tenant isolation + auditability for write paths What changed Foundational (Runs + Idempotency + Observability) • Added a shared RunIdempotency helper (dedupe while queued/running). • Added a read-only BulkOperationRuns surface (list + view) for status/progress. • Added DB notifications for run status changes (with “View run” link). US1 – Policy “Capture snapshot” is job-only • Policy detail “Capture snapshot” now: • creates/reuses a run (dedupe key: tenant + policy.capture_snapshot + policy DB id) • dispatches a queued job • returns immediately with notification + link to run detail • Graph capture work moved fully into the job; request path stays Graph-free. US3 – Restore runs orchestration is job-only + safe • Live restore execution is queued and updates RestoreRun status/progress. • Per-item outcomes are persisted deterministically (per internal DB record). • Audit logging is written for live restore. • Preview/dry-run is enforced as read-only (no writes). Tenant isolation / authorization (non-negotiable) • Run list/view/start are tenant-scoped and policy-guarded (cross-tenant access => 403, not 404). • Explicit Pest tests cover cross-tenant denial and start authorization. Tests / Verification • ./vendor/bin/pint --dirty • Targeted suite (examples): • policy capture snapshot queued + idempotency tests • restore orchestration + audit logging + preview read-only tests • run authorization / tenant isolation tests Notes / Scope boundaries • Phase 1 UX = DB notifications + run detail page. A global “progress widget” is tracked as Phase 2 and not required for merge. • Resilience/backoff is tracked in tasks but can be iterated further after merge. Review focus • Dedupe behavior for queued/running runs (reuse vs create-new) • Tenant scoping & policy gates for all run surfaces • Restore safety: audit event + preview no-writes Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #56
103 lines
4.5 KiB
Markdown
103 lines
4.5 KiB
Markdown
# Implementation Plan: Backup/Restore Job Orchestration (049)
|
||
|
||
**Branch**: `feat/049-backup-restore-job-orchestration-session-1768091854` | **Date**: 2026-01-11 | **Spec**: [specs/049-backup-restore-job-orchestration/spec.md](specs/049-backup-restore-job-orchestration/spec.md)
|
||
**Input**: Feature specification from `specs/049-backup-restore-job-orchestration/spec.md`
|
||
|
||
**Note**: This template is filled in by the `/speckit.plan` command. See `.specify/scripts/` for helper scripts.
|
||
|
||
## Summary
|
||
|
||
Move all backup/restore “start/execute” actions off the interactive request path.
|
||
|
||
- Interactive actions must only create (or reuse) a tenant-scoped Run Record and enqueue work.
|
||
- Background jobs perform Graph calls, capture/restore work, and update run records with status + counts + safe error summaries.
|
||
- Idempotency prevents double-click duplicates by reusing an active run for the same `(tenant + operation type + target)`.
|
||
|
||
Design choices are captured in [specs/049-backup-restore-job-orchestration/research.md](specs/049-backup-restore-job-orchestration/research.md).
|
||
|
||
## Phasing
|
||
|
||
### Phase 1 (this spec’s implementation target)
|
||
|
||
- Ensure all in-scope operations are job-only (no heavy work inline).
|
||
- Create/reuse run records with idempotency for active runs.
|
||
- Provide **Run detail** views for progress (status + counts) and **DB notifications** for state transitions.
|
||
|
||
### Phase 2 (explicitly out-of-scope for Phase 1)
|
||
|
||
- Add a **global progress widget** that surfaces all run types (not just bulk ops) across the admin UI.
|
||
|
||
## Technical Context
|
||
|
||
**Language/Version**: PHP 8.4.15
|
||
**Primary Dependencies**: Laravel 12, Filament 4, Livewire 3
|
||
**Storage**: PostgreSQL (JSONB used for run payloads/summaries where appropriate)
|
||
**Testing**: Pest 4 (feature tests + job tests)
|
||
**Target Platform**: Containerized web app (Sail for local dev; Dokploy for staging/prod)
|
||
**Project Type**: Web application (Laravel monolith)
|
||
**Performance Goals**: 95% of start actions confirm “queued” within 2 seconds (SC-001)
|
||
**Constraints**: No heavy work during interactive requests; jobs must be idempotent + observable; no secrets in run records
|
||
**Scale/Scope**: Multi-tenant MSP usage; long-running Graph operations; frequent retries/double-click scenarios
|
||
|
||
## Constitution Check
|
||
|
||
*GATE: Must pass before Phase 0 research. Re-check after Phase 1 design.*
|
||
|
||
- Inventory-first: orchestration is run-record centric; inventory stays “last observed”, backups remain explicit actions.
|
||
- Read/write separation: preview/dry-run stays read-only; live restore remains behind explicit confirmation + audit + tests.
|
||
- Graph contract path: all Graph calls remain behind `GraphClientInterface` and contract registry (`config/graph_contracts.php`).
|
||
- Deterministic capabilities: no new capability derivation introduced by this feature (existing resolver remains authoritative).
|
||
- Tenant isolation: all run visibility + execution is tenant-scoped; no cross-tenant run access.
|
||
- Automation: enforce de-duplication for active runs; jobs use locks/backoff for 429/503 where applicable.
|
||
- Data minimization: run records store only safe summaries (error codes + whitelisted context), never secrets/tokens.
|
||
|
||
## Project Structure
|
||
|
||
### Documentation (this feature)
|
||
|
||
```text
|
||
specs/049-backup-restore-job-orchestration/
|
||
├── plan.md # This file (/speckit.plan command output)
|
||
├── research.md # Phase 0 output (/speckit.plan command)
|
||
├── data-model.md # Phase 1 output (/speckit.plan command)
|
||
├── quickstart.md # Phase 1 output (/speckit.plan command)
|
||
├── contracts/ # Phase 1 output (/speckit.plan command)
|
||
└── tasks.md # Phase 2 output (/speckit.tasks command - NOT created by /speckit.plan)
|
||
```
|
||
|
||
### Source Code (repository root)
|
||
|
||
```text
|
||
app/
|
||
├── Filament/
|
||
│ └── Resources/
|
||
├── Jobs/
|
||
├── Livewire/
|
||
├── Models/
|
||
├── Services/
|
||
└── Support/
|
||
|
||
database/
|
||
└── migrations/
|
||
|
||
resources/
|
||
└── views/
|
||
|
||
tests/
|
||
├── Feature/
|
||
└── Unit/
|
||
```
|
||
|
||
**Structure Decision**: Laravel monolith; orchestration implemented via queued jobs + run records in existing models/tables.
|
||
|
||
## Complexity Tracking
|
||
|
||
> **Fill ONLY if Constitution Check has violations that must be justified**
|
||
|
||
| Violation | Why Needed | Simpler Alternative Rejected Because |
|
||
|-----------|------------|-------------------------------------|
|
||
| [e.g., 4th project] | [current need] | [why 3 projects insufficient] |
|
||
| [e.g., Repository pattern] | [specific problem] | [why direct DB access insufficient] |
|
||
|
||
No constitution violations are required for this feature.
|