12 KiB
Tasks: Backup/Restore Job Orchestration (049)
Input: Design documents from specs/049-backup-restore-job-orchestration/
Prerequisites: plan.md (required), spec.md (required), research.md, data-model.md, contracts/, quickstart.md
Tests: REQUIRED (Pest) for these runtime behavior changes.
MVP scope: Strictly limited to T001–T016 (US1 only). The Phase 7 global progress widget (T037) is Phase 2 and explicitly NOT part of the MVP.
Phase 1: Setup (Shared Infrastructure)
- T001 Verify queue + DB notifications prerequisites in config/queue.php and database/migrations/notifications (add missing migration if needed)
- T002 Confirm existing run tables and status enums used by RestoreRun in app/Support/RestoreRunStatus.php and database/migrations/2025_12_10_000150_create_restore_runs_table.php
- T003 [P] Add quickstart sanity commands for this feature in specs/049-backup-restore-job-orchestration/quickstart.md
Phase 2: Foundational (Blocking Prerequisites)
⚠️ CRITICAL: No user story work should begin until this phase is complete.
- T004 Add idempotency support to bulk_operation_runs via database/migrations/2026_01_11_120001_add_idempotency_key_to_bulk_operation_runs_table.php
- T005 Add idempotency support to restore_runs via database/migrations/2026_01_11_120002_add_idempotency_key_to_restore_runs_table.php
- T006 [P] Add casts/fillables for idempotency + timestamps in app/Models/BulkOperationRun.php and app/Models/RestoreRun.php
- T007 Implement idempotency key helpers in app/Support/RunIdempotency.php (build key, find active run, enforce reuse)
- T008 [P] Add a read-only Filament resource to inspect run details for BulkOperationRun in app/Filament/Resources/BulkOperationRunResource.php
- T009 [P] Add notification for run status transitions in app/Notifications/RunStatusChangedNotification.php (DB channel)
- T010 Add unit tests for RunIdempotency helpers in tests/Unit/RunIdempotencyTest.php
CRITICAL (must-fix before implementing any new run flows): Tenant isolation + authorization
- T042 Add tenant-scoped authorization for run list/view/start across all run flows (BulkOperationRun + RestoreRun) using policies/resources and ensure every query is tenant-scoped (e.g., app/Filament/Resources/BulkOperationRunResource.php, app/Filament/Resources/RestoreRunResource.php, and each start action/page that creates runs)
- T043 [P] Add Pest feature tests that run list/view are tenant-scoped (cannot list/view another tenant’s runs) in tests/Feature/RunAuthorizationTenantIsolationTest.php
- T044 [P] Add Pest feature tests that unaffiliated users cannot start runs (capture snapshot / restore execute / preview / backup set capture) in tests/Feature/RunStartAuthorizationTest.php
Checkpoint: Foundation ready (idempotency + run detail view + notifications).
Phase 3: User Story 1 - Capture snapshot runs in background (Priority: P1) 🎯 MVP
Goal: Capturing a policy snapshot never blocks the UI; it creates/reuses a run record and processes in a queued job with visible progress.
Independent Test: Trigger “Capture snapshot” on a policy; the request returns quickly and a BulkOperationRun transitions queued → running → succeeded|failed|partial, with details viewable.
Tests (write first)
- T011 [P] [US1] Add Pest feature test that capture snapshot queues a job (no inline capture) in tests/Feature/PolicyCaptureSnapshotQueuedTest.php
- T012 [P] [US1] Add Pest feature test that double-click reuses the active run (idempotency) in tests/Feature/PolicyCaptureSnapshotIdempotencyTest.php
Implementation
- T013 [US1] Create queued job to capture one policy snapshot in app/Jobs/CapturePolicySnapshotJob.php (updates BulkOperationRun counts + failures)
- T014 [US1] Update UI action to create/reuse run and dispatch job in app/Filament/Resources/PolicyResource/Pages/ViewPolicy.php
- T015 [P] [US1] Add linking from UI notifications to BulkOperationRunResource view page in app/Filament/Resources/BulkOperationRunResource.php
- T016 [US1] Ensure failures are safe/minimized (no secrets) when recording run failures in app/Services/BulkOperationService.php
Checkpoint: User Story 1 is independently usable and testable.
Phase 4: User Story 3 - Restore runs in background with per-item results (Priority: P1)
Goal: Restore execution and re-run restore operate exclusively via queued jobs, with persisted per-item outcomes and safe error summaries visible in the run detail UI.
Independent Test: Starting restore creates/reuses a RestoreRun in queued state, queues execution, and later shows item outcomes without relying on logs.
Tests (write first)
- T017 [P] [US3] Add Pest feature test that restore execution reuses active run for identical (tenant+backup_set+scope) starts in tests/Feature/RestoreRunIdempotencyTest.php
- T018 [P] [US3] Extend existing restore job test to assert per-item outcome persistence in tests/Feature/ExecuteRestoreRunJobTest.php
- T045 [P] [US3] Add Pest feature test that live restore writes an audit event (run-id linked) in tests/Feature/RestoreAuditLoggingTest.php
Implementation
- T019 [US3] Implement idempotency key computation for restore runs (tenant + operation + target + scope hash) in app/Support/RunIdempotency.php
- T020 [US3] Update restore run creation/execute flow to reuse active runs (no duplicates) in app/Filament/Resources/RestoreRunResource.php
- T021 [US3] Update app/Jobs/ExecuteRestoreRunJob.php to set started/finished timestamps and emit DB notifications (queued/running/terminal)
- T022 [US3] Persist deterministic per-item outcomes into restore_runs.results (keyed by backup_item_id) in app/Services/Intune/RestoreService.php
- T023 [US3] Derive total/succeeded/failed counts from persisted results and surface in RestoreRunResource view/table in app/Filament/Resources/RestoreRunResource.php
- T046 [US3] Ensure live restore execution emits an auditable event linked to the run (e.g., audit_logs FK or structured audit record) in app/Jobs/ExecuteRestoreRunJob.php and/or app/Services/Intune/RestoreService.php
Checkpoint: Restore runs are job-only, idempotent, and observable with item outcomes.
Phase 5: User Story 2 - Backup set create/capture runs in background (Priority: P2)
Goal: Creating a backup set and adding policies to a backup set does not perform Graph-heavy snapshot capture inline; capture occurs in jobs with a run record.
Independent Test: Creating a backup set returns quickly and produces a BulkOperationRun showing progress; adding policies via the picker also queues work.
Tests (write first)
- T024 [P] [US2] Add Pest feature test that backup set create does not run capture inline and instead queues a job in tests/Feature/BackupSetCreateCaptureQueuedTest.php
- T025 [P] [US2] Add Pest feature test that “Add selected” in policy picker queues background work in tests/Feature/BackupSetPolicyPickerQueuesCaptureTest.php
Implementation
- T026 [US2] Refactor capture work out of BackupService::createBackupSet into separate methods in app/Services/Intune/BackupService.php
- T027 [US2] Create queued job to capture backup set items in app/Jobs/CaptureBackupSetJob.php (uses BackupService; updates BulkOperationRun)
- T028 [US2] Update backup set create flow to create backup_set record quickly and dispatch CaptureBackupSetJob in app/Filament/Resources/BackupSetResource.php
- T029 [US2] Create queued job to add policies to a backup set (and capture foundations if requested) in app/Jobs/AddPoliciesToBackupSetJob.php
- T030 [US2] Update bulk action in app/Livewire/BackupSetPolicyPickerTable.php to create/reuse BulkOperationRun and dispatch AddPoliciesToBackupSetJob
Checkpoint: Backup set capture workloads are job-only and observable.
Phase 6: User Story 4 - Dry-run/preview runs in background (Priority: P2)
Goal: Restore preview generation is queued, persisted, and viewable without re-execution.
Independent Test: Clicking “Generate preview” returns quickly; a queued RestoreRun performs the diff generation asynchronously and persists preview output that the UI can display.
Tests (write first)
- T031 [P] [US4] Add Pest feature test that preview generation queues a job (no inline RestoreDiffGenerator call) in tests/Feature/RestorePreviewQueuedTest.php
- T032 [P] [US4] Add Pest feature test that preview results persist and are reusable in tests/Feature/RestorePreviewPersistenceTest.php
- T047 [P] [US4] Add Pest feature test that preview/dry-run never performs writes (must be read-only) in tests/Feature/RestorePreviewReadOnlySafetyTest.php
Implementation
- T033 [US4] Create queued job to generate preview diffs and persist to restore_runs.preview + metadata in app/Jobs/GenerateRestorePreviewJob.php
- T034 [US4] Update preview action in app/Filament/Resources/RestoreRunResource.php to create/reuse a dry-run RestoreRun and dispatch GenerateRestorePreviewJob
- T035 [US4] Update restore run view component to read preview from the persisted run record in resources/views/filament/forms/components/restore-run-preview.blade.php
- T036 [US4] Emit DB notifications for preview queued/running/completed/failed transitions in app/Jobs/GenerateRestorePreviewJob.php
- T048 [US4] Enforce preview/dry-run read-only behavior: block write-capable operations and record a safe failure if a write would occur (in app/Jobs/GenerateRestorePreviewJob.php and/or restore diff generation service)
Checkpoint: Preview is asynchronous, persisted, and visible.
Phase 7: Phase 2 - Global Progress Widget (All Run Types)
- T037 [P] Add a global progress widget for restore runs (Phase 2 requirement) by extending app/Livewire/BulkOperationProgress.php or adding a dedicated Livewire component in app/Livewire/RestoreRunProgress.php
Phase 8: Polish & Cross-Cutting Concerns
- T038 Ensure Graph throttling/backoff behavior is applied inside queued jobs (429/503) in app/Services/Intune/PolicySnapshotService.php and app/Services/Intune/RestoreService.php
- T039 [P] Add/extend run status notification formatting to include safe error codes/contexts in app/Notifications/RunStatusChangedNotification.php
- T040 Run formatter on modified files: vendor/bin/pint --dirty
- T041 Run targeted tests for affected areas: tests/Feature/Restore tests/Feature/BackupSet tests/Feature/Policy (use php artisan test with filters)
Dependencies & Execution Order
Story order
- Phase 1 → Phase 2 must complete first.
- After Phase 2:
- US1 and US3 can proceed in parallel.
- US4 can proceed in parallel but may be easiest after US3 (shared RestoreRun patterns).
- US2 can proceed independently after Phase 2.
Dependency graph
- Setup → Foundational → { US1, US2, US3, US4 } → Polish
- Setup → Foundational → { US1, US2, US3, US4 } → Phase 2 Global Widget → Polish
- Suggested minimal MVP: Setup → Foundational → US1
Parallel execution examples
US1
- In parallel: T011 (queues test), T012 (idempotency test)
- In parallel: T013 (job), T014 (UI action update) after foundational tasks
US2
- In parallel: T024 (create queues test), T025 (picker queues test)
- In parallel: T027 (job) and T029 (job) after BackupService refactor task T026
US3
- In parallel: T017 (idempotency test), T018 (job behavior test)
- In parallel: T021 (job notifications) and T023 (UI view enhancements) once results format is defined
US4
- In parallel: T031 (queues test), T032 (persistence test)
- In parallel: T033 (job) and T035 (view reads persisted preview) once run persistence shape is agreed
Implementation strategy
- MVP (fastest value): deliver US1 first (policy snapshot capture becomes queued + idempotent + observable).
- Next: US3 + US4 to fully de-risk restore execution and preview.
- Then: US2 to eliminate inline Graph work from backup set flows.
Format validation
All tasks above follow the required checklist format:
- [ ] T### [P?] [US#?] Description with file path