What Implements tenant-scoped backup scheduling end-to-end: schedules CRUD, minute-based dispatch, queued execution, run history, manual “Run now/Retry”, retention (keep last N), and auditability. Key changes Filament UI: Backup Schedules resource with tenant scoping + SEC-002 role gating. Scheduler + queue: tenantpilot:schedules:dispatch command wired in scheduler (runs every minute), creates idempotent BackupScheduleRun records and dispatches jobs. Execution: RunBackupScheduleJob syncs policies, creates immutable backup sets, updates run status, writes audit logs, applies retry/backoff mapping, and triggers retention. Run history: Relation manager + “View” modal rendering run details. UX polish: row actions grouped; bulk actions grouped (run now / retry / delete). Bulk dispatch writes DB notifications (shows in notifications panel). Validation: policy type hard-validation on save; unknown policy types handled safely at runtime (skipped/partial). Tests: comprehensive Pest coverage for CRUD/scoping/validation, idempotency, job outcomes, error mapping, retention, view modal, run-now/retry notifications, bulk delete (incl. operator forbidden). Files / Areas Filament: BackupScheduleResource.php and app/Filament/Resources/BackupScheduleResource/* Scheduling/Jobs: app/Console/Commands/TenantpilotDispatchBackupSchedules.php, app/Jobs/RunBackupScheduleJob.php, app/Jobs/ApplyBackupScheduleRetentionJob.php, console.php Models/Migrations: app/Models/BackupSchedule.php, app/Models/BackupScheduleRun.php, database/migrations/backup_schedules, backup_schedule_runs Notifications: BackupScheduleRunDispatchedNotification.php Specs: specs/032-backup-scheduling-mvp/* (tasks/checklist/quickstart updates) How to test (Sail) Run tests: ./vendor/bin/sail artisan test tests/Feature/BackupScheduling Run formatter: ./vendor/bin/sail php ./vendor/bin/pint --dirty Apply migrations: ./vendor/bin/sail artisan migrate Manual dispatch: ./vendor/bin/sail artisan tenantpilot:schedules:dispatch Notes Uses DB notifications for queued UI actions to ensure they appear in the notifications panel even under queue fakes in tests. Checklist gate for 032 is PASS; tasks updated accordingly. Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #34
78 lines
4.2 KiB
Markdown
78 lines
4.2 KiB
Markdown
# Research: Backup Scheduling MVP (032)
|
|
|
|
**Date**: 2026-01-05
|
|
|
|
This document resolves technical decisions and clarifies implementation approach for Feature 032.
|
|
|
|
## Decisions
|
|
|
|
### 1) Reuse existing sync + backup services
|
|
- **Decision**: Use `App\Services\Intune\PolicySyncService::syncPoliciesWithReport(Tenant $tenant, ?array $supportedTypes = null): array` and `App\Services\Intune\BackupService::createBackupSet(...)`.
|
|
- **Rationale**: These are already tenant-aware, use `GraphClientInterface` behind the scenes (via `PolicySyncService`), and `BackupService` already writes a `backup.created` audit log entry.
|
|
- **Alternatives considered**:
|
|
- Implement new Graph calls directly in the scheduler job → rejected (violates Graph abstraction gate; duplicates logic).
|
|
|
|
### 2) Policy type source of truth + validation
|
|
- **Decision**:
|
|
- Persist `backup_schedules.policy_types` as `array<string>` of **type keys** present in `config('tenantpilot.supported_policy_types')`.
|
|
- **Hard validation at save-time**: unknown keys are rejected.
|
|
- **Runtime defensive check** (legacy/DB): unknown keys are skipped.
|
|
- If ≥1 valid type remains → run becomes `partial` and `error_code=UNKNOWN_POLICY_TYPE`.
|
|
- If 0 valid types remain → run becomes `skipped` and `error_code=UNKNOWN_POLICY_TYPE` (no `BackupSet` created).
|
|
- **Rationale**: Prevent silent misconfiguration and enforce fail-safe behavior at entry points, while still handling legacy data safely.
|
|
- **Alternatives considered**:
|
|
- Save unknown keys and ignore silently → rejected (silent misconfiguration).
|
|
- Fail the run for any unknown type → rejected (too brittle for legacy).
|
|
|
|
### 3) Graph calls and contracts
|
|
- **Decision**: Do not hardcode Graph endpoints. All Graph access happens via `GraphClientInterface` (through `PolicySyncService` and `BackupService`).
|
|
- **Rationale**: Matches constitution requirements and existing code paths.
|
|
- **Alternatives considered**:
|
|
- Calling `deviceManagement/{type}` directly → rejected (explicitly forbidden by constitution; also unsafe for unknown types).
|
|
|
|
### 4) Scheduling mechanism
|
|
- **Decision**: Add an Artisan command `tenantpilot:schedules:dispatch` and register it with Laravel scheduler to run every minute.
|
|
- **Rationale**: Fits Laravel 12 structure (no Kernel), supports Dokploy operation models (`schedule:run` cron or `schedule:work`).
|
|
- **Alternatives considered**:
|
|
- Long-running daemon polling DB directly → rejected (less idiomatic; harder ops).
|
|
|
|
### 5) Due calculation + time semantics
|
|
- **Decision**:
|
|
- `scheduled_for` is minute-slot based and stored in UTC.
|
|
- Due calculation uses the schedule timezone.
|
|
- DST (MVP): invalid local time → skip; ambiguous local time → first occurrence.
|
|
- **Rationale**: Predictable and testable; avoids “surprise catch-up”.
|
|
- **Alternatives considered**:
|
|
- Catch-up missed slots → rejected by spec (MVP explicitly “no catch-up”).
|
|
|
|
### 6) Idempotency + concurrency
|
|
- **Decision**:
|
|
- DB unique constraint: `(backup_schedule_id, scheduled_for)`.
|
|
- Cache lock per schedule (`lock:backup_schedule:{id}`) to prevent parallel execution.
|
|
- If lock held, do not run in parallel: mark run `skipped` with a clear error_code.
|
|
- **Rationale**: Prevents double runs and provides deterministic behavior.
|
|
- **Alternatives considered**:
|
|
- Only cache lock (no DB constraint) → rejected (less robust under crashes/restarts).
|
|
|
|
### 7) Retry/backoff policy
|
|
- **Decision**:
|
|
- Transient/throttling failures (e.g. 429/503) → retries with backoff.
|
|
- Auth/permission failures (401/403) → no retry.
|
|
- Unknown failures → limited retries, then fail.
|
|
- **Rationale**: Avoid noisy retry loops for non-recoverable errors.
|
|
|
|
### 8) Audit logging
|
|
- **Decision**: Use `App\Services\Intune\AuditLogger` for:
|
|
- dispatch cycle (optional aggregated)
|
|
- run start + completion
|
|
- retention applied (count deletions)
|
|
- **Rationale**: Constitution requires audit log for every operation; existing `BackupService` already writes `backup.created`.
|
|
|
|
### 9) Notifications
|
|
- **Decision**: Only interactive actions (Run now / Retry) notify the acting user (database notifications). Scheduled runs rely on Run history.
|
|
- **Rationale**: Avoid undefined “who gets notified” without adding new ownership fields.
|
|
|
|
## Open Items
|
|
|
|
None blocking Phase 1 design.
|