78 lines
4.2 KiB
Markdown
78 lines
4.2 KiB
Markdown
# Research: Backup Scheduling MVP (032)
|
|
|
|
**Date**: 2026-01-05
|
|
|
|
This document resolves technical decisions and clarifies implementation approach for Feature 032.
|
|
|
|
## Decisions
|
|
|
|
### 1) Reuse existing sync + backup services
|
|
- **Decision**: Use `App\Services\Intune\PolicySyncService::syncPoliciesWithReport(Tenant $tenant, ?array $supportedTypes = null): array` and `App\Services\Intune\BackupService::createBackupSet(...)`.
|
|
- **Rationale**: These are already tenant-aware, use `GraphClientInterface` behind the scenes (via `PolicySyncService`), and `BackupService` already writes a `backup.created` audit log entry.
|
|
- **Alternatives considered**:
|
|
- Implement new Graph calls directly in the scheduler job → rejected (violates Graph abstraction gate; duplicates logic).
|
|
|
|
### 2) Policy type source of truth + validation
|
|
- **Decision**:
|
|
- Persist `backup_schedules.policy_types` as `array<string>` of **type keys** present in `config('tenantpilot.supported_policy_types')`.
|
|
- **Hard validation at save-time**: unknown keys are rejected.
|
|
- **Runtime defensive check** (legacy/DB): unknown keys are skipped.
|
|
- If ≥1 valid type remains → run becomes `partial` and `error_code=UNKNOWN_POLICY_TYPE`.
|
|
- If 0 valid types remain → run becomes `skipped` and `error_code=UNKNOWN_POLICY_TYPE` (no `BackupSet` created).
|
|
- **Rationale**: Prevent silent misconfiguration and enforce fail-safe behavior at entry points, while still handling legacy data safely.
|
|
- **Alternatives considered**:
|
|
- Save unknown keys and ignore silently → rejected (silent misconfiguration).
|
|
- Fail the run for any unknown type → rejected (too brittle for legacy).
|
|
|
|
### 3) Graph calls and contracts
|
|
- **Decision**: Do not hardcode Graph endpoints. All Graph access happens via `GraphClientInterface` (through `PolicySyncService` and `BackupService`).
|
|
- **Rationale**: Matches constitution requirements and existing code paths.
|
|
- **Alternatives considered**:
|
|
- Calling `deviceManagement/{type}` directly → rejected (explicitly forbidden by constitution; also unsafe for unknown types).
|
|
|
|
### 4) Scheduling mechanism
|
|
- **Decision**: Add an Artisan command `tenantpilot:schedules:dispatch` and register it with Laravel scheduler to run every minute.
|
|
- **Rationale**: Fits Laravel 12 structure (no Kernel), supports Dokploy operation models (`schedule:run` cron or `schedule:work`).
|
|
- **Alternatives considered**:
|
|
- Long-running daemon polling DB directly → rejected (less idiomatic; harder ops).
|
|
|
|
### 5) Due calculation + time semantics
|
|
- **Decision**:
|
|
- `scheduled_for` is minute-slot based and stored in UTC.
|
|
- Due calculation uses the schedule timezone.
|
|
- DST (MVP): invalid local time → skip; ambiguous local time → first occurrence.
|
|
- **Rationale**: Predictable and testable; avoids “surprise catch-up”.
|
|
- **Alternatives considered**:
|
|
- Catch-up missed slots → rejected by spec (MVP explicitly “no catch-up”).
|
|
|
|
### 6) Idempotency + concurrency
|
|
- **Decision**:
|
|
- DB unique constraint: `(backup_schedule_id, scheduled_for)`.
|
|
- Cache lock per schedule (`lock:backup_schedule:{id}`) to prevent parallel execution.
|
|
- If lock held, do not run in parallel: mark run `skipped` with a clear error_code.
|
|
- **Rationale**: Prevents double runs and provides deterministic behavior.
|
|
- **Alternatives considered**:
|
|
- Only cache lock (no DB constraint) → rejected (less robust under crashes/restarts).
|
|
|
|
### 7) Retry/backoff policy
|
|
- **Decision**:
|
|
- Transient/throttling failures (e.g. 429/503) → retries with backoff.
|
|
- Auth/permission failures (401/403) → no retry.
|
|
- Unknown failures → limited retries, then fail.
|
|
- **Rationale**: Avoid noisy retry loops for non-recoverable errors.
|
|
|
|
### 8) Audit logging
|
|
- **Decision**: Use `App\Services\Intune\AuditLogger` for:
|
|
- dispatch cycle (optional aggregated)
|
|
- run start + completion
|
|
- retention applied (count deletions)
|
|
- **Rationale**: Constitution requires audit log for every operation; existing `BackupService` already writes `backup.created`.
|
|
|
|
### 9) Notifications
|
|
- **Decision**: Only interactive actions (Run now / Retry) notify the acting user (database notifications). Scheduled runs rely on Run history.
|
|
- **Rationale**: Avoid undefined “who gets notified” without adding new ownership fields.
|
|
|
|
## Open Items
|
|
|
|
None blocking Phase 1 design.
|