4.2 KiB
4.2 KiB
Research: Backup Scheduling MVP (032)
Date: 2026-01-05
This document resolves technical decisions and clarifies implementation approach for Feature 032.
Decisions
1) Reuse existing sync + backup services
- Decision: Use
App\Services\Intune\PolicySyncService::syncPoliciesWithReport(Tenant $tenant, ?array $supportedTypes = null): arrayandApp\Services\Intune\BackupService::createBackupSet(...). - Rationale: These are already tenant-aware, use
GraphClientInterfacebehind the scenes (viaPolicySyncService), andBackupServicealready writes abackup.createdaudit log entry. - Alternatives considered:
- Implement new Graph calls directly in the scheduler job → rejected (violates Graph abstraction gate; duplicates logic).
2) Policy type source of truth + validation
- Decision:
- Persist
backup_schedules.policy_typesasarray<string>of type keys present inconfig('tenantpilot.supported_policy_types'). - Hard validation at save-time: unknown keys are rejected.
- Runtime defensive check (legacy/DB): unknown keys are skipped.
- If ≥1 valid type remains → run becomes
partialanderror_code=UNKNOWN_POLICY_TYPE. - If 0 valid types remain → run becomes
skippedanderror_code=UNKNOWN_POLICY_TYPE(noBackupSetcreated).
- If ≥1 valid type remains → run becomes
- Persist
- Rationale: Prevent silent misconfiguration and enforce fail-safe behavior at entry points, while still handling legacy data safely.
- Alternatives considered:
- Save unknown keys and ignore silently → rejected (silent misconfiguration).
- Fail the run for any unknown type → rejected (too brittle for legacy).
3) Graph calls and contracts
- Decision: Do not hardcode Graph endpoints. All Graph access happens via
GraphClientInterface(throughPolicySyncServiceandBackupService). - Rationale: Matches constitution requirements and existing code paths.
- Alternatives considered:
- Calling
deviceManagement/{type}directly → rejected (explicitly forbidden by constitution; also unsafe for unknown types).
- Calling
4) Scheduling mechanism
- Decision: Add an Artisan command
tenantpilot:schedules:dispatchand register it with Laravel scheduler to run every minute. - Rationale: Fits Laravel 12 structure (no Kernel), supports Dokploy operation models (
schedule:runcron orschedule:work). - Alternatives considered:
- Long-running daemon polling DB directly → rejected (less idiomatic; harder ops).
5) Due calculation + time semantics
- Decision:
scheduled_foris minute-slot based and stored in UTC.- Due calculation uses the schedule timezone.
- DST (MVP): invalid local time → skip; ambiguous local time → first occurrence.
- Rationale: Predictable and testable; avoids “surprise catch-up”.
- Alternatives considered:
- Catch-up missed slots → rejected by spec (MVP explicitly “no catch-up”).
6) Idempotency + concurrency
- Decision:
- DB unique constraint:
(backup_schedule_id, scheduled_for). - Cache lock per schedule (
lock:backup_schedule:{id}) to prevent parallel execution. - If lock held, do not run in parallel: mark run
skippedwith a clear error_code.
- DB unique constraint:
- Rationale: Prevents double runs and provides deterministic behavior.
- Alternatives considered:
- Only cache lock (no DB constraint) → rejected (less robust under crashes/restarts).
7) Retry/backoff policy
- Decision:
- Transient/throttling failures (e.g. 429/503) → retries with backoff.
- Auth/permission failures (401/403) → no retry.
- Unknown failures → limited retries, then fail.
- Rationale: Avoid noisy retry loops for non-recoverable errors.
8) Audit logging
- Decision: Use
App\Services\Intune\AuditLoggerfor:- dispatch cycle (optional aggregated)
- run start + completion
- retention applied (count deletions)
- Rationale: Constitution requires audit log for every operation; existing
BackupServicealready writesbackup.created.
9) Notifications
- Decision: Only interactive actions (Run now / Retry) notify the acting user (database notifications). Scheduled runs rely on Run history.
- Rationale: Avoid undefined “who gets notified” without adding new ownership fields.
Open Items
None blocking Phase 1 design.