feat/032-backup-scheduling-mvp #36
11
specs/032-backup-scheduling-mvp/checklists/requirements.md
Normal file
11
specs/032-backup-scheduling-mvp/checklists/requirements.md
Normal file
@ -0,0 +1,11 @@
|
|||||||
|
# Requirements Checklist (032)
|
||||||
|
|
||||||
|
- [ ] Tenant-scoped tables use `tenant_id` consistently.
|
||||||
|
- [ ] 1 Run = 1 BackupSet (no rolling reuse in MVP).
|
||||||
|
- [ ] Dispatcher is idempotent (unique schedule_id + scheduled_for).
|
||||||
|
- [ ] Concurrency lock prevents parallel runs per schedule.
|
||||||
|
- [ ] Run stores status + summary + error_code/error_message.
|
||||||
|
- [ ] UI shows schedule list + run history + link to backup set.
|
||||||
|
- [ ] Run now + Retry are permission-gated and write DB notifications.
|
||||||
|
- [ ] Retention keeps last N and soft-deletes older backup sets.
|
||||||
|
- [ ] Tests cover due-calculation, idempotency, job success/failure, retention.
|
||||||
56
specs/032-backup-scheduling-mvp/plan.md
Normal file
56
specs/032-backup-scheduling-mvp/plan.md
Normal file
@ -0,0 +1,56 @@
|
|||||||
|
# Plan: Backup Scheduling MVP (032)
|
||||||
|
|
||||||
|
**Date**: 2026-01-05
|
||||||
|
**Input**: spec.md
|
||||||
|
|
||||||
|
## Architecture / Reuse
|
||||||
|
- Reuse existing services:
|
||||||
|
- `PolicySyncService::syncPoliciesWithReport()` for selected policy types
|
||||||
|
- `BackupService::createBackupSet()` to create immutable snapshots + items (include_foundations supported)
|
||||||
|
- Store selection as `policy_types` (config keys), not free-form categories.
|
||||||
|
- Use tenant scoping (`tenant_id`) consistent with existing tables (`backup_sets`, `backup_items`).
|
||||||
|
|
||||||
|
## Scheduling Mechanism
|
||||||
|
- Add Artisan command: `tenantpilot:schedules:dispatch`.
|
||||||
|
- Scheduler integration (Laravel 12): schedule the command every minute via `routes/console.php` + ops configuration (Dokploy cron `schedule:run` or long-running `schedule:work`).
|
||||||
|
- Dispatcher algorithm:
|
||||||
|
1) load enabled schedules
|
||||||
|
2) compute whether due for the current minute in schedule timezone
|
||||||
|
3) create run with `scheduled_for` slot (minute precision) using DB unique constraint
|
||||||
|
4) dispatch `RunBackupScheduleJob(schedule_id, run_id)`
|
||||||
|
- Concurrency:
|
||||||
|
- Cache lock per schedule (`lock:backup_schedule:{id}`) plus DB unique slot constraint for idempotency.
|
||||||
|
|
||||||
|
## Run Execution
|
||||||
|
- `RunBackupScheduleJob`:
|
||||||
|
1) load schedule + tenant
|
||||||
|
2) preflight: tenant active; Graph/auth errors mapped to error_code
|
||||||
|
3) sync policies for selected types (collect report)
|
||||||
|
4) select policy IDs from local DB for those types (exclude ignored)
|
||||||
|
5) create backup set:
|
||||||
|
- name: `{schedule_name} - {Y-m-d H:i}`
|
||||||
|
- includeFoundations: schedule flag
|
||||||
|
6) set run status:
|
||||||
|
- success if backup_set.status == completed
|
||||||
|
- partial if backup_set.status == partial OR sync had failures but backup succeeded
|
||||||
|
- failed if nothing backed up / hard error
|
||||||
|
7) update schedule last_run_* and compute/persist next_run_at
|
||||||
|
8) dispatch retention job
|
||||||
|
|
||||||
|
## Retention
|
||||||
|
- `ApplyBackupScheduleRetentionJob(schedule_id)`:
|
||||||
|
- identify runs ordered newest→oldest
|
||||||
|
- keep last N runs that created a backup_set_id
|
||||||
|
- for older ones: soft-delete referenced BackupSets (and cascade soft-delete items)
|
||||||
|
|
||||||
|
## Filament UX
|
||||||
|
- Tenant-scoped resources:
|
||||||
|
- `BackupScheduleResource`
|
||||||
|
- Runs UI via RelationManager under schedule (or a dedicated resource if needed)
|
||||||
|
- Actions: enable/disable, run now, retry
|
||||||
|
- Notifications: persist via `->sendToDatabase($user)` for the DB info panel.
|
||||||
|
|
||||||
|
## Ops / Deployment Notes
|
||||||
|
- Requires queue worker.
|
||||||
|
- Requires scheduler running.
|
||||||
|
- Missed runs policy (MVP): no catch-up.
|
||||||
114
specs/032-backup-scheduling-mvp/spec.md
Normal file
114
specs/032-backup-scheduling-mvp/spec.md
Normal file
@ -0,0 +1,114 @@
|
|||||||
|
# Feature Specification: Backup Scheduling MVP (032)
|
||||||
|
|
||||||
|
**Feature**: Automatisierte Backups per Zeitplan (pro Tenant)
|
||||||
|
**Created**: 2026-01-05
|
||||||
|
**Status**: Ready for implementation (MVP)
|
||||||
|
**Risk**: Medium (Backup-only, no restore scheduling)
|
||||||
|
**Dependencies**: Tenant Portfolio + Tenant Context Switch ✅
|
||||||
|
|
||||||
|
## Context
|
||||||
|
TenantPilot unterstützt manuelle Backups. Kunden/MSPs benötigen regelmäßige, zuverlässige Backups pro Tenant (z. B. nightly), inkl. nachvollziehbarer Runs, Fehlercodes und Retention.
|
||||||
|
|
||||||
|
## Goals
|
||||||
|
- Pro Tenant können 1..n Backup Schedules angelegt werden.
|
||||||
|
- Schedules laufen automatisch via Queue/Worker.
|
||||||
|
- Jeder Lauf wird als Run auditierbar gespeichert (Status, Counts, Fehler).
|
||||||
|
- Retention löscht alte Backups nach Policy.
|
||||||
|
- Filament UI: Schedules verwalten, Run-History ansehen, “Run now”, “Retry”.
|
||||||
|
|
||||||
|
## Non-Goals (MVP)
|
||||||
|
- Kein Kalender-UI als Pflicht (kann später ergänzt werden).
|
||||||
|
- Kein Cross-Tenant Bulk Scheduling (MSP-Templates später).
|
||||||
|
- Kein “drift-triggered scheduling” (kommt nach Drift-MVP).
|
||||||
|
- Kein Restore via Scheduling (nur Backup).
|
||||||
|
|
||||||
|
## Definitions
|
||||||
|
- **Schedule**: Wiederkehrender Plan (daily/weekly, timezone).
|
||||||
|
- **Run**: Konkrete Ausführung eines Schedules (scheduled_for + status).
|
||||||
|
- **BackupSet**: Ergebniscontainer eines Runs.
|
||||||
|
|
||||||
|
**MVP Semantik**: **1 Run = 1 neues BackupSet** (kein Rolling-Reuse im MVP).
|
||||||
|
|
||||||
|
## Requirements
|
||||||
|
|
||||||
|
### Functional Requirements
|
||||||
|
- **FR-001**: Schedules sind tenant-scoped via `tenant_id` (FK auf `tenants.id`).
|
||||||
|
- **FR-002**: Dispatcher erkennt “due” schedules und erstellt genau einen Run pro Zeit-Slot (idempotent).
|
||||||
|
- **FR-003**: Run nutzt bestehende Services:
|
||||||
|
- Sync Policies (nur selektierte policy types)
|
||||||
|
- Create BackupSet aus lokalen Policy-IDs (inkl. Foundations optional)
|
||||||
|
- **FR-004**: Run schreibt `backup_schedule_runs` mit Status + Summary + Error-Codes.
|
||||||
|
- **FR-005**: “Run now” erzeugt sofort einen Run (scheduled_for=now) und dispatcht Job.
|
||||||
|
- **FR-006**: “Retry” erzeugt einen neuen Run für denselben Schedule.
|
||||||
|
- **FR-007**: Retention hält nur die letzten N Runs/BackupSets pro Schedule (soft delete BackupSets).
|
||||||
|
- **FR-008**: Concurrency: Pro Schedule darf nur ein Run gleichzeitig laufen.
|
||||||
|
|
||||||
|
### UX Requirements (Filament)
|
||||||
|
- **UX-001**: Schedule-Liste zeigt Enabled, Frequency, Time+Timezone, Policy Types Summary, Retention, Last Run, Next Run.
|
||||||
|
- **UX-002**: Run-History pro Schedule zeigt scheduled_for, status, duration, counts, error_code/message, Link zum BackupSet.
|
||||||
|
- **UX-003**: “Run now” und “Retry” sind nur mit passenden Rechten verfügbar.
|
||||||
|
|
||||||
|
### Security / Authorization
|
||||||
|
- **SEC-001**: Tenant Isolation: User sieht/managt nur Schedules des aktuellen Tenants.
|
||||||
|
- **SEC-002**: Permissions (RBAC):
|
||||||
|
- `backup_schedules.view`
|
||||||
|
- `backup_schedules.manage`
|
||||||
|
- `backup_schedules.run_now`
|
||||||
|
- `backup_schedules.runs.view`
|
||||||
|
- **SEC-003**: Runs schreiben tenant-scoped Audit Logs (keine Secrets/Tokens).
|
||||||
|
|
||||||
|
### Reliability / Non-Functional Requirements
|
||||||
|
- **NFR-001**: Idempotency durch Unique Slot-Constraint (`backup_schedule_id` + `scheduled_for`).
|
||||||
|
- **NFR-002**: Klare Fehlercodes (z. B. TOKEN_EXPIRED, PERMISSION_MISSING, GRAPH_THROTTLE, UNKNOWN).
|
||||||
|
- **NFR-003**: Retries: Throttling → Backoff; 401/403 → kein blind retry.
|
||||||
|
- **NFR-004**: Missed runs policy (MVP): **No catch-up** — wenn offline, wird nicht nachgeholt, nur nächster Slot.
|
||||||
|
|
||||||
|
## Data Model
|
||||||
|
|
||||||
|
### backup_schedules
|
||||||
|
- `id` bigint
|
||||||
|
- `tenant_id` FK tenants.id
|
||||||
|
- `name` string
|
||||||
|
- `is_enabled` bool default true
|
||||||
|
- `timezone` string default 'UTC'
|
||||||
|
- `frequency` string enum: daily|weekly
|
||||||
|
- `time_of_day` time
|
||||||
|
- `days_of_week` json nullable (array<int>, weekly only; 1=Mon..7=Sun)
|
||||||
|
- `policy_types` jsonb (array<string>)
|
||||||
|
- `include_foundations` bool default true
|
||||||
|
- `retention_keep_last` int default 30
|
||||||
|
- `last_run_at` datetime nullable
|
||||||
|
- `last_run_status` string nullable
|
||||||
|
- `next_run_at` datetime nullable
|
||||||
|
- timestamps
|
||||||
|
|
||||||
|
Indexes:
|
||||||
|
- (tenant_id, is_enabled)
|
||||||
|
- (next_run_at) optional
|
||||||
|
|
||||||
|
### backup_schedule_runs
|
||||||
|
- `id` bigint
|
||||||
|
- `backup_schedule_id` FK
|
||||||
|
- `tenant_id` FK (denormalisiert)
|
||||||
|
- `scheduled_for` datetime
|
||||||
|
- `started_at` datetime nullable
|
||||||
|
- `finished_at` datetime nullable
|
||||||
|
- `status` string enum: running|success|partial|failed|canceled|skipped
|
||||||
|
- `summary` jsonb (policies_total, policies_backed_up, errors_count, type_breakdown, warnings)
|
||||||
|
- `error_code` string nullable
|
||||||
|
- `error_message` text nullable
|
||||||
|
- `backup_set_id` FK nullable
|
||||||
|
- timestamps
|
||||||
|
|
||||||
|
Indexes:
|
||||||
|
- (backup_schedule_id, scheduled_for)
|
||||||
|
- (tenant_id, created_at)
|
||||||
|
- **Unique**: (backup_schedule_id, scheduled_for)
|
||||||
|
|
||||||
|
## Acceptance Criteria
|
||||||
|
- User kann pro Tenant einen Schedule anlegen (daily/weekly, time, timezone, policy types, retention).
|
||||||
|
- Dispatcher erstellt Runs zur geplanten Zeit (Queue Worker vorausgesetzt).
|
||||||
|
- UI zeigt Last Run + Next Run + Run-History.
|
||||||
|
- Run now startet sofort.
|
||||||
|
- Fehlerfälle (Token/Permission/Throttle) werden als failed/partial markiert mit error_code.
|
||||||
|
- Retention hält nur die letzten N BackupSets pro Schedule.
|
||||||
38
specs/032-backup-scheduling-mvp/tasks.md
Normal file
38
specs/032-backup-scheduling-mvp/tasks.md
Normal file
@ -0,0 +1,38 @@
|
|||||||
|
# Tasks: Backup Scheduling MVP (032)
|
||||||
|
|
||||||
|
**Date**: 2026-01-05
|
||||||
|
**Input**: spec.md, plan.md
|
||||||
|
|
||||||
|
## Phase 1: Spec & Setup
|
||||||
|
- [ ] T001 Create specs/032-backup-scheduling-mvp (spec/plan/tasks + checklist).
|
||||||
|
|
||||||
|
## Phase 2: Data Model
|
||||||
|
- [ ] T002 Add migrations: backup_schedules + backup_schedule_runs (tenant-scoped, indexes, unique slot).
|
||||||
|
- [ ] T003 Add models + relationships (Tenant->schedules, Schedule->runs, Run->backupSet).
|
||||||
|
|
||||||
|
## Phase 3: Scheduling + Dispatch
|
||||||
|
- [ ] T004 Add command `tenantpilot:schedules:dispatch`.
|
||||||
|
- [ ] T005 Register scheduler to run every minute.
|
||||||
|
- [ ] T006 Implement due-calculation (timezone, daily/weekly) + next_run_at computation.
|
||||||
|
- [ ] T007 Implement idempotent run creation (unique slot) + cache lock.
|
||||||
|
|
||||||
|
## Phase 4: Jobs
|
||||||
|
- [ ] T008 Implement `RunBackupScheduleJob` (sync -> select policy IDs -> create backup set -> update run + schedule).
|
||||||
|
- [ ] T009 Implement `ApplyBackupScheduleRetentionJob` (keep last N, soft-delete backup sets).
|
||||||
|
- [ ] T010 Add error mapping to `error_code` (TOKEN_EXPIRED, PERMISSION_MISSING, GRAPH_THROTTLE, UNKNOWN).
|
||||||
|
|
||||||
|
## Phase 5: Filament UI
|
||||||
|
- [ ] T011 Add `BackupScheduleResource` (tenant-scoped): CRUD + enable/disable.
|
||||||
|
- [ ] T012 Add Runs UI (relation manager or resource) with details + link to BackupSet.
|
||||||
|
- [ ] T013 Add actions: Run now + Retry (permission-gated); notifications persisted to DB.
|
||||||
|
|
||||||
|
## Phase 6: Tests
|
||||||
|
- [ ] T014 Unit: due-calculation + next_run_at.
|
||||||
|
- [ ] T015 Feature: dispatcher idempotency (unique slot); lock behavior.
|
||||||
|
- [ ] T016 Job-level: successful run creates backup set, updates run/schedule (Graph mocked).
|
||||||
|
- [ ] T017 Job-level: token/permission/throttle errors map to error_code and status.
|
||||||
|
- [ ] T018 Retention: keeps last N and deletes older backup sets.
|
||||||
|
|
||||||
|
## Phase 7: Verification
|
||||||
|
- [ ] T019 Run targeted tests (Pest).
|
||||||
|
- [ ] T020 Run Pint (`./vendor/bin/pint --dirty`).
|
||||||
Loading…
Reference in New Issue
Block a user