Summary This PR implements Spec 049 – Backup/Restore Job Orchestration: all critical Backup/Restore execution paths are job-only, idempotent, tenant-scoped, and observable via run records + DB notifications (Phase 1). The UI no longer performs heavy Graph work inside request/Filament actions for these flows. Why We want predictable UX and operations at MSP scale: • no timeouts / long-running requests • reproducible run state + per-item results • safe error persistence (no secrets / no token leakage) • strict tenant isolation + auditability for write paths What changed Foundational (Runs + Idempotency + Observability) • Added a shared RunIdempotency helper (dedupe while queued/running). • Added a read-only BulkOperationRuns surface (list + view) for status/progress. • Added DB notifications for run status changes (with “View run” link). US1 – Policy “Capture snapshot” is job-only • Policy detail “Capture snapshot” now: • creates/reuses a run (dedupe key: tenant + policy.capture_snapshot + policy DB id) • dispatches a queued job • returns immediately with notification + link to run detail • Graph capture work moved fully into the job; request path stays Graph-free. US3 – Restore runs orchestration is job-only + safe • Live restore execution is queued and updates RestoreRun status/progress. • Per-item outcomes are persisted deterministically (per internal DB record). • Audit logging is written for live restore. • Preview/dry-run is enforced as read-only (no writes). Tenant isolation / authorization (non-negotiable) • Run list/view/start are tenant-scoped and policy-guarded (cross-tenant access => 403, not 404). • Explicit Pest tests cover cross-tenant denial and start authorization. Tests / Verification • ./vendor/bin/pint --dirty • Targeted suite (examples): • policy capture snapshot queued + idempotency tests • restore orchestration + audit logging + preview read-only tests • run authorization / tenant isolation tests Notes / Scope boundaries • Phase 1 UX = DB notifications + run detail page. A global “progress widget” is tracked as Phase 2 and not required for merge. • Resilience/backoff is tracked in tasks but can be iterated further after merge. Review focus • Dedupe behavior for queued/running runs (reuse vs create-new) • Tenant scoping & policy gates for all run surfaces • Restore safety: audit event + preview no-writes Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #56
170 lines
4.2 KiB
YAML
170 lines
4.2 KiB
YAML
openapi: 3.0.3
|
|
info:
|
|
title: TenantPilot Admin Run Orchestration (049)
|
|
version: 0.1.0
|
|
description: |
|
|
Internal admin contracts for starting long-running backup/restore operations
|
|
and reading run status/progress. These endpoints are tenant-scoped.
|
|
|
|
servers:
|
|
- url: /admin
|
|
|
|
paths:
|
|
/t/{tenantExternalId}/runs/{runType}:
|
|
post:
|
|
operationId: startRun
|
|
summary: Start a background run
|
|
description: |
|
|
Starts an operation by creating (or reusing) a Run Record and enqueueing
|
|
background work. Must return quickly.
|
|
parameters:
|
|
- in: path
|
|
name: tenantExternalId
|
|
required: true
|
|
schema:
|
|
type: string
|
|
- in: path
|
|
name: runType
|
|
required: true
|
|
schema:
|
|
type: string
|
|
enum:
|
|
- backup_set_add_policies
|
|
- restore_execute
|
|
- restore_preview
|
|
- snapshot_capture
|
|
requestBody:
|
|
required: false
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/RunStartRequest'
|
|
responses:
|
|
'201':
|
|
description: Run created and queued
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/RunStartResponse'
|
|
'200':
|
|
description: Existing active run reused
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/RunStartResponse'
|
|
'403':
|
|
description: Forbidden
|
|
|
|
/t/{tenantExternalId}/runs/{runType}/{runId}:
|
|
get:
|
|
operationId: getRun
|
|
summary: Get run status and progress
|
|
parameters:
|
|
- in: path
|
|
name: tenantExternalId
|
|
required: true
|
|
schema:
|
|
type: string
|
|
- in: path
|
|
name: runType
|
|
required: true
|
|
schema:
|
|
type: string
|
|
- in: path
|
|
name: runId
|
|
required: true
|
|
schema:
|
|
type: string
|
|
responses:
|
|
'200':
|
|
description: Run record
|
|
content:
|
|
application/json:
|
|
schema:
|
|
$ref: '#/components/schemas/RunRecord'
|
|
'404':
|
|
description: Not found
|
|
|
|
components:
|
|
schemas:
|
|
RunStartRequest:
|
|
type: object
|
|
additionalProperties: false
|
|
properties:
|
|
targetObjectId:
|
|
type: string
|
|
nullable: true
|
|
description: Operation target used for de-duplication.
|
|
payloadHash:
|
|
type: string
|
|
nullable: true
|
|
description: Optional stable hash of relevant payload to strengthen idempotency.
|
|
itemIds:
|
|
type: array
|
|
items:
|
|
type: string
|
|
nullable: true
|
|
description: Optional internal item ids to process.
|
|
|
|
RunStartResponse:
|
|
type: object
|
|
required: [run]
|
|
properties:
|
|
reused:
|
|
type: boolean
|
|
default: false
|
|
run:
|
|
$ref: '#/components/schemas/RunRecord'
|
|
|
|
RunRecord:
|
|
type: object
|
|
required:
|
|
- id
|
|
- tenantExternalId
|
|
- type
|
|
- status
|
|
properties:
|
|
id:
|
|
type: string
|
|
tenantExternalId:
|
|
type: string
|
|
type:
|
|
type: string
|
|
status:
|
|
type: string
|
|
enum: [queued, running, succeeded, failed, partial]
|
|
createdAt:
|
|
type: string
|
|
format: date-time
|
|
startedAt:
|
|
type: string
|
|
format: date-time
|
|
nullable: true
|
|
finishedAt:
|
|
type: string
|
|
format: date-time
|
|
nullable: true
|
|
counts:
|
|
type: object
|
|
additionalProperties: false
|
|
properties:
|
|
total:
|
|
type: integer
|
|
minimum: 0
|
|
succeeded:
|
|
type: integer
|
|
minimum: 0
|
|
failed:
|
|
type: integer
|
|
minimum: 0
|
|
safeError:
|
|
type: object
|
|
nullable: true
|
|
additionalProperties: false
|
|
properties:
|
|
code:
|
|
type: string
|
|
context:
|
|
type: object
|
|
additionalProperties: true
|