Summary This PR introduces Unified Operations Runs + Monitoring Hub (053). Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware). Phase 1 adoption scope (per spec): • Drift generation (drift.generate) • Backup Set “Add Policies” (backup_set.add_policies) Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental. ⸻ What changed Monitoring / Operations hub • Moved/organized run monitoring under Monitoring → Operations • Added: • status buckets (queued / running / succeeded / partially succeeded / failed) • filters (run type, status bucket, time range) • run detail “Related” links (e.g. Drift findings, Backup Set context) • All hub pages are DB-only and view-only (no rerun/cancel/delete actions) Canonical run semantics • Added canonical helpers on BulkOperationRun: • runType() (resource.action) • statusBucket() derived from status + counts (testable semantics) Drift integration (Phase 1) • Drift generation start behavior now: • creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids) • dispatches generation job • emits DB notifications including “View run” link • On generation failure: stores sanitized failure entries + sends failure notification Permissions / tenant isolation • Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access • Readonly can view runs but cannot start drift generation ⸻ Tests Added/updated Pest coverage: • BulkOperationRunStatusBucketTest.php • DriftGenerationDispatchTest.php • GenerateDriftFindingsJobNotificationTest.php • RunAuthorizationTenantIsolationTest.php Validation run locally: • ./vendor/bin/pint --dirty • targeted tests from feature quickstart / drift monitoring tests ⸻ Manual QA 1. Go to Monitoring → Operations • verify filters (run type / status / time range) • verify run detail shows counts + sanitized failures + “Related” links 2. Open Drift Landing • with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run” • as readonly: should not start generation 3. Run detail • drift.generate runs show “Drift findings” related link • failure entries are sanitized (no secrets/tokens/raw payload dumps) ⸻ Notes / Ops • Queue workers must be restarted after deploy so they load the new code: • php artisan queue:restart (or Sail equivalent) • This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern. ⸻ Spec / Docs • SpecKit artifacts added under specs/053-unify-runs-monitoring/ • Checklists are complete: • requirements checklist PASS • writing checklist PASS Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #60
108 lines
3.9 KiB
Markdown
108 lines
3.9 KiB
Markdown
# Data Model: Unified Operations Runs + Monitoring Hub (053)
|
|
|
|
This feature primarily standardizes and surfaces existing run records for long-running operations, and links operators to the underlying business artifacts (e.g., drift findings).
|
|
|
|
## Entities
|
|
|
|
## 1) BulkOperationRun (`bulk_operation_runs`)
|
|
|
|
**Purpose:** Canonical tenant-scoped run record for long-running operations (Phase 1).
|
|
|
|
**Model:** `App\Models\BulkOperationRun`
|
|
|
|
**Key fields (existing):**
|
|
- `id` (PK)
|
|
- `tenant_id` (FK → tenants)
|
|
- `user_id` (FK → users)
|
|
- `resource` (string) — e.g. `drift`, `backup_set`
|
|
- `action` (string) — e.g. `generate`, `add_policies`
|
|
- `idempotency_key` (string|null)
|
|
- `status` (string) — `pending`, `running`, `completed`, `completed_with_errors`, `failed`, `aborted`
|
|
- counters: `total_items`, `processed_items`, `succeeded`, `failed`, `skipped`
|
|
- `item_ids` (jsonb) — stable identifiers for the items/scope of the run
|
|
- Example (`drift.generate`): `{ scope_key, baseline_run_id, current_run_id }`
|
|
- Example (`backup_set.add_policies`): `{ backup_set_id, policy_ids, options }`
|
|
- `failures` (jsonb|null) — sanitized failure details (including per-item failures for itemized operations)
|
|
- `audit_log_id` (FK → audit_logs|null)
|
|
- `created_at`, `updated_at`
|
|
|
|
**Relationships:**
|
|
- `BulkOperationRun belongsTo Tenant`
|
|
- `BulkOperationRun belongsTo User`
|
|
- `BulkOperationRun belongsTo AuditLog` (nullable)
|
|
|
|
**Uniqueness / idempotency:**
|
|
- Active-run uniqueness enforced via a partial unique index on `(tenant_id, idempotency_key)` for active statuses.
|
|
- Idempotency keys are deterministic and stable per tenant + operation type + scope.
|
|
|
|
**State transitions (storage):**
|
|
- `pending → running → completed | completed_with_errors | failed | aborted`
|
|
|
|
**Status mapping (operator UI semantics):**
|
|
- `pending` → `queued`
|
|
- `running` → `running`
|
|
- `completed` → `succeeded`
|
|
- `completed_with_errors` → `partially succeeded`
|
|
- `failed`/`aborted` → `failed`
|
|
|
|
**Failure entry shape (sanitized):**
|
|
- `reason_code` (string, stable) + `reason` (short sanitized message)
|
|
- for itemized runs: `item_id` per failure entry (and optional `type=skipped` for non-failure outcomes)
|
|
|
|
---
|
|
|
|
## 2) Finding (`findings`) — Drift results
|
|
|
|
**Purpose:** Persisted analytic findings; drift findings are the primary “related artifact” for Drift generation runs.
|
|
|
|
**Model:** `App\Models\Finding`
|
|
|
|
**Key fields (existing, drift-related):**
|
|
- `id` (PK)
|
|
- `tenant_id` (FK → tenants)
|
|
- `finding_type` (`drift`)
|
|
- `scope_key` (string)
|
|
- `baseline_run_id` (FK → inventory_sync_runs|null)
|
|
- `current_run_id` (FK → inventory_sync_runs|null)
|
|
- `fingerprint` (string; deterministic; unique per tenant)
|
|
- `subject_type`, `subject_external_id`
|
|
- `status` (`new|acknowledged`)
|
|
- `evidence_jsonb` (jsonb; sanitized allowlist)
|
|
- `created_at`, `updated_at`
|
|
|
|
**Relationships:**
|
|
- `Finding belongsTo Tenant`
|
|
- `Finding belongsTo InventorySyncRun` via `baseline_run_id` and `current_run_id` (nullable)
|
|
|
|
**Notes:**
|
|
- Phase 1 can link operators from the drift run to findings through scope/baseline/current identifiers without introducing a new DB foreign key.
|
|
- If later needed, introduce an explicit link (e.g., `findings.bulk_operation_run_id`) to make navigation and reporting easier.
|
|
|
|
---
|
|
|
|
## 3) InventorySyncRun (`inventory_sync_runs`) — Drift inputs
|
|
|
|
**Purpose:** “Last observed” inventory run records used as baseline/current inputs for drift comparisons.
|
|
|
|
**Model:** `App\Models\InventorySyncRun`
|
|
|
|
**Relevant fields (existing):**
|
|
- `tenant_id`
|
|
- `status`
|
|
- `selection_hash` (used as `scope_key`)
|
|
- `finished_at`
|
|
|
|
---
|
|
|
|
## 4) Notification Event (DB notifications)
|
|
|
|
**Purpose:** Persist run lifecycle notifications (queued/completed) linking operators to the run detail page.
|
|
|
|
**Storage:** Laravel Notifications (DB channel).
|
|
|
|
**Payload (target):**
|
|
- tenant identifier
|
|
- run identifier + type (`bulk_operation_run`)
|
|
- status bucket (queued/running/succeeded/partial/failed)
|
|
- summary counts and a safe error summary (when applicable)
|