Summary This PR introduces Unified Operations Runs + Monitoring Hub (053). Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware). Phase 1 adoption scope (per spec): • Drift generation (drift.generate) • Backup Set “Add Policies” (backup_set.add_policies) Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental. ⸻ What changed Monitoring / Operations hub • Moved/organized run monitoring under Monitoring → Operations • Added: • status buckets (queued / running / succeeded / partially succeeded / failed) • filters (run type, status bucket, time range) • run detail “Related” links (e.g. Drift findings, Backup Set context) • All hub pages are DB-only and view-only (no rerun/cancel/delete actions) Canonical run semantics • Added canonical helpers on BulkOperationRun: • runType() (resource.action) • statusBucket() derived from status + counts (testable semantics) Drift integration (Phase 1) • Drift generation start behavior now: • creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids) • dispatches generation job • emits DB notifications including “View run” link • On generation failure: stores sanitized failure entries + sends failure notification Permissions / tenant isolation • Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access • Readonly can view runs but cannot start drift generation ⸻ Tests Added/updated Pest coverage: • BulkOperationRunStatusBucketTest.php • DriftGenerationDispatchTest.php • GenerateDriftFindingsJobNotificationTest.php • RunAuthorizationTenantIsolationTest.php Validation run locally: • ./vendor/bin/pint --dirty • targeted tests from feature quickstart / drift monitoring tests ⸻ Manual QA 1. Go to Monitoring → Operations • verify filters (run type / status / time range) • verify run detail shows counts + sanitized failures + “Related” links 2. Open Drift Landing • with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run” • as readonly: should not start generation 3. Run detail • drift.generate runs show “Drift findings” related link • failure entries are sanitized (no secrets/tokens/raw payload dumps) ⸻ Notes / Ops • Queue workers must be restarted after deploy so they load the new code: • php artisan queue:restart (or Sail equivalent) • This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern. ⸻ Spec / Docs • SpecKit artifacts added under specs/053-unify-runs-monitoring/ • Checklists are complete: • requirements checklist PASS • writing checklist PASS Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local> Reviewed-on: #60
3.9 KiB
Data Model: Unified Operations Runs + Monitoring Hub (053)
This feature primarily standardizes and surfaces existing run records for long-running operations, and links operators to the underlying business artifacts (e.g., drift findings).
Entities
1) BulkOperationRun (bulk_operation_runs)
Purpose: Canonical tenant-scoped run record for long-running operations (Phase 1).
Model: App\Models\BulkOperationRun
Key fields (existing):
id(PK)tenant_id(FK → tenants)user_id(FK → users)resource(string) — e.g.drift,backup_setaction(string) — e.g.generate,add_policiesidempotency_key(string|null)status(string) —pending,running,completed,completed_with_errors,failed,aborted- counters:
total_items,processed_items,succeeded,failed,skipped item_ids(jsonb) — stable identifiers for the items/scope of the run- Example (
drift.generate):{ scope_key, baseline_run_id, current_run_id } - Example (
backup_set.add_policies):{ backup_set_id, policy_ids, options }
- Example (
failures(jsonb|null) — sanitized failure details (including per-item failures for itemized operations)audit_log_id(FK → audit_logs|null)created_at,updated_at
Relationships:
BulkOperationRun belongsTo TenantBulkOperationRun belongsTo UserBulkOperationRun belongsTo AuditLog(nullable)
Uniqueness / idempotency:
- Active-run uniqueness enforced via a partial unique index on
(tenant_id, idempotency_key)for active statuses. - Idempotency keys are deterministic and stable per tenant + operation type + scope.
State transitions (storage):
pending → running → completed | completed_with_errors | failed | aborted
Status mapping (operator UI semantics):
pending→queuedrunning→runningcompleted→succeededcompleted_with_errors→partially succeededfailed/aborted→failed
Failure entry shape (sanitized):
reason_code(string, stable) +reason(short sanitized message)- for itemized runs:
item_idper failure entry (and optionaltype=skippedfor non-failure outcomes)
2) Finding (findings) — Drift results
Purpose: Persisted analytic findings; drift findings are the primary “related artifact” for Drift generation runs.
Model: App\Models\Finding
Key fields (existing, drift-related):
id(PK)tenant_id(FK → tenants)finding_type(drift)scope_key(string)baseline_run_id(FK → inventory_sync_runs|null)current_run_id(FK → inventory_sync_runs|null)fingerprint(string; deterministic; unique per tenant)subject_type,subject_external_idstatus(new|acknowledged)evidence_jsonb(jsonb; sanitized allowlist)created_at,updated_at
Relationships:
Finding belongsTo TenantFinding belongsTo InventorySyncRunviabaseline_run_idandcurrent_run_id(nullable)
Notes:
- Phase 1 can link operators from the drift run to findings through scope/baseline/current identifiers without introducing a new DB foreign key.
- If later needed, introduce an explicit link (e.g.,
findings.bulk_operation_run_id) to make navigation and reporting easier.
3) InventorySyncRun (inventory_sync_runs) — Drift inputs
Purpose: “Last observed” inventory run records used as baseline/current inputs for drift comparisons.
Model: App\Models\InventorySyncRun
Relevant fields (existing):
tenant_idstatusselection_hash(used asscope_key)finished_at
4) Notification Event (DB notifications)
Purpose: Persist run lifecycle notifications (queued/completed) linking operators to the run detail page.
Storage: Laravel Notifications (DB channel).
Payload (target):
- tenant identifier
- run identifier + type (
bulk_operation_run) - status bucket (queued/running/succeeded/partial/failed)
- summary counts and a safe error summary (when applicable)