TenantAtlas/specs/053-unify-runs-monitoring/data-model.md
ahmido 30ad57baab feat/053-unify-runs-monitoring (#60)
Summary

This PR introduces Unified Operations Runs + Monitoring Hub (053).

Goal: Standardize how long-running operations are tracked and monitored using the existing tenant-scoped run record (BulkOperationRun) as the canonical “operation run”, and surface it in a single Monitoring → Operations hub (view-only, tenant-scoped, role-aware).

Phase 1 adoption scope (per spec):
	•	Drift generation (drift.generate)
	•	Backup Set “Add Policies” (backup_set.add_policies)

Note: This PR does not convert every run type yet (e.g. GroupSyncRuns / InventorySyncRuns remain separate for now). This is intentionally incremental.

⸻

What changed

Monitoring / Operations hub
	•	Moved/organized run monitoring under Monitoring → Operations
	•	Added:
	•	status buckets (queued / running / succeeded / partially succeeded / failed)
	•	filters (run type, status bucket, time range)
	•	run detail “Related” links (e.g. Drift findings, Backup Set context)
	•	All hub pages are DB-only and view-only (no rerun/cancel/delete actions)

Canonical run semantics
	•	Added canonical helpers on BulkOperationRun:
	•	runType() (resource.action)
	•	statusBucket() derived from status + counts (testable semantics)

Drift integration (Phase 1)
	•	Drift generation start behavior now:
	•	creates/reuses a BulkOperationRun with drift context payload (scope_key + baseline/current run ids)
	•	dispatches generation job
	•	emits DB notifications including “View run” link
	•	On generation failure: stores sanitized failure entries + sends failure notification

Permissions / tenant isolation
	•	Monitoring run list/view is tenant-scoped and returns 403 for cross-tenant access
	•	Readonly can view runs but cannot start drift generation

⸻

Tests

Added/updated Pest coverage:
	•	BulkOperationRunStatusBucketTest.php
	•	DriftGenerationDispatchTest.php
	•	GenerateDriftFindingsJobNotificationTest.php
	•	RunAuthorizationTenantIsolationTest.php

Validation run locally:
	•	./vendor/bin/pint --dirty
	•	targeted tests from feature quickstart / drift monitoring tests

⸻

Manual QA
	1.	Go to Monitoring → Operations
	•	verify filters (run type / status / time range)
	•	verify run detail shows counts + sanitized failures + “Related” links
	2.	Open Drift Landing
	•	with >=2 successful inventory runs for scope: should queue drift generation + show notification with “View run”
	•	as readonly: should not start generation
	3.	Run detail
	•	drift.generate runs show “Drift findings” related link
	•	failure entries are sanitized (no secrets/tokens/raw payload dumps)

⸻

Notes / Ops
	•	Queue workers must be restarted after deploy so they load the new code:
	•	php artisan queue:restart (or Sail equivalent)
	•	This PR standardizes monitoring for Phase 1 producers only; follow-ups will migrate additional run types into the unified pattern.

⸻

Spec / Docs
	•	SpecKit artifacts added under specs/053-unify-runs-monitoring/
	•	Checklists are complete:
	•	requirements checklist PASS
	•	writing checklist PASS

Co-authored-by: Ahmed Darrazi <ahmeddarrazi@adsmac.local>
Reviewed-on: #60
2026-01-16 15:10:31 +00:00

3.9 KiB

Data Model: Unified Operations Runs + Monitoring Hub (053)

This feature primarily standardizes and surfaces existing run records for long-running operations, and links operators to the underlying business artifacts (e.g., drift findings).

Entities

1) BulkOperationRun (bulk_operation_runs)

Purpose: Canonical tenant-scoped run record for long-running operations (Phase 1).

Model: App\Models\BulkOperationRun

Key fields (existing):

  • id (PK)
  • tenant_id (FK → tenants)
  • user_id (FK → users)
  • resource (string) — e.g. drift, backup_set
  • action (string) — e.g. generate, add_policies
  • idempotency_key (string|null)
  • status (string) — pending, running, completed, completed_with_errors, failed, aborted
  • counters: total_items, processed_items, succeeded, failed, skipped
  • item_ids (jsonb) — stable identifiers for the items/scope of the run
    • Example (drift.generate): { scope_key, baseline_run_id, current_run_id }
    • Example (backup_set.add_policies): { backup_set_id, policy_ids, options }
  • failures (jsonb|null) — sanitized failure details (including per-item failures for itemized operations)
  • audit_log_id (FK → audit_logs|null)
  • created_at, updated_at

Relationships:

  • BulkOperationRun belongsTo Tenant
  • BulkOperationRun belongsTo User
  • BulkOperationRun belongsTo AuditLog (nullable)

Uniqueness / idempotency:

  • Active-run uniqueness enforced via a partial unique index on (tenant_id, idempotency_key) for active statuses.
  • Idempotency keys are deterministic and stable per tenant + operation type + scope.

State transitions (storage):

  • pending → running → completed | completed_with_errors | failed | aborted

Status mapping (operator UI semantics):

  • pendingqueued
  • runningrunning
  • completedsucceeded
  • completed_with_errorspartially succeeded
  • failed/abortedfailed

Failure entry shape (sanitized):

  • reason_code (string, stable) + reason (short sanitized message)
  • for itemized runs: item_id per failure entry (and optional type=skipped for non-failure outcomes)

2) Finding (findings) — Drift results

Purpose: Persisted analytic findings; drift findings are the primary “related artifact” for Drift generation runs.

Model: App\Models\Finding

Key fields (existing, drift-related):

  • id (PK)
  • tenant_id (FK → tenants)
  • finding_type (drift)
  • scope_key (string)
  • baseline_run_id (FK → inventory_sync_runs|null)
  • current_run_id (FK → inventory_sync_runs|null)
  • fingerprint (string; deterministic; unique per tenant)
  • subject_type, subject_external_id
  • status (new|acknowledged)
  • evidence_jsonb (jsonb; sanitized allowlist)
  • created_at, updated_at

Relationships:

  • Finding belongsTo Tenant
  • Finding belongsTo InventorySyncRun via baseline_run_id and current_run_id (nullable)

Notes:

  • Phase 1 can link operators from the drift run to findings through scope/baseline/current identifiers without introducing a new DB foreign key.
  • If later needed, introduce an explicit link (e.g., findings.bulk_operation_run_id) to make navigation and reporting easier.

3) InventorySyncRun (inventory_sync_runs) — Drift inputs

Purpose: “Last observed” inventory run records used as baseline/current inputs for drift comparisons.

Model: App\Models\InventorySyncRun

Relevant fields (existing):

  • tenant_id
  • status
  • selection_hash (used as scope_key)
  • finished_at

4) Notification Event (DB notifications)

Purpose: Persist run lifecycle notifications (queued/completed) linking operators to the run detail page.

Storage: Laravel Notifications (DB channel).

Payload (target):

  • tenant identifier
  • run identifier + type (bulk_operation_run)
  • status bucket (queued/running/succeeded/partial/failed)
  • summary counts and a safe error summary (when applicable)