TenantAtlas/specs/119-baseline-drift-engine/data-model.md

# Data Model — Drift Golden Master Cutover (Spec 119)

This spec extends existing models and introduces no new tables.

## Entities

### 1) Finding (existing: `App\Models\Finding`)
Baseline Compare drift findings are tenant-owned rows in `findings`.

Key fields used/extended by this feature:
- `workspace_id` (derived from tenant; required)
- `tenant_id` (required)
- `finding_type = drift` (required)
- `source = baseline.compare` (mandatory contract post-cutover)
- `scope_key` (baseline compare grouping key; stable across re-runs)
- `fingerprint` + `recurrence_key` (stable identity for idempotent upsert and lifecycle)
- `severity` (`low|medium|high|critical`)
- `status` (lifecycle): `new`, `reopened`, other open states, terminal states
- `evidence_fidelity` (string): `content|meta|mixed`
- `evidence_jsonb` (JSONB): enriched evidence contract (see below)
- `current_operation_run_id` (the Baseline Compare `OperationRun` that observed the finding)
- `baseline_operation_run_id` (unused for Baseline Compare drift; legacy run-to-run drift used it)

#### Evidence contract (`findings.evidence_jsonb`)
Minimum required keys for diff-UX compatibility:
- `change_type` (string): `missing_policy|unexpected_policy|different_version`
- `policy_type` (string)
- `subject_key` (string)
- `summary.kind` (string): `policy_snapshot|policy_assignments|policy_scope_tags`
- `baseline.policy_version_id` (int|null)
- `current.policy_version_id` (int|null)
- `baseline.hash` / `current.hash` (string; may be empty for missing/unexpected)
- `baseline.provenance` / `current.provenance` (object; fidelity/source/observed_at/observed_operation_run_id)
- `fidelity` (string): `content|meta|mixed`
- `provenance` (object): `baseline_profile_id`, `baseline_snapshot_id`, `compare_operation_run_id`, and `inventory_sync_run_id` when applicable

Invariant:
- `different_version` requires both policy version ids to render a detailed diff.
- `missing_policy` may render a detailed diff with only `baseline.policy_version_id`.
- `unexpected_policy` may render a detailed diff with only `current.policy_version_id`.
- If the required policy version reference(s) for the finding’s change type are missing, the UI must treat the finding as “diff unavailable”.

### 2) OperationRun (existing: `App\Models\OperationRun`)
Baseline Compare runs:
- `type = baseline_compare`
- `tenant_id` is required (tenant-scoped operation)
- `status/outcome` transitions are service-owned (must go through `OperationRunService`)
- `context.baseline_profile_id` and `context.baseline_snapshot_id` identify the compare baseline inputs
- `context.baseline_compare.*` contains coverage + evidence gap breakdowns (read-only reporting)

Baseline Capture runs:
- `type = baseline_capture`
- `context.baseline_profile_id` and `context.source_tenant_id` identify the capture inputs
- `context.baseline_capture.inventory_sync_run_id` records which completed Inventory Sync bounded subject selection when one existed
- `context.baseline_capture.gaps.*` remains the canonical reporting block for snapshot-capture ambiguity or evidence issues

Legacy run-to-run drift generation runs:
- `type = drift_generate_findings` (no longer created post-cutover; existing rows may remain historical)

### 3) BaselineProfile / BaselineSnapshot / BaselineSnapshotItem (existing)
Workspace-owned baseline objects:
- Baseline Profile (`baseline_profiles`) defines scope + capture mode and points to `active_snapshot_id`.
- Baseline Snapshot (`baseline_snapshots`) is immutable and stores `summary_jsonb`.
- Baseline Snapshot Items (`baseline_snapshot_items`) store:
  - `baseline_hash` (string)
  - `meta_jsonb.evidence` provenance fields (fidelity/source/observed_at), intentionally without tenant-owned identifiers (e.g., no `policy_version_id`).

### 4) PolicyVersion (existing: `App\Models\PolicyVersion`)
Tenant-owned immutable policy snapshots used for:
- Content hashing/normalization for Baseline Compare evidence
- Rendering diffs in Findings detail view when both baseline/current policy version references exist

Relevant fields:
- `id`
- `tenant_id`
- `policy_id` (ties versions to a tenant policy)
- `capture_purpose` (e.g., `baseline_capture`, `baseline_compare`)
- `operation_run_id` (which run captured it)
- `captured_at`
- `snapshot`, `assignments`, `scope_tags` (JSON/arrays)

## Derived/Computed Values
- `evidence_jsonb.fidelity` should align with (or be derived from) the per-finding `evidence_fidelity` column.
- `summary.kind` may be set conservatively (e.g., `policy_snapshot`) when dimension-level detection is not available.

## Invariants
- Post-cutover drift findings must be queryable as: `finding_type = drift AND source = baseline.compare`.
- Legacy drift findings are deleted by a one-time migration using: `finding_type = drift AND (source IS NULL OR source <> 'baseline.compare')`.