TenantAtlas/specs/117-baseline-drift-engine/data-model.md
ahmido f08924525d Spec 117: Baseline Drift Engine + evidence fidelity/provenance (#142)
Implements Spec 117 (Golden Master Baseline Drift Engine):

- Adds provider-chain resolver for current state hashes (content evidence via PolicyVersion, meta evidence via inventory)
- Updates baseline capture + compare jobs to use resolver and persist provenance + fidelity
- Adds evidence_fidelity column/index + Filament UI badge/filter/provenance display for findings
- Adds performance guard test + integration tests for drift, fidelity semantics, provenance, filter behavior
- UX fix: Policies list shows "Sync from Intune" header action only when records exist; empty-state CTA remains and is functional

Tests:
- `vendor/bin/sail artisan test --compact tests/Feature/Filament/PolicySyncCtaPlacementTest.php`
- `vendor/bin/sail artisan test --compact --filter=Baseline`

Checklist:
- specs/117-baseline-drift-engine/checklists/requirements.md ✓

Co-authored-by: Ahmed Darrazi <ahmed.darrazi@live.de>
Reviewed-on: #142
2026-03-03 07:23:01 +00:00

135 lines
3.9 KiB
Markdown

# Data Model — Spec 117 Baseline Drift Engine
This document describes the data shapes required to implement deep settings drift via a provider chain and to satisfy provenance requirements (baseline + current).
## Entities (existing)
### `baseline_snapshots`
- Purpose: immutable reference snapshot for a baseline capture.
- Key fields (known from repo):
- `id`
- `captured_at` (timestamp; the “since” reference)
- `baseline_profile_id` (profile reference)
### `baseline_snapshot_items`
- Purpose: per-subject snapshot item, stored without tenant identifiers.
- Fields (known from repo):
- `baseline_snapshot_id`
- `subject_type`
- `subject_id`
- `baseline_hash` (currently meta contract hash)
- `meta_jsonb` (currently holds provenance-like info)
### `operation_runs`
- Purpose: operational lifecycle for queued capture/compare.
- Contract: summary counts are numeric-only and key-whitelisted; extended detail goes in `context`.
### `findings`
- Purpose: drift findings produced by compare.
- Current: uses `evidence_jsonb` for drift evidence shape.
## Proposed changes
### 1) Findings: add `evidence_fidelity`
**Add column**: `findings.evidence_fidelity` (string)
- Allowed values: `content`, `meta`
- Index: `index_findings_evidence_fidelity` (and/or composite with tenant/status if common)
**Why**: supports fast filtering and stable semantics, while provenance remains in JSON.
### 2) Evidence JSON shape: include provenance for both sides
Store under `findings.evidence_jsonb` (existing column) with a stable top-level shape:
```json
{
"change_type": "created|updated|deleted|unchanged",
"baseline": {
"hash": "...",
"provenance": {
"fidelity": "content|meta",
"source": "policy_version|inventory",
"observed_at": "2026-03-02T10:11:12Z",
"observed_operation_run_id": "uuid-or-int-or-null"
}
},
"current": {
"hash": "...",
"provenance": {
"fidelity": "content|meta",
"source": "policy_version|inventory",
"observed_at": "2026-03-02T10:11:12Z",
"observed_operation_run_id": "uuid-or-int-or-null"
}
}
}
```
Notes:
- `source` is intentionally constrained to the two v1.5 sources.
- `observed_operation_run_id` is optional; include when available for traceability.
### 3) Baseline snapshot item provenance
Baseline capture should persist provenance for the baseline-side evidence:
- Continue storing `baseline_hash` on `baseline_snapshot_items`.
- Store baseline-side provenance in `baseline_snapshot_items.meta_jsonb` (existing) in a stable structure:
```json
{
"evidence": {
"fidelity": "content|meta",
"source": "policy_version|inventory",
"observed_at": "...",
"observed_operation_run_id": "..."
}
}
```
Notes:
- This does not add columns to snapshot items (keeps schema minimal).
- Snapshot items remain tenant-identifier-free.
### 4) Operation run context for compare coverage
Store compare coverage and evidence gaps in `operation_runs.context`:
```json
{
"baseline_compare": {
"since": "...baseline captured_at...",
"coverage": {
"subjects_total": 500,
"resolved_total": 480,
"resolved_content": 120,
"resolved_meta": 360
},
"evidence_gaps": {
"missing_baseline": 0,
"missing_current": 20,
"missing_both": 0
}
}
}
```
Notes:
- Keep this out of `summary_counts` due to key restrictions.
## Validation rules
- `evidence_fidelity` must be either `content` or `meta`.
- Findings must include both `baseline.provenance` and `current.provenance`.
- When no evidence exists for a subject (per spec), record evidence gap in run context and do not create a finding.
## Migration strategy
- Add a single migration to add `evidence_fidelity` to `findings` + backfill existing rows to `meta`.
- Keep backward compatibility for older findings by defaulting missing JSON paths to `meta`/`inventory` at render time (until backfill completes).