Ahmed Darrazi f32e226f6f feat: baseline drift engine (spec 117)

2026-03-03 08:21:24 +01:00

3.9 KiB

Raw Blame History

Data Model — Spec 117 Baseline Drift Engine

This document describes the data shapes required to implement deep settings drift via a provider chain and to satisfy provenance requirements (baseline + current).

Entities (existing)

`baseline_snapshots`

Purpose: immutable reference snapshot for a baseline capture.
Key fields (known from repo):
- id
- captured_at (timestamp; the “since” reference)
- baseline_profile_id (profile reference)

`baseline_snapshot_items`

Purpose: per-subject snapshot item, stored without tenant identifiers.
Fields (known from repo):
- baseline_snapshot_id
- subject_type
- subject_id
- baseline_hash (currently meta contract hash)
- meta_jsonb (currently holds provenance-like info)

`operation_runs`

Purpose: operational lifecycle for queued capture/compare.
Contract: summary counts are numeric-only and key-whitelisted; extended detail goes in context.

`findings`

Purpose: drift findings produced by compare.
Current: uses evidence_jsonb for drift evidence shape.

Proposed changes

1) Findings: add `evidence_fidelity`

Add column: findings.evidence_fidelity (string)

Allowed values: content, meta
Index: index_findings_evidence_fidelity (and/or composite with tenant/status if common)

Why: supports fast filtering and stable semantics, while provenance remains in JSON.

2) Evidence JSON shape: include provenance for both sides

Store under findings.evidence_jsonb (existing column) with a stable top-level shape:

{
  "change_type": "created|updated|deleted|unchanged",
  "baseline": {
    "hash": "...",
    "provenance": {
      "fidelity": "content|meta",
      "source": "policy_version|inventory",
      "observed_at": "2026-03-02T10:11:12Z",
      "observed_operation_run_id": "uuid-or-int-or-null"
    }
  },
  "current": {
    "hash": "...",
    "provenance": {
      "fidelity": "content|meta",
      "source": "policy_version|inventory",
      "observed_at": "2026-03-02T10:11:12Z",
      "observed_operation_run_id": "uuid-or-int-or-null"
    }
  }
}

Notes:

source is intentionally constrained to the two v1.5 sources.
observed_operation_run_id is optional; include when available for traceability.

3) Baseline snapshot item provenance

Baseline capture should persist provenance for the baseline-side evidence:

Continue storing baseline_hash on baseline_snapshot_items.
Store baseline-side provenance in baseline_snapshot_items.meta_jsonb (existing) in a stable structure:

{
  "evidence": {
    "fidelity": "content|meta",
    "source": "policy_version|inventory",
    "observed_at": "...",
    "observed_operation_run_id": "..."
  }
}

Notes:

This does not add columns to snapshot items (keeps schema minimal).
Snapshot items remain tenant-identifier-free.

4) Operation run context for compare coverage

Store compare coverage and evidence gaps in operation_runs.context:

{
  "baseline_compare": {
    "since": "...baseline captured_at...",
    "coverage": {
      "subjects_total": 500,
      "resolved_total": 480,
      "resolved_content": 120,
      "resolved_meta": 360
    },
    "evidence_gaps": {
      "missing_baseline": 0,
      "missing_current": 20,
      "missing_both": 0
    }
  }
}

Notes:

Keep this out of summary_counts due to key restrictions.

Validation rules

evidence_fidelity must be either content or meta.
Findings must include both baseline.provenance and current.provenance.
When no evidence exists for a subject (per spec), record evidence gap in run context and do not create a finding.

Migration strategy

Add a single migration to add evidence_fidelity to findings + backfill existing rows to meta.
Keep backward compatibility for older findings by defaulting missing JSON paths to meta/inventory at render time (until backfill completes).

3.9 KiB Raw Blame History

Data Model — Spec 117 Baseline Drift Engine

Entities (existing)

baseline_snapshots

baseline_snapshot_items

operation_runs

findings

Proposed changes

1) Findings: add evidence_fidelity

2) Evidence JSON shape: include provenance for both sides

3) Baseline snapshot item provenance

4) Operation run context for compare coverage

Validation rules

Migration strategy

3.9 KiB

Raw Blame History

`baseline_snapshots`

`baseline_snapshot_items`

`operation_runs`

`findings`

1) Findings: add `evidence_fidelity`