TenantAtlas/specs/159-baseline-snapshot-truth/data-model.md

# Data Model: BaselineSnapshot Artifact Truth

## Entity: BaselineSnapshot

Purpose:
- Workspace-owned baseline artifact produced by `baseline.capture` and consumed by `baseline.compare` only when explicitly complete.

Existing fields used by this feature:
- `id`
- `workspace_id`
- `baseline_profile_id`
- `snapshot_identity_hash`
- `captured_at`
- `summary_jsonb`
- `created_at`
- `updated_at`

New or revised fields for V1:
- `lifecycle_state`
  - Type: enum/string
  - Allowed values: `building`, `complete`, `incomplete`
  - Default: `building` for new capture attempts
- `completed_at`
  - Type: nullable timestamp
  - Meaning: when the artifact was proven complete and became consumable
- `failed_at`
  - Type: nullable timestamp
  - Meaning: when the artifact was finalized incomplete
- `completion_meta_jsonb`
  - Type: nullable JSONB
  - Purpose: minimal integrity proof and failure diagnostics
  - Suggested contents:
    - `expected_items`
    - `persisted_items`
    - `finalization_reason_code`
    - `was_empty_capture`
    - `producer_run_id`

Relationships:
- Belongs to one BaselineProfile
- Has many BaselineSnapshotItems
- Is optionally referenced by one BaselineProfile as `active_snapshot_id`
- Is optionally referenced by many compare runs via `operation_runs.context.baseline_snapshot_id`

Validation / invariants:
- Only `complete` snapshots are consumable.
- `completed_at` must be non-null only when `lifecycle_state = complete`.
- `failed_at` must be non-null only when `lifecycle_state = incomplete`.
- A snapshot may become `complete` only after completion proof passes.
- A snapshot may never transition from `incomplete` back to `complete` in V1.

State transitions:
- `building -> complete`
  - Trigger: capture assembly completes successfully and finalization proof passes
- `building -> incomplete`
  - Trigger: capture fails or terminates after snapshot creation but before successful completion
- Forbidden in V1:
  - `complete -> building`
  - `incomplete -> complete`

Derived historical presentation:
- A `complete` snapshot is rendered as `superseded` or historical when a newer `complete` snapshot for the same profile becomes the effective current truth.
- This is a derived presentation concept, not a persisted lifecycle transition, so immutable snapshot rows keep their recorded terminal lifecycle state.

## Entity: BaselineSnapshotItem

Purpose:
- Immutable or deterministic per-subject record within a BaselineSnapshot.

Existing fields:
- `id`
- `baseline_snapshot_id`
- `subject_type`
- `subject_external_id`
- `policy_type`
- `baseline_hash`
- `meta_jsonb`
- `created_at`
- `updated_at`

Relationships:
- Belongs to one BaselineSnapshot

Validation / invariants:
- Unique per snapshot on `(baseline_snapshot_id, subject_type, subject_external_id)`.
- Item persistence must remain deterministic across retries/reruns.
- Item presence alone does not make the parent snapshot consumable.

## Entity: BaselineProfile

Purpose:
- Workspace-owned baseline definition that exposes the effective current baseline truth to operators and compare consumers.

Relevant existing fields:
- `id`
- `workspace_id`
- `status`
- `capture_mode`
- `scope_jsonb`
- `active_snapshot_id`

Revised semantics:
- `active_snapshot_id` points only to a `complete` snapshot.
- Effective current baseline truth is the latest complete snapshot for the profile.
- The latest attempted snapshot may differ from the effective current snapshot when the latest attempt is `building` or `incomplete`.

Validation / invariants:
- A profile must never advance `active_snapshot_id` to a non-consumable snapshot.
- When no complete snapshot exists, `active_snapshot_id` may remain null.

## Entity: OperationRun (capture/compare interaction only)

Purpose:
- Execution/audit record for `baseline.capture` and `baseline.compare`.

Relevant fields:
- `id`
- `workspace_id`
- `tenant_id`
- `type`
- `status`
- `outcome`
- `summary_counts`
- `context`
- `failure_summary`

Feature constraints:
- Run truth remains separate from snapshot truth.
- `summary_counts` remain numeric-only execution metrics.
- Snapshot lifecycle and consumability do not live in `summary_counts`.
- Compare execution must revalidate snapshot consumability even if `context.baseline_snapshot_id` is present.

## Derived Concepts

### Consumable Snapshot

Definition:
- A BaselineSnapshot whose `lifecycle_state = complete`.

Authoritative rule:
- Exposed through one shared helper/service and mirrored by `BaselineSnapshot::isConsumable()`.

### Effective Current Snapshot

Definition:
- The snapshot a profile should use as current baseline truth.

Resolution rule:
- Latest `complete` snapshot for the profile.
- Prefer `active_snapshot_id` as a cached pointer when valid.
- Never resolve to `building`, `incomplete`, or a historically superseded complete snapshot as current truth.

Explicit compare override rule:
- In V1, a historically superseded complete snapshot is viewable but is not a valid explicit compare input once a newer complete snapshot is the effective current truth.

### Legacy Snapshot Proof Classification

Definition:
- Backfill-time decision of whether an existing pre-feature snapshot can be trusted as complete.

Proof sources:
- persisted item count
- `summary_jsonb.total_items` or equivalent expected-item metadata
- producing run context/result proving successful finalization, if available
- explicit zero-item completion proof for empty snapshots

Fallback:
- If proof is ambiguous, classify as `incomplete`.

Decision order:
1. Count proof with exact expected-item to persisted-item match.
2. Producing-run success proof with expected-item to persisted-item reconciliation.
3. Proven empty capture proof where expected items and persisted items are both zero.
4. Otherwise `incomplete`.