TenantAtlas/specs/159-baseline-snapshot-truth/data-model.md
2026-03-23 11:58:46 +01:00

175 lines
5.7 KiB
Markdown

# Data Model: BaselineSnapshot Artifact Truth
## Entity: BaselineSnapshot
Purpose:
- Workspace-owned baseline artifact produced by `baseline.capture` and consumed by `baseline.compare` only when explicitly complete.
Existing fields used by this feature:
- `id`
- `workspace_id`
- `baseline_profile_id`
- `snapshot_identity_hash`
- `captured_at`
- `summary_jsonb`
- `created_at`
- `updated_at`
New or revised fields for V1:
- `lifecycle_state`
- Type: enum/string
- Allowed values: `building`, `complete`, `incomplete`
- Default: `building` for new capture attempts
- `completed_at`
- Type: nullable timestamp
- Meaning: when the artifact was proven complete and became consumable
- `failed_at`
- Type: nullable timestamp
- Meaning: when the artifact was finalized incomplete
- `completion_meta_jsonb`
- Type: nullable JSONB
- Purpose: minimal integrity proof and failure diagnostics
- Suggested contents:
- `expected_items`
- `persisted_items`
- `finalization_reason_code`
- `was_empty_capture`
- `producer_run_id`
Relationships:
- Belongs to one BaselineProfile
- Has many BaselineSnapshotItems
- Is optionally referenced by one BaselineProfile as `active_snapshot_id`
- Is optionally referenced by many compare runs via `operation_runs.context.baseline_snapshot_id`
Validation / invariants:
- Only `complete` snapshots are consumable.
- `completed_at` must be non-null only when `lifecycle_state = complete`.
- `failed_at` must be non-null only when `lifecycle_state = incomplete`.
- A snapshot may become `complete` only after completion proof passes.
- A snapshot may never transition from `incomplete` back to `complete` in V1.
State transitions:
- `building -> complete`
- Trigger: capture assembly completes successfully and finalization proof passes
- `building -> incomplete`
- Trigger: capture fails or terminates after snapshot creation but before successful completion
- Forbidden in V1:
- `complete -> building`
- `incomplete -> complete`
Derived historical presentation:
- A `complete` snapshot is rendered as `superseded` or historical when a newer `complete` snapshot for the same profile becomes the effective current truth.
- This is a derived presentation concept, not a persisted lifecycle transition, so immutable snapshot rows keep their recorded terminal lifecycle state.
## Entity: BaselineSnapshotItem
Purpose:
- Immutable or deterministic per-subject record within a BaselineSnapshot.
Existing fields:
- `id`
- `baseline_snapshot_id`
- `subject_type`
- `subject_external_id`
- `policy_type`
- `baseline_hash`
- `meta_jsonb`
- `created_at`
- `updated_at`
Relationships:
- Belongs to one BaselineSnapshot
Validation / invariants:
- Unique per snapshot on `(baseline_snapshot_id, subject_type, subject_external_id)`.
- Item persistence must remain deterministic across retries/reruns.
- Item presence alone does not make the parent snapshot consumable.
## Entity: BaselineProfile
Purpose:
- Workspace-owned baseline definition that exposes the effective current baseline truth to operators and compare consumers.
Relevant existing fields:
- `id`
- `workspace_id`
- `status`
- `capture_mode`
- `scope_jsonb`
- `active_snapshot_id`
Revised semantics:
- `active_snapshot_id` points only to a `complete` snapshot.
- Effective current baseline truth is the latest complete snapshot for the profile.
- The latest attempted snapshot may differ from the effective current snapshot when the latest attempt is `building` or `incomplete`.
Validation / invariants:
- A profile must never advance `active_snapshot_id` to a non-consumable snapshot.
- When no complete snapshot exists, `active_snapshot_id` may remain null.
## Entity: OperationRun (capture/compare interaction only)
Purpose:
- Execution/audit record for `baseline.capture` and `baseline.compare`.
Relevant fields:
- `id`
- `workspace_id`
- `tenant_id`
- `type`
- `status`
- `outcome`
- `summary_counts`
- `context`
- `failure_summary`
Feature constraints:
- Run truth remains separate from snapshot truth.
- `summary_counts` remain numeric-only execution metrics.
- Snapshot lifecycle and consumability do not live in `summary_counts`.
- Compare execution must revalidate snapshot consumability even if `context.baseline_snapshot_id` is present.
## Derived Concepts
### Consumable Snapshot
Definition:
- A BaselineSnapshot whose `lifecycle_state = complete`.
Authoritative rule:
- Exposed through one shared helper/service and mirrored by `BaselineSnapshot::isConsumable()`.
### Effective Current Snapshot
Definition:
- The snapshot a profile should use as current baseline truth.
Resolution rule:
- Latest `complete` snapshot for the profile.
- Prefer `active_snapshot_id` as a cached pointer when valid.
- Never resolve to `building`, `incomplete`, or a historically superseded complete snapshot as current truth.
Explicit compare override rule:
- In V1, a historically superseded complete snapshot is viewable but is not a valid explicit compare input once a newer complete snapshot is the effective current truth.
### Legacy Snapshot Proof Classification
Definition:
- Backfill-time decision of whether an existing pre-feature snapshot can be trusted as complete.
Proof sources:
- persisted item count
- `summary_jsonb.total_items` or equivalent expected-item metadata
- producing run context/result proving successful finalization, if available
- explicit zero-item completion proof for empty snapshots
Fallback:
- If proof is ambiguous, classify as `incomplete`.
Decision order:
1. Count proof with exact expected-item to persisted-item match.
2. Producing-run success proof with expected-item to persisted-item reconciliation.
3. Proven empty capture proof where expected items and persisted items are both zero.
4. Otherwise `incomplete`.