# Phase 1 — Data Model This is the proposed data model for **101 — Golden Master / Baseline Governance v1**. ## Ownership Model - **Workspace-owned** - Baseline profiles - Baseline snapshots + snapshot items (data-minimized, reusable across tenants) - **Tenant-owned** - Tenant assignments (joins a tenant to the workspace baseline standard) - Operation runs for capture/compare - Findings produced by compares This follows constitution SCOPE-001 conventions: tenant-owned tables include `workspace_id` + `tenant_id` NOT NULL; workspace-owned tables include `workspace_id` and do not include `tenant_id`. ## Entities ## 1) baseline_profiles (workspace-owned) **Purpose**: Defines the baseline (“what good looks like”) and its scope. **Fields** - `id` (pk) - `workspace_id` (fk workspaces, NOT NULL) - `name` (string, NOT NULL) - `description` (text, nullable) - `version_label` (string, nullable) - `status` (string enum: `draft|active|archived`, NOT NULL) - `scope_jsonb` (jsonb, NOT NULL) - v1 schema: `{ "policy_types": ["string", ...] }` — array of policy type keys from `InventoryPolicyTypeMeta` - An empty array means "all types" (no filtering); each string must be a known policy type key - Future versions may add additional filter dimensions (e.g., `platforms`, `tags`) - `active_snapshot_id` (fk baseline_snapshots, nullable) - `created_by_user_id` (fk users, nullable) - timestamps **Indexes/constraints** - index: `(workspace_id, status)` - uniqueness: `(workspace_id, name)` (optional but recommended if UI expects names unique per workspace) **Validation notes** - status transitions: `draft ↔ active → archived` (archived is terminal in v1; deactivate returns active → draft) ## 2) baseline_snapshots (workspace-owned) **Purpose**: Immutable baseline snapshot captured from a tenant. **Fields** - `id` (pk) - `workspace_id` (fk workspaces, NOT NULL) - `baseline_profile_id` (fk baseline_profiles, NOT NULL) - `snapshot_identity_hash` (string(64), NOT NULL) - sha256 of normalized captured content - `captured_at` (timestamp tz, NOT NULL) - `summary_jsonb` (jsonb, nullable) - counts/metadata (e.g., total items) - timestamps **Indexes/constraints** - unique: `(workspace_id, baseline_profile_id, snapshot_identity_hash)` (dedupe) - index: `(workspace_id, baseline_profile_id, captured_at desc)` ## 3) baseline_snapshot_items (workspace-owned) **Purpose**: Immutable items within a snapshot. **Fields** - `id` (pk) - `baseline_snapshot_id` (fk baseline_snapshots, NOT NULL) - `subject_type` (string, NOT NULL) — e.g. `policy` - `subject_external_id` (string, NOT NULL) — stable key for the policy within the tenant inventory - `policy_type` (string, NOT NULL) — for filtering and summary - `baseline_hash` (string(64), NOT NULL) — stable content hash for the baseline version - `meta_jsonb` (jsonb, nullable) — minimized display metadata (no secrets) - timestamps **Indexes/constraints** - unique: `(baseline_snapshot_id, subject_type, subject_external_id)` - index: `(baseline_snapshot_id, policy_type)` ## 4) baseline_tenant_assignments (tenant-owned) **Purpose**: Assigns exactly one baseline profile per tenant (v1), with optional scope override that can only narrow. **Fields** - `id` (pk) - `workspace_id` (fk workspaces, NOT NULL) - `tenant_id` (fk tenants, NOT NULL) - `baseline_profile_id` (fk baseline_profiles, NOT NULL) - `override_scope_jsonb` (jsonb, nullable) - `assigned_by_user_id` (fk users, nullable) - timestamps **Indexes/constraints** - unique: `(workspace_id, tenant_id)` - index: `(workspace_id, baseline_profile_id)` **Validation notes** - Override must be subset of profile scope (enforced server-side; store the final effective scope hash in the compare run context for traceability). ## 5) findings additions (tenant-owned) **Purpose**: Persist baseline compare drift findings using existing findings system. **Required v1 additions** - add `findings.source` (string, **nullable**, default `NULL`) - baseline compare uses `source='baseline.compare'` - existing findings receive `NULL`; legacy drift-generate findings are queried with `whereNull('source')` or unconditionally - index: `(tenant_id, source)` **Baseline compare finding conventions** - `finding_type = 'drift'` - `scope_key = 'baseline_profile:'` - `fingerprint = sha256(tenant_id|scope_key|subject_type|subject_external_id|change_type|baseline_hash|current_hash)` (using `DriftHasher`) - `evidence_jsonb` includes: - `source` (until DB column exists) - `baseline.profile_id`, `baseline.snapshot_id` - `baseline.hash`, `current.hash` - `change_type` (missing_policy|different_version|unexpected_policy) - any minimized diff pointers required for UI ## 6) operation_runs conventions **Operation types** - `baseline_capture` - `baseline_compare` **Run context** (json) - capture: - `baseline_profile_id` - `source_tenant_id` - `effective_scope` / `selection_hash` - compare: - `baseline_profile_id` - `baseline_snapshot_id` (frozen at enqueue time) - `effective_scope` / `selection_hash` **Summary counts** - compare should set a compact breakdown for dashboards: - totals and severity breakdowns