Ahmed Darrazi add136cc3c spec(116): baseline drift engine specs

2026-03-02 02:08:28 +01:00

4.0 KiB

Raw Blame History

Phase 1 — Data Model (Baseline Drift Engine)

This document identifies the data/entities involved in Spec 116 and the minimal schema/config changes needed to implement it in this repository.

Existing Entities (Confirmed)

BaselineProfile

Represents a baseline definition.

Fields (expected): id, name, description, scope (jsonb), created_by, timestamps
Relationships: has many snapshots; assigned to tenants via BaselineTenantAssignment

BaselineSnapshot

Immutable capture of baseline state at a point in time.

Fields (expected): id, baseline_profile_id, captured_at, status, operation_run_id, timestamps
Relationships: has many items; belongs to baseline profile

BaselineSnapshotItem

One item in a baseline snapshot.

Fields (expected):
- id, baseline_snapshot_id
- policy_type
- external_id
- subject_json (jsonb) or subject fields
- baseline_hash (string)
- meta_jsonb (jsonb)
- timestamps

Finding

Generic drift finding storage.

Fields (confirmed by usage): tenant_id, fingerprint (unique with tenant), recurrence_key (nullable), scope_key, lifecycle fields (first_seen_at, last_seen_at, times_seen), evidence (jsonb)

OperationRun

Tracks long-running operations.

Fields (by convention): type, status/outcome, summary_counts (numeric map), context (jsonb)

New / Adjusted Data Requirements

1) Inventory sync coverage context

Goal: Baseline compare must know which policy types were actually processed successfully by inventory sync.

Where: operation_runs.context for the latest inventory sync run.

Shape (proposed):

{
  "inventory": {
    "coverage": {
      "policy_types": {
        "deviceConfigurations": {"status": "succeeded", "item_count": 123},
        "compliancePolicies": {"status": "failed", "error": "..."}
      },
      "foundation_types": {
        "securityBaselines": {"status": "succeeded", "item_count": 4}
      }
    }
  }
}

Notes:

Only summary_counts must remain numeric; detailed coverage lists live in context.
For Spec 116 v1, it’s sufficient to store policy_types coverage; adding foundation_types coverage at the same time keeps parity with scope rules.

2) Baseline scope schema

Goal: Support both policy and foundation scope with correct defaults.

Current: policy_types only.

Target:

{
  "policy_types": ["deviceConfigurations", "compliancePolicies"],
  "foundation_types": ["securityBaselines"]
}

Default semantics:

Empty policy_types means “all supported policy types excluding foundations”.
Empty foundation_types means “none”.

3) Findings recurrence strategy

Goal: Stable identity per snapshot and per subject.

findings.recurrence_key: populated for baseline compare findings.
findings.fingerprint: set to the same recurrence key (to satisfy existing uniqueness constraint).

Recurrence key inputs:

tenant_id
baseline_snapshot_id
policy_type
subject_external_id
change_type

Grouping (scope_key):

Keep findings.scope_key = baseline_profile:{baselineProfileId} for baseline compare findings.

4) Inventory meta contract

Goal: Explicitly define what is hashed for v1 comparisons.

Implemented as a dedicated builder class (no schema change required).
Used by baseline capture to compute baseline_hash and by compare to compute current_hash.

Potential Migrations (Likely)

If baseline_profiles.scope is not jsonb or does not include foundation types → migration to adjust structure (jsonb stays the same, but add support in code; DB change may be optional).
If coverage context needs persistence beyond operation run context → avoid adding tables unless proven necessary; context-based is sufficient for v1.

Index / Performance Notes

Findings queries commonly filter by tenant_id + scope_key; ensure there is an index on (tenant_id, scope_key).
Baseline snapshot items must be efficiently loaded by (baseline_snapshot_id, policy_type).

4.0 KiB Raw Blame History Unescape Escape