# Feature Specification: Drift Golden Master Cutover (Baseline Compare)

**Feature Branch**: `feat/119-baseline-drift-engine`  
**Created**: 2026-03-04  
**Status**: Draft (ready for implementation)  
**Owner**: Governance/Platform  
**Input**: User description: "Spec 119 — Drift Unification Cutover & Legacy Cleanup — Option A: Golden Master (Baseline Compare) becomes the single source of truth for drift findings, keeping the existing diff UX while removing the legacy run-to-run drift generator and cleaning up legacy findings."

## Spec Scope Fields *(mandatory)*

- **Scope**: tenant (drift findings + compare runs) + workspace (baseline profiles/snapshots referenced for compare)
- **Primary Routes**:
  - Tenant-context admin: Baseline Compare landing (Drift entry), Findings (list + view), Operation Run detail
  - Workspace admin: Baseline Profiles (read-only impact via provenance links)
- **Data Ownership**:
  - Tenant-owned: drift findings, compare runs, policy versions/evidence used to explain drift
  - Workspace-owned: baseline profiles, baseline snapshots (this spec does not change snapshot semantics; it only references them for provenance)
- **RBAC**:
  - Membership: workspace membership + tenant access are required for tenant-context surfaces (deny-as-not-found for non-members)
  - Capabilities (existing):
    - `tenant_findings.view`: view drift findings and any diff surfaces
    - `tenant.sync`: start baseline compare runs (manual) and authorize compare-time evidence refresh where applicable
    - `tenant.view`: view baseline compare landing and run summaries
  - 404 vs 403 semantics:
    - Non-member / not entitled to workspace scope OR tenant scope → 404 (deny-as-not-found)
    - Member but missing capability → 403

For canonical-view specs: not applicable (this is a tenant-context feature).

## Clarifications

### Session 2026-03-05

- Q: What is the drift finding `source` value for Baseline Compare? → A: `baseline.compare`
- Q: What are the allowed `evidence.summary.kind` values for drift diff rendering? → A: `policy_snapshot`, `policy_assignments`, `policy_scope_tags`
- Q: When is a `different_version` drift finding considered diff-renderable? → A: Only when both `baseline.policy_version_id` and `current.policy_version_id` are present; otherwise show “diff unavailable”.
- Q: What is the legacy drift findings deletion rule for cleanup? → A: Delete any findings where `finding_type = drift` AND (`source` is null OR `source` is not equal to `baseline.compare`).
- Q: What should the Drift navigation entry point be after cutover? → A: Baseline Compare landing.

### Session 2026-03-06

- Q: How should `missing_policy` and `unexpected_policy` findings render in the Finding view when exactly one policy-version reference exists? → A: Render a diff against an empty side (`removed` for baseline-only, `added` for current-only) instead of showing “diff unavailable”.
- Q: Which inventory rows may baseline snapshot capture treat as current subjects when multiple historical Inventory Sync runs exist? → A: Use only rows from the latest completed Inventory Sync run; stale rows from older syncs must not create new snapshot ambiguity gaps.
- Q: How should full-content Baseline Compare treat an identical `PolicyVersion` that was re-used instead of newly inserted during the current compare run? → A: Treat it as valid current content evidence for the current compare run; it must not be dropped as `missing_current` just because the reused row’s original `captured_at` predates the snapshot.
- Q: How should the Baseline Compare landing-page duplicate-name warning be computed when historical Inventory Sync rows exist? → A: Warn only for duplicates present in the latest completed Inventory Sync scope; stale historical rows must not keep the warning visible.
- Q: Should Intune `Actions for noncompliance` changes count as compliance-policy drift? → A: Yes. Canonicalize `scheduledActionsForRule` by semantic fields (`actionType`, `gracePeriodHours`, notification template id), ignore opaque IDs/order-only noise, and treat those changes as `policy_snapshot` drift.
- Q: How should existing baseline snapshots remain comparable when the policy snapshot drift signal is expanded? → A: When a baseline item has content provenance, recompute the effective baseline hash from the resolved baseline `policy_version` during compare so previously captured snapshots do not require forced recapture.

## Problem Statement

Today, drift findings can be produced by two different generators with different “change types” and different evidence detail levels. This creates mixed data states, inconsistent UI rendering, and operator confusion (“two truths”) that is not acceptable for enterprise governance workflows.

This spec hard-cuts drift generation to a single golden-master source: Baseline Compare.

## Goals

- **Single source of truth**: all drift findings are generated exclusively by Baseline Compare.
- **Preserve Diff UX**: baseline-compare drift findings include enough evidence references to render the existing normalized diff views when content evidence exists, and provide a clear “diff not available” explanation when it does not.
- **Hard cut (no dual-write)**: no feature flags, no dual-write period, no backward compatibility work for the legacy drift generator (project is not yet production).
- **Legacy cleanup**: remove the legacy drift generator stack end-to-end (jobs, scheduling, UI affordances, tests/docs) and delete legacy drift findings to prevent confusion.

## Non-Goals

- No new reporting layer (e.g., Stored Reports / Evidence Items).
- No soft deprecation, rollout toggles, or staged migration.
- No change to Baseline Compare drift change-type semantics (missing policy / unexpected policy / different version remain conceptually the same).

## Definitions

- **Drift finding**: a detected deviation between an approved baseline and a tenant’s current observed state, recorded as a finding for triage and audit.
- **Baseline Compare**: the comparison process that evaluates baseline vs current state and emits drift findings.
- **Legacy drift generator**: the older drift workflow that compares two historical runs (“run-to-run”) and writes drift findings independently of Baseline Compare.
- **Source (origin label)**: a mandatory short identifier indicating which drift engine produced a drift finding; for this feature, the only allowed drift finding source is `baseline.compare`.
- **Evidence**: the explanatory payload attached to a drift finding, including what changed and what reference snapshots/versions support a diff view.
- **Evidence fidelity**:
  - **content**: evidence includes sufficient full-content references to support detailed diffs
  - **meta**: evidence is metadata-level only; diffs are not available
  - **mixed**: some dimensions have content-level evidence and others do not; UI must communicate limitations
- **Diff kind**: the drift dimension the UI should render. Allowed values are `policy_snapshot`, `policy_assignments`, and `policy_scope_tags`.
- **Provenance**: which baseline profile/snapshot and which compare run produced the finding (and when).

## Assumptions

- The system is not yet production; deleting legacy drift findings is acceptable to ensure a clean cutover.
- Baseline Compare already exists as an observable operation (run history, outcome, counts) and already emits drift findings.
- The Findings UI already supports rendering normalized diffs when evidence includes baseline/current references.

## User Scenarios & Testing *(mandatory)*

### User Story 1 - Understand drift with consistent diffs (Priority: P1)

As a tenant admin/operator, I can run Baseline Compare and open the resulting drift findings with a consistent, enterprise-friendly explanation. When content evidence exists, I can view a detailed diff; when it does not, the UI clearly explains why.

**Why this priority**: This is the core value of drift governance: explain what changed in a way that is actionable and trustworthy.

**Independent Test**: Run Baseline Compare to produce (a) a `different_version` finding with content-level evidence, (b) a `missing_policy` or `unexpected_policy` finding with a single policy-version reference, and (c) a meta-only finding, then verify the finding detail view renders the correct diff or an explicit “diff unavailable” explanation.

**Acceptance Scenarios**:

1. **Given** a Baseline Compare run that produces a `different_version` drift finding with content-level evidence, **When** I open the finding detail view, **Then** the UI renders the appropriate diff view for the change dimension and shows baseline vs current references.
2. **Given** a `missing_policy` or `unexpected_policy` drift finding with exactly one policy-version reference, **When** I open the finding detail view, **Then** the UI renders a diff against an empty side so the operator can see the added or removed policy content.
3. **Given** a drift finding with meta-only evidence, **When** I open the finding detail view, **Then** the UI shows a clear “diff not available” explanation and still displays the finding summary and provenance.

---

### User Story 2 - Eliminate “two truths” for drift (Priority: P2)

As a tenant admin/operator, I only ever see drift findings coming from one source (Baseline Compare). I do not need to understand or choose between multiple drift engines or evidence formats.

**Why this priority**: Trust and supportability depend on consistent terminology and behavior.

**Independent Test**: After the cutover, verify that all drift findings shown in the UI are labeled as Baseline Compare-origin and no UI surface offers a legacy “Generate drift” action or “source” switching.

**Acceptance Scenarios**:

1. **Given** any drift findings are visible for a tenant, **When** I view the Findings list and open drift findings, **Then** each drift finding clearly indicates a single origin (“Baseline Compare”) and the UI contains no “choose drift source” affordances.
2. **Given** I navigate to the Drift entry point (Baseline Compare landing), **When** I view available actions, **Then** I can view Baseline Compare status and view findings, but I cannot start any legacy run-to-run drift generation workflow.

---

### User Story 3 - Clean cutover & legacy removal (Priority: P3)

As a platform/governance owner, I can deploy this change as a hard cut: legacy drift generation is removed end-to-end and legacy drift findings are deleted so operators never encounter mixed states.

**Why this priority**: This removes ambiguity and reduces long-term maintenance and support costs.

**Independent Test**: After deployment and the one-time cleanup step, verify that legacy drift findings no longer exist, legacy drift generation cannot be scheduled or started, and Baseline Compare drift continues to work.

**Acceptance Scenarios**:

1. **Given** a tenant previously had legacy drift findings, **When** the system is upgraded and cleanup is applied, **Then** those legacy drift findings are no longer visible and only Baseline Compare drift findings remain.
2. **Given** I inspect scheduled operations and run history, **When** I look for legacy drift generation, **Then** no scheduling or run types exist for the legacy drift generator and drift generation is attributable only to Baseline Compare runs.

### Edge Cases

- **Missing/unexpected policy outcomes**: a drift finding may only have a baseline reference or only a current reference; the UI should render a diff against an empty side when that single reference exists, and only fall back to “diff unavailable” when the required single reference is missing.
- **Mixed fidelity**: a single compare run can produce findings where some drift dimensions have content-level evidence and others do not; fidelity labels and UI messaging must reflect the weakest/limiting evidence.
- **Evidence gaps**: when evidence references are missing, the finding detail view must not error; it must show “diff unavailable” with a reason.
- **Stale inventory rows**: when a policy was deleted or renamed after an older Inventory Sync, a new Baseline Snapshot capture must ignore stale `inventory_items` rows from older sync runs so historical duplicates do not create false `ambiguous_match` gaps.
- **No baseline configured**: the Baseline Compare landing (Drift entry point) must show a blocked state with clear guidance (no silent empty states).
- **Repeat compares**: repeated Baseline Compare runs without changes must not create confusing duplicates; findings lifecycle should remain stable and understandable.

## Requirements *(mandatory)*

### Constitution alignment (required)

- Drift evaluation and findings persistence are tenant-owned and MUST remain strictly tenant-scoped.
- This spec changes the drift-finding contract and removes legacy scheduled/queued work; it MUST preserve run observability and auditability through existing operations monitoring surfaces.
- No new external integrations are introduced by this spec; it unifies drift generation pathways and evidence contracts.

### Constitution alignment (OPS-UX)

- Baseline Compare runs MUST remain observable operations with:
  - intent-only toast feedback when manually started,
  - progress visibility only via the active-ops widget and operation run detail,
  - a single terminal outcome notification for user-initiated runs (scheduled/system runs rely on Monitoring).
- Run lifecycle transitions remain service-owned.
- Run summary counts MUST remain numeric, stable, and suitable for operator understanding (e.g., subjects processed, findings created, evidence gaps).

### Constitution alignment (RBAC-UX)

- Authorization planes involved: tenant/admin `/admin` with tenant-context `/admin/t/{tenant}/...`.
- Drift findings and any diff views MUST be deny-as-not-found (404) for non-members and capability-gated (403) for members without permission.
- Any action that starts Baseline Compare is an operation-start mutation and MUST be enforced server-side, not only via UI visibility.

### Functional Requirements

#### Phase 1 — Baseline Compare becomes the drift “golden master”

- **FR-119-P1-01 Single drift source**: The system MUST generate drift findings exclusively via Baseline Compare going forward; no legacy run-to-run drift generator may write drift findings.
- **FR-119-P1-02 Mandatory origin label**: Every drift finding MUST carry a mandatory origin label (`source`). For Baseline Compare drift findings, `source` MUST be exactly `baseline.compare`.
- **FR-119-P1-03 Evidence supports diff UX**: Baseline Compare drift findings MUST include evidence that allows the existing diff views to render when content references exist, including:
  - `evidence_jsonb.summary.kind` with one of: `policy_snapshot`, `policy_assignments`, `policy_scope_tags`, and
  - `evidence_jsonb.baseline.policy_version_id` and `evidence_jsonb.current.policy_version_id` (int|null) as content references when available, and
  - `evidence_jsonb.fidelity` and `findings.evidence_fidelity` with one of: `content`, `meta`, `mixed`.
- **FR-119-P1-03a Diff-renderability rule**: The finding detail view MUST render detailed diffs according to change type:
  - `different_version` requires both `baseline.policy_version_id` and `current.policy_version_id`,
  - `missing_policy` requires `baseline.policy_version_id` and renders against an empty current side,
  - `unexpected_policy` requires `current.policy_version_id` and renders against an empty baseline side.
  If the required reference(s) are missing for the change type, the finding detail view MUST show an explicit “diff unavailable” explanation.
- **FR-119-P1-03b Baseline policy-version resolution**: When baseline snapshot provenance indicates content evidence and provides an `observed_at` timestamp, the system MUST attempt to resolve `baseline.policy_version_id` deterministically using the baseline snapshot item identity (`policy_type` + `subject_key`) and the baseline item’s `observed_at`. If no matching policy version is found, `baseline.policy_version_id` MUST be set to null.
- **FR-119-P1-03c Baseline capture freshness boundary**: When Baseline Snapshot capture builds its subject list and a latest completed Inventory Sync run exists for the tenant, it MUST scope eligible `inventory_items` to that run only. Stale rows from older sync runs MUST NOT create `ambiguous_match` gaps or leak removed policies into new snapshots.
- **FR-119-P1-03d Reused current content evidence remains valid**: When full-content Baseline Compare captures current policy content and the capture layer reuses an identical existing `policy_version` row instead of inserting a new one, the compare run MUST still treat that reused version as valid current content evidence for the current run. It MUST NOT produce `missing_current` solely because the reused row’s persisted `captured_at` is older than the baseline snapshot.
- **FR-119-P1-03e Duplicate-name warnings follow the current inventory boundary**: Any UI warning about duplicate policy display names preventing baseline matching MUST be computed from the latest completed Inventory Sync scope when one exists. Stale rows from older sync runs MUST NOT keep the warning visible.
- **FR-119-P1-03f Compliance noncompliance actions are part of policy drift**: For `deviceCompliancePolicy`, the drift signal MUST include a canonical representation of `scheduledActionsForRule` using semantic fields only:
  - `actionType`,
  - `gracePeriodHours`,
  - notification template id when present.
  The representation MUST be deterministically sorted and MUST ignore opaque IDs, Graph metadata fields, and order-only noise so semantically identical payloads do not flap.
- **FR-119-P1-03g Baseline hash compatibility for content-backed snapshots**: When a baseline item has content provenance and its baseline `policy_version_id` can be resolved, Baseline Compare MUST use the resolved baseline `policy_version` as the effective content-hash source during drift evaluation. Stored snapshot hashes from older normalization semantics MUST NOT force operators to recapture an unchanged baseline just to stay comparable.
- **FR-119-P1-04 Provenance is explicit**: Baseline Compare drift findings MUST include provenance linking them to the baseline profile/snapshot and the compare run that produced them.
- **FR-119-P1-04a Provenance keys are stable**: The provenance block MUST be present as `evidence_jsonb.provenance` and MUST include:
  - `baseline_profile_id` (int),
  - `baseline_snapshot_id` (int),
  - `compare_operation_run_id` (int; the Baseline Compare `OperationRun` id),
  - `inventory_sync_run_id` (int|null) when applicable.
- **FR-119-P1-05 Fidelity is explicit**: Drift findings MUST include an evidence fidelity label (`content`, `meta`, or `mixed`). It MUST be deterministic and computed as:
  - `content` when both `baseline.policy_version_id` and `current.policy_version_id` are present,
  - `mixed` when exactly one of the two references is present,
  - `meta` when neither reference is present.
  When evidence is meta-only, the UI MUST clearly communicate that diffs are not available.
- **FR-119-P1-06 Change-type semantics unchanged**: Baseline Compare drift change types remain conceptually the same (missing policy / unexpected policy / different version); this spec only standardizes evidence and origin labeling.

#### Phase 2 — Hard cut cleanup (“Search & Destroy”)

- **FR-119-P2-01 Remove legacy drift generation workflow**: The legacy run-to-run drift generation workflow MUST be removed end-to-end (operation types, scheduling, UI affordances, and any related documentation/tests).
- **FR-119-P2-02 Drift entry point is Baseline Compare**: The Drift navigation entry MUST open the Baseline Compare landing and MUST no longer start or configure legacy drift generation. It MUST present Baseline Compare-driven drift status and a path to view findings.
- **FR-119-P2-03 Legacy drift findings deleted**: A one-time cleanup step MUST delete legacy drift findings so the dataset is not polluted by mixed evidence formats. Baseline Compare drift findings MUST remain intact.
- **FR-119-P2-03a Cleanup filter**: The cleanup step MUST delete all findings where `finding_type = drift` AND (`source` is null OR `source` is not equal to `baseline.compare`). It MUST NOT delete drift findings where `source = baseline.compare`.
- **FR-119-P2-04 No “legacy source” UI states**: The UI MUST not include badges, filters, or labels that reference the legacy drift generator after cutover.

### Non-Functional Requirements

- **NFR-119-01 Determinism**: For the same baseline snapshot and the same current observed state, Baseline Compare MUST produce consistent drift outcomes and evidence labels (no nondeterministic “flapping” source/fidelity values).
- **NFR-119-02 Contract stability**: The drift evidence contract used for diff rendering MUST be stable across runs so operators do not experience regressions in how drift is explained.

## UI Action Matrix *(mandatory when Filament is changed)*

| Surface | Location | Header Actions | Inspect Affordance (List/Table) | Row Actions (max 2 visible) | Bulk Actions (grouped) | Empty-State CTA(s) | View Header Actions | Create/Edit Save+Cancel | Audit log? | Notes / Exemptions |
|---|---|---|---|---|---|---|---|---|---|---|
| Page | app/Filament/Pages/BaselineCompareLanding.php | “Compare Now” (confirmation; capability-gated) | Link to Operation Run detail | None | None | Single CTA when blocked: “Fix prerequisites” guidance | N/A | N/A | Yes | This is the Drift navigation entry point after cutover, and remains the only drift generation entry point. |
| Resource | app/Filament/Resources/FindingResource.php | “Triage all matching” (confirmation; capability-gated) | View action to open finding | “More” workflow actions | Bulk actions grouped under “More” | None (no create) | Workflow actions (capability-gated) | N/A | Yes | Evidence/diff is read-only; lifecycle actions unchanged by this spec. |

### Key Entities *(include if feature involves data)*

- **Drift Finding**: A tenant-owned governance record describing a deviation (type, severity/status, origin, evidence fidelity, provenance, evidence payload for diff rendering).
- **Baseline Compare Run**: An observable operation that compares baseline vs current state for a tenant and produces drift findings with counts and coverage/fidelity context.
- **Evidence Payload**: The structured explanation attached to a drift finding, including diff kind, baseline/current references (when available), provenance, and fidelity.
- **Policy Version**: An immutable snapshot/version reference used to support detailed diffs when content evidence exists.

## Success Criteria *(mandatory)*

### Measurable Outcomes

- **SC-119-01 Single-source drift**: 100% of drift findings shown to operators have `source = baseline.compare`; no UI surface references multiple drift sources.
- **SC-119-02 Diff UX preserved**: For drift findings with content-level evidence, operators can open the finding and view an appropriate diff (settings / assignments / scope tags) without errors, including one-sided diffs for `missing_policy` and `unexpected_policy`; for meta-only findings, operators see a clear “diff unavailable” explanation.
- **SC-119-03 Legacy workflow removed**: Operators cannot start, schedule, or configure legacy run-to-run drift generation anywhere in the product after deployment.
- **SC-119-04 Legacy dataset cleaned**: After the one-time cleanup step, zero legacy drift findings remain visible; Baseline Compare drift findings remain available.
- **SC-119-05 Timely visibility**: Drift findings from a completed Baseline Compare run are visible in the Findings list within 5 minutes of run completion under normal operating conditions.