- Enrich drift findings evidence_jsonb for diff UX (summary.kind, refs, fidelity, provenance) - Add baseline policy version resolver and contract asserts - Remove legacy drift generator + DriftLanding surfaces - Add one-time cleanup migration for legacy drift findings - Scope baseline capture/landing warnings to latest inventory sync - Canonicalize compliance scheduledActionsForRule drift signal
9.7 KiB
9.7 KiB
Research — Drift Golden Master Cutover (Spec 119)
This document resolves planning unknowns and records implementation decisions for making Baseline Compare the single source of truth for drift findings while preserving the existing diff UI.
Decisions
1) Golden-master drift source
- Decision: All drift findings generated by Baseline Compare will use
findings.source = baseline.compare. - Rationale: This is the single “origin label” used across the spec and is already set in
/Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Jobs/CompareBaselineToTenantJob.phpwhen upserting findings. - Alternatives considered:
- Keep
sourcenullable / optional → rejected because it enables mixed states and breaks the single-source contract.
- Keep
2) Drift navigation entry point (post-cutover)
- Decision: The Drift navigation entry point becomes the Baseline Compare landing page (
/admin/t/{tenant}/baseline-compare-landing). - Rationale: This preserves a single operational entry point for drift generation and reduces duplicated UI “landing” surfaces.
- Alternatives considered:
- Keep a separate Drift landing page and repurpose it → rejected (extra surface to maintain and re-explain).
3) Evidence contract for diff UX compatibility
- Decision: Baseline Compare drift findings will write
evidence_jsonbkeys required by the existing diff renderer:summary.kindwith allowed values:policy_snapshot,policy_assignments,policy_scope_tagsbaseline.policy_version_idandcurrent.policy_version_idwhen content evidence exists- Explicit fidelity labeling + explicit compare provenance (baseline profile/snapshot + compare run id + inventory sync run id when available)
- Rationale:
/Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Filament/Resources/FindingResource.phpusessummary.kindto decide which diff UI to render and reads the policy version IDs frombaseline.policy_version_idandcurrent.policy_version_id. - Alternatives considered:
- Introduce a new diff UI for Baseline Compare evidence → rejected (scope; requires new UI + new contract).
4) Diff renderability rule (avoid misleading empty diffs)
- Decision: Only render a detailed diff when both
baseline.policy_version_idandcurrent.policy_version_idare present; otherwise show “diff unavailable”. - Rationale: The diff builder can otherwise compare empty/null versions and display misleading results; the spec requires an explicit “diff unavailable” explanation.
- Alternatives considered:
- Render diffs even when one side is missing → rejected (misleading output; violates clarified rule).
8) One-sided diff rendering for policy presence changes
- Decision: Render diffs against an empty side for
missing_policy(baseline-only reference) andunexpected_policy(current-only reference). Keep the stricter two-reference rule only fordifferent_version. - Rationale: Policy presence changes are easier to understand when operators can inspect the captured policy content that exists, instead of receiving a generic “diff unavailable” message.
- Alternatives considered:
- Keep treating all single-reference findings as non-renderable → rejected (hides useful evidence even when one side is fully captured).
9) Baseline capture must ignore stale inventory rows
- Decision: When a latest completed Inventory Sync exists, Baseline Snapshot capture scopes
inventory_itemsto that run before derivingsubject_keymatches. - Rationale: Capture and Compare must agree on the same “current observed state” boundary; otherwise deleted/renamed policies from older syncs can create false
ambiguous_matchgaps and omit valid baseline subjects. - Alternatives considered:
- Continue scanning all tenant inventory rows during capture → rejected (nondeterministic snapshot gaps as historical rows accumulate).
- Hard-fail capture when no completed Inventory Sync exists → deferred (larger product behavior change than this fix; current fallback remains acceptable).
10) Full-content compare must reuse same-run deduplicated evidence
- Decision: When the compare-time content capture fetches current policy content successfully but reuses an older identical
policy_versionrow instead of inserting a new one, the compare run will consume that returned version directly as current evidence for the run. - Rationale: The capture step has already validated current Graph content. Re-querying only by
captured_at >= snapshot.captured_atmisclassifies these successful deduplicated captures asmissing_current, which incorrectly downgrades fidelity and emitsevidence_capture_incomplete. - Alternatives considered:
- Always insert a new
policy_versionrow per compare run → rejected (breaks immutable dedupe strategy and inflates storage). - Keep relying only on the post-capture
sincequery → rejected (produces false partial-success outcomes when content is unchanged).
- Always insert a new
11) Landing-page duplicate warnings must use the latest sync boundary
- Decision: The Baseline Compare landing-page duplicate-name warning uses the latest completed Inventory Sync run when one exists, matching compare/capture subject selection.
- Rationale: Operators should not keep seeing a duplicate-name warning after the duplicate only survives in stale historical inventory rows; the landing page must reflect the same current boundary as the underlying compare logic.
- Alternatives considered:
- Keep scanning all tenant inventory rows for the warning → rejected (UI keeps reporting already-resolved duplicates until stale rows are cleaned up out-of-band).
12) Compliance noncompliance actions belong in the policy drift signal
- Decision:
deviceCompliancePolicy.scheduledActionsForRuleparticipates inpolicy_snapshotdrift through a canonical semantic projection of each configured action. - Rationale: A compliance policy’s security effect depends on both the rule and its enforcement timeline/consequences. Changing
gracePeriodHours, removingretire, or swapping notification templates changes governance behavior and must produce drift. - Alternatives considered:
- Ignore noncompliance actions entirely → rejected (false negatives on meaningful governance changes).
- Hash the raw Graph array directly → rejected (opaque IDs and order churn would create false positives).
13) Expand the drift signal without forcing baseline recapture
- Decision: When baseline content provenance resolves to a tenant
policy_version, Compare recomputes the effective baseline content hash from that immutable version instead of trusting only the stored snapshot hash. - Rationale: Existing baseline snapshots were captured under older normalization semantics. Recomputing from the resolved baseline version keeps those snapshots comparable as the canonical drift signal expands, which avoids rollout-time false positives and avoids forcing operators to recapture unchanged baselines.
- Alternatives considered:
- Require every tenant to recapture their baseline after signal changes → rejected (operationally brittle and easy to miss).
- Keep comparing only the stored snapshot hash → rejected (old snapshots would flap as soon as the drift signal grows).
5) How policy version references are populated
- Decision:
- Current-side
policy_version_id: taken from content evidence (ResolvedEvidence.meta.policy_version_id) when content fidelity is used. - Baseline-side
policy_version_id: resolved opportunistically for the same tenant policy when baseline-side evidence is content-based (e.g., via baseline-capture policy versions), otherwise set to null.
- Current-side
- Rationale: Baseline snapshots are workspace-owned and intentionally avoid persisting tenant-owned identifiers; the finding (tenant-owned) is the correct place to attach tenant-specific policy version references.
- Alternatives considered:
- Persist baseline policy version IDs in baseline snapshots → rejected (violates scope/ownership model for workspace-owned snapshots).
6) Legacy drift findings deletion criteria
- Decision: One-time cleanup deletes drift findings where
sourceis null or not equal tobaseline.compare(scoped tofinding_type = drift), and keepssource = baseline.comparerows. - Rationale: Legacy drift generator rows often have
source = NULL; this filter removes mixed evidence formats without risking Baseline Compare drift data. - Alternatives considered:
- Delete by “old evidence shape” heuristics only → rejected (brittle; source is the canonical differentiator post-cutover).
7) Legacy drift generator removal scope
- Decision: Remove legacy run-to-run drift generation end-to-end:
GenerateDriftFindingsJob+ generator-only services- Drift landing UI surface that triggers legacy drift generation
- Operation run type catalog entries and any related UI/widget/alert producer references
- Legacy tests that assert drift generation dispatch/notifications
- Rationale: Hard cut means no dual-write/no feature flags; leaving legacy entry points risks reintroducing “two truths”.
- Alternatives considered:
- Leave legacy components present but unreachable → rejected (dead code + drift risk).
Notes / Repo Facts Used
- Baseline Compare upserts findings and already hard-sets
source = baseline.comparein/Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Jobs/CompareBaselineToTenantJob.php. - The existing diff UI reads:
evidence_jsonb.summary.kindevidence_jsonb.baseline.policy_version_idevidence_jsonb.current.policy_version_idin/Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Filament/Resources/FindingResource.php.
- Content evidence already carries
policy_version_idinResolvedEvidence.metavia/Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Services/Baselines/Evidence/ContentEvidenceProvider.php.