Research — Spec 118 Golden Master Deep Drift v2

This document resolves planning unknowns for implementing /Users/ahmeddarrazi/Documents/projects/TenantAtlas/specs/118-baseline-drift-engine/spec.md in the existing Laravel + Filament codebase.

Decision 1 — Full-content evidence capture orchestration

Decision: Introduce a dedicated “baseline content capture” phase that can be invoked from both baseline capture and baseline compare:

Baseline capture (baseline_capture run): capture evidence needed to build a content-fidelity baseline snapshot (as budget allows).
Baseline compare (baseline_compare run): refresh current evidence before drift evaluation (as budget allows).

The phase reuses the existing Intune capture orchestration (/Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Services/Intune/PolicyCaptureOrchestrator.php) so we do not introduce a second capture implementation.

Rationale:

Aligns with Spec 118 goal: deep drift by default, without per-policy manual capture.
Keeps a single source of truth for content capture (policy payload + assignments + scope tags).
Makes quota management, retries, and resumability explicit at the operation level.

Alternatives considered:

Opportunistic only (rejected: repeats Spec 117 fragility; “no drift” can still be a silent failure).
UI-driven per-policy capture (rejected: explicitly out of UX goals).

Decision 2 — PolicyVersion purpose tagging + run traceability

Decision: Extend policy_versions with baseline-purpose attribution:

capture_purpose: backup | baseline_capture | baseline_compare
operation_run_id (nullable): link to the run that captured the version
baseline_profile_id (nullable): link for baseline_* captures

Rationale:

Enables audit/debug (“which run produced this evidence, for what purpose?”) without introducing a separate evidence table.
Supports idempotency and “resume capture” semantics (skip already-captured subjects for the same run/purpose).

Alternatives considered:

Store purpose only in policy_versions.metadata (rejected: harder to index/query; weaker guardrails).
Create an EvidenceItems model now (rejected: explicitly not required in Spec 118).

Decision 3 — Golden Master subject matching across tenants

Decision: Treat the Golden Master “subject identity” as a cross-tenant match key derived from policy display name:

Subject match key: policy_type + normalized_display_name
normalized_display_name rules: trim leading/trailing whitespace, collapse internal whitespace to single spaces, lowercase.

Implementation uses a dedicated snapshot-item field (e.g., baseline_snapshot_items.subject_key) for matching, while preserving tenant-specific external IDs separately for evidence resolution.

Ambiguous/missing match handling:

Missing match in current tenant → eligible for “missing policy” (only with coverage proof).
Multiple matches for the same key within a tenant/type → record evidence gap and suppress drift evaluation for that subject key (no finding).

Rationale:

Baselines are workspace-owned and can be assigned to multiple tenants; external IDs are tenant-specific and cannot be used for cross-tenant matching.
The match key keeps snapshot items free of tenant identifiers while enabling consistent comparisons.

Alternatives considered:

Match by tenant external ID (rejected: breaks cross-tenant baseline assignment).
Require per-tenant baseline snapshots (rejected for Spec 118: changes product semantics and assignment UX).
Introduce an explicit mapping table (rejected for R1: higher effort and requires operational UX not described in spec).

Decision 4 — Quota-aware capture + resumable token

Decision: Evidence capture is bounded and resumable:

Enforce per-run limits (max items, max concurrency, max retry attempts).
Store an opaque “resume token” in operation_runs.context when a run cannot complete within budget.
Provide a “Resume capture” UI action that starts a follow-up run continuing from that token.

Rationale:

Large tenants/scopes must not create uncontrolled queue storms or long-running jobs.
Operators need explicit visibility into “what was captured vs skipped” and a safe path to completion.

Alternatives considered:

“Always finish no matter what” (rejected: risks rate limiting and operational instability).
Mark run failed on any capture failure (rejected: Spec 118 allows partial failure with warnings).

Decision 5 — Ops-UX + run context contract (“Why no findings?”)

Decision: Baseline runs explicitly populate:

context.target_scope (required for Monitoring run detail; avoids “No target scope details…”)
context.effective_scope + context.capture_mode
evidence capture stats + gaps + reason codes when subjects processed = 0 or findings = 0

Keep summary_counts numeric-only and limited to keys from /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Support/OpsUx/OperationSummaryKeys.php; store richer detail in context.

Rationale:

Eliminates ambiguous “0 findings” outcomes and improves operator trust.
Conforms to Ops-UX 3-surface feedback contract and Monitoring expectations.

Alternatives considered:

Put details into summary_counts (rejected: key whitelist contract).
Only log details (rejected: operators need UI visibility).

Notes on current codebase (facts observed)

Baseline capture run creation: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Services/Baselines/BaselineCaptureService.php
Baseline compare run creation: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Services/Baselines/BaselineCompareService.php
Capture job (currently opportunistic content): /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Jobs/CaptureBaselineSnapshotJob.php
Compare job (provider-chain evidence): /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Jobs/CompareBaselineToTenantJob.php
Evidence providers + resolver: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Services/Baselines/CurrentStateHashResolver.php and /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Services/Baselines/Evidence/*
Monitoring target scope rendering expects context.target_scope: /Users/ahmeddarrazi/Documents/projects/TenantAtlas/app/Filament/Resources/OperationRunResource.php

6.3 KiB Raw Blame History

Research — Spec 118 Golden Master Deep Drift v2

Decision 1 — Full-content evidence capture orchestration

Decision 2 — PolicyVersion purpose tagging + run traceability

Decision 3 — Golden Master subject matching across tenants

Decision 4 — Quota-aware capture + resumable token

Decision 5 — Ops-UX + run context contract (“Why no findings?”)

Notes on current codebase (facts observed)

6.3 KiB

Raw Blame History