Ahmed Darrazi 04d61cbad0 feat: baseline drift engine v1

- Implement Spec 116 baseline capture/compare + coverage guard\n- Add UI surfaces and widgets for baseline compare\n- Add tests and research report

2026-03-02 23:01:39 +01:00

42 KiB

Raw Blame History

Golden Master / Baseline Drift — Deep Settings-Drift (Content-Fidelity) Analysis

Enterprise Research Report for TenantAtlas / TenantPilot
Date: 2025-07-15
Scope: Architecture, code evidence, implementation proposal

Executive Summary
System Map — Side-by-Side Comparison
Architecture Decision Record (ADR-001): Unify vs Separate
Deep-Dive: Why Settings Changes Don't Produce Baseline Drift
Code Evidence Table
Type Coverage Matrix
Proposal: Deep Drift Implementation Plan
Test Plan (Enterprise)
Open Questions / Assumptions
Key Questions Answered (KQ-01 through KQ-06)

1. Executive Summary

Two parallel drift systems exist: Baseline Compare (meta fidelity, inventory-sourced) and Backup Drift (content fidelity, PolicyVersion-sourced). They share DriftHasher but are otherwise separate data paths with separate finding generators.
The core gap: CompareBaselineToTenantJob hashes InventoryMetaContract v1 — which contains only odata_type, etag, scope_tag_ids, assignment_target_count — never actual policy settings. When an admin changes a Wi-Fi password or a compliance threshold in Intune, none of these meta signals necessarily change.
Inventory sync uses Graph LIST endpoints, which return metadata and display fields only. Per-item GET (which fetches settings, assignments, scope tags) is only performed during Backup via PolicyCaptureOrchestrator.
DriftFindingGenerator (the backup drift system) does detect settings changes — it normalizes PolicyVersion.snapshot via SettingsNormalizer → PolicyNormalizer::flattenForDiff() → type-specific normalizers, then hashes with DriftHasher.
Spec 116 already designs v2 with a provider precedence chain (PolicyVersion → Inventory content → Meta fallback), which is the correct architectural direction. The v1 meta baseline shipped first as a deliberate, safe-to-ship initial milestone.
Unification is recommended (provider chain approach) — not merging the two jobs, but enabling CompareBaselineToTenantJob to optionally consume PolicyVersion snapshots as a content-fidelity provider, falling back to InventoryMetaContract when no PolicyVersion is available.
28 supported policy types are registered in tenantpilot.php, plus 3 foundation types. Of these, 10+ have complex hydration (settings catalog, group policy, security baselines, compliance actions) and would benefit most from deep-drift detection.
The etag signal is unreliable as a settings-change proxy: Microsoft Graph etag semantics vary per resource type, and etag may or may not change when settings are modified. It is useful as a hint but not a guarantee.
API cost is the primary constraint: content-fidelity compare requires per-item GET calls (or a recent Backup that already captured PolicyVersions). The hybrid provider chain avoids this by opportunistically reusing existing PolicyVersions without requiring a full backup before every compare.
Coverage Guard is critical for v2: the baseline system must know which types have fresh PolicyVersions and suppress content-fidelity findings for types where no recent version exists (falling back to meta fidelity).
Risk profile: Shipping deep-drift for wrong types (without proper per-type normalization) could produce false positives. Type-specific normalizers already exist for the backup drift path; reusing them is safe.
Recommended phasing: v1.5 (current sprint) = add content_hash_source column to baseline_snapshot_items + provider chain in compare job. v2.0 = on-demand per-item GET during baseline capture for types lacking recent PolicyVersions.

2. System Map

Side-by-Side Comparison Table

Dimension	System A: Baseline Compare	System B: Backup Drift
Entry point	`CompareBaselineToTenantJob`	`GenerateDriftFindingsJob` → `DriftFindingGenerator`
Data source (current)	`InventoryItem` (from LIST sync)	`PolicyVersion` (from per-item GET backup)
Data source (baseline)	`BaselineSnapshotItem` (captured from inventory)	Earlier `PolicyVersion` from prior `OperationRun`
Hash contract	`InventoryMetaContract` v1 → `DriftHasher::hashNormalized()`	`SettingsNormalizer` → `PolicyNormalizer::flattenForDiff()` → `DriftHasher::hashNormalized()`
Hash inputs	`version`, `policy_type`, `subject_external_id`, `odata_type`, `etag`, `scope_tag_ids`, `assignment_target_count`	Full `PolicyVersion.snapshot` JSON (with volatile key removal)
Fidelity	`meta` (persisted as `fidelity='meta'` in snapshot context)	`content` (settings + assignments + scope_tags)
Dimensions detected	`missing_policy`, `different_version`, `unexpected_policy`	`policy_snapshot` (added/removed/modified), `policy_assignments` (modified), `policy_scope_tags` (modified)
Finding identity	`recurrence_key = sha256(tenantId\|snapshotId\|policyType\|extId\|changeType)`	`recurrence_key = sha256(drift:tenantId:scopeKey:subjectType:extId:dimension)`
Scope key	`baseline_profile:{profileId}`	`DriftScopeKey::fromSelectionHash()`
Auto-close	`BaselineAutoCloseService` (stale finding resolution)	`resolveStaleDriftFindings()` within `DriftFindingGenerator`
Coverage guard	`InventoryCoverage::fromContext()` → uncovered types → partial outcome	None (trusts backup captured all types)
Graph API calls	Zero at compare time (reads from DB)	Zero at compare time (reads PolicyVersions from DB)
Graph API calls (capture)	Zero (inventory sync did LIST)	Per-item GET via `PolicyCaptureOrchestrator`
Normalizer pipeline	None (meta contract is the normalization)	`SettingsNormalizer` → `PolicyNormalizer` → type normalizers
Shared components	`DriftHasher`, `Finding` model	`DriftHasher`, `Finding` model
Trigger	After inventory sync, on schedule/manual	After backup, on schedule/manual

Data Flow Diagrams

SYSTEM A — Baseline Compare (Meta Fidelity)
============================================
Graph LIST ──► InventorySyncService ──► InventoryItem (meta_jsonb)
                                             │
                                             ▼
                                    CaptureBaselineSnapshotJob
                                    ├─ InventoryMetaContract.build()
                                    ├─ DriftHasher.hashNormalized()
                                    └─► BaselineSnapshotItem (baseline_hash)
                                             │
                                             ▼
                                    CompareBaselineToTenantJob
                                    ├─ loadCurrentInventory() → InventoryItem
                                    ├─ BaselineSnapshotIdentity.hashItemContent()
                                    │   └─ InventoryMetaContract.build()
                                    │   └─ DriftHasher.hashNormalized()
                                    ├─ computeDrift() → hash compare
                                    └─ upsertFindings() → Finding records


SYSTEM B — Backup Drift (Content Fidelity)
============================================
Graph GET ──► PolicySnapshotService.fetch() ──► full JSON snapshot
                     │
                     ▼
              PolicyCaptureOrchestrator.capture()
              ├─ assignments GET
              ├─ scope tags resolve
              └─► VersionService.captureVersion() ──► PolicyVersion
                                                         │
                                                         ▼
                                               DriftFindingGenerator.generate()
                                               ├─ versionForRun() → baseline/current PV
                                               ├─ SettingsNormalizer.normalizeForDiff()
                                               │   └─ PolicyNormalizer.flattenForDiff()
                                               ├─ DriftHasher.hashNormalized() × 3
                                               │   (snapshot, assignments, scope_tags)
                                               └─ upsertDriftFinding() → Finding records

3. ADR-001: Unify vs Separate

Title

ADR-001: Golden Master Baseline Compare — Provider Chain for Content Fidelity

Status

PROPOSED

Context

TenantPilot has two drift detection systems that evolved independently:

System A (Baseline Compare): Designed for "does the tenant still match the golden master?" Use case. Ships with meta-fidelity (v1) — fast, cheap, zero additional Graph calls at compare time. Detects structural drift (policy added/removed/meta-changed) but is blind to settings changes.
System B (Backup Drift): Designed for "what changed between two backup points?" Use case. Content-fidelity — full PolicyVersion snapshots with per-type normalization. Detects settings, assignments, and scope tag changes.

The two systems cannot be merged into one without fundamentally changing their triggering, scoping, and API cost models. However, System A's accuracy can be dramatically improved by consuming data already produced by System B.

Decision

Adopt the Provider Chain pattern as already designed in Spec 116 v2:

ContentProvider = PolicyVersion → InventoryContent → MetaFallback

Specifically:

CompareBaselineToTenantJob gains a ContentProviderChain that, for each (policy_type, external_id):
- First: Looks for a PolicyVersion captured since the last baseline snapshot timestamp. If found, normalizes via SettingsNormalizer → DriftHasher → returns content fidelity hash.
- Second (future): Looks for enriched inventory content if inventory sync is upgraded to capture settings (v2.0+).
- Fallback: Builds InventoryMetaContract v1 → DriftHasher → returns meta fidelity hash.
Each baseline snapshot item records its fidelity (meta | content) and content_hash_source (inventory_meta_v1 | policy_version:{id} | inventory_content_v2).
Compare findings carry fidelity in evidence, enabling UI to display confidence level.
Coverage Guard is extended: a type is content-covered only if PolicyVersions exist for ≥N% of items. Below that threshold, fallback to meta fidelity (do not suppress).

Consequences

Positive: No new Graph API calls needed (reuses existing PolicyVersions from backups). Zero additional infrastructure. Incremental rollout per policy type. Existing meta-fidelity behavior preserved as fallback.
Negative: Content fidelity depends on backup recency. If a tenant hasn't been backed up, only meta fidelity is available. Could create "mixed fidelity" findings within a single compare run.
Rejected Alternative: Full merge of System A and B into a single system. Rejected because they serve different use cases (golden master comparison vs point-in-time drift), have different scoping models (BaselineProfile vs selection_hash), and different triggering models (post-inventory-sync vs post-backup).
Rejected Alternative: Always-GET during baseline compare. Rejected due to API cost (30+ types × 100s of policies = 1000s of GET calls per tenant per compare run).

Compliance Notes

Livewire v4.0+ / Filament v5: no UI changes in core ADR; provider chain is purely backend.
Provider registration: n/a (backend services only).
No destructive actions.
Asset strategy: no new assets.

4. Deep-Dive: Why Settings Changes Don't Produce Baseline Drift

The Root Cause Chain

Step 1: Inventory Sync captures only LIST metadata

InventorySyncService::executeSelectionUnderLock() (line ~340-450) calls Graph LIST endpoints. For each policy, it extracts:

display_name, category, platform (display fields)
odata_type, etag, scope_tag_ids, assignment_target_count (meta signals)

These are stored in InventoryItem.meta_jsonb. No settings values are fetched or stored.

Step 2: Baseline Capture hashes only the Meta Contract

CaptureBaselineSnapshotJob::collectSnapshotItems() reads from InventoryItem, then calls BaselineSnapshotIdentity::hashItemContent():

// BaselineSnapshotIdentity.php, line 56-67
public function hashItemContent(string $policyType, string $subjectExternalId, array $metaJsonb): string
{
    $contract = $this->metaContract->build(
        policyType: $policyType,
        subjectExternalId: $subjectExternalId,
        metaJsonb: $metaJsonb,
    );
    return $this->hasher->hashNormalized($contract);
}

The InventoryMetaContract::build() output is:

[
    'version'                => 1,
    'policy_type'            => 'settingsCatalogPolicy',
    'subject_external_id'    => '<guid>',
    'odata_type'             => '#microsoft.graph.deviceManagementConfigurationPolicy',
    'etag'                   => '"abc..."',           // ← unreliable change indicator
    'scope_tag_ids'          => ['0'],
    'assignment_target_count' => 3,
]

This is ALL that gets hashed. Actual policy settings (the Wi-Fi password, the compliance threshold, the firewall rule) are nowhere in this contract.

Step 3: Baseline Compare re-computes the same meta hash

CompareBaselineToTenantJob::loadCurrentInventory() (line 367-409) reads current InventoryItem records and calls the same BaselineSnapshotIdentity::hashItemContent() with the same InventoryMetaContract, producing the same hash structure.

computeDrift() (line 435-500) then compares baseline_hash vs current_hash:

if ($baselineItem['baseline_hash'] !== $currentItem['current_hash']) {
    $drift[] = ['change_type' => 'different_version', ...];
}

If the admin changed a policy setting but the meta signals (etag, scope_tag_ids, assignment_target_count) stayed the same, baseline_hash === current_hash and NO drift is detected.

Why etag is unreliable

Microsoft Graph etag behavior varies by resource type:

Some types update etag on any property change (including settings)
Some types update etag only on top-level property changes (not nested settings)
Settings Catalog policies may or may not update the parent resource etag when child settings are modified (the settings are a separate subresource at /configurationPolicies/{id}/settings)
Group Policy Configurations have settings in definitionValues → presentationValues (multi-level nesting); etag at root level may not reflect these changes

The Contrast: How Backup Drift Does Detect Settings Changes

DriftFindingGenerator::generate() (line 32-80) operates on PolicyVersion.snapshot — the full JSON captured via per-item GET:

$baselineSnapshot = $baselineVersion->snapshot;  // Full JSON from Graph GET
$currentSnapshot  = $currentVersion->snapshot;

$baselineNormalized = $this->settingsNormalizer->normalizeForDiff($baselineSnapshot, $policyType, $platform);
$currentNormalized  = $this->settingsNormalizer->normalizeForDiff($currentSnapshot, $policyType, $platform);

$baselineSnapshotHash = $this->hasher->hashNormalized($baselineNormalized);
$currentSnapshotHash  = $this->hasher->hashNormalized($currentNormalized);

if ($baselineSnapshotHash !== $currentSnapshotHash) {
    // → Drift finding with change_type = 'modified'
}

This pipeline captures actual settings values, normalizes them per policy type, strips volatile metadata, and hashes the result. If a setting changed, the hash changes, and drift is detected.

Summary Visualization

Admin changes Wi-Fi password in Intune
          │
          ▼
┌─────────────────────────────────┐
│ Graph LIST (inventory sync)     │
│ returns: displayName, etag, ... │
│                                 │
│ etag MAY change, settings NOT   │
│ returned by LIST endpoint       │
└────────────┬────────────────────┘
             │
     ┌───────┴────────┐
     ▼                ▼
 InventoryItem    PolicyVersion
 (meta only)      (if backup ran)
     │                │
     ▼                ▼
 Meta Contract    Full Snapshot
 hash unchanged   hash CHANGED
     │                │
     ▼                ▼
 Baseline         Backup Drift:
 Compare:         "modified" ✅
 NO DRIFT ❌

5. Code Evidence Table

#	Class / File	Lines	Role	Key Finding
1	`app/Jobs/CompareBaselineToTenantJob.php`	785 total; L367-409 (loadCurrentInventory), L435-500 (computeDrift)	Core baseline compare job	Reads from `InventoryItem` only; hashes via `InventoryMetaContract` → blind to settings
2	`app/Services/Baselines/InventoryMetaContract.php`	75 total; L30-57 (build)	Meta hash contract builder	Hashes only: version, policy_type, external_id, odata_type, etag, scope_tag_ids, assignment_target_count — no settings content
3	`app/Services/Baselines/BaselineSnapshotIdentity.php`	73 total; L56-67 (hashItemContent)	Per-item hash via meta contract	Delegates to `InventoryMetaContract.build()` → `DriftHasher.hashNormalized()`
4	`app/Jobs/CaptureBaselineSnapshotJob.php`	305 total	Captures snapshot from inventory	Reads `InventoryItem`, stores `fidelity='meta'` and `source='inventory'`
5	`app/Services/Drift/DriftFindingGenerator.php`	484 total; L32-80 (generate), L250-267 (recurrenceKey)	Backup drift finding generator	Uses `PolicyVersion.snapshot` with `SettingsNormalizer` → detects settings changes
6	`app/Services/Drift/DriftHasher.php`	100 total; L13-24 (hashNormalized)	Shared hasher	`sha256(json_encode(normalized))` with volatile key removal. SHARED by both systems.
7	`app/Services/Drift/Normalizers/SettingsNormalizer.php`	22 total	Thin wrapper	Delegates to `PolicyNormalizer::flattenForDiff()`. Used by System B only.
8	`app/Services/Intune/PolicyNormalizer.php`	67 total	Type-specific normalizer router	Routes to per-type normalizers for diff operations
9	`app/Services/Inventory/InventorySyncService.php`	652 total; L340-450 (executeSelectionUnderLock)	LIST-based sync	Fetches from Graph LIST endpoints; extracts meta signals only; upserts `InventoryItem`
10	`app/Services/Intune/BackupService.php`	438 total	Backup orchestration	Creates `BackupSet`, uses `PolicyCaptureOrchestrator` for per-item GET → PolicyVersion
11	`app/Services/Intune/PolicyCaptureOrchestrator.php`	429 total	Per-item GET + hydration	Fetches full snapshot, assignments, scope tags; creates PolicyVersion with all content
12	`app/Services/Intune/PolicySnapshotService.php`	852 total	Per-item Graph GET	Type-specific hydration (hydrateConfigurationPolicySettings, hydrateGroupPolicyConfiguration, etc.)
13	`app/Services/Intune/VersionService.php`	312 total; L1-150 (captureVersion)	PolicyVersion persistence	Transactional, locking, consecutive version_number
14	`app/Models/PolicyVersion.php`	Model	PolicyVersion model	Casts: snapshot(array), assignments(array), scope_tags(array), plus hash columns
15	`app/Models/InventoryItem.php`	Model	Inventory item model	Casts: meta_jsonb(array) — no settings content
16	`app/Models/BaselineSnapshotItem.php`	Model	Snapshot item model	Has `baseline_hash(64)`, `meta_jsonb`
17	`app/Support/Inventory/InventoryCoverage.php`	173 total	Coverage parser	`fromContext()` extracts per-type status from sync run context
18	`app/Services/Drift/DriftRunSelector.php`	~60 total	Run pair selector	Selects 2 most recent sync runs with same `selection_hash` (System B only)
19	`app/Jobs/GenerateDriftFindingsJob.php`	~200 total	Dispatcher for System B	Dispatches `DriftFindingGenerator` for policy-version-based drift
20	`config/graph_contracts.php`	867 total	Policy type registry	Defines endpoints, hydration strategies, subresources, type families per policy type
21	`config/tenantpilot.php`	385 total; L18-293 (supported_policy_types)	Application config	28 supported policy types + 3 foundation types
22	`specs/116-baseline-drift-engine/spec.md`	237 total	Feature spec	Defines v1 (meta) and v2 (content fidelity) requirements
23	`specs/116-baseline-drift-engine/research.md`	200 total	Phase 0 research	6 key decisions including v2 architecture strategy
24	`specs/116-baseline-drift-engine/plan.md`	259 total	Implementation plan	Steps 1-7 for v1; v2 deferred

6. Type Coverage Matrix

Coverage assessment for deep-drift feasibility: which types have per-type normalization and hydration support?

#	`policy_type`	Label	Hydration	Subresources	Per-Type Normalizer	Deep-Drift Feasible	Notes
1	`settingsCatalogPolicy`	Settings Catalog Policy	`configurationPolicies`	`settings` (list)	Yes (via PolicyNormalizer)	YES	Most impactful — complex nested settings
2	`endpointSecurityPolicy`	Endpoint Security Policies	`configurationPolicies`	`settings` (list)	Yes (shared with settings catalog)	YES	Same endpoint family as settings catalog
3	`securityBaselinePolicy`	Security Baselines	`configurationPolicies`	`settings` (list)	Yes (shared)	YES	Same pipeline
4	`groupPolicyConfiguration`	Administrative Templates	`groupPolicyConfigurations`	`definitionValues` → `presentationValues`	Yes (via PolicyNormalizer)	YES	Multi-level nesting; hydration required
5	`deviceConfiguration`	Device Configuration	`deviceConfigurations`	None (properties on root)	Yes (via PolicyNormalizer)	YES	Properties directly on resource
6	`deviceCompliancePolicy`	Device Compliance	`deviceCompliancePolicies`	`scheduledActionsForRule` (expand)	Yes (via PolicyNormalizer)	YES	Actions subresource needs expand
7	`windowsUpdateRing`	Software Update Ring	`deviceConfigurations` (filtered)	None (properties on root)	Yes (shared with deviceConfig)	YES	Subset of deviceConfiguration
8	`appProtectionPolicy`	App Protection (MAM)	`managedAppPolicies`	None (properties)	Partial (via PolicyNormalizer)	YES	Mobile-focused
9	`conditionalAccessPolicy`	Conditional Access	`identity/conditionalAccess/policies`	None (properties)	Yes (via PolicyNormalizer)	YES	High-risk, preview-only restore
10	`deviceManagementScript`	PowerShell Scripts	`deviceManagementScripts`	None (scriptContent base64)	Partial	PARTIAL	Script content is base64 in snapshot
11	`deviceShellScript`	macOS Shell Scripts	`deviceShellScripts`	None (scriptContent base64)	Partial	PARTIAL	Same pattern as PS scripts
12	`deviceHealthScript`	Proactive Remediations	`deviceHealthScripts`	None	Partial	PARTIAL	Detection + remediation scripts
13	`deviceComplianceScript`	Custom Compliance Scripts	`deviceComplianceScripts`	None	Partial	PARTIAL	Script content
14	`windowsFeatureUpdateProfile`	Feature Updates	`windowsFeatureUpdateProfiles`	None	Yes	YES	Simple properties
15	`windowsQualityUpdateProfile`	Quality Updates	`windowsQualityUpdateProfiles`	None	Yes	YES	Simple properties
16	`windowsDriverUpdateProfile`	Driver Updates	`windowsDriverUpdateProfiles`	None	Yes	YES	Simple properties
17	`mamAppConfiguration`	App Config (MAM)	`targetedManagedAppConfigurations`	None	Partial	YES	Properties-based
18	`managedDeviceAppConfiguration`	App Config (Device)	`mobileAppConfigurations`	None	Partial	YES	Properties-based
19	`windowsAutopilotDeploymentProfile`	Autopilot Profiles	`windowsAutopilotDeploymentProfiles`	None	Minimal	YES	Properties-based
20	`windowsEnrollmentStatusPage`	Enrollment Status Page	`deviceEnrollmentConfigurations`	None	Minimal	META-ONLY	Enrollment types have limited settings
21	`deviceEnrollmentLimitConfiguration`	Enrollment Limits	`deviceEnrollmentConfigurations`	None	Minimal	META-ONLY	Numeric limit only
22	`deviceEnrollmentPlatformRestrictionsConfiguration`	Platform Restrictions	`deviceEnrollmentConfigurations`	None	Minimal	META-ONLY	Nested restriction config
23	`deviceEnrollmentNotificationConfiguration`	Enrollment Notifications	`deviceEnrollmentConfigurations`	None	Minimal	META-ONLY	Template snapshots nested
24	`enrollmentRestriction`	Enrollment Restrictions	`deviceEnrollmentConfigurations`	None	Minimal	META-ONLY	Mixed config type
25	`termsAndConditions`	Terms & Conditions	`termsAndConditions`	None	Yes	YES	bodyText, acceptanceStatement
26	`endpointSecurityIntent`	Endpoint Security Intents	`intents`	categories/settings (legacy)	Partial	PARTIAL	Legacy intent API; migrating to configPolicies
27	`mobileApp`	Applications	`mobileApps`	None	Minimal	META-ONLY	Metadata-only backup per config
28	`policySet`	Policy Sets	(if supported)	assignments	Minimal	META-ONLY	Container for other policies

Foundation Types:

#	`foundation_type`	Label	Deep-Drift	Notes
F1	`assignmentFilter`	Assignment Filter	YES	`rule` property is key content
F2	`roleScopeTag`	Scope Tag	META-ONLY	displayName + description only
F3	`notificationMessageTemplate`	Notification Template	PARTIAL	Localized messages are subresource

Summary:

Full content-fidelity feasible: 16 types (settingsCatalog, endpointSecurity, securityBaseline, groupPolicy, deviceConfig, compliance, updateRings/profiles, appProtection, conditionalAccess, appConfigs, autopilot, termsAndConditions, assignmentFilter)
Partial (script content / legacy APIs): 5 types
Meta-only sufficient: 7 types (enrollment configs, mobileApp, roleScopeTag)

7. Proposal: Deep Drift Implementation Plan

Phase v1.5 — Provider Chain (Opportunistic Content Fidelity)

Goal: Enable baseline compare to use existing PolicyVersions for content-fidelity hash when available, with meta-fidelity fallback.

Estimated effort: 3-5 days

Step 1: ContentHashProvider Interface

// app/Contracts/Baselines/ContentHashProvider.php
interface ContentHashProvider
{
    /**
     * @return array{hash: string, fidelity: string, source: string}|null
     */
    public function resolve(string $policyType, string $externalId, int $tenantId, CarbonImmutable $since): ?array;
}

Step 2: PolicyVersionContentProvider

// app/Services/Baselines/PolicyVersionContentProvider.php
// Looks up the latest PolicyVersion for (tenant_id, external_id, policy_type)
// captured_at >= $since (baseline snapshot timestamp)
// Returns SettingsNormalizer → DriftHasher hash with fidelity='content'

Step 3: MetaFallbackProvider (existing logic)

// Wraps InventoryMetaContract → DriftHasher → fidelity='meta'

Step 4: ContentProviderChain

// Iterates [PolicyVersionContentProvider, MetaFallbackProvider]
// Returns first non-null result

Step 5: Integration in CompareBaselineToTenantJob

loadCurrentInventory() accepts optional ContentProviderChain
For each item: try chain, record fidelity + source
computeDrift() unchanged (still hash vs hash comparison)
Finding evidence includes fidelity and content_hash_source

Step 6: CaptureBaselineSnapshotJob enhancement

Optional: during capture, also try PolicyVersionContentProvider to store content-fidelity baseline_hash
Store content_hash_source in baseline_snapshot_items.meta_jsonb
This means: if a backup was taken before baseline capture, the baseline itself is content-fidelity

Step 7: Coverage extension

Add content_coverage to compare run context: which types had PolicyVersions, which fell back to meta
Display in operation detail UI

Migration

-- Optional: add column for source tracking
ALTER TABLE baseline_snapshot_items
    ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1';

Phase v2.0 — On-Demand Content Capture (Future)

Goal: For types without recent PolicyVersions, perform targeted per-item GET during baseline capture/compare.

Estimated effort: 5-8 days

Introduce BaselineContentCaptureJob that, for a given baseline profile's scope, identifies items lacking recent PolicyVersions and performs targeted GET + PolicyVersion creation.
Reuses existing PolicyCaptureOrchestrator with a new "baseline-triggered" context.
Adds capture_mode to baseline profile: meta_only (v1), opportunistic (v1.5), full_content (v2.0).
Rate limiting: per-tenant throttle to avoid Graph API quota issues.
Budget guard: max N items per capture run, with continuation support.

Phase v2.5 — Inventory Content Enrichment (Future, Optional)

Goal: Optionally have inventory sync capture settings content inline during LIST (where type supports $expand).

Some types support $expand=settings on LIST (settings catalog, endpoint security).
This would give "free" content fidelity without per-item GET.
High complexity: varies per type, may increase LIST payload size significantly.
Evaluate ROI after v2.0 ships.

8. Test Plan (Enterprise)

Unit Tests

#	Test File	Scope	Key Assertions
U1	`tests/Unit/Baselines/ContentProviderChainTest.php`	Provider chain resolution	First provider wins; null fallback; fidelity recorded correctly
U2	`tests/Unit/Baselines/PolicyVersionContentProviderTest.php`	PolicyVersion lookup + normalization	Correct hash for known snapshot; returns null when no PV; respects `$since` cutoff
U3	`tests/Unit/Baselines/MetaFallbackProviderTest.php`	Meta contract fallback	Produces `fidelity='meta'`; matches existing `InventoryMetaContract` behavior exactly
U4	`tests/Unit/Baselines/InventoryMetaContractTest.php`	(Existing) contract stability	Null handling, ordering, versioning — extend for edge cases

Feature Tests

#	Test File	Scope	Key Assertions
F1	`tests/Feature/Baselines/BaselineCompareContentFidelityTest.php`	End-to-end compare with PolicyVersions available	Settings change → `different_version` finding with `fidelity='content'`
F2	`tests/Feature/Baselines/BaselineCompareMixedFidelityTest.php`	Some types have PV, some don't	Mixed `fidelity` values in findings; coverage context records both
F3	`tests/Feature/Baselines/BaselineCompareFallbackTest.php`	No PolicyVersions available	Falls back to meta fidelity; identical behavior to v1
F4	`tests/Feature/Baselines/BaselineCaptureFidelityTest.php`	Capture with PolicyVersions present	`baseline_hash` uses content fidelity; `content_hash_source` recorded
F5	`tests/Feature/Baselines/BaselineCompareStaleVersionTest.php`	PolicyVersion older than snapshot	Falls back to meta (stale PV not used)
F6	`tests/Feature/Baselines/BaselineCompareCoverageGuardContentTest.php`	Coverage reporting for content types	`content_coverage` in run context shows which types are content-covered

Existing Tests to Preserve

#	Test File	Impact
E1	`tests/Feature/Baselines/BaselineCompareFindingsTest.php`	Must still pass — meta fidelity is default when no PV exists
E2	`tests/Feature/Baselines/BaselineComparePreconditionsTest.php`	No change expected
E3	`tests/Feature/Baselines/BaselineCompareStatsTest.php`	Stats remain grouped by scope_key; may need fidelity breakdown
E4	`tests/Feature/Baselines/BaselineOperabilityAutoCloseTest.php`	Auto-close unaffected by fidelity source

Integration / Regression

#	Test	Scope
I1	Content hash stability across serialization	JSON encode/decode round-trip does not change hash
I2	PolicyVersion normalizer alignment	Same snapshot → `SettingsNormalizer` produces same hash in both System A (via provider) and System B (via DriftFindingGenerator)
I3	Hash collision protection	Different settings → different hashes (property-based test with sample data)
I4	Empty snapshot edge case	PolicyVersion with empty/null snapshot → provider returns null → fallback works

Performance Tests

#	Test	Acceptance Criteria
P1	Compare job with 500 items, 50% with PolicyVersions	Completes in < 30s (DB-only, no Graph calls)
P2	Provider chain query efficiency	PolicyVersion lookup uses batch query, not N+1

9. Open Questions / Assumptions

Open Questions

#	Question	Impact	Proposed Resolution
OQ-1	Staleness threshold for PolicyVersions: How old can a PolicyVersion be before we reject it as a content source?	Determines false-negative risk	Default: PolicyVersion must be captured after the baseline snapshot's `captured_at`. Configurable per workspace.
OQ-2	Mixed fidelity UX: How should the UI display findings with different fidelity levels?	User trust and understanding	Badge/icon on finding cards: "High confidence (content)" vs "Structural only (meta)". Filterable in findings table.
OQ-3	*Should baseline capture force* a backup** if no recent PolicyVersions exist?	API cost vs accuracy trade-off	No for v1.5 (opportunistic only). Yes for v2.0 as opt-in `capture_mode: full_content`.
OQ-4	etag as change hint: Should we use etag changes as a trigger for on-demand PolicyVersion capture?	Could reduce unnecessary GETs	Worth investigating in v2.0. If etag changes during inventory sync, schedule targeted per-item GET for that policy only.
OQ-5	Settings Catalog `$expand=settings` on LIST: Does Microsoft Graph support this?	Could give "free" content fidelity for settings catalog types	Needs validation against Graph API. If supported, would eliminate per-item GET for the most impactful type.
OQ-6	Retention / pruning interaction: If old PolicyVersions are pruned, does that affect baseline compare?	Could lose content fidelity for old baselines	Baseline compare only needs versions captured after baseline snapshot. Pruning policy should respect active baseline snapshots.

Assumptions

#	Assumption	Risk if Wrong
A-1	`DriftHasher::hashNormalized()` is deterministic across PHP serialization boundaries	Hash mismatch → false drift findings. Validated: uses `json_encode` with stable flags + `ksort`.
A-2	`SettingsNormalizer` / `PolicyNormalizer` produce the same output for the same input regardless of call context (System A vs System B)	Hash inconsistency between systems. Low risk: same code path.
A-3	PolicyVersions from backups contain complete settings (not partial hydration)	Incomplete content → false negatives or incorrect hashes. Validated: `PolicySnapshotService` performs full hydration per type.
A-4	The `Finding` model's `fingerprint`/`recurrence_key` identity allows mixed fidelity sources	Identity collision if fidelity changes source. Safe: recurrence_key includes snapshot_id, not hash value.
A-5	Graph LIST endpoints do NOT return settings values for any supported policy type	If wrong, inventory sync could capture settings "for free". Validated: LIST returns only `$select` fields per `graph_contracts.php`.
A-6	Per-type normalizers in backup drift path handle all 28 supported policy types	If not, some types would produce unstable hashes. Partially validated: `PolicyNormalizer` has a fallback for unknown types.

10. Key Questions Answered

KQ-01: Are Baseline Compare and Backup Drift truly separate systems?

Yes. They share DriftHasher and the Finding model, but differ in:

Data source: InventoryItem vs PolicyVersion
Hash contract: InventoryMetaContract (7 fields, meta only) vs SettingsNormalizer → PolicyNormalizer (full snapshot)
Finding generator: CompareBaselineToTenantJob::computeDrift() vs DriftFindingGenerator::generate()
Finding identity: different recurrence key structures
Scope model: BaselineProfile-scoped vs selection_hash-scoped
Trigger: post-inventory-sync vs post-backup
Coverage: InventoryCoverage guard vs none (trusts backup completeness)

KQ-02: Should they be unified or remain separate?

Hybrid approach (Provider Chain) — as designed in Spec 116 v2. Keep separate triggering and scoping, but let System A consume data produced by System B (PolicyVersions) via a provider chain. This avoids:

Merging two fundamentally different scoping models
Introducing new Graph API costs
Disrupting existing backup drift workflows

KQ-03: What is the minimal viable "v1.5" to bridge the gap?

Add a PolicyVersionContentProvider that checks for recent PolicyVersions as part of baseline compare's hash computation. For types where a PolicyVersion exists (i.e., a backup was taken), the compare immediately gains content-fidelity. For types without, meta-fidelity continues as before. Net code change: ~200-300 lines (interface + 2 providers + chain + integration).

KQ-04: Which types benefit most from content-fidelity drift?

Top priority (complex settings, high change frequency):

settingsCatalogPolicy — most common, deeply nested settings
groupPolicyConfiguration — multi-level nesting (definitionValues → presentationValues)
deviceCompliancePolicy — compliance rules + scheduled actions
deviceConfiguration — broad category, many OData sub-types
endpointSecurityPolicy — critical security settings
securityBaselinePolicy — security-critical baselines
conditionalAccessPolicy — identity security gate

Medium priority (simpler settings but still valuable): 8. appProtectionPolicy, windowsUpdateRing, windowsFeatureUpdateProfile, windowsQualityUpdateProfile

KQ-05: How does coverage work and how should it extend for content fidelity?

Currently: InventoryCoverage::fromContext(latestSyncRun->context) → coveredTypes() returns types with status=succeeded. Uncovered types → findings suppressed, outcome = partially_succeeded.

For v1.5: Add content_coverage alongside meta_coverage:

content_covered_types: types where PolicyVersion exists post-baseline
meta_only_types: types where only meta is available
uncovered_types: types with no coverage at all (findings suppressed)

Finding evidence should include:

{
  "fidelity": "content",
  "content_hash_source": "policy_version:42",
  "note": "Hash computed from PolicyVersion #42 captured 2025-07-14T10:30:00Z"
}

KQ-06: What is the long-term unified architecture?

Provider precedence chain with configurable capture modes:

BaselineProfile.capture_mode:
  'meta_only'        → InventoryMetaContract only (v1)
  'opportunistic'    → PolicyVersion if available → meta fallback (v1.5)
  'full_content'     → On-demand GET for missing types → PolicyVersion → meta (v2.0)

ContentProviderChain:
  1. PolicyVersionContentProvider    (checks existing PolicyVersions)
  2. InventoryContentProvider        (future: if inventory sync enriched)
  3. MetaFallbackProvider            (InventoryMetaContract v1)

The long-term vision is that baseline capture + compare use the same normalizer pipeline as backup drift, producing identical hashes for identical content regardless of which system produced the PolicyVersion. This is achievable because DriftHasher and SettingsNormalizer are already shared code.

Appendix: Database Schema Reference

`baseline_snapshot_items` (current)

id                    BIGINT PK
baseline_snapshot_id  BIGINT FK → baseline_snapshots
subject_type          VARCHAR(255)    -- 'policy'
subject_external_id   VARCHAR(255)    -- Graph resource GUID
policy_type           VARCHAR(255)    -- e.g. 'settingsCatalogPolicy'
baseline_hash         VARCHAR(64)     -- sha256 of InventoryMetaContract
meta_jsonb            JSONB           -- {display_name, category, platform, meta_contract: {...}, fidelity, source}
created_at            TIMESTAMP
updated_at            TIMESTAMP

`inventory_items` (current)

id                          BIGINT PK
tenant_id                   BIGINT FK → tenants
policy_type                 VARCHAR(255)
external_id                 VARCHAR(255)
display_name                VARCHAR(255)
category                    VARCHAR(255) NULL
platform                    VARCHAR(255) NULL
meta_jsonb                  JSONB         -- {odata_type, etag, scope_tag_ids, assignment_target_count}
last_seen_at                TIMESTAMP NULL
last_seen_operation_run_id  BIGINT NULL
created_at                  TIMESTAMP
updated_at                  TIMESTAMP

`policy_versions` (current)

id                  BIGINT PK
tenant_id           BIGINT FK → tenants
policy_id           BIGINT FK → policies
version_number      INTEGER
policy_type         VARCHAR(255)
platform            VARCHAR(255) NULL
created_by          VARCHAR(255) NULL
captured_at         TIMESTAMP
snapshot            JSON          -- FULL Graph GET response (hydrated)
metadata            JSON          -- additional metadata
assignments         JSON NULL     -- full assignments array
scope_tags          JSON NULL     -- scope tag IDs
assignments_hash    VARCHAR(64) NULL
scope_tags_hash     VARCHAR(64) NULL
created_at          TIMESTAMP
updated_at          TIMESTAMP
deleted_at          TIMESTAMP NULL  -- soft delete

Proposed v1.5 Addition

ALTER TABLE baseline_snapshot_items
    ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1';
-- Values: 'inventory_meta_v1', 'policy_version:{id}', 'inventory_content_v2'

42 KiB Raw Blame History Unescape Escape