- Implement Spec 116 baseline capture/compare + coverage guard\n- Add UI surfaces and widgets for baseline compare\n- Add tests and research report
42 KiB
Golden Master / Baseline Drift — Deep Settings-Drift (Content-Fidelity) Analysis
Enterprise Research Report for TenantAtlas / TenantPilot
Date: 2025-07-15
Scope: Architecture, code evidence, implementation proposal
Table of Contents
- Executive Summary
- System Map — Side-by-Side Comparison
- Architecture Decision Record (ADR-001): Unify vs Separate
- Deep-Dive: Why Settings Changes Don't Produce Baseline Drift
- Code Evidence Table
- Type Coverage Matrix
- Proposal: Deep Drift Implementation Plan
- Test Plan (Enterprise)
- Open Questions / Assumptions
- Key Questions Answered (KQ-01 through KQ-06)
1. Executive Summary
-
Two parallel drift systems exist: Baseline Compare (meta fidelity, inventory-sourced) and Backup Drift (content fidelity, PolicyVersion-sourced). They share
DriftHasherbut are otherwise separate data paths with separate finding generators. -
The core gap:
CompareBaselineToTenantJobhashesInventoryMetaContractv1 — which contains onlyodata_type,etag,scope_tag_ids,assignment_target_count— never actual policy settings. When an admin changes a Wi-Fi password or a compliance threshold in Intune, none of these meta signals necessarily change. -
Inventory sync uses Graph LIST endpoints, which return metadata and display fields only. Per-item GET (which fetches settings, assignments, scope tags) is only performed during Backup via
PolicyCaptureOrchestrator. -
DriftFindingGenerator(the backup drift system) does detect settings changes — it normalizesPolicyVersion.snapshotviaSettingsNormalizer→PolicyNormalizer::flattenForDiff()→ type-specific normalizers, then hashes withDriftHasher. -
Spec 116 already designs v2 with a provider precedence chain (
PolicyVersion → Inventory content → Meta fallback), which is the correct architectural direction. The v1 meta baseline shipped first as a deliberate, safe-to-ship initial milestone. -
Unification is recommended (provider chain approach) — not merging the two jobs, but enabling
CompareBaselineToTenantJobto optionally consumePolicyVersionsnapshots as a content-fidelity provider, falling back to InventoryMetaContract when no PolicyVersion is available. -
28 supported policy types are registered in
tenantpilot.php, plus 3 foundation types. Of these, 10+ have complex hydration (settings catalog, group policy, security baselines, compliance actions) and would benefit most from deep-drift detection. -
The
etagsignal is unreliable as a settings-change proxy: Microsoft Graph etag semantics vary per resource type, and etag may or may not change when settings are modified. It is useful as a hint but not a guarantee. -
API cost is the primary constraint: content-fidelity compare requires per-item GET calls (or a recent Backup that already captured PolicyVersions). The hybrid provider chain avoids this by opportunistically reusing existing PolicyVersions without requiring a full backup before every compare.
-
Coverage Guard is critical for v2: the baseline system must know which types have fresh PolicyVersions and suppress content-fidelity findings for types where no recent version exists (falling back to meta fidelity).
-
Risk profile: Shipping deep-drift for wrong types (without proper per-type normalization) could produce false positives. Type-specific normalizers already exist for the backup drift path; reusing them is safe.
-
Recommended phasing: v1.5 (current sprint) = add
content_hash_sourcecolumn tobaseline_snapshot_items+ provider chain in compare job. v2.0 = on-demand per-item GET during baseline capture for types lacking recent PolicyVersions.
2. System Map
Side-by-Side Comparison Table
| Dimension | System A: Baseline Compare | System B: Backup Drift |
|---|---|---|
| Entry point | CompareBaselineToTenantJob |
GenerateDriftFindingsJob → DriftFindingGenerator |
| Data source (current) | InventoryItem (from LIST sync) |
PolicyVersion (from per-item GET backup) |
| Data source (baseline) | BaselineSnapshotItem (captured from inventory) |
Earlier PolicyVersion from prior OperationRun |
| Hash contract | InventoryMetaContract v1 → DriftHasher::hashNormalized() |
SettingsNormalizer → PolicyNormalizer::flattenForDiff() → DriftHasher::hashNormalized() |
| Hash inputs | version, policy_type, subject_external_id, odata_type, etag, scope_tag_ids, assignment_target_count |
Full PolicyVersion.snapshot JSON (with volatile key removal) |
| Fidelity | meta (persisted as fidelity='meta' in snapshot context) |
content (settings + assignments + scope_tags) |
| Dimensions detected | missing_policy, different_version, unexpected_policy |
policy_snapshot (added/removed/modified), policy_assignments (modified), policy_scope_tags (modified) |
| Finding identity | recurrence_key = sha256(tenantId|snapshotId|policyType|extId|changeType) |
recurrence_key = sha256(drift:tenantId:scopeKey:subjectType:extId:dimension) |
| Scope key | baseline_profile:{profileId} |
DriftScopeKey::fromSelectionHash() |
| Auto-close | BaselineAutoCloseService (stale finding resolution) |
resolveStaleDriftFindings() within DriftFindingGenerator |
| Coverage guard | InventoryCoverage::fromContext() → uncovered types → partial outcome |
None (trusts backup captured all types) |
| Graph API calls | Zero at compare time (reads from DB) | Zero at compare time (reads PolicyVersions from DB) |
| Graph API calls (capture) | Zero (inventory sync did LIST) | Per-item GET via PolicyCaptureOrchestrator |
| Normalizer pipeline | None (meta contract is the normalization) | SettingsNormalizer → PolicyNormalizer → type normalizers |
| Shared components | DriftHasher, Finding model |
DriftHasher, Finding model |
| Trigger | After inventory sync, on schedule/manual | After backup, on schedule/manual |
Data Flow Diagrams
SYSTEM A — Baseline Compare (Meta Fidelity)
============================================
Graph LIST ──► InventorySyncService ──► InventoryItem (meta_jsonb)
│
▼
CaptureBaselineSnapshotJob
├─ InventoryMetaContract.build()
├─ DriftHasher.hashNormalized()
└─► BaselineSnapshotItem (baseline_hash)
│
▼
CompareBaselineToTenantJob
├─ loadCurrentInventory() → InventoryItem
├─ BaselineSnapshotIdentity.hashItemContent()
│ └─ InventoryMetaContract.build()
│ └─ DriftHasher.hashNormalized()
├─ computeDrift() → hash compare
└─ upsertFindings() → Finding records
SYSTEM B — Backup Drift (Content Fidelity)
============================================
Graph GET ──► PolicySnapshotService.fetch() ──► full JSON snapshot
│
▼
PolicyCaptureOrchestrator.capture()
├─ assignments GET
├─ scope tags resolve
└─► VersionService.captureVersion() ──► PolicyVersion
│
▼
DriftFindingGenerator.generate()
├─ versionForRun() → baseline/current PV
├─ SettingsNormalizer.normalizeForDiff()
│ └─ PolicyNormalizer.flattenForDiff()
├─ DriftHasher.hashNormalized() × 3
│ (snapshot, assignments, scope_tags)
└─ upsertDriftFinding() → Finding records
3. ADR-001: Unify vs Separate
Title
ADR-001: Golden Master Baseline Compare — Provider Chain for Content Fidelity
Status
PROPOSED
Context
TenantPilot has two drift detection systems that evolved independently:
-
System A (Baseline Compare): Designed for "does the tenant still match the golden master?" Use case. Ships with meta-fidelity (v1) — fast, cheap, zero additional Graph calls at compare time. Detects structural drift (policy added/removed/meta-changed) but is blind to settings changes.
-
System B (Backup Drift): Designed for "what changed between two backup points?" Use case. Content-fidelity — full PolicyVersion snapshots with per-type normalization. Detects settings, assignments, and scope tag changes.
The two systems cannot be merged into one without fundamentally changing their triggering, scoping, and API cost models. However, System A's accuracy can be dramatically improved by consuming data already produced by System B.
Decision
Adopt the Provider Chain pattern as already designed in Spec 116 v2:
ContentProvider = PolicyVersion → InventoryContent → MetaFallback
Specifically:
-
CompareBaselineToTenantJobgains aContentProviderChainthat, for each(policy_type, external_id):- First: Looks for a
PolicyVersioncaptured since the last baseline snapshot timestamp. If found, normalizes viaSettingsNormalizer→DriftHasher→ returnscontentfidelity hash. - Second (future): Looks for enriched inventory content if inventory sync is upgraded to capture settings (v2.0+).
- Fallback: Builds
InventoryMetaContractv1 →DriftHasher→ returnsmetafidelity hash.
- First: Looks for a
-
Each baseline snapshot item records its
fidelity(meta|content) andcontent_hash_source(inventory_meta_v1|policy_version:{id}|inventory_content_v2). -
Compare findings carry
fidelityin evidence, enabling UI to display confidence level. -
Coverage Guard is extended: a type is
content-coveredonly if PolicyVersions exist for ≥N% of items. Below that threshold, fallback to meta fidelity (do not suppress).
Consequences
- Positive: No new Graph API calls needed (reuses existing PolicyVersions from backups). Zero additional infrastructure. Incremental rollout per policy type. Existing meta-fidelity behavior preserved as fallback.
- Negative: Content fidelity depends on backup recency. If a tenant hasn't been backed up, only meta fidelity is available. Could create "mixed fidelity" findings within a single compare run.
- Rejected Alternative: Full merge of System A and B into a single system. Rejected because they serve different use cases (golden master comparison vs point-in-time drift), have different scoping models (BaselineProfile vs selection_hash), and different triggering models (post-inventory-sync vs post-backup).
- Rejected Alternative: Always-GET during baseline compare. Rejected due to API cost (30+ types × 100s of policies = 1000s of GET calls per tenant per compare run).
Compliance Notes
- Livewire v4.0+ / Filament v5: no UI changes in core ADR; provider chain is purely backend.
- Provider registration: n/a (backend services only).
- No destructive actions.
- Asset strategy: no new assets.
4. Deep-Dive: Why Settings Changes Don't Produce Baseline Drift
The Root Cause Chain
Step 1: Inventory Sync captures only LIST metadata
InventorySyncService::executeSelectionUnderLock() (line ~340-450) calls Graph LIST endpoints. For each policy, it extracts:
display_name,category,platform(display fields)odata_type,etag,scope_tag_ids,assignment_target_count(meta signals)
These are stored in InventoryItem.meta_jsonb. No settings values are fetched or stored.
Step 2: Baseline Capture hashes only the Meta Contract
CaptureBaselineSnapshotJob::collectSnapshotItems() reads from InventoryItem, then calls BaselineSnapshotIdentity::hashItemContent():
// BaselineSnapshotIdentity.php, line 56-67
public function hashItemContent(string $policyType, string $subjectExternalId, array $metaJsonb): string
{
$contract = $this->metaContract->build(
policyType: $policyType,
subjectExternalId: $subjectExternalId,
metaJsonb: $metaJsonb,
);
return $this->hasher->hashNormalized($contract);
}
The InventoryMetaContract::build() output is:
[
'version' => 1,
'policy_type' => 'settingsCatalogPolicy',
'subject_external_id' => '<guid>',
'odata_type' => '#microsoft.graph.deviceManagementConfigurationPolicy',
'etag' => '"abc..."', // ← unreliable change indicator
'scope_tag_ids' => ['0'],
'assignment_target_count' => 3,
]
This is ALL that gets hashed. Actual policy settings (the Wi-Fi password, the compliance threshold, the firewall rule) are nowhere in this contract.
Step 3: Baseline Compare re-computes the same meta hash
CompareBaselineToTenantJob::loadCurrentInventory() (line 367-409) reads current InventoryItem records and calls the same BaselineSnapshotIdentity::hashItemContent() with the same InventoryMetaContract, producing the same hash structure.
computeDrift() (line 435-500) then compares baseline_hash vs current_hash:
if ($baselineItem['baseline_hash'] !== $currentItem['current_hash']) {
$drift[] = ['change_type' => 'different_version', ...];
}
If the admin changed a policy setting but the meta signals (etag, scope_tag_ids, assignment_target_count) stayed the same, baseline_hash === current_hash and NO drift is detected.
Why etag is unreliable
Microsoft Graph etag behavior varies by resource type:
- Some types update etag on any property change (including settings)
- Some types update etag only on top-level property changes (not nested settings)
- Settings Catalog policies may or may not update the parent resource etag when child
settingsare modified (the settings are a separate subresource at/configurationPolicies/{id}/settings) - Group Policy Configurations have settings in
definitionValues→presentationValues(multi-level nesting); etag at root level may not reflect these changes
The Contrast: How Backup Drift Does Detect Settings Changes
DriftFindingGenerator::generate() (line 32-80) operates on PolicyVersion.snapshot — the full JSON captured via per-item GET:
$baselineSnapshot = $baselineVersion->snapshot; // Full JSON from Graph GET
$currentSnapshot = $currentVersion->snapshot;
$baselineNormalized = $this->settingsNormalizer->normalizeForDiff($baselineSnapshot, $policyType, $platform);
$currentNormalized = $this->settingsNormalizer->normalizeForDiff($currentSnapshot, $policyType, $platform);
$baselineSnapshotHash = $this->hasher->hashNormalized($baselineNormalized);
$currentSnapshotHash = $this->hasher->hashNormalized($currentNormalized);
if ($baselineSnapshotHash !== $currentSnapshotHash) {
// → Drift finding with change_type = 'modified'
}
This pipeline captures actual settings values, normalizes them per policy type, strips volatile metadata, and hashes the result. If a setting changed, the hash changes, and drift is detected.
Summary Visualization
Admin changes Wi-Fi password in Intune
│
▼
┌─────────────────────────────────┐
│ Graph LIST (inventory sync) │
│ returns: displayName, etag, ... │
│ │
│ etag MAY change, settings NOT │
│ returned by LIST endpoint │
└────────────┬────────────────────┘
│
┌───────┴────────┐
▼ ▼
InventoryItem PolicyVersion
(meta only) (if backup ran)
│ │
▼ ▼
Meta Contract Full Snapshot
hash unchanged hash CHANGED
│ │
▼ ▼
Baseline Backup Drift:
Compare: "modified" ✅
NO DRIFT ❌
5. Code Evidence Table
| # | Class / File | Lines | Role | Key Finding |
|---|---|---|---|---|
| 1 | app/Jobs/CompareBaselineToTenantJob.php |
785 total; L367-409 (loadCurrentInventory), L435-500 (computeDrift) | Core baseline compare job | Reads from InventoryItem only; hashes via InventoryMetaContract → blind to settings |
| 2 | app/Services/Baselines/InventoryMetaContract.php |
75 total; L30-57 (build) | Meta hash contract builder | Hashes only: version, policy_type, external_id, odata_type, etag, scope_tag_ids, assignment_target_count — no settings content |
| 3 | app/Services/Baselines/BaselineSnapshotIdentity.php |
73 total; L56-67 (hashItemContent) | Per-item hash via meta contract | Delegates to InventoryMetaContract.build() → DriftHasher.hashNormalized() |
| 4 | app/Jobs/CaptureBaselineSnapshotJob.php |
305 total | Captures snapshot from inventory | Reads InventoryItem, stores fidelity='meta' and source='inventory' |
| 5 | app/Services/Drift/DriftFindingGenerator.php |
484 total; L32-80 (generate), L250-267 (recurrenceKey) | Backup drift finding generator | Uses PolicyVersion.snapshot with SettingsNormalizer → detects settings changes |
| 6 | app/Services/Drift/DriftHasher.php |
100 total; L13-24 (hashNormalized) | Shared hasher | sha256(json_encode(normalized)) with volatile key removal. SHARED by both systems. |
| 7 | app/Services/Drift/Normalizers/SettingsNormalizer.php |
22 total | Thin wrapper | Delegates to PolicyNormalizer::flattenForDiff(). Used by System B only. |
| 8 | app/Services/Intune/PolicyNormalizer.php |
67 total | Type-specific normalizer router | Routes to per-type normalizers for diff operations |
| 9 | app/Services/Inventory/InventorySyncService.php |
652 total; L340-450 (executeSelectionUnderLock) | LIST-based sync | Fetches from Graph LIST endpoints; extracts meta signals only; upserts InventoryItem |
| 10 | app/Services/Intune/BackupService.php |
438 total | Backup orchestration | Creates BackupSet, uses PolicyCaptureOrchestrator for per-item GET → PolicyVersion |
| 11 | app/Services/Intune/PolicyCaptureOrchestrator.php |
429 total | Per-item GET + hydration | Fetches full snapshot, assignments, scope tags; creates PolicyVersion with all content |
| 12 | app/Services/Intune/PolicySnapshotService.php |
852 total | Per-item Graph GET | Type-specific hydration (hydrateConfigurationPolicySettings, hydrateGroupPolicyConfiguration, etc.) |
| 13 | app/Services/Intune/VersionService.php |
312 total; L1-150 (captureVersion) | PolicyVersion persistence | Transactional, locking, consecutive version_number |
| 14 | app/Models/PolicyVersion.php |
Model | PolicyVersion model | Casts: snapshot(array), assignments(array), scope_tags(array), plus hash columns |
| 15 | app/Models/InventoryItem.php |
Model | Inventory item model | Casts: meta_jsonb(array) — no settings content |
| 16 | app/Models/BaselineSnapshotItem.php |
Model | Snapshot item model | Has baseline_hash(64), meta_jsonb |
| 17 | app/Support/Inventory/InventoryCoverage.php |
173 total | Coverage parser | fromContext() extracts per-type status from sync run context |
| 18 | app/Services/Drift/DriftRunSelector.php |
~60 total | Run pair selector | Selects 2 most recent sync runs with same selection_hash (System B only) |
| 19 | app/Jobs/GenerateDriftFindingsJob.php |
~200 total | Dispatcher for System B | Dispatches DriftFindingGenerator for policy-version-based drift |
| 20 | config/graph_contracts.php |
867 total | Policy type registry | Defines endpoints, hydration strategies, subresources, type families per policy type |
| 21 | config/tenantpilot.php |
385 total; L18-293 (supported_policy_types) | Application config | 28 supported policy types + 3 foundation types |
| 22 | specs/116-baseline-drift-engine/spec.md |
237 total | Feature spec | Defines v1 (meta) and v2 (content fidelity) requirements |
| 23 | specs/116-baseline-drift-engine/research.md |
200 total | Phase 0 research | 6 key decisions including v2 architecture strategy |
| 24 | specs/116-baseline-drift-engine/plan.md |
259 total | Implementation plan | Steps 1-7 for v1; v2 deferred |
6. Type Coverage Matrix
Coverage assessment for deep-drift feasibility: which types have per-type normalization and hydration support?
| # | policy_type |
Label | Hydration | Subresources | Per-Type Normalizer | Deep-Drift Feasible | Notes |
|---|---|---|---|---|---|---|---|
| 1 | settingsCatalogPolicy |
Settings Catalog Policy | configurationPolicies |
settings (list) |
Yes (via PolicyNormalizer) | YES | Most impactful — complex nested settings |
| 2 | endpointSecurityPolicy |
Endpoint Security Policies | configurationPolicies |
settings (list) |
Yes (shared with settings catalog) | YES | Same endpoint family as settings catalog |
| 3 | securityBaselinePolicy |
Security Baselines | configurationPolicies |
settings (list) |
Yes (shared) | YES | Same pipeline |
| 4 | groupPolicyConfiguration |
Administrative Templates | groupPolicyConfigurations |
definitionValues → presentationValues |
Yes (via PolicyNormalizer) | YES | Multi-level nesting; hydration required |
| 5 | deviceConfiguration |
Device Configuration | deviceConfigurations |
None (properties on root) | Yes (via PolicyNormalizer) | YES | Properties directly on resource |
| 6 | deviceCompliancePolicy |
Device Compliance | deviceCompliancePolicies |
scheduledActionsForRule (expand) |
Yes (via PolicyNormalizer) | YES | Actions subresource needs expand |
| 7 | windowsUpdateRing |
Software Update Ring | deviceConfigurations (filtered) |
None (properties on root) | Yes (shared with deviceConfig) | YES | Subset of deviceConfiguration |
| 8 | appProtectionPolicy |
App Protection (MAM) | managedAppPolicies |
None (properties) | Partial (via PolicyNormalizer) | YES | Mobile-focused |
| 9 | conditionalAccessPolicy |
Conditional Access | identity/conditionalAccess/policies |
None (properties) | Yes (via PolicyNormalizer) | YES | High-risk, preview-only restore |
| 10 | deviceManagementScript |
PowerShell Scripts | deviceManagementScripts |
None (scriptContent base64) | Partial | PARTIAL | Script content is base64 in snapshot |
| 11 | deviceShellScript |
macOS Shell Scripts | deviceShellScripts |
None (scriptContent base64) | Partial | PARTIAL | Same pattern as PS scripts |
| 12 | deviceHealthScript |
Proactive Remediations | deviceHealthScripts |
None | Partial | PARTIAL | Detection + remediation scripts |
| 13 | deviceComplianceScript |
Custom Compliance Scripts | deviceComplianceScripts |
None | Partial | PARTIAL | Script content |
| 14 | windowsFeatureUpdateProfile |
Feature Updates | windowsFeatureUpdateProfiles |
None | Yes | YES | Simple properties |
| 15 | windowsQualityUpdateProfile |
Quality Updates | windowsQualityUpdateProfiles |
None | Yes | YES | Simple properties |
| 16 | windowsDriverUpdateProfile |
Driver Updates | windowsDriverUpdateProfiles |
None | Yes | YES | Simple properties |
| 17 | mamAppConfiguration |
App Config (MAM) | targetedManagedAppConfigurations |
None | Partial | YES | Properties-based |
| 18 | managedDeviceAppConfiguration |
App Config (Device) | mobileAppConfigurations |
None | Partial | YES | Properties-based |
| 19 | windowsAutopilotDeploymentProfile |
Autopilot Profiles | windowsAutopilotDeploymentProfiles |
None | Minimal | YES | Properties-based |
| 20 | windowsEnrollmentStatusPage |
Enrollment Status Page | deviceEnrollmentConfigurations |
None | Minimal | META-ONLY | Enrollment types have limited settings |
| 21 | deviceEnrollmentLimitConfiguration |
Enrollment Limits | deviceEnrollmentConfigurations |
None | Minimal | META-ONLY | Numeric limit only |
| 22 | deviceEnrollmentPlatformRestrictionsConfiguration |
Platform Restrictions | deviceEnrollmentConfigurations |
None | Minimal | META-ONLY | Nested restriction config |
| 23 | deviceEnrollmentNotificationConfiguration |
Enrollment Notifications | deviceEnrollmentConfigurations |
None | Minimal | META-ONLY | Template snapshots nested |
| 24 | enrollmentRestriction |
Enrollment Restrictions | deviceEnrollmentConfigurations |
None | Minimal | META-ONLY | Mixed config type |
| 25 | termsAndConditions |
Terms & Conditions | termsAndConditions |
None | Yes | YES | bodyText, acceptanceStatement |
| 26 | endpointSecurityIntent |
Endpoint Security Intents | intents |
categories/settings (legacy) | Partial | PARTIAL | Legacy intent API; migrating to configPolicies |
| 27 | mobileApp |
Applications | mobileApps |
None | Minimal | META-ONLY | Metadata-only backup per config |
| 28 | policySet |
Policy Sets | (if supported) | assignments | Minimal | META-ONLY | Container for other policies |
Foundation Types:
| # | foundation_type |
Label | Deep-Drift | Notes |
|---|---|---|---|---|
| F1 | assignmentFilter |
Assignment Filter | YES | rule property is key content |
| F2 | roleScopeTag |
Scope Tag | META-ONLY | displayName + description only |
| F3 | notificationMessageTemplate |
Notification Template | PARTIAL | Localized messages are subresource |
Summary:
- Full content-fidelity feasible: 16 types (settingsCatalog, endpointSecurity, securityBaseline, groupPolicy, deviceConfig, compliance, updateRings/profiles, appProtection, conditionalAccess, appConfigs, autopilot, termsAndConditions, assignmentFilter)
- Partial (script content / legacy APIs): 5 types
- Meta-only sufficient: 7 types (enrollment configs, mobileApp, roleScopeTag)
7. Proposal: Deep Drift Implementation Plan
Phase v1.5 — Provider Chain (Opportunistic Content Fidelity)
Goal: Enable baseline compare to use existing PolicyVersions for content-fidelity hash when available, with meta-fidelity fallback.
Estimated effort: 3-5 days
Step 1: ContentHashProvider Interface
// app/Contracts/Baselines/ContentHashProvider.php
interface ContentHashProvider
{
/**
* @return array{hash: string, fidelity: string, source: string}|null
*/
public function resolve(string $policyType, string $externalId, int $tenantId, CarbonImmutable $since): ?array;
}
Step 2: PolicyVersionContentProvider
// app/Services/Baselines/PolicyVersionContentProvider.php
// Looks up the latest PolicyVersion for (tenant_id, external_id, policy_type)
// captured_at >= $since (baseline snapshot timestamp)
// Returns SettingsNormalizer → DriftHasher hash with fidelity='content'
Step 3: MetaFallbackProvider (existing logic)
// Wraps InventoryMetaContract → DriftHasher → fidelity='meta'
Step 4: ContentProviderChain
// Iterates [PolicyVersionContentProvider, MetaFallbackProvider]
// Returns first non-null result
Step 5: Integration in CompareBaselineToTenantJob
loadCurrentInventory()accepts optionalContentProviderChain- For each item: try chain, record fidelity + source
computeDrift()unchanged (still hash vs hash comparison)- Finding evidence includes
fidelityandcontent_hash_source
Step 6: CaptureBaselineSnapshotJob enhancement
- Optional: during capture, also try
PolicyVersionContentProviderto store content-fidelity baseline_hash - Store
content_hash_sourceinbaseline_snapshot_items.meta_jsonb - This means: if a backup was taken before baseline capture, the baseline itself is content-fidelity
Step 7: Coverage extension
- Add
content_coverageto compare run context: which types had PolicyVersions, which fell back to meta - Display in operation detail UI
Migration
-- Optional: add column for source tracking
ALTER TABLE baseline_snapshot_items
ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1';
Phase v2.0 — On-Demand Content Capture (Future)
Goal: For types without recent PolicyVersions, perform targeted per-item GET during baseline capture/compare.
Estimated effort: 5-8 days
- Introduce
BaselineContentCaptureJobthat, for a given baseline profile's scope, identifies items lacking recent PolicyVersions and performs targeted GET + PolicyVersion creation. - Reuses existing
PolicyCaptureOrchestratorwith a new "baseline-triggered" context. - Adds
capture_modeto baseline profile:meta_only(v1),opportunistic(v1.5),full_content(v2.0). - Rate limiting: per-tenant throttle to avoid Graph API quota issues.
- Budget guard: max N items per capture run, with continuation support.
Phase v2.5 — Inventory Content Enrichment (Future, Optional)
Goal: Optionally have inventory sync capture settings content inline during LIST (where type supports $expand).
- Some types support
$expand=settingson LIST (settings catalog, endpoint security). - This would give "free" content fidelity without per-item GET.
- High complexity: varies per type, may increase LIST payload size significantly.
- Evaluate ROI after v2.0 ships.
8. Test Plan (Enterprise)
Unit Tests
| # | Test File | Scope | Key Assertions |
|---|---|---|---|
| U1 | tests/Unit/Baselines/ContentProviderChainTest.php |
Provider chain resolution | First provider wins; null fallback; fidelity recorded correctly |
| U2 | tests/Unit/Baselines/PolicyVersionContentProviderTest.php |
PolicyVersion lookup + normalization | Correct hash for known snapshot; returns null when no PV; respects $since cutoff |
| U3 | tests/Unit/Baselines/MetaFallbackProviderTest.php |
Meta contract fallback | Produces fidelity='meta'; matches existing InventoryMetaContract behavior exactly |
| U4 | tests/Unit/Baselines/InventoryMetaContractTest.php |
(Existing) contract stability | Null handling, ordering, versioning — extend for edge cases |
Feature Tests
| # | Test File | Scope | Key Assertions |
|---|---|---|---|
| F1 | tests/Feature/Baselines/BaselineCompareContentFidelityTest.php |
End-to-end compare with PolicyVersions available | Settings change → different_version finding with fidelity='content' |
| F2 | tests/Feature/Baselines/BaselineCompareMixedFidelityTest.php |
Some types have PV, some don't | Mixed fidelity values in findings; coverage context records both |
| F3 | tests/Feature/Baselines/BaselineCompareFallbackTest.php |
No PolicyVersions available | Falls back to meta fidelity; identical behavior to v1 |
| F4 | tests/Feature/Baselines/BaselineCaptureFidelityTest.php |
Capture with PolicyVersions present | baseline_hash uses content fidelity; content_hash_source recorded |
| F5 | tests/Feature/Baselines/BaselineCompareStaleVersionTest.php |
PolicyVersion older than snapshot | Falls back to meta (stale PV not used) |
| F6 | tests/Feature/Baselines/BaselineCompareCoverageGuardContentTest.php |
Coverage reporting for content types | content_coverage in run context shows which types are content-covered |
Existing Tests to Preserve
| # | Test File | Impact |
|---|---|---|
| E1 | tests/Feature/Baselines/BaselineCompareFindingsTest.php |
Must still pass — meta fidelity is default when no PV exists |
| E2 | tests/Feature/Baselines/BaselineComparePreconditionsTest.php |
No change expected |
| E3 | tests/Feature/Baselines/BaselineCompareStatsTest.php |
Stats remain grouped by scope_key; may need fidelity breakdown |
| E4 | tests/Feature/Baselines/BaselineOperabilityAutoCloseTest.php |
Auto-close unaffected by fidelity source |
Integration / Regression
| # | Test | Scope |
|---|---|---|
| I1 | Content hash stability across serialization | JSON encode/decode round-trip does not change hash |
| I2 | PolicyVersion normalizer alignment | Same snapshot → SettingsNormalizer produces same hash in both System A (via provider) and System B (via DriftFindingGenerator) |
| I3 | Hash collision protection | Different settings → different hashes (property-based test with sample data) |
| I4 | Empty snapshot edge case | PolicyVersion with empty/null snapshot → provider returns null → fallback works |
Performance Tests
| # | Test | Acceptance Criteria |
|---|---|---|
| P1 | Compare job with 500 items, 50% with PolicyVersions | Completes in < 30s (DB-only, no Graph calls) |
| P2 | Provider chain query efficiency | PolicyVersion lookup uses batch query, not N+1 |
9. Open Questions / Assumptions
Open Questions
| # | Question | Impact | Proposed Resolution |
|---|---|---|---|
| OQ-1 | Staleness threshold for PolicyVersions: How old can a PolicyVersion be before we reject it as a content source? | Determines false-negative risk | Default: PolicyVersion must be captured after the baseline snapshot's captured_at. Configurable per workspace. |
| OQ-2 | Mixed fidelity UX: How should the UI display findings with different fidelity levels? | User trust and understanding | Badge/icon on finding cards: "High confidence (content)" vs "Structural only (meta)". Filterable in findings table. |
| OQ-3 | Should baseline capture force a backup if no recent PolicyVersions exist? | API cost vs accuracy trade-off | No for v1.5 (opportunistic only). Yes for v2.0 as opt-in capture_mode: full_content. |
| OQ-4 | etag as change hint: Should we use etag changes as a trigger for on-demand PolicyVersion capture? | Could reduce unnecessary GETs | Worth investigating in v2.0. If etag changes during inventory sync, schedule targeted per-item GET for that policy only. |
| OQ-5 | Settings Catalog $expand=settings on LIST: Does Microsoft Graph support this? |
Could give "free" content fidelity for settings catalog types | Needs validation against Graph API. If supported, would eliminate per-item GET for the most impactful type. |
| OQ-6 | Retention / pruning interaction: If old PolicyVersions are pruned, does that affect baseline compare? | Could lose content fidelity for old baselines | Baseline compare only needs versions captured after baseline snapshot. Pruning policy should respect active baseline snapshots. |
Assumptions
| # | Assumption | Risk if Wrong |
|---|---|---|
| A-1 | DriftHasher::hashNormalized() is deterministic across PHP serialization boundaries |
Hash mismatch → false drift findings. Validated: uses json_encode with stable flags + ksort. |
| A-2 | SettingsNormalizer / PolicyNormalizer produce the same output for the same input regardless of call context (System A vs System B) |
Hash inconsistency between systems. Low risk: same code path. |
| A-3 | PolicyVersions from backups contain complete settings (not partial hydration) | Incomplete content → false negatives or incorrect hashes. Validated: PolicySnapshotService performs full hydration per type. |
| A-4 | The Finding model's fingerprint/recurrence_key identity allows mixed fidelity sources |
Identity collision if fidelity changes source. Safe: recurrence_key includes snapshot_id, not hash value. |
| A-5 | Graph LIST endpoints do NOT return settings values for any supported policy type | If wrong, inventory sync could capture settings "for free". Validated: LIST returns only $select fields per graph_contracts.php. |
| A-6 | Per-type normalizers in backup drift path handle all 28 supported policy types | If not, some types would produce unstable hashes. Partially validated: PolicyNormalizer has a fallback for unknown types. |
10. Key Questions Answered
KQ-01: Are Baseline Compare and Backup Drift truly separate systems?
Yes. They share DriftHasher and the Finding model, but differ in:
- Data source:
InventoryItemvsPolicyVersion - Hash contract:
InventoryMetaContract(7 fields, meta only) vsSettingsNormalizer → PolicyNormalizer(full snapshot) - Finding generator:
CompareBaselineToTenantJob::computeDrift()vsDriftFindingGenerator::generate() - Finding identity: different recurrence key structures
- Scope model:
BaselineProfile-scoped vsselection_hash-scoped - Trigger: post-inventory-sync vs post-backup
- Coverage:
InventoryCoverageguard vs none (trusts backup completeness)
KQ-02: Should they be unified or remain separate?
Hybrid approach (Provider Chain) — as designed in Spec 116 v2. Keep separate triggering and scoping, but let System A consume data produced by System B (PolicyVersions) via a provider chain. This avoids:
- Merging two fundamentally different scoping models
- Introducing new Graph API costs
- Disrupting existing backup drift workflows
KQ-03: What is the minimal viable "v1.5" to bridge the gap?
Add a PolicyVersionContentProvider that checks for recent PolicyVersions as part of baseline compare's hash computation. For types where a PolicyVersion exists (i.e., a backup was taken), the compare immediately gains content-fidelity. For types without, meta-fidelity continues as before. Net code change: ~200-300 lines (interface + 2 providers + chain + integration).
KQ-04: Which types benefit most from content-fidelity drift?
Top priority (complex settings, high change frequency):
settingsCatalogPolicy— most common, deeply nested settingsgroupPolicyConfiguration— multi-level nesting (definitionValues → presentationValues)deviceCompliancePolicy— compliance rules + scheduled actionsdeviceConfiguration— broad category, many OData sub-typesendpointSecurityPolicy— critical security settingssecurityBaselinePolicy— security-critical baselinesconditionalAccessPolicy— identity security gate
Medium priority (simpler settings but still valuable):
8. appProtectionPolicy, windowsUpdateRing, windowsFeatureUpdateProfile, windowsQualityUpdateProfile
KQ-05: How does coverage work and how should it extend for content fidelity?
Currently: InventoryCoverage::fromContext(latestSyncRun->context) → coveredTypes() returns types with status=succeeded. Uncovered types → findings suppressed, outcome = partially_succeeded.
For v1.5: Add content_coverage alongside meta_coverage:
content_covered_types: types where PolicyVersion exists post-baselinemeta_only_types: types where only meta is availableuncovered_types: types with no coverage at all (findings suppressed)
Finding evidence should include:
{
"fidelity": "content",
"content_hash_source": "policy_version:42",
"note": "Hash computed from PolicyVersion #42 captured 2025-07-14T10:30:00Z"
}
KQ-06: What is the long-term unified architecture?
Provider precedence chain with configurable capture modes:
BaselineProfile.capture_mode:
'meta_only' → InventoryMetaContract only (v1)
'opportunistic' → PolicyVersion if available → meta fallback (v1.5)
'full_content' → On-demand GET for missing types → PolicyVersion → meta (v2.0)
ContentProviderChain:
1. PolicyVersionContentProvider (checks existing PolicyVersions)
2. InventoryContentProvider (future: if inventory sync enriched)
3. MetaFallbackProvider (InventoryMetaContract v1)
The long-term vision is that baseline capture + compare use the same normalizer pipeline as backup drift, producing identical hashes for identical content regardless of which system produced the PolicyVersion. This is achievable because DriftHasher and SettingsNormalizer are already shared code.
Appendix: Database Schema Reference
baseline_snapshot_items (current)
id BIGINT PK
baseline_snapshot_id BIGINT FK → baseline_snapshots
subject_type VARCHAR(255) -- 'policy'
subject_external_id VARCHAR(255) -- Graph resource GUID
policy_type VARCHAR(255) -- e.g. 'settingsCatalogPolicy'
baseline_hash VARCHAR(64) -- sha256 of InventoryMetaContract
meta_jsonb JSONB -- {display_name, category, platform, meta_contract: {...}, fidelity, source}
created_at TIMESTAMP
updated_at TIMESTAMP
inventory_items (current)
id BIGINT PK
tenant_id BIGINT FK → tenants
policy_type VARCHAR(255)
external_id VARCHAR(255)
display_name VARCHAR(255)
category VARCHAR(255) NULL
platform VARCHAR(255) NULL
meta_jsonb JSONB -- {odata_type, etag, scope_tag_ids, assignment_target_count}
last_seen_at TIMESTAMP NULL
last_seen_operation_run_id BIGINT NULL
created_at TIMESTAMP
updated_at TIMESTAMP
policy_versions (current)
id BIGINT PK
tenant_id BIGINT FK → tenants
policy_id BIGINT FK → policies
version_number INTEGER
policy_type VARCHAR(255)
platform VARCHAR(255) NULL
created_by VARCHAR(255) NULL
captured_at TIMESTAMP
snapshot JSON -- FULL Graph GET response (hydrated)
metadata JSON -- additional metadata
assignments JSON NULL -- full assignments array
scope_tags JSON NULL -- scope tag IDs
assignments_hash VARCHAR(64) NULL
scope_tags_hash VARCHAR(64) NULL
created_at TIMESTAMP
updated_at TIMESTAMP
deleted_at TIMESTAMP NULL -- soft delete
Proposed v1.5 Addition
ALTER TABLE baseline_snapshot_items
ADD COLUMN content_hash_source VARCHAR(255) NULL DEFAULT 'inventory_meta_v1';
-- Values: 'inventory_meta_v1', 'policy_version:{id}', 'inventory_content_v2'